KryoFlux - Optimisation Strategies


Unless there is an unnoticed issue, the code is now fully working. There are a few dirty things here and there that will be replaced with proper functions, things like disabling the data streaming properly as at the moment, once it’s turned on, it remains on forever until a software reset is requested by the host.

Things still to do: - Do testing to verify integrity... just in case. - Add OOB (Out Of Band) data for index signal. - Proper un-initialization. - Fix host timeout issue. The last point is that we just noticed that DD transfer now fails (!) because the transfer buffer has to be filled completely within the timeout period set in WinUSB, but the data is too compact for the large buffer... However, this buffer size is needed for HD. The WinUSB timeout would have been a lot more useful if it detected a period of bus inactivity, instead of how long it takes to fill the supplied buffer. Such is life.

To resolve this, the firmware either needs to be changed to periodically signal transfer end within the host timeout period, or we completely disable timeouts while transferring data. We’re not very keen on the latter unless the transfer end signal method fails.

Previously, the data transferred always maxed out the bandwidth and filled the buffer, so ironically a timeout due to not filling a buffer completely within a given period has never been an issue before.

This kind of performance can be achieved because everything is done in a rather unusual way. Whatever cost in performance terms was analysed to try and find a way to do one of the following:

  1. Delay that operation when something costly was happening at the time until the system was definitely idle.
  2. Merge the operation with something else.
  3. Let the host software do it.

For example, checking the index signal and encoding it into a stream is a very expensive operation if you have to try and do it a few hundred thousand times a second, and making sure you write your buffer in sync with the sampling system.

Instead, we introduced an OOB (Out Of Band) data marker that is sent regardless of its validity whenever the ring buffer wraps around. This way, the index signal (or whatever) information is encoded as part of the buffer, but never ever actually touched by the sampler → zero cost.

If we want to transmit information about, say, buffer overflow (right now it just lights the LED’s and the system freezes), we’ll just add that to the transfer routine as OOB data and since it’s not associated with the sampler → zero cost.

Checking for buffer overflow at each sample is very expensive. Checking whether the sampler has overwritten any data that has just been transferred (i.e. passing the actual page start in the ring-buffer) → zero cost.

You get the idea. There is quite a few other things like that.

We could even make it faster, but the compiler thinks it’s smarter... and we’re not going to convert it to assembly code as it would not be maintainable anymore. HD disks now indeed work, and we have about 80% more bandwidth than the minimum required now. The only failure that can happen is a buffer overrun and detecting that is already in there, although freezing the system is not a nice way of indicating it. :)


  • Fixed a special case with the buffer overrun check.
  • Added code that sends a short packet to signal (current) transfer end when the timeout limit is reached.
  • Added code for disabling disk read streaming