Why are the disk images so big?

Actually, they are very small, considering what they contain. Some Atari ST dumps might be quite big though.

Normal Disk Images

The normal size of a dump is around 3 MiB. The theoretical maximum size of a dump is nearly 18 MiB, but this would only happen if all tracks on a disk were unformatted - these cannot be compressed since you rarely get redundancy in random data. ;-)

If 12500 is the theoretical optimum track size for DD disks, and each track (can be up to 13900 bytes for some games) is sampled 5 times (to avoid redumping, and too see highlight various sorts of protection) you get:

  12500 * 5 = 62500 bytes


Also, the timing (density) table uses 4 bytes per byte sampled, which gives us:

  12500 * 4 = 50000 bytes

Which gives us a total of 112500 bytes (or 112.5 KiB) per track with no compression. Dumping the whole disk would give you:

  84 cylinders * 2 sides * 112500 bytes = 18900000 bytes (~18 MiB)

Now, using our special lossless compression algorithm takes this down to around 3 MiB per dump. This means a 112.5 KiB uncompressed track becomes around a 19 KiB compressed one. Which is just over 600% compression! Pretty good really...

Large Atari ST Disk Images

The problem we have found when dumping Atari ST disks is that quite a few are single sided. That is, one side is unformatted or random data. These types of disks can be very large when dumped. It is exactly the same for Amiga games that do not use all of the disk, or game data is scattered around the disk such as in games like Archipelagos.

The tracks on Archipelagos can be seen below. The ones marked grey are random, i.e. unformatted and just “noise”.

Archipelagos Track Display

As discussed, a dump of a normal disk is usually around 3 MiB, due to each track being sampled 5 times, and storing information like track geometry and bit density. However a single sided Atari ST game can be around 8-10 MiB.

Now, this is the disk dumped in a raw unprocessed form. However, the IPF (Interchangeable Preservation Format) file produced from such a single sided disk will be half the size of a double sided IPF as you might expect. In real figures, this is roughly 1 MiB for a double sided disk (depending on the Copy Protection used) and about 500 KiB for single sided ones.

The fact that the dumps are so big is annoying, but unfortunately unavoidable because the dumping tool does not (and should not) care if a track is unformatted, it just reads it anyway. It cannot detect this fact because it would need to determine if it was really unformatted, or just part of the copy protection. This determination is the job of the analyser which is a x86 based application due to shear processing power required.