Monday, August 13, 2012

The Persistence of Memory

This is the oldest book I could find on my book shelf. It's over 100 years old, printed in 1909. It's holding up quite nicely. It has a musty smell, but all the pages are intact and legible. Based on the pages count (355) and the words per page (29x29=841), I'd say there are about 30,000 words in the book. At an average of 4.5 letters per word, there are 135,000 characters in the book. It takes 8 bits to encode a letter, so this book holds 10.7 million bits of information, or 10.2 kB. Its volume is 1,457 cubic centimeters.

This second picture is a 5 1/4 inch floppy. I couldn't find one in my house, so I borrowed the image from Wikipedia. According to Wikipedia, this technology was introduced in 1976 and was nearly obsolete by the end of the 1980s. But in their heyday, they were ubiquitous. They hold (at their maximum) 1.2MB. Its volume is 53 cubic centimeters.

Comparing the two, we see that the floppy takes up significantly less space and holds an two orders of magnitude more data. For the same volume as my book, it can store 32 MB of data. The floppies are more delicate though, and my experience with them as a child was that they not infrequently bent to the point of being unusable. This was a known design flaw, and they were later replaced with 3.5" floppies which were studier and had 1.44 MB.

These, too, have practically disappeared. There are modern hard disks, which can take being dropped much easier and store upwards of 1 TB (and climbing). But they're mechanical, and do catastrophically fail from time to time. Solid-state drives are coming down in price, and provide must better data persistence, and I have one with 128MB. Their failure rate per bit should be much much lower than either HDs or floppies or books.

Now here's a picture (also attributed to wikipedia) of a medieval manuscript from China, predating the 13th century.

I don't know how to calculate the bit rate of Chinese, but it's pretty cool that something so old is still around and readable. If you know Chinese, can you tell me if it's readable? Unlike several medieval manuscripts I found in Latin, this actually looked like something legible, whereas medieval Latin script can sometimes be an eyesore to modern readers.

As a civilization, we have a lot invested in our collective knowledge. That used to be housed in libraries in books. Now it's gone digital. It takes up much less space, is better indexed, and infinitely more accessible to a much wide breadth of the literate population. But is this transition for the better?

Books have been recovered from Pompeii that are still legible (with a bit of reconstruction), there are legion extant medieval documents that are readable and translatable and transcribable, books printed 200 years ago adorn bookshelves which can even be read without translation.

A floppy disk from 1976 is unusable, and any data it once stored, if it wasn't copied, is lost. Same for 3.5" microfloppies. While we're surrounded with tons of storage media (USB fobs, HDs, SSDs, "The Cloud"), all of that relatively recent, and there is nothing to say that any of it is more persistent than its digital predecessors.

What has changed is our transcription error. Now computers convert from one format to another instead of zoned out medieval monks. The prospects to save more information from century to century are rosy. But we have that with the printing press, too. What makes us think that the digital age will preserve our data? If Facebook or blogger goes away tomorrow, or in twenty years, those pictures are lost for good (in twenty years do you really think your CD backups will still work or even be around?). But your photo album will still be there on your shelf. As will the diary beside your bed.

Our collective memory is being uploaded right now to the Internet with the false assumption that it will always be there, but what will happen to us when we wake up and realize that most of our memories, without our consent or knowledge, had all their bits flipped to zero?


  1. Great post!

    By the way, I have to comment that I am still forced to use 3.5" floppy disks in the process of designing state-of-the-art ADCs for various next-generation mobile devices. Why, you ask? This is because we still have old logic analyzers whose only I/O is a 3.5" floppy drive (and sometimes ethernet, but no one can ever figure that out)!

  2. As for the bit rate of Chinese before the 13th Century, I found this:

    If we take the upper limit of about 54,000 characters we can conclude that it takes 16 bits to represent each character. So there's your bit rate...

  3. IBM still makes floppy drives for 5 inch disks. And also 8 inch disks. If you can keep it safe from the elements, it's arguable that a floppy disk is safer than the cloud, which stores your data more or less at the whims of the corporations hosting it.

    A kindle can hold as much as many book cases. But in 10 years, those many book cases will still be holding readable books, but the kindle will be e-waste. There are major issues of sustainability, aside from anything else.


ubi solitutinem faciunt, pacem appellant.