09 November, 2005

Recorded history

According to the Democrats, the White House has once again changed something on its website to avoid embarrassment. This wouldn't be the first time -- during the last Presidential campaign, quite a few pages disappeared.

At the same moment -- as I write -- two of the Internet's greates pioneers are on the White House website responding to "Ask the White House" questions from the public. Somehow I doubt they will respond to my question:
How can we ensure that history remains accurately recorded when pages on the Internet can so easily be changed? Does the Internet make it easier or harder to change recorded history?
UPDATE: They did take it. Their vaguely satisfying response:
That's a really good question! There are projects underway to capture the dynamic contents of the World Wide Web. Brewster Kahle is running an Internet Archive project for example. The content of the WWW is dynamic and often ephemeral and potentially modifiable, as you suggest. Digital Signature technology is one way of protecting information by exposing any attempt to modify it. But even that may not guarantee absolute integrity protection forever. The use of digital objects and its underlying ability to verify the integrity of digital content through the use of the Handle System that Bob Kahn has been working on at CNRI offers another fruitful avenue towards solving this problem. In addition efforts such as the American Memory project at the Library of Congress and recent efforts to automate the National Archives represent institutional approaches to this problem.


You know, I've often worried about this problem in another context. Specifically, much of what we know of history and historical figures is drawn from letters. E.g. from Pharyngula : "During their lifetimes, Darwin sent at least 7,591 letters and received 6,530; Einstein sent more than 14,500 and received more than 16,200." Now that there's e-mail, what are historians going to make use of in the future? We delete most of our correspondence; whatever we don't will only last as long as our hard-drives do... 

Posted by saurabh

I don't delete my correspondence. In fact, I organize it pretty well. I use Eudora, which maintains all the letters, in and out, in ascii-searchable folders.

Outlook is, predictably for Microsoft, pretty bad in this regard. If a user is on a work network, s/he can tell Outlook to archive a set of folders to a separate file. But that file can be read, of course, only by Outlook. So it's a combination of good and bad news that such correspondence will be largely lost. Bad news for the reason you cite. Good news because people using Outlook are maybe not so smart, so it will save historians from ploughing through a lot of dreck.

My father insists on printing out all e-mails to read them. He also sends his letters as word processor files and insists that his recipients print them to read them. It's quaint but effective -- at least it was when I had a printer. Personally, I don't find paper to be any more persistent than data. Often it's less so: viz my pal whose 40,000-piece library of Tibetan manuscripts recently burned in a house fire.

And most of all I take a Buddhist approach. I try not to live for posterity, but for now. I am perfectly content to know that everything I do and say could be washed away in a tsunami. It comforts me several times a day when I say something stupid.  

Posted by hedgehog

I've often thought that a worthwhile project for someone with a lot of server space and bandwitdth would be to regularly crawl and archive dated snapshots of certain government sites. The WhiteHouse and Defense being key. But I'm sure if you unleased a bot on .mil, they'd notice and not be happy about it.

Saurabh, I was just writing about that to a friend today--by email of course. It's not quite as inspiring of epistolary lyricism as a fountain pen on nice paper.

Hedgehog---I'm pretty sure there's a way to export any given Outlook folder as a single text file--tab separated values or somesuch. So yeah, you have to do it folder by folder, and then you'd have to write a a script or something to break the resulting text file up into the component pieces, but I doubt that's very hard.

That's really awful about the Tibetan manuscripts. I hate the thought of books burning. When books have persevered for so long, I feel like there's a point at which it becomes ever increasingly worthwhile to keep preserving them. I've really enjoyed the time I've gotten to spend with old manuscripts and such, and I am still childishly pleased to encounter old family papers, so I wouldn't want to rob my own children of that. But I've moved often enough and been disconnected from the physical trappings of family heritage enough to learn how to quickly swallow being upset at losing email backups and the like.
I discovered that by making it incredibly inconvenient to print  I often didn't really need to. It's like TV that way.  

Posted by Saheli

This page is powered by Blogger. Isn't yours?