Scribes
saurabh is a manic- depressive graduate student with delusions of
overturning well- established social hierarchies through sheer weight of cynicism. in his spare time he writes self-effacing auto- biographical blurbs.
dan makes things up casually, effortlessly, and often. Never believe a
word he says.
hedgehog burrows between San Francisco and other areas rich in roots and nuts. His father says he is a literalist and his mother says he is very smart. Neither of them say aloud that he should spend less time with blegs and more time out of doors.
Pollocrisy
Blegs
- scrofulous
- wax banks
- a tiny revolution
- under the same sun
- alt hippo
- isthatlegal?
- informed comment
- abu aardvark
- crooked timber
- bob harris
- saheli: the gathering
- john & belle have a blog
- red state son
- pharyngula
- critical montages
- living the scientific life
- pass the roti
- attitude adjustor
- pandagon
- this modern world
- orcinus
- a lovely promise
- ufo breakfast
- sabdariffa
- to do: 1. get hobby, 2. floss
Links
Archives
- 11.2003
- 04.2004
- 05.2004
- 06.2004
- 07.2004
- 08.2004
- 09.2004
- 10.2004
- 11.2004
- 12.2004
- 01.2005
- 02.2005
- 03.2005
- 04.2005
- 05.2005
- 06.2005
- 07.2005
- 08.2005
- 09.2005
- 10.2005
- 11.2005
- 12.2005
- 01.2006
- 02.2006
- 03.2006
- 04.2006
- 05.2006
- 06.2006
- 07.2006
- 08.2006
- 09.2006
- 10.2006
- 11.2006
- 12.2006
- 01.2007
- 02.2007
Search
Site Feed
31 December, 2004
Summat amusing (White House Paranoia remix)
I was poking around on Bruce Schneier's website, which eventually led me (through a convoluted chain) to the White House website's robots.txt file. For the unlettered, robots.txt is a file served by most web sites that contains instructions to automatic web-crawlers on what files within the site they should and should not index. Spiders (web-crawlers) are generally used by search-engines (e.g. Google) and web archives (e.g. archive.org's Wayback Machine) to trawl the internet and keep track of what's what. Spiders are of course free to disregard the instructions in robots.txt, but they're often useful for both parties and almost all spiders will conform to the suggestions as a courtesy to the site. A good number of sites also use robots.txt to keep files from being cached (e.g. by Google).
Case in point, the White House, which has apparently gone to the trouble of excluding every page on the site that might contain text transcripts or might mention Iraq by postfixing 'iraq' or 'text' to every damn page on their site, whether or not the indicated page actually exists. This precludes the slightest possibility of anyone caching anything that even mentions Iraq, to prevent embarassing incidents of your words coming back to bite you on the ass*. This produces some pretty comical results, viz. Barney the Dog's site:
* Yes, Thomas Friedman gets brownie points for this awesome bitch-slap maneuver, but he's still a weenie.
Case in point, the White House, which has apparently gone to the trouble of excluding every page on the site that might contain text transcripts or might mention Iraq by postfixing 'iraq' or 'text' to every damn page on their site, whether or not the indicated page actually exists. This precludes the slightest possibility of anyone caching anything that even mentions Iraq, to prevent embarassing incidents of your words coming back to bite you on the ass*. This produces some pretty comical results, viz. Barney the Dog's site:
Disallow: /barney/iraqEt tu, Barney?
* Yes, Thomas Friedman gets brownie points for this awesome bitch-slap maneuver, but he's still a weenie.