[ skip to content ]

Nelson's Web Time Travels Get More Support from Library of Congress

Michael L. Nelson

A team of researchers including Old Dominion University computer scientist Michael L. Nelson has received a $450,000 grant from the Library of Congress to continue a project that has been dubbed "time travel for the Web."

"The Web has a terrible memory," is how Nelson and his colleagues described the challenge they were addressing when late last year they unveiled a Web architecture they called Memento.

The Library of Congress had supported the preliminary work on the project, and now it has given the research team the money to refine and expand Memento in a variety of ways. The two-year grant period begins May 15.

As Nelson sees it, "The Memento project advocates a rather straightforward approach to make navigating last year's Web as easy as navigating today's. Remnants of the past Web are available, and there are many efforts ongoing to archive even more. It's just that the past Web is not as readily accessible as today's.

"For example," Nelson added, "if you want to see an archived version of http://cnn.com, you can go to the Internet Archive's Wayback Machine and search for it there. But wouldn't it be much easier if you could just connect to cnn.com indicating that you are interested in the pages of, say March 20, 2008, not the current ones. You could activate a time machine in your Web browser!"

Most Web users understand that any URI (uniform resource identifier) that they bookmark one month may very well return a quite different page a few months later; the original page will have been supplanted by a later version. These same users also probably know that current avenues for retrieving old-archived-versions of Web pages are limited and can be convoluted and slow.

To enhance the Web's "memory," the researchers have devised a solution that works via software for servers and browsers to allow the browsers to enter "time-travel" mode. This enables searches for a past version of a file that is date-and-time specific, rather than the most up-to-date version. Memento, in other words, can provide an easy way to dig up a news service's Web page from late summer 2005 when Katrina and her flooding were ravaging New Orleans.

The Memento project is supported by the Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP). Leading the research team together with Nelson is Herbert Van de Sompel, a computer scientist at the U.S. Department of Energy's Los Alamos National Laboratory (LANL).

Other members of the team are Lyudmila Balakireva, Robert Sanderson and Harihar Shankar of LANL and Scott Ainsworth, a graduate student at ODU.

Since Nelson and Van de Sompel first presented the Memento technology at a NDIIPP seminar at the Library of Congress in Washington Nov. 16, 2009, there have been dozens of other international presentations, tutorials, and podcasts about Memento. Memento has also become a popular topic on Twitter, Slideshare and various technology blogs.

Memento works via a function of the hypertext transfer protocol (HTTP) that supports the World Wide Web. HTTP defines how Web pages are formatted and transmitted from servers to browsers.

A function called "content negotiation" allows latitude in how the browser interacts with the Web server. For example, the page request settings of the browser that contacts the URI might dictate that the page be sent in French as opposed to English. The settings might also show a preference for HTML over PDF, or for GNU Zip files over Zip files.

Nelson, Van de Sompel and their colleagues have designed Memento to use a new dimension of this page request function that negotiates for a specific date and time.

On a server running the Apache Web system, four lines of extra code will build in the date-and-time negotiation, according to the researchers. On the browser, a drop-down menu will let users enter the date and time for the page they want to view.

"We wanted to implement a solution that was appropriately placed at the protocol level, so it would be applicable to all Web content," said Nelson. "We don't want to promote ad hoc solutions that differ from site to site."

A Memento research paper the team has written notes that most archival copies have URIs that are not protocol-connected to the URI of the original resource. "This turns accessing archived resources into a significant discovery challenge for both human and software agents, which typically involves following a multitude of links from the original to the archival resource, or of searching archives for the original URI," the authors state.

"This paper proposes the protocol-based Memento solution to address this problem, and describes a proof-of-concept experiment that includes major servers of archival content, including Wikipedia and the Internet Archive. The Memento solution is based on existing HTTP capabilities applied in a novel way to add the temporal dimension. The result is a framework in which archived resources can seamlessly be reached via the URI of their original."

A recent blog post about how to get and use the FireFox add-on called MementoFox is at:

http://ws-dl.blogspot.com/2010/03/2010-03-19-mementofox-add-on-released.html.

At this link is a 10-minute podcast about the project:

http://www.educause.edu/blog/gbayne/PodcastMementoProtocolBasedTim/196509

Nelson, an associate professor of computer science at ODU who won a $500,000 National Science Foundation Young Career Development Award in 2007 to pursue his innovative ideas about digital preservation, said the new grant will allow the researchers to iron out several Memento bugs, such as time-travel Web experiences that turn up previous versions, but not the versions that were sought.

This article was posted on: May 10, 2010

Old Dominion University
Office of University Relations

Room 100 Koch Hall Norfolk, Virginia 23529-0018
Telephone: 757-683-3114
http://www.odu.edu/news

Old Dominion University is an equal opportunity, affirmative action institution.