WWW Fall 94 Agent papers

All of the following can be found in the Agents section of the online proceedings.

Existing agents on the World Wide Web aren't really agents: they're clever little programs sitting on your local machine that talk to servers, rather than clever little programs that cross the Web. Thus, you have things like ``spiders'' that crawl across the net, often in a depth-first manner, getting pages and following the links therein. There's also the ``fish,'' a sort of lookahead cache: when you arrive at a page, it immediately requests all referenced pages (to a specified depth, usually two or three) also be sent along. Of course, this tends to waste network bandwidth even more than Mosaic does normally.

Dave Eichmann discussed a number of these issues in his ethical web agents talk. He also spoke of ``gravitational wells'' in the Web, usually caused by intelligent servers that make up pages on the fly, often strongly interconnected. He cited an example of a clever thesaurus server: you enter a word, it gives you a number of antonyms, all linked to their antonyms, etc. An unsuspecting spider (e.g., such as one written by Mr. Eichmann), could get caught in such an environment, eventually traversing every word in the thesaurus.

His paper contains a number of interesting graphs obtained from ``spider'' searches. One shows how many documents he found as a function of the number of inbound links (one inbound link was the average). Another shows how many documents he found as a function of the number of URLs they contained. Most fell in the under-25 range, but there were are a number of terrifying outliers.

His suggestion is a set of conventions for search engines: an automated searcher could ask the server what it should or should not look at, warning it away from gravitational wells, for example.

Scott Spetka presented a modifed version of TkWWW that allows easy writing of local agents (e.g., like one of Eichmann's spiders). TkWWW is a GNU-produced WWW browser that uses the Tcl/Tk packages.

Steven Whitehead presented a curious experiment in artificial intelligence that he calls auto-FAQ . His basic idea is this: use computers to automate the maintainance of such lists, and take advantage of the enormous number of users of such lists.

His approach uses ``Elisa-level'' artificial intelligence to search a large FAQ database. He separates a natural-language query into keywords and searches through the questions using this. The hope is that the database is large enough for a simple-minded scheme to work, and the plan is to employ the users to expand the database.

This seems like it could either be very clever, or very unsuccessful. I'm not sure what mainstream AI researchers would think, but it does seem like a curious proposition.

Yechezkal Gutfreund presented his WWWinda system for building distributed browsers based on TclDP. It seems like a clean, simple little system.

Karen Oostendorp presented a program called paint that is essentially a hierarchical hotlist maintainer. It looks a bit like something that could be put together in a day or two using Tcl/Tk.