Robot of the Week (6)
The Web Bot:
“#!/usr/bin/perl # -w
# use diagnostics;
use LWP::RobotUA;
use HTML::Parser;
use URI::URL;
use POSIX;
use DB_File;my $url;
my $arg = (shift @ARGV);
my $domain_name = “http://”.$arg.”/”;
my @get_list = $domain_name;
local (%main,%localise);
local $counter = 0; # local files
my $maxcount = 100;
my $dirname = $arg;”
Written strings of command-based algorithms, WebBots are busily at work within the computer networks we access daily, querying specific web sites, monitoring and reporting their findings, and communicating with other WebBots deemed sufficiently “trustworthy,” in order to gain as much reliable information as possible. Their actions form a complex web of choreography, protocol, and code, a dance of data that is all the more impressive for its apparent lack of physicality.
If bodiless, text-based robots, such as those catalogued on the WebCrawler search engine (http://www.robotstxt.org/wc/active/html/) today are any indication, the conception of the robot as a simulation of physical functioning has traveled an astonishingly long distance since the time of mechanically-enabled ducks, paddling swans powered by clockwork, clumsy and box-like men of metal, and other such automata of an earlier age. Rather than existing as the embodied mimesis of any physical system, the WebBots listed on WebCrawler, such as the ArchitextSpider on the Excite Search Engine and the ChirstCrawler on ChristCENTRAL.com, are text-based command lines, designed to track and index information on the World Wide Web.
You can read more about web bots here:
http://www.usatoday.com/tech/news/2002-12-24-web-bots_x.htm Here is the entire Perl code for this retrieval bot by “deep”:
http://www.woodmann.com/fravia/rt_bot2.htm
About this entry
You’re currently reading “Robot of the Week (6),” an entry on C-LIT 146: Class Forum
- Published:
- September 11, 2006 / 6:14 pm
- Category:
- Robot of the Week
- Tags:



1 Comment
Jump to comment form | comments rss [?] | trackback uri [?]