Part A

Zen and the Art of the Internet

Copyright (c) 1992 Brendan P. Kehoe

Permission is granted to make and distribute verbatim copies of this guide provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this booklet under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this booklet into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the author.

Zen and the Art of the Internet
A Beginner's Guide to the Internet
First Edition
January 1992

by Brendan P. Kehoe

This is revision 1.0 of February 2, 1992.
Copyright (c) 1992 Brendan P. Kehoe

The composition of this booklet was originally started because the
Computer Science department at Widener University was in desperate
need of documentation describing the capabilities of this "great new
Internet link" we obtained.

It's since grown into an effort to acquaint the reader with much of what's currently available over the Internet. Aimed at the novice user, it attempts to remain operating system "neutral"—-little information herein is specific to Unix, VMS, or any other environment. This booklet will, hopefully, be usable by nearly anyone.

A user's session is usually offset from the rest of the paragraph, as such:

prompt> command The results are usually displayed here.

The purpose of this booklet is two-fold: first, it's intended to serve as a reference piece, which someone can easily grab on the fly and look something up. Also, it forms a foundation from which people can explore the vast expanse of the Internet. Zen and the Art of the Internet doesn't spend a significant amount of time on any one point; rather, it provides enough for people to learn the specifics of what his or her local system offers.

One warning is perhaps in order—-this territory we are entering can become a fantastic time-sink. Hours can slip by, people can come and go, and you'll be locked into Cyberspace. Remember to do your work!

With that, I welcome you, the new user, to The Net.

brendan@cs.widener.edu Chester, PA

Acknowledgements

Certain sections in this booklet are not my original work—-rather, they are derived from documents that were available on the Internet and already aptly stated their areas of concentration. The chapter on Usenet is, in large part, made up of what's posted monthly to news.announce.newusers, with some editing and rewriting. Also, the main section on archie was derived from whatis.archie by Peter Deutsch of the McGill University Computing Centre. It's available via anonymous FTP from archie.mcgill.ca. Much of what's in the telnet section came from an impressive introductory document put together by SuraNet. Some definitions in the one are from an excellent glossary put together by Colorado State University.

This guide would not be the same without the aid of many people on The Net, and the providers of resources that are already out there. I'd like to thank the folks who gave this a read-through and returned some excellent comments, suggestions, and criticisms, and those who provided much-needed information on the fly. Glee Willis deserves particular mention for all of his work; this guide would have been considerably less polished without his help.

Andy Blankenbiller
Andy Blankenbiller, Army at Aberdeen

bajan@cs.mcgill.ca Alan Emtage, McGill University Computer Science Department

Brian Fitzgerald
Brian Fitzgerald, Rensselaer Polytechnic Institute

John Goetsch
John Goetsch, Rhodes University, South Africa

composer@chem.bu.edu Jeff Kellem, Boston University's Chemistry Department

kraussW@moravian.edu Bill Krauss, Moravian College

Steve Lodin
Steve Lodin, Delco Electronics

Mike Nesel
Mike Nesel, NASA

Bob
Bob Neveln, Widener University Computer Science Department

wamapi@dunkin.cc.mcgill.ca (Wanda Pierce) Wanda Pierce, McGill University Computing Centre

Joshua.R.Poulson@cyber.widener.edu
Joshua Poulson, Widener University Computing Services

de5@ornl.gov Dave Sill, Oak Ridge National Laboratory

bsmart@bsmart.tti.com Bob Smart, CitiCorp/TTI

emv@msen.com Ed Vielmetti, Vice President of MSEN

Craig E. Ward
Craig Ward, USC/Information Sciences Institute (ISI)

Glee Willis
Glee Willis, University of Nevada, Reno

Charles Yamasaki
Chip Yamasaki, OSHA

Network Basics

We are truly in an information society. Now more than ever, moving vast amounts of information quickly across great distances is one of our most pressing needs. From small one-person entrepreneurial efforts, to the largest of corporations, more and more professional people are discovering that the only way to be successful in the '90s and beyond is to realize that technology is advancing at a break-neck pace—-and they must somehow keep up. Likewise, researchers from all corners of the earth are finding that their work thrives in a networked environment. Immediate access to the work of colleagues and a "virtual" library of millions of volumes and thousands of papers affords them the ability to encorporate a body of knowledge heretofore unthinkable. Work groups can now conduct interactive conferences with each other, paying no heed to physical location—-the possibilities are endless.

You have at your fingertips the ability to talk in "real-time" with someone in Japan, send a 2,000-word short story to a group of people who will critique it for the sheer pleasure of doing so, see if a Macintosh sitting in a lab in Canada is turned on, and find out if someone happens to be sitting in front of their computer (logged on) in Australia, all inside of thirty minutes. No airline (or tardis, for that matter) could ever match that travel itinerary.

The largest problem people face when first using a network is grasping all that's available. Even seasoned users find themselves surprised when they discover a new service or feature that they'd never known even existed. Once acquainted with the terminology and sufficiently comfortable with making occasional mistakes, the learning process will drastically speed up.

Domains

Getting where you want to go can often be one of the more difficult aspects of using networks. The variety of ways that places are named will probably leave a blank stare on your face at first. Don't fret; there is a method to this apparent madness.

If someone were to ask for a home address, they would probably expect a street, apartment, city, state, and zip code. That's all the information the post office needs to deliver mail in a reasonably speedy fashion. Likewise, computer addresses have a structure to them. The general form is:

a person's email address on a computer: user@somewhere.domain a computer's name: somewhere.domain

The user portion is usually the person's account name on the system, though it doesn't have to be. somewhere.domain tells you the name of a system or location, and what kind of organization it is. The trailing domain is often one of the following:

com Usually a company or other commercial institution or organization, like Convex Computers (convex.com).

edu An educational institution, e.g. New York University, named nyu.edu.

gov A government site; for example, NASA is nasa.gov.

mil A military site, like the Air Force (af.mil).

net Gateways and other administrative hosts for a network (it does not mean all of the hosts in a network). {The Matrix, 111. One such gateway is near.net.}

org
This is a domain reserved for private organizations, who don't
comfortably fit in the other classes of domains. One example is the
Electronic Frontier Foundation named eff.org.

Each country also has its own top-level domain. For example, the us domain includes each of the fifty states. Other countries represented with domains include:

au Australia ca Canada fr France uk The United Kingdom. These also have sub-domains of things like ac.uk for academic sites and co.uk for commercial ones.

FQDN (Fully Qualified Domain Name)

The proper terminology for a site's domain name (somewhere.domain above) is its Fully Qualified Domain Name (FQDN). It is usually selected to give a clear indication of the site's organization or sponsoring agent. For example, the Massachusetts Institute of Technology's FQDN is mit.edu; similarly, Apple Computer's domain name is apple.com. While such obvious names are usually the norm, there are the occasional exceptions that are ambiguous enough to mislead—-like vt.edu, which on first impulse one might surmise is an educational institution of some sort in Vermont; not so. It's actually the domain name for Virginia Tech. In most cases it's relatively easy to glean the meaning of a domain name—-such confusion is far from the norm.

Internet Numbers

Every single machine on the Internet has a unique address, {At least one address, possibly two or even three—-but we won't go into that.} called its Internet number or IP Address. It's actually a 32-bit number, but is most commonly represented as four numbers joined by periods (.), like 147.31.254.130. This is sometimes also called a dotted quad; there are literally thousands of different possible dotted quads. The ARPAnet (the mother to today's Internet) originally only had the capacity to have up to 256 systems on it because of the way each system was addressed. In the early eighties, it became clear that things would fast outgrow such a small limit; the 32-bit addressing method was born, freeing thousands of host numbers.

Each piece of an Internet address (like 192) is called an "octet," representing one of four sets of eight bits. The first two or three pieces (e.g. 192.55.239) represent the network that a system is on, called its subnet. For example, all of the computers for Wesleyan University are in the subnet 129.133. They can have numbers like 129.133.10.10, 129.133.230.19, up to 65 thousand possible combinations (possible computers).

IP addresses and domain names aren't assigned arbitrarily—-that would lead to unbelievable confusion. An application must be filed with the Network Information Center (NIC), either electronically (to hostmaster@nic.ddn.mil) or via regular mail.

Resolving Names and Numbers

Ok, computers can be referred to by either their FQDN or their
Internet address. How can one user be expected to remember them all?

They aren't. The Internet is designed so that one can use either method. Since humans find it much more natural to deal with words than numbers in most cases, the FQDN for each host is mapped to its Internet number. Each domain is served by a computer within that domain, which provides all of the necessary information to go from a domain name to an IP address, and vice-versa. For example, when someone refers to foosun.bar.com, the resolver knows that it should ask the system foovax.bar.com about systems in bar.com. It asks what Internet address foosun.bar.com has; if the name foosun.bar.com really exists, foovax will send back its number. All of this "magic" happens behind the scenes.

Rarely will a user have to remember the Internet number of a site (although often you'll catch yourself remembering an apparently obscure number, simply because you've accessed the system frequently). However, you will remember a substantial number of FQDNs. It will eventually reach a point when you are able to make a reasonably accurate guess at what domain name a certain college, university, or company might have, given just their name.

The Networks

Internet The Internet is a large "network of networks." There is no one network known as The Internet; rather, regional nets like SuraNet, PrepNet, NearNet, et al., are all inter-connected (nay, "inter-networked") together into one great living thing, communicating at amazing speeds with the TCP/IP protocol. All activity takes place in "real-time."

UUCP The UUCP network is a loose association of systems all communicating with the UUCP protocol. (UUCP stands for `Unix-to-Unix Copy Program'.) It's based on two systems connecting to each other at specified intervals, called polling, and executing any work scheduled for either of them. Historically most UUCP was done with Unix equipment, although the software's since been implemented on other platforms (e.g. VMS). For example, the system oregano polls the system basil once every two hours. If there's any mail waiting for oregano, basil will send it at that time; likewise, oregano will at that time send any jobs waiting for basil.

BITNET BITNET (the "Because It's Time Network") is comprised of systems connected by point-to-point links, all running the NJE protocol. It's continued to grow, but has found itself suffering at the hands of the falling costs of Internet connections. Also, a number of mail gateways are in place to reach users on other networks.

The Physical Connection

The actual connections between the various networks take a variety of forms. The most prevalent for Internet links are 56k leased lines (dedicated telephone lines carrying 56kilobit-per-second connections) and T1 links (special phone lines with 1Mbps connections). Also installed are T3 links, acting as backbones between major locations to carry a massive 45Mbps load of traffic.

These links are paid for by each institution to a local carrier (for
example, Bell Atlantic owns PrepNet, the main provider in
Pennsylvania). Also available are SLIP connections, which carry
Internet traffic (packets) over high-speed modems.

UUCP links are made with modems (for the most part), that run from 1200 baud all the way up to as high as 38.4Kbps. As was mentioned in The Networks, the connections are of the store-and-forward variety. Also in use are Internet-based UUCP links (as if things weren't already confusing enough!). The systems do their UUCP traffic over TCP/IP connections, which give the UUCP-based network some blindingly fast "hops," resulting in better connectivity for the network as a whole. UUCP connections first became popular in the 1970's, and have remained in wide-spread use ever since. Only with UUCP can Joe Smith correspond with someone across the country or around the world, for the price of a local telephone call.

BITNET links mostly take the form of 9600bps modems connected from site to site. Often places have three or more links going; the majority, however, look to "upstream" sites for their sole link to the network.

"The Glory and the Nothing of a Name"
Byron, {Churchill's Grave}

—————-
Electronic Mail

The desire to communicate is the essence of networking. People have always wanted to correspond with each other in the fastest way possible, short of normal conversation. Electronic mail (or email) is the most prevalent application of this in computer networking. It allows people to write back and forth without having to spend much time worrying about how the message actually gets delivered. As technology grows closer and closer to being a common part of daily life, the need to understand the many ways it can be utilized and how it works, at least to some level, is vital. part of daily life (as has been evidenced by the ISDN effort, the need to understand the many ways it can be utilized and how it works, at least to some level, is vital.

Email Addresses

Electronic mail is hinged around the concept of an address; the section on Networking Basics made some reference to it while introducing domains. Your email address provides all of the information required to get a message to you from anywhere in the world. An address doesn't necessarily have to go to a human being. It could be an archive server, {See Archive Servers, for a description.} a list of people, or even someone's pocket pager. These cases are the exception to the norm—-mail to most addresses is read by human beings.

%@!.: Symbolic Cacophony

Email addresses usually appear in one of two forms—-using the Internet format which contains @, an "at"-sign, or using the UUCP format which contains !, an exclamation point, also called a "bang." The latter of the two, UUCP "bang" paths, is more restrictive, yet more clearly dictates how the mail will travel.

To reach Jim Morrison on the system south.america.org, one would address the mail as jm@south.america.org. But if Jim's account was on a UUCP site named brazil, then his address would be brazil!jm. If it's possible (and one exists), try to use the Internet form of an address; bang paths can fail if an intermediate site in the path happens to be down. There is a growing trend for UUCP sites to register Internet domain names, to help alleviate the problem of path failures.

Another symbol that enters the fray is %—-it acts as an extra "routing" method. For example, if the UUCP site dream is connected to south.america.org, but doesn't have an Internet domain name of its own, a user debbie on dream can be reached by writing to the address not smallexample!

debbie%dream@south.america.org

The form is significant. This address says that the local system should first send the mail to south.america.org. There the address debbie%dream will turn into debbie@dream, which will hopefully be a valid address. Then south.america.org will handle getting the mail to the host dream, where it will be delivered locally to debbie.

All of the intricacies of email addressing methods are fully covered
in the book "!%@@:: A Directory of Electronic Mail Addressing and
Networks" published by O'Reilly and Associates, as part of their
Nutshell Handbook series. It is a must for any active email user.
Write to nuts@ora.com for ordering information.

Sending and Receiving Mail

We'll make one quick diversion from being OS-neuter here, to show you what it will look like to send and receive a mail message on a Unix system. Check with your system administrator for specific instructions related to mail at your site.

A person sending the author mail would probably do something like this:

% mail brendan@cs.widener.edu Subject: print job's stuck

I typed `print babe.gif' and it didn't work! Why??

The next time the author checked his mail, he would see it listed in his mailbox as:

% mail "/usr/spool/mail/brendan": 1 messages 1 new 1 unread U 1 joeuser@foo.widene Tue May 5 20:36 29/956 print job's stuck ?

which gives information on the sender of the email, when it was sent, and the subject of the message. He would probably use the reply command of Unix mail to send this response:

? r
To: joeuser@@foo.widener.edu
Subject: Re: print job's stuck

You shouldn't print binary files like GIFs to a printer!

Brendan

Try sending yourself mail a few times, to get used to your system's mailer. It'll save a lot of wasted aspirin for both you and your system administrator.

Anatomy of a Mail Header

An electronic mail message has a specific structure to it that's common across every type of computer system. {The standard is written down in RFC-822. See also RFCs for more info on how to get copies of the various RFCs.} A sample would be:

>From bush@hq.mil Sat May 25 17:06:01 1991
Received: from hq.mil by house.gov with SMTP id AA21901
(4.1/SMI for dan@house.gov); Sat, 25 May 91 17:05:56 -0400
Date: Sat, 25 May 91 17:05:56 -0400
From: The President
Message-Id: <9105252105.AA06631@hq.mil>
To: dan@senate.gov
Subject: Meeting

Hi Dan .. we have a meeting at 9:30 a.m. with the Joint Chiefs. Please don't oversleep this time.

The first line, with From and the two lines for Received: are usually not very interesting. They give the "real" address that the mail is coming from (as opposed to the address you should reply to, which may look much different), and what places the mail went through to get to you. Over the Internet, there is always at least one Received: header and usually no more than four or five. When a message is sent using UUCP, one Received: header is added for each system that the mail passes through. This can often result in more than a dozen Received: headers. While they help with dissecting problems in mail delivery, odds are the average user will never want to see them. Most mail programs will filter out this kind of "cruft" in a header.

The Date: header contains the date and time the message was sent. Likewise, the "good" address (as opposed to "real" address) is laid out in the From: header. Sometimes it won't include the full name of the person (in this case The President), and may look different, but it should always contain an email address of some form.

The Message-ID: of a message is intended mainly for tracing mail routing, and is rarely of interest to normal users. Every Message-ID: is guaranteed to be unique.

To: lists the email address (or addresses) of the recipients of the message. There may be a Cc: header, listing additional addresses. Finally, a brief subject for the message goes in the Subject: header.

The exact order of a message's headers may vary from system to system, but it will always include these fundamental headers that are vital to proper delivery.

Bounced Mail

When an email address is incorrect in some way (the system's name is wrong, the domain doesn't exist, whatever), the mail system will bounce the message back to the sender, much the same way that the Postal Service does when you send a letter to a bad street address. The message will include the reason for the bounce; a common error is addressing mail to an account name that doesn't exist. For example, writing to Lisa Simpson at Widener University's Computer Science department will fail, because she doesn't have an account. {Though if she asked, we'd certainly give her one.}

From: Mail Delivery Subsystem
Date: Sat, 25 May 91 16:45:14 -0400
To: mg@gracie.com
Cc: Postmaster@cs.widener.edu
Subject: Returned mail: User unknown

——- Transcript of session follows ——- While talking to cs.widener.edu: >>> RCPT To: <<< 550 … User unknown 550 lsimpson… User unknown

As you can see, a carbon copy of the message (the Cc: header entry) was sent to the postmaster of Widener's CS department. The Postmaster is responsible for maintaining a reliable mail system on his system. Usually postmasters at sites will attempt to aid you in getting your mail where it's supposed to go. If a typing error was made, then try re-sending the message. If you're sure that the address is correct, contact the postmaster of the site directly and ask him how to properly address it.

The message also includes the text of the mail, so you don't have to retype everything you wrote.

——- Unsent message follows ——-
Received: by cs.widener.edu id AA06528; Sat, 25 May 91 16:45:14 -0400
Date: Sat, 25 May 91 16:45:14 -0400
From: Matt Groening
Message-Id: <9105252045.AA06528@gracie.com>
To: lsimpson@cs.widener.edu
Subject: Scripting your future episodes
Reply-To: writing-group@gracie.com

…. verbiage …

The full text of the message is returned intact, including any headers that were added. This can be cut out with an editor and fed right back into the mail system with a proper address, making redelivery a relatively painless process.

Mailing Lists

People that share common interests are inclined to discuss their hobby or interest at every available opportunity. One modern way to aid in this exchange of information is by using a mailing list—-usually an email address that redistributes all mail sent to it back out to a list of addresses. For example, the Sun Managers mailing list (of interest to people that administer computers manufactured by Sun) has the address sun-managers@eecs.nwu.edu. Any mail sent to that address will "explode" out to each person named in a file maintained on a computer at Northwestern University.

Administrative tasks (sometimes referred to as administrivia) are often handled through other addresses, typically with the suffix -request. To continue the above, a request to be added to or deleted from the Sun Managers list should be sent to sun-managers-request@eecs.nwu.edu.

When in doubt, try to write to the -request version of a mailing list address first; the other people on the list aren't interested in your desire to be added or deleted, and can certainly do nothing to expedite your request. Often if the administrator of a list is busy (remember, this is all peripheral to real jobs and real work), many users find it necessary to ask again and again, often with harsher and harsher language, to be removed from a list. This does nothing more than waste traffic and bother everyone else receiving the messages. If, after a reasonable amount of time, you still haven't succeeded to be removed from a mailing list, write to the postmaster at that site and see if they can help.

Exercise caution when replying to a message sent by a mailing list. If you wish to respond to the author only, make sure that the only address you're replying to is that person, and not the entire list. Often messages of the sort "Yes, I agree with you completely!" will appear on a list, boring the daylights out of the other readers. Likewise, if you explicitly do want to send the message to the whole list, you'll save yourself some time by checking to make sure it's indeed headed to the whole list and not a single person.

A list of the currently available mailing lists is available in at least two places; the first is in a file on ftp.nisc.sri.com called interest-groups under the netinfo/ directory. It's updated fairly regularly, but is large (presently around 700K), so only get it every once in a while. The other list is maintained by Gene Spafford (spaf@cs.purdue.edu), and is posted in parts to the newsgroup news.lists semi-regularly. (Usenet News, for info on how to read that and other newsgroups.)

Listservs

On BITNET there's an automated system for maintaining discussion lists called the listserv. Rather than have an already harried and overworked human take care of additions and removals from a list, a program performs these and other tasks by responding to a set of user-driven commands.

Areas of interest are wide and varied—-ETHICS-L deals with ethics in computing, while ADND-L has to do with a role-playing game. A full list of the available BITNET lists can be obtained by writing to LISTSERV@BITNIC.BITNET with a body containing the command

list global

However, be sparing in your use of this—-see if it's already on your system somewhere. The reply is quite large.

The most fundamental command is subscribe. It will tell the listserv to add the sender to a specific list. The usage is

subscribe foo-l Your Real Name

It will respond with a message either saying that you've been added to the list, or that the request has been passed on to the system on which the list is actually maintained.

The mate to subscribe is, naturally, unsubscribe. It will remove a given address from a BITNET list. It, along with all other listserv commands, can be abbreviated—-subscribe as sub, unsubscribe as unsub, etc. For a full list of the available listserv commands, write to LISTSERV@BITNIC.BITNET, giving it the command help.

As an aside, there have been implementations of the listserv system for non-BITNET hosts (more specifically, Unix systems). One of the most complete is available on cs.bu.edu in the directory pub/listserv.

"I made this letter longer than usual because
I lack the time to make it shorter."
Pascal, Provincial Letters XVI

———————

Anonymous FTP

FTP (File Transfer Protocol) is the primary method of transferring files over the Internet. On many systems, it's also the name of the program that implements the protocol. Given proper permission, it's possible to copy a file from a computer in South Africa to one in Los Angeles at very fast speeds (on the order of 5—10K per second). This normally requires either a user id on both systems or a special configuration set up by the system administrator(s).

There is a good way around this restriction—-the anonymous FTP service. It essentially will let anyone in the world have access to a certain area of disk space in a non-threatening way. With this, people can make files publicly available with little hassle. Some systems have dedicated entire disks or even entire computers to maintaining extensive archives of source code and information. They include gatekeeper.dec.com (Digital), wuarchive.wustl.edu (Washington University in Saint Louis), and archive.cis.ohio-state.edu (The Ohio State University).

The process involves the "foreign" user (someone not on the system itself) creating an FTP connection and logging into the system as the user anonymous, with an arbitrary password:

Name (foo.site.com:you): anonymous
Password: jm@south.america.org

Custom and netiquette dictate that people respond to the Password: query with an email address so that the sites can track the level of FTP usage, if they desire. (Addresses for information on email addresses).

The speed of the transfer depends on the speed of the underlying link. A site that has a 9600bps SLIP connection will not get the same throughput as a system with a 56k leased line (The Physical Connection, for more on what kinds of connections can exist in a network). Also, the traffic of all other users on that link will affect performance. If there are thirty people all FTPing from one site simultaneously, the load on the system (in addition to the network connection) will degrade the overall throughput of the transfer.

FTP Etiquette

Lest we forget, the Internet is there for people to do work. People using the network and the systems on it are doing so for a purpose, whether it be research, development, whatever. Any heavy activity takes away from the overall performance of the network as a whole.

The effects of an FTP connection on a site and its link can vary; the general rule of thumb is that any extra traffic created detracts from the ability of that site's users to perform their tasks. To help be considerate of this, it's highly recommended that FTP sessions be held only after normal business hours for that site, preferably late at night. The possible effects of a large transfer will be less destructive at 2 a.m. than 2 p.m. Also, remember that if it's past dinner time in Maine, it's still early afternoon in California—-think in terms of the current time at the site that's being visited, not of local time.

Basic Commands

While there have been many extensions to the various FTP clients out there, there is a de facto "standard" set that everyone expects to work. For more specific information, read the manual for your specific FTP program. This section will only skim the bare minimum of commands needed to operate an FTP session.

Creating the Connection

The actual command to use FTP will vary among operating systems; for the sake of clarity, we'll use FTP here, since it's the most general form.

There are two ways to connect to a system—-using its hostname or its Internet number. Using the hostname is usually preferred. However, some sites aren't able to resolve hostnames properly, and have no alternative. We'll assume you're able to use hostnames for simplicity's sake. The form is

ftp somewhere.domain

Domains for help with reading and using domain names (in the example below, somewhere.domain is ftp.uu.net).

You must first know the name of the system you want to connect to.
We'll use ftp.uu.net as an example. On your system, type:

ftp ftp.uu.net

(the actual syntax will vary depending on the type of system the connection's being made from). It will pause momentarily then respond with the message

Connected to ftp.uu.net.

and an initial prompt will appear:

220 uunet FTP server (Version 5.100 Mon Feb 11 17:13:28 EST 1991) ready. Name (ftp.uu.net:jm):

to which you should respond with anonymous:

220 uunet FTP server (Version 5.100 Mon Feb 11 17:13:28 EST 1991) ready. Name (ftp.uu.net:jm): anonymous

The system will then prompt you for a password; as noted previously, a good response is your email address:

331 Guest login ok, send ident as password. Password: jm@south.america.org 230 Guest login ok, access restrictions apply. ftp>

The password itself will not echo. This is to protect a user's security when he or she is using a real account to FTP files between machines. Once you reach the ftp> prompt, you know you're logged in and ready to go.

Notice the ftp.uu.net:joe in the Name: prompt? That's another clue that anonymous FTP is special: FTP expects a normal user accounts to be used for transfers.

dir At the ftp> prompt, you can type a number of commands to perform various functions. One example is dir—-it will list the files in the current directory. Continuing the example from above:

ftp> dir

200 PORT command successful. 150 Opening ASCII mode data connection for /bin/ls. total 3116 drwxr-xr-x 2 7 21 512 Nov 21 1988 .forward -rw-rw-r— 1 7 11 0 Jun 23 1988 .hushlogin drwxrwxr-x 2 0 21 512 Jun 4 1990 Census drwxrwxr-x 2 0 120 512 Jan 8 09:36 ClariNet … etc etc … -rw-rw-r— 1 7 14 42390 May 20 02:24 newthisweek.Z … etc etc … -rw-rw-r— 1 7 14 2018887 May 21 01:01 uumap.tar.Z drwxrwxr-x 2 7 6 1024 May 11 10:58 uunet-info

226 Transfer complete. 5414 bytes received in 1.1 seconds (4.9 Kbytes/s) ftp>

The file newthisweek.Z was specifically included because we'll be using it later. Just for general information, it happens to be a listing of all of the files added to UUNET's archives during the past week.

The directory shown is on a machine running the Unix operating system—-the dir command will produce different results on other operating systems (e.g. TOPS, VMS, et al.). Learning to recognize different formats will take some time. After a few weeks of traversing the Internet, it proves easier to see, for example, how large a file is on an operating system you're otherwise not acquainted with.

With many FTP implementations, it's also possible to take the output of dir and put it into a file on the local system with

ftp> dir n* outfilename

the contents of which can then be read outside of the live FTP connection; this is particularly useful for systems with very long directories (like ftp.uu.net). The above example would put the names of every file that begins with an n into the local file outfilename.

cd

At the beginning of an FTP session, the user is in a "top-level" directory. Most things are in directories below it (e.g. /pub). To change the current directory, one uses the cd command. To change to the directory pub, for example, one would type

ftp> cd pub

which would elicit the response

250 CWD command successful.

Meaning the "Change Working Directory" command (cd) worked properly. Moving "up" a directory is more system-specific—-in Unix use the command cd .., and in VMS, cd [-].

get and put

The actual transfer is performed with the get and put commands. To get a file from the remote computer to the local system, the command takes the form:

ftp> get filename

where filename is the file on the remote system. Again using ftp.uu.net as an example, the file newthisweek.Z can be retrieved with

ftp> get newthisweek.Z 200 PORT command successful. 150 Opening ASCII mode data connection for newthisweek.Z (42390 bytes). 226 Transfer complete. local: newthisweek.Z remote: newthisweek.Z 42553 bytes received in 6.9 seconds (6 Kbytes/s) ftp>

The section below on using binary mode instead of ASCII will describe why this particular choice will result in a corrupt and subsequently unusable file.

If, for some reason, you want to save a file under a different name (e.g. your system can only have 14-character filenames, or can only have one dot in the name), you can specify what the local filename should be by providing get with an additional argument

ftp> get newthisweek.Z uunet-new

which will place the contents of the file newthisweek.Z in uunet-new on the local system.

The transfer works the other way, too. The put command will transfer a file from the local system to the remote system. If the permissions are set up for an FTP session to write to a remote directory, a file can be sent with

ftp> put filename

As with get, put will take a third argument, letting you specify a different name for the file on the remote system.

ASCII vs Binary

In the example above, the file newthisweek.Z was transferred, but supposedly not correctly. The reason is this: in a normal ASCII transfer (the default), certain characters are translated between systems, to help make text files more readable. However, when binary files (those containing non-ASCII characters) are transferred, this translation should not take place. One example is a binary program—-a few changed characters can render it completely useless.

To avoid this problem, it's possible to be in one of two modes—-ASCII or binary. In binary mode, the file isn't translated in any way. What's on the remote system is precisely what's received. The commands to go between the two modes are:

ftp> ascii 200 Type set to A. (Note the A, which signifies ASCII mode.)

ftp> binary 200 Type set to I. (Set to Image format, for pure binary transfers.)

Note that each command need only be done once to take effect; if the user types binary, all transfers in that session are done in binary mode (that is, unless ascii is typed later).

The transfer of newthisweek.Z will work if done as:

ftp> binary 200 Type set to I. ftp> get newthisweek.Z 200 PORT command successful. 150 Opening BINARY mode data connection for newthisweek.Z (42390 bytes). 226 Transfer complete. local: newthisweek.Z remote: newthisweek.Z 42390 bytes received in 7.2 seconds (5.8 Kbytes/s)

Note: The file size (42390) is different from that done in ASCII mode (42553) bytes; and the number 42390 matches the one in the listing of UUNET's top directory. We can be relatively sure that we've received the file without any problems.

mget and mput

The commands mget and mput allow for multiple file transfers using wildcards to get several files, or a whole set of files at once, rather than having to do it manually one by one. For example, to get all files that begin with the letter f, one would type

ftp> mget f*

Similarly, to put all of the local files that end with .c:

ftp> mput *.c

Rather than reiterate what's been written a hundred times before, consult a local manual for more information on wildcard matching (every DOS manual, for example, has a section on it).

Normally, FTP assumes a user wants to be prompted for every file in a mget or mput operation. You'll often need to get a whole set of files and not have each of them confirmed—-you know they're all right. In that case, use the prompt command to turn the queries off.

ftp> prompt Interactive mode off.

Likewise, to turn it back on, the prompt command should simply be issued again.

Joe Granrose's List
Monthly, Joe Granrose (odin@pilot.njin.net) posts to Usenet
(Usenet News) an extensive list of sites offering anonymous FTP
service. It's available in a number of ways:

The Usenet groups comp.misc and comp.sources.wanted

Anonymous FTP from pilot.njin.net [128.6.7.38], in /pub/ftp-list.

Write to odin@pilot.njin.net with a Subject: line of listserv-request and a message body of send help. Please don't bother Joe with your requests—-the server will provide you with the list.

The archie Server archie is always in lowercase

A group of people at McGill University in Canada got together and created a query system called archie. It was originally formed to be a quick and easy way to scan the offerings of the many anonymous FTP sites that are maintained around the world. As time progressed, archie grew to include other valuable services as well.

The archie service is accessible through an interactive telnet session, email queries, and command-line and X-window clients. The email responses can be used along with FTPmail servers for those not on the Internet. (FTP-by-Mail Servers, for info on using FTPmail servers.)

Using archie Today

Currently, archie tracks the contents of over 800 anonymous FTP archive sites containing over a million files stored across the Internet. Collectively, these files represent well over 50 gigabytes of information, with new entries being added daily.

The archie server automatically updates the listing information from each site about once a month. This avoids constantly updating the databases, which could waste network resources, yet ensures that the information on each site's holdings is reasonably up to date.

To access archie interactively, telnet to one of the existing servers. {See Telnet, for notes on using the telnet program.} They include

archie.ans.net (New York, USA) archie.rutgers.edu (New Jersey, USA) archie.sura.net (Maryland, USA) archie.unl.edu (Nebraska, USA) archie.mcgill.ca (the first Archie server, in Canada) archie.funet.fi (Finland) archie.au (Australia) archie.doc.ic.ac.uk (Great Britain)

At the login: prompt of one of the servers, enter archie to log in. A greeting will be displayed, detailing information about ongoing work in the archie project; the user will be left at a archie> prompt, at which he may enter commands. Using help will yield instructions on using the prog command to make queries, set to control various aspects of the server's operation, et al. Type quit at the prompt to leave archie. Typing the query prog vine.tar.Z will yield a list of the systems that offer the source to the X-windows program vine; a piece of the information returned looks like:

Host ftp.uu.net (137.39.1.9)
Last updated 10:30 7 Jan 1992

Location: /packages/X/contrib
FILE rw-r—r— 15548 Oct 8 20:29 vine.tar.Z

Host nic.funet.fi (128.214.6.100)
Last updated 05:07 4 Jan 1992

Location: /pub/X11/contrib
FILE rw-rw-r— 15548 Nov 8 03:25 vine.tar.Z

archie Clients

There are two main-stream archie clients, one called (naturally enough) archie, the other xarchie (for X-Windows). They query the archie databases and yield a list of systems that have the requested file(s) available for anonymous FTP, without requiring an interactive session to the server. For example, to find the same information you tried with the server command prog, you could type

% archie vine.tar.Z
Host athene.uni-paderborn.de
Location: /local/X11/more_contrib
FILE -rw-r—r— 18854 Nov 15 1990 vine.tar.Z

Host emx.utexas.edu
Location: /pub/mnt/source/games
FILE -rw-r—r— 12019 May 7 1988 vine.tar.Z

Host export.lcs.mit.edu
Location: /contrib
FILE -rw-r—r— 15548 Oct 9 00:29 vine.tar.Z

Note that your system administrator may not have installed the archie clients yet; the source is available on each of the archie servers, in the directory archie/clients.

Using the X-windows client is much more intuitive—-if it's installed, just read its man page and give it a whirl. It's essential for the networked desktop.

Mailing archie

Users limited to email connectivity to the Internet should send a message to the address archie@archie.mcgill.ca with the single word help in the body of the message. An email message will be returned explaining how to use the email archie server, along with the details of using FTPmail. Most of the commands offered by the telnet interface can be used with the mail server.

The whatis database

In addition to offering access to anonymous FTP listings, archie also permits access to the whatis description database. It includes the names and brief synopses for over 3,500 public domain software packages, datasets and informational documents located on the Internet.

Additional whatis databases are scheduled to be added in the future. Planned offerings include listings for the names and locations of online library catalog programs, the names of publicly accessible electronic mailing lists, compilations of Frequently Asked Questions lists, and archive sites for the most popular Usenet newsgroups. Suggestions for additional descriptions or locations databases are welcomed and should be sent to the archie developers at archie-l@cs.mcgill.ca.

"Was f@"ur pl@"undern!"
("What a place to plunder!")
Gebhard Leberecht Bl@"ucher

———
Usenet News

Original from: chip@count.tct.com (Chip Salzenberg)
[Most recent change: 19 May 1991 by spaf@cs.purdue.edu (Gene Spafford)]

The first thing to understand about Usenet is that it is widely misunderstood. Every day on Usenet the "blind men and the elephant" phenomenon appears, in spades. In the opinion of the author, more flame wars (rabid arguments) arise because of a lack of understanding of the nature of Usenet than from any other source. And consider that such flame wars arise, of necessity, among people who are on Usenet. Imagine, then, how poorly understood Usenet must be by those outside!

No essay on the nature of Usenet can ignore the erroneous impressions held by many Usenet users. Therefore, this section will treat falsehoods first. Keep reading for truth. (Beauty, alas, is not relevant to Usenet.)

What Usenet Is

Usenet is the set of machines that exchange articles tagged with one or more universally-recognized labels, called newsgroups (or "groups" for short). (Note that the term newsgroup is correct, while area, base, board, bboard, conference, round table, SIG, etc. are incorrect. If you want to be understood, be accurate.)

The Diversity of Usenet

If the above definition of Usenet sounds vague, that's because it is. It is almost impossible to generalize over all Usenet sites in any non-trivial way. Usenet encompasses government agencies, large universities, high schools, businesses of all sizes, home computers of all descriptions, etc.

Every administrator controls his own site. No one has any real control over any site but his own. The administrator gets his power from the owner of the system he administers. As long as the owner is happy with the job the administrator is doing, he can do whatever he pleases, up to and including cutting off Usenet entirely. C'est la vie.

What Usenet Is Not

Usenet is not an organization. Usenet has no central authority. In fact, it has no central anything. There is a vague notion of "upstream" and "downstream" related to the direction of high-volume news flow. It follows that, to the extent that "upstream" sites decide what traffic they will carry for their "downstream" neighbors, that "upstream" sites have some influence on their neighbors. But such influence is usually easy to circumvent, and heavy-handed manipulation typically results in a backlash of resentment.

Usenet is not a democracy. A democracy can be loosely defined as "government of the people, by the people, for the people." However, as explained above, Usenet is not an organization, and only an organization can be run as a democracy. Even a democracy must be organized, for if it lacks a means of enforcing the peoples' wishes, then it may as well not exist.

Some people wish that Usenet were a democracy. Many people pretend that it is. Both groups are sadly deluded.

Usenet is not fair. After all, who shall decide what's fair? For that matter, if someone is behaving unfairly, who's going to stop him? Neither you nor I, that's certain.

Usenet is not a right. Some people misunderstand their local right of "freedom of speech" to mean that they have a legal right to use others' computers to say what they wish in whatever way they wish, and the owners of said computers have no right to stop them.

Those people are wrong. Freedom of speech also means freedom not to speak; if I choose not to use my computer to aid your speech, that is my right. Freedom of the press belongs to those who own one.

Usenet is not a public utility. Some Usenet sites are publicly funded or subsidized. Most of them, by plain count, are not. There is no government monopoly on Usenet, and little or no control.

Usenet is not a commercial network. Many Usenet sites are academic or government organizations; in fact, Usenet originated in academia. Therefore, there is a Usenet custom of keeping commercial traffic to a minimum. If such commercial traffic is generally considered worth carrying, then it may be grudgingly tolerated. Even so, it is usually separated somehow from non-commercial traffic; see comp.newprod.

Usenet is not the Internet. The Internet is a wide-ranging network, parts of which are subsidized by various governments. The Internet carries many kinds of traffic; Usenet is only one of them. And the Internet is only one of the various networks carrying Usenet traffic.

Usenet is not a Unix network, nor even an ASCII network.

Don't assume that everyone is using "rn" on a Unix machine. There are Vaxen running VMS, IBM mainframes, Amigas, and MS-DOS PCs reading and posting to Usenet. And, yes, some of them use (shudder) EBCDIC. Ignore them if you like, but they're out there.

Usenet is not software. There are dozens of software packages used at various sites to transport and read Usenet articles. So no one program or package can be called "the Usenet software."

Software designed to support Usenet traffic can be (and is) used for other kinds of communication, usually without risk of mixing the two. Such private communication networks are typically kept distinct from Usenet by the invention of newsgroup names different from the universally-recognized ones.

Usenet is not a UUCP network.

UUCP is a protocol (some might say protocol suite, but that's a technical point) for sending data over point-to-point connections, typically using dialup modems. Usenet is only one of the various kinds of traffic carried via UUCP, and UUCP is only one of the various transports carrying Usenet traffic.

Well, enough negativity.

Propagation of News

In the old days, when UUCP over long-distance dialup lines was the dominant means of article transmission, a few well-connected sites had real influence in determining which newsgroups would be carried where. Those sites called themselves "the backbone."

But things have changed. Nowadays, even the smallest Internet site has connectivity the likes of which the backbone admin of yesteryear could only dream. In addition, in the U.S., the advent of cheaper long-distance calls and high-speed modems has made long-distance Usenet feeds thinkable for smaller companies. There is only one pre-eminent UUCP transport site today in the U.S., namely UUNET. But UUNET isn't a player in the propagation wars, because it never refuses any traffic—-it gets paid by the minute, after all; to refuse based on content would jeopardize its legal status as an enhanced service provider.

All of the above applies to the U.S. In Europe, different cost structures favored the creation of strictly controlled hierarchical organizations with central registries. This is all very unlike the traditional mode of U.S. sites (pick a name, get the software, get a feed, you're on). Europe's "benign monopolies", long uncontested, now face competition from looser organizations patterned after the U.S. model.

Group Creation

As discussed above, Usenet is not a democracy. Nevertheless, currently the most popular way to create a new newsgroup involves a "vote" to determine popular support for (and opposition to) a proposed newsgroup. Newsgroup Creation, for detailed instructions and guidelines on the process involved in making a newsgroup.

If you follow the guidelines, it is probable that your group will be created and will be widely propagated. However, due to the nature of Usenet, there is no way for any user to enforce the results of a newsgroup vote (or any other decision, for that matter). Therefore, for your new newsgroup to be propagated widely, you must not only follow the letter of the guidelines; you must also follow its spirit. And you must not allow even a whiff of shady dealings or dirty tricks to mar the vote.

So, you may ask: How is a new user supposed to know anything about the "spirit" of the guidelines? Obviously, she can't. This fact leads inexorably to the following recommendation:

If you're a new user, don't try to create a new newsgroup alone.

If you have a good newsgroup idea, then read the news.groups newsgroup for a while (six months, at least) to find out how things work. If you're too impatient to wait six months, then you really need to learn; read news.groups for a year instead. If you just can't wait, find a Usenet old hand to run the vote for you.

Readers may think this advice unnecessarily strict. Ignore it at your peril. It is embarrassing to speak before learning. It is foolish to jump into a society you don't understand with your mouth open. And it is futile to try to force your will on people who can tune you out with the press of a key.

If You're Unhappy… Property rights being what they are, there is no higher authority on Usenet than the people who own the machines on which Usenet traffic is carried. If the owner of the machine you use says, "We will not carry alt.sex on this machine," and you are not happy with that order, you have no Usenet recourse. What can we outsiders do, after all?

That doesn't mean you are without options. Depending on the nature of your site, you may have some internal political recourse. Or you might find external pressure helpful. Or, with a minimal investment, you can get a feed of your own from somewhere else. Computers capable of taking Usenet feeds are down in the $500 range now, Unix-capable boxes are going for under $2000, and there are at least two Unix lookalikes in the $100 price range.

No matter what, appealing to "Usenet" won't help. Even if those who read such an appeal regarding system administration are sympathetic to your cause, they will almost certainly have even less influence at your site than you do.

By the same token, if you don't like what some user at another site is doing, only the administrator and/or owner of that site have any authority to do anything about it. Persuade them that the user in question is a problem for them, and they might do something (if they feel like it). If the user in question is the administrator or owner of the site from which he or she posts, forget it; you can't win. Arrange for your newsreading software to ignore articles from him or her if you can, and chalk one up to experience.

The History of Usenet (The ABCs)

In the beginning, there were conversations, and they were good. Then came Usenet in 1979, shortly after the release of V7 Unix with UUCP; and it was better. Two Duke University grad students in North Carolina, Tom Truscott and Jim Ellis, thought of hooking computers together to exchange information with the Unix community. Steve Bellovin, a grad student at the University of North Carolina, put together the first version of the news software using shell scripts and installed it on the first two sites: unc and duke. At the beginning of 1980 the network consisted of those two sites and phs (another machine at Duke), and was described at the January 1980 Usenix conference in Boulder, CO. {The Usenix conferences are semi-annual meetings where members of the Usenix Association, a group of Unix enthusiasts, meet and trade notes.} Steve Bellovin later rewrote the scripts into C programs, but they were never released beyond unc and duke. Shortly thereafter, Steve Daniel did another implementation in the C programming language for public distribution. Tom Truscott made further modifications, and this became the "A" news release.

In 1981 at the University of California at Berkeley, grad student Mark Horton and high school student Matt Glickman rewrote the news software to add functionality and to cope with the ever increasing volume of news—-"A" news was intended for only a few articles per group per day. This rewrite was the "B" news version. The first public release was version 2.1 in 1982; all versions before 2.1 were considered in beta test. As The Net grew, the news software was expanded and modified. The last version maintained and released primarily by Mark was 2.10.1.

Rick Adams, then at the Center for Seismic Studies, took over coordination of the maintenance and enhancement of the news software with the 2.10.2 release in 1984. By this time, the increasing volume of news was becoming a concern, and the mechanism for moderated groups was added to the software at 2.10.2. Moderated groups were inspired by ARPA mailing lists and experience with other bulletin board systems. In late 1986, version 2.11 of news was released, including a number of changes to support a new naming structure for newsgroups, enhanced batching and compression, enhanced ihave/sendme control messages, and other features. The current release of news is 2.11, patchlevel 19.

A new version of news, becoming known as "C" news, has been developed at the University of Toronto by Geoff Collyer and Henry Spencer. This version is a rewrite of the lowest levels of news to increase article processing speed, decrease article expiration processing and improve the reliability of the news system through better locking, etc. The package was released to The Net in the autumn of 1987. For more information, see the paper News Need Not Be Slow, published in the Winter 1987 Usenix Technical Conference proceedings.

Usenet software has also been ported to a number of platforms, from the Amiga and IBM PCs all the way to minicomputers and mainframes.

Hierarchies Newsgroups are organized according to their specific areas of concentration. Since the groups are in a tree structure, the various areas are called hierarchies. There are seven major categories:

comp Topics of interest to both computer professionals and hobbyists, including topics in computer science, software sources, and information on hardware and software systems.

misc
Group addressing themes not easily classified into any of the other
headings or which incorporate themes from multiple categories.
Subjects include fitness, job-hunting, law, and investments.

sci Discussions marked by special knowledge relating to research in or application of the established sciences.

soc Groups primarily addressing social issues and socializing. Included are discussions related to many different world cultures.

talk Groups largely debate-oriented and tending to feature long discussions without resolution and without appreciable amounts of generally useful information.

news
Groups concerned with the news network, group maintenance, and software.

rec
Groups oriented towards hobbies and recreational activities

These "world" newsgroups are (usually) circulated around the entire Usenet—-this implies world-wide distribution. Not all groups actually enjoy such wide distribution, however. The European Usenet and Eunet sites take only a selected subset of the more "technical" groups, and controversial "noise" groups are often not carried by many sites in the U.S. and Canada (these groups are primarily under the talk and soc classifications). Many sites do not carry some or all of the comp.binaries groups because of the typically large size of the posts in them (being actual executable programs).

Also available are a number of "alternative" hierarchies:

alt True anarchy; anything and everything can and does appear; subjects include sex, the Simpsons, and privacy.

gnu
Groups concentrating on interests and software with the GNU
Project of the Free Software Foundation. For further info on what the
FSF is, FSF.

biz
Business-related groups.

Moderated vs Unmoderated

Some newsgroups insist that the discussion remain focused and on-target; to serve this need, moderated groups came to be. All articles posted to a moderated group get mailed to the group's moderator. He or she periodically (hopefully sooner than later) reviews the posts, and then either posts them individually to Usenet, or posts a composite digest of the articles for the past day or two. This is how many mailing list gateways work (for example, the Risks Digest).

news.groups & news.announce.newgroups