History

This is a rough outline of the history of the web. Details are available in many places on the web and in history books.

Computers were originally, very large (physically anyway), stand alone machines. I recall working on a Univac 1100 series machine where 256K of memory occupied a row of file cabinet size frames. When people needed to share data, they did it either with cards or tape. Sometime in the late 60's, the mini-computer was developed. Mini in this case was relative to the mainframe machines. Now, a given institution might have several of these. Using tape was annoying and slow, cards weren't being used much and 8 inch floppies had too little capacity.

Technologies were developed to connect computers together over cables. These were called networks. A very popular one, called the Ethernet was developed at the Xerox Palo Alto Research Center. This allowed very fast transfer of data between computers.

Over time, most groups of computers were connected by networks. However, there were several different types and no consistent or fast way to connect computers on different networks. The Defense Advanced Research Projects Administration (DARPA) funded much of the research that led to a solution. The DARPAnet allowed different computers on different networks to communicate. This became known as the internet because it allowed the interconnection of different networks.

It is important to separate the internet from the applications that run on it. The internet itself is a set of protocols or standards and the physical implementation of those standards. Essentially, it is the hardware that connects the computers and the software that will move bits from one computer to another. The things we think of as the internet, the web, email, news groups, etc. are applications that are built on top of the internet.

We will talk in more detail about the applications of the early internet, most of which are still in use today, just in different forms. The researchers were mostly interested in trading scientific information so the earliest applications were file transfer (FTP) and email. This proved to be extremely useful to the scientists who worked at the government labs where the technology was developed. It soon spread to most academic institutions but was little known or used outside of these.

An early implementation of the internet was usenet. This allowed small computers to use phone lines to connect to another computer that was already on the internet. My computer would call the one at the local university, collect email and other data and do two things with it. Things that were addressed to my machine would stay there and, if I allowed other machines to connect to mine, mine would pass on data to the next one. One popular use for this was netnews. This was and is a collection of discussion groups on an enormous variety of topics.

Now people had access to other computers, but how to find what you wanted? People would publish lists of the files available at their site on the news groups. Other people collected these lists and reversed them. They would list a file and the sites where it was available. You could search these lists and use the file transfer tools to get the files to your machine.

The high point of this kind of system was Archie, no relation. It was a set of programs that searched known public sites and collected a list of the files that were stored there. These lists were stored in a database along with the location of the site. You could use a search program to look up the name of the file you wanted and it would give you back a list of the places that had it.

This collection of tools proved pretty useful but had several drawbacks. It was hard for untrained people to use. And you had to know very specifically what you were looking for.

Around 1989, Tim Bereners-Lee and others are the CERN nuclear physics research center in Switzerland (like Fermilab), were trying to solve this problem as it pertained to the dispersal of physics research. They built a new application on top of the internet called the World Wide Web. It implemented what is called hypertext. The term was coined in the early 70's by Ted Nelson. It described a document that contains links to other documents. Instead of a document starting at page 1 and being read until the end, the reader can jump around in the document as much as the author has provided links. You can even link to other documents. This is a very non-linear style of document reading and writing.

What they had created at CERN was a set of protocols and standards. These are precise descriptions of the rules that govern the use of the system. The two key inventions were the HTTP (HyperText Transfer Protocol) and HTML (HyperText Markup Language). We will see these in more detail later.

The development of these systems allowed software to be developed by many people that would all interact correctly, as long as they followed the standard. The first versions of the web were text only. Later, especially after Mosaic, the first graphical browser was created at the University of Illinois in Champaign-Urbana, graphics and other non-text objects to be linked together. All the initial critical components of the web were developed first at public research institutions and were thus free to spread around the world.