Montana Tech of The University of Montana
Computer Science & Software Engineering

CSCI 466
Networks
Fall 2013



PROGRAM #1

The goal of this assignment is to learn the basic of socket programming in Python using TCP. You will also learn some basics about the HTTP header format. You will first develop a simple web server that can be used from any browser. Second, you will develop a simple web client that can retrieve a page via HTTP.


Part 1: Web Server
You need to a develop a web server that handles one HTTP request at a time. Your web server needs to: By default, your web server should display helpful information about what is up to and who has connected to it. It should display a ready message along with the port it is listening on whenever it is waiting for a new client. When a client connects, it should display the IP address and port number of the client. It should then display the name of the file requested by the client. If the file was sent, it should display the bytes sent (including the final \r\n). If the client requests a missing file, display a file not found error message. Here is example output that retrieves the files HelloWorld.html and HelloWorld2.html:
% python WebServer.py 1234
Ready to serve on port =  1234
New client:  ('127.0.0.1', 50341)
File requested:  HelloWorld.html
Bytes sent:  52
Ready to serve on port =  1234
New client:  ('127.0.0.1', 50342)
File requested:  temp/HelloWorld2.html
Bytes sent:  67
Ready to serve on port =  1234
New client:  ('127.0.0.1', 50346)
File requested:  MissingFile.html
File not found!
Ready to serve on port =  1234
...

% python WebServer.py
Ready to serve on port =  6789
...
Code
Download the skeleton code for the web server. You are to complete the skeleton code. The places where you need to fill in code are marked. Each place may require one or more lines of code.

All the socket related functions you will need are in Chapter 2 of the book, or in the lecture on socket programming. You can read more about Python's socket interface here.

Running the server
Put an HTML file (e.g. HelloWorld.html) in the same directory that the server is in. Run the server program. Determine the IP address of the host that is running the server (e.g. 128.238.251.26). From another host, open a browser and provide the corresponding URL. For example: http://128.238.251.26:6789/HelloWorld.html

"HelloWorld.html" is the name of the file you placed in the server directory. Note also the use of the port number after the colon. You need to replace the port number with whatever port you have used in the server code. In the above example, we have used the port number 6789. The browser should then display the contents of HelloWorld.html. If you omit ":6789", the browser will assume port 80 and you will get the web page form the server only if your server is listening on port 80.

Next try and get a file that is not present on the server. You should get a "404 Not Found" message.

I get an "Address already in use" error when I try and start my server. What is going on? First make sure you don't have another copy of the server running. If you have recently aborted a server process on the same port number, it can take awhile before the OS considers the port free. So you may need to switch to a new number unless you want to wait.
Part 2: Web Client
Write an HTTP client program that can retrieve a single page from a web server. The program takes three command-line arguments: If fewer than four parameters are passed to the program, it aborts after printing a helpful list of the required arguments. Your program needs to parse the HTTP headers to locate the Content-Length: header. Your program only needs to return the body of a page with the Content-Length: header. If this isn't in the header, simply return without reading or printing the body. It should work with your previously developed web server as well as other sites that deliver pages in which Content-Length is set.

The program simply outputs the entire body of the HTTP response (this excludes the header lines and the \r\n that precede the body). Here is an example run both against our simple Python server and against several pages on the Internet:
% python WebClient.py
WebClient.py <server name> <server port> <page> 

% python WebClient.py localhost 6789 HelloWorld.html
<html>
<body>
<h1>Hello world!</h1>
</body>
</html>

% python WebClient.py cs.mtech.edu 80 index.html
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="http://cs.mtech.edu/main">here</a>.</p>
<hr>
<address>Apache/2.2.16 (Debian) Server at cs.mtech.edu Port 80</address>
</body></html>

% python WebClient.py cs.mtech.edu 80 main/ 

The last example has no output since this page is using "chunked" encoding for the body and so there is not Content-Length in the HTTP header.

My first call to recv() returned just the status code. The next call to recv() returned all the headers plus the body. How do I get just a single line from the HTTP response? A TCP socket is simply a stream of bytes. How many send() calls the client made and of what size may not be reflected in the behavior of recv() on the other end. So you'll need to receive the header one byte at a time, waiting for \r\n to signify the end of a line. We recommend implementing a function like def getLine(sock) that reads in from the socket one byte at-a-time until it gets the \r\n sequence.
Submission. Submit your programs WebServer.py and WebClient.py via Moodle. Be sure each submitted source file has the required header with your name, username, and a description of the program. Programs should use descriptive variable names and appropriate levels of commenting (explain at a high-level the goal of different sections of code).

Page last updated: October 14, 2013