The Web Server: Shana Blair and Mark Kimmet University of Notre Dame, Notre Dame IN 46556, USA
The Web Server: Shana Blair and Mark Kimmet University of Notre Dame, Notre Dame IN 46556, USA
Abstract
Since the creation of the World Wide Web by Tim Berners-Lee in 1990, internet web
servers have been a necessity for transferring information. Currently however, loopholes
and backdoors in web servers like Microsoft’s IIS are widely known and are a cause for
concern. We have set out to build a web server that will side step many of these known
vulnerabilities by eliminating all of the defaults set by programs like IIS, and by giving us
more control over what is served and how it is served.
1 Introduction
Today nobody thinks twice about web servers or how they work. We go about our web
surfing activities rarely thinking about what goes on behinds the scenes. Every copy of
Windows comes with its own personal web server and you can download Apache web
server online for free. No matter how much we take them for granted, they make up the
backbone of the World Wide Web, serving information to all corners of the world.
With this in mind, we endeavored to create a web server to gain better knowledge of how
the World Wide Web works. In so doing we have additionally addressed several key
security concerns of Microsoft’s popular IIS web server.
2 Implementation
When you sit down at your computer and point your browser to your favorite website
there are several transactions that take place. First, your browser sends a request to the
server. This request header has a defined format that is set by the W3C, the organization
that establishes the standards for communicating through HTTP (HyperText Transfer
Protocol) as well as other internet standards. This header looks similar to the following:
1
Accept: text/html, image/jpeg
/*blank line*/
The next step is done on the server’s end. The server takes the request header received
and parses it to get the necessary data it needs to fill the request. From the first line it
determines what type of request was made (GET, HEAD, POST), the file that was
requested, in this case “\index.html,” and the HTTP version that will be accepted. Below
this information are the various other optional attributes like date, host, user-agent,
language-accepted, and content types accepted.
After it has this information parsed, it uses it to create a response header. First it will
verify that the file exists and the user has permissions to view it. Next it will build the
response header, and then send the header and the file (if applicable) to the client IP. A
example header looks like this:
HTTP/1.1 200 OK
Server: NCSA/2
Location: http://www.nd.edu/~mkimmet/index.html
Content-type: text/html
Content-length: 67
/* blank line */
In the first line of the header it responds with the HTTP version used and the response
code (was it successful). In the following optional lines, it will give information on the
server software used the content type being return and the content length. Finally a blank
line separates the header and the actual file code.
We broke the implementation into three separate classes: the server class (Myserver), the
header class (Header), and the response class (Response).
This is the main class from which all of our actions emanate. From here we store all our
general information about the server; we control the starting of the server as well as all
the subsequent connections that are created.
When we receive a request from a client browser over the internet, we create a new
process to handle this request. We used the Practical Sockets foundation, by Baylor
University’s CSE department, to allow us easy control over socket connections in a
process-driven environment. Once a process is created we then create a header object.
2
2.2.2 Header Class
The header class takes in the header buffer sent by the client browser, and instantiates a
Header object that parses and stores the data from the client’s header.
Using the header class the response class checks to make sure the requested file is valid
and it exists. Based on this information it generates a response filling in the necessary
information. Once this is performed, the process sends the response and the file (if
necessary) to the client IP.
We have also implemented a form of logging to keep tract of what files are accessed on
our server. The file is saved as a comma delimited file with the IP that made the request,
the file that was requested and finally the response code.
2.4 Security
Our first security check is made when the response class validates the file name. In so
doing it makes sure that the file requested does not contain “../” which otherwise could be
used to gain access to any file on the server. If the request is made with “../” in the file
name, our server returns a 403 (forbidden) error as seen below.
3
Another feature we have built in is checking to make sure the file exists. When a file that
is requested is found to be non-existent, a 404 error is returned to the user, similar to the
error below.
To further protect us from popular viruses circulating the internet we do not use the
default directories that IIS web servers’ use. Some of these default scripting directories
can be used by viruses like the CodeRed virus to gain write permissions to the server,
thus allowing a hacker to set up a robot that allows them to use your computer to attack
further vulnerable systems or to launch Denial Of Service (DOS) attacks against
commercial websites.
Overall our web server gives us more control over what and how we serve; only serving
images, text and html and not serving any scripts like asp that could be manipulated for
evil. We can decide what directory to use as our root directory and we can change this
every time we run our server. Having written the code ourselves gives us the reassurance
that we know exactly what our server is doing.
3 Difficulties Faced
The majority of problems we encountered were due to our lack of experience with
creating web servers. This was a new area of programming for both of us and
implementing a web server requires an extensive knowledge of internet transfer protocol,
4
what constitutes a header and response, and many other aspects of programming with the
internet that we had not expected. Although our server may seem simple on the outside,
underneath the user interface it is very intricate. There were a number of subtle
programming issues that ended up being extremely important to the functionality of the
web server and caused us many hours of intense scrutiny on only a few lines of code. One
of these was the case if our content length was off by only one, then the response to the
client would not work.
Some of our other difficulties came with trying to implement a GUI for our web server.
Neither of us had any experience with GUIs in C++ so many hours were spent trying just
to learn how to set up a GUI. Then our other problems were finding out how to stop the
MFC precompiled headers on the files that did not need them and figuring out how to get
input from text boxes and then manipulate it. Our GUI still is not at the level we would
like it to be. Our program is much more reliable when run from the console version.
Also, our web server is a little touchy about what sort of computer it will run on. We
believe this is because of the socket class we are using and how the computer it is
running on is connected to the internet and how it deals with security issues.
Our last set of difficulties came with implementing Design by Contract. At some point in
our coding, the assertions we supplied started causing the program to end in an abnormal
termination. Although we worked on this for a number of hours, we were not able to
locate the point at which something went wrong. Therefore, although we did implement
Design by Contract throughout our whole project, the assertions have been disabled so
that our program will execute correctly.
4 Conclusion
We started this project with the view that implementing a web server would be
a challenging yet entirely possible project. We looked forward to learning about
something that not many people know about and everyone takes for granted. Web servers
are essential to present-day information sharing. We did not realize quite how complex it
would be, though. We started out with big goals for our web server, but during the
process realized we needed to rethink those goals somewhat.
At first we planned to have our server handle every type of request method,
every type of response code, and all possible content types. This soon proved to be too
much to implement in such a short time. Now our server handles the GET request method,
three response codes, and five content types. We are excited that our web server works
and retrieves pages, because we had to work so many hours just to get it to do that. Also,
the console version is quite reliable and we were able to implement a couple of forms of
security. One of them is making sure the user can only access files in a specified folder
and its subsequent directories. Also, we provided for the case that the user would try to
5
access folders using “..\”. We did not use virtual directories like Microsoft does, and
because of this, viruses like Code Red cannot get in through our web server.
We were able to provide a function that creates a log file. This way, the user
could keep track of what is being accessed on the server, who is doing the accessing, and
other similar information. Other extras we implemented include allowing the user to set
the port number to listen on and allowing the user to set the root directory from which to
serve web pages. Our goal of creating a GUI was accomplished, but it is not up to the
standards that we originally expected.
At the conclusion of our project, our web server provides a highly useful utility
for retrieving web pages for clients. The console version is stable and secure, and our
code was written in a way so that adding improvements will be simple. We both have
learned so much about a topic inherent to present-day information sharing that we barely
knew anything about before this project.
The task of creating a web server was more complex than we anticipated, and
there is room for a number of improvements of our project. First, we would like to make
it so that the assertions can be turned back on, because then the server would be
extremely stable and reliable. Then we would not have to worry about someone using the
character array for the header to overflow the buffer and gain access to the rest of the
files in the computer. Secondly, we would like to provide the rest of the response codes
and content types. This is something that would require a lot more time, but with the way
we designed our code, could be easily implemented.
We would like to allow form inputs and processing. We also could implement
virtual directories, although our web server would not be quite as secure. Lastly, we
would like to provide even more security by limiting access based on user rights and
enabling password protected directories.
Our server is a basic web server, but it provides a strong base to implement a
much more advanced server. We did not worry so much about speed as we did about
security and reliability for this project. Increasing the speed of our server also could be a
goal for future work.
5 References
Hughes, Merlin. Java Network Programming. Greenwich: Manning, 1999.
6
“Microsoft’s HTTP Revealed (a HTTP Primer)” accessed on Dec 1, 2002.
<http://www.microsoft.com/mind/0796/protocol/protocol.asp>
7
Appendix A – Screen Shots
8
Appendix B – Selected Code Segments
try {
TCPServerSocket servSock(my_port); // Server Socket object
9
// GET HEADER SENT FROM CLIENT
char echoBuffer[RCVBUFSIZE];
char headerbuffer[RCVBUFSIZE];
int recvMsgSize, headerSize;
string theheader = "";
// reads the header sent from the buffer and save it as a char array, and record its size
while ((recvMsgSize = sock->recv(echoBuffer, RCVBUFSIZE)) > 0) {
strcpy(headerbuffer, echoBuffer); //retrieves the header
headerSize = recvMsgSize; //retrieves its size
break;
}
// CREATE HEADER
Header currentHeader(headerbuffer, headerSize, getmy_default_file());
// places them in a character array for use with reading from files
for(int z=0; z<dir.size() && z<100; z++)
{
fullpath[z] = dir.at(z);
}
// if the last character of the default directory is not a slash add one
if (fullpath[z-1] != '\\') {
fullpath[z] = '\\';
z++;
}
// adds the file name to the character array, completing the full path
for(int x=0; x<gfile.size() && x<100 && z<202; x++)
{
fullpath[z] = gfile.at(x);
z++;
}
// end set full file name and path
10
// SENDING THE FILE TO THE CLIENT
test.close();
}
delete sock;
}
11
int my_string_size = 0;
int my_accept_size = 0;
int my_path_size = 0;
int my_encoding_size = 0;
char c; // current element in array
char c2; // next element in array
// More Preconditions
Observe("Old my_string_size", my_string_size);
Observe("Old my_accept_size", my_accept_size);
Observe("Old my_path_size", my_path_size);
Observe("Old my_encoding_size", my_encoding_size);
// Implementation
// constructor
explicit Response(Header myHeader, char sname[100], char fullpath[202]){
string tempsname = sname;
string tempfullpath = fullpath;
// Preconditions
Require(tempsname.size()<100, "No overflowing");
Require(tempfullpath.size()<202, "No overflowing");
// Implementation
//check if valid filename/check if file exist, then set the status
validation(myHeader.get_path_info(), fullpath);
fin.close();
}
12