In this project, we'll finish the implementation of a web server in C.
What you need to write:
- 
HTTP request parser 
- 
HTTP response builder 
- 
LRU cache - Doubly linked list (some functionality provided)
- Use existing hashtable functionality (below)
 
- 
Your code will interface with the existing code. Understanding the existing code is an expected part of this challenge. 
What's already here:
- net.hand- net.ccontain low-level networking code
- mime.hand- mime.ccontains functionality for determining the MIME type of a file
- file.hand- file.ccontains handy file-reading code that you may want to utilize, namely the- file_load()and- file_free()functions for reading file data and deallocating file data, respectively (or you could just perform these operations manually as well)
- hashtable.hand- hashtable.ccontain an implementation of a hashtable (this one is a bit more complicated than what you built in the Hashtables sprint)
- llist.hand- llist.ccontain an implementation of a doubly-linked list (used solely by the hashable--you don't need it)
- cache.hand- cache.care where you will implement the LRU cache functionality for days 3 and 4
A web server is a piece of software that accepts HTTP requests (e.g. GET requests for HTML pages), and returns responses (e.g. HTML pages). Other common uses are GET requests for getting data from RESTful API endpoints, images within web pages, and POST requests to upload data to the server (e.g. a form submission or file upload).
We will write a simple web server that returns files and some specialized data on a certain endpoint.
- 
http://localhost:3490/d20should return a random number between 1 and 20 inclusive astext/plaindata.
- 
Any other URL should map to the serverrootdirectory and files that lie within. For example:http://localhost:3490/index.htmlshould serve the file ./serverroot/index.html
Examine the skeleton source code in server.c and cache.c for which pieces you'll need to implement.
IMPORTANT Spend some time inventorying the code to see what is where. Write down notes. Write an outline. Note which functions call which other functions. Time spent up front doing this will reduce overall time spent down the road.
The existing code is all one big hint on how to attack the problem.
For the portions that are already written, study the moderately-well-commented code to see how it works.
There is a Makefile provided. On the command line, type make to build the server.
Type ./server to run the server.
Read through all the main and stretch goals before writing any code to get an overall view, then come back to goal #1 and dig in.
- 
Implement send_response().This function is responsible for formatting all the pieces that make up an HTTP response into the proper format that clients expect. In other words, it needs to build a complete HTTP response with the given parameters. It should write the response to the string in the responsevariable.The total length of the header and body should be stored in the response_lengthvariable so that thesend()call knows how many bytes to send out over the wire.See the HTTP section above for an example of an HTTP response and use that to build your own. Hint: sprintf()for creating the HTTP response.strlen()for computing content length.sprintf()also returns the total number of bytes in the result string, which might be helpful. For getting the current time for the Date field of the response, you'll want to look at thetime()andlocaltime()functions, both of which are already included in thetime.hheader file.The HTTP Content-Lengthheader only includes the length of the body, not the header. But theresponse_lengthvariable used bysend()is the total length of both header and body.You can test whether you've gotten send_responseworking by calling theresp_404function from somewhere inside themainfunction, and seeing if the client receives the 404 response.
- 
Examine handle_http_request()in the fileserver.c.You'll want to parse the first line of the HTTP request header to see if this is a GETorPOSTrequest, and to see what the path is. You'll use this information to decide which handler function to call.The variable requestinhandle_http_request()holds the entire HTTP request once therecv()call returns.Read the three components from the first line of the HTTP header. Hint: sscanf().Right after that, call the appropriate handler based on the request type ( GET,POST) and the path (/d20or other file path.) You can start by just checking for/d20and then add arbitrary files later.Hint: strcmp()for matching the request method and path. Another hint:strcmp()returns0if the strings are the same!Note: you can't switch()on strings in C since it will compare the string pointer values instead of the string contents. You have to use anif-elseblock withstrcmp()to get the job done.If you can't find an appropriate handler, call resp_404()instead to give them a "404 Not Found" response.
- 
Implement the get_d20()handler. This will callsend_response().See above at the beginning of the assignment for what get_d20()should pass tosend_response().If you need a hint as to what the send_response()call should look like, check out the usage of it inresp_404(), just above there.Note that unlike the other responses that send back file contents, the d20endpoint will simply compute a random number and send it back. It does not read the number from a file.The fdvariable that is passed widely around to all the functions holds a file descriptor. It's just a number use to represent an open communications path. Usually they point to regular files on disk, but in this case it points to an open socket network connection. All of the code to create and usefdhas been written already, but we still need to pass it around to the points it is used.
- 
Implement arbitrary file serving. Any other URL should map to the serverrootdirectory and files that lie within. For example:http://localhost:3490/index.htmlserves file./serverroot/index.html.http://localhost:3490/foo/bar/baz.htmlserves file./serverroot/foo/bar/baz.html.You might make use of the functionality in file.cto make this happen.You also need to set the Content-Typeheader depending on what data is in the file.mime.chas useful functionality for this.
Implement an LRU cache. This will be used to cache files in RAM so you don't have to load them through the OS.
When a file is requested, the cache should be checked to see if it is there. If so, the file is served from the cache. If not, the file is loaded from disk, served, and saved to the cache.
The cache has a maximum number of entries. If it has more entries than the max, the least-recently used entries are discarded.
The cache consists of a doubly-linked list and a hash table.
The hashtable code is already written and can be found in hashtable.c.
- 
Implement cache_put()incache.c.Algorithm: - Allocate a new cache entry with the passed parameters.
- Insert the entry at the head of the doubly-linked list.
- Store the entry in the hashtable as well, indexed by the entry's path.
- Increment the current size of the cache.
- If the cache size is greater than the max size:
- Remove the entry from the hashtable, using the entry's pathand thehashtable_deletefunction.
- Remove the cache entry at the tail of the linked list (this is the least-recently used one)
- Free the cache entry.
- Ensure the size counter for the number of entries in the cache is correct.
 
- Remove the entry from the hashtable, using the entry's 
 
- 
Implement cache_get()incache.c.Algorithm: - Attempt to find the cache entry pointer by pathin the hash table.
- If not found, return NULL.
- Move the cache entry to the head of the doubly-linked list.
- Return the cache entry pointer.
 
- Attempt to find the cache entry pointer by 
- 
Add caching functionality to server.c.When a file is requested, first check to see if the path to the file is in the cache (use the file path as the key). If it's there, serve it back. If it's not there: - Load the file from disk (see file.c)
- Store it in the cache
- Serve it
 
- Load the file from disk (see 
There's a set of unit tests included to ensure that your cache implementation is functioning correctly. From the src directory, run make tests in order to run the unit tests against your implementation.
- 
Implement find_start_of_body()to locate the start of the HTTP request body (just after the header).
- 
Implement the post_save()handler. Modify the main loop to pass the body into it. Have this handler write the file to disk. Hint:open(),write(),close().fopen(),fwrite(), andfclose()variants can also be used, but the former three functions will be slightly more straightforward to use in this case.The response from post_save()should be of typeapplication/jsonand should be{"status":"ok"}.
We know that if the user hits http://localhost:3490/index.html it should
return the file at ./serverroot/index.html.
Make it so that if the user hits http://localhost:3490/ (which is endpoint
/, on disk ./serverroot/), if no file is found there, try adding an
index.html to the end of the path and trying again.
So http://localhost:3490/ would first try:
./serverroot/
fail to find a file there, then try:
./serverroot/index.html
and succeed.
3. Implement functionality that will allow your server to serve any type of data, not just text data
All the files that the server has been responding with have been text files of some sort. Augment the server such that if a client requests http://localhost:3490/cat.jpg, the server is able to fetch and respond with the requested file (of course, the file needs to exist in the server's directory structure).
Add an image to the ./serverroot directory and update the send_response function such that it can handle any type of data. Hint: you'll want to look into the memcpy function from the C standard library.
Note that file_load doesn't actually need any modification. It's already been written in such a way that it can handle arbitrary types of file data.
It doesn't make sense to cache things forever--what if the file changes on disk?
Add a created_at timestamp to cache entries.
If an item is found in the cache, check to see if it is more than 1 minute old. If it is, delete it from the cache, then load the new one from disk as if it weren't found.
You'll have to add a cache_delete function to your cache code that does the work of actually removing entries that are too old from the cache.
Difficulty: Pretty Dang Tough
Research the pthreads library.
When a new connection comes in, launch a thread to handle it.
Be sure to lock the cache when a thread accesses it so the threads don't step on each other's toes and corrupt the cache.
Also have thread cleanup handlers to handle threads that have died.