My Google Summer of Code project proposal stated that I would add TCP support to the network branch of ReactOS, which sought to integrate lwIP 1.4.1 as the protocol level network driver for the operating system, to ultimately be tested by replacing the network driver in an installation of Windows Server 2003 with my driver. The full proposal can be found here. At the time of my proposal, I underestimated the amount of effort a fully working network driver would take.
In this final week, I tried to do as much as possible to get my driver to some sort of usable state for simple network C programs. My first task this week was to fix a problem with port freeing. When a TCP connection dies, its lwIP PCB would sometimes remain, preventing new sockets from binding to the ports they are taking up. After a lot of tracing, I discovered that an lwIP internal semantic was at play.
In week 10, I had completed a major rewrite of my driver. In week 11, I dove into the problem of lwIP not being thread-safe once again. While I was able to deal with most of the individual bugs that kept popping up, each one was taking me more time to solve due to the haphazard nature of my previous fixes. At the beginning of last week, it was quickly becoming more apparent that I would need to rework most of my code once again if I wanted to have any hope of circumventing the multithreeading issue once and for all.
This past week, I have primarily focused on thread-safety. Three weeks ago, I discovered that lwIP's core code is not thread-safe. When left unmodified, each lwIP thread will access several unprotected global linked lists as well as use a set of global variables to process any and all incoming packets. One option to solve this problem was to modify the core code so the global data was protected from concurrent access.
Having rewritten all of the functions that I implemented and modified some of the functions that already existed before I started on this project at the end of last week, I now had to start flushing all of the bugs that invariably exist after a rewrite. The major issue I dealt with this past week revolved around properly handling a TDI_LISTEN. Part of the purpose for the rewrite was to reorganize my data so each TCP_CONTEXT struct represented a user socket or connection endpoint, and each ADDRESS_FILE struct only represented a local network address.
Last week ended with my realization that lwIP was not thread-safe, and me reading up on various ways to get around that. Last weekend, I spent a lot of time tinkering with lwIP's core code to see how hard it would be to make it thread-safe. I ultimately failed to actually make the library thread-safe, but I did learn a lot of things about lwIP that I hadn't known before I started digging into the source code in so much depth.
This past week started with me mindlessly chasing down memory bugs after having gotten WinDBG up and running. A particularly annoying bug involved an lwIP protocol control block being dereferenced by lwIP after it had been freed. I could not find a place in my drivere where I tried to use a dead PCB pointer, so I looked deeper. I did some stepping through of code, and read more of lwIP's source code.
Last week ended with Art going through decompiled assembly to find a bug for me, because the stack trace in the kernel debugger was pointing me to the wrong line in the C source code. It turns out that the problem was a NULL pointer dereference in the RECEIVE callback. As always in programming, a lot of effort went into catching a small oversight. With that leftover bug from last week resolved, I moved on to my tasks for this week - more debugging, a code review done by Thomas Faber, and finally setting up WinDBG.
I closed out last week by drawing a flow chart in preparation for restructuring my code for some new state variables. For most of this week, I was figuring out details about, and implementing, this chart. One of new state variable I added in is one variable in the CONNECTION_CONTEXT struct, specifying what that particular connection is doing. This way, my driver can easily identify a socket as bound or not, whether it is currently connected, and what operations it is trying to perform.
Going into week 5, I started with a code-complete but very much incorrect implementation of the TDI_SEND and TDI_RECEIVE IRQ handlers. My TCP_CONTEXT data structure and the existing ADDRESS_FILE data structure both did not contain a way to keep track of pending IRQs, so I had no way of keeping track of outstanding pending IRQs and what connection contexts they were supposed to be associated with. Without a clear scheme for keeping track of the information, IRP pointers invariably got lost.