Session Manager Subsystem Rewrite
The session manager subsystem is responsible for initializing much of the operating system environment and starting up services and processes needed to get a user to the login screen. Specifically, it creates the environment variables inherited by other processes, loads the Win32 subsystem, sets up the pagefiles for virtual memory, and finally starts up the winlogon.exe process, amongst other things. The implementation that was in ReactOS took many shortcuts in how it communicated with subsystem components, which in turn led to those components not correctly implementing the interfaces SMSS was supposed to use. Alex Ionescu decided to rectify this situation and systematically replaced SMSS components with a new implementation. Initially SMSS2 did nothing more than initialize the environment but slowly co-opted more and more responsibility from the existing SMSS, including creating the environment variables and creating the pagefiles. Alex jumped back and forth between SMSS and the Win32 subsystem components, win32k.sys and CSRSRV, the client-server runtime subsystem, to fix their interactions with SMSS, and ultimately finished the CSRSRV implementation in trunk.
The CSRSRV work also dealt with the threading model used to service requests. Previously CSRSRV would create a new thread for every request it received. In addition, the old code suffered a massive leakage of ETHREADs, the data structures that represented the threads. Because ETHREAD instances were allocated on the non-paged pool, the fact that CSRSRV was spawning several thousand threads with each leaking approximately 300bytes, ReactOS ended up chewing through a rather scarce system resource rather quickly, even before considering the other smaller leaks the old CSRSRV suffered. Resolution of this issue has significantly decreased the number of simultaneous threads existing in CSRSRV and plugged a fairly major memory leak.
Memory Leakages and Inefficiencies
As mentioned above, Alex's SMSS and CSRSRV work significantly reduced the amount of memory being used by ReactOS, mostly by plugging leakages and making the code use resources more efficiently. Alex however has not been the only one trying to plug memory leaks and corruptions, and Jérôme Gardou managed to nail the memory manager bug that caused the conveniently but inaccurately named 'mshtml' issue. ReactOS currently has two memory managers, the original memory manager and the Another Rewrite of the Memory Manager Module (ARM3) that the ARM porting team started. While ARM3 has much of the needed functionality a memory manager replacement needs, it cannot yet outright replace the old memory manager and the two effectively run concurrently. When the old memory manager and ARM3 differ, issues can creep up.
A very short explanation of virtual memory and paging data structures follows. Those that are already familiar with virtual memory terminology can skip to the next paragraph. Virtual memory is a way for the operating system to abstract away direct access to physical memory and present programs with a continuous address space. The mapping between virtual addresses and physical addresses is maintained in page tables, with entries in those tables mapping one page each. Page tables are themselves organized into page directories, with each page directory entry (PDE) pointing to a page table and holding bits of associated information. These compose the primary data structures a memory manager needs to do its job. Errors in handling any of them tends to produce bugs ranging from subtle data corruption to outright crashes.
Each memory manager maintains its own set of page directories and handled allocating and freeing them as needed on their own. ARM3 actually kept a reference count of how many entries were in the page directory to help it decide when a PDE could be discarded. The old memory manager did not and actually dropped PDEs whenever a process exited or the processor performed a context switch. ARM3 currently only supports allocating kernel memory, a responsibility it actually shares with the old memory manager. This sharing of responsibilities can however lead to very bad things due to the records maintained by the old memory manager and ARM3 never being synchronized. What appeared to happen was the old memory manager would basically lose track of pages that held page directories, which might end up being claimed by either itself or ARM3 again and used for other purposes, corrupting the PDEs and ultimately losing track of mappings between virtual and physical addresses. Jérôme's fix involved having ARM3 assume responsibility for all pages that hold PDEs, so that ARM3's reference counting mechanism can be used to keep track of whether a page directory was being used. So far feedback from testers indicates the 'mshtml' bug has been squashed, though other issues are now cropping up, some likely hid by the long reign of the 'mshtml' issue.
The work of Jérôme and a few others in the memory manager inspired Alex to take a look in his pile of old patches and he came upon an old one sent to him for review by the ARM team. After looking it over, Alex realized this patch likely would have dealt with the 'mshtml' issue months ago and saved all of us from a great deal of grief. Like Jérôme's work, this patch dealt with the PDE and proper cleanup to ensure entries do not leak, along with some other rather nice gems. Alex updated the patch where he could to make it work with the current state of the memory manager and committed it, though some additional work by Jérôme was needed to fix releasing of PDE pages that pointed to shared memory.
In addition to the PDE handling code, the ARM team patch also dealt with discarding data that is no longer needed after program initialization. Specifically, functions in drivers can be marked with the INIT_FUNCTION macro, which tells the loader to put it in a special section of a binary image. After the loader loads the image into memory and initialization of the program or driver is complete, the memory manager can free the block of memory where the section was loaded. This obviously should only be used for functions that are only needed during initialization and never afterward, but can still result in visible savings in used memory. The final gem in the ARM patch was fairly straightforward, discarding data that was needed to boot the operating system but was useless once the kernel took over operation.
An obvious question many may ask is when ARM3 can outright supplant the old memory manager. The brief discussion above should have demonstrated the tighter bookkeeping ARM3 already does and its superiority to the old memory manager. Unfortunately, ARM3 is not yet plugged into the user mode memory components and it will likely take some time before this work is completed, so for the near term ReactOS has little choice but to live with having two memory managers.
Window Stations and Desktops
Window stations are a rather integral part of the whole security model in Windows and its sad state in ReactOS means security is effectively nonexistent. Giannis Adamopoulos has been diligently working to rectify this situation and has faced a wide range of challenges. One of the biggest blockers to a proper implementation was actually the incorrectly designed input handling for keyboard and mouse. Rafał Harabień fixed this issue and Giannis has taken advantage of this to remove several hacks. Many of these hacks were related to the THREADINFO data structure, which as its name implies is used by win32k to describe threads. An especially egregious bug dealt with the handling of desktop objects, though this was not the fault of THREADINFO itself. A desktop object has its own memory heap associated with it and all threads that belong to that desktop allocate their data on that heap. When the CreateDesktop function was called for an existing desktop object, a new heap was given to the object. The problem however was all of the threads still had their data on the old heap and became hopelessly confused when suddenly the heap their desktop object owned got changed.
Windows system developers often make use of a variety of access controls and masks to specify what kind of operations they want to carry out when they need to create or open objects and the system checks these masks to make sure the caller actually has permission to actually carry out the desired operation. For objects managed by win32k such as desktops and window stations, these checks were not being carried out. The cause of this is something of a negative feedback loop. First, ReactOS never specified the valid access masks to test against. More specifically, this would result in some sanity checking code discarding the access mask handed to it because no valid masks were specified. This should in theory result in the permission and security checks failing. Second, ReactOS did not set proper permissions for the owner of desktop objects so in theory no one should have been able to do anything with the desktop object. And third, in order to get ReactOS past the previous two issues, access control checks were basically turned off everywhere. Giannis has dealt with the first issue and is working on the second. Once those two are done, he can turn on access checks in the appropriate places and hopefully security will actually mean something in ReactOS then.
As mentioned in the joint statement with Haiku, the ReactOS project recently made significant progress on its USB stack. During and since the sprint to get the USB branch merged into trunk, Cameron has been working extensively with Johannes Anderwald to debug and fix issues to get USB to a usable and even useful state. So far their achievements include mounting USB drives, a long asked for feature, and much better support for USB input devices. Cameron was also able to finally drop the NT4 USB driver that ReactOS had been relying on as a hack to provide limited USB support. This driver effectively combined the entire USB stack into a single driver. This included mass storage, keyboard, mouse, and even the drivers for the various controller types such as EHCI, OHCI, and UHCI. For a brief explanation of USB controllers, please see the previous newsletter. This was very much a gigantic hack and the work begun by Michael Martin and continued by Cameron and Johannes has now made this hack unnecessary.
The boot issues people had with USB devices was traced back to the legacy HAL reporting invalid bus numbers to the PCI driver, which would then fail to detect anything when using those numbers to scan for devices. Since the PCI driver found nothing, it would not attempt to load the USB driver. Resets for EHCI and OHCI were also massive hacks, wherein OHCI was basically busy-waiting by polling for the reset bit instead of waiting for an interrupt. EHCI basically was doing things it should not have in the reset interrupt handler. EHCI was also failing to clear status change bits, which prevented new devices from being used after a low speed device was plugged in and removed from a port short of a reboot. Finally, the USB keyboard issue during 1st stage boot ended up being due to missing data in the registry. In the past nothing else used this information so there was no pressing need to fix it, but the device interface registration needed to get USB devices working so early did need it. Altogether, USB support has increased substantially, but the stack is still extremely fragile and incomplete. A lot more work needs to be done and Cameron and Johannes are wrestling with a steady stream of bugs, misbehaving controllers, and ambiguous specification documentation. It will be a long road before ReactOS finally gains stable USB support, but this is a good start.