Difference between revisions of "Programming Guidelines"
(added: SHGetPathFromIDList in section "don't imply path names")
(added other SH function related to special paths in section "don't imply path names")
|Line 113:||Line 113:|
-Do not use absolute paths; don't imply path names.
-Do not use absolute paths; don't imply path names.
Obvious, isn't it? As you know, drive letters (what a pain) vary from system to system. So including drive letters in paths is bad. Rather use relative paths or local paths. Also paths are no axioms on windows systems. The „Program files“-directory for example changes it's spelling from
Obvious, isn't it? As you know, drive letters (what a pain) vary from system to system. So including drive letters in paths is bad. Rather use relative paths or local paths. Also paths are no axioms on windows systems. The „Program files“-directory for example changes it's spelling from to . On there do exist the functions GetWindowsDirectory, GetSystemDirectoryand . With these functions you can resolve the differences between the single installations and .
Revision as of 22:27, 7 March 2005
See http://www.microsoft.com/resources/practices/ for Microsoft's take on best coding practices.
Always remember these guidelines:
- Write thread safe code
- Use multithreading where possible and useful but judiciously.
- Try to open files shared (esp. SHAREMODE_DELETE)
- Make programs support multiple instances
- Avoid data corruption on crashes.
- Write preemptible code
- Use spinlocks rarely but where needed
- raese (raise?) IRQL (for) as short (a time) as possible
- think of writing a DPC
- write time and memory efficient programs
- avoid memory leaks
- put plenty of comments into your code
- think of block comments which explain a whole module and the play tog.
- find the right balance between abstraction and straight forward.
- to be continued.......
- remember Unicode/non-Unicode
- don't hard code English phrases into source code
- if you must, collect them in one place (easier to localise)
- don't create a thread per connected user, use I/O Completion Ports instead
- Think of changing writing directions
- Use a GUI with layoutmanager than pixel positioning of controls
- Don't use "goto" in C-Code (if you feel you need this in C, then you should use ASM instead)
With respect to writing time and memory efficient programs, I would put forward that one should not concern themselves with execution speed while building code. Such concerns will be mostly a waste of time. Build the program to be robust and easily maintainable, then if it exhibits performance that is not acceptable profile it and improve only those sections that most need to be optimzed. -RexJolliff
Programming Guidelines Hints to write good code. Originally collected in the context of the ReactOS project.
Write thread safe code Especially if you write a library. Your library may be used by an MT-application This applies anyway to the kernel, since the kernel is a library which has to be thread-save per se. What differentiates thread safe code from thread unsafe code is the use of global variables. Thread safe code avoids nearly all global variables. This includes also local static variables and class variables. One has to decide carefully wether a variable or data structure is local, thread-global or process-global. Try to eliminate global variables. If this isn't possible and you need a thread-global varable then thread local storage (TLS) is the right thing. If you need a process global variable, then the right thing is a classic global variable. However don't forget to declare it with the volatile modifier. If you don't declare it as volatile, it may happen that two threads hold a copy of the variable in their contexts. So data consistency may not be guaranteed. Global data structures are little more difficult. A thread-global structure like a list is no problem, since only the owning thread knows about it. However a process-global structure which gets accessed by multiple threads needs more care. You have to synchonize the accesses of the threads with so called synchronization objects the kernel provides. These help your threads to guarantee a mutual exclusive access to common used data structures. The Mutex is commonly used for this purpose. One Mutex per list is mostly ok. However if such a structure gets accessed very hard, one should considder to use more than one Mutex. One per element is waste of resources. So using Mutexes for several ranges would be a good compromise. The same applies to kernel-mode. It's only harder and one uses spinlocks to even synchonize threads over multiple processors.
volatile int really_global_i; DWORD tlsi = TlsAlloc(); TlsSetValue( tlsi, 95 );
Short: Avoid global variables. Use TLS for thread-global variables. Find the right syncronisation granularity for global structures. Write multiprocessor-aware code
Use multithreading where possible and useful but judicious. Multithreading is not the holy grail. However there exist multiple examples out in the world where MT would have been appropriate. Win32-GUI apps have usually one GUI-thread and a bunch of workingthreads. It's also possible to have many GUI-threads but this gets complex very fast. It's not that simple to write MT-GUI-apps. However some things are just intuitively threadable. So have a try. Another thing are server applications. It's just a have to for server apps to be multi-threaded. And fortunately it is no big problem to make a server use multiple threads. However using too many threads is not good, either. A better strategy is to have a pool of sleeping server threads. If a request arrives, it is queued and dispatched to one of the threads. Win32 provides for this purpose the so called I/O-Completion ports. Using this mechanism, all this dispatch and queue work is a minor matter for you. Using multiple threads wich execute the same code in parallel lead us to another problem. If such a program is run on an SMP or an SMT capable system, a so called cache-aliassing happens. Processor caches are organized in so called cache lines, each of which consists of 32,64 or more bytes. These lines are circular mapped to memory addresses. Result is that addresses that, map to the same cache line, repeat all 8MB or so.
On SMT systems this means that two threads with the same memory access pattern hinder each other. Same memory access pattern happens if the same code was started nearly at the same time and the memory places are 8MB off. On SMT systems this means that if a thread reads a byte, the processor loads the whole cache line. The same happens on the other virtual processor but some MB off (which maps to the same cache-line). Next time the first processor acceses its address again. Now these two processors have to sync their caches with each other. This happens again and again. Result is that such a program runs faster on an uniprocessor system :-( This gets even worse on SMT-systems, because the two virtual processors share the same cache. This means also SMT and Cache-aliassing.
-Try to open files shared (esp. SHAREMODE_DELETE) -Make programs to don't care multiple instances -avoid data corruption on crashes. -write preemtible code -Use spinlocks rarely but where needed -raise IRQL as short as possible -think of writing an DPC - remember Unicode/non-Unicode - don't hardcode English phrases into source code - if you must, collect them in one place (easier to localise) - don't create a thread per connected user, use I/O Completion Ports instead Having one thread to serve all users is a bad idea. It urges users to wait an undefined time rather than being served slower. A solution is to create one thread for every user. However this is a bad idea, too. It's OK if you serve a determinable maximum of queries. But if you can't determine how many queries will arive, you better use the comfortable I/O Completion Ports. The thing with creating one and another server thread is: Mostly your thread does I/O operations (HD access). Invoking too many threads doesn't hurt the OS but your I/O subsystem. You get something like thrashing. The single thread's I/Os hinder each other to be finished and the whole thing gets slower and capacity melts. So restricting to a number of server threads is the best idea. One could program such a behaviour by hand or use the I/O completion ports. Example:
-Think of changing writing directions Left to right reading order is not the only direction to read and write text on the globe. There exist also right to left reading order as in hebrew and top to down reading order as in japanese. Whereas one legal reading order in japanese is also left to right. So you can restrict yourself to these two orders. This means that <help me>
- (?) Use a GUI with layoutmanager rather than pixel positioning of controls
-write time and memory efficient programs Memory sizes get always bigger and processors get always faster. But programs get always less efficient. OK, enough. This seems to be just an advice, because everyone has to decide and design ones application by ones own. At least these hints: Keep in mind that you can use memory mapped files. This gives you easier access to your file data and you do not have to rebuild the whole data world in memory. This is a thing you should always avoid: like some bad games, occupy one gig on HD and if started, the same amount in swapspace. If you work object oriented, use references for object parameters. This avoids a copy constructor to be called twice. One time for the temporal object, one time for the higher level variable. Avoid copying of redundant data through an object hierarchy. This means, do not pass a set of variables to the next deeper object and so on but use an intelligent design with pointers.
-avoid memory leaks Easier said than done. Memory leaks are always an offence. There's nothing one can really do to never have a memory leak. However take these hints. You can use a language which has a garbage collector, like Eiffel. In C, your only option is to be more careful and always write pairs of malloc and free. If you use C++ there exist the techniques of auto pointers and smart pointers. For how to use them, see the corresponding literature.
-- There are garbage collectors for C and C++ - Jakov
-put plenty of comments into your code -think of block comments which explain a whole module and the play tog. -find the right balance between abstraction and straight forward. -Don't make redundant copies of program data. -to be continued.......
-If using assembler, always implement an alternative way in C or what you use ReactOS is or will be a multi platform OS. Having nice fast assembler pieces for time critical operations is good. However there has to be an alternative written in C, always. Only this guarantee enables ROS to compile on every of it's target platforms. At last one hint: The first goal is always to make a piece of code running. If we find this piece to be a bottleneck, we'll optimize it and possibly write it in assembler.
-Preserve the meta information of files you use. On ReactOS-filesystems files may have lots of additional information. This includes timestamps, attributes, access control lists, multiple filestreams and extended attributes. Not every filesystem provides every feature. But if it does, it is anoying for a user who works with these informations, if a program destroys the meta information. At most this happens when copying or packing files or when saving a file. These programs just create a new file and store the content from the old one to the new. Afterwards they delete the old and rename the new one. Generally the optimal way would be to memory map the original and use its mapped data to directly for r/o access. If changes are required, because the user works with the program, a self made copy-on-write mechanism goes on and makes a working copy. If the user wants to save his changes, the changed structures are written back through the memorymapping. If this is not an alternative for you, remember to use the parameter hTemplateFile of the CreateFile-Call.
-Do not use absolute paths; don't imply path names. Obvious, isn't it? As you know, drive letters (what a pain) vary from system to system. So including drive letters in paths is bad. Rather use relative paths or local paths. Also paths are no axioms on windows systems. The „Program files“-directory for example changes it's spelling from localization to localization. On Win32 there do exist the functions GetWindowsDirectory, GetSystemDirectory, SHGetPathFromIDList, SHGetSpecialFolderPath and SHGetFolderPath. With these functions you can resolve the differences between the single installations and localizations.