Help diagnosing a problem with cygwin/samba on ROS

All development related issues welcome

Moderator: Moderator Team

bugdude
Posts: 28
Joined: Wed Nov 27, 2019 11:41 pm

Re: Help diagnosing a problem with cygwin/samba on ROS

Post by bugdude »

Didn't mean to annoy you. I am working on building the test you sent to see if I can reproduce the failure with it. I did extract some code out of Cygwin that shows the pattern used in the code, but I'll hold onto that until I can reproduce the failure with the test.

bigdude
bugdude
Posts: 28
Joined: Wed Nov 27, 2019 11:41 pm

Re: Help diagnosing a problem with cygwin/samba on ROS

Post by bugdude »

learn_more,

The reason your test code does not fail is that you re-used the same structures for both calls to NtOpenFile, while in my Delphi test app I declared two independent structures for ObjectAttributes, iosb, and FileStandardInformation. It seems that if a string is already present in the structure and you try to replace it with an empty constant InitializeObjectAttributes leaves the original string in place. I could be wrong, but it seems you may have hit a completely separate issue with the macro for InitializeObjectAttributes. Since I used two separate structures, I get different results which includes triggering the failure condition I am seeing in Cygwin's NtOpenFile usage. In Cygwin the calls come from separate functions with local structures in different scopes so the structures cannot be the same ones. If you alter the test code to use two independent structures for the calls to NtOpenFile you will recreate the failure condition with your test code. I did that, and added some instrumentation to print the filesizes and with those changes your test code got the exact same failed results I see in Delphi/Cygwin including the second handle being the volume (distinguishable by its size).

I continued to work on building a patch for the issue and I came up with one that seems to work. I am testing it more carefully now, but it comes down to inserting an 'if' to detect the parameter condition and execute a separate code branch that uses ObReferenceObjectByHandle and ObDuplicateObject to duplicate the handle received in ObjectAttributes->RootDirectory. The code below would be inserted just after the parameter validations in the IopCreateFile function in iomgr\file.c. As I mentioned before I am not a C programmer so I'm sure it requires cleanup and validation to really be usable once the DPRINT1 instrumentation is converted to match the style used in the rest of file.c but the code builds and works for the tests I've run using your test code and my and Delphi test. I will test it with my preliminary Samba build shortly. There is probably a cleaner way to detect the empty unicode_string than the one I used, but those are not friendly to work with..

Code: Select all

    /* Handle Cygwin edge case of re-opening FileObject by calling NtOpenFile passing only existing Handle. Deflect the call to 
       ObDuplicateObject because we don't need OpenPacket or other calls in this case, we have all we need in hand already. */

    if ((ObjectAttributes->ObjectName->Length <= 1 || (!ObjectAttributes->ObjectName->Length) || (!ObjectAttributes->ObjectName) || 		 
        (!ObjectAttributes->ObjectName->Buffer)) && (ObjectAttributes->RootDirectory != 0) && (ObjectAttributes->RootDirectory != NULL)) 
    {
    	Status = ObReferenceObjectByHandle(NtCurrentProcess(), 
                                       	   PROCESS_DUP_HANDLE,
                                       	   PsProcessType,
                                       	   PreviousMode,
                                       	   (PVOID*)&SourceProcess,
                                       	   NULL);
    	if (!NT_SUCCESS(Status)) {
        	DPRINT1("(PID %lx) IopCreateFile call to ObReferenceObjectByHandle for Source Process failed - status: %lx\n", 
        	                                         PsGetCurrentProcessId(), Status); 
		return Status;
    	}
    	Status = ObDuplicateObject(SourceProcess, 
                               	   ObjectAttributes->RootDirectory,
                               	   SourceProcess, 
                               	   &LocalHandle,
                               	   DesiredAccess,
                               	   ObjectAttributes->Attributes,
                               	   CreateOptions,
                               	   PreviousMode);
    	if (!NT_SUCCESS(Status)) {
        	DPRINT1("(PID %lx) IopCreateFile call to ObDuplicateObject failed - status: %lx\n", PsGetCurrentProcessId(), Status); 
        	ObDereferenceObject(SourceProcess);
		return Status;
    	}
    	ObDereferenceObject(SourceProcess);

    	if (AccessMode != KernelMode) 
    	{ /* probe for write access for non-kernel mode */
	       	_SEH2_TRY
        	{
            		/* Probe the output parameters */
            		ProbeForWriteHandle(FileHandle);
            		ProbeForWriteIoStatusBlock(IoStatusBlock);
        	}
        	_SEH2_EXCEPT(EXCEPTION_EXECUTE_HANDLER)
        	{
            		/* Get the exception status */
            		Status = _SEH2_GetExceptionCode();
        	}
        	_SEH2_END;
    	}   

	/* perform write back of handle and status information */
        _SEH2_TRY
        {
            	*FileHandle = LocalHandle;
            	/* IoStatusBlock->Information = OpenPacket->Information; */
            	/* since we do not use Io for this calling mode set up 'fake' Io status and final status if we didn't crash before this point we succeeded */
            	IoStatusBlock->Status = STATUS_SUCCESS;
            	Status = STATUS_SUCCESS;
        }
        _SEH2_EXCEPT(EXCEPTION_EXECUTE_HANDLER)
        {
            	/* Get the exception status */
            	Status = _SEH2_GetExceptionCode();
        }
        _SEH2_END;

    	if (!NT_SUCCESS(Status)) {
       		if (LocalHandle != 0) {
          		DPRINT1("(PID %lx) IopCreateFile Completed status != NT_SUCCESS - returning status: 0x%lx handle: %p FileName: %wZ \n", 
                                                                          PsGetCurrentProcessId(), Status, LocalHandle, ObjectAttributes->ObjectName );
       		} else {
          		DPRINT1("(PID %lx) IopCreateFile Completed status != NT_SUCCESS - returning status: 0x%lx handle: 0x0 FileName: %wZ \n", 
                                                                          PsGetCurrentProcessId(), Status, ObjectAttributes->ObjectName );
       		}
    	}
    	return Status;
    } /* cygwin edge case code branch ends here */
Bugdude
hbelusca
Developer
Posts: 1204
Joined: Sat Dec 26, 2009 10:36 pm
Location: Zagreb, Croatia

Re: Help diagnosing a problem with cygwin/samba on ROS

Post by hbelusca »

Bear in mind that Windows doesn't have a "Handle Cygwin bugs" case in its kernel, so your "patch" is most probably wrong.
bugdude
Posts: 28
Joined: Wed Nov 27, 2019 11:41 pm

Re: Help diagnosing a problem with cygwin/samba on ROS

Post by bugdude »

That is true, but neither Cygwin or the test applications fail under windows when they make the same calls, so internally Windows handles this case very differently than ReactOS currently does. Thank you for your help, the test that you sent makes it possible to recreate the error using ROSBE based code so it should make the case much clearer.

I reported this on JIRA a moment ago because it is what most would define as bug since ROS does not match the behavior of W2K3 or XP. It seems to be a rarely occurring issue, but it could still be impacting other applications preventing them from working on ROS as well. I attached a modified version of the test you sent me and the preliminary patch to the JIRA report as well. For my purposes I now have the patched build that I can use to continue to test the Samba build I have been working on.

I have had to create small patches for Samba, Cygwin, waf, and Python as well as make lots of changes to the Samba build system to get those to even build. It will sound strange to say it, but I am getting used to finding issues everywhere in the stack on this pet project. In fact hunting this quirk down in ROS has been like the rest of the project, it has been an educational brain twister.

Bugdude
bugdude
Posts: 28
Joined: Wed Nov 27, 2019 11:41 pm

Re: Help diagnosing a problem with cygwin/samba on ROS

Post by bugdude »

Hello again,

I think I have traced one part of the problem to a problem with the Winsock implementation in ReactOS. It is another slightly unusual problem, so I wanted to ask about it before posting it on Jira. The issue appears to be in the implementation of the connect() function. I will include complete sample code below, but the problem is simpler to explain in terms of a sequence of function calls.

My questions are simply:
Has anyone had UDP applications that failed to work on ReactOS that used this pattern ?
Does anyone have a ReactOS patch for this ?

I would patch it in samba or cygwin, but the calls come from different contexts and the calling context for sendto() does not have the target information readily available (different thread). In any event, the testing process I used confirmed the issue is really in winsock and not the callers.

This key steps of the failing sequence C code are:

Code: Select all

WSAStartup();
socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
    memset((char *) &si_other, 0, sizeof(si_other));
    si_other.sin_family = AF_INET;
    si_other.sin_port = htons(PORT);
    si_other.sin_addr.S_un.S_addr = inet_addr(SERVER);
connect(s, (struct sockaddr *) &si_other, slen);  // <=== NOTE: that sockaddr si_other is passed to connect()
sendto(s, message, strlen(message) , 0 , NULL, 0);  // <=== NOTE: sockaddr IS NOT passed to sendto()
closesocket(s);
WSACleanup();
I coded a test program in C but I didn't have access to a C compiler so I actually tested using Delphi. Under Windows 10 this sequence works just fine because the call to connect alters the target information stored in the socket itself. Apparently under ROS the winsock implementation does not do that. I confirmed this is what happens on Windows 10 by adding a call to getpeername() and displaying the port value before and after the calls to connect() and sendto().

Below are a client and server test application in C (borrowed from a website and altered to follow the above sequence). It's a simple UDP sender and receiver that will work completely on Win10 but fails on ReactOS in the same way my Samba port is currently failing..

The code fails at the sendto() rather than the connect(), but the samba messaging API does the same thing on ReactOS, both fail with sendto returning WSAEINVAL (invalid parameter).

Code: Select all

/*
	Simple udp client  https://www.winsocketdotnetworkprogramming.com/winsock2programming/winsock2advancedcode1e.html
*/
#include<stdio.h>
#include<winsock2.h>

#pragma comment(lib,"ws2_32.lib") //Winsock Library

#define SERVER "127.0.0.1"	//ip address of udp server
#define BUFLEN 512	//Max length of buffer
#define PORT 8888	//The port on which to listen for incoming data

int main(void)
{
	struct sockaddr_in si_other;
	int s, slen=sizeof(si_other);
	char buf[BUFLEN];
	char message[BUFLEN];
	WSADATA wsa;

	//Initialise winsock
	printf("\nInitialising Winsock...");
	if (WSAStartup(MAKEWORD(2,2),&wsa) != 0)
	{
		printf("Failed. Error Code : %d",WSAGetLastError());
		exit(EXIT_FAILURE);
	}
	printf("Initialised.\n");
	
	//create socket
	if ( (s=socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == SOCKET_ERROR)
	{
		printf("socket() failed with error code : %d" , WSAGetLastError());
		exit(EXIT_FAILURE);
	}
	
	//setup address structure
	memset((char *) &si_other, 0, sizeof(si_other));
	si_other.sin_family = AF_INET;
	si_other.sin_port = htons(PORT);
	si_other.sin_addr.S_un.S_addr = inet_addr(SERVER);
	
	//start communication
	while(1)
	{
		printf("Enter message : ");
		gets(message);
		
		if ( connect(s, (struct sockaddr *) &si_other, slen) == SOCKET_ERROR)
		{
			printf("connect() failed with error code : %d" , WSAGetLastError());
			exit(EXIT_FAILURE);
		}

		if (sendto(s, message, strlen(message) , 0 , NULL, 0) == SOCKET_ERROR)
		{
			printf("sendto() failed with error code : %d" , WSAGetLastError());
			exit(EXIT_FAILURE);
		}
		
		//receive a reply and print it
		//clear the buffer by filling null, it might have previously received data
		memset(buf,'\0', BUFLEN);
		//try to receive some data, this is a blocking call
		if (recvfrom(s, buf, BUFLEN, 0, (struct sockaddr *) &si_other, &slen) == SOCKET_ERROR)
		{
			printf("recvfrom() failed with error code : %d" , WSAGetLastError());
			exit(EXIT_FAILURE);
		}
		
		puts(buf);
	}

	closesocket(s);
	WSACleanup();

	return 0;
}

Code: Select all

/*
	Simple UDP Server  https://www.winsocketdotnetworkprogramming.com/winsock2programming/winsock2advancedcode1e.html
*/

#include<stdio.h>
#include<winsock2.h>

#pragma comment(lib,"ws2_32.lib") //Winsock Library

#define BUFLEN 512	//Max length of buffer
#define PORT 8888	//The port on which to listen for incoming data

int main()
{
	SOCKET s;
	struct sockaddr_in server, si_other;
	int slen , recv_len;
	char buf[BUFLEN];
	WSADATA wsa;

	slen = sizeof(si_other) ;
	
	//Initialise winsock
	printf("\nInitialising Winsock...");
	if (WSAStartup(MAKEWORD(2,2),&wsa) != 0)
	{
		printf("Failed. Error Code : %d",WSAGetLastError());
		exit(EXIT_FAILURE);
	}
	printf("Initialised.\n");
	
	//Create a socket
	if((s = socket(AF_INET , SOCK_DGRAM , 0 )) == INVALID_SOCKET)
	{
		printf("Could not create socket : %d" , WSAGetLastError());
	}
	printf("Socket created.\n");
	
	//Prepare the sockaddr_in structure
	server.sin_family = AF_INET;
	server.sin_addr.s_addr = INADDR_ANY;
	server.sin_port = htons( PORT );
	
	//Bind
	if( bind(s ,(struct sockaddr *)&server , sizeof(server)) == SOCKET_ERROR)
	{
		printf("Bind failed with error code : %d" , WSAGetLastError());
		exit(EXIT_FAILURE);
	}
	puts("Bind done");

	//keep listening for data
	while(1)
	{
		printf("Waiting for data...");
		fflush(stdout);
		
		//clear the buffer by filling null, it might have previously received data
		memset(buf,'\0', BUFLEN);
		
		//try to receive some data, this is a blocking call
		if ((recv_len = recvfrom(s, buf, BUFLEN, 0, (struct sockaddr *) &si_other, &slen)) == SOCKET_ERROR)
		{
			printf("recvfrom() failed with error code : %d" , WSAGetLastError());
			exit(EXIT_FAILURE);
		}
		
		//print details of the client/peer and the data received
		printf("Received packet from %s:%d\n", inet_ntoa(si_other.sin_addr), ntohs(si_other.sin_port));
		printf("Data: %s\n" , buf);
		
		//now reply the client with the same data
		if (sendto(s, buf, recv_len, 0, (struct sockaddr*) &si_other, slen) == SOCKET_ERROR)
		{
			printf("sendto() failed with error code : %d" , WSAGetLastError());
			exit(EXIT_FAILURE);
		}
	}

	closesocket(s);
	WSACleanup();
	
	return 0;
}
bugdude
bugdude
Posts: 28
Joined: Wed Nov 27, 2019 11:41 pm

Re: Help diagnosing a problem with cygwin/samba on ROS

Post by bugdude »

Hello everyone,

I haven't checked for any impacted software, but I suspect this issue would stop other network connected server software from operating. It is not an uncommon method to use in Linux based multi threaded server software.

After tracing the failure carefully I developed some test code that resolves this problem. I do not know if the method I chose is the way windows does this, or the simplest way to do this, but what I did is the following..

In the AfdStreamSocketConnect() function I copied some of the code used for SOCK_STREAM handling (towards the end of the function) into the section of code for handling connectionless mode (UDP). The copied code stores the RemoteAddress information received by the connect() function into FCB->ConnectCallInfo as well as FCB->ConnectReturnInfo(). In AfdPacketSocketWriteData() I added code that checks the result of the existing call to TdiBuildConnectionInfo which populates TargetAddress, and if that call failed calls TdiBuildconnectionInfo again using FCB->ConnectCallInfo->RemoteAddress followed by FCB->ConnectReturnInfo->RemoteAddress until one succeeds or all fail. Since the address info was previously stored there, SendTo call can can send succesfully even though the SendReq->TdiConnection.RemoteAddress was not populated by the parameters it received.

The code works, but I don't understand the 'plumbing' well enough to know if this is a viable way to do this, or to know if I should also backfill SendReq->TdiConnection.RemoteAddress with this same information to optimize subsequent calls. I am just starting to test the code now, but since it verifies this condition exists I will now post this on JIRA.

Any advice on how to improve this WIP code would be greatly appreciated.

bugdude
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 11 guests