WriteAsync

Dec 2, 2014 at 10:48 PM
Hi,

As I said in a previous thread, I am really impressed by this library, but I have been having a weird problem.

I have been using the following method to write to serialportstream:
public async Task WriteAsync(Stream stream, bool flush = true, CancellationToken token = default(CancellationToken))
        {
            if (!stream.CanWrite)
                throw new ProtocolViolationException("Unable to write to stream");

            var buffer = new List<byte>();

            buffer.Add(HeaderByte);

            buffer.AddRange(Payload);

            buffer.Add(GetXoredByte());

            await stream.WriteAsync(buffer.ToArray(), 0, buffer.Count);

            if (flush)
            {
                await stream.FlushAsync();
                Console.Out.WriteLine(string.Format("Wrote the packet: {0}", this.ToString()));
            }
        }
However, every now and again I have been experiencing lockups when a packet is written. when this happens If I instigate another packet write (from the UI, the packets start being sent and received as normal again...

Because of this I tried using it synchronously like so:
 public void Write(Stream stream, bool flush = true, CancellationToken token = default(CancellationToken))
        {
            if (!stream.CanWrite)
                throw new ProtocolViolationException("Unable to write to stream");

            var buffer = new List<byte>();

            buffer.Add(HeaderByte);

            buffer.AddRange(Payload);

            buffer.Add(GetXoredByte());

            stream.Write(buffer.ToArray(), 0, buffer.Count);

            if (flush)
            {
                stream.Flush();
                Console.Out.WriteLine(string.Format("Wrote the packet: {0}", this.ToString()));
            }
        }
and I haven't seen the problem since.

I just thought I would report it as a potential bug, and if you need more info / stacktrace etc. just let me know..
Coordinator
Dec 3, 2014 at 5:14 PM
Can you capture a stacktrace/core dump at the time it stops? If you build the code yourself, you should send me also the revision of SVN your using and the PDBs you built so i can read the core. I've not used the async methods, so I'm not entirely sure how they map to the .NET 2.0 asynchronous model.

There's also a trace scope you can enable "IO.Ports.SerialPortStream". e.g. you can put this in your "app.config"

<system.diagnostics>
<sharedListeners>
  <add name="DiskLog" type="System.Diagnostics.TextWriterTraceListener" traceOutputOptions="DateTime" initializeData="C:\serialportstream.log" />
</sharedListeners>

<sources>
  <source name="IO.Ports.SerialPortStream" switchValue="Information">
    <listeners>
      <add name="DiskLog"/>
    </listeners>
  </source>
</sources>

<trace autoflush="true" useGlobalLock="true"/>
</system.diagnostics>
Dec 10, 2014 at 11:19 PM
Sorry for the delayed reply, life is getting in the way...

I am in the process of uploading the dump file and will PM you with a link in the next day or two (it is around 100MB). I was unable to get the trace listener to work, here is my app config if you can see the issue:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <configSections>
    <section name="exceptionless" type="Exceptionless.Configuration.ExceptionlessSection, Exceptionless" />
  </configSections>
  <startup>
    <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5.1" />
  </startup>
  <exceptionless apiKey="f3c2e174f87c441e93e83662ac99a5fa" enableLogging="false" logPath=".\Trace.log" />

<system.diagnostics>
    <sharedListeners>
        <add name="DiskLog" type="System.Diagnostics.TextWriterTraceListener" traceOutputOptions="DateTime" initializeData=".\serialportstream.log" />
    </sharedListeners>
    <sources>
        <source name="IO.Ports.SerialPortStream" switchValue="Information">
            <listeners>
                <add name="DiskLog"/>
            </listeners>
        </source>
    </sources>
<trace autoflush="true" useGlobalLock="true"/>
</system.diagnostics>
</configuration>
BTW, I think it has something to do with waiting to return using the await keyword in "await serialport.WriteAsync(.." perhaps it is not escaping from a loop and/or a timeout is not occurring...
Dec 12, 2014 at 1:52 PM
PMed you.
Coordinator
Dec 13, 2014 at 9:52 PM
Edited Dec 13, 2014 at 9:54 PM
As I'm using Win7 x64 for debugging with WinDBG, and you have the problem with Win8.1 (6.3.9600), I need a couple of files from your .NET folder to properly debug.

From the folder: C:\Windows\Microsoft.NET\Framework64\v4.0.30319, grab the files
  • clr.dll
  • mscordacwks.dll
  • mscorwks.dll
  • SOS.dll
That's because when I debug, I get the error:
> !clrstack
The version of SOS does not match the version of CLR you are debugging.  Please
load the matching version of SOS for the version of CLR you are debugging.
CLR Version: 4.0.30319.34209
SOS Version: 4.0.30319.18444
Hopefully with these binaries from your machine I can see exactly where it's failing. that's because the backtrace with the symbol file isn't what I'd expect (the path to the sources and the lines don't match):
> 0:016> !clrstack
The version of SOS does not match the version of CLR you are debugging.  Please
load the matching version of SOS for the version of CLR you are debugging.
CLR Version: 4.0.30319.34209
SOS Version: 4.0.30319.18444
OS Thread Id: 0x1574 (16)
        Child SP               IP Call Site
0000000f5127e838 00007ff93d0d1c2a [HelperMethodFrame_1OBJ: 0000000f5127e838] System.Threading.WaitHandle.WaitOneNative(System.Runtime.InteropServices.SafeHandle, UInt32, Boolean, Boolean)
0000000f5127e960 00007ff9288b3264 System.Threading.WaitHandle.InternalWaitOne(System.Runtime.InteropServices.SafeHandle, Int64, Boolean, Boolean)
0000000f5127e9a0 00007ff8ced86cd0 RJCP.IO.Ports.SerialPortStream+NativeSerialPort+CommOverlappedIo.WaitForWriteEmptyEvent(Int32) [c:\Users\Robin\Source\Repos\RailMasterSharp\src\SerialPortStream\NativeSerialPort_CommOverlappedIo.cs @ 969]
0000000f5127e9f0 00007ff8ced86b5d RJCP.IO.Ports.SerialPortStream.Flush() [c:\Users\Robin\Source\Repos\RailMasterSharp\src\SerialPortStream\SerialPortStream.cs @ 1527]
Line NativeSerialPort_CommOverlappedIo.cs @ 969 is not WaitForWriteEmptyEvent, but is "m_WriteClearDoneEvent.WaitOne(-1);" in DiscardOutBuffer().

Line SerialPortStream.cs @ 1527 is not Flush(), but "if (!ar.IsCompleted) ar.AsyncWaitHandle.WaitOne(-1);" in EndWrite(), so the lines don't match up. Anyway, based on the callstack, it's clear what functions are being called, but there may be other subtle changes.

I'll keep looking and post what I find.
Dec 13, 2014 at 11:41 PM
Thanks for the update.

I have PMed you the link to the files requested minus mscorwks.dll as I was unable to locate it!

Anything else you need, just let me know!
Dec 13, 2014 at 11:51 PM
Oh, and I have now got tracing working by changing the output to "All", so I will post that when I re-run the test but due to the nature it doesn't usually freeze on the same interval.

Also This is a quick run-through of what was happening when sending to SerialPortStream.WriteAsync froze and I paused and created a minidump (I think it also happened near the start of the execution but I kicked it by manually sending a packet that wasn't in its sequence).

Basically the program is writing a sequential packet to a device, the device then sends back the result and the program sends the next packet in the sequence. the range is 1 through 256. Each request response takes about 4.1 seconds.
Coordinator
Dec 14, 2014 at 11:47 AM
Analysing the core dump with the correct SOS/MSCORDACWKS libraries, It looks like the low level USB serial driver didn't send the EV_TXEMPTY event back to my driver to indicate that the hardware buffers were emptied. Internally, the write buffers are empty and the main loop is just waiting for a new event. And as the flush is given with -1 (no timeout), it will sit there stuck forever.

So once I've got full traces (that is not enabled with Information, but with Verbose), I can see what the driver replied with. Then I can see if there's a race condition or not in the SerialPortStream library.
Dec 14, 2014 at 10:46 PM
OK,

I have reimported the SerialPortStream library and it the diff shows it has modified the spacing around many of the brackets, this is probably the reason why the line numbers were different. I will attach the new symbols files with the dump.

Just for FYI, the serial device I am using is the default mdmcqp.inf driver that references itself as a CDC RS-232 Emulation Demo driver by Microchip Technology inc. Version 5.1.2600.7.

I will also note that when Verbose logging is turned on I don't get very far before the lockup seems to occur. In fact it might even be the first write event! I am not sure if this is due to the performance of my dev system?!

I will PM you when the files have finished uploading.

Thanks.
Coordinator
Dec 16, 2014 at 6:06 PM
Hi, it's been a couple of days now. Do you have any new logs?
Dec 16, 2014 at 9:49 PM
Hi, Sorry for the delay, I have run into a couple of problems since overwriting the serialportstream project in my solution.

I have started getting read problems with incomplete packets being received all of a sudden so I thought it would be better to try and diagnose what is going on there (it could well be my interface) before I troubled you...

The copy I uploaded was targeted at a lower framework for my app so I have changed it back to 4.5.1 in my solution, however I am still seeing the problem... Hopefully tonight I will be able to revert the solution back to see if the problem disappears and then work forward again.

I will get back to you as soon as I can with the next debug.

Thanks for your help.
Dec 17, 2014 at 2:33 PM
Progress, although a couple of questions:

When I turn on tracing, the UI responsiveness is non existent. it gets a little better when I set the WriteTimeout to 5000 (any lower and writes don't seem to occur) so first question is am I right in saying that this option is set in milliseconds (it would be helpful to state the unit within the write timeout functions XML, same with the read timeout)?

Second question is, is there any particular reason for the UI to lockup whilst the library is writing trace events?

Obviously this would have an effect on my ability to provide you with a decent log and minidump if timeouts are occurring due to the trace logging...



PS, do you have any knowledge of what type of handshaking would be required for a USB CDC virtual serial port? I am guessing (and now explicitly set) it is Handshake.None...
Dec 17, 2014 at 3:10 PM
Update:

Questions 1 and 2 fixed by adding <remove name="Default"/> to the listeners. It seems performance is an issue when sending too many things to the console!?

I am just zipping the logs and minidump and will upload asap!
Coordinator
Dec 20, 2014 at 11:17 AM
do you have any knowledge of what type of handshaking would be required for a USB CDC virtual serial port? I am guessing (and now explicitly set) it is Handshake.None...
You'll need to check the documentation for your hardware (not just the USB serial chip). If the hardware itself uses 4-wire (Rx, Tx, CTS, RTS), then you should use hardware handshake. Otherwise, if the remote is a microprocessor and it supports XON/XOFF (the Ctrl-Q and Ctrl-S signals) to handle flow control, but that is rare today and not intended for binary data transfer (you'll need to implement some kind of encapsulation, e.g. how PPP does it).
Questions 1 and 2 fixed by adding <remove name="Default"/> to the listeners. It seems performance is an issue when sending too many things to the console!?
Good to see that you resolved the performance. You're absolutely right about logging to the console. Verbose logs a lot of data and will freeze the console. I've had this with other traces also.

I haven't looked at the logs yet. Will do something about it today. Can't promise anything, my time will be limited in the next 1-2 weeks.
Coordinator
Dec 20, 2014 at 1:53 PM
Edited Dec 20, 2014 at 1:53 PM
So I've analysed the latest logs. Unfortunately, the dump files are only valid with a current PDB that you compiled with. Everytime you recompile, a new PDB needs to be provided for debugging with WinDBG. So I'll just assume the results of my last analysis.

Looking at the write logs, the driver definitely did not return with an EV_TXEMPTY event. This indicates a failure in the USB driver.
IO.Ports.SerialPortStream Verbose: 0 : COM6: SerialThread: DoWriteEvent: WriteFile(1808, 1011127436651, 4, ...) == False
--> #2 Writes 4 bytes
IO.Ports.SerialPortStream Verbose: 0 : COM6: SerialThread: ProcessWaitCommEvent: EV_TXEMPTY
IO.Ports.SerialPortStream Verbose: 0 : COM6: CommEvent: EV_TXEMPTY
--> Got a TXEMPTY event. Probably for the previous write #1. This is buffered.
IO.Ports.SerialPortStream Verbose: 0 : COM6: SerialThread: ProcessWriteEvent: 4 bytes
--> #2 Write of 4 bytes finished.
IO.Ports.SerialPortStream Verbose: 0 : COM6: SerialThread: ProcessWriteEvent: TX-BUFFER empty
--> Previous TXEMPTY is now used and assumed to be valid. May or may not be, I wouldn't expect this to be given in this state, indicating the driver doesn't clear this flag on the next write.
IO.Ports.SerialPortStream Verbose: 0 : COM6: SerialThread: DoWriteEvent: WriteFile(1808, 1011127436655, 3, ...) == False
--> #3 Writes 3 bytes.
IO.Ports.SerialPortStream Verbose: 0 : COM6: SerialThread: ProcessWriteEvent: 3 bytes
--> #3 Write of 3 bytes finished.
The serial driver doesn't indicate that the buffer is empty at all. This indicates an error in the USB driver. It explains why the flush hangs forever.

In the above logs, I also see that the TXEMPTY event comes sometimes after a write but before the write is finished which we know cannot be empty - so we remember it for when the write is finished and assume then it's empty. As all operations are serial, technically the race condition for the write and then the TXEMPTY is in the USB driver itself (and shouldn't occur). Worst case analysis is, we think it's flushed when it's not really. But this is not related to your problem, that the TXEMPTY doesn't occur at all (as instead too often). There's no solution for this, I'd rather think it was flushed, than not at all precisely to avoid hang's in the Flush() as you're observing.

It works correctly when you send a next sequence of bytes, because somehow the "USB" driver is reset by having bytes in the output buffer and then it does send the TXEMPTY event unblocking your Flush().

The only suggestion I have is to put a time limit of the maximum flush time you expect. Assuming you have no flow control, operating at 115200, each byte should take about 0.1ms. Sending 10 bytes means it should flush in about 1ms maximum to send out on the wire. So flush with a timeout of 55ms (this takes into account the Windows scheduler - not a problem with Win8, but Win7 and earlier would only schedule every 55ms when no multimedia timers were running).

The complete I/O is handled in a single thread (CommOverlappedIO) and so the only difference between async and non-async is timing. You're probably running on a multicore machine, which has more stressful timing with async as now we're running simultaneously on multiple cores, than with the sync that runs on a single core.

So to set the timeout, set the property SerialPortStream.WriteTimeout to 55 (default is -1, which means to block). This property is used by Flush() and Write() methods. Because you're calling Flush() regularly and your write buffer is large enough that it can never get full, you shouldn't run into any more problems.
Marked as answer by jmcurl on 12/26/2014 at 6:14 AM
Jan 29, 2015 at 1:02 PM
Hi,

Sorry for the very delayed reply, I have had pressure on to get a totally different project started (NETMF serial ports are a lot more reliable ;-)).

Setting the timeout did indeed make a huge difference but 100ms seems to be the sweet spot. Although I dont know how the VCP using USBSer.sys can be so faulty for an MS written driver! (I have found some info on the quirks at http://www.keil.com/forum/58625/usbser-sys-quirks-explained/ but they are a bit over my head)

I have one last problem that I need to deal with, the fact that although on my main development computer the serialportstream connects to the device imediately and is able to transfer data, on my other device (a surface pro 2 with exactly the same OS and device driver) this is not the case, in fact the only work around I can find at the moment is so add a 4 second delay between initialising serialportstream and opening the port, otherwise you receive the error "Unable to set serial break state".

If you dont mind, I will send you the dump when I get a chance.

Lastly, It is great that the library is now on Nuget, but would benefit from the symbols being on SymbolSource (http://docs.nuget.org/create/creating-and-publishing-a-symbol-package) as then it would make the software easier to debug...

Thanks for all of your help.
Coordinator
Jan 29, 2015 at 9:18 PM
Thanks for the tip about NuGet! I'll wait some time before creating a new nuget package in case there are any changes. I'll investigate creating a source package, however I can't say until when for now. I'll create a new issue so it's easier to track.

I've tested primarily with FTDI and Prolific. Reading through the documents, I don't see how I can improve the code. Still regarding the flush, the time it takes to actually write the data can be calculated by the maximum number of bytes in your buffer (which should be known) plus 32ms.

As for your new problem, can you please post the entire exception? It will have the Win32 error associated with it:
throw new IOException("Unable to set the serial break state", Marshal.GetLastWin32Error());

Please check the driver versions. I assume you have to install a specific driver for your USBSER device, perhaps there are differences there. The error is being raised as a direct failure from the OS call.

https://msdn.microsoft.com/en-us/library/windows/desktop/aa363433%28v=vs.85%29.aspx

Are you setting the property BreakState in your code? Or the method GetPortSettings()? In any case, the exception stack will definitely help.
Jan 30, 2015 at 1:42 PM
I will see if I can get the full exception but will have to download the latest trunk and import it into the solution first as the exception thrown is only reporting the simple exception and null for the inner exception. I expect this is due to there being no symbols available for the nugget package...

The driver (or actually just the inf) is automatically downloaded from Windows update when the device is first plugged in and both report the same version number. I have uninstalled the device driver and reinstalled it on both computers to the same problem, so I don't think it is that. Although I am starting to wonder about the USB chipset driver for the Surface Pro 2!

I don't use either GetPortSettings() the BreakState property (although I think the latter needs the port to be open before you can set it?!) Should I be calling GetPortSettings() before opening the port?
Coordinator
Jan 30, 2015 at 6:54 PM
Well, I don't understand the exception at all if you're not using these two methods (unless I've overseen something).\

It's not intended that GetPortSettings be used. This is for the special case that a user wants to configure the serial port settings using the driver manager, where opening the driver uses the last settings or the settings in the driver manager (think Windows 3.1 behaviour). In just about all cases you want to set the values explicitly because your application knows best how to talk to the remote device.

Regarding the PDB's, they're in the Nuget package. I don't know where Nuget puts them though. One reason why I can think it's not all there is they're probably built as a release build (not a debug build). I have seen a difference that line numbers don't get printed for release builds. In any case, the exception should show a HRESULT which is the error returned by the Windows low level driver.
Feb 2, 2015 at 11:15 PM
OK,

I seem to now have a stable serial connection to this device via this stupid driver!

It seems that USBSER does something to mangle SetCommState on certain computers (but not all) if sent within 4 seconds of creating the serialportstream object and returns a HResult of 31.

It seems that because it is a VCP SetCommState is not actually used so I have commented it out: more info found here
public void SetCommState()
    {
        //if (!UnsafeNativeMethods.SetCommState(m_ComPortHandle, ref m_Dcb)) {
        //    throw new IOException("Unable to set the serial port state", Marshal.GetLastWin32Error());
        //}
     }
and the problem totally disappears!

I might as well comment out m_SerialPort.SerialPortModemStatus.ClearCommBreak(); rather than having a try/cacth as well since it always throws an exception and I am fedup of seeing it in my traces :-/

At least this solution will work for me until I can find a better way, which will probably be having a "compatibility" flag so that I can reuse the same library for behaving drivers... I will submit a patch when I have done it as hopefully you will be able to include it in your nuget package?!

I still cant believe it was created and still maintained by Microsoft... Time to submit some windows 10 feedback I think!!!
Coordinator
Feb 4, 2015 at 7:19 PM
I really appreciate your research on the USB serial issue. It confirms for me I should stay with FTDI as they appear to be the most reliable.

A HRESULT=31 translates to: public const int ERROR_GEN_FAILURE = 31; In the headers of MS the comment is "A device attached to the system is not functioning.".

If you comment out SetCommState, then how is any settings provided to the serial port? Because then none of the flags are set, baud rate isn't set, etc. I'm afraid I'm just getting confused and either a dump, or a stack trace will help me.

SetCommBreak (your last issue), is used only in the two scenarios I mentioned.
SetCommBreak <- GetPortSettings() (Open() calls here, but only with a non-null value)
SetCommBreak <- set_BreakState

You saw this error 31 in your post on 31st October 2014 during the Open() call. So I wrapped Open() with a try/catch with SerialPortStream.cs r31428 (10th November) around ClearCommBreak. But I didn't wrap SetCommBreak(), because they need to be explicitly used, and thus should raise an exception if the underlying driver raises an exception (but shouldn't if they're used as part of initialisation, which is always cleared).
Feb 4, 2015 at 8:49 PM
It seems that any settings I enter for the device (on a pc that works with SetCommState but with ClearCommBreak still disabled) can use anything for the settings anyway, for example the correct settings for the device according to the manual are:
IoPort.BaudRate = 115200;
IoPort.DataBits = 8;
IoPort.Parity = Parity.None;
IoPort.StopBits = StopBits.One;
IoPort.Handshake = Handshake.None;
but if I set:
IoPort.BaudRate = 2400;
IoPort.DataBits = 7;
IoPort.Parity = Parity.Even;
IoPort.StopBits = StopBits.One;
IoPort.Handshake = Handshake.None;
the device still talks as normal. I am guessing that it is because the VCP driver totally ignores these settings (it is just a bulk endpoint after all) and for some reason on certain computers it is throwing this error. Now all of this could be related to a problem with the firmware on the device and unfortunately I have no access to the firmware sourcecode.

I will try to create you a dump soon but for now, I will upload you a copy of how I have modified SerialPortStream. It would be interesting if the same was true with any of the devices you use....
Coordinator
Feb 8, 2015 at 4:18 PM
I looked over your changes. Instead of "CompatibilityMode", it looks like a "DirectMode" just using CDC over USB, but no physical translation to hardware that even uses a serial protocol. Also, what do you think if I instead have a new constructor that ignores the settings instead of defining a property? I don't think it should be changed once set and if it should be changed, then a new object can be instantiated for the same port after closing the previous one.
Feb 15, 2015 at 3:54 PM
I am willing to give it a try if you send me an example...
Coordinator
Feb 15, 2015 at 10:16 PM
I've taken a different approach in SVN. I've created a new public Open() method. Call Open(false) instead of Open(). I'm thinking of changing the name instead to OpenDirect(). It avoids calling SetCommState() in the Open() method which appears to be your problem. I prefer that over overriding all methods, as this gives more flexibility. Please try and tell me what you find.

I went through all the instances that calls SetCommState. That is all the properties after the port is opened (Baudrate, DataBits, StopBits, Parity, ParityReplace, DiscardNull, HandShake, XonLimit, XoffLimit, TxContinueOnXOff) and GetPortSettings(), so make sure you are not calling this code. So now, just call Open(false), and it should never be called.
Feb 16, 2015 at 1:45 AM
Thanks,

I have downloaded the latest revision and it works fine on all of my devices (including the one that is particularly temperamental with the serial port). If you were to choose not to have an if statement in all properties it will need to be made very clear in the documentation (preferably the XML for each property and on the website documentation) as I can see people calling them by mistake and then wondering what is going on...

Hope that makes sense, I am a little tired.