I used following three messaging patterns.
- Publish/Subscribe: Where client subscribes to specific types of messages. When server reads these messages from hardware, it will publish to these clients.
- Request/Response: Client can send request to server who execute the request, interact with hardware and get's the response back. E.g Client can request to open a serial port or play an audio.
- Push/Pull: All clients will push the logs to the central logging server, central logging server pulls the messages and writes to the file.
As the development is done using C# on Windows Embedded environment, I use clrzmq which is a C# binding for ZeroMQ. Based on my initial performance test, I realized that clrzmq is taking lot more CPU than I expected.
I used RedGate's ANTS performance profiler for .NET which gives detail analysis on how much CPU cycles are spent on each function and how many times it is called.
What I found is that ZmqSocket.Receive() method spent it's time on
- SpinWait:17.1%
- Stopwatch.GetElapsedDateTimeTicks: 4.1%
- Stopwatch.StartNew: 2.4%
- Receive: 73.3%
In which Receive() function spent 64.4% of the time on SocketProxy.Receive()
- ErrorProxy.get_ShouldTryAgain: 5.1%
- SocketProxy.Receive: 64.4%
Now CPU Usage for SocketProxy.Receive()
- DisposableIntPtr.Dispose: 11.1%
- ZmqMsgT.Init:7.1%
- ZmqMsgT.Close:5.8%
- SocketProxy.RetryIfInterrupted: 20.8%
See the attached Picture where SocketProxy.Receive() uses 13142.42 CPU ticks
Average CPU ticks per request = 191,196,025 / 14,548 = 13142.42
Average CPU ticks per request = 191,196,025 / 14,548 = 13142.42
As part of optimization, I used pre-allocated raw buffer to send and receive data instead of ZmqMsg object, moved StopWatch and SpinWait code in to a limited scope where timeout is defined and longer than certain value.
After these optimization, SocketProxy.Receive() uses only 2696.54 CPU ticks which is almost 1/5 of original cpu usages. See the attached picture below.
Average CPU ticks per request = 5,132,725,522 / 1,903,448 = 2696.54
Here is github link for optimized ZeroMQ library.
I am happy to say that my patch was accepted by clrzmq author and merge into the mainline clrzmq library.