Monday, April 7, 2008

Improving SPI performance - our starting line

OK, I've spent the past week first discovering just what the current SPI performance is relative to live CAN traffic followed by determining how to address the issues and finally by implementing the solution. The exciting news is that it was a very productive week. It appears that the performance is now where it needs to be but further testing must be done to prove this.

Let me show you what I found the performance to be. The picture to the right shows us a number of things (most of which are not good ;-)

First let's look at how we initiated the traffic. I'm setup with 5 CAN nodes on the CAN bus. I sent a message out to the five CAN Nodes to have them identify themselves (part of the AMSAT protocol). Locating this in the logic analyzer trace, we first see the serial RxD (Red) traffic at the bottom left which is followed by the serial TxD (Yellow) "OK" response. Next we have the MCP2515 transmit buffer load (SPI-labelled signals in middle) and the send buffer command again via SPI. Then we see the actual CAN transmit (Green) signal at bottom followed by the CAN receive (Red) with the Green CAN-Tx ACKs for each CAN message arriving.

Notice how quickly these messages arrive? CAN is a great little protocol in that all devices listen to all traffic on the bus and they inject their message as soon as they can. They will wait only long enough (mandatory gap between successive messages) and then the next message is transmitted. This means that as a receiving device we wait around for traffic which will come in bursts and sometimes, like this test case, our traffic will arrive in maximum speed bursts. How fun! Well, OK it's not fun at all. This means that our device has to immediately support running at best possible speed. The only freedom we have is that the MCP2515 does/can double buffer received traffic but this is very little additional freedom.

Now, turning our attention to the top signals we see the /INT line (Red) being asserted which is our transmit complete signal to the Propeller. We then see the /Rx0BF signal (Green) asserting followed by almost exactly one CAN message later the /Rx1BF signal (Yellow) asserting. These are the "receive buffer zero full" and "receive buffer one full" signals. This is all working exactly as we want. We are being notified of each event as we intended!

Where we see the first indication of performance issues with our prototype code is that we don't begin to unload the 2nd message until after all 5 messages have arrived. (the first unload occurs during the last of the 4th message arriving and the 2nd unload occurs quite a bit after the last message arrived.) Given that we know we only have two receive buffers we just proved that we lost 3 of the 5 messages arriving. We just are not fast enough.

Our transmit complete clearing of the interrupt can happen much faster (should likely happen before the first message arrives) and certainly our receives need to be much faster. In fact this almost certainly proves that we can't have a separate Cog watching the /Rx0BF and /Rx1BF lines and then asking the SPI back-end Cog to unload the buffers as we do here. This simply isn't fast enough.

This means that our initial functional decomposition and assigning of responsibility amongst Cogs is not going to work. It looks like we have to move some of the transmit acknowledgement handling and most if not all of the receive handling. Well, it's back to drawing board for me to figure out which Cog needs to do what, one more time...

In my next post I'll describe how things are to get rearranged and then I'll follow up with measurements of the new organization.

No comments: