My new favourite tool


Life is much easier when you can see what's going on

Sometimes I think these machines are getting a little too autonomous for my liking...

Right now, Auric is being a downright pain the in the rear.  It just seems to be uncooperative.

It all started when I observed that the directional control was erratic.  Given that the steering process is a stream of adjustments based on image processing, it is normal that there will usually be a mismatch in the speeds of the motors on each side.  But when the camera is shown a static image of a receding path I would expect it to steer more or less straight, or to make only small adjustments.  This however, was not the case.

More digging hinted that not all of the commands sent from the main controller to the motor controller were being processed.  This boiled down to one of three possibilities:
  • The main controller was not sending the commands;
  • The motor controller was not receiving all the commands being sent; or
  • The motor controller was not acting on some of the commands being received.
I was able to eliminate the first option by connecting a serial interface to the output of the main controller and comparing what hit the wire with the logging output.  They matched, so I ruled out the controller and it's code as being the culprit.

This left the blame on the motor controller.  Unfortunately since it has only a single serial port, debugging was limited, and it does operate in what is basically a tight loop.  With no console available to me, I modified the controller code to flash the onboard LED (LED_BUILTIN) each time a message was received.  Firing up again, the LED was flickering constantly so that looked OK.

I then modified the code again to flash the LED each time it executed a command, and the result was a much slower flicker. Aha!

Or at least so I thought.  The commands follow a simple formula:

direction:pwm_speed

So R:512 would mean set the PWM value on the right side to 512, and the motor controller will respond with "0" for OK or "1" for an error.  If I ran minicom on the main controller, and entered the commands manually, it worked fine every time.

Then I wrote a smaller test program using libserialport to send a command to each motor.  What was curious here was that consistently only the first command was executed!  Now we are getting somewhere, but why?

I suspected that the line speed might be a problem, so I lowered the serial rate to 9600 baud, but the same symptoms was there.

Time for the big guns

A while ago I purchased a BitScope Micro through a special deal from TronixLabs.  I hadn't really put it to serious use, but if ever there was a time, this was it.  The BS05 (as it's known) has a range of software from basic oscilloscope through logic analyser and protocol analysis.  It also has tiny little probes which can get between pins.  The software also works under Windows, MacOS and Linux.  As a bonus, they are a local company.  Even better as far as I am concerned.

Images courtesy of BitScope
With everything running, I was going to use the protocol analysis features to decode the serial stream, but I didn't need to.  Modifying serialtest to loop through it's transmission, and report the number of bytes sent, I had one channel on the transmit from the main controller, and another channel on its receive.

This little device has helped protect my sanity!
I could see the blocks of data on both channels, and the size of the blocks seemed about right.  But wait!  The relative positions of the blocks was changing, sometimes there was a response in between the commands, other times responses seemed to overlap with a command.

This was the break I needed (See what I did there?)  I modified serial test yet again to report when a frame was received.  Now I could understand what was happening.  The RockChip UART has a deep transmit buffer, so when we wrote a command, the write was acknowledged instantly as having been sent and it was ready for the next command.  As a result, the commands would just keep arriving in the motor controller's RX buffers faster than it could deal with them and because there was no handshaking, some commands were simply lost.

The tiny probes make it easier to grab the right part on a PCB
I can't change the buffer behaviour, and I have no capacity to add hardware handshaking between the two controllers, so the solution is to make the main controller actively wait for the acknowledgement from the motor controller before sending the next command.  By making this change, I could ramp the data speed back up and things held together fine.

It was definitely a software problem of my own making; but the symptoms were pointing to anything but that.

The odd thing was that this same code model had held up just fine in the previous car even though my poor code was present, so what is the difference?  There's a explanations I can offer.  I am not yet sure which is the most correct:
  • Because the Raspberry Pi is slower, perhaps it simply didn't manifest;
  • There may be a difference in either the UART driver code or the UART itself between the two boards;
  • Maybe it was there, but because the of way the steering mechanism worked, it just wasn't obvious.

So what did I learn?

Sometimes a console just isn't enough, and no console is even harder
Debugging embedded systems has unique challenges because you can't always have the controller report what it's thinking.  Using a spare GPIO port to indicate state will usually cost very little in performance, but might be all you need.
Don't assume.  Check
Maybe if I had been less rushed in writing the code in the first place, I would not have spent a day trying to solve this, but I fell into the trap of "what the heck, it works!"
The right tools make all the difference
I have a couple of perfectly good CRO's (Cathode Ray Oscilloscope), but they are big and bulky and I couldn't be bothered to drag them out.  The BitScope on the other hand, is a little USB device, which makes it portable and powerful for my needs.

There is a lot of functionality I have yet to explore, but I am sure that with Auric's bad attitude I am going to get the opportunity!

Want to help?

If you would like to make a financial contribution to the project, please go to my Patreon Page at:




Please note:  I am not associated in any way with BitScope Designs, MetaChip Pty Ltd, or their resellers.  I am just a very happy customer, and sought their permission before writing.














Comments

Popular posts from this blog

The QYT KT-7900D

Recording weather with Arduino, Elasticsearch and Kibana

Exploring Solar Power - Let there be smoke!