Update: the reason the sampled image is such a low resolution is because we don’t have a computer powerful enough to train it. If we had a much better computer and GPU we could train it at a higher resolution. (hint hint intel/nvidia 😉 )
I’ve just finished a recent side project with my friend Kendrick. (his GitHub) We built an autonomous car that you can teach how to drive, then it drives around by itself. I did all of the hardware/arduino software and Kendrick did all of the machine learning software. He called his project Suiron and it’s available on GitHub here. The code running on the arduino is called car-controller and is available on my GitHub here.
Now that you’ve got the links feel free to go and have a look. Work through the code and try to figure out how it works. I’ll try to briefly cover the software here but my main focus will be on the hardware. One thing to note are the open source licenses, all my stuff is GPL and Kendrick’s Suiron project is MIT.
This post is more intended as an overview of how the whole thing works. If I get time I might turn it into a tutorial on how to get it working yourself.
Before we begin here is a short video of it in action.
Now onto the fun stuff! How does it work?
These are the main components used.
1) Remote Control Car – we used this car (link) but anything of similar size will work. As long as it has a standard ESC and Steering Servo. It comes with a remote control, battery and charger to start with. I recommend buying a new remote control system. (link 5 below)
2) Inten NUC – The raspberry pi doesn’t really have enough power and is arm based. An x86 based processor like the i5 in our NUC is much easier to use for machine learning purposes. The exact one you use doesn’t matter.
4) Lens filters – if you are operating in any sunlight, you will want a Polarising and ND (Neutral Density) filter. The camera just can’t cope with the harsh sunlight and shadows so these filters help bring the conditions into something much better. A variable ND is great as it let’s you adjust the “darkness” level.
5) Radio control system – if you intend on doing lots of this stuff then get an FrSky TARANIS. You won’t be disappointed. Otherwise, a turnigy 9XR will work just as good. Make sure you get a receiver too if it isn’t included.
6) You’ll also need an arduino. I like the Arduino Nano’s because they are super cheap and have on board USB.
I won’t go into details on how to wire everything as this isn’t a tutorial. However, If you need some help drop a comment below. I suggest you learn how an ESC (electronic speed controller) works together with a motor, receiver, servo and battery. This is a standard setup on normal remote control cars. Once you understand that you should look at arduino’s and how to use them to blink lights and read inputs. Read through the arduino code and the wiring should be pretty self explanatory.
How it all fits together
It’s up to you how you put everything together. I recommend trying to keep everything as low as possible for better stability when driving. The webcam needs to be mounted up high so it has a better chance of seeing the lane that it’s in. I just used a square bit of balsa wood as it’s really light and strong, then glued the webcam to it. Instead of explaining exactly how I mounted everything I’ll dump a few pictures here. All the white things are 3D printed, but you could easily do it without a 3D printer.
The importance of a polarising filter cannot be underestimated. It reduces reflections and the harsh glare sometimes encountered. In the image below (credit) you can see how much of a difference a polarising filter can make. Now water is a bit of an extreme example, but I chose that picture so it’s easier to demonstrate the difference. In realty, where we’re operating the difference won’t be so obvious.
The neutral density filter is equally or more important than the polarising filter. The ND filter is basically like sun glasses for the webcam. The webcam doesn’t like really harsh light so it reduces the intensity of it without interfering with the image to much. The picture below (credit wikipedia) shows how much better the right ND filter can make an image in harsh light.
I suggest making the lens filters removable as it will make the image to dark in lower lighting situations. For example, it was perfect mid day but much to dark a few hours later just before dusk. I made a simple mount that just uses an alligator clip to hold the filters in place. The filters are both glued together then onto a small 3D printed right angle mount.
The diagram below shows how everything is hooked up. Basically the arduino is the “brains of the hardware”. It reads in the values from the R/C receiver (bottom left) and then decides what to do based on the mode channel. Dig through the arduino code (link) and see exactly how. Basically there are 3 modes, manual, autonomous and emergency stop.
In manual mode the arduino reads in the steering and motor values and passes it straight to the motor and steering servo. In this mode with the right flag enabled, it also sends back over UART what those values are every time it receives a character. (every time it receives prevents the serial buffer getting full and “lagging”) In autonomous mode the arduino reads inputs over UART from the NUC. In this mode it receives two messages; steer,x and motor,x where x is the value you want to set it to. It then writes those outputs to the steering servo or motor. Finally, emergency stop kills the motor output and straightens the steering servo. This emergency stop overrides any sort of manual or autonomous control.
The Machine Learning Part
This isn’t my expertise so I’ll briefly summarise what it’s doing. (not really how it’s doing it, no one really knows) We used a library called Tensor Flow. It’s an Open Source machine learning library published by Google. It’s open source and released under an Apache license. It has a nice python and a “no nonsense” C++ api.
This is a really short summary of the whole process. Each time a video frame is recorded Suiron (software on the NUC) asks car-controller (software on arduino) what the human operator is doing. Remember, in manual mode the human operator is driving the car around. Car-controller responds by sending the current steering and motor values back to Suiron. Suiron takes these values and saves them along with a processed version of the frame.
This process happens at about 30Hz (or 30 times per second) for as long as you record data. In the final model, we used about 20 minutes worth of training data. That is 20 minutes of continuously driving around the track. It may not seem like a lot but it’s repetitive very quickly. 😉 In reality, 20 minutes is no where near enough data. It works great on this particular track with similar lighting conditions but would likely fail if the conditions changed to much.
Again, I’m not an exert at this but I’ll try to briefly explain how the training works. Convolutional Neural Networks (CNNs) are weird in the way they work. It’s impossible to know exactly how or why a CNN works. Basically, we’re giving Tensor Flow the frame and two numbers. (steering and motor) Then we’re asking it to work out how the frame relates to those two numbers. After giving it hundreds of thousands of examples (frames) it can try to generalise a model.
Because of the amount of computing power required it takes a very long time to train a good model. Due to the type of calculations it has to do, Tensor Flow runs much faster on a dedicated GPU. With only 20 minutes of data our model took half a day to train properly. The training took place in a desktop with a borrowed GTX980, a GPU that’s towards the higher end of consumer graphics cards.
Using the model
You can see it in action in the gif below. The blue line is what the model thinks it should do, the green line is what I actually did when I was steering it. Note that this data was not included in the training set, this is to ensure the model works with other data.
Once it has been trained we can then use the model. Basically, what happens is we collect just a frame from the webcam. Then we pass it to Tensor Flow and ask it to run it through the model. The model then spits out what it thinks our two values should be, one for steering and one for throttle. At the moment the throttle is unused and it runs at a constant speed. However we thought we’d include it just in case we wanted to use it in the future.