How to build a small self driving car?

2017 ยท Amsterdam


My recent side project is a building small self driving car with my friend Filip. See the code on my github.


There is fair amount of folks doing similar projects, there are even people who race these things. If it looks interesting, you could do it too - it is easier than ever and I'm sure you'd learn a lot.






* Idea *

The car works in the following way:


Below diagaram visualizes the flow of data:




To enable all of this, pieces of various software infrastructure and hardware are necessary.


On the GPU server side we need: Each of the above is multiprocessing.Process and they share a state which is multiprocessing.Namespace. On the RasPi side we need:


On the hardware side there are the following elements:


In the text below I will explain how we figured out how to combine all of them into a small self driving car, and hopefully convey that it was a lot of fun. And that perhaps we have even learned a little.


* Before the first car *

The whole story started when I finally had time to learn some electronics. I have always wanted to understand it, as it delivers such exceptional amounts of value and - in my book - is as close to magic as it gets.


There are so many cool books about electonics, online write-ups, youtube videos, electronics simulators, but what I found most practical for the car project was this book targetted to young teens:




It is so good: It lacks the proper theory build-up, so you will have to read up on that somewhere else if you are interested, but for the car project it should be sufficient. After some lazy evenings with soldering iron I was able to reverse engineer and debug basic electronic circuits such as this "astable multivibrator", to use the proper name:




* The first car era *

Armed with knowledge of rudiments of electronics, it was time to reverse engineer how an RC car works. As I generally had little idea what I was doing, I have decided to get cheapest one I could find so I wouldn't be too wasteful if I broke it while disassembling it. The cheapest one I could find was this little marvel:




which I have bought in Dutch bargain store Action.nl, new, for below 5 euro.


Deconstructing it, you just cannot help wondering how simple and elegantly designed it is. It truly amazes me that for under 5 Euro per unit people are able to design, fabricate, package, ship and sell this toy. It is possible mostly due to amusingly "to the point", "brutalist" design, where most of elements are right on their tolerances and wanted effects are achieved in the simplest possible way. This approach is best showcased by the design of the front wheels steering system, which I will describe below.


Coming back to the self-driving project, when we pop the body of the frame, we see this:




Luckily for our project, this construction can be adapted to our needs in a straightforward fashion. In the center we see the logic PCB. There are three pairs of red & black wires attached to the PCB:


When logic board receives radio inputs, it (more-or-less) connects the battery wires with the engine wires and the car moves.




You can see for yourself - if you directly connect the pair from the battery to the pair coming from one of engines (front or back engine), you see that the wheels move in wanted directions. So if we substitute the current logic board with our custom RasPi based logic board that controls connection between battery and engines based on instructions that we control by code, we would be able to drive-by-code.


Warning!

As Raspberry Pi has some General Purpose Input/Output (GPIO) pins it is tempting to try to hook up the engine directly to RasPi. This is a very bad idea, which will destroy your RasPi, by running too much electricity through it.


Thus, we need to implement a hardware engine controller which based on signals from Raspi will drive the motors. You can buy a ready motor controller or be thrifty and implement your own. As conceptually the wanted mechanism of action sounds like transistor - "based on small electricty turn on flow of bigger electricty" - I have decided to build my own engine controller. Following this tutorial produced an engine controller based on L293D chip.




The tutorial worked - at least partially. I was able to steer the front wheels and move the car forward, but there was no reverse gear.




Turns out the method shown in the above tutorial couldn't spin second engine in the reverse direction. After consulting the chip's factsheet I have come up with another way of hooking up L293D, which fixed the bug. Now I could control car completely from Python code running on RasPi.


Now, as I had a car that can be controlled by keyboard attached to RasPi, it was time to make the control remote. I have connected a basic small WiFi dongle to Raspberry and implemented a pair of Python scripts:


* Steering mechanism in first car *

Now I will describe how steering is implemented mechanically in the first RC car we have been working with.


Let me remark here that I am impressed by how simple this mechanism is. Perhaps the reason I am so excited about it is because I have only a vague idea about how mechanical design works, so I am easy to impress, and perhaps stuff like this gets done every day, who knows. But I see an element of playful master in it and I can't help feeling happy about it.


The mechanism is hidden if you look from the front




and similarily remains hidden if you look at it from the top.




It is only after removing the top cover that we see the following mechanism (I have removed all the plastic elements to help visibility):




Seen from the top:




And here comes the trick:


On top of shaft there is a pinion. The pinion is just slid on shaft, without any kind of glue. The size and tightness of the pinion are chosen in in a way that when the engine starts moving, the friction between shaft and pinion is just enough to move the half-circle rack.




But when the half-circle rack hits the maximum extension point, the friction breaks and the shaft spins around inside pinion. Is there a cheaper trick (speaking both symbolically and literally)? How ingenious!




Below the semicircular rack there is a spring that pulls the wheels back to neutral position:




You can see it better in this close-up:




A small plastic element is connected to wheels. It has a small peg that sits in the middle of the spring, so the spring an act on it:




So as you can see just a collection of as simple as possible elements each doing its job well. Everything is super cheap, but the materials and their shapes are chosen in a smart way that makes it all work together well. So cool!


That wraps up the description of the "brutalist" steering mechanism. Back to self driving!


* Hardware coming together *

When we had already the "drive by keyboard attached to RasPi by cable" capability, we wanted to test remote control of the car. In order to do that, we needed to power the RPi. Not sure about how much power it needs, we have bought a pretty big power bank for mobile phone with electric charge of 10.000 mAh, weighting around 400 g.


I am huge fan of American automotive show Roadkill, in which two guys, mostly by themselves, fix extremely clapped out classic American cars, usually on the road / in WalMart parking lot / in the junkyard. Typically they take some rotten engine-less chassis and put in a cheapest-they-can-find V8 into them. Needless to say, they are masters of using zip-ties.


Inspired by Freiburger and Finnegan experiences, I have relied on ziptie engineering in order to combine the collection of parts into one coherrent vehicle that will move together. The end product looked like this:










You may say that it looks pretty, but that's about all it did. Notice how big the battery is in relation to the car. Turns out, it was way too big. Unfortunately, the car was significantly too heavy to move on its own.




So the heaviest part of the car was the battery pack powering the RasPi. After getting some initial experiences it became clear that 10000 mAh is a lot of more than sufficient power to run RasPi on, even with WiFi interface for reasonable amount of time. Thus it was easiest to save weight by buying a way lighter battery with less capactiy.


We were really lucky about the Pokemon Go craze that has rolled through the Netherlands in early 2017, as it meant abundant supply of cheap mobile battery packs. We quickly sourced a new one, and with the new battery, the car looked like this:






Unfortunately, the car was still way too weak to drive well with RasPi strapped to its back. We needed a bigger car, but before I describe it, I will show how we have put together image capture and streaming system.


* Image capture and streaming *

Building this subsystem was a big loop of trial and error. In the end we have settled for RPi Camera on hardware side and amazing picamera Python module.




An interesting detail here is camera holder. Filip works in 3D Hubs, an Amsterdam based 3d printing company. As a perk, team members are given significant 3D printing allowance, which we used to 3d print case for RasPi camera in high quality. In this technology they shoot plastic-paticle spray with laser so it solifidies. This technology results in very accurate and resilent products.


When it comes to the code side, at the beginning we have struggled a lot with high latency:




and then experimented some more:




and even more:




but quickly we have converged to a pretty resilent UDP based stream. The code version that has worked best for us in the end was version presented in Advanced Recipes in picamera docs, but with UDP based connection. You can see the code here. After this, we were consistently seeing good latency:




Measuring latency takes both hands.


Here is how the final user interface looks like:






What does the self-driving car see when it sees itself in the mirror?


* Second car *

After the first car wouldn't move under it's own weight even with light battery pack, it became clear we need a new one. In the local shopping mall, this time in a proper toy shop we have found this car:




which is a bigger and slightly more advanced construction. However, in this car we can find some interesting mechanical pieces such as rear differential, and slightly more advanced front steering. For now, I have resisted the temptation to tear it completely apart, but I think I will come back to it after we move to bigger car. The disassembly yielded a similar result for our project - three wire pairs, one for battery, one for front steering one for back steering.




As we didn't want to disassemble the front steering, we were unsure whether it will work in the simple, linear way. That is whether by just applying voltage to front we could steer the car. We tested it by hand:




and luckily it worked, which meant we could reuse the previous engine control unit. We used the opportunity to clean up the control module a bit and fit it into slightly tighter package.




Overall, the ready product looked like this:




After this we added above described high quality camera holder.




The black car turned out to be quite fast:




And it was fun to just drive it around.




* Remote control *

Of course, driving by wire didn't cut it for us.


We have decided to stream the steering over WiFi. Another viable alternative was Bluetooth. First we have implemented event based controller that listens for keyboard events on the controller server side. If it catches any, it sends a message over TCP socket to the car to update the remote state.


On the testing bench, it has worked just fine:




But in practice, it was error-prone:




The default WiFi dongle we have bought for RasPi was too small and had not so strong reception. It was easy to get into a spot where there were problems with instant transfer of the data. As the TCP retries sending the data until succesful, by the time the information arrived onto the car, several new events might have happened. The car did not feel very responsive.


We partially resolved this issue by: so in the new implementation, algorithm listens for keyboard events and updates the state based on it and we send the state to the car all the time over and over again. You can see the final code here and here.


These changes in code helped the reliablity a lot, but it still wasn't perfect. Still in places with weak wifi reception, the car didn't work pefectly. We felt there is still room for improvement. We tried the easiest thing there was and added stronger Wifi dongle.




We went for pentesting-style Alfa wifi dongle with reasonably big antenna. Look how well the dongle sits between the back engine and wing. It's almost as if we planned it to be constructed that way!




But we did not. It's just a bunch of zipties.


This concluded the hardware part for now. This is how it looks when you (badly) drive the car around the track in the evening.




Machine learning

We have implemented ConvNet-based behavioral cloning system that steers the car. It is a variation of a widely used Nvidia implementation. See the full code here.


The inspiration came from many places:


As shown in the diagram at the top of the blog post, based on a dataset of pairs (image, steering) sampled from a human driving record, we learn to predict steering from image from camera . This is done by using a convolutional neural network.


There are good resources galore on deep learning / neural nets online, so I won't go into detail how this was implemented. Instead, I will focus on explaining differences between our and Nvidia's approches.


First of all, in our case the steering is binary. In a typical real-world self-driving car implementation you would gain control over steer-by-wire system done by car manufacturer. This usually means you gain control over pretty precise actuators that allow you to set the steering angle with consistency and precision. This was not the case in our model. Currently we only have three steering states - steer full left, steer full right or no steering. Additionally, when returning from full steer to the neutral position, the mechanism is often not completely accurate and returns only to somewhat-neutral position with bias to one side that is noticeable when the car goes fast. Check the driving video linked above and compare how fast everything happens compared to driving a normal car.


Secondly, in our implementation we only use one camera, where Nvidia's implementation uses three cameras that are generally forward facing, but at slightly different angles. This trick is done to get rid of a problem specific to behavioral learning. The problem is overrepresentation of idealized trajectories in the training data. If you consider a person driving a car on a road, even on a closed-off private road, the samples from the car going off-road, or near-crashing and recovering in the last moment are extremely rare compared to calm driving in the middle of the world. However, for the algorithm both cases are as important as the other. The self driving systems needs to know what to do when it starts going away from middle of the track.




I have copy-pasted this image from slides of Sergey Levine for one of the beginning lectures of Berkeley CS294-112: Deep Reinforcement Learning. - an incredibely high-quality online course generously offered for free by this university.


You can see in this image that during the training, we have explored the black trajectory. Due to inherent randomness, we are slightly diverging from the most well-known trajectory, for example the car steers a bit more to the right then usually. The correct response would be to correct and go back to the middle of the track, but there is no data like this in the training dataset (or there is but it is not sufficiently frequent to be represented in the machine learning).


The solution to this is to build a distribution of cost (and therefore, of wanted behaviours) around the optimal trajectory:




Image, again courtesy to UC Berkeley, Prof Sergey Levine.


There are several ways one could go around producing such a distribution.


One could try to gather a lot more data, especially taking care to sample trajectories that are away from the optimal trajectory. In the case of the car it could be turning off recording, steering slightly off the center of the road, turning recording back on, correcting the steering back to the center. Another idea would be to drive over the same piece of road many times, taking care to drive a bit to the left, than a bit to the right, then more to the right etc. As you can see, these solutions are doable, but not extremely practical, so it is not always there is an easy way to just gather the data.


One could consider the trick that Nvidia has done. They have set up two cameras at slight angles to left and right in addition to the central forward facing one. Then they correct for this in the steering commands that are recorded as the pair for that sideways facing image. So for example if they are driving straight forward, the right facing camera will record a picture as if the car was going to the right of the road, and a steering correction of the same angle to the left would be added to the current (fully straight steering). This will help the machine learning algorithm understand that it needs to steer left when it sees this image.




In our case the problem is less severe than in the case of the real self-driving car. This is because:


We have also added some more layers to the NVIDIA architecture, but I don't think it should make a big difference (as the structure of the data we have is relatively simple).


One more difference is purely architectural - NVIDIA has run the deep learning directly on the car, using cutting edge (but super expensive) NVIDIA DRIVE PX computer for autonomous driving. NVIDIA claims that they got 30 fps, and we got 25 fps, so I would say not too bad.


In general the system works well on the track it was trained on, but the performance doesn't generalize to a more complicated, tighter track. We will run some experiments about training on the new track soon. My guess the performance should be improving steadily with amount of training data gathered.


You can see that even on the vanilla one-turn track the driving is not perfect, doesn't stay inside of the track, but with the current controls it may be beyond possible. It should be possible with a car that has more precise, preferably non-binary left-right steering.


* Machine learning development story *

From the beginning of the development, we were pretty sure that our approach should work - we knew it has been applied sucessfully in real life applications and very similar project but inside of simulator was part of Udacity self-driving car course. Still, getting it right required some degree of patience and creative tweaks.


After the hardware construction & testing phase, we had a real-time, low-latency system where you drive the car around track, based solely on the video input.


The final user interface looks like this:




Here you can see a previous iteration of it being tested carefully by another friend.




So we started off with gathering one big track, around 10 meters around the whole house. In this iteration, everything that we did was "on camera" and got appended to the dataset.


We have trained a neural net same as in the Nvidia paper. It didn't work! And by didn't work we don't mean running it once and declaring failure. In general, we tried to make at least three optimization runs for each setting of architecutre / metaparameters to be sure to take out effect of random initialization.


After the initial attempt didn't work, we decided to simplify the problem to just one turn. We have gathered a lot of data another time - more or less 2 sessions of 2 hours each. The raw collected pictures look like this:




After training the network we have found that again it doesn't work. We had a brainstorming session we decided that most effective would be to implement a manual data adjuster. We were thinking that perhaps there was too much error in our training data (that we were driving too much outside of the lane as things were happening too fast, that there was overrepresentation of start of track and end of track with 0 velocity but nonzero steering). We have implemented some additional infrastructure to help debugging this kind of data errors - a data viewer / editor, capability to turn recording on and off in data collection, a dataset combiner. Then we manually went through all of the data and corrected what we thought were inconsistencies. This also didn't work...




Then we spent some even more time looking at the data and only then we have noticed that horizon is located differently in different parts of the dataset. That meant that our camera was moving around too much, introducing significant noise in the data. We have fixed the camera placement, added a guiding line in the driver UI and collected more data. Still didn't work!


Then we had the breakthrough, which was noticing that we should cast to grayscale and implement brightness and contrast adjustments. We do it to correct for differing lighting conditions. An example of our final augmentation pipeline is visible below:




Than it finally worked! Somewhat... The car doesn't stay perfectly inside of the track (everything happens so much faster compared to real life car) We only tested the whole project well only around one turn track.


We can squeeze out / cheat a run that doesn't exit track around a more curvy, tighter track, but it doesn't count. But still it shows some degree of generalization. We think that if you just gather more data with the current architecture, it should work.




* Lessons *



* Plans for future *

We have many ideas for the future. We want to try some of ideas listed below, but our priority is getting the car to work very well on a track more complex than one turn.