Mike Garmulewicz - gf4c3

There is fair amount of folks doing similar projects, there are even people who race these things. If it looks interesting, you could do it too - it is easier than ever and I'm sure you'd learn a lot.

* Idea *

The car works in the following way:

On board of the car there is forward facing RasPi camera.
The image stream is sent to GPU computer, which streams steering commands back: "go forward", "stop", "turn left", etc.
Steering commands are issued in one of two ways:
- Human input, where you steer the car in racing video game fashion.
- Self-driving algorithm, which tries to guess what would a human do based on current video frame.

On the GPU server side we need:

code that receives images from RasPi
code that sends current steering state to the car
code that executes neural network to obtain autonomous steering
code that displays user the "steering game" and collects steering
code that collects human input pairs and saves them to the drive

Each of the above is multiprocessing.Process and they share a state which is multiprocessing.Namespace. On the RasPi side we need:

code that receives steering commands and executes them
code that captures video stream and puts it on the wire

On the hardware side there are the following elements:

Raspberry Pi
RasPi battery pack
RasPi compatibile camera
RC Car
RC car engines battery pack
Engine controller to control RC car
Wifi interface (turns out it needs to be strong)
Ways to keep all of this together

In the text below I will explain how we figured out how to combine all of them into a small self driving car, and hopefully convey that it was a lot of fun. And that perhaps we have even learned a little.

* Before the first car *

The whole story started when I finally had time to learn some electronics. I have always wanted to understand it, as it delivers such exceptional amounts of value and - in my book - is as close to magic as it gets.

There are so many cool books about electonics, online write-ups, youtube videos, electronics simulators, but what I found most practical for the car project was this book targetted to young teens:

It is so good:

has light-hearted vibe about it,
teaches you the practical basics via small projects:
- how to solder,
- how to use multimeter,
is exceptionally easy-to-read.

It lacks the proper theory build-up, so you will have to read up on that somewhere else if you are interested, but for the car project it should be sufficient. After some lazy evenings with soldering iron I was able to reverse engineer and debug basic electronic circuits such as this "astable multivibrator", to use the proper name:

* The first car era *

Armed with knowledge of rudiments of electronics, it was time to reverse engineer how an RC car works. As I generally had little idea what I was doing, I have decided to get cheapest one I could find so I wouldn't be too wasteful if I broke it while disassembling it. The cheapest one I could find was this little marvel:

Deconstructing it, you just cannot help wondering how simple and elegantly designed it is. It truly amazes me that for under 5 Euro per unit people are able to design, fabricate, package, ship and sell this toy. It is possible mostly due to amusingly "to the point", "brutalist" design, where most of elements are right on their tolerances and wanted effects are achieved in the simplest possible way. This approach is best showcased by the design of the front wheels steering system, which I will describe below.

Luckily for our project, this construction can be adapted to our needs in a straightforward fashion. In the center we see the logic PCB. There are three pairs of red & black wires attached to the PCB:

a pair feeding power from the battery pack
another pair sending power to the back motor, which movese car forward and backward
one more pair sending power to the front wheel steering system

You can see for yourself - if you directly connect the pair from the battery to the pair coming from one of engines (front or back engine), you see that the wheels move in wanted directions. So if we substitute the current logic board with our custom RasPi based logic board that controls connection between battery and engines based on instructions that we control by code, we would be able to drive-by-code.

Warning!

As Raspberry Pi has some General Purpose Input/Output (GPIO) pins it is tempting to try to hook up the engine directly to RasPi. This is a very bad idea, which will destroy your RasPi, by running too much electricity through it.

Thus, we need to implement a hardware engine controller which based on signals from Raspi will drive the motors. You can buy a ready motor controller or be thrifty and implement your own. As conceptually the wanted mechanism of action sounds like transistor - "based on small electricty turn on flow of bigger electricty" - I have decided to build my own engine controller. Following this tutorial produced an engine controller based on L293D chip.

Turns out the method shown in the above tutorial couldn't spin second engine in the reverse direction. After consulting the chip's factsheet I have come up with another way of hooking up L293D, which fixed the bug. Now I could control car completely from Python code running on RasPi.

Now, as I had a car that can be controlled by keyboard attached to RasPi, it was time to make the control remote. I have connected a basic small WiFi dongle to Raspberry and implemented a pair of Python scripts:

Server-side script capturing keyboard events and sending control events based on them to RasPi,
RasPi-side script listening for those events on TCP socket and based on them steering the L293D and in turn, the car.

* Steering mechanism in first car *

Let me remark here that I am impressed by how simple this mechanism is. Perhaps the reason I am so excited about it is because I have only a vague idea about how mechanical design works, so I am easy to impress, and perhaps stuff like this gets done every day, who knows. But I see an element of playful master in it and I can't help feeling happy about it.

On top of shaft there is a pinion. The pinion is just slid on shaft, without any kind of glue. The size and tightness of the pinion are chosen in in a way that when the engine starts moving, the friction between shaft and pinion is just enough to move the half-circle rack.

But when the half-circle rack hits the maximum extension point, the friction breaks and the shaft spins around inside pinion. Is there a cheaper trick (speaking both symbolically and literally)? How ingenious!

So as you can see just a collection of as simple as possible elements each doing its job well. Everything is super cheap, but the materials and their shapes are chosen in a smart way that makes it all work together well. So cool!

* Hardware coming together *

When we had already the "drive by keyboard attached to RasPi by cable" capability, we wanted to test remote control of the car. In order to do that, we needed to power the RPi. Not sure about how much power it needs, we have bought a pretty big power bank for mobile phone with electric charge of 10.000 mAh, weighting around 400 g.

I am huge fan of American automotive show Roadkill, in which two guys, mostly by themselves, fix extremely clapped out classic American cars, usually on the road / in WalMart parking lot / in the junkyard. Typically they take some rotten engine-less chassis and put in a cheapest-they-can-find V8 into them. Needless to say, they are masters of using zip-ties.

Inspired by Freiburger and Finnegan experiences, I have relied on ziptie engineering in order to combine the collection of parts into one coherrent vehicle that will move together. The end product looked like this:

You may say that it looks pretty, but that's about all it did. Notice how big the battery is in relation to the car. Turns out, it was way too big. Unfortunately, the car was significantly too heavy to move on its own.

So the heaviest part of the car was the battery pack powering the RasPi. After getting some initial experiences it became clear that 10000 mAh is a lot of more than sufficient power to run RasPi on, even with WiFi interface for reasonable amount of time. Thus it was easiest to save weight by buying a way lighter battery with less capactiy.

We were really lucky about the Pokemon Go craze that has rolled through the Netherlands in early 2017, as it meant abundant supply of cheap mobile battery packs. We quickly sourced a new one, and with the new battery, the car looked like this:

Unfortunately, the car was still way too weak to drive well with RasPi strapped to its back. We needed a bigger car, but before I describe it, I will show how we have put together image capture and streaming system.

* Image capture and streaming *

An interesting detail here is camera holder. Filip works in 3D Hubs, an Amsterdam based 3d printing company. As a perk, team members are given significant 3D printing allowance, which we used to 3d print case for RasPi camera in high quality. In this technology they shoot plastic-paticle spray with laser so it solifidies. This technology results in very accurate and resilent products.

but quickly we have converged to a pretty resilent UDP based stream. The code version that has worked best for us in the end was version presented in Advanced Recipes in picamera docs, but with UDP based connection. You can see the code here. After this, we were consistently seeing good latency:

* Second car *

After the first car wouldn't move under it's own weight even with light battery pack, it became clear we need a new one. In the local shopping mall, this time in a proper toy shop we have found this car:

which is a bigger and slightly more advanced construction. However, in this car we can find some interesting mechanical pieces such as rear differential, and slightly more advanced front steering. For now, I have resisted the temptation to tear it completely apart, but I think I will come back to it after we move to bigger car. The disassembly yielded a similar result for our project - three wire pairs, one for battery, one for front steering one for back steering.

As we didn't want to disassemble the front steering, we were unsure whether it will work in the simple, linear way. That is whether by just applying voltage to front we could steer the car. We tested it by hand:

and luckily it worked, which meant we could reuse the previous engine control unit. We used the opportunity to clean up the control module a bit and fit it into slightly tighter package.

* Remote control *

We have decided to stream the steering over WiFi. Another viable alternative was Bluetooth. First we have implemented event based controller that listens for keyboard events on the controller server side. If it catches any, it sends a message over TCP socket to the car to update the remote state.

The default WiFi dongle we have bought for RasPi was too small and had not so strong reception. It was easy to get into a spot where there were problems with instant transfer of the data. As the TCP retries sending the data until succesful, by the time the information arrived onto the car, several new events might have happened. The car did not feel very responsive.

We partially resolved this issue by:

sending the state (spamming?) to the car every 10 ms
allowing information loss by switching to UDP

so in the new implementation, algorithm listens for keyboard events and updates the state based on it and we send the state to the car all the time over and over again. You can see the final code here and here.

These changes in code helped the reliablity a lot, but it still wasn't perfect. Still in places with weak wifi reception, the car didn't work pefectly. We felt there is still room for improvement. We tried the easiest thing there was and added stronger Wifi dongle.

We went for pentesting-style Alfa wifi dongle with reasonably big antenna. Look how well the dongle sits between the back engine and wing. It's almost as if we planned it to be constructed that way!

Machine learning

As shown in the diagram at the top of the blog post, based on a dataset of pairs (image, steering) sampled from a human driving record, we learn to predict steering from image from camera . This is done by using a convolutional neural network.

There are good resources galore on deep learning / neural nets online, so I won't go into detail how this was implemented. Instead, I will focus on explaining differences between our and Nvidia's approches.

First of all, in our case the steering is binary. In a typical real-world self-driving car implementation you would gain control over steer-by-wire system done by car manufacturer. This usually means you gain control over pretty precise actuators that allow you to set the steering angle with consistency and precision. This was not the case in our model. Currently we only have three steering states - steer full left, steer full right or no steering. Additionally, when returning from full steer to the neutral position, the mechanism is often not completely accurate and returns only to somewhat-neutral position with bias to one side that is noticeable when the car goes fast. Check the driving video linked above and compare how fast everything happens compared to driving a normal car.

Secondly, in our implementation we only use one camera, where Nvidia's implementation uses three cameras that are generally forward facing, but at slightly different angles. This trick is done to get rid of a problem specific to behavioral learning. The problem is overrepresentation of idealized trajectories in the training data. If you consider a person driving a car on a road, even on a closed-off private road, the samples from the car going off-road, or near-crashing and recovering in the last moment are extremely rare compared to calm driving in the middle of the world. However, for the algorithm both cases are as important as the other. The self driving systems needs to know what to do when it starts going away from middle of the track.

I have copy-pasted this image from slides of Sergey Levine for one of the beginning lectures of Berkeley CS294-112: Deep Reinforcement Learning. - an incredibely high-quality online course generously offered for free by this university.

You can see in this image that during the training, we have explored the black trajectory. Due to inherent randomness, we are slightly diverging from the most well-known trajectory, for example the car steers a bit more to the right then usually. The correct response would be to correct and go back to the middle of the track, but there is no data like this in the training dataset (or there is but it is not sufficiently frequent to be represented in the machine learning).

One could try to gather a lot more data, especially taking care to sample trajectories that are away from the optimal trajectory. In the case of the car it could be turning off recording, steering slightly off the center of the road, turning recording back on, correcting the steering back to the center. Another idea would be to drive over the same piece of road many times, taking care to drive a bit to the left, than a bit to the right, then more to the right etc. As you can see, these solutions are doable, but not extremely practical, so it is not always there is an easy way to just gather the data.

One could consider the trick that Nvidia has done. They have set up two cameras at slight angles to left and right in addition to the central forward facing one. Then they correct for this in the steering commands that are recorded as the pair for that sideways facing image. So for example if they are driving straight forward, the right facing camera will record a picture as if the car was going to the right of the road, and a steering correction of the same angle to the left would be added to the current (fully straight steering). This will help the machine learning algorithm understand that it needs to steer left when it sees this image.

In our case the problem is less severe than in the case of the real self-driving car. This is because:

we have less general problem. The variety of tracks that we can build around the house is significantly smaller than the variety of real roads, especially when you take into consideration the spectrum of possible lighting conditions. Our track is white pieces of paper, which is very clearly visible and similar to itself under a variety of conditions.
the tracks that we train on are shorter so we can run the car over each piece of road hundreds of times in a day or so. Imagine doing that for each piece of road in Europe or US.
We have significant variations of runs due to lack of precision of our actuators. Driving in the small car world is a lot more dynamic, the turns are a lot sharper and we can afford to crash for free. Check out driving video above to see how quick everything happens.

One more difference is purely architectural - NVIDIA has run the deep learning directly on the car, using cutting edge (but super expensive) NVIDIA DRIVE PX computer for autonomous driving. NVIDIA claims that they got 30 fps, and we got 25 fps, so I would say not too bad.

In general the system works well on the track it was trained on, but the performance doesn't generalize to a more complicated, tighter track. We will run some experiments about training on the new track soon. My guess the performance should be improving steadily with amount of training data gathered.

You can see that even on the vanilla one-turn track the driving is not perfect, doesn't stay inside of the track, but with the current controls it may be beyond possible. It should be possible with a car that has more precise, preferably non-binary left-right steering.

* Machine learning development story *

From the beginning of the development, we were pretty sure that our approach should work - we knew it has been applied sucessfully in real life applications and very similar project but inside of simulator was part of Udacity self-driving car course. Still, getting it right required some degree of patience and creative tweaks.

We have trained a neural net same as in the Nvidia paper. It didn't work! And by didn't work we don't mean running it once and declaring failure. In general, we tried to make at least three optimization runs for each setting of architecutre / metaparameters to be sure to take out effect of random initialization.

After the initial attempt didn't work, we decided to simplify the problem to just one turn. We have gathered a lot of data another time - more or less 2 sessions of 2 hours each. The raw collected pictures look like this:

After training the network we have found that again it doesn't work. We had a brainstorming session we decided that most effective would be to implement a manual data adjuster. We were thinking that perhaps there was too much error in our training data (that we were driving too much outside of the lane as things were happening too fast, that there was overrepresentation of start of track and end of track with 0 velocity but nonzero steering). We have implemented some additional infrastructure to help debugging this kind of data errors - a data viewer / editor, capability to turn recording on and off in data collection, a dataset combiner. Then we manually went through all of the data and corrected what we thought were inconsistencies. This also didn't work...

Then we spent some even more time looking at the data and only then we have noticed that horizon is located differently in different parts of the dataset. That meant that our camera was moving around too much, introducing significant noise in the data. We have fixed the camera placement, added a guiding line in the driver UI and collected more data. Still didn't work!

Then we had the breakthrough, which was noticing that we should cast to grayscale and implement brightness and contrast adjustments. We do it to correct for differing lighting conditions. An example of our final augmentation pipeline is visible below:

Than it finally worked! Somewhat... The car doesn't stay perfectly inside of the track (everything happens so much faster compared to real life car) We only tested the whole project well only around one turn track.

We can squeeze out / cheat a run that doesn't exit track around a more curvy, tighter track, but it doesn't count. But still it shows some degree of generalization. We think that if you just gather more data with the current architecture, it should work.

* Lessons *

need to be patient and persistent if you want anything to work
need a bit of money - you need quite a bit of gear (we spent around 200-300 EUR on the whole project, but we already had computer with CUDA strong GPUs)

How to build a small self driving car?