Besides investing mildly into new hardware to keep me able to waste my time I also did some real hacking, or at least prepared it.

During a long phone call with a good friend I got an interesting idea. There’s a company that offers modules for almost all BMW models with which you can turn them into your own custom light organ. Want to blind people randomly bu putting on the high beams when unlocking it? Just program their 150 bucks module to do so. Yes, people are buying such stuff and pay big money. While I understand that for some generations where BMW used a proprietary IBus interface for all their comfort modules I’m a bit surprised there’s no competition for this for CAN bus based models. Also, there’s no one offering such nonsense for other manufacturers.

Yeah, you get it!

As there is a nice Mercedes Benz in my garage I got the idea to tap into its interior CAN bus and try to fool around there a little. Before you do something like that you should have an idea of what you are about to do and how the stuff you’re going to interfere is working. So let’s take a little lesson on the CAN bus.

Basically in the automotive world most manufacturers use a two wire CAN bus, several of them. The common car has at least three. One for the OBD II diagnosis port that is just present for this reason and provided by the main control unit for the engine. It features some standard messages for diagnosis purposes but is practically limited to that. The second is the actual engine bus that’s used by the important systems, like engine, transmission and security systems. The third is the interesting one, used for everything that’s not highly mission critical, like lights, radio, AC, seats and such. The engine and interior CAN are, at least for most Mercedes cars, ¬†loosely connected by the instrument panel (or however you call this thing in English). The instrument panel is connected to the engine bus and and the interior bus and acts as a bridge for certain kinds of messages, like speed or steering wheel position. What exactly it bridges and what not I’ll have to find out.

So we now know the topology. The CAN used for the engine is typically 500kbit, while, at least for older interior buses, 125kbit is common. But how does that whole CAN shenanigans work?

First, it is quite a bit different than the usual bus and network connections you know from computers. It features a different concept of addressing and also arbitration and collision management. Physically it is a bus of two wires with, iirc, differential signaling on those two lines and a 60Ohm termination, one 120Ohm terminator on each end. Signaling voltage is, iirc, again, something about 5V. To provide a constant clock for all devices on the bus, you’ll see why this is more important than elsewhere, in case nobody feels like talking bit stuffing is used. Ones and zeros get signaled by pulling the voltage down or leaving it, the actual signal is the voltage level, not the change of voltage signals. So three times a one is long low signal (or if I mix it up a long high signal). Therefore after more than 5 times the same bit one inverted bit is sent to ensure the devices stay in sync with their idea of what the clock signal is.

So how does addressing devices work?

Not at all! CAN devices have no device address. The protocol consists mainly of message IDs and payloads for those IDs. So if you want to talk to a specific device on the bus, you need to just send a message with the appropriate ID. If you want to listen to everything a certain device says, you need to know all message IDs it uses. This scheme is actually quite beneficial for tapping into such a bus, as there’s no way for any other device to detect the presence of a listener or to distinguish between messages sent by two devices. If my device uses the same message idea as another one, nobody on the bus has an idea of that. The only option would be that devices listen to the IDs they use themselves to see if anyone else uses them.

How are collisions handled?

Whenever you have a bus and especially if it is half duplex you have to somehow handle situations where more than one device wants to say something. You might know that for Ethernet it’s basically that when two start talking both shut up and try again after a random timeout. That’s done for a reason but it has one big downside. There is no guarantee that anything can be sent in a finite amount of time. Yeah sure, practically and such, but in theory you can solve this in other ways. You could e.g. have a master, like the USB host controller or cell towers, assign time slots and manage it this way, but that would not be too fail safe, as you would at least have to define what happens if the master fails. Your options for this are, in short, failure or complex stuff.

CAN does not have a collision detection approach, but a collision resolution approach. Message IDs not only distinguish between different ¬†kinds of information, they are also responsible for determining priority of a message. The higher priority message always wins during the arbitration period and gets transmitted, while the lower priority message is held back and has to be transmitted later. Now we’ll see why a steady clock signal for all participants is important. If two participants want to send a message, they start at the same time and also listen to the bus. While both send the same bits, they have no idea whether someone else is also sending or not. Once a bit differs, one has to pull the bus low and one has to leave the level to signal the opposite bit. The one who leaves the bus alone sees it being pulled down, backs off and loses arbitration. The other one, and anybody else on the bus, has no idea that there ever was someone else intending to send and just goes on. The fact that one bit has to differ is determined by ensuring that all devices use different message IDs for sending.

So this way you can avoid congestion for the cost of delaying or suppressing messages with lower priority. This makes perfect sense in an automotive environment, where you can live with an increased latency or loss of unimportant information. Who’s gonna care about how high the latency for the power window button is when your anti lock brakes are in effect? And besides, I already mentioned power windows and such are separated from the critical bus, so it would maybe be something like adjusting ignition timing when you’re currently sliding in a 90 degree angle with ESP trying to keep you from acting like a dreidel. And I’m also sure the capacity of the engine CAN is high enough to not have any relevant latency even in this case.

So what does this tell us in regard to reverse engineering a Mercedes interior CAN?

A lot. For one it tells us to better get hardware that handles the CAN protocol itself, at least for the start. Implementing it with a normal microcontroller should be doable, at 125kbit and 5V signaling level, but that would be a mid sized project in itself. If you wanna do it really good and make sure signal edges and such are clean enough to be within specs and not just clean enough to be just working with whatever you have it would likely be a pretty big project.

Another thing is that we can assume Mercdes kept to the idea of using message IDs for priority. So chances are this can give us hints decoding messages. I’d guess the brake light gets preferred over the buttons to control the radio.

So yeah, I’m gonna get my hands on something that can tap the CAN bus sooner or later and see what I’m able to decode and understand. My current preference is something based on the ELM327 chip, which, among other things, does speak CAN, handles all the bus details and not only has a UART interface, but also comes with a very detailed datasheet explaining a lot not only about the chip and its behavior, but also CAN itself.

Don’t get too excited, though, this is more a long term idea and not necessarily happening this weekend.