PLC4X – [Untitled]

Reverse Engineering the DeltaV protocol

This document should describe what we did in order to reverse-engineer the DeltaV protocol.

The sole purpose of this document, is to write document our path in order to eventually protect ourselves against accusation of using illegal measures in gaining the information we got.

Starting point

We kindly were provided with access to a DeltaV system used for training.

Therefore we were able to change a few things without fearing to break anything important.

This system consisted of three DeltaV controllers, two Operator-System instances So in general there were three types of systems involved:

DeltaV Controller
DeltaV Operator-System
Laptop running WireShark

The first change here, was to replace the network switch with a network hub, so we could capture the entire network traffic. (By the way … if you happen to still have a working Hub, don’t give that away. It turns out it’s really hard to get you hands on hubs these days)

With this we let the system run and did a pretty long network capture.

This 29MB WireShark capture was what we started working with.

Identifying the protocol

As WireShark didn’t have a DeltaV disector, we had to trace it down ourselves.

When looking at the capture, we very quickly filtered out the usual network protocols and the Windows communication and found a lot of binary UDP traffic running on port 18507.

Special with all of these was that they all started with a payload of 0xFACE.

First steps in understanding the protocol

Here it appeared that UDP packets are sent with some sort of payload, which are obviously responded to by packets with a fixed and very small size (64 bytes).

So we are assuming these short messages, that seem to partially repeat some of the information of the first packet (but not all), are probably acknowledge packets.

From looking at the nature of the traffic and knowing the IP addresses of the DeltaV controllers (PLC) and the Operator Systems (Controll-Software), we could see that this communication is probably subscription based as we could see a lot of the Controller sending packets to the OS, which the OS then acknowleges in regular intervals.

When filtering only the UDP packets on port 18507 we could scroll though the byte representation, we could quickly identify some things.

Directly after the 0xFACE head, came two bytes (short) which was identical for similar packet content, so we at first thought it somehow correlates to a type.
After this came another short value which also seemed identical for similar packets. However as there were differences, we assumed this was some sort of sub-type.
This was followed by one byte that seemed identical for all messages sent from one node to another, but we did find up to 3 different values in the dumps. So we at first thought this was some sort of connection id
The most obvious thing about the byte after that was, that it seemed to increment by one for messages with the same "connection id", so we assumed this was a message-id.
There were 3 bytes between the message-id and an obviously constant sequence of 3 bytes which, at first we had no idea what they might be used for.

In order to prove any assumption we made, we used little programs that used Pcap4J to programmatically check assumptions.

Decoding the wrapper packet header

As mentioned before, it appeared obvious that for almost identical packets the first two short values were very similar too, which led us to the assumption that these are type and `sub-type'. So we wrote a little program, that scanned all the packets and counted how many types we had. This was a very large number. Also did we output some additional information about packets on the console … one of these pieces of information was the length of the packet.

When outputting a sorted list of the type values (maybe it’s strange, but I like data sorted) it seemed quite obvious, that for a larger presumed type value, also the length of the packet was larger. Especially when comparing the difference of these values also showed that of a type had a value that is larger than another packet, that also the length was exactly this number of bytes larger.

So in the end it turned out that the first short value is not the type, but a packet length-related value. It even turned out that the number encoded here was constantly 2 bytes short the number of bytes after the 3 constant bytes of 0x800400, ending leaving two bytes at the end.

These two bytes seemed to be extremely jumpy. Here we could see that for every packet this value seemed to change quite dramatically, even if the payload was almost identical. We did see values repeated, but couldn’t see any correlation between the content of the colliding packets. So in the end we assume this is some sort of checksum value.

With this assumption, we now think that the first short value is the payload-length value.

Knowing the first short value no longer is the type, upgraded the second short value to being a new candidate for being a full type.

Scanning the communication, it pretty quickly became obvious that here most communication packets have a value of 0x0002 and some have 0x0006. Only when starting a new OS instance while capturing, made some others pop up: 0x0003 and 0x0004.

So we are assuming the 0x0003 and 0x0004 packets are related to establishing a connection. Also we could see the node initiating a connection sends a 0x0003 packet, which is then replied to by a 0x0004 packet.

Then most information is sent via 0x0002 packets. However every now and then there comes a pair of 0x0006 packets, who’s length seems to prevent being used to transfer any data. We are therefore assuming these packets are used for syncing.

As mentioned before we did a lot of automatically splitting up PCAPNG files into separate ones depending on different conditions, because this allowed to browse through similar packets using simple human pattern recognition for decoding.

Pretty soon after starting to splitting up the streams, so the communication between two nodes was separated into one separate file, we did notice what we assumed to be a connection-id and a mesasage-counter instantly following the type.

However when doing the statistical counting and listing of these values, we noticed that after the message-id 0xFF came the 0x00, as we would have expected, but the assumed connection-id was also incremented by one. So it turned out that the similarity of the first byte were not due to a connection-id, but rather the counter of just having the same first byte. After the type byte seems to come a message-id counter. It seems as if this initialized to some random value during the connection establishment and is then incremented with every message.

However our little check did reveal, that this id is identical for a packet following a 0x0006 packet. Therefore we are assuming the 0x0006 packets are used to sync the connection between two nodes.

The short value (or a byte value following a 0x00 byte), after this message id seems to be constant for every message sent from ine node to another, but we think to have seen one node use different values for communicating with different nodes. Also it seems that all Controllers seem to have values from 0x0000 to 0x0009 and OSes seem to have values from 0x000A to 0x000F. We are assuming this is some sort of sender-id value.

The following 3 bytes were quite a mystery to us. We could see that it seems to be an ever increasing number stored here, however the increment was not linear. However when interpreting these 3 bytes as a number and filtering only packets initiated by the controller to one particular OS, filtering out Ack messages by accident, we did notice a certain pattern. When correlating that to the capture file, we could see that the pattern correlated to the pattern in the communication. We were told, that the OS is configured to update the values once a second and could see that pattern in the capture. Here was when we noticed that the numeric value in the 3 bytes followed this pattern. In the end it turns out that if we interpret this 3 byte number where the first 14 bits encoding seconds and the following 10 bits the sub-second part, the values in the timestamp matched the timestamps of the pacet capture.

We have to admit that we were a little surprised about this. Currently we are assuming that this timestamp is used for tracking transmission times. 4 bytes would allow quite a time-range, but would come at a cost of one additional byte per packet, that wouldn’t be needed for simple run-time checking. Two bytes however only would allow to encode only about 63 seconds, which is probably too little when thinking about packet-losses and re-transmissions. So in the end we assume these 3 bytes are a simple timestamp.

As mentioned before, the following byte is usually 0x80 except for the first packet every party exchanges with the other side. In this case the value is 0x00. So we’ll just keep that in mind an not worry about interpreting a meaning into this. The last two bytes of the wrapper packet header are simply constantly 0x8000 in every packet. Here too, we’ll just live with knowing that and not wondering what it could mean.

Decoding the internal packet

After having decoded what we think is the wrapper-protocol packet structure, we knew decoding the internal protocol would be a bigger challenge.

Therefore the test-setup was changed a little.

The number of controllers and operator-systems were reduced to one each. The OS was changed to display only one single value of the controller. A simulator was introduced, which was connected to the controller and communicated to this via ProfiNet protocol. This simulator allowed us to provide values to the controller via the backend ProfiNet connection.

With this setup we started working on the payload decoding.

The first two bytes of the payload seem again to relate to a type as similar packets usually had identical values here.

Especially we did a lot of separate captures for all sorts of different operations. During this we managed to reverse engineer the connection on the wrapper-protocol layer as well how the communication looks inside the internal protocol.

Connecting

When an OS is booted, the following communication could be observed:

The OS sends a wrapper-protocol type 0x0003 to the controller.
The controller responds with a 0x0004 message back to the controller.
After receiving the 0x0004 message, the OS sends a type 0x0002 message with a wrapped type 0x0501
This packet is acknowledged by the controller
The controller sends a very similar packet as the 0x0501 packet back to the OS
The OS acknowledges this packet back to the controller

From looking at the packet in WireShark’s byte and ASCII view, we could see ASCII characters that seem to relate to test strings. A quick consultation of the guy in charge of these systems made us realize that these strings refer to parts of the controllers software. Digging even deeper made us realize that the 4 byte values relate to unix timestamps - at least we could decode the byte values to absolutely sensible dates. So it seems that In these packets each participant compares it’s software state in order to know all are using compatible versions.

This does make sense. When connecting to a remote system, checking the version compatibility is surely an important thing to do.

Normal operation

When just starting the system and not changing anything, we did notice those previously mentioned 0x0006 type messages we think are responsible for keeping the connections in sync.

What is also noticeable when just starting a System and not logging in to Windows yet, is that the DeltaV Operator System seems to connect before a user logs in to Windows. After the connection is established, the Controller almost immediately starts sending alarm telegrams (0x03XX types), but no value telegrams (0x04xx).

It also appears that there is some upper bound for the size of packets. As when seeing a lot of alarms being sent in 0x0304 messages, sometimes these are split up into multiple messages sent in very short intervals. The last of these usually being smaller than the rest.

Changing values

So now we started changing values. First we started changing valid values into other valid values.

So whenever we changed a value immediately a packet with wrapper-type 0x0002 was written from the controller to the os with a payload-type 0x0403.

Whenever we changed values to values that raise alarms, we could see the same 0x0403 messages being sent, but the payload of these being a lot larger, but additionally packets with payload-types starting with 0x03 being sent.

So after increasing a value to a value that raises alarms, we got a payload-type 0x0304 message containing a string-like representation of the Name of the alarm raised as well as probably a name of a user or role. Also did we notice the 0x0403 packets seem to contain String constants that relate exactly to the text displayed at the bottom of each screen window.

We assume payload-types starting with 0x04 relate to normal data exchange and ones with 0x03 relating to events and alarms.

Decoding value changes

After coming to the conclusion that wrapper-type 0x0002 messages with a payload-type 0x0403 seem to relate to value changes, we started filtering for exactly these packages.

Next we created some captures that contain all sorts of documented value changes. So we started with the value 0 and incremented that in steps of 1 till 20 and then in steps of 10 up to 200, going back to 0 and then down to the minimum value of -40.

And we could see that there were certain parts of the packet that seem to change when changing values, most of the packet remaining constant. When changing a value back to the original value, the packets payload was identical in this part. As the packets didn’t all have the same content and some times blocks were added and left away in regular intervals, lead us to the impression, that these packets too consist of a fixed packet header followed by blocks. In our case the changing blocks all had two indicator bytes 0x0b08 with the 4 bytes after that changing according to the input. Sometimes a block of bytes was added before this block, then the entire content was just pushed back.

The last of such blocks would always be a sequence of 9 bytes: 0x01000000000000000000, so we assume this is sort of the terminating block of such a packet.

One thing quite obvious is that the header does again seem to contain a counter of one byte which is incremented for every 0x0403 packet.

As we currently wanted to understand how the data is encoded we decided to ignore all other parts and concentrate on the block we identified as containing the information we were looking for.

Now how should we interpret this information. At first I was told we were working with an unsigned 16 bit integer. Unfortunately we could see 4 bytes changing. It took us quite a while. As mentioned before we created a table containing the value we input and what we could see in the packet. Unfortunately we started seeing patterns, but couldn’t quite get the meaning, however we did notice the "end" of the hex-values alternating around 0x00001 and 0xFFFFF. This didn’t help much, so we converted these 4 bytes into a numeric value. And we did notice the numbers being a lot bigger for negative values, which made us think, maybe the first bit is used for the sign. We could confirm that a value of +20 and -20 did produce a value that was equal, except for the first bit. Unfortunately still we didn’t quite understand how to interpet the other 31 bits. So we converted the payload to it’s binary representation and this is when we finally found out what it meant. So in reality this value was not a 16 bit integer, but a IEEE-754 Floating Point value. Here the first bit is interpreted as sign, the following an 8 bit exponent and a 23 bit mantisse. This explained the end of the value alternating around 0.

With this we were able to write a simple console application that was able to instantly display value changes.

However when using what we learnt on the old captures we did, it turned out that unfortunately we could no longer find these 0x0b08 sequences, however we did find a lot of 0x08 followed by a 32 bit floating point value. So it seems that 0x08 indicates the type of an signed 32 bit IEE 754 floating point value. Perhaps the 0x0b part referred to some sort subscription-id.

As we seem to be doing a subscription based communication, the OS has to tell the controller what information it is interested in.

Right now we are assuming that in one of the other 0x04 packets a subscription is initiated and assigned with some sort of id and this id is used to identify which value is actually changed.

We will hopefully be able to decode this addressing problem in one of our next reverse-engineering sessions.

Decoding Strings

As we mentioned before, that we could see content in the packages that were sometimes readable from just looking at the payload, we decided to have another look at these.

After some time, we did notice, that it seems that Strings have a type of 0x00 and are followed by one byte or are a short value containing the length of the string value. This is then followed by each character being output using two bytes. We assume this is in order to allow transferring UTF-16 (two byte) big characters, but never found any. Right now the first byte of every block is simply 0x00 followed by the UTF-8 character value.