Introduction
SIP-R stands for Serial Interface Protocol - Reliable
This is a protocol designed by Michael Richards and Marcell. It is designed specifically so tuning software can efficiently communicate with the ECU using a standard reliable protocol over serial bus (RS232 or RS485, or even CAN).
- It uses a pre-defined format that is easy to parse
- It provides forward and backward compatibility. (read: it does not restrict firmware developers by freezing a set of variable-offsets. Temporarily it's OK though).
- It is stateless and provides ack packets for every command sent. This allows continuous tasks like sensor logging to occur without interruption while setting and map adjustments are made.
- uses as much of existing standards as possible
- same addressing as comm.c send_page() and store_page() commands that is used for MagaTunix
- same human readable (nice to maintain, and there anyway) key=>binary offset translation as mcd (eg. config.cwl position=02). This is not necessary if the config offset database for the given config version is stored or cached in the tuningsoftare (not urgent).
- it supports some reasonable restrictions. Some table-data will be stored in EEPROM, so the firmware can only save one variable at a time. The tuningsoftware must support this (at least later, when needed): either by reading back variable, or rather the busy condition
- It's OK to send "empty" packets that only have flow control (NAK, or ACK), but normally, when there is "production talk", NAKs and ACKs must be piggybacked on production packets.
How it is done in proven standard solutions
In an ideal world 2 layers are necessary. This (separation of 2 layers) is something that [modbus.org] [modbus protocol] gets perfectly right. No wonder modbus is the absolute market-leader protocol for the exact task that we are inventing our own for.
application layer is responsible for content dispatch (appPDU: which byte means what?)
- while [http://www.modbus.org/modbus/standMbusLibrary.nsf/fa_libsearch?OpenForm# modbus application protocol] is nice and maps very well to our application (so it is a good idea to browse through the 40 pages to get an idea of commands)
- I think that it's OK for now to have a simplified "sipr" implementation instead of a conformant modbus
- Note: the operations that get/set registers work on 16 bit for modbus, while some of our registers are 8 bit: when we walk towards conformancy, we can treat 2x8 bit config elements as 1 word (some coding inconvenience) or send 16bits over the network but only store the MSB (tiny network overhead)
network layer is responsible for framing and reliable communications (CRC)
- modbus appPDU length is not explicite. It is well defined, but not explicitely at given position inside the appPDU. The intention is clear: the application layer deals with frames and the network layer must provide framing in whatever way is appropriate
- [modbus over serial line] is a very good example of framing without byte stuffing (at least the obligatory RTU, not the optional ASCII), but based on "frame timing" information (blank between transmitted bytes) which might not work on win32 (especially if USB-serial is also involved)
- the most efficient, still win32 friendly protocol: frame marker, explicite frame-length and CRC. Without byte-stuffing the frame-marker and CRC (together) make it easy to find the frame for virtually no added complexity and with very little CPU overhead even if it gets lost and clearly the most performant of all for normal transmission (the "find-frame-by-help-of-CRC method is proven even in gigabit ATM, with the help of frame-marker it just cuts down "worst-case" costs further).
- byte stuffing is also possible of course, with some loss in performance even for the normal (no transmit-error) case
- when packing modbus-like (answer on virtual request) PDU-s in MMC flash (datalogging) the framing based on "frame timing" is a no-go. (but either of above 2 works)
- modbus-serial uses an incredible performant CRC that I suggest we drop into our flash (C source-code is also provided) This will cut flash usage since we'll use the modbus-CRC in any case and there are no other restrictions that would bind us to any other CRC-type.
- It must be possible to leave out CRC, if the carrier (CAN, zipfile, etc...) provides CRC anyway. That is, when another (than serial) network layer is used.
Note that modbus solves some existing problems out of the box, in a standard way:
- besides setting and querying traditional variables (that can be either config or runtime)
- trigger and other event logging (query FIFO command)
modbus shortcomings
- strict predefined allocation of master. When we have a small WBO2+other sensors dongle, a display, an ECU, and a PC to chat, which one should be the master? Any 2 should be able to cooperate without the other 2 present.
- I thought about a simple master election that works on logical level. Even if the master is shut down, the bus continues to work (if otherwise operable physically: unpowered devices don't disturb the RS485 bus) if the slaves detect the loss of master: after 1000 usec silence, all slaves choose a random number t_rand=0..2000 usec and after t_rand time elapsed becomes the master by advertising itself unless another node became master in the meantime.
- there is also a hardware issue: there is a slight recommendation to place "line polarization" (almost like pullup, but symmetrical) into the master. It is probably the best to not install "line polarization" too hard on boards, but leave that to the installer along with "line termination"-s (easy to make in the standard DSUB9)
- no slave-to-slave direct communication (although the RS485 bus would allow)
- slave cannot talk at all unless requested by master
- in the reference 4 units on the RS485 bus with flows: PC=>ECU, sensors=>ECU, ECU=>PC, ECU=>display case the control of flows is non-trivial (unless the ECU-PC link is separate; than the ECU can be the absolute master)
- when 2 ECU-s are interested in the same signal (eg. MAP signals) data, the standard way is to relay data. but it would be possible that the slave ECU cathes the data at the same time as the first ECU
- if I understand right, the master cannot ask another question until reply is received (or timeout)
- this decreases max utilization on the bus (since 115200 is easily possible even with 70m cable, might not be a real problem for most applications)
- different slaves could work on different questions independently
- this could be a place to improve modbus for even higher performance. Collisions must be avoided, though (nontrivial: might not worth the hassle).
- if the questions are simple (such as the case for the typical register-reads) the timeout can be very low and utilization high with short idle-times. Actually, the opposite can be a problem: assuring no sooner reply than the required 3.5 symbol blank-time (315 usec at 115200 baud).
- I don't see a way in modbus to pack multiple commands into a frame
- sigh: the application layer packets can be very short
- the overhead of a network packet might be significant because of CRC, acknowledgement, and possibly routing and bandwidth. It is extremely prohibitive in wireless communications.
- with the incredibly nice CRC it might not be a big issue.
- if the application's PDU-s are made sufficiently big (eg. a set of data sent in one batch, like all 16 ADC values; modbus supports this, see command function code 0x04) it can work efficient without any hacks, even when data goes to MMC flash.
The "wired in CRC", which is the violation of the 2-layer design (application + network, see above) is not a performance issue. It becomes a maintenance issue when we start support for CAN or MMC flash or other data carrier besides serial link. No question that we can solve it than though.
Entering SIPR mode
On startup the ECU can switch to the SIPR protocol by sending the command 'Mas'. You can only enter this from the MENU_ZERO state because this does not make sense for a keyboard user. Upon entering the protocol the string "OK\n" will be sent.
Packet Format
Every packet must be sent in the following format:
Signature | Sequence | Type | Size | Data | CRC-16 |
1 byte | 1 byte | 1 byte | 1 byte | variable length of Size bytes | 2 bytes |
Discussion: The signature byte serves as a trigger to the receiver that a new packet is starting. 0x0A is defined as the byte to use. If the receiver gets a corrupted packet or times out receiving one it will return to a state where it is looking for a packet signature.
The sequence byte is necessary to ensure that ACK packets are matched with the command they refer to. If the tuning software sends 2 commands and one ACK becomes corrupted then the software must be able to correctly determine which command was accepted. In our case it is the responsibility of the sender to increment the sequence number with every command sent to ensure no collisions occur.
The type byte allows up to 255 different types of packets to be defined. In reality we expect to see about 30 different types. Should the need for more than 255 types ever arrive then an "extended" packet type could be defined by carrying extra type definition bytes into the data area.
There is only one byte designated for the data size. This means the largest possible packet will be 261 bytes - more than sufficient for the limited resources of the ECU. As apparent from the description above, size in this context refers only to the number of bytes in the data segment. In some packets it may also be valid to have 0 bytes of data.
Size number of bytes comprise the data segment or payload of a packet.
The last 2 bytes in a packet are the checksum bytes. In this implementation CRC-16 was chosen as an algorithm. This choice was made because of it's proven track record and library availability. The CRC is applied to the entire packet excluding the signature byte and obviously the CRC bytes! Since the header byte is necessary for the packet to be recognised, running the CRC routine on this would be a waste of time and resources.
Packet Type Definitions
NAK
- Sender: ECU
- Sequence: Set the same as the packet it is responding to.
- Type: 0x00
- Size: 0 (no data necessary)
- Response: none
- Description: The NAK packet is always sent in response to receiving an invalid (wrong CRC) or unrecognised (type not found) packet. In the future it may be necessary to add a reason byte to the data.
ACK
- Sender: ECU
- Sequence: Set the same as the packet it is responding to.
- Type: 0x01
- Size: 0 (no data necessary)
- Response: none
- Description: The ACK packet is always sent in response to a command packet. The ACK packet tells the sender that the command was received, understood and processed.
- Sender: Tuner
- Sequence: Sequentially chosen by the sender.
- Type: 0x01
- Size: 0 (no data necessary)
- Response: ACK
- Description: As a special case the tuning software can send an ACK and expect an ACK to be returned. This functions as a "Ping".
GET_VERSION
- Sender: Tuner
- Sequence: Sequentially chosen by the sender.
- Type: 0x02
- Size: 0 (no data necessary)
- Response: GET_VERSION
- Description: The GET_VERSION packet is sent by the tuning software to determine the version of firmware it is communicating with. This detail is often necessary to determine the features available to the software.
- Sender: ECU
- Sequence: Set the same as the packet it is responding to.
- Type: 0x02
- Size: 3
- Response: none
- Data: The 3 data bytes returned are the Major, Revision and Build numbers of the firmware. 0x01 0x00 0x03 for example would mean version 1.0.3
- Description: The GET_VERSION response packet provides the tuner with the firmware version.
SET_BAUD
- Sender: Tuner
- Sequence: Sequentially chosen by the sender.
- Type: 0x03
- Size: 1
- Response: ACK
- Data: The byte here represents the baudrate as chosen from the following table.
byte | Baudrate |
0x01 | 9600 |
0x02 | 19200 (default rate) |
0x03 | 38400 |
0x04 | 57600 |
0x05 | 115200 |
- Description: The SET_BAUD command is a particular one. The ECU must respond in the current baudrate with it's ACK packet before switching baudrates. The tuning software must take care as missing the ACK packet may result in it retrying in the wrong baudrate. Once sent, the tuning software should wait for an ACK. Then resend. If it still does not receive an ACK then it should try switching to the new baudrate and sending one or more pings (ACK). Previously the data would have been the divisor needed to form the new baudrate. In the interest of portability I have chosen constants. In the future with an ARM processor the divisors may be different.
Does it make sense to use the same baudrate=1000000/x that is used at other places in the firmware? This covers every baudrate that is currently possible. The ARM would have no problem using (or converting from) these very same values. In any case, 0xE0..0xFF can be reserved for future (eg. other speeds, that don't fit this well). Otherwise we must be _very_ careful to choose the baudrates.
Besides me forgetting to list important baudrates these are very unlikely to change. I've seen these as the common baudrates for at least the past 15 years. I feel the using pre-defined values is clearer and less prone to errors. Being off by one for example will still work on many serial implementations but oddly not on others.
GET_TABLESIZE
- Sender: Tuner
- Sequence: Sequentially chosen by the sender.
- Type: 0x04
- Size: 1
- Response: GET_TABLESIZE
- Data: Only 1 data byte is sent to specify the table we are querying.
byte | table |
0x01 | todo |
0x02 | todo |
0x03 | todo |
- Description: The GET_TABLESIZE packet is sent by the tuning software to determine the dimensions of a given table. Compiled options can alter the size of certain tables and knowing the correct sizes is very important.
- Sender: ECU
- Sequence: Set the same as the packet it is responding to.
- Type: 0x04
- Size: 2
- Response: none
- Data: the two data bytes correspond to the number of rows and columns respectively. a 5 row 2 column table would be sent as 0x05 0x02
- Description: The GET_TABLESIZE response tells the tuning software the dimensions of the specified table.
GET_TABLEDATA
- Sender: Tuner
- Sequence: Sequentially chosen by the sender.
- Type: 0x05
- Size: 1
- Response: GET_TABLEDATA
- Data: Only 1 data byte is sent to specify the table we are querying.
byte | table |
0x01 | todo |
0x02 | todo |
0x03 | todo |
- Description: The GET_TABLEDATA packet is sent by the tuning software to retrieve all the data from the specified table. GET_TABLESIZE can be called to determine the rows and columns format of this data.
.
- Sender: ECU
- Sequence: Set the same as the packet it is responding to.
- Type: 0x05
- Size: (determined by the actual size of the table)
- Response: none
- Data: The table data is returned as a byte stream.
- Description: The GET_TABLEDATA response sends the actual content of a table to the tuning software.
SET_TABLEDATA
- Sender: Tuner
- Sequence: Sequentially chosen by the sender.
- Type: 0x06
- Size: 3
- Response: ACK
- Data: The first data byte identifies the table we are modifying. The second byte contains the offset within that table and the third byte is the actual data.
byte | table |
0x01 | todo |
0x02 | todo |
0x03 | todo |
- Description: The SET_TABLEDATA packet is sent by the tuning software to modify an entry within a table.
READ_SENSORS
- Sender: Tuner
- Sequence: Sequentially chosen by the sender.
- Type: 0x07
- Size: 2
- Response: READ_SENSORS
- Data: The first data byte specifies the number of data sets to return. 0 means stop sending and 255 means send continuously. The second byte specifies the delay between packets.
- Description: The READ_SENSORS packet is sent to request one or more sets of sensor data. The entire structure is returned in an effort to provide the maximum amount of data with a minimum amount of overhead. The repeat and delay bytes are very important for creating data logs as they allow the ECU to send the data at ECU determined intervals - more accurate and not subject to the stacking errors of the software generating the requests.
- Sender: ECU
- Sequence: Set the same as the packet it is responding to.
- Type: 0x07
- Size: to be determined
- Response: none
- Data: Data format as specified in the following table
offset | size | meaning |
Comments From Alexander Guy:
I think that byte stuffing isn't something that should be overlooked. It's easy to implement, and with it and frame 'flags', frame synchronization becomes trivial. Rather than counting bytes based on the alleged size of the frame, the frame continues until the next flag is hit. If no flag is hit before the maximum buffer size is reached, it can be assumed that the frame was invalid and frame reading can be reset until the next flag.
A rough implementation of an input function is as follows. This is the same HDLC-like framing that PPP uses over 8-bit async serial links:
- \\n
#define BYTE_FLAG 0x7e #define BYTE_ESCAPE 0x7d #define MAX_SIZE 80 struct { char rbuf[MAX_SIZE]; int ridx; enum { INVALID, NORMAL, ESCAPE } rstate; } commstate; void input_byte(char input) { int idx = commstate.ridx; if (input == BYTE_FLAG) { if ((commstate.rstate == NORMAL) && (idx > 0)) { /* XXX - Verify Checksum and Process Message Here */ } commstate.ridx = 0; commstate.rstate = NORMAL; return; } if (commstate.rstate == INVALID) return; if (input == BYTE_ESCAPE) { commstate.rstate = ESCAPE; return; } if (commstate.rstate == ESCAPE) { input ^= 0x20; commstate.rstate = NORMAL; } commstate.rbuf[idx] = input; /* XXX - You Can Update Checksum Here */ idx++; if (idx >= MAX_SIZE) { idx = 0; commstate.rstate = INVALID; } commstate.ridx = idx; return; }
See also
- GenBoard/BinaryProtocol - predecessor (almost same)