This page will contain a detailed description - for developers - of how the Binary Protocol will be laid out.
With the initial discussion mcell and hackish discussed the ideas of how frames should be defined.
TODO: search for opendiag and see if there's something (standard) we could use (they apparently don't have a wiki, so it'll suck a bit to catch up); someone could subscribe and keep an eye?
Aims
- robustness: transmission errors handled gracefully
- compactness: reasonable data-footprint in flash and via network
- reasonable CPU usage (no compression, avoiding byte stuffing, reasonable checksum, no ECC)
- preferrably same code used for
- logging into flash (the FLASH applies ECC internally, transparent to the user) as
- logging to network eg. to PC or
- runtime sensor-data to another board
- maybe even for dumping configuration.
Solution
In order to make an error resilient system we have decided on the following.
- All data transferred within frames (length: 8..255 bytes).
- The frame contains as compact info as possible:
- marker byte (eg. \n)
- frame_type word (proposed: 2 bytes; 1 byte is major and 1 minor. The minor might have some meaning, depending on the major number.)
- production data. The layout is determined by the frame_type. See the frame_type_description below.
- 2 CRC16 bytes as defined in the avrlibc CRC_update() routine. Note that this is independent of the CRC7 that is used when talking to MMC in SPI mode. This would be highly redundant when using CAN, as CAN applies frame-logic and error protection automatically (and there might be other similar cases). So the protocol must syntactically support error detection ON/OFF. (but it must not accept a frame (without any previous setup) where the error detection flag has been damaged to OFF.
- Since the frame_type determines the length of the frame, we don't need separate length-byte (but it is possible for the minor to be length in some cases; if the developer has no better idea). The description of the frame (frame_type_description) can be queried, so the relevant frames can be interpreted. Byteflight standard allows max 12 byte payload, but I think we want to allow higher lengths too (maybe max 255 bytes - although reasonable lengths recommended). With smaller lengths overhead will be worse, but latency can be better (if a priority packet-scheduler is applied)
- several common frame_type_description-s must live in flash. Can be listed and all can be queried for details.
- at least one (maybe more later) frame_type_description can be defined from menu. This has the same structure as the one in flash, but lives in SRAM (which is more expensive).
- the frame_type_description refers to variables, that are boardtype+application specific. Note: 256 variables is not enough for a given application, so 2 bytes is likely. Also helpful if arrays should be needed. Should this be flat or hierarchical?
A frame is defined as
frame_marker | frame_type | data | CRC16 |
0x0A | FRAME TYPE | DATA | CRC |
A frame_type_description payload is defined as
- length (special value might mean it is computed ?)
- description text
and a repetition of:
- data-type (=> determines data-length)
The frame_type_description is packaged in a frame (using a special frame_type) when transmitted or stored (tricky?).
Byte stuffing or not?
Byte stuffing (escaping the frame_marker byte) is not needed.
- only valid frames are processed in any case (we cannot avoid the CRC anyway - except maybe in flash)
- we save bytes without byte stuffing (especially nice in MMC flash - GenBoard/LoggerIntegration )
- we save computational power without byte stuffing in the normal (optimistic) case (when no communication or storage errors)
- in the worst case (if some bytes are lost or corrupted) it might take slightly more computation to find the valid frames without byte-stuffing: if the payload also contains the frame_marker, some extra frame-candidates will be examined (checksum calculation) and dropped if it was found to be a fake frame_marker. Processing continues onto next candidate. Note that even with byte stuffing, it is unavoidable that sometime fake frame candidates are examined because of fake frame_marker-s occuring out of the blue (corruption).
It seems we get better overall performance without byte stuffing.
Acknowledge
We better support
- Sending acknowledge
- Requesting acknowledge - perhaps this is redundant?
- Protocol could define frame_types which require acknowledgement. All non-returning (eg "get" messages) messages should be acknowledged.
Some sequence number is required for this.
We must support the acknowledge piggybacked in a useful frame (that might possibly be empty) to avoid extreme acknowledge overhead. Overhead is one cost of robustness and security. It'll always be a trade off - we need not even use a frame... a single bit or marker byte is plenty.
Lets look at 2 usecases for updating a value from tuning software. Steps 1 & 2 are about all that's done right now, but there's no verification that anythng worked (correct me if I'm wrong!). That's not good enough.
Usecase without ack:
- Software sends command to set VE table [1][2] to 35.
- ECU receives command, and updates VE table [1][2].
- Software sends command to read VE table [1][2].
- ECU sends back VE table [1][2].
- Software tests that command worked.
Usecase with ack:
- Software sends command to set VE table [1][2] to 35.
- ECU receives command, updates VE table [1][2], and sends ack.
- Software receives ack and knows command worked.
Are we trying too hard to reinvent a reliable message passing protocol? If we're talking about using CRC functions and ack and so on anyway... CAN already works, and can be implemented over a 1-wire interface (see SAE J2411). And we need it for next gen hardware. We should get it figured out on some level now.
Some bytes are lost forever - this condition must be handled gracefully in any case.
We expect there to be a defined timeout where the receiver will give up and request that a sender re-send the frame (issuing the same - idempotent - command again). Note that for the PC-GenBoard communication the PC will take care of this. The GenBoard doesn't care if the PC does not get the reply (the PC will request again).
Please list any operations (firmware commands) that are not idempotent (issuing again can have side-effects).
- mcb.. (going into bootloader mode: repeating unnecessarily can start stupid action in bootloader - chance is low )
- mcB.. change baudrate but remain in ECM application
- ...
---
See also:
- GenBoard/LoggerIntegration/DataFormat (TODO: cleanup, this page is newer)
- GenBoard/LoggerIntegration