Friday, October 12, 2012

software serial, a possible alternative implementation

(All code on this page is (c) 2012 Joost Yervante Damad and licenced under the GPL version 3)

Everybody at some point at least temporary needs more serial interfaces  on their Arduino/ATMega/... micro-controller.
The usual solution is to use the software serial library. The existing implementations have some serious limitations so I'm working on an alternative solution.

First let's have a look at the existing ones and their features and limitations.

Before Arduino 1.0 there was a version which I will call "Old Software Serial".
Starting with Arduino 1.0 a library called "New Software Serial" was integrated.

Old Software Serial


This was written in 2006 by David A. Mellis.

Sending is done via setting the transmit pin HIGH and LOW and  the timing of the bits is done by means of the "delayMicroseconds" call which does busy looping.

Receiving is done by calling the "read" function which will busy wait for the start of the byte and then read the bits where the timing between the bits is again done via delayMicroseconds.


It does not use any interrupts.


This works but has severe limitations:

  • sending data keeps the CPU busy
  • receiving data also keeps the CPU busy
  • if your program is not actively reading at the right time you will loose bytes
  • only one instance can be active at once as reading is done in the main program

New Software Serial


This started it's life as AFSerial done by Lady Ada from Adafruit industries. It was then picked up by Mikal Hart from Arduiniana and there were bugfixes and improvements by various other people (Paul Stoffregen from the Teensy, ...) In Arduino 1.0 it was integrated as the "SoftwareSerial" library.

Sending works basically the same as in the Old Software Serial.

Receiving is interrupt based.
The receive pin is registered for pin change interrupts. This means every time the pin changes from LOW to HIGH or the other direction an interrupt is called.
In practice when a pin change happens and it is a HIGH->LOW transition this indicates a start bit and then the byte is read by means of short delays implemented via a busy loop. The byte is then inserted into a circular buffer. Reading from the main program just reads from that buffer.

The big improvement here is that reading does not require blocking the main program. 

This works better then the Old Software Serial but also has limitations:
  • sending data still keeps the CPU busy
  • it only supports one active receiver; you have to change the active receiver with the listen() call if you have multiple instances of SoftwareSerial (this is actually a limitation caused by how reading a byte works)
  • reading a byte BLOCKS busy looping within the interrupt handler
That last in particular is quite nasty. It means that while the library is reading a byte, no other interrupts can be handled. This Includes interrupts for a receive of another software serial instance or even another hardware serial interface! Normally the goal should be to have interrupt handlers do as little as possible.

JD Software Serial


Giving comments on other people's code is easy so let's put my money where my mouth is and try to come up with a better solution.

For now I'm focusing on the receive part, for the sending part I have some ideas about as well, but that is not that important as sending doesn't block interrupts and people expect it to be synchronous.

The basic idea is to also use the pin change interrupt, but instead of blocking to read an entire byte, look at the HIGH->LOW and LOW-HIGH transitions and the timing of when they happen and find out the bytes like that.

This approach also has some shortcomings and caveats, I'll get to that later.

A first try


First I'm going to write a simple interrupt handler that just puts the transitions in a 32bit circular buffer annotating each bit with it's timestamp. All this code is not optimized it is just to do some playing with the concept.

The type for the buffer and data fields for the interrupt:

 1 typedef uint32_t stamp_t;
 2 typedef struct {
 3   bool rise;
 4   stamp_t timestamp;
 5 } bitchange;
 6 static bitchange bitchanges[32] = { { 0,0 } };
 7 static int datapos = 31;
 8 static int freepos = 0;
 9 static volatile uint8_t * receivePortRegister = 0;
10 static uint8_t receiveBitMask = 0;

The interrupt handler:

 1 inline void handle_interrupt()
 2 {
 3   // if our bit change buffer is full, just drop this bit
 4   if (datapos == freepos) { return; }
 5   const uint8_t pin_data = *receivePortRegister & receiveBitMask;
 6   bitchanges[freepos].rise = pin_data > 0;
 7   bitchanges[freepos].timestamp = mytimer();
 8   freepos = (freepos + 1) & 31;
 9 }
10 
11 ISR(PCINT0_vect)
12 {
13   handle_interrupt();
14 }

The mytimer() function uses the timer0 facility which is already in use by the Arduino core. It uses the lower 24 bit of the timer0 overflow counter together with the fast changing content  of the TCNT0 register.

1 extern unsigned long timer0_overflow_count;
2 static inline stamp_t mytimer() {
3   const stamp_t ov = (timer0_overflow_count & 16777215l) << 8;
4   const uint8_t t = TCNT0;
5   return ov+t;
6 }

My setup uses two Arduino boards. Arduino one is sending a 'U' every second on it's TX pin, meaning it is using the hardware serial. It is clocked at 31250 baud.
That TX pin is connected to pin 2 on the second Arduino. The second arduino is connected with it's normal serial interface to a PC via USB at 115200 baud.

The setup. On the left the sender, on the right the receiver. Data is going via the green wire.


To get started I just look at the bit transitions in a busy loop.

 1 void loop()
 2 {
 3   if ((datapos + 1)%32 != freepos) {
 4     const bool rise = bitchanges[datapos].rise;
 5     const stamp_t timestamp = bitchanges[datapos].timestamp;
 6     ATOMIC_BLOCK(ATOMIC_RESTORESTATE) {
 7       datapos = (datapos + 1) & 31;
 8     }
 9     Serial.print(timestamp); Serial.print(' ');   Serial.println(rise?'1':'0'); 
10 }

That works fine. Evert time it sees a byte it prints:

11835708 1
12085644 0
12085652 1
12085660 0
12085668 1
12085676 0
12085684 1
12085692 0
12085700 1
12085708 0

First line is the stop bit of the previous byte.
Second line is the start bit of this byte.
The remaining lines are the data bits.
The data bits give 10101010 which is LSB (Least significant bit) to MSB (most significant bit) meaning it represents 01010101 which is 0x55 which is 85 decimal which is the character 'U'. Now the reason I'm using that particular byte is that it has an edge on every bit, which makes it an easy case.

In general there are a couple of special cases:

  1. two or more times the same bit in one byte
  2. initially starting out in the middle of a byte
  3. ending with one or more HIGH bits meaning there is no stop bit interrupt
1 is easy to fix: just look at the time-stamps and insert extra bits with the previous bit value. The amount of bits can be determined from the difference of the current time-stamp and the previous time-stamp.
2 just implies that we need to be clever in trying to find the start bit in an existing stream.

3 is the tough one. The solution I'm implementing is that when a start bit is received and I was actually still expecting a data bit the remainder of the bits are filled with HIGH because that is what they might be.
However that does mean that the byte is only known when the next byte starts, which could be quite late. In practice though for the typical ASCII values this is not an issue as the MSB is 0, and the stop bit is always seen.

Get those bytes


Let's give decoding the bytes a stab. After quite some testing/tweaking I came with the following test code that nicely dumps the received byte and works for the whole 0 till 255 (0xFF) range:

 1 void loop()
 2 {
 3   if ((datapos + 1)%32 != freepos) {
 4     // first get data out of array
 5     const bool rise = bitchanges[datapos].rise;
 6     const stamp_t timestamp = bitchanges[datapos].timestamp;
 7     // only then update datapos to avoid getting overwritten data
 8     ATOMIC_BLOCK(ATOMIC_RESTORESTATE) {
 9       datapos = (datapos + 1) & 31;
10     }
11     // Serial.print(timestamp); Serial.print(' '); Serial.println(rise?'1':'0');
12     if (want_start) {
13       if (!rise) {
14         if (start_delay > 0) {
15           const stamp_t d = stampdiff(timestamp, prev_ts);
16           if (d < start_delay) {
17             // to soon, skip
18           } else {
19             want_start = false;
20             bit_count = 0;
21             recv_byte = 0;
22             start_delay = 0;
23             //Serial.print('T');
24           }
25         } else {
26           // \/ we actually receive what could be a start bit
27           want_start = false;
28           bit_count = 0;
29           recv_byte = 0;
30           //Serial.print('S');
31         }
32       } else {
33         //Serial.print('Q');
34         // /\ probably a stop bit; just discard and wait for next start
35       }
36     } else {
37       // we are receiving a byte
38       const stamp_t d = stampdiff(timestamp, prev_ts);
39       // Serial.println(d);
40       if (d > 80) {
41         // if it takes to long to get a bit, the byte has ended
42         if (!rise) {
43           // \/ we find what is possibly a new start bit
44           // this means we need to fill the remaining bits with
45           // /\ meaning a 1
46           const int num_bits_remaining = 8 - bit_count;
47           // Serial.print("+"); Serial.print(num_bits_remaining); Serial.print('-');
48           for (int i = 0; i  < num_bits_remaining; ++i) {
49             recv_byte >>= 1;
50             recv_byte += 128;
51           }
52           Serial.print('$');
53           Serial.println((int)recv_byte, HEX);
54           want_start = false;
55           bit_count = 0;
56           recv_byte = 0;
57         } else {
58           // /\ we're out of sync, restart
59           //Serial.println('*');
60           bit_count = 0;
61           want_start = true;
62           start_delay = 80;
63         }
64       } else {
65         // new edge within the same byte
66         // if the edge comes after double the expected time or more
67         // we've skipped a bunch of bit of the previous value
68         const int8_t prev_num_bits = ((d+1) / 8) - 1;
69         // Serial.print("+"); Serial.print(prev_num_bits); Serial.print('-');
70         // push previous bits
71         for (int8_t i = 0; i < prev_num_bits; ++i) {
72           recv_byte >>= 1;
73           recv_byte += rise ? 0 : 128; // inverse of what we're going to now
74           // Serial.print(rise ? 'O' : '|');
75           ++bit_count;
76         }
77         // push current bit
78         if (bit_count < 8) {
79           // Serial.print(rise ? '1' : '0');
80           recv_byte >>= 1;
81           recv_byte += rise ? 128 : 0;
82           ++bit_count;
83         }
84         if (bit_count >= 8) {
85           // we reached the end of the byte
86           Serial.print('#');
87           Serial.println((int)recv_byte, HEX);
88           want_start = true;
89         }
90       }
91     }
92     prev_ts = timestamp;
93   }
94 }


This actually somewhat works :)

Now comes the tough part to rework it to do the byte decoding in the interrupt handling and work with a circular buffer like New Software Serial. After that it needs to be optimized.

Also tables need to be made to accommodate different baud rates then the fixed 31250 baud used here.
An interesting thing is that because I'm already looking at time-stamps anyway, implementing auto-baud should not be hard!

Another future addition could be to actually sample the pin a second time shortly after the first time and if it's not the same discard the bit. This would work as a simple noise filtering.

To be continued in the near future!

10 comments:

Matthijs Kooijman said...

Did you ever finish this project? I'm planning a project which also needs multiple Software Serial ports being monitored at the same time, so I'd hate to duplicate work :-)

Matthijs Kooijman said...

Did you ever finish this project? I'm planning a project which also needs multiple software serial ports to be monitored at the same time and I'd hate to duplicate work :-)

Unknown said...

@Matthijs Kooijman: no, I never really finished it, it was more a proof of concept in the end. I ended up using other solutions for the project I was working on at the time.

Unknown said...

P.S.: later I also found altsoftserial available at: http://www.pjrc.com/teensy/td_libs_AltSoftSerial.html
If you can afford losing a timer and using the specific ports it works nicely.

Matthijs Kooijman said...

Yeah, I just stumbled upon AltSoftSerial as well, but that only supports a single instance (since it uses the "timer1 capture pin" for input and some other special pin for output...). So that's not going to work for me...

I might create something based on your proof-of-concept, then. Can I use and redistribute your code under an open license (e.g., MIT)?

Unknown said...

Sure, if you want I can send you the full files. Perhaps they're best licensed under the GPL just like the existing SoftwareSerial libraries.

Unknown said...

On the other hand, if you want them MIT licensed, I'd be happy to re-license the code.

Matthijs Kooijman said...

Full files would be nice, could you send them to matthijs@stdin.nl ?

As for licensing, I usually prefer MIT, since it's more flexibile for others (and can still be used in a GPL project as well).

Unknown said...

the example code is available as files at https://github.com/andete/blog-stuff/tree/master/SOFTSER1

Matthijs Kooijman said...

Unfortunately, this approach didn't work out for me.

I did a few experiments with your code and came to the conclusion that,
at least for my application, it isn't feasible to do software serial on
multiple GPIO pins. The overhead of interrupt calls is considerable and
I expect the implementation will be very sensitive to delays caused by
other interrupt handlers (e.g., I was missing bits with a single channel
with minimal processing running on 20k baud due to the timer interrupt
already). With multiple channels receiving, the chances of missing bits
will get a lot bigger.

Of course I can do retransmits, but then the latency is going to be a
lot bigger and I'm afraid I'll end up retransmitting everything all the
time, which I can't really handle in my application.

I'm now investigation a single-wire handshaking protocol instead, so I
can poll the lines to see which peers are ready to send something and
explicitely ack before they start sending, instead of having to pay
attention and start receiving right away whenever one decides to send.
So far, this works pretty well, but it of course only works when you can
control both side of the communication channel...