RFM69 hat library receive lockup debugging

Every so often my Enums & Masks test harness locked up and stopped receiving messages from my test rig. This seemed to happen more often when the send functionality of my library was not being used.

easysensors RFM69HCW test rig

After 5 to 30 minutes (a couple of times it was 5 to 8 hours overnight) the application stopped receiving messages and wouldn’t resume until the application (device reset) was restarted or the RegOpmode-Mode was quickly changed to sleep then back to receive.

private void InterruptGpioPin2_ValueChanged(GpioPin sender, GpioPinValueChangedEventArgs args)
{
    Debug.WriteLine("InterruptGpioPin2_ValueChanged");

    rfm69Device.SetMode(Rfm69HcwDevice.RegOpModeMode.Sleep);
    rfm69Device.SetMode(Rfm69HcwDevice.RegOpModeMode.Receive);
}

After re-reading the Semtech SX1231 datasheet one of the other possible solutions involved writing to the RegPacketConfig2-RestartRX bit

RegPacketConfig2 configuration options

Of the different approaches I found this code was the most reliable way of restarting reception of packets.

private void InterruptGpioPin3_ValueChanged(GpioPin sender, GpioPinValueChangedEventArgs args)
{
	Debug.WriteLine("InterruptGpioPin3_ValueChanged");

	byte regpacketConfig2 = rfm69Device.RegisterManager.ReadByte(0x3d);
	regpacketConfig2 |= (byte)0x04;
	rfm69Device.RegisterManager.WriteByte(0x3d, regpacketConfig2);
}

I had noticed this code in the Low Power Lab and wondered what it was for. The HopeRF library didn’t appear to have code like this to restart reception which was interesting.

void RFM69::send(uint16_t toAddress, const void* buffer, uint8_t bufferSize, bool requestACK)
{
  writeReg(REG_PACKETCONFIG2, (readReg(REG_PACKETCONFIG2) & 0xFB) | RF_PACKET2_RXRESTART); // avoid RX deadlocks
  uint32_t now = millis();
  while (!canSend() && millis() - now < RF69_CSMA_LIMIT_MS) receiveDone();
  sendFrame(toAddress, buffer, bufferSize, requestACK, false);
}

// should be called immediately after reception in case sender wants ACK
void RFM69::sendACK(const void* buffer, uint8_t bufferSize) {
  ACK_REQUESTED = 0;   // TWS added to make sure we don't end up in a timing race and infinite loop sending Acks
  uint16_t sender = SENDERID;
  int16_t _RSSI = RSSI; // save payload received RSSI value
  writeReg(REG_PACKETCONFIG2, (readReg(REG_PACKETCONFIG2) & 0xFB) | RF_PACKET2_RXRESTART); // avoid RX deadlocks
  uint32_t now = millis();
  while (!canSend() && millis() - now < RF69_CSMA_LIMIT_MS) receiveDone();
  SENDERID = sender;    // TWS: Restore SenderID after it gets wiped out by receiveDone()
  sendFrame(sender, buffer, bufferSize, false, true);
  RSSI = _RSSI; // restore payload RSSI
}

void RFM69::receiveBegin() {
  DATALEN = 0;
  SENDERID = 0;
  TARGETID = 0;
  PAYLOADLEN = 0;
  ACK_REQUESTED = 0;
  ACK_RECEIVED = 0;
#if defined(RF69_LISTENMODE_ENABLE)
  RF69_LISTEN_BURST_REMAINING_MS = 0;
#endif
  RSSI = 0;
  if (readReg(REG_IRQFLAGS2) & RF_IRQFLAGS2_PAYLOADREADY)
    writeReg(REG_PACKETCONFIG2, (readReg(REG_PACKETCONFIG2) & 0xFB) | RF_PACKET2_RXRESTART); // avoid RX deadlocks
  writeReg(REG_DIOMAPPING1, RF_DIOMAPPING1_DIO0_01); // set DIO0 to "PAYLOADREADY" in receive mode
  setMode(RF69_MODE_RX);
}

In the debug output you can see that clock frequencies of the two test devices are slightly different. Every so often they transmit close enough to corrupt one of the message payloads which causes the deadlock.

22:20:26.379 Address 0X99 10011001
22:20:26 Received 14 byte message Hello World:10
22:20:26.561 RegIrqFlags2 01100110
22:20:26.576 Address 0X66 01100110
22:20:26 Received 14 byte message Hello World:26
.22:20:27.501 RegIrqFlags2 01100110
22:20:27.517 Address 0X99 10011001
22:20:27 Received 14 byte message Hello World:11
22:20:27.699 RegIrqFlags2 01100110
22:20:27.714 Address 0X66 01100110
22:20:27 Received 14 byte message Hello World:27
...............................

Now I need to back integrate the fix into the send & receive message methods of my code, then stress test the library with even more client devices.