LowPowerLab Forum

Hardware support => Low Power Techniques => Topic started by: ChemE on January 12, 2017, 09:50:06 PM

Title: Speeding Up The RFM69 Library
Post by: ChemE on January 12, 2017, 09:50:06 PM
Has anyone fooled with writing smaller tighter code to speed up Felix's library in order to save power?  Without any major heroics I was able to reduce the time needed to transmit a 6-byte payload from 1,300us to 1,000us by essentially just rewriting Select, Unselect, ReadReg, and WriteReg.  Once I'm happy with my version of the library I'll share it here for everyone but it seems there is a decent power saving potential just in speeding up the code.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 12, 2017, 11:28:23 PM
Okay wow, now I've got the actual transmission down to 476us just by ditching the bloated and slow digitalRead() in sendFrame in favor of

Code: [Select]
PIND & _BV(_interruptPin)

I knew it was too good to be true.  It was just an error causing the while loop to go false immediately.
Title: Re: Speeding Up The RFM69 Library
Post by: joelucid on January 13, 2017, 02:20:40 AM
I did a lot of this stuff to make the library smaller to fit in the bootloader. BTW a good way to save power is to sleep the CPU while waiting for packetsent or payloaddone. This is particularly useful for coin cells where peak current limits battery life.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 13, 2017, 07:31:31 AM
I did a lot of this stuff to make the library smaller to fit in the bootloader. BTW a good way to save power is to sleep the CPU while waiting for packetsent or payloaddone. This is particularly useful for coin cells where peak current limits battery life.

Geez I bet you did then!  I forgot that the bootloader knows how to work the radio.  Thanks for the tip on sleeping.  It seems to take around 500us between switching to Tx and getting back a signal DIO0.  I hope the radio isn't sucking down 17mA for that whole period or my calculated power budget is far from accurate.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 13, 2017, 02:48:09 PM
...Without any major heroics I was able to reduce the time needed to transmit a 6-byte payload from 1,300us to 1,000us by...

I spent some time reading Joe's OTA bootloader thread and decided like he did to dispense with everything.  All the waits, whiles, etc and just bang some bits.  The same transmission (6 byte payload with 7 bytes of preamble/sync/len/address) now only takes 600 microseconds.  Not bad considering the actual transmission of 104 bits at 300 kbps should take 345 microseconds and I have to write all those bits to the FIFO before the transmission can begin.  Shame I can't get the first byte or two into the FIFO, start the transmission and then continue to fill the FIFO while the transmission is in progress.  It takes around 28 microseconds to move a byte to the FIFO so I could in theory save another 300 microseconds if this were possible.  I need to clean this code up for a day or two but I'll be sharing this.  This could be extended back to do ACKs and whatnot but right now it just turns on the radio, fills the FIFO, and sends it out with no retry or ACK request.  I like to get down to bare bones before I add back.
Title: Re: Speeding Up The RFM69 Library
Post by: Felix on January 13, 2017, 02:52:46 PM
I'll be sharing this.
Very nice sir, sharing is caring!
Title: Re: Speeding Up The RFM69 Library
Post by: WhiteHare on January 13, 2017, 04:33:33 PM
Shame I can't get the first byte or two into the FIFO, start the transmission and then continue to fill the FIFO while the transmission is in progress. 

The datasheet refers to a similar technique for handling large packets, but I suppose it might (?) also work with shorter packets like yours also:

Quote
5.5.6. Handling Large Packets
When Payload length exceeds FIFO size (66 bytes) whether in fixed, variable or unlimited length packet format, in addition
to PacketSent in Tx and PayloadReady or CrcOk in Rx, the FIFO interrupts/flags can be used as described below:
􀂊 For Tx:
FIFO can be prefilled in Sleep/Standby but must be refilled "on-the-fly" during Tx with the rest of the payload.
1) Prefill FIFO (in Sleep/Standby first or directly in Tx mode) until FifoThreshold or FifoFull is set
2) In Tx, wait for FifoThreshold or FifoNotEmpty to be cleared (i.e. FIFO is nearly empty)
3) Write bytes into the FIFO until FifoThreshold or FifoFull is set.
4) Continue to step 2 until the entire message has been written to the FIFO (PacketSent will fire when the last bit of the
packet has been sent).
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 13, 2017, 04:44:36 PM
Okay, I've satisfied my #define code art urges at least with the most fundamental routines.

Code: [Select]
#define 	SS_PIN			PB2
#define         interruptPin            2
#define SELECT noInterrupts(); SS_WRITE_LOW
#define UNSELECT SS_WRITE_HIGH; interrupts()  //if (!_inISR) interrupts()
#define    SS_WRITE_LOW      PORTB &= ~(1<<SS_PIN)
#define    SS_WRITE_HIGH      PORTB |= 1<<SS_PIN
#define         WAIT_WHILE_SPI_BUSY     asm volatile("nop"); while (!(SPSR & 1<<SPIF))

uint8_t readReg(uint8_t addr) {
  SELECT;
  SPDR = ( addr & 0x7F );
  WAIT_WHILE_SPI_BUSY;
  SPDR = ( 0 );
  WAIT_WHILE_SPI_BUSY;
  UNSELECT;
  return SPDR;
}

void writeReg(uint8_t addr, uint8_t value) {
  SELECT;
  SPDR = ( addr | 0x80 );
  WAIT_WHILE_SPI_BUSY;
  SPDR = ( value );
  WAIT_WHILE_SPI_BUSY;
  UNSELECT;
}

static inline uint8_t SPI_XFER(uint8_t data) {
  SPDR = data;
  WAIT_WHILE_SPI_BUSY;
  return(SPDR);
}

This should more or less play ball with sendFrame in Felix's library except one needs to change SPI.transfer to SPI_XFER.  There was a lot of execution time given over to basically thrashing the SPI settings and jumping into and back from routines.  What Felix wrote is obviously much safer than this, but in a well controlled loop where one isn't changing SPI settings, there is no need save them, change them, and restore them each time we read and write a byte.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 13, 2017, 04:49:28 PM
Full Code:

Code: [Select]
#include "HTU21D.h"
#include "LowPower.h"
#include <RFM69.h>
#include <RFM69registers.h>
#include <SPI.h>

// Define various ADC prescaler
#define    ADC_PS_16    (1 << ADPS2)
#define    ADC_PS_32    (1 << ADPS2) | (1 << ADPS0)
#define    ADC_PS_64    (1 << ADPS2) | (1 << ADPS1)
#define    ADC_PS_128    (1 << ADPS2) | (1 << ADPS1) | (1 << ADPS0)
#define sbi(sfr, bit) (_SFR_BYTE(sfr) |= _BV(bit))

//*********************************************************************************************
// *********** IMPORTANT SETTINGS - YOU MUST CHANGE/ONFIGURE TO FIT YOUR HARDWARE *************
//*********************************************************************************************
#define         NETWORKID               100  //the same on all nodes that talk to each other - 170 is 10101010 DC free value
#define         RECEIVER                1    //unique ID of the gateway/receiver
#define         SENDER                  2
#define         NODEID                  SENDER  //change to "SENDER" if this is the sender node (the one with the button)
#define         FREQUENCY               RF69_915MHZ
#define SS_PIN PB2
#define         interruptPin            2
#define SELECT noInterrupts(); SS_WRITE_LOW
#define UNSELECT SS_WRITE_HIGH; interrupts()  //if (!_inISR) interrupts()
#define    SS_WRITE_LOW      PORTB &= ~(1<<SS_PIN)        // Much faster and smaller version of digitalWrite(Pin, LOW)
#define    SS_WRITE_HIGH      PORTB |= 1<<SS_PIN        // Much faster and smaller version of digitalWrite(Pin, HIGH)
#define         WAIT_WHILE_SPI_BUSY     asm volatile("nop"); while (!(SPSR & 1<<SPIF))

RFM69 radio;    //Create an instance of the object
bool _inISR=false;

uint8_t readReg(uint8_t addr) {
  SELECT;
  SPDR = ( addr & 0x7F );
  WAIT_WHILE_SPI_BUSY;
  SPDR = ( 0 );
  WAIT_WHILE_SPI_BUSY;
  UNSELECT;
  return SPDR;
}

void writeReg(uint8_t addr, uint8_t value) {
  SELECT;
  SPDR = ( addr | 0x80 );
  WAIT_WHILE_SPI_BUSY;
  SPDR = ( value );
  WAIT_WHILE_SPI_BUSY;
  UNSELECT;
}

static inline uint8_t SPI_XFER(uint8_t data) {
  SPDR = data;
  WAIT_WHILE_SPI_BUSY;
  return(SPDR);
}

static inline void SendFrame(uint8_t toAddress, const void* buffer, uint8_t bufferSize) {
  uint8_t FOO = (readReg(REG_OPMODE) & 0xE3);    // I see no reason not to cache this and some some reads
  writeReg(REG_OPMODE, FOO | RF_OPMODE_STANDBY); // turn off receiver to prevent reception while filling fifo
  while ((readReg(REG_IRQFLAGS1) & RF_IRQFLAGS1_MODEREADY) == 0x00); // wait for ModeReady
  writeReg(REG_DIOMAPPING1, RF_DIOMAPPING1_DIO0_00); // DIO0 is "Packet Sent"
 
  // write to FIFO
  SELECT;
  SPI_XFER(REG_FIFO | 0x80);
  SPI_XFER(bufferSize + 3);
  SPI_XFER(toAddress);
  SPI_XFER(NODEID);
  SPI_XFER(0x00);

  for (uint8_t i = 0; i < bufferSize; i++) SPI_XFER(((uint8_t*) buffer)[i]);
  UNSELECT;

  // no need to wait for transmit mode to be ready since its handled by the radio
  writeReg(REG_OPMODE, FOO | RF_OPMODE_TRANSMITTER);
  uint32_t txStart = millis();
  while (!(PIND & _BV(interruptPin)) && millis() - txStart < RF69_TX_LIMIT_MS);// wait for DIO0 to turn HIGH signalling transmission finish
  writeReg(REG_OPMODE, FOO | RF_OPMODE_STANDBY);
}

static inline void myInit(void) {
  sei();
 
  // Timer 0 initialization from wiring.c for a ATmega 328P (Arduino Uno rev 3) + 12 bytes to sketch size
  TCCR0A = _BV(WGM01) | _BV(WGM00);      // set timer 0 prescale factor to 64
  TCCR0B = _BV(CS01) | _BV(CS00);        // set timer 0 prescale factor to 64
  TIMSK0 = _BV(TOIE0);                 // enable timer 0 overflow interrupt
 
  // Timer 2 initialization from wiring.c for an ATmega 328P (Arduino Uno rev 3) + 20 bytes to sketch size
  TCCR2A |= _BV(COM2A1) | _BV(WGM20);    // Enable timer 2 to _delay_ms() works properly
  TCCR2B |= CS22;                        // set clkT2S/64 (From prescaler)
 
  // ADC Housekeeping
  ADMUX = _BV(REFS0) | _BV(MUX3) | _BV(MUX2) | _BV(MUX1);    // Set the multiplexer to read the internal bandgap voltage
  ADCSRA |= ADC_PS_32;    // set our own prescaler to 32
}

int main(void) {
  myInit();
  radio.initialize(FREQUENCY,NODEID,NETWORKID);
  radio.setPowerLevel(0);
  radio.sleep();
  Serial.begin(115200);
  initTWI();
 
  uint8_t data[6];  //16-bit temp, 16-bit RH, 16-bit Vcc
  uint16_t startT, elapsed;
 
  for(;;) {
    startT = micros();           // ==================== START THE CLOCK ====================
    sbi(ADCSRA, ADSC);  // start a conversion
    issueCommand(WRITE_USER_REGISTER, ELEVEN_BIT_TEMP);    // this conversation takes 88uS - plenty long enough for the ADC
    issueCommand(TRIGGER_TEMP_MEASURE_NOHOLD,0);
    data[5] = ADCL;    // Avoid 16-bit math in this loop since it adds 40us
    data[6] = ADCH;
    LowPower.powerDown(SLEEP_15MS, ADC_OFF, BOD_OFF);
    readRaw(&data[0]);
    issueCommand(WRITE_USER_REGISTER, EIGHT_BIT_RH);
    issueCommand(TRIGGER_HUMD_MEASURE_NOHOLD,0);
    LowPower.powerDown(SLEEP_15MS, ADC_OFF, BOD_OFF); 
    readRaw(&data[2]);
    TWCR = (1<<TWINT)|(1<<TWEN)| (1<<TWSTO);  // stop the TWI
   
    // Send the data
    SendFrame(RECEIVER, data, 6);
    radio.sleep();
    elapsed = micros()-startT;  // ==================== STOP THE CLOCK ====================
 
 
    float Tamb = ((uint16_t) (data[0]<<8 | data[1])) * (316.296 / 65535.0) - 52.33;
    float RHamb = ((uint16_t) (data[2]<<8 | data[3])) * (125.0 / 65535.0) - 6.0;
    Serial.print("\nTemperature: ");
    Serial.print(Tamb);
    Serial.print("\tHumidity: ");
    Serial.print(RHamb);
    Serial.print("\t\tVcc: ");
    Serial.print(112296/(data[6]<<8 | data[5]));
    Serial.print("\t\tLoop took: ");
    Serial.print(elapsed);
    Serial.print(" us");
    _delay_ms(4);  // Give the UART time to get our output across
    Serial.flush();
   
    LowPower.powerDown(SLEEP_8S, ADC_OFF, BOD_OFF);    // Sleep for 8 seconds
  }  // end for
}  // end main

HTU21D Code (probably compatible with Si7021s?)
Code: [Select]
#include "Arduino.h"
#define   BAUD_RATE                     8000000ul
#define   TRIGGER_TEMP_MEASURE_NOHOLD   0xF3
#define   TRIGGER_HUMD_MEASURE_NOHOLD   0xF5
#define   WRITE_USER_REGISTER           0xE6
#define   ELEVEN_BIT_TEMP               B10000011
#define   EIGHT_BIT_RH                  B00000011
#define   SLA_W                         TWDR = (0x40 << 1)
#define   SLA_R                         TWDR = ((0x40 << 1) + 0x01)
#define   START_TWI                     TWCR = (1<<TWINT) | (1<<TWSTA) | (1<<TWEN); WAIT_FOR_TWI_INT
#define   RESTART_TWI                   TWCR = (1<<TWINT) | (1<<TWEN);              WAIT_FOR_TWI_INT
#define   RESTART_TWI_ACK               TWCR = (1<<TWINT) | (1<<TWEA) | (1<<TWEN);  WAIT_FOR_TWI_INT
#define   WAIT_FOR_TWI_INT              while (!(TWCR & (1<<TWINT)) && ++counter)
#define   STOP_TWI                      TWCR = (1<<TWINT)|(1<<TWEN)| (1<<TWSTO)//;    while ((TWCR & (1<<TWSTO)) && ++counter)
#define   NOT_READY                     (TWSR & 0xF8) == 0x48 && ++counter

static inline void initTWI() {
  DDRC |= (1<<PC3) | (1<<PC2);
  PORTC |= (1<<PC2) | (1<<PC4) | (1<<PC5);
  TWBR=1;  //TWBR = ((F_CPU / BAUD_RATE) - 16) / 2;
}

static inline void stopTWI() {
  uint16_t counter;
  STOP_TWI;
}

static inline void issueCommand(uint8_t comm, uint8_t res) {
  uint16_t counter;
  START_TWI; 
  SLA_W;
  RESTART_TWI;
  TWDR = comm;    // Send the command
  RESTART_TWI;
  if (comm == WRITE_USER_REGISTER) {  // Send the new resolution
    TWDR = res;
    RESTART_TWI;
  } else {    // Issue a stop on the I2C bus so we can enter sleep
    STOP_TWI;
  }
}

static inline void readRaw(uint8_t *ptr) {
  uint16_t counter;
  do {  // Start + SLA(R) until we get an ACK
    START_TWI;
    SLA_R;
    RESTART_TWI;
  } while (NOT_READY);

  // Measurement is ready, read back 2 bytes an store them at the address of ptr
  RESTART_TWI_ACK;    // Set the ACK bit to let the transmitter know we need another byte
  *ptr++ = TWDR; // Write the MSB
  RESTART_TWI;    // Set the NACK bit to let the transmitter know we are done
  *ptr = TWDR & 0xFC; // Write the LSB with the status bits masked off
}

I need (want) to finish culling any references to SPI or RFM69 as I intend for this to be small and self-contained but for now I'd like it in the wild in case anyone else can benefit from this.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 13, 2017, 04:52:23 PM
The datasheet refers to a similar technique for handling large packets, but I suppose it might (?) also work with shorter packets like yours also:

Thanks, I'll have to check out the datasheet to see how many bytes have to be in the FIFO to trigger FifoThreshold.  Hopefully it is less than 9 or I'm boned.

EDIT: Looks like one can program this threshold to be anything that 7 bits can hold: RegFifoThresh (0x3C).  Thanks WhiteHare, this may have legs!
Title: Re: Speeding Up The RFM69 Library
Post by: WhiteHare on January 13, 2017, 06:07:47 PM
If your motivation is merely to conserve energy, it might be simpler to just load the FIFO while the RFM69 sleeps, then initiate the Tx, then immediately sleep the atmega328p, and then have the radio wake the atmega328p with an interrupt when Tx finishes. 

If you're primarily wanting to reduce latency, though, your approach seems worthwhile. 
Title: Re: Speeding Up The RFM69 Library
Post by: perky on January 13, 2017, 08:24:20 PM
Also the packet you send might have some static values in it that don't change from one packet to another (like packet type, server address etc.), in which case create a 'template' packet in memory and just change the bytes you need to change before sending. Make sure your SPI runs in the MHz, even at 1MHz the time taken to send a single byte to the FIFO during a burst write is 8us or so plus a little time for polling. I think the 328P has a single data register for SPI, however if you were able to use USART in SPI mode you get the double buffering which would allow back-to-back transfers.
Mark.
Title: Re: Speeding Up The RFM69 Library
Post by: TD22057 on January 14, 2017, 11:40:28 AM
I haven't had time to play with this library yet: https://github.com/iwanders/plainRFM69  but it uses the RFM auto-mode selection to handle rx/tx which in theory should be more efficient.  There are couple of discussions (here (https://github.com/iwanders/plainRFM69/issues/1) and here (https://github.com/iwanders/plainRFM69/issues/3)) about use w/ the moteino.  Might be worth looking at if it really does improve efficiency.
Title: Re: Speeding Up The RFM69 Library
Post by: joelucid on January 14, 2017, 12:18:53 PM
Another trick: the rfm69 doesn't overwrite the FIFO with any new data that comes in after a packet has been received. It waits until the fifo is emptied before reading new packets. Given this you can use the FIFO as only data store for incoming data and eliminate the 61 bytes DATA buffer the rfm69 lib uses. Also you don't need to use interrupts to copy the FIFO there.

Joe
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 15, 2017, 03:09:04 PM
I haven't had time to play with this library yet: https://github.com/iwanders/plainRFM69  but it uses the RFM auto-mode selection to handle rx/tx which in theory should be more efficient.  There are couple of discussions (here (https://github.com/iwanders/plainRFM69/issues/1) and here (https://github.com/iwanders/plainRFM69/issues/3)) about use w/ the moteino.  Might be worth looking at if it really does improve efficiency.

Great links thank you!  Automatic mode seems like it might be wildly efficient for what I'm wanting to do with my TH nodes.  It will take me a while to digest this code and mode of operation but I suspect I'll be updating my code to make use of it.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 15, 2017, 04:03:31 PM
Thanks for all the good ideas everyone.  There are lots of interesting things to test out here.  I would love to build a send only version that is self contained, very small code size (cause I always shoot for that), uses auto mode to transmit and then immediately go back to sleep, and can actually begin transmitting as the FIFO is still being written because that is just cool and why not reduce latency.  Plus if the radio can be putting data into the FIFO while it is transmitting and then automatically put itself to sleep once the packet is sent, the radio itself will spend less time awake and more time asleep.  So faster and less power.  Hopefully once I have something working someone with a scope can verify that this is the case.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 19, 2017, 08:00:41 AM
Well, I'm quite impressed with automatic mode so far.  It took an entire 3 minutes to implement (I'm really not used to things working on the first try with the RFM69W).  The uC is now awake for 324 microseconds and packets are still getting out just as before.  This 324 microseconds breaks down as follows:
    292 microseconds = get a Vin, Temp, RH measurement
     32 microseconds = write 11 bytes to the FIFO

Without having to manually switch modes my send frame code now looks like this:
Code: [Select]
static inline void SendFrame(uint8_t toAddress, const void* buffer, uint8_t bufferSize) {
  SELECT;                      // write to FIFO using SPI burst mode
  SPI_XFER(REG_FIFO | 0x80);
  SPI_XFER(bufferSize + 3);    // LEN byte
  SPI_XFER(toAddress);         // 1st byte
  SPI_XFER(NODEID);            // 2nd byte
  SPI_XFER(0x00);              // 3rd byte
  for (uint8_t i = 0; i < bufferSize; i++) SPI_XFER(((uint8_t*) buffer)[i]);  // Write 6 more bytes to the FIFO
  UNSELECT;
}

Someone with a scope would have to verify whether or not the radio spends more/less/the same time drawing current but with certainty the uC is awake less using auto mode.  I still have some decoupling to do from the Arduino libraries, and general code clean up/compaction, and I do want to add back 4 error check bits, but this super stripped down power optimized code for nodes that only send back data is looking very promising.

EDIT: Forgot to mention that I came across an initialization setting that shaves 30 microseconds off a normal broadcast.  Not sure what if any the implications are of a faster PA ramp rate but this certainly sped up my manual mode transmissions:
Code: [Select]
writeReg(REG_PARAMP, B00001111);    // Save 30us by speeding up the PA ramp rate
Title: Re: Speeding Up The RFM69 Library
Post by: emjay on January 19, 2017, 07:44:33 PM
@ChemE,

Quote
the implications are of a faster PA ramp rate

In a phrase, spectral splatter.  Just look at it as the start of an OOK transmission (no RF -> full RF).  OOK of course has a spectrum centered around the raw carrier frequency with skirts that get wider as the modulation rate increases. Now if you allow the PA to race from zero to max in the shortest time, this is analogous to a high baud rate - hence the skirts are wide, potentially upsetting adjacent channels. 
Unfortunately you need a good spectrum analyser to catch this - it's beyond the capability of a typical "dongle" SA
 
Title: Re: Speeding Up The RFM69 Library
Post by: WhiteHare on January 19, 2017, 08:52:40 PM
In terms of receiving, I can see there'd be a palpable benefit in emptying the receive FIFO as it fills rather than waiting for it to fully fill: you can issue an ACK faster.  Of course, to do it you have to trade away hardware encryption (since that works on the entire FIFO queue all at once), but, meh, for most people full-bore encryption probably isn't really needed anyway.
Title: Re: Speeding Up The RFM69 Library
Post by: perky on January 20, 2017, 06:48:03 AM
A proprietory hashing algorithm that uses a counter and a unique serial number would probably be sufficient in place of encryption, that's extremely quick to calculate.
Mark.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 20, 2017, 08:27:29 PM
**WARNING:  This code is just for W's not HW's.  My code might screw up your HW if you don't make adjustments.

Uff, it took vastly longer than I thought it would, but I've finally stripped out all references to the Arduino libraries as well as taken just a teeny subset of Felix's library and rewritten it to make use of auto mode as well as sped that up as much as I could.  The only outside reference is to LowPower for sleep duties. 

Regarding code size, I'm down to 1,418 bytes with the fastest version of the code which uses always inline on some RFM69CW routines.  With that macro commented out, the compiler will optimize for smaller code which adds 8 microseconds to the loop but drops the hex file down to 1,084 bytes.  The difference between fast and "slow" is 312 microseconds vs. 304 microseconds of awake time to get a Temp, RH, Vin measurement and send it all to the radio.  Almost all of that is the brutally slow I2C conversation.

Main program
Code: [Select]
include "HTU21D.h"
#include "LowPower.h"
#include "ADC.h"
#include "RFM69CW.h"
#define DEBUG 0

int main(void) {
  uint8_t data[6];  //16-bit temp, 16-bit RH, 16-bit Vcc
  #if DEBUG
    uint16_t startT, elapsed;
    startT = micros();           // ==================== START THE CLOCK ====================
  #endif
 
  Fast_ADC_Init();    // Initialize the AVR registers needed
  RadioInit();  // Configure the radio while it is asleep
  initTWI();  // Initialize the I2C bus at 400kHz
  #if DEBUG
    elapsed = micros()-startT;   // ==================== STOP THE CLOCK ====================
    Serial.begin(115200);
    Serial.print("\nInitialization took ");
    Serial.print(elapsed);
    Serial.print(" us\n");
    _delay_ms(2);
  #endif

  for(;;) {  // Loop forever
    #if DEBUG
      startT = micros();           // ==================== START THE CLOCK ====================
    #endif
    ENABLE_ADC;                                            // Turn on the ADC just prior to needing it
    START_ADC_CONVERSION;                                  // start a conversion
    issueCommand(WRITE_USER_REGISTER, ELEVEN_BIT_TEMP);    // Change the measurement resolution to 11 bits
    issueCommand(TRIGGER_TEMP_MEASURE_NOHOLD,0);           // Begin a temperature measurement
    data[5] = ADCL;                                        // Avoid 16-bit math in this loop since it adds 40us
    data[6] = ADCH;                                        // Read back the results of the ADC
    DISABLE_ADC;                                           // Turn off the ADC to save power
    LowPower.powerDown(SLEEP_15MS, ADC_OFF, BOD_OFF);      // Sleep while the sensor measures...
    readRaw(&data[0]);                                     // Retreive the measurement results
    issueCommand(WRITE_USER_REGISTER, EIGHT_BIT_RH);       // Change the measurement resolution to 8 bits
    issueCommand(TRIGGER_HUMD_MEASURE_NOHOLD,0);           // Begin a RH measurement
    LowPower.powerDown(SLEEP_15MS, ADC_OFF, BOD_OFF);      // Sleep while the sensor measures...
    readRaw(&data[2]);                                     // Retreive the measurement results
    TWCR = (1<<TWINT)|(1<<TWEN)| (1<<TWSTO);               // stop the TWI
    SendFrame(RECEIVER, data, 6);                          // Send the data
    #if DEBUG
      elapsed = micros()-startT;   // ==================== STOP THE CLOCK ====================
      float Tamb = ((uint16_t) (data[0]<<8 | data[1])) * (316.296 / 65535.0) - 52.33;
      float RHamb = ((uint16_t) (data[2]<<8 | data[3])) * (125.0 / 65535.0) - 6.0;
      Serial.print("\nTemperature: ");
      Serial.print(Tamb);
      Serial.print("\tHumidity: ");
      Serial.print(RHamb);
      Serial.print("\t\tVcc: ");
      Serial.print(112296/(data[6]<<8 | data[5]));
      Serial.print("\t\tLoop took: ");
      Serial.print(elapsed);
      Serial.print(" us");
      _delay_ms(4);  // Give the UART time to get our output across
      Serial.flush();
    #endif
   
    for ( uint8_t sleep_count=0; sleep_count<1; sleep_count++) LowPower.powerDown(SLEEP_8S, ADC_OFF, BOD_OFF);    // Sleep for 32 seconds
  }  // end for
}  // end main

ADC.h
Code: [Select]
#include "Arduino.h"

// Define various ADC prescaler
#define       ADC_PS_16               (1 << ADPS2)
#define       ADC_PS_32               (1 << ADPS2) |                (1 << ADPS0)
#define       ADC_PS_64               (1 << ADPS2) | (1 << ADPS1)
#define       ADC_PS_128              (1 << ADPS2) | (1 << ADPS1) | (1 << ADPS0)
#define       sbi(sfr, bit)           (_SFR_BYTE(sfr) |= _BV(bit))
#define       cbi(sfr, bit)           (_SFR_BYTE(sfr) &= ~_BV(bit))
#define       START_ADC_CONVERSION    sbi(ADCSRA, ADSC);  // start a conversion
#define       ENABLE_ADC              cbi(PRR, PRADC); ADCSRA |= bit(ADEN); // Enable the ADC
#define       DISABLE_ADC             cbi(ADCSRA, ADEN); sbi(PRR, PRADC); // Disable the ADC to save power

static inline void Fast_ADC_Init(void) {
  // Timer 0 initialization from wiring.c for a ATmega 328P (Arduino Uno rev 3) + 12 bytes to sketch size
  TCCR0A = _BV(WGM01) | _BV(WGM00);      // set timer 0 prescale factor to 64
  TCCR0B = _BV(CS01) | _BV(CS00);        // set timer 0 prescale factor to 64
  TIMSK0 = _BV(TOIE0);                 // enable timer 0 overflow interrupt
 
  // Timer 2 initialization from wiring.c for an ATmega 328P (Arduino Uno rev 3) + 20 bytes to sketch size
  TCCR2A |= _BV(COM2A1) | _BV(WGM20);    // Enable timer 2 to _delay_ms() works properly
  TCCR2B |= CS22;                        // set clkT2S/64 (From prescaler)
 
  // ADC Housekeeping
  ADMUX = _BV(REFS0) | _BV(MUX3) | _BV(MUX2) | _BV(MUX1);    // Set the multiplexer to read the internal bandgap voltage
  ADCSRA |= ADC_PS_32;    // set our own prescaler to 32
}

HTU21D.h
Code: [Select]
#include "Arduino.h"
#define   BAUD_RATE                     8000000ul
#define   TRIGGER_TEMP_MEASURE_NOHOLD   0xF3
#define   TRIGGER_HUMD_MEASURE_NOHOLD   0xF5
#define   WRITE_USER_REGISTER           0xE6
#define   ELEVEN_BIT_TEMP               B10000011
#define   EIGHT_BIT_RH                  B00000011
#define   SLA_W                         TWDR = (0x40 << 1)
#define   SLA_R                         TWDR = ((0x40 << 1) + 0x01)
#define   START_TWI                     TWCR = (1<<TWINT) | (1<<TWSTA) | (1<<TWEN); WAIT_FOR_TWI_INT
#define   RESTART_TWI                   TWCR = (1<<TWINT) | (1<<TWEN);              WAIT_FOR_TWI_INT
#define   RESTART_TWI_ACK               TWCR = (1<<TWINT) | (1<<TWEA) | (1<<TWEN);  WAIT_FOR_TWI_INT
#define   WAIT_FOR_TWI_INT              while (!(TWCR & (1<<TWINT)) && ++counter)
#define   STOP_TWI                      TWCR = (1<<TWINT)|(1<<TWEN)| (1<<TWSTO)
#define   NOT_READY                     (TWSR & 0xF8) == 0x48 && ++counter

static inline void initTWI() {
  DDRC |= (1<<PC3) | (1<<PC2);
  PORTC |= (1<<PC2) | (1<<PC4) | (1<<PC5);
  TWBR=1;  //TWBR = ((F_CPU / BAUD_RATE) - 16) / 2;
}

static inline void issueCommand(uint8_t comm, uint8_t res) {
  uint16_t counter;
  START_TWI; 
  SLA_W;
  RESTART_TWI;
  TWDR = comm;    // Send the command
  RESTART_TWI;
  if (comm == WRITE_USER_REGISTER) {  // Send the new resolution
    TWDR = res;
    RESTART_TWI;
  } else {    // Issue a stop on the I2C bus so we can enter sleep
    STOP_TWI;
  }
}

static inline void readRaw(uint8_t *ptr) {
  uint16_t counter;
  do {  // Start + SLA(R) until we get an ACK
    START_TWI;
    SLA_R;
    RESTART_TWI;
  } while (NOT_READY);

  // Measurement is ready, read back 2 bytes an store them at the address of ptr
  RESTART_TWI_ACK;    // Set the ACK bit to let the transmitter know we need another byte
  *ptr++ = TWDR; // Write the MSB
  RESTART_TWI;    // Set the NACK bit to let the transmitter know we are done
  *ptr = TWDR & 0xFC; // Write the LSB with the status bits masked off
}

RFM69CW.h
Code: [Select]
#include "Arduino.h"
#define         OPT_FLAG                                //__attribute__((always_inline))  // Comment this definition out to optimize for size
// ============================== User Definitions ==============================
#define         NETWORKID                               100    //the same on all nodes that talk to each other - 170 is 10101010 DC free value
#define         RECEIVER                                1      // ID of the gateway that all nodes report back to
#define         NODEID                                  2      // ID of this Node
#define SS_PIN                 PB2    // Slave select pin

// ============================== SPI Definitions ==============================
#define SELECT                 noInterrupts(); PORTB &= ~(1<<SS_PIN)
#define UNSELECT                 SS_WRITE_HIGH; interrupts()
#define    SS_WRITE_HIGH                      PORTB |= 1<<SS_PIN
#define         WAIT_WHILE_SPI_BUSY                     asm volatile("nop"); while (!(SPSR & 1<<SPIF))

// ============================== RFM69CW Definitions ==============================
#define         SLEEP_MODE                              B00000000
#define         AUTO_TRANSMITTER                        B01011011    // Enter = FIFO level; Exit = Packet Sent; Intermediate Mode = TX
#define         CHANGE_OP_MODE(mode)                    writeReg(REG_OPMODE, mode)
#define         SET_POWER_LEVEL(level)                  writeReg(REG_PALEVEL, 0x80 | (level & 0x0F));  // Only works for the RFM69CW not the RFM69HCW
#define         REG_FIFO                                0x00
#define         REG_OPMODE                              0x01
#define         REG_BITRATEMSB                          0x03
#define         REG_BITRATELSB                          0x04
#define         REG_FDEVMSB                             0x05
#define         REG_FDEVLSB                             0x06
#define         REG_PALEVEL                             0x11
#define         REG_RXBW                                0x19
#define         REG_RSSITHRESH                          0x29
#define         REG_SYNCCONFIG                          0x2E
#define         REG_SYNCVALUE1                          0x2F
#define         REG_SYNCVALUE2                          0x30
#define         REG_PACKETCONFIG1                       0x37
#define         REG_AUTOMODES                           0x3B
#define         REG_PACKETCONFIG2                       0x3D
#define         RF_BITRATEMSB_300000                    0x00    // Begin 300 kbps auto Tx settings
#define         RF_BITRATELSB_300000                    0x6B
#define         RF_FDEVMSB_300000                       0x13
#define         RF_FDEVLSB_300000                       0x33
#define         RF_RXBW_DCCFREQ_111                     0xE0
#define         RF_RXBW_MANT_16                         0x00
#define         RF_RXBW_EXP_0                           0x00
#define         RF_SYNC_ON                              0x80
#define         RF_SYNC_FIFOFILL_AUTO                   0x00
#define         RF_SYNC_SIZE_2                          0x08
#define         RF_SYNC_TOL_0                           0x00
#define         RF_PACKET1_FORMAT_VARIABLE              0x80
#define         RF_PACKET1_DCFREE_OFF                   0x00
#define         RF_PACKET1_CRC_OFF                      0x00
#define         RF_PACKET1_CRCAUTOCLEAR_OFF             0x08
#define         RF_PACKET1_ADRSFILTERING_OFF            0x00
#define         RF_PACKET2_RXRESTARTDELAY_2BITS         0x10
#define         RF_PACKET2_AUTORXRESTART_ON             0x02
#define         RF_PACKET2_AES_OFF                      0x00    // End 300 kbps auto Tx settings

static inline void SPI_INIT(void) {  // Level 0 code - initialize the SPI bus at fosc/2 (8MHz)
  PORTB |= 1<<SS_PIN;
  DDRB |= _BV(SS_PIN);
  SPCR |= _BV(MSTR) | _BV(SPE);
  SPSR |= (1<<SPI2X);    // Set the SPI bus speed to Fosc/2 = 8MHz at full speed
 
  // No clue why this is neccessary as opposed to DDRB |= 1<<SCK | 1<<MOSI
  volatile uint8_t *reg;
  reg = &DDRB;
  *reg |= 0x28;  // Bit mask of SCK and MOSI
}

OPT_FLAG uint8_t SPI_XFER(uint8_t data) {    // Level 0 code - move data over the SPI bus
  SPDR = data;
  WAIT_WHILE_SPI_BUSY;
  return SPDR;
}

OPT_FLAG uint8_t readReg(uint8_t addr) {    // Level 1 code - interact witht the radio's registers
  SELECT;
  SPDR = ( addr & 0x7F );
  WAIT_WHILE_SPI_BUSY;
  SPDR = ( 0 );
  WAIT_WHILE_SPI_BUSY;
  UNSELECT;
  return SPDR;
}

OPT_FLAG void writeReg(uint8_t addr, uint8_t value) {  // Level 1 code - interact witht the radio's registers
  SELECT;
  SPDR = ( addr | 0x80 );
  WAIT_WHILE_SPI_BUSY;
  SPDR = ( value );
  WAIT_WHILE_SPI_BUSY;
  UNSELECT;
}

static inline void SendFrame(uint8_t toAddress, const void* buffer, uint8_t bufferSize) {  // Level 2 code - do useful work
  SELECT;
  SPI_XFER(REG_FIFO | 0x80);   // write to FIFO using SPI burst mode
  SPI_XFER(bufferSize + 3);    // LEN byte
  SPI_XFER(toAddress);         // 1st byte
  SPI_XFER(NODEID);            // 2nd byte
  SPI_XFER(0x00);              // 3rd byte
  for (uint8_t i = 0; i < bufferSize; i++) SPI_XFER(((uint8_t*) buffer)[i]);  // Write 6 more bytes to the FIFO
  UNSELECT;
}

static inline void RadioInit(void) {
    SPI_INIT();
    CHANGE_OP_MODE(SLEEP_MODE);  // Put the radio to sleep ASAP to save power
    writeReg( REG_BITRATEMSB, RF_BITRATEMSB_300000 ); // 0x03
    writeReg( REG_BITRATELSB, RF_BITRATELSB_300000 ); // 0x04
    writeReg( REG_FDEVMSB, RF_FDEVMSB_300000 ); // 0x05
    writeReg( REG_FDEVLSB, RF_FDEVLSB_300000 ); // 0x06
    writeReg( REG_RXBW, RF_RXBW_DCCFREQ_111 | RF_RXBW_MANT_16 | RF_RXBW_EXP_0 ); // 0x19
    writeReg( REG_RSSITHRESH, 220 ); // 0x29
    writeReg( REG_SYNCCONFIG, RF_SYNC_ON | RF_SYNC_FIFOFILL_AUTO | RF_SYNC_SIZE_2 | RF_SYNC_TOL_0 ); // 0x2E - 2 sync bytes
    writeReg( REG_SYNCVALUE1, 0xAA ); // 0x2F
    writeReg( REG_SYNCVALUE2, NETWORKID ); // 0x30
    writeReg( REG_PACKETCONFIG1, RF_PACKET1_FORMAT_VARIABLE | RF_PACKET1_DCFREE_OFF | RF_PACKET1_CRC_OFF | RF_PACKET1_CRCAUTOCLEAR_OFF | RF_PACKET1_ADRSFILTERING_OFF );  // 0x37 // 0x37
    writeReg( REG_AUTOMODES, AUTO_TRANSMITTER );  // 0x3B - Put the radio in automatic mode
    writeReg( REG_PACKETCONFIG2, RF_PACKET2_RXRESTARTDELAY_2BITS | RF_PACKET2_AUTORXRESTART_ON | RF_PACKET2_AES_OFF ); // 0x3D
    SET_POWER_LEVEL(0);
}

I now plan to look into adding back ACKs (done added 4 bytes) and generalizing back to HWs as well.  Once that is done I'll need to experiment with auto receivers.  Other pending issues are CRC-4 or some sort of encryption.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on January 26, 2017, 11:04:39 AM
I managed to track down a subtle bug that was preventing much of anything from actually functioning when DEBUG was set to 0 (no serial output).  The culprit was a lack of a Timer 0 overflow vector.  This fix was this simple once I realized what was going on:

Code: [Select]
#if !DEBUG
  ISR(TIMER0_OVF_vect){}  // Needed to make the WDT work
#endif

The library can now be switched between Auto or Manual mode since it appears that there is a LONG delay in auto mode of putting the radio to sleep.  It might just be a bad setting on my part but I haven't tracked down a fix for that yet.  We know we can get to sleep fast in manual mode.  The code size is 970bytes in auto mode and 1066 bytes in manual mode.  I do still plan to add CRC-4 though I wonder if a parity bit would be sufficient for such small payloads.
Title: Re: Speeding Up The RFM69 Library
Post by: zingbat on January 31, 2018, 04:37:28 AM
@ChemE ,

It's awesome to have such a small and dependency free implementation. Did you end up writing optimized receive code as well? I'm tinkering with an Optiboot variant that does OTA programming and thus am looking around for slim RFM69 implementations.

Jeff
Title: Re: Speeding Up The RFM69 Library
Post by: Felix on January 31, 2018, 04:29:06 PM
I'm tinkering with an Optiboot variant that does OTA programming

@zingbat - which one might that be?
Title: Re: Speeding Up The RFM69 Library
Post by: zingbat on February 03, 2018, 11:56:09 AM
I'm working on it - no established project.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on February 03, 2018, 01:31:25 PM
@ChemE ,

It's awesome to have such a small and dependency free implementation. Did you end up writing optimized receive code as well? I'm tinkering with an Optiboot variant that does OTA programming and thus am looking around for slim RFM69 implementations.

Jeff

Jeff,

I did start fooling with the receive code but got bogged down in the registers not seeming to work properly.  I wanted an implementation that I could simply periodically poll the registers to see if a packet had been received but I never got that to work.  It looks like I'll need to set it up using interrupts just as Felix did.  I have been meaning to pick this project back up since I've put it down for so very long.  I'm glad at least the send code is useful for you though.  One day I'll have a reliable datagram working, dependancy-free, and shoehorned into a few kilobytes of code space!
Title: Re: Speeding Up The RFM69 Library
Post by: zingbat on February 03, 2018, 02:08:15 PM
That would be awesome if you could send it! I think I'm running in to the same problem: I'm polling the FIFO and just getting 999999 back.

Also, I haven't looked closely at this yet but I believe the plainRFM69 library is using auto-receive mode, not sure if he's polling or using interrupts:

https://github.com/iwanders/plainRFM69
Title: Re: Speeding Up The RFM69 Library
Post by: Felix on February 03, 2018, 06:55:40 PM
I'm working on it - no established project.
You mean you are developing it yourself?
Title: Re: Speeding Up The RFM69 Library
Post by: zingbat on February 03, 2018, 08:44:03 PM
Yep! Slowly, since it has to fit in under 4k..
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on February 03, 2018, 11:04:34 PM
Probably should take this to a new thread, but I wonder if there should be an open-source Dual Optiboot 2.0 that does OTA.  Joe has done it but it is closed source but he seems to help with hints.  Felix could help too though I know he won't go crazy on it since he has other commitments.  But still, there are enough of us interested/able that I bet we could get it done and make a big leap forward in capability.
Title: Re: Speeding Up The RFM69 Library
Post by: zingbat on February 04, 2018, 12:02:25 PM
Sounds great! I have receive working now and will hopefully post a proof of concept soon.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on February 05, 2018, 05:36:59 AM
I would love to see the code extended to a working receive!  Please post it for everyone when you get a chance.
Title: Re: Speeding Up The RFM69 Library
Post by: LukaQ on April 10, 2018, 08:43:02 AM
hey ChemE, do you have gateway also available for this?
And I don't know where you set 915M or 868M freq.
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on April 10, 2018, 02:10:50 PM
hey ChemE, do you have gateway also available for this?
And I don't know where you set 915M or 868M freq.

When the module powers up, it is already set to 915MHz so I don't bother setting the carrier frequency.  To change it is 868MHz though you would use the following:

Code: [Select]
  // Change the carrier frequency to 868MHz
  writeReg( REG_FRFMSB, 0xD9 );
  writeReg( REG_FRFMID, 0x00 );
  writeReg( REG_FRFLSB, 0x00);

My gateway is not quite finished yet.  I have been successful in listening for packets and detecting payloadready in DIO0 though.
Title: Re: Speeding Up The RFM69 Library
Post by: LukaQ on April 10, 2018, 03:31:28 PM
what is you cpu freq.? You've set TWBR=1, which would mean that i2c clock would be very fast, 888kHz? Prescaler is 1?
I have too much capacitance on bus, I had to go with 48.

I can confirm, that is works with si7021, this would then also work with si7051.
But would like to ask, if 278 for Vcc is correct? as far as I know you are looking at bandgap voltage, right?

Any reason you are going for "everything done fast as possible"?
Title: Re: Speeding Up The RFM69 Library
Post by: ChemE on April 10, 2018, 04:21:08 PM
what is you cpu freq.? You've set TWBR=1, which would mean that i2c clock would be very fast, 888kHz? Prescaler is 1?
I have too much capacitance on bus, I had to go with 48.

I can confirm, that is works with si7021, this would then also work with si7051.
But would like to ask, if 278 for Vcc is correct? as far as I know you are looking at bandgap voltage, right?

Any reason you are going for "everything done fast as possible"?

Fcpu is 16MHz yes.  278 is roughly what I measure on my Moteinos and yes my code is using the internal bandgap voltage as the reference.  As for why I'm doing everything as fast as possible, I always code that way!  I enjoy speeding code up, always have.  Things cannot break, but until they break, to me faster is better.
Title: Re: Speeding Up The RFM69 Library
Post by: LukaQ on April 11, 2018, 02:28:15 AM
Also very refreshing to have code without dependency. Totally different approach for doing the same job. I like your code, give me insight into how things work "in the background".
Do you have any other code like that, that you would share? maybe github?