jra, thanks for the links. My original suggestion is something like the Sleeping Beauty, but with RFM radio layout.
Felix, a few more ideas ...
I definitely prefer the 1284, and use the board shown for most of my projects anymore. You'll notice my board does have everything on it, so it'll make a complete robot: 5/3.3V 1A v.regs, radio, extra RAM/eeprom chip, piezo, 32KHz xtal layout, Arduino female headers plus 3-row male headers, and I2C header. Sensors and servos plug right in. The 328 chips don't have enough RAM or IO pins for a decent robot.
One thing I did compromise on was to put the ICSP header is the "wrong" place, since the XBee socket and 5V regulator took up so much space. However, only the Ethernet shields use that header to my knowledge, so I have some that are hardwired to D11..D13, and I rewired others so they'd plug in. If the board had both DPAK vregs and a smaller layout for RFM radio, then the 1284 chip could be moved back, and the ICSP put in the usual place. But no issue anyways with SMT 1284 chips. And you could add a header on the right side for the extra 1284 I/O pins, as you mentioned.
Anymore I use DPAK v.regs for both 5V and 3.3V regulators on my newer boards. They are small, very cheap, and have superior heat characteristics to SOT-223/23.
http://www.digikey.com/product-detail/en/LD1117DT50TR/497-1238-6-ND/1848396There is a jumper on the board that can switch the cpu Vcc between 5V and 3.3V. If you look "closely" at the oscillator frequency vs Vcc curves in the AVR datasheets, you'll see that at 3.3V the allowable freq shown is actually about 13.3 MHz, so using a 16 MHz xtal is only overclocking by 20%. I've never had any problem with a 16 MHz xtal and 3.3V, and many other people on the Arduino forum have done likewise. I do have voltage-dividers under the XBee radio socket for when Vcc=5V, but anymore I mostly use Vcc=3.3V anyways.
One other thing, there is plenty of room to have more than a single SOIC8 layout for SPI flash/RAM, and I would have at least 2 such layouts. Also, my boards have series-Rs for protection in all I/O lines, but they take up a lot of space. I would however include 1K series-Rs at least in the Rx,Tx lines going to the FTDI connector, so then the FTDI adapter pin voltages wouldn't be a problem.
EDIT: I would also put a jumper on the Vin pin on the FTDI header, so it wouldn't conflict with direct power to the board.