This simple code works on the MSSP on the PIC24F series, I wired the SDO1 to SDI1 for hardware loopback to self-test. I can confirm that the simulator doesn't work, you need real hardware.
#pragma config FNOSC = FRCPLL // Oscillator Select (Fast RC Oscillator with Postscaler and PLL Module (FRCDIV+PLL))
#pragma config FWDTEN = OFF // Watchdog Timer Enable bits (WDT disabled in hardware; SWDTEN bit disabled)
#include <xc.h>
#define FCY 16000000
#include <libpic30.h>
#include <stdint.h>
#define BUFFERSIZE 8
static const uint8_t _au8Tx[BUFFERSIZE]={0x55,0xAA,0x11,0x88,0x22,0x44,0xFF,0x00};
static uint8_t _au8Rx[BUFFERSIZE];
int main(void)
{
int i;
CLKDIVbits.RCDIV=0b000; // 0b000 -> div-by-1 before FRCPLL for 16MHz Fcy
SSP1CON1=0; // Disable module for configuration
SSP1CON1bits.SSPM=0b0010; // Bit clock=Fcy/16
// SSP1CON1bits.SSPM=0b0001; // Bit clock=Fcy/4
// SSP1CON1bits.SSPM=0b0000; // Bit clock=Fcy
SSP1CON1bits.SSPEN=1;
for (i=0;i<BUFFERSIZE;i++)
{
SSP1BUF=_au8Tx[i];
while (SSP1STATbits.BF==0)
{
Nop();
}
_au8Rx[i]=SSP1BUF;
}
while (1)
{
Nop();
}
}
I have re-coded/refactored it for the MSSP on the PIC18F4620, but I don't have any silicon with me to test as I'm out and about today. I have some PIC18F2620 back in the lab which are the same device but in 28 pin package which I can try for you later. Here's the code for now that you could try. Again, the simulator doesn't work with the MSSP for this device either.
//#pragma config OSC = INTIO67 // Oscillator Selection bits (Internal oscillator block, port function on RA6 and RA7)
#pragma config OSC = INTIO7 // Oscillator Selection bits (Internal oscillator block, CLKOUT function on RA6, port function on RA7)
#pragma config PWRT = OFF // Power-up Timer Enable bit (PWRT disabled)
#pragma config WDT = OFF // Watchdog Timer Enable bit (WDT disabled (control is placed on the SWDTEN bit))
#pragma config MCLRE = ON // MCLR Pin Enable bit (MCLR pin enabled; RE3 input pin disabled)
#pragma config LVP = OFF // Single-Supply ICSP Enable bit (Single-Supply ICSP disabled)
#include <xc.h>
#include <stdint.h>
#define BUFFERSIZE 8
static const uint8_t _au8Tx[BUFFERSIZE]={0x55,0xAA,0x11,0x88,0x22,0x44,0xFF,0x00};
static uint8_t _au8Rx[BUFFERSIZE];
int main(void)
{
int i;
OSCCONbits.IRCF=0b111; // 0b111 -> div-by-1 8MHz internal clock
// CS_ pin setup for master mode
LATAbits.LATA5=1; // SS_ disabled
TRISAbits.TRISA5=0; // RA5 & SS_ on the same pin, in master mode we need to manually assert
// SCK pin setup for master mode
LATCbits.LATC3=0;
TRISCbits.TRISC3=0; // RC3 & SCK on same pin, TRIS bit needs setting to output for master mode
SSPCON1=0; // Disable module for configuration
// SSPCON1bits.SSPM=0b0010; // Bit clock=Fcy/16
// SSPCON1bits.SSPM=0b0001; // Bit clock=Fcy/4
SSPCON1bits.SSPM=0b0000; // Bit clock=Fcy
SSPCON1bits.SSPEN=1;
LATAbits.LATA5=0; // Assert CS_
for (i=0;i<BUFFERSIZE;i++)
{
SSPBUF=_au8Tx[i];
while (SSPSTATbits.BF==0)
{
Nop();
}
_au8Rx[i]=SSPBUF;
}
LATAbits.LATA5=1; // Deassert CS_
while (1)
{
Nop();
}
return;
}
Edit: In my experience, the simulator is of very limited use for real world development.
For a limited number of things, such as optimizing number crunching during unit testing, it can be quite useful, as the cycle calculations are generally reliable, but keep in mind that higher end PIC32 devices with cache and wait states won't give you realistic cycle times.
For anything real world, in real time scenarios with external stimulus, or when using non-trivial peripherals, the simulator is next to useless and you need real hardware.
One final thing, always keep things as simple as possible to begin with, and build incrementally on top. In particular, avoid things like interrupts, FIFO and DMA to start with. If you can't get a simple polling solution to work during prototype unit testing, I guarantee you'll never get those more complicated concepts to work, they just complicate things and cloud the issue.