Author Topic: [SOLVED] SPI Sdcard + I2C Sensor = No go!  (Read 11480 times)

0 Members and 1 Guest are viewing this topic.

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
[SOLVED] SPI Sdcard + I2C Sensor = No go!
« on: December 28, 2016, 10:34:44 pm »
Heya folks!

Not my first post on crazy voodoo wasting precious time, like it happened some time ago.

This time, things are different. I'm working on the uRADMonitor model D, to summarise it is 60x110cm board with an Atmega2561, a 2500mAh battery, two inverters based on LTC3440 for 3.3V / 5.0V used independently, a TP4056 battery charger,  a DS1337 RTC (I2C), an EEPROM (I2C), a FT232 USB2Serial, a software configurable high voltage inverter with multiplier (configured for 480V), a microSD slot (SPI), a NEO6M GPS (UART), a ESP8266-ESP03 Wifi board (UART), a Bosch BME680 (I2C) and a Sharp gp2y1010au0f and an ILI9341 2.4" LCD (SPI) with touchscreen.

After several changes and revisions:


I finally got to the fifth variant which was good enough. I assembled two PCBs and done all the software: drivers for the various modules and sensors, a fat32 implementation for the sdcard, a gps nmea parser, an lcd driver with a minimalistic library, etc, etc. So everything worked just fine.
I pushed it to production, not before doing some changes to the PCB layout, nothing fancy, just moving the speaker a little and adjusting the size of the SMD pads. So the factory finished the assembly of a few devices, but the test software fails.

Here's the Voodoo part:
1. If my code inits the BME680 (over I2c) and tries to write something to the SDCard, the device will NOT start, nothing, not even some innocent code just writing something on the screen.
2. If I take out the BME680 init code,and the read function, the code can write to the SDcard just fine and all the code works, for all the modules and components.

This behaviour is not happening on my two assembled test units. I suspect the following:
1. Different Atmega2561 with some weird memory issues (unlikely)
2. Code memory corruption (unlikely) - the code is nothing but a collection of personal libraries for all the modules used.

Testing goes slow, as it is done remotely on the route Romania - China factory.

Any suggestions on how could this be addressed remotely?

Thanks!
« Last Edit: February 11, 2017, 04:24:47 pm by radhoo »

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 

Offline Fortran

  • Regular Contributor
  • *
  • Posts: 206
  • Country: fi
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #1 on: December 28, 2016, 10:44:48 pm »
Are you running the 2561 on 3.3V?
Because they're available in two voltage ranges. 
Atmega2561V-xx  1.8V - 5.5V
Atmega2561-xx 4.5V - 5.5V
 

Offline Fortran

  • Regular Contributor
  • *
  • Posts: 206
  • Country: fi
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #2 on: December 28, 2016, 11:30:20 pm »
Are you sure? :)

Atmega2561-16AU has a specified input rating of 4.5V to 5.5V.
3.3V is below that.

Going outside the specified rating even a little means basically anything can happen.
The ones you have that are working might be doing so just out of pure luck.
 

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #3 on: December 28, 2016, 11:44:07 pm »
I had to delete previous reply because it was wrong.

So to put everything in order: the latest version is using mega2561-16AU running on 3.3V  |O . Long story short, initially it was an mega128, that got upgraded to the 2561. When selecting this IC I did a mistake, as per the picture attached.

Thanks for the second pair of eyes on this, Fortran. I will investigate this direction and report back.

Radu



Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 

Offline Someone

  • Super Contributor
  • ***
  • Posts: 4863
  • Country: au
    • send complaints here
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #4 on: December 28, 2016, 11:55:10 pm »
You say the tube is alpha sensitive, but then how are you applying the biological weighting for the sievert? The data sheet for the tube doesnt show a reduced sensitivity to account for any weighting in the window. It would be very misleading to present that measurement if its only based on a single isotope.
 

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #5 on: December 29, 2016, 09:44:21 am »
The tube is alpha sensitive (but also beta and x-ray) the datasheet contains this info as well. The CPM to Sv/h conversion is done pretty much the same way on all Geiger Tube based detectors, ranging from gamma-only to window tubes (see the expensive Gamma Scout, which is an excellent detector).
For better accuracy or scientific purposes, I would use selective filters on the input window, and the CPM value in calculations. The filters can be applied easily due to the aluminium enclosure's shape.

But we're getting a little off-topic, please hold your horses at least until I solve the problem posted above.  Thanks!
 

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 

Offline dgtl

  • Regular Contributor
  • *
  • Posts: 183
  • Country: ee
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #6 on: December 29, 2016, 10:50:33 am »
Perhaps just a simple stack overflow issue. The amount of static allocation has reached the point where stack collides with it and things get trashed.
 

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #7 on: December 29, 2016, 12:52:12 pm »
Perhaps just a simple stack overflow issue. The amount of static allocation has reached the point where stack collides with it and things get trashed.
This crossed my mind, but no matter how much I simplified / changed the code I wasn't able to prove this. Still, this would be the only logical explanation , given the behaviour. I am waiting to try Fortran's suggestion too.

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 

Offline bktemp

  • Super Contributor
  • ***
  • Posts: 1616
  • Country: de
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #8 on: December 29, 2016, 01:53:34 pm »
+1 for the stack overflow (or some other SRAM data corruption issue).

The datasheets aren't very clear about the voltage ratings, but from what I have heard, both variants are actually identical. They are only being tested under different conditions.
If all boards fail the same way it is clearly a software bug or a hardware issue (like supply voltage dropping when everything is active at the same time).
 

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #9 on: December 29, 2016, 01:58:16 pm »
Well, the datasheet does say that the atmega2561-16AU needs 4.5-5V, my bad for overseeing that, due to a confusion I made while reading the datasheet.

Still, my two test boards work perfectly, mega2561-16AU, with 14.7MHz crystals on 3.3V.
The datasheets aren't very clear about the voltage ratings, but from what I have heard, both variants are actually identical. They are only being tested under different conditions.
Do you have more info on this?

If all boards fail the same way it is clearly a software bug or a hardware issue (like supply voltage dropping when everything is active at the same time).
On the other hand two test boards in the factory, failed in the weird way presented above: they do work, but not when I want to use BME680 code + SDCard code together.  When this happens, the LCD will not even show anything, despite the bme/sdcard code being a few lines after, like the code wouldn't run in a linear way.
The same code runs on my two test boards perfectly. I'd say that yes, the hardware difference is a different mega2561-16AU used in the factory.

I know you guys can't guess such a problem from so little info. But thank you for all your ideas so far, I'll double check everything!

What I am planning to do is to follow the datasheet indications, and switch the 2561-16au for a 2561V-8AU as a first step.
« Last Edit: December 29, 2016, 02:02:29 pm by radhoo »

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 

Offline bktemp

  • Super Contributor
  • ***
  • Posts: 1616
  • Country: de
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #10 on: December 29, 2016, 03:06:21 pm »
Still, my two test boards work perfectly, mega2561-16AU, with 14.7MHz crystals on 3.3V.
14.7MHz @3.3V is way out of specs, regardless of the variant. For 14.7MHz you need at least 4.2V.

Looking deeper into the datasheet there seems to be a difference between the smaller versions and the ATMega2560/2561:
All of the smaller versions are specified for 2.7-5.5V while ATMega2560/2561 are only specified at 4.5-5.5V.
In Table 31-1 the voltage range is still at 2.7-5.5V for the ATMega2560.
It almost looks like they designed the chip for 2.7-5.5V but experienced some problems and changed the specifications to 4.5-5.5V.
In the errata for ATmega2560 for older revisions: Part does not work under 2.4 volts

Quote
The datasheets aren't very clear about the voltage ratings, but from what I have heard, both variants are actually identical. They are only being tested under different conditions.
Do you have more info on this?
I can't remeber where I got this information, but it makes sense and there are some clues in the datasheets. For example there are no seperate errata for the V version.

Quote
On the other hand two test boards in the factory, failed in the weird way presented above: they do work, but not when I want to use BME680 code + SDCard code together.  When this happens, the LCD will not even show anything, despite the bme/sdcard code being a few lines after, like the code wouldn't run in a linear way.
The same code runs on my two test boards perfectly. I'd say that yes, the hardware difference is a different mega2561-16AU used in the factory.
Functions are getting executed in the order you write them. But they can affect each other, because there may be different stack requirement.
I would replace the AVR on a non working pcb but change nothing else (make sure to use the same hex file and the same fuse bits).
That is probably the only way to be sure it was a problem with the 2561 instead of 2561V version.


 

Offline dgtl

  • Regular Contributor
  • *
  • Posts: 183
  • Country: ee
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #11 on: December 29, 2016, 03:47:37 pm »
To check for stack overflows, do the following:
At the start of the code, fill the whole RAM after static allocations up to stack start with a known pattern. After a while, print out the ram and you'll see where the pattern is gone. If the pattern is gone, stack has reached that point. If the whole stack space is used up, you'll have a problem. This is easier to do with a debugger, but even with uart debug prints you can do it.
 

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #12 on: December 29, 2016, 04:35:16 pm »
It almost looks like they designed the chip for 2.7-5.5V but experienced some problems and changed the specifications to 4.5-5.5V.
Add the fact that those that I am using are functioning properly, and the code is quite complex, testing most of the IC's sub-systems.

I would replace the AVR on a non working pcb but change nothing else (make sure to use the same hex file and the same fuse bits).
That is probably the only way to be sure it was a problem with the 2561 instead of 2561V version.
Yes, that's the plan.

To check for stack overflows, do the following:
At the start of the code, fill the whole RAM after static allocations up to stack start with a known pattern. After a while, print out the ram and you'll see where the pattern is gone. If the pattern is gone, stack has reached that point. If the whole stack space is used up, you'll have a problem. This is easier to do with a debugger, but even with uart debug prints you can do it.
Thanks for the idea, I will try this as well. Testing is going very slowly, because I am doing everything remotely , with the factory. The code is behaving strangely. Again, this code is well verified, being classes from other projects, I wrote over time. The problem's mechanism, on the factory units is as follow (quoting test code):
Initialisation:
Code: [Select]
at24c eeprom;
ILI9341 lcd; // LCD
Inverter inverter; // regulated high voltage inverter for the Geiger tube
NMEA nmea;
DS1337 rtc;
SDClass sd; // the sdcard
TouchScreen touch; // touchscreen
ESP8266 wifi; // setup wifi on serial communication
Battery battery; // Battery manager
// define sensors
BME680 bme680; // Bosch BME680
GEIGER geiger; // Geiger LND712 tube
GP2Y1010AU0F pm25; // Sharp Optical PM sensor
Code: [Select]
void Devices::init() {
lcdReset.init(&PORTA, PA0);
lcdDC.init(&PORTA, PA1);
lcdCS.init(&PORTA, PA2);
lcdBacklight.init(&PORTA, PA3);
sdCS.init(&PORTE, PE3);
touchXP.init(&PORTA, PA4);
touchYN.init(&PORTA, PA5);
touchXN.init(&PORTF, PF6);
touchYP.init(&PORTF, PF7);
pwrUnit.init(&PORTA, PA6);
pwrWLAN.init(&PORTE, PE2);
pwrGPS.init(&PORTG, PG0);
pwrFAN.init(&PORTD, PD6);
pwrSPK.init(&PORTA, PA7);
pwrDustLED.init(&PORTD, PD7);
sdAvailable.init(&PORTE, PE5, DigitalPin::INPUT);
pwrBatteryADC.init(&PORTG, PG2);
pwrInverter5V.init(&PORTC, PC7);

eeprom.init(&i2c);
rtc.init(&i2c);

lcd.init(&lcdDC, &lcdReset, &lcdCS, &lcdBacklight, &spi);
touch.init(&adc, &touchXP, &touchYN, &touchXN, &touchYP);
}

Test run:
Code: [Select]
// keep power pin High unless we want to shutdown
pwrUnit = ON;
_delay_ms(500);
pwrGPS = ON;
_delay_ms(500);
pwrWLAN = ON;
_delay_ms(200);

pwrSPK = 1; _delay_ms(1); pwrSPK = 0; // a short start beep

// system config
MCUCR |= (1 << JTD); MCUCR |= (1 << JTD); // disable JTAG so we can use PF4 (UNUSED),PF5 (BAT ADC),PF6 (TOUCH XN ADC),PF7(TOUCH YP ADC) for ADC and GPIO
adc.start(); // start ADC
uart0.start(0, 9600, 1); // connected to the GPS
uart1.start(1, 115200, 1); // used by esp8266 .
EICRB |= (1 << ISC00) | (1 << ISC01); // Configure INT4 to trigger on RISING EDGE
EIMSK |= (1 << INT4); // Configure INT4 to fire interrupts
i2c.init();
rtc.start();
time.init(callback_timeSecond, callback_timeMinute);

spi.startfast();

// start LCD
lcd.start(ILI9341::ROT0, BLACK, ON);
lcd.drawString(buffer, len, 0,0, 20, RED, BLACK, "1  ");
wifi.init(&uart1);

lcd.drawString(buffer, len, 0,0, 20, RED, BLACK, "2  ");
battery.init(&pwrBatteryADC, &adc, BATTERY_ADC); // init battery manager

lcd.drawString(buffer, len, 0,0, 20, RED, BLACK, "3  ");
inverter.start(&adc, INVERTER_ADC); // create Timer T1 PWM to drive inverter for regulated Geiger tube voltage

lcd.drawString(buffer, len, 0,0, 20, RED, BLACK, "4  ");
bme680.start(&i2c); // init Bosch BME680 sensor

lcd.drawString(buffer, len, 0,0, 20, RED, BLACK, "5  ");
pm25.init(&pwrDustLED, &adc, PMSENSOR_ADC); // init PM2.5 Sharp GP2Y1010AU0f sensor

lcd.drawString(buffer, len, 0,0, 20, RED, BLACK, "6  ");

int res1 = sd.begin(&sdCS, &sdAvailable, &spi);
cid_t cid;
int res2 = sd.mkdir("test-dir");
File f = sd.open("test-dir/test.txt", FILE_WRITE);
int res3 = f.write((uint8_t *)"Hello World!\r\n",14);
f.close();
sd.getCard().readCID(&cid);

This works on both my test devices. On the factory units, this fails:
1. As it is, the LCD will only blink once due to the backlight being turned on (line lcd.start(ILI9341::ROT0, BLACK, ON); ). No BLACK background, so the logic breaks
2. If I comment out the bme680.start line, the code will run just fine, the SDCARD will get the new folder and the test.txt file in it . Before that I will see 1,2,3,4,5,6
3. If instead, I leave the bme680.start enabled, but comment the sdcard folder and file create code, again the code will run . Before that I see 1,2,3,4,5,6

Strange, isn't it?

Attached a few random test images.

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 

Offline senso

  • Frequent Contributor
  • **
  • Posts: 951
  • Country: pt
    • My AVR tutorials
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #13 on: December 29, 2016, 06:45:01 pm »
Running out of memory(ram)?
I had a lot of "problems" with i2c recently due to having a lot of devices attached to on micro-controller, and if the lib you are using is anywhere like the Pete Fleury it is not that hardened, and then so wheren't my device drivers for each i2c device, probe around that a bit..
Is there so cli/sei in the fatFS code?
Might be the fact that the i2c lib is interrupt dependant and the BME is taking a long time to respond, I would say some kind of race condition.
 

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #14 on: December 29, 2016, 06:48:51 pm »
The I2C lib is just a very basic read / write with no interrupts. When the problem happens, it doesn't even show "1,2,3..." even before getting to the BME680.start line. That's the weird part  :scared:

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 

Offline bktemp

  • Super Contributor
  • ***
  • Posts: 1616
  • Country: de
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #15 on: December 30, 2016, 11:54:35 am »
What happens if you switch to internal 8MHz RC oscillator?

Stack overflow problems can be really difficult to find. That's why I prefer static memory allocation, because the memory usage is known at compile time.
I once had a software that worked fine for years, until I changed the SD card. Then it crashed occasionally exactly after 1022 files had been written to the SD card.
It took me weeks to find the actual problem: When writing the 1023th file, the cluster holding the directory was full, therefore the FAT driver had to allocate a new one. Calling this function needs additional stack.
If an interrupt triggered at the same time, the stack overflowed and damaged the file system structure.

If interrupts are being used, even minor changes in the code or the timing can have large effects if there isn't much available memory left for the stack.
 

Offline lujji

  • Contributor
  • Posts: 29
  • Country: 00
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #16 on: December 30, 2016, 01:45:36 pm »
1. If my code inits the BME680 (over I2c) and tries to write something to the SDCard, the device will NOT start, nothing, not even some innocent code just writing something on the screen.
Are you polling during i2c read/write operations? Perhaps this is a signal integrity issue and your uC freezes while waiting for slave to respond. Can you probe i2c lines with a scope?
 

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #17 on: December 30, 2016, 01:48:13 pm »
I thought about that too, and the read op has a timeout already. It doesn't help. And keep in mind that I get no LCD output even before reaching the BME680 code, which is the weird part, really!

My best bet is on Fortran's observations. Something goes totally wrong due to the chip working out of specs. Will see on Monday.

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #18 on: January 07, 2017, 12:27:47 pm »
Additional testing was performed:
1. atmega1281-16au with 14.7M crystal
2. atmega2561v-8au with 7.37M crystal
3. the initial atmega2561-16au with 14.7M crystal

The behaviour and results were identical. Meaning code and various modules (LCD, RTC, Dust sensor, 480V inverter, Tube counter, GPS, Wifi) all work ok, but when the BME680 code is initialised and then the SDCard is initialised, the unit won't start (even if there is LCD display code before initing bme and the sdcard, nothing gets on the screen).

So weird. Still at this point I can only conclude it is a software issue. The code works perfectly on my two test units.

The factory will ship two units here so I can do more detailed tests.

Any suggestions are welcome.

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 

Offline senso

  • Frequent Contributor
  • **
  • Posts: 951
  • Country: pt
    • My AVR tutorials
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #19 on: January 09, 2017, 09:09:06 pm »
The I2C lib is just a very basic read / write with no interrupts. When the problem happens, it doesn't even show "1,2,3..." even before getting to the BME680.start line. That's the weird part  :scared:

Can you post it?
All the i2c libs I know use a general interrupt to run the state machine of the i2c hardware engine, maybe thats your problem, a weird race condition.
 

Offline krho

  • Regular Contributor
  • *
  • Posts: 223
  • Country: si
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #20 on: January 10, 2017, 06:31:19 am »
senso might have a point.
Also if you have a debugger uploading and running a debug version via Atmel studio and hitting the pause button will show you where it is stuck. ATMEL-ICE is rather cheap and it can save your ass more times than you think.
 

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
Re: SPI Sdcard + I2C Sensor = No go!
« Reply #21 on: January 16, 2017, 06:34:39 pm »
Hi Guys!

Got two units shipped from the factory, those that handled the remote tests with the various issues I presented. The soldering is beautiful, see the first picture:

I flashed the firmware, and guess what, they work perfectly, exactly like on my previous test devices.


I was hoping to replicate the issue locally so I can find the cause. But noooo, this has to be hard, or life looses its awesomeness. Anyway, since everything works as expected, I can only suspect the programmer + programming software, exactly like it happened in an old article I mentioned in the first post: http://www.pocketmagic.net/atmega128-voodoo/ .
Back then, you'll see if you read it, it was a similar problem, but on the mega128, where some simple code would work, but the same functionality placed in a class would fail. I think it was a led blinking code, nothing fancy. Just until I changed the programming software. Unbelievable, isn't it??! But apparently that was a known bug.



« Last Edit: January 16, 2017, 06:38:56 pm by radhoo »

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 
The following users thanked this post: thm_w

Offline radhooTopic starter

  • Contributor
  • Posts: 31
  • Country: ro
    • My technology blog
[SOLVED] Re: SPI Sdcard + I2C Sensor = No go!
« Reply #22 on: February 11, 2017, 09:56:20 am »
Ok Guys, problem solved  8)

Be very careful about this, as it's a lesson I knew about, but neglected it. Maybe it will save some time for some of you.

So we have some code that works in a sequential manner, meaning it does some things (lcd  text display), and then it does some more things (bme680 sensor reading). We add a third set of things (sdcard ops),but then, the entire thing breaks. We no longer see the first set of things being executed. WEIRD!

So of course you would assume it's a software glitch, except that the very same program runs perfectly on another set of devices (not one, but 3 other devices).

So ok, then you figure it must be related to hardware. So you send some devices all the way around the globe, to test the code on the suspect hardware. And when you finally do that, you see that everything is working properly.

Ok, what's this BS?

Apparently, it all has to do with the software used to program the microcontroller! That's right. We all think there are plenty of checksums and verifications in place, not to mention that if some code works, it's enough proof that writing the HEX works. BUT NO!
This stupid problem was caused because they used PROGISP1.72 in the factory, while I was using the good`old command line avrdude. Go figure. This is exactly the same issue I mentioned in the begining of this post, with the ATmega128 voodoo thing. Exactly the same!

The factory used a different program, the AVR Fighter, and it works ok. Exactly like the samples they sent me , exactly like those that I built initially.

Yes, this makes life beautiful.

LE: just blogged about this: http://www.pocketmagic.net . Case closed.  Thank you good folks, for your help on this! EEVBlog is a great place to discuss hardware.
« Last Edit: February 11, 2017, 04:26:12 pm by radhoo »

Blog :: My Youtube :: uRADMonitor :: "Build something that matters!"
 
The following users thanked this post: thm_w


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf