Interrupt Madness
Posted on 2010-10-11 22:44 in Blog
The other day I ran into a very strange issue while debugging a problem at work. We were evaluating a new microprocessor for use in one of our products and had been trying to exercise all the different functionalities the processor had to offer.
I was testing out the input capture functionalities and discovered that the first capture (rising edge of input pulse) always worked correctly, but the second edge was never detected. After some digging, it appeared as though the I2C communication routines were interfering with the capture.
I turned on debugger, ran code and it broke inside the I2C routine, hanging waiting for a reply which was never going to come (test code had no time outs). I commented out all calls to the I2C code within the program, reran it, and the debugger once again broke inside the I2C routine. Somehow, the program counter was jumping to an unused block of code. I know it was unused because the compiler complained about the unused segment.
Putting on my debugger hat, I placed a break point immediately after the first capture, and stepped through the code, one line at a time. On the following line, something strange happened
counter += 1;
Instead of advancing to the next instruction, the program jumped almost to the beginning of memory (15 bytes after the end of the memory mapped register range). I repeated this several times, always with the same result. As an experiment, I rearranged a few lines of code and reran. This time, the code jumped on a different line, but landed in the same location.
The only thing I know of in microprocessor land which can change your execution point is a faulty processor, or more likely, a interrupt occurring. After poking though the map file, I realized the I2C routine was the first executable block in code, and the decoded instruction 0xFF which fills all unused space is a move instruction.
MOV A, B
It makes sense, that a bad jump into unused space would eventually land the code in the I2C routine, but that was triggering the interrupt? A five minute search yielded the solution.
The code we were using was based on example code provided by the vendor. I hacked the various pieces together and changes from interrupt driven function to polling driving functions. Unfortunately, I missed the line in the initialization routine where the input capture interrupt occurred. Most surprisingly, the compiler allows us to enable an interrupt, without defining an interrupt handler. (In this processor, all interrupt controls had direct bit access. They were not grouped together requiring a byte mask.)
The interrupt event occurred and the program counter was loaded with the default value from the interrupt table. We have no idea how 0xFFFF from the table was translated into 0x034A which is where execution picked back up. Commenting out the interrupt enable line resolved our issue and we were able to finish evaluating the board.
Morals:
1) Interrupts are evil. They introduce too much variability into the
execution process.
2) Sample code is a great place to start from, but don’t base your
demonstration code on it
3) Debugging takes time and patience, take a break once in a while to
relax and clear your thoughts