I will not analyze what DSP processors are different from regular microcontrollers as many differences allow boosting performance in many specific tasks like filtering, FFT, etc. One thing is obvious that DSP processors have to perform mathematical calculations rapidly enough to get predictive results. The better result we want – the more processing power we need. We know that MCU is performing two main tasks: data manipulation and mathematical operations. But the fact is that it has to be done really fast. General-purpose microcontrollers aren’t optimized to perform these tasks effectively as microcontroller has to as much universal as possible to fit in many areas. In other words, flexibility reduces performance.
DSP processors are more specialized microprocessors optimized for tas that they usually do – multiplication and addition. Let’s take the most common DSP routine FIR digital filter implementation. It takes several samples of signal x[] and produces output signal y[], modified by multiplying appropriate samples by coefficients an.
y[n] = a0x[n]+a1x[n-1]+…+akx[n-k]
Depending on many coefficients depends on filter output result – more coefficients – better-looking results. In practice, there may be from few coefficients to thousands. The set of these coefficients is the so-called filter kernel. You can imagine what operations DSP has to perform to keep up with the sampling rate. Let’s take a maximum sound frequency 20kHz, and let’s take the worst scenario sampling rate of 40kHz, so the filter has to deal with 40000 samples per second. And imagine filter with like 100 coefficients and look at the formula above. DSP has to save the last 100 last input signal samples, multiply them with coefficients respectively and then add the result to get one output sample, then the last sample has to be removed, all samples shifted, and new sample x[n] placed in front of the array. Could you imagine this to be done with a simple MCU?
Let’s talk about the shifting of input samples part. You should reorder samples with regular MCU, while DSP processors have built-in hardware to deal with this so that updating samples wouldn’t take much time as output samples have to gout out at the same rate as input(real-time processing or better – online processing). There are so-called circular buffers in DSP hardware that can solve this problem with one operation – updating one sample value.
A circular buffer is a memory location that is managed by four-pointer parameters that indicate the start of the circular buffer; second indicates the end of the circular buffer, third indicates the step size of a memory location (in case values takes two bytes, etc.) and last parameter indicates next sample. As we can see, all play is done with addressing and updating only one memory address with a new sample value. Las sample of circular buffer becomes the first one in the next cycle with the updated value. This way circle goes on and on 🙂 A circular buffer is managed at the hardware level; just implementation must be done at the beginning with initial pointers. When the new sample arrives, the fourth pointer must be updated with the next value that is updated. Simple and fast. Circular buffers in DSP are optimized so processing could reach the highest speed.
In the end, let’s see what operations are needed to implement a simple FIR filter:
- Obtain a new sample from ADC;
- Detect interrupt and manage it;
- Move sample to circular buffer – update las previous sample;
- Update pointer for input sample;
- Zero Accumulator register;
- Loop through each of coefficients;
- Fetch coefficients from another circular buffer;
- Update pointer for coefficients;
- Fetch sample from the circular buffer;
- Update pointer;
- Multiply coefficient by sample;
- Add product to accumulator;
- Move output sample to output buffer;
- Move output sample to DAC.
So these steps have to be done quickly to have real-time processing. Traditional MCU carries these operations in series, while DSP has many parallelized operations like one loop can be done in one clock cycle.