I am not going to analyse what DSP processors are different from regular microcontrollers as there are many differences that allow to boost performance in many specific tasks like filtering, FFT, etc. One thing is obvious that DSP processors have to perform mathematical calculations rapidly enough to get predictive results. Better result we want â€“ more processing power wee need. We know that MCU are performing two main tasks: data manipulation and mathematical operations. But fact is that it has to be done really fast. General purpose microcontrollers aren’t optimised to perform these tasks effectively as microcontroller has to as much universal as possible to fit in many areas. In other words flexibility reduces performance.
DSP processors are more specialized microprocessors that ara optimised for tas that they usually do â€“ multiplication and addition. Lets take most common DSP routine FIR digital filter implementation. It takes several samples of signal x[] and produces output signal y[] which is modified by multiplying appropriate samples by coefficients a_{n} .
y[n] = a_{0}x[n]+a_{1}x[n1]+…+a_{k}x[nk]
Depending on number of coefficients depends filter output result â€“ more coefficients â€“ better looking results. In practice there may be from few coefficients to thousands. Set of these coefficients is so called filter kernel. You can imagine what operations DSP have to perform to keep up to sampling rate. Lets take maximum sound frequency 20kHz and lets take worst scenario sampling rate of 40kHz, so filter has to deal 40000 samples per second. And imagine filter with like 100 coefficients and look in formula above. DSP have to save last 100 last input signal samples, multiply them with coefficients respectively and then add result to get one output sample, then last sample has to be removed, all samples shifted and new sample x[n] placed in front of array. Could you imagine this to be done with simple MCU?
Lets talk about shifting of input samples part. You should reorder samples with regular MCU, while DSP processors have built in hardware to deal with this so that updating samples wouldn’t take much time as output samples has to gout out at the same rate as input(real time processing or better â€“ on line processing). In DSP hardware there is so called circular buffers, that can solve this problem with one operation â€“ updating one sample value.
Circular buffer is a memory location that is managed by four pointer parameters that indicate the start of circular buffer, second indicates the end of circular buffer, third indicates the step size of memory location (in case values takes two bytes, etc.) and last parameter indicates next sample. As wee can see all play is done with addressing and updating only one memory address with new sample value. Las sample of circular buffer becomes the first one in the next cycle with the updated value. This way circle goes on and on 🙂 Circular buffer is managed in hardware level, just implementation has to be done at the beginning with initial pointers. This way when new sample arrives the fourth pointer has to be updated with next value which is updated. Simple and fast. Circular buffers in DSP are optimised so processing could reach highest speed.
At the end lets see what operations are needed to implement simple FIR filter:

Obtain new sample from ADC;

Detect an interrupt and manage it;

Move sample to circular buffer â€“ update las previous sample;

Update pointer for input sample;

Zero Accumulator register;

Loop through each of coefficients;

Fetch coefficients from other circular buffer;

Update pointer for coefficients;

Fetch sample from circular buffer;

Update pointer;

Multiply coefficient by sample;

Add product to accumulator;


Move output sample to output buffer;

Move output sample to DAC.
So these steps has to be done quickly to have real time processing. Traditional MCU carry these operations in series while DSP have many parallelized operations like one loop can be done in one clock cycle.