Compressed Thumb instructions of ARM MCU

Thumb instructions Thumb instructions shrink ARM instructions from 32 bit to 16-bit length. This allows saving up to 40% of program memory comparing to the ARM instruction set. Maybe this is the main reason for Thumb instructions being used. Thumb instructions lose many properties of ARM instructions and become similar to traditional RISC instructions. Thumb instructions cant be conditional. Data processing has a two-address format where the destination register is one of the source registers. ARM instruction: ADD R0, R0, R1 Thumb instruction: ADD R0, R1 As the Thumb instruction set takes less program space, it allows to upload bigger applications, but with lower speed than using ARM, instructions performance is up to 40% faster. But in noncritical applications or functions speed isn’t a significant factor. Also, Thumb instructions don’t have full access to all registers. Thumb instructions can only access “low registers”(R0-R7). And only a few instructions can access “high registers”(R8-R12).

Continue reading

ARM7 core instruction set explained

ARM7 architecture has a normal 32bit ARM7 instruction set and a compressed 16-bit instruction set, the so-called “Thumb.” ARM7 instructions have complex behavior. As ARM processor programming is usually written in C, there is no need to be an ARM expert, but understanding the basics may help develop efficient programs. ARM7 datatypes ARM7 processor can support following datatypes: 8 bit signed and unsigned bytes; 16 bit signed and unsigned half-words; 32 bit signed and unsigned words But shorter than 32-bit data types are supported only by data transfer functions, but when internally processed, they are extended to 32-bit size. ARM7 core doesn’t support floating point datatypes – they can only be interpreted by software.

Continue reading

ARM7 exception modes

This is an important part of understanding ARM operation modes and handling them. ARM7 supports seven types (0x00000014 address is reserved) of exceptions: As you can see in the table, each exception has its own name and fixed address, so-called exception vectors. When an exception occurs, execution is forced from a fixed memory address corresponding to the type of exception. When an exception occurs, R14 and SPSR registers act like this:

Continue reading

ARM7 MCU registers

ARM has 31 general purposes 32-bit registers where 16 of these are visible at any time. Other registers are used to speed up the processing of exceptions. There also are 6 32bit wide status registers. Let’s see how it looks like. Registers are arranged in partially overlapping banks with a different register ban of each MCU mode. As I mentioned, 15 general-purpose registers(R0 to R14) and one or two status registers and PC are visible at any time. Basically, R0-R12 registers are user register, that doesn’t have a special purpose. Registers R13 – R15 has special functions. R13 is used as stack pointer (SP), R14 is used as link register (LR), and R15 is as the program counter (PC):

Continue reading

How does ARM7 pipelining works

One of the key features of the fast performance of ARM microcontrollers is Pipelining. ARM7 Core has a three-stage pipeline that increases instruction flow through the processor up to three times. So each instruction is executed in three stages: Fetch – instruction is fetched from memory and placed in the pipeline; Decode – instruction is decoded and data-path signals prepared for the next cycle; Execute – instruction from prepared data-path reads from registry bank, shifts operand to ALU, and writes generated result to dominant register. Pipelining is implemented at the hardware level. The pipeline is linear, which means that in simple data processing processor executes one instruction in a single clock cycle while individual instruction takes three clock cycles. But when the program structure has branches, the pipeline faces difficulties because it cannot predict which command will be next. In this case, the pipeline flushes and has to be refilled what means execution speed drops to 1 instruction per 3 clock cycles. But it isn’t true, actually. ARM instructions have nice features that allow for the smooth performance of small branches in code that assure optimal performance. This is achieved at a hardware level where PC (Program Counter) is calculated…

Continue reading