I was finally able to get my hands on an STM32! This board (sometimes called "Blue Pill") is pretty cheap and has a STM32F103C8T6 chip on it:
The STM32F103C8T6 chip has an ARM Cortex-M3 core, 20KB of SRAM and 64KB of flash. I have great memories of playing with ARM processors with the Game Boy Advance (ARM7TDMI) and Nintendo DS (ARM946E-S), so I decided to try to write some stuff using just assembly.
The whole deal was much easier than I expected. In the Nintendo DS
days I had to compile my own gcc and binutils to be able to build
binaries, and then use a "backup" flash cartridge to run them in the
hardware. Today things are much easier: the "ST-Link" programmer is
very cheap (although what I got is probably a clone), the Arduino IDE
supports STM32 with a simple change in its configuration, and people
make Linux packages for the cross-compiler and other tools (there's
also the STM32CubeIDE, which is based on Eclipse and is provided by
ST). For my needs, I just had to install binutils-arm-none-eabi
and
gcc-arm-none-eabi
on Ubuntu and I was well on my way to write some
code.
Now, since I wanted to write pure assembly with no help from any compiler or library, I needed a linker script, which tells the linker where each section of the program goes in memory. I searched the web but couldn't find anyone who had exactly what I needed: examples were either for the right chip but other toolchains (there's an YouTube video of someone doing this with the STM32CubeMX IDE, which doesn't seem to use GNU's ld?), or they were using the GNU toolchain but for different STM32 chips. So I ended up mixing and matching (and confirming things in the STM32F10x user manual) and ended up with this:
/* linker script for stm32f103.ld */ ENTRY(start); MEMORY { RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 20K FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 64K } SECTIONS { .text : { . = ALIGN(4); *(.isr_vector) /* <-- this has to be at the very start of flash */ *(.text) *(.text*) *(.rodata) *(.rodata*) . = ALIGN(4); } > FLASH __data_flash = .; .data : AT ( __data_flash ) { . = ALIGN(4); __data_start = .; *(.data) *(.data*) . = ALIGN(4); __data_end = .; } > RAM .bss : { __bss_start = .; *(.bss) *(.bss*) *(COMMON) . = ALIGN(4); __bss_end = .; } > RAM __stack_size = 1024; __stack_end = ORIGIN(RAM)+LENGTH(RAM); __stack_start = __stack_end - __stack_size; . = __stack_start; .stack : { . = . + __stack_size; } > RAM }
It has a lot of stuff I don't need (a lot of these sections are used by gcc), but I kept them anyway since they don't hurt and I might want to use them in the future.
The assembly code needed to set up and control a GPIO pin is pretty
simple, the only thing that stumped me for a bit was not knowing that
the flash memory needs to have an interrupt vector at the very start.
I ended up putting it in its own section, .isr_vector
, to make easy
for the linker script to place it exactly at the flash beginning. The
first entry of the vector is not an interrupt handler, but the address
of the top of the stack region (the stack grows down), and the second
one is the address of the start of the code. Apparently it really
wants some extra entries there for other handlers; the CPU locks up if
they're not present (even though I made the handlers themselves into
an infinite loop). I'm not entirely sure what's going on there, but
adding 6 handlers after the start address seems to keep the CPU happy.
Anyway, here's the assembly code. In the STM32, everything is controlled via registers mapped in memory. The code first enables the clock for the pins on port C (all peripheral clocks start disabled to save power), then configures pin 13 of port C as output, then sets the pin state, and then loops forever.
.cpu cortex-m3 .syntax unified .thumb @ ====================================== @ isr_vector @ ====================================== .section .isr_vector .word __stack_end .word start .word halt @ NMI handler .word halt @ hard fault .word halt @ memory fault .word halt @ bus fault .word halt @ usage fault .text @ ====================================== @ start @ ====================================== .global start .thumb_func start: @ enable clock for port C pins ldr r0, =0x10 @ bit 4 = port C enable ldr r1, =0x40021000 @ RCC base address str r0, [r1, #0x18] @ 0x18 = reg for APB2 enable (where port C lives) @ configure port C pin 13 as output ldr r0, =0x44244444 @ set pin 13 as output, other 7 pins stay as input ldr r1, =0x40011000 @ port C base address str r0, [r1, #0x04] @ 0x04 = reg for config of 8 higher pins of port @ set port C pin 13 to high (commented) or low @ldr r0, =0x2000 @ pin 13 high (0x2000 = 1<<13) ldr r0, =0 @ pin 13 low str r0, [r1, #0x0c] @ 0x0c = reg for output data of port pins start_loop: b start_loop start_end: .pool @ ====================================== @ halt @ ====================================== .thumb_func halt: b halt @ ====================================== @ DATA section @ (not used, added here just to check that @ the linker script is working) @ ====================================== .section .data .ascii "Hello, world!\0"
The board has an LED connected to pin 13 of port C; it seems that the LED's cathode is connected to the pin (with the anode connected to +Vdd) because the LED turns on when the pin is brought low.
This code works, but if I ever want to use the stuff in the .data
section, I'll have to copy it from the flash memory to RAM. The
linker script is prepared for that: it defines the symbol
__data_flash
to the address of the very end of the .text
section,
which is the start of where the .data
section stuff is located in
the flash memory. So I'll just have to make a memcpy
function that
copies __data_end-__data_start
bytes from __data_flash
to
__data_start
(which is the address of the .data
section in RAM).
After assembling the code with arm-none-eabi-as
and linking it with
arm-none-eabi-ld
, I used arm-none-eabi-objcopy
to convert the ELF
executable to a stripped out binary suitable to write to flash. These
are the commands that make
runs:
arm-none-eabi-as -o main.o main.s arm-none-eabi-ld -Tstm32f103.ld -o main.elf main.o arm-none-eabi-objcopy -O binary main.elf main.bin
And, finally, to upload the final binary to the flash in the chip I ended up using the Windows STLink command line utility that the Arduino IDE installs when you install the STM32 boards. I think it's the same tool that can be downloaded from the STM32 website (which comes with a GUI besides the command line tool), but I haven't checked.