My Real Time Scheduler a RTOS study part 1

Case study

This is going to be more like a case study then a how-to, please don’t expect to find a finished program.

Al lot has been written about the sense and nonsense of Real Time Operating Systems.
Internet is filled with aguments about why and when you should use a RTOS and there is of cource the eternal debate of commercial, open source or roll-your-own.
All of which I’m not going to repeat.

Personally I think, in about 10% of all embedded systems projects, a RTOS is an absolute wrong choice.
I also think that in 15% of the embedded systems a RTOS is absolutely necessary, the complexity of those systems justify the use of a full-blown RTOS.
If we add another 15% for systems needing or are capable of running a more complex OS like Linux, we end-up with 60% of all embedded systems in the middle of a grey area. To service my projects within this grey area I decided to roll-my-own scheduler.

Roll-my-own scheduler

Why roll-my-own? There is nothing more insightful then to roll-your-own, so that’s what I did.
I used the core of a RTOS as my starting point, the scheduler.

A pre-emptive prioritised and co-operative scheduler

The target is to design a prioritised-cooperative-preemptive scheduler. Well nothing fancy, each task has a priority and I want each task to give up control when it is finished, so it doesn’t waist time doing nothing during its time slice. I design alot for battery powered systems so the basic idea is to use a context switch algorithm that races to the idle task where I put the system in sleep mode.

My target platform is the cortex-M3/4 family so my scheduler is build like all simular schedulers around the SysTick interrupt handler. This is where the context switch algorithm is implemented.
In addition I use the PendSV interrupt as the actual context switching function because I would like to manually force a context switch, using the SVC handler, when a task needs to gives up control.
The SysTick IRQ (highest priority) will do the switch algorithm and the PendSV IRQ (lowest priority) will do the actual context switching. This means that the context switch will occur after all interrupts have been serviced. [1]

RTOS flow

Basic flow of a RTOS

The switch alogorithm will look like this, priority level is indicated by the position in the array of tasks:

 index = number_of_tasks;
 do {
     if( TCB.task[ --index ].status = NEED_SERVICE )
 while( index > 0);
 next_task = TCB.task[ index ];

If no task needs service we will end-up in the idle task (0).

The context switch

Now for the interesting part.

I am not going to explain the registers in the cortex-m series processor instead I’ll point to the Procedure Call Standard for the ARM Architecture [6] and repeat Table 2, Core registers and AAPCS usage.

Table 2, Core registers and AAPCS usage

Table 2, Core registers and AAPCS usage

For more information please look at a white paper by ARM (Cortex-M for Beginners) [3], look for paragraph 3.1 Programmer’s model, or take a look at NXP UM10360 document [4] paragraph Core registers.

Now the trick with context switching is to save the context and resoring the context during switching (duh).

The cortex-M family hardware does store a lot on its own [2] however we need to store the remaining registers.


The hardware uses the Decrement address Before access strategy.

PendSV stack sequence

PendSV stack sequence

After the interrupt (in my configuration the PendSV interrupt) the hardware stores the main registers onto the task stack, then we enter into the interrupt handler where we need to store the remaining registers. After that we are ready to do the context swich.

Pseudo-code first part of PendSV:
 mrs r0, psp
 stmfd r0!,{r4-r11}  /* Store registers r4, r5, r6, r7, r8, r9, r10, r11 */
 str   r0, #current_task_sp_variable

 /* Let's switch */

So at each context switch we need to store atleast 64 bytes (16 x 32 bits) on our task stack.

Main stack and process stack

From the ARM(c) Cortex Technical Reference Manual:
Out of reset, all code uses the main stack. An exception handler such as SVC can change the stack used by Thread mode from main stack to process stack by changing the
EXC_RETURN value it uses on exit. All exceptions continue to use the main stack. Only one stack, the process stack or the main stack, is visible, using R13, at any time. It is also possible to switch from main stack to process stack while in Thread mode by writing to CONTROL[1] using the MSR instruction, in addition to being selectable using the EXC_RETURN value from an exit from Handler mode.

In my scheduler I would like to use this dual stack pointer. That means that the tasks are using the process stack (PSP) and if we enter the PendSV interrupt, R13 is pointing towards the main stack (MSP). So first ting to do is to make sure we push the remaining registers to the PSP location.

Initialise the scheduler

Asume we have two task stacks:

uint32_t task1_stack[128];
uint32_t task2_stack[128];

During initialization we setup the stack of each task:

TCB.tasks[taskid].sp = (uint32_t)(taskx_stack+128-16); /*Setup stack pointer */
task_stack+128-1 = 0x01000000;      /* xPSR default value */
task_stack+128-2 = &taskx_function; /* PC will set the pointer to the task function */
task_stack+128-3 = &exit_function;  /* LR will set a pointer to an exit function, incase the task exits */

It’s +128 because the cortex-M family uses a decending stack configuration, so we set it to the top of the stack.
-16 (== 64 bytes) because at the context switch we pop 64 bytes (32 by software and 32 by hardware) from this stack to start the next task. So at the end of the context switch function, the PC register is loaded with the address of the task_function and the program will continue from there.

Start the scheduler

Now let’s start the scheduler:

__set_PSP(task1_stack+128); /* Set PSP to the top of the task1 stack */
__set_CONTROL(0x02);        /* Switch to PSP, privilleged mode */


Because we start the first task manualy we can disgard the previous filled values of xPSR, PC and LR for this perticular task. The carefull reader may notice that we set the stack pointer to a memory position outside the boundaries of the task1_stack array. As mentions before the hardware uses the Decrement address Before access strategy. So it will decrement the memory position before it wirtes anything on the stack.
The stack is setup we can call the task1 function. When entering this function the hardware will push registers on the stack that will remain there forever.

The Switching

Pseudo-code PendSV:
 mrs r0, psp
 stmfd r0!,{r4-r11}  /* STMFD is s synonym for STMDB */
 str   r0, #current_task_sp_variable

 /* Let's switch */

 ldr   r0, #next_task_sp_variable
 ldmfd r0!,{r4-r11}  /* LDMFD is a synonym for LDMIA */
 msr psp, r0
 ldr r0, =0xFFFFFFFD /* EXC_RETURN - Thread mode with PSP: */
 bx  r0

Most of the PendSV code is strait forward except maybe the last two instructions.

Accoording to [5] an exception return occurs when the processor is in Handler mode and executes one of the following instructions to load the EXC_RETURN value into the PC:
  1. an LDM or POP instruction that loads the PC
  2. an LDR instruction with PC as the destination
  3. a BX instruction using any register.

We use the third option and load r0 with the “Return to Thread mode, exception return uses non-floating-point state from the PSP and execution uses PSP after return” value.

Below is an attempt to visualise the start and switch sequence.

Scheduler start sequence

Scheduler start sequence

This is the basics of a RTOS, the scheduler.