# Process Coordination and Shared Data Lecture 27 #### In These Notes . . . #### Sharing data safely - When multiple threads/processes interact in a system, new species of bugs arise - 1. Compiler tries to save time by not reloading values which it doesn't realize may have changed - 2. Switching between threads can lead to trying to operate upon partially updated variables/data structures - We must design the system to prevent or avoid them #### Operating System support for Process Coordination - Monitors - When multiple thread/processes interact in a system, new species of bugs arise - We must design the system to prevent or avoid them - Bugs and solutions #### **Volatile Data** ## Compilers assume that variables in memory do not change spontaneously, and optimize based on that belief - Don't reload a variable from memory if you haven't stored a value there - Read variable from memory into register (faster access) - Write back to memory at end of the procedure, or before a procedure call #### This optimization can fail - Example: reading from input port, polling for key press - while (SW\_0); will read from SW\_0 once and reuse that value - Will generate an infinite loop triggered by SW\_0 being true #### Variables for which it fails - Memory-mapped peripheral register register changes on its own - Global variables modified by an ISR ISR changes the variable - Global variables in a multithreaded application another thread or ISR changes the variable #### The Volatile Directive ## Need to tell compiler which variables may change outside of their control Use volatile keyword to force compiler to reload these vars from memory for each use ``` volatile unsigned int num_ints; ``` – Pointer to a volatile int volatile int \* var; // or ``` int volatile * var; ``` - Now each C source read of a variable (e.g. status register) will result in a assembly language move instruction - Good explanation in Nigel Jones' "Volatile," Embedded Systems Programming July 2001 ## **Cooperation and Sharing Information** Program consists of one or more threads/processes Any two threads/processes are either independent or cooperating #### Cooperation enables - Improved performance by overlapping activities or working in parallel - Better program structure (easier to develop and debug) - Easy sharing of information #### Two methods to share information - Shared memory - Message passing ## **Shared Memory** Is practical when communication cost is low Low-end embedded systems have no memory protection support - Threads can access the data directly e.g. global variables - (Who needs seatbelts or airbags!) UNIX and high-end embedded systems have memory protection support - Impossible to see other processes' memory space by default - E.g. virtual memory - Establish a mapping between process's address space to a named memory object which can be shared across processes - POSIX Threads (pthreads) API is a standard for workstation programming ## Message Passing Most useful when communication cost is high Often used for distributed systems Producer process generates message, consumer process receives it Each process must be able to name other process Consumer is assumed to have an infinite receive queue Bounded queue complicates the programming OS manages messages *Mailbox* is a queue with only one entry #### **The Shared Data Problem** Often we want to split work between ISR and the task code Some variables must be shared to transfer information Problem results from task code using shared data *non-atomically* An atomic part of a program is non-interruptible A critical section (group of instructions) in a program must be executed atomically for correct program behavior get\_ticks() returns a long, formed by concatenating variable tchi and register tc > If an interrupt occurs in get\_ticks, we may get old value of tchi and new value of tc | Step | temp | tchi | tc | |------|------------|--------|--------| | 1 | 0x00000123 | 0x0123 | 0xffff | | 2 | 0x01230000 | 0x0123 | 0xffff | | 3 | 0x01230000 | 0x0124 | 0x0000 | | 4 | 0x01230000 | 0x0124 | 0x0000 | #### **Critical Sections Lead to Race Conditions** Critical section: A non-re-entrant piece of code that can only be executed by one process at a time. Some synchronization mechanism is required at the entry and exit of the critical section to ensure exclusive use. Re-entrant Code: Code which can have multiple simultaneous, interleaved, or nested invocations which will not interfere with each other. This is important for parallel processing, recursive functions or subroutines, and interrupt handling. - If invocations must share data, the code is non-reentrant. (e.g. using global variable, not restoring all relevant processor state (e.g. flags)) - If each invocation has its own data, the code is reentrant. (e.g. using own stack frame and restoring all relevant processor state) Race condition: Anomalous behavior due to unexpected critical dependence on the relative timing of events. Result of increment example depends on the *relative timing* of the read and write operations. #### **Long Integer** ``` long int ct; void f1() { ct++; } void f2() { if (ct==0x10000) /* ... */ } ``` What if f2() starts running after the f1's add.w (resulting in a carry) but before the adcf.w? Race condition due to **non-atomic** operation - Data structures - Large variables ``` ; void f1() add.w #0001H, ct adcf.w _ct+2 rts ; void f2() cmp.w #0,_ct jnz unequal cmp.w #1,_ct+2 jnz unequal ; equal unequal: unequal ``` ## Is Queue Access Atomic for Serial Example? Size field is modified by both enqueue and dequeue functions Does compiler generate code which is atomic? This code is very inefficient – the compiler vendor wants you to buy the licensed and optimized version ``` ; Enqueue ; q->Size++; mov.w - 2[FB], A0; q mov.w - 2[FB], A1; q mov.w 0024H[A0],0024H[A1] add.w #0001H,0024H[A1] Dequeue ; q->Size--; mov.w - 3[FB], A0; q mov.w - 3[FB], A1 mov.w 0024H[A0],0024H[A1] sub.w #0001H,0024H[A1] ``` ## Solution 1 – Disable Interrupts #### Disable interrupts during critical section Renesas syntax -> #### **Problems** - You must determine where the critical sections are, not the compiler (it's not smart enough) - Disabling interrupts increases the response time for other interrupts - What if interrupts were already disabled when we called get\_ticks? - Need to restore the interrupt masking to previous value ``` #define ENABLE INTS {_asm(" FSET I");} #define DISABLE_INTS {_asm(" FCLR I");} unsigned long get_ticks(){ unsigned long temp; DISABLE_INTS: temp = tchi; temp <<= 16; temp += tc; ENABLE_INTS; return temp; ``` ## **Are Interrupts Currently Enabled?** #### FLG's I flag (bit 6) - Enables/disables interrupts - Section 1.4 of ESM Need to examine flag register, but how? - Not memory-mapped - Can't access with BTST #### Solution - STC: Store from control register (ESM, p. 123) - Use a macro (CLPM, p. 98) to copy the flag bit into a variable **iflg** in our code (we copy the whole register, then mask out the other bits) – nifty feature! - Later use that variable **iflg** to determine whether to re-enable interrupts ``` #define I_MASK (0x0040) #define GET_INT_STATUS(x) {_asm(" STC FLG, $$[FB]", x); x &= I_MASK;} #define ENABLE_INTS {_asm(" FSET I");} #define DISABLE_INTS {_asm(" FCLR I");} unsigned long get_ticks(){ unsigned long temp, iflg; GET_INT_STATUS(iflg); DISABLE_INTS; temp = tchi; temp <<= 16; temp += tc; if (iflg) ENABLE_INTS; return temp; ``` ## Solution 2 – Repeatedly Read Data ## Keep reading until the function returns the same value Easy here because get\_seconds returns an easily compared value (a long) #### Problems which limit this approach - tc might be changing every clock cycle, so get\_ticks would never return. Loop time must be short compared with interrupt frequency - What if we wanted to compare two structures? Would need a function (slower, more code) - Compiler may optimize out code ``` unsigned long get_seconds() { unsigned long temp1, temp2; temp2 = tchi; temp2 <<= 16; temp2 += tc; do { temp1 = temp2; temp2 = tchi; temp2 <<= 16; temp2 += tc: } while (temp1 != temp2); return temp2; ``` ## A Gotcha! TC keeps changing! See Ganssle's "Asynchronicity" Solution: after disabling interrupts, do the timer C ISR's work if needed Examine Interrupt Request bit of tcic (timer C interrupt control register), which indicates overflow Increment counter if it did overflow ``` unsigned long get_ticks(){ unsigned long temp, iflg; unsigned temp1, temp2; GET_INT_STATUS(iflg); DISABLE_INTS; temp2 = tc; temp1 = tchi; if (ir_tcic) { temp1++; temp2 = tc; if (iflg) ENABLE_INTS; temp = temp1; temp <<= 16; temp += temp2; return temp; ``` #### Solution 3 – Use a Lock Relies on kernel/scheduler for efficiency Define a lock variable (global) for each resource to be shared (variable (inc. data structure), I/O device) - Lock is 0 if resource is available - Lock is 1 if resource is busy Functions agree to check lock before accessing resource - if lock is 0, can use resource - if lock is 1, need to try again later - if preemptive kernel is used, call kernel to reschedule this thread later - for non-preemptive kernel, call kernel to yield processor to other threads Enable interrupts when possible to reduce interrupt latency Some processors have atomic read-modify-write instructions, avoiding need to disable interrupts when accessing lock variable ``` DISABLE_INTS if (lock_var == 0) { lock_var = 1; ENABLE_INTS access resource DISABLE INTS lock_var = 0; ENABLE INTS } else { ENABLE_INTS // try again later ```