3rd edition: chap. 5 Concurrent and Race conditions old world: interrupts mostly the problem. new world: SMP means TH vs TH, and BH vs BH, and TH vs BH. and TH kernel preemption ... (which is TH vs TH) pitfalls in scull: ----------------- see code. p. 107, you can't assign a ptr without serial access if A assigns first, B will overwrite it, hence we have a memory leak this is a classic race condition: concurrency and its management ------------------------------ race conditions occur because of shared access to "resources" (usually memory, but hw resources are another possibility, if you have one buffer in a network card, you can't have two "threads" sucking a packet out at one time) if we avoid global memory, we might be able to minimize it BUT WE HAVE A SMALL STACK we need serialization mechanisms (mutex/locking ... whatever you call it). semaphores and mutexes ---------------------- add locking to scull ... with semaphores TH can "sleep" (or block) and give up the CPU. BH never sleeps. (it can however delay execution but never mind). semaphores CAN block. standard semaphore: init to 1 P to sleep serial access V to get out TH "protocol" The Linux Semaphore Implementation ---------------------------------- p. 110. functions: sema_init(), down*, up are the real functions. macros packaging functions: static declaration: DECLARE_MUTEX(name) <--- init to 1 DECLARE_MUTEX_LOCKED(name) <--- init to 0 dyn decl: init_MUTEX init_MUTEX)LOCKED note down vs down_interruptible. You probably want down_interruptible. (Can cntrl-c out of it) Using Semaphores in scull ------------------------ note scull_dev has semaphore in it. thus one semaphore per minor device. could have used 1 global sem., but no reason to make process A block while process B is using the *other* device. see init code: p. 112 Reader/writer semaphores ------------------------ Semaphores come with various usage patterns. init to 1, init to 0, and have bottom-half wake you up, reader/writer is 1 such pattern. we can allow multiple readers access without restriction as long as they are not CHANGING the data. RO tasks can run in parallel. The point here is that readers do not lock out readers. readers lock out writers. writers lock out readers. if writers and readers contend, writers get in 1st. If many writers you may not want to use it as it can lead to reader starvation. linux has special sem called: "rwsem" or reader/writer sempahore for this. see functions p. 113-114. Completions ----------- common prog. pattern: 1. initate some activity outside of current thread 2. wait for completition (block) of that activity e.g., create new thread, or user-space process, some hardware-based action may use semaphore as on p. 114, BUT problem with that situation is that often event you are waiting for is faster than you expect, and other problems may exist: completions are a lightweight (compared to sems) mechanism: top-half calls: wait_for_completion bottom-half calls: complete or complete_all, see p. 115 see code example p. 116 note read/write top-half interaction: spinlocks: ----------------------------- semaphores block. you cannot block in BH. semaphores are slow because they may block. you may want finer time granularity spinlock is similar an atomic variable with test/set semantics (how implemented is arch. dependent, typically need machine instruction/s) lock: while (x) ; x = 1; do something to data unlock x = 0 cons: note the spinlock. what happens if we naively execute that on a single cpu? introduction to spinlock API: ----------------------------- see p. 117 init static/dynamic spin_lock spin_unlock spinlocks and atomic context ----------------------------- example: lock copy_from_user unlock what could go wrong here? 1. block in copy_from_user, 2. higher level process kicks you out after you get lock 3. another scenario: you take out a lock your interrupt handler interrupts and takes out the same lock ... TH cannot run to free the lock two possible outcomes: 1. delay (likely) 2. some possibility of deadlock (possible) rule: code holding spinlocks should be atomic and should not sleep and minimal ... background: if you hold a spinlock, your cpu will not be preempted hard to avoid sleep ... must have intimate knowledge of TH functions must disable interrupts on local CPU if bh contents for that lock keep spin lock times short the spinlock functions ----------------------------- spin_lock - spin_lock_irqsave - disable intrs on local cpu only, save interrupt status regs spin_lock_irq - same, but don't save status spin_lock_bh - disable sw interrupts if we get the lock, but leave hw interrupts alone if tasklet code can use spinlock, use spin_lock_bh if hw interrupt code can use spinlock, use spin_lock_irqsave as 1st choice reader/writer spinlocks ----------------------------- similar to r/w semaphores, situation: > 1 reader can get into crit. section, 1 writer only can get it. writer can starve readers if busy. p. 120 locking traps ----------------------------- lots of experience about what can go wrong ... ambiguous rules ----------------------------- be clear and explicit (and comment...) how spinlocks are to be used. e.g., functionA sets lock Z functionB is called and wants lock Z too ... scull design: if invoked from system call, acquire sem to lock device structure all internal functions can assume they have access to device structure lock ordering rules ----------------------------- in general the less locks the better. danger can come > 1 lock. take the locks in the same order take lock 1 take lock 2 avoid this deadlock: thread 1 takes lock 1 thread 2 takes lock 2 neither can proceed if you have 2 locks 1 is your data structure lock 2 is global lock take local lock first don't do a down on a semaphore when you hold a spinlock fine versus coarse-grained locking ----------------------------- kernel started out with big-kernel lock. don't use it. locking gets finer and finer granularity. means kernel becomes more and more complex. lockmeter: http://oss.sgi.com/projects/lockmeter measure how much time spent waiting for locking alternatives to locking ----------------------------- alogirthms can exist that provide a different way to access data and avoid spinlocks lock-free algorithms ----------------------------- e.g., circular buffer: writer writes at one end. reader reads at the other end. trick here is that threads modify only certain variables and do not share those variables. could we have a circular buffer with minimized locking, 1 writer, and many readers? how would you do that? note: provides generic circular buffer implementation. atomic variables ----------------------------- n_op++ may not be atomic. (it is on some CPUs) lock access unlock is not efficient. need arch. independent way to do this: see primitives p. 125 p. 126 top is important. if you have operations requiring multiple atomic_t variables, you need locking. bit operations ----------------------------- p. 126 seqlocks ----------------------------- 1. another mechanism for fast, lockless access to a small simple, frequently accessed shared resource 2. write access should be rare. readers get freeaccess but much check for collisions with writers, and if that is the case, retry cannot use with pointers ... see read code example p. 128 read/copy/update ----------------------------- another way to avoid spinlocks. places constraints on types of data structure that it can protect. optimal use: 1. reads are common/writes are rare. 2. pointers should be how resource is accessed 3. resources must be held by atomic code when data structure needs to be changed, writing thread makes a copy. changes the copy aims relevant pointer at new version when kernel is sure no references to old version remain, it can be freed how to do that: we know that code that holds data structure is atomic. once every cpu on system has been scheduled at least once, all references are gone. RCU schedules callback function, callback frees old storage read code on p. 129 write code on p. 130, note callback function passed in