# OS Kernels

## Lecture 1

• Will require C!
• Will have a complicated C assignment
• Office Hours in 105 Schorr, Tuesday 10:30-12:30
• more low-level
• HW (35%)
• 2 programming assignments
• 2 problem solving
• Project (15%)
• Pair programming
• Trio programming (requires additional components)
• Midterms 3x, (40% total)
• Quizzes (10%)
• Tips
• Get to know classmates
• participate!!
• Prior to due dates, class time will be spent answering questions
• Goals
• knowledge of internals
• OS roles
• interaction with hardware
• systems programming
• memory management
• concurrency
• practical OS experience
• Know what a process is
• And how multiple processes work
• And things like race conditions

## Lecture 2: Review – Cache Memories

• OS Objectives
• Provides convenience
• simplifies io
• simplifies user interfaces
• simplifies portabilities
• manages resources
• memory
• processes
• I/O devices
• communications media
• ease of evolution
• support new hardware components
• provides new services
• supports maintenance
• Major portion on resource management
• review of memory hierarchy and cache

### Memory Hierarchy and Cash

• Memory Hierarchy
1. Registers
2. L1 Cache (SRAM)
3. L2 Cache (SRAM)
4. Main Memory (DRAM)
5. Local storage
6. remote storage
• Smaller numbers are smaller in space, faster, but more expensive per byte
• rule of locality – bring things closer as you use them
• The Cache
• A small, fast storage device – a staging area for data in a larger, slower device
• for each $$k$$ the faster, smaller device at $$k$$ serves as cache for larger, slower $$k+1$$
• Hiercharchies work because of locality – programs tend to access data at level k more often than k+1 – thus storage at level k+1 can be slower and thus larger and cheaper per bit
• The memory hieracrrchy creates a large pool of storage that costs as much as cheap storage near the bottomm, serving data to programs at the rate of the fast storage near the top
• SRAM – managed automatically in hardware – holds frequently accessed blocks of main memory
• CPU looks for data in caches first, then in the main memory
• data is copied in transfer blocks
• Requires various ways of managing which blocks are in the cache in the cache, and what isn’t
• Cache concepts
• Cache hits – When the cache has the block
• Cache misses – when the cache doesn’t have the block
• Get from main memory
• store in cache – using placement (determines where a block goes) and replacement (determines which block gets evicted) policies
• Types
• Cold start / Compulsory – cache is empty
• conflict miss – limits blocks at level $$k+1$$ to a small subset of the blocks at $$k$$
• occur when level $$k$$ cache is large enough, but multiple objects map to the same level k block
• capacity miss – occurs when active blocks is larger than the cache
• Faults – when the needed stuff isn’t in main memory

### Cache Organization

• $$S$$ – set
• $$E$$ – way/line per set
• $$B$$ – block size
• Block has $$v$$ (valid bit), tag, $$B$$ bytes
• $$C = SEB$$ for cache size
• Address is Tag, Set, Block offset
• Cache is at the physical domain
• $$B = 8$$
• $$S = 4$$
• $$E = 1$$
• Locate Set
• Locate matching tag, and valid
• Locate data starting at offset
• Direct mapped cache, $$E = 1$$
• when no match, evict line and replace
• $E$-way set associative cache – two lines per set, means more storage/more compact, requires more comparison of the tag – no match is a bit harder, but requires a slightly different replacementt protocol

## Lecture 3

### Cache – Continued

• understand matching and replacement
• replacements policies – random, least recently used, etc
• writebacks – happen to the original memory
• write-through will automatically write to memory
• write-back defers writing until the line is replaced – needs a dirty bit
• write-misss
• write-allocate – load into cache, update line
• no-write-allocate – immediately write to memory
• miss rate, hit time, miss penalty are the performance metrics
• make sure to write s.t. the common case goes fast
• minimize misses in the inner loops
• repeated refs to vars are good
• stride-1 ref patterns are good

#### Problems

• phys address is 12 bit wide
• 81 is fysically addressed and direct mapped with a 4-byte line and 16 total sets

## Lecture 4: Virtual Memory

• lets a computer think it has more memory than it actually does
• helps to deal with a lot of more basic stuff
• physical addressing – used in small systems – embeded microcontollers for things like cars, elevators, etc
• can be an issue – a set amount of memory is provided, no more, no less
• 512mb means 512mb
• programs are machine dependent
• running multiple programs becomes very difficult
• virtual memory uses an mmu to translate a virtual address to a physical address
• means that it doesn’t matter how much memory it has
• linear address space – orderd set of contiguous, non negative integer address
• virtual address space – set of $$N = 2^n$$ virtual addresses
• physical address set of $$M = 2^m$$ physicall addresses
• difference between data and attributes
• an object can have multiple addresses
• every byte in main memory has one PA and one or more VAs
• virtual memory uses main memory efficiently
• simplifies memory management
• isolates address space – one process can’t interfere with another’s, and a user program cannot access privileged kernel information

### Virtual memory as a cache

• VM is an array of N contiguous byets stored on disk
• contents of the array are cached in physical memory
• blocks are called pages (size $$P = 2^p$$ bytes)
• pages are of fixed, but configurable size
• Given an address space of $$2^n$$, and a page size of $$2^m$$, you have $$2^{n-m}$$ pages
• keeping track requires special software, cannot be implemented in hardware. This is the page table
• write-back, not write-through
• Page tables
• an array of page table entries (PTEs) that map virtual pages to physical pages
• per-process kernel data structure, in DRAM
• Has valid bit, keeps physical page number or disk address
• valid signals on disk
• works because of locality
• actie pages is called the working set
• working set < main size, good performance for one process after misses
• sum(working set) > main size, thrashing, lots of swapping

### Virtual memory for memory management

• each process has its own virtual address space – own page table
• views memory as a simple linear array
• mapping function scatters through physical memory
• maps virtual pages to the same physical page
• and can allow a single library to be used for multiple programs
• kernel vm is shared
• linking maps programs into the common structure

## Lecture 5: Virtual Memory

• homework is a tool
• quiz based
• midterm will be electronic?
• first midterm, programming assignment coming, week, week and a half

### Virtual memory for memory management

• use chunks to bring stuff up!
• everything has its own virtual address space
• mapping scatters addresses through physical memory
• well chosen mappings simplify allocation and management
• includes protecting sections from exeution

### Memory protection

• $$N = 2^n$$ number of addresses in virtual address space
• $$M = 2^m$$ number of addresses in physical memory
• $$P = 2^p$$ page size
• TLBI – tlb index
• TLBT – tlb tag
• VPO – virtual page offset
• VPN – virtual page number
• PPO – Physical page offset
• PPN – Physical page number
• CO – byte offset
• CI – cache index
• CT – cache tag
• given base address, use page number and offset
• mmu translates based on ptable entry/address
• VM and cache together make things faster/more able to handle a lot of memory
• TLB – another cache in the MMU, maps vpns to ppns, contains complete page table entries for a small number of pages
• multilevel page tables, would need 512 gb page table for 48 bit, 8 bite PTE at 4kb page size
• paging the page table!
• page tables that list page tables

## Lecture 6: Virtual Memory

### Page faults

• small page size – less internal fragmentation
• more pages required
• larger page tables
• large programs means page tables may be in virtual memory
• rotational characteristics favor larger page size
• 4k is the most common size
• internal fragmentation
• 10 seats, 10 students - no fragmentation
• 20 seats, 10 students - fragmentation of 10
• 40 seats, 10 students - fragmentation of 30
• Small size with small pages – high page fault rate, but faster to fill
• large size, large pages – low page fault rate, slower to fill
• many pages in process, large page size, low page fault
• contemporary techniques decrease locality of references within a process
• Fetch policy – when pages should be brought into memory
• demand paging – when the page is needed, as they are needed
• many page faults when a process is first started
• prepaging – load what is needed, and then a bit more
• exploits characteristics of second memory devices
• more efficient to bring in a number of pages at one time
• ineffective if extra pages aren’t referenced
• placement policy
• determines where stuff is to be put
• replacement policy
• deals with selection of a page in main memory to be replaced
• objective is to remove the page least likely to be referenced in the near future
• more elaborate policies require greater overhead
• frame locking
• a frame that must stay in memory at al time
• the kernel, and key control structures
• i/o buffers and time-critical areas may be locked
• locking is achieved by lock bit with the frame
• algorithms
• selects the page for which the time to next ref is longest, produces three page faults after the frame allocation has been filled
• FIFO
• page frames a circular buffer
• first thing in gets pushed out
• least recently used
• based on locality
• least likely to be used is the least recently used
• difficult to implement
• clock policy
• requires a use bit
• reference
• replacement if not 1
• flipped periodically
• Desireable to maintain as many processes in memory as possible
• frees programmers from size restrictions
• virt mem
• references are logical, translated
• breaks processes into pieces
• two approaches are paging and segmentation
• management scheme requires both hardware and software

## Lecture 7: Virtual Memory

• Considere a paged VM system witth 32 bit virt addresses and 2K-byte pages. Each page table entry requires 32 bits. Each page table size is exactly one page (or 2K-bytes)
• how many levels of page tables
• how many entries does the table use at each level (at one level, the table uses a smaller number of entries)
• smaller number of entries can be used at the top or botom. Which strategy uses the least number of pages? Explain by calculating the number of pages for both
• 232/211 = 221
• 211/22 = 29, then, 3 levels, (9 + 9 + 3) = 21
• entries per level: 512, 512, 8

### Dynamic Memory allocation

• basic concepts
• malloc etc acquires vm at run time
• manage an area of process vm known as the heap
• allocator maintains heap as collection of variable sized blocks, either allocated or free
• types
• explicit – malloc and free in C
• implicit – application allocates, doesn’t free, GC in Java, ML, Lisp
• void *malloc(size_t size) – returns void pointer, memory blocks of at least size, aligned to 8-byte boundary (usually), returns null, sets errno
• void free(void *p) – returns block pointed at p to pool of available memory – must come from pervious call to malloc
• sbrk is internal function
• applications – can issue arbit sequence of malloc and free
• free must be called on a malloc’d block
• cant control number or size of allocated blocks
• must respond immediately to malloc
• must allocate from free memory
• must align blocks to satisfy all alignment requirements (8 byte for GNU Malloc)
• can manipulate and modify only free memory
• can’t move the allocated blocks once malloc’d
• given sequence of malloc and free,
• maximize throughput and peak memory utilization
• throughput is number of completed requests per unit time
• payload, malloc(p) results in a block with payload of p bytes
• currentheap size is non-decreasing
• peak utilization after kay requests max payload over heap size
• internal fragmentation is when the payload is smaller than the block size
• enough aggregate memory, but not enough contiguous is external framentation
• compacting the stack can be done, but requires a tremendous amount of work/linear search of stack/memory/introspection
• size is embeded in a block
• keep track of free blocks – some structure in the heap
• picking blocks for allocation
• reinserted freed blocks?
• standard block – length ob flock in word preceding the block (header field, header), requires an extra word for every allocated block
• implicit list using length (links all blocks)
• explicit list (lists free blocks)
• segregated free list
• different free lists for different size classes

## Lecture 8: Malloc

• how much memory to free – block before is the size
• implicit list - use length/links of all blocks – implemented in K&R
• segregated free list – free list for different size classes
• blocks sorted by size – use RB tree with pointers to each block, link used as key
• The implicit list
• size and allocation status – could store in two words, but this is wasteful
• use word for block size, low bit is status (because blocks are a multiple of an even number)
• alignment with 8
• while not past end, and already allocated, and too small, go to the next block
• next fit, first fit, but starts where previous search finished
• best fit, looks for the smallest block with appropriate size
• allocation in a free block – split a block
• freeing – simply clear allocated flag
• or coalesce – check the next/previous block if its free
• boundary tags – size tag/status flag at both ends, traverse list backwards, requires extra space, important, general technique, Knuth 73
• extra internl frag with boundary tags
• Placement – first fit, next fit, best fit
• splitting – how much internal frag
• coalescing – immediate or deferred
• often used in special purpose apps, easy to implement
• The explicit free list
• allocated is as before, free has size, allocation status, and next previous are pointers stored in the first two blocks of the free
• allocation requires moving previous and next for several
• faster when memory is more full
• more complex free/allocate because blocks need to be spliced in/out of the list
• extra space required by links
• commonly used with segregated free lists

## Lecture 9: Malloc Continued

• programming assignment coming
• Explicit List
• splice out predecessor and successor blocks, insert new block at root of list, coalesce if needed
• explicit – only include free blocks
• linked lists are mostly used in segregated free lists
• segragated free list
• blocks sorted by size – rb tree, or similar to use length as a key
• size classes of blocks have own free list
• large sizes – one for each two power size
• Search for appropriate free list for block of size $$m > n$$
• if an appropriate block is found
• split block and place on appropriate list
• If no block found, try next larger class
• repeat until block found
• If no block is yet found – request additional heap memory from os (sbrk)
• allocate block of $$n$$ bytes from this new memory
• place remainder as a single free block in largest size class
• Memory perils and pitfalls
• overwiriting memory
• referencing non existent variables
• freeing mulitple times
• referencing freed blocks
• failing to free blocks
• be careful of various operators and their precedence

## Lecture 10: Garbage Collection!

• Function pointers are used to implement dynamic loading and in some OO languages, v-tables

### Garbage Collection

• OO and Functional Programming languages
• storage is still exhaustable
• Let the runtime handle it
• less debugging
• no leaks
• no explicit management
• malloc function, new operator
• used to allocate neded memory
• mutator – does useful work, chages or mutates connectivity of object graph
• gc – shift responsibility of freeing to systems
• static allocation – binding at compile time – local vars bound to the same locations at every activation of the procedure
• heap allocation – allocated and deallocated in any order, outlives method creating it, real-world data structures
• liveness – held in roots, can be detected
• memory is now a directed graph
• non-reachable is garbage
• nodes are reachable if there is a path from any root to that node
• initialization is part of new – which acts like malloc otherwise
• ref count – the number of objects that refer to the current object
• harder to determine deallocateability
• but les time is spent debugging memory
• Classical algorithms
• Reference counting
• counts number of references to each object
• incremental
• Perl, photoshop, small talk, VB
• update(right(m), null)
• recursively decrement all of the things below it, and update anything below again
• management distributed through execution
• better locality than tracing
• cells that are short lived can be reused to reduce the amount of heap usage
• but fragile, and doesn’t deal with cyclic structures
• Mark sweep
• dead objects not reclaimed immediately
• begins when a request is made that results in memory exhaustion
• useful info is suspended
• sweep unused objects back to list
• global traversal of all live objects starting from root
• sweap from bottom up
• unmarked cells returned to the free pool
• marking bits are unset in prep for next gc
• Start/stop
• complexity proportional to size of heap
• more fragmentation/reduced locality
• Copying algorithm
• split into from and to spaces
• allocation through pointer mubing in from space
• live identification is done through tracing
• live object copied to other side of heap
• once copied, tags for spaces are flipped
• fast allocation, simple check for exhaustion
• no fragmentation
• poor space utilization
• more page faults
• Generational gc
• the defacto
• uses multiple kinds

## Lecture 11: Process Management

• difference between programs and processes
• program is a lifeless object, sits on disk, just instructions
• a process is a running program
• OS takes the instructions and gets them running
• transforms the program into something useful
• issues with programs running on different OS – because of memory differences, abis etc
• creation
• stored to memory (code and data), stack and heap
• i/o – create file descriptors
• execute, starting at entry point (main)
• System to be initialized
• user requests a new process
• initiation of batch jobs
• be very careful of fork
• termination conditions
1. normal exit
2. error exit
3. fatal error
4. killed by another process
• Parents create children, children can create children
• forming a hierarchy, called a process group (except on windows)
• processes may be either running or not running (suspesion is used to wait for other things)
• has to use a queue – kinda

## Lecture 12: Process Management, continued

• the two process model
• running or not running
• running requires availability of cpu resources
• many state models
• two state model
• running, and not running, looped together
• not running is entry point
• exit is exit point from running
• not-running-queue
• uses a queue for not yet running
• three state model
• running to blocked, or to ready
• lowest layer of a process-structured os
• handles interrupts and scheduling
• above the scheduler are sequential processes
• can preempt, switch, etc
• scheduler preemption
• save program counter
• register values
• memory, etc
• five state
• running, ready, new, blocked, exit
• running ties out to ready
• events wait from running o blocked
• running releases to exit
• two/multiple queues
• blocked queue
• a queue for each event waiting type
• process suspension
• processor is faster than i/o
• swap processes to disk to free memory
• blocked state becomes suspend when swapped to disk
• process suspension may happen for a variety of reasonns
• i/o tables
• file tables
• status tables
• process structure points to where various things are
• registers
• pc
• program status
• stack pointer
• process stae
• priority
• sched params
• pid
• parent
• process group
• a variety of others
• shared memory can be used to communicate/coordinate between processes
• lighter weight than processes
• but have a similar purpose
• keep the same process, execution context
• and only are a specific function
• can still lead to race conditions
• can cause issues w/ stack space

## Lecture 13: Process Examples

• know how many processes are in play
• remember that process have parent/child relationships

• miniature processes – part of one, same resources as the original, but new execution context (new stack space for the thread)
• may/does require dynamic allocation on the stack
• multi threading helps to utilize as much of the processor as possible
• threads can be implemented either in userspace or by the kernel
• requires scheduling, a thread table, etc
• kernel threads are now the norm – makes things much easier to handle
• good concurrency support – blocking thread doesn’t starve siblings
• scheduling needs system calls
• not as fleixble
• not as scalable
• scheduler activations
• avoid unnecessary user/kernel transitions
• kernel assigns virt processor to each process
• relies on the kernel, to notify of kernel blocking events
• more virt processors are assigned in blocking
• mysql
• some before, but most after
• may have private globals
• and static locals

• create a thread and start routine
• exit simply terminates a thread
• Be careful with how sync is done.
• And consider how syncing in threads is done
• Mutexs, flags, semaphores, etc

## Lecture 16: More Concurrency

• Print Queue – $$N = 100$$
• Processes add print jobs to the queue, but can’t add more to a full queue
• Processes remove jobs from the queue and do the printing, bu can’t consume from the empty queue
• Some buffers are circular
• When buffer is full, producers go to sleep
• When empty, consumers go to sleep
• If empty buffer gets added to, wake consumer; if full buffer gets removed, wake producers
• Updates may, and likely do, require a mutex
• Semaphores – A number, 0 or a postive number
• down – if val > 0, decremet, if val = 0, suspend without completing down
• up – increment and if there are processes sleeping on the semaphore, wake one or all up
• Monitors
• Semaphores are used for mutual exclusion
• barriers and signalling events
• low level
• Monitors provide mutual exclusion implicity

## Lecture 17: Monitor

• Semaphores as barriers
• Monitor is a language construct
• Only one process can be active in a monitor at anny instant
• Use condition variables to block processes
• wait operations
• signal operations
• See slide 11 from class
• produce-consumer problem, only one process can be active at one time
• buffer has $$n$$ slots
• implementation independent
• only visible effects matter
• callers are independent
• monitor writers can change the iplementation as long as tthe same effects are maintained
• provides mutual exclusion implicitly
• condition variables
• delay a process if a monitor state fails to satisfy boolean condition
• awaken a process or processes if the condition becomes true
• signaling disciplines
• signal continue signal wait
• Helps to solve race conditions

• When there’s a shared resource tthat you only need a part of
• many different methods of dealing with deadlock
• happens on mutual exclusion
• no preemption
• circular waiting
• mutual exclusions hel to avoid race conditions
• hold-and-wait – must request all resources initially
• no preemption – unable to do well
• circular wait – define linear ordering of resource types
• Most OS use the ostrich’s approach – head in sand
• Use the banker’s algorithm to help control allocation/release
• max requirement staed in advance
• process must be independen – no synchronization requirement
• fixed number of resources
• no process may exit while holding resources
• detect by
• marking each process that has a roww of all zeroes
• initials a tem vector to equal the available vector
• find an $$i$$ such that $$i$$ is unmarked and the $i$th row of $$Q \leq W$$
• if does not exist, terminate
• if exists, mark and add to row of $$A$$

## Lecture 20: Dealing with Interrupts

• be careful of how synchronus/asynchronous function calls happen
• interrupts are handled by function pointers
• execution context
• alt_exception_entry.s – handles changing execution context, i.e., interrupts
• stores return address, exception handling exit
• stores current stack pointer, all registers, etc
• Loads and stores into the stack
• Stack pointer points to the last used slot
• FP points to the last saved frame
• In multi-threading, you can just go to the next thread on return from a interrupt handler
• Next up – scheduling algorithms
• Signal and continue – signaler continues and process executes at some later time

## Lecture 21: Uniprocessor Scheduling

• Assign processes to be executed
• minimize response time
• maximize throughput and processor efficiency
• Long term scheduling – determines which wich are admitted, determines degree of multiprogramming
• Medium-term – swapping function, manage degree of multiprogramming and available memory
• short-term scheduling – dispatcher, executes most frequently, when events occur (clock, i/o interrupts, OS calls, signals)
• user-oriented by response time, which should be minimized
• throughput for the system
• turnaround time – interval between submission and completion – batch processing
• response time – submission and initial response
• deadline – tife for a process to complete maximize percentage of deadlines met
• predictability – job should take about the same time to run and incur the same cost on a system regardless of load
• Multiple queues based on what is being waited on
• Schedulers often use priorities – low priority may suffer starvation
• First Come First served – first process
• Arrival and time Required
• Look at like a Gant chart
• Pick oldest in queue to go first
• Short processes may have to wait a long time before it can execute
• Round Robin – Premption based on a clock
• Amount of timedetermined that allows each process to use the processor for that time
• Time slicing
• Shortes process next
• shortest remaining time
• highest response ratio next – choose based on time waiting plus service time over service time
• Multilevel feedback queue – penalize jobs that have been running longer, don’t know remaining time process needs to execute

## Lecture 22: Scheduling

• Use context library to implement thread scheduler
• Context should be pretty helpful