UP | HOME

OS Kernels

Table of Contents

Lecture 1

  • Will require C!
  • Will have a complicated C assignment
  • Office Hours in 105 Schorr, Tuesday 10:30-12:30
  • more low-level
  • HW (35%)
    • 2 programming assignments
    • 2 problem solving
  • Project (15%)
    • Pair programming
    • Trio programming (requires additional components)
  • Midterms 3x, (40% total)
  • Quizzes (10%)
  • Tips
    • Get to know classmates
    • participate!!
    • Prior to due dates, class time will be spent answering questions
  • Goals
    • knowledge of internals
    • OS roles
    • interaction with hardware
    • systems programming
    • memory management
    • concurrency
    • practical OS experience
  • Know what a process is
  • And how multiple processes work
  • thinking about shared resources
  • And things like race conditions
  • Thread-safety is important – all code must be thread safe!

Lecture 2: Review – Cache Memories

  • OS Objectives
    • Provides convenience
      • simplifies io
      • simplifies user interfaces
      • simplifies portabilities
    • manages resources
      • memory
      • processes
      • I/O devices
      • communications media
    • ease of evolution
      • support new hardware components
      • provides new services
      • supports maintenance
  • Major portion on resource management
  • review of memory hierarchy and cache

Memory Hierarchy and Cash

  • Memory Hierarchy
    1. Registers
    2. L1 Cache (SRAM)
    3. L2 Cache (SRAM)
    4. Main Memory (DRAM)
    5. Local storage
    6. remote storage
  • Smaller numbers are smaller in space, faster, but more expensive per byte
  • rule of locality – bring things closer as you use them
  • The Cache
    • A small, fast storage device – a staging area for data in a larger, slower device
    • for each \(k\) the faster, smaller device at \(k\) serves as cache for larger, slower \(k+1\)
    • Hiercharchies work because of locality – programs tend to access data at level k more often than k+1 – thus storage at level k+1 can be slower and thus larger and cheaper per bit
    • The memory hieracrrchy creates a large pool of storage that costs as much as cheap storage near the bottomm, serving data to programs at the rate of the fast storage near the top
    • SRAM – managed automatically in hardware – holds frequently accessed blocks of main memory
    • CPU looks for data in caches first, then in the main memory
    • data is copied in transfer blocks
    • Requires various ways of managing which blocks are in the cache in the cache, and what isn’t
  • Cache concepts
    • Cache hits – When the cache has the block
    • Cache misses – when the cache doesn’t have the block
      • Get from main memory
      • store in cache – using placement (determines where a block goes) and replacement (determines which block gets evicted) policies
      • Types
        • Cold start / Compulsory – cache is empty
        • conflict miss – limits blocks at level \(k+1\) to a small subset of the blocks at \(k\)
          • occur when level \(k\) cache is large enough, but multiple objects map to the same level k block
        • capacity miss – occurs when active blocks is larger than the cache
    • Faults – when the needed stuff isn’t in main memory

Cache Organization

  • \(S\) – set
  • \(E\) – way/line per set
  • \(B\) – block size
  • Block has \(v\) (valid bit), tag, \(B\) bytes
  • \(C = SEB\) for cache size
  • Address is Tag, Set, Block offset
  • Cache is at the physical domain
  • CPU generally does cache/virtual addresses, not physical addresses
  • To read:
    • \(B = 8\)
    • \(S = 4\)
    • \(E = 1\)
    • Locate Set
    • Locate matching tag, and valid
    • Locate data starting at offset
  • Direct mapped cache, \(E = 1\)
  • when no match, evict line and replace
  • $E$-way set associative cache – two lines per set, means more storage/more compact, requires more comparison of the tag – no match is a bit harder, but requires a slightly different replacementt protocol

Lecture 3

Cache – Continued

  • understand matching and replacement
  • replacements policies – random, least recently used, etc
  • writebacks – happen to the original memory
  • write-through will automatically write to memory
  • write-back defers writing until the line is replaced – needs a dirty bit
  • write-misss
    • write-allocate – load into cache, update line
    • no-write-allocate – immediately write to memory
  • miss rate, hit time, miss penalty are the performance metrics
  • make sure to write s.t. the common case goes fast
  • minimize misses in the inner loops
    • repeated refs to vars are good
    • stride-1 ref patterns are good

Problems

  • physical address only
  • memory is byte addressable
  • access to 1 byte words
  • phys address is 12 bit wide
  • 81 is fysically addressed and direct mapped with a 4-byte line and 16 total sets

Lecture 4: Virtual Memory

  • lets a computer think it has more memory than it actually does
  • helps to deal with a lot of more basic stuff
  • physical addressing – used in small systems – embeded microcontollers for things like cars, elevators, etc
    • can be an issue – a set amount of memory is provided, no more, no less
    • 512mb means 512mb
    • programs are machine dependent
    • running multiple programs becomes very difficult
  • virtual memory uses an mmu to translate a virtual address to a physical address
    • means that it doesn’t matter how much memory it has
  • linear address space – orderd set of contiguous, non negative integer address
  • virtual address space – set of \(N = 2^n\) virtual addresses
  • physical address set of \(M = 2^m\) physicall addresses
  • difference between data and attributes
  • an object can have multiple addresses
  • every byte in main memory has one PA and one or more VAs
  • virtual memory uses main memory efficiently
  • simplifies memory management
  • isolates address space – one process can’t interfere with another’s, and a user program cannot access privileged kernel information

Virtual memory as a cache

  • VM is an array of N contiguous byets stored on disk
  • contents of the array are cached in physical memory
  • blocks are called pages (size \(P = 2^p\) bytes)
  • pages are of fixed, but configurable size
  • Given an address space of \(2^n\), and a page size of \(2^m\), you have \(2^{n-m}\) pages
  • keeping track requires special software, cannot be implemented in hardware. This is the page table
  • write-back, not write-through
  • Page tables
    • an array of page table entries (PTEs) that map virtual pages to physical pages
    • per-process kernel data structure, in DRAM
    • Has valid bit, keeps physical page number or disk address
    • valid signals on disk
    • works because of locality
    • actie pages is called the working set
    • working set < main size, good performance for one process after misses
    • sum(working set) > main size, thrashing, lots of swapping

Virtual memory for memory management

  • each process has its own virtual address space – own page table
  • views memory as a simple linear array
  • mapping function scatters through physical memory
  • maps virtual pages to the same physical page
  • and can allow a single library to be used for multiple programs
  • can make linking and loading easier
  • kernel vm is shared
  • linking maps programs into the common structure
  • loading allocates and copies things as needed

Lecture 5: Virtual Memory

  • homework is a tool
  • quiz based
  • midterm will be electronic?
  • first midterm, programming assignment coming, week, week and a half

Virtual memory for memory management

  • use chunks to bring stuff up!
  • everything has its own virtual address space
  • mapping scatters addresses through physical memory
  • well chosen mappings simplify allocation and management
  • includes protecting sections from exeution

Memory protection

  • page fault bits, superfior, read, write, block address

Address Translation

  • \(N = 2^n\) number of addresses in virtual address space
  • \(M = 2^m\) number of addresses in physical memory
  • \(P = 2^p\) page size
  • TLBI – tlb index
  • TLBT – tlb tag
  • VPO – virtual page offset
  • VPN – virtual page number
  • PPO – Physical page offset
  • PPN – Physical page number
  • CO – byte offset
  • CI – cache index
  • CT – cache tag
  • given base address, use page number and offset
  • mmu translates based on ptable entry/address
  • VM and cache together make things faster/more able to handle a lot of memory
  • TLB – another cache in the MMU, maps vpns to ppns, contains complete page table entries for a small number of pages
  • multilevel page tables, would need 512 gb page table for 48 bit, 8 bite PTE at 4kb page size
    • paging the page table!
    • page tables that list page tables

Lecture 6: Virtual Memory

Page faults

  • small page size – less internal fragmentation
    • more pages required
    • larger page tables
    • large programs means page tables may be in virtual memory
    • rotational characteristics favor larger page size
  • 4k is the most common size
  • internal fragmentation
    • 10 seats, 10 students - no fragmentation
    • 20 seats, 10 students - fragmentation of 10
    • 40 seats, 10 students - fragmentation of 30
  • Small size with small pages – high page fault rate, but faster to fill
  • large size, large pages – low page fault rate, slower to fill
  • many pages in process, large page size, low page fault
  • contemporary techniques decrease locality of references within a process
  • Fetch policy – when pages should be brought into memory
    • demand paging – when the page is needed, as they are needed
      • many page faults when a process is first started
    • prepaging – load what is needed, and then a bit more
      • exploits characteristics of second memory devices
      • more efficient to bring in a number of pages at one time
      • ineffective if extra pages aren’t referenced
  • placement policy
    • determines where stuff is to be put
  • replacement policy
    • deals with selection of a page in main memory to be replaced
    • objective is to remove the page least likely to be referenced in the near future
    • more elaborate policies require greater overhead
  • frame locking
    • a frame that must stay in memory at al time
    • the kernel, and key control structures
    • i/o buffers and time-critical areas may be locked
    • locking is achieved by lock bit with the frame
  • algorithms
    • optimal (Belady’s policy)
      • selects the page for which the time to next ref is longest, produces three page faults after the frame allocation has been filled
    • FIFO
      • page frames a circular buffer
      • first thing in gets pushed out
    • least recently used
      • based on locality
      • least likely to be used is the least recently used
      • difficult to implement
    • clock policy
      • requires a use bit
        • set on load
        • reference
        • replacement if not 1
        • flipped periodically
  • Desireable to maintain as many processes in memory as possible
  • frees programmers from size restrictions
  • virt mem
    • references are logical, translated
    • breaks processes into pieces
    • two approaches are paging and segmentation
    • management scheme requires both hardware and software

Lecture 7: Virtual Memory

  • Considere a paged VM system witth 32 bit virt addresses and 2K-byte pages. Each page table entry requires 32 bits. Each page table size is exactly one page (or 2K-bytes)
    • how many levels of page tables
    • how many entries does the table use at each level (at one level, the table uses a smaller number of entries)
    • smaller number of entries can be used at the top or botom. Which strategy uses the least number of pages? Explain by calculating the number of pages for both
    • 232/211 = 221
    • 211/22 = 29, then, 3 levels, (9 + 9 + 3) = 21
    • entries per level: 512, 512, 8

Dynamic Memory allocation

  • basic concepts
    • malloc etc acquires vm at run time
    • manage an area of process vm known as the heap
    • allocator maintains heap as collection of variable sized blocks, either allocated or free
    • types
      • explicit – malloc and free in C
      • implicit – application allocates, doesn’t free, GC in Java, ML, Lisp
  • void *malloc(size_t size) – returns void pointer, memory blocks of at least size, aligned to 8-byte boundary (usually), returns null, sets errno
  • void free(void *p) – returns block pointed at p to pool of available memory – must come from pervious call to malloc
  • sbrk is internal function
  • memory is word addressed
  • applications – can issue arbit sequence of malloc and free
  • free must be called on a malloc’d block
  • cant control number or size of allocated blocks
  • must respond immediately to malloc
  • must allocate from free memory
  • must align blocks to satisfy all alignment requirements (8 byte for GNU Malloc)
  • can manipulate and modify only free memory
  • can’t move the allocated blocks once malloc’d
  • given sequence of malloc and free,
  • maximize throughput and peak memory utilization
  • throughput is number of completed requests per unit time
  • payload, malloc(p) results in a block with payload of p bytes
  • aggregate payload is the sum of the currently allocated payloads
  • currentheap size is non-decreasing
  • peak utilization after kay requests max payload over heap size
  • internal fragmentation is when the payload is smaller than the block size
  • enough aggregate memory, but not enough contiguous is external framentation
  • compacting the stack can be done, but requires a tremendous amount of work/linear search of stack/memory/introspection
  • size is embeded in a block
  • keep track of free blocks – some structure in the heap
  • picking blocks for allocation
  • reinserted freed blocks?
  • standard block – length ob flock in word preceding the block (header field, header), requires an extra word for every allocated block
  • implicit list using length (links all blocks)
  • explicit list (lists free blocks)
  • segregated free list
  • different free lists for different size classes

Lecture 8: Malloc

  • how much memory to free – block before is the size
  • implicit list - use length/links of all blocks – implemented in K&R
  • segregated free list – free list for different size classes
  • blocks sorted by size – use RB tree with pointers to each block, link used as key
  • The implicit list
    • size and allocation status – could store in two words, but this is wasteful
    • use word for block size, low bit is status (because blocks are a multiple of an even number)
    • status word, payload, padding fulfill alignment
    • alignment with 8
    • while not past end, and already allocated, and too small, go to the next block
    • next fit, first fit, but starts where previous search finished
    • best fit, looks for the smallest block with appropriate size
    • allocation in a free block – split a block
    • freeing – simply clear allocated flag
      • or coalesce – check the next/previous block if its free
        • boundary tags – size tag/status flag at both ends, traverse list backwards, requires extra space, important, general technique, Knuth 73
        • extra internl frag with boundary tags
    • Placement – first fit, next fit, best fit
    • splitting – how much internal frag
    • coalescing – immediate or deferred
    • often used in special purpose apps, easy to implement
  • The explicit free list
    • allocated is as before, free has size, allocation status, and next previous are pointers stored in the first two blocks of the free
    • allocation requires moving previous and next for several
    • faster when memory is more full
    • more complex free/allocate because blocks need to be spliced in/out of the list
    • extra space required by links
    • commonly used with segregated free lists

Lecture 9: Malloc Continued

  • programming assignment coming
  • Explicit List
  • splice out predecessor and successor blocks, insert new block at root of list, coalesce if needed
  • explicit – only include free blocks
  • linked lists are mostly used in segregated free lists
  • segragated free list
    • blocks sorted by size – rb tree, or similar to use length as a key
    • size classes of blocks have own free list
    • large sizes – one for each two power size
    • Search for appropriate free list for block of size \(m > n\)
      • if an appropriate block is found
        • split block and place on appropriate list
      • If no block found, try next larger class
      • repeat until block found
      • If no block is yet found – request additional heap memory from os (sbrk)
      • allocate block of \(n\) bytes from this new memory
      • place remainder as a single free block in largest size class
  • Memory perils and pitfalls
    • dereferencing bad pointers
    • reading uninitialized memory
    • overwiriting memory
    • referencing non existent variables
    • freeing mulitple times
    • referencing freed blocks
    • failing to free blocks
    • be careful of various operators and their precedence

Lecture 10: Garbage Collection!

  • Function pointers are used to implement dynamic loading and in some OO languages, v-tables

Garbage Collection

  • OO and Functional Programming languages
  • storage is still exhaustable
  • Let the runtime handle it
    • less debugging
    • no leaks
    • no explicit management
  • malloc function, new operator
    • used to allocate neded memory
  • mutator – does useful work, chages or mutates connectivity of object graph
  • gc – shift responsibility of freeing to systems
  • static allocation – binding at compile time – local vars bound to the same locations at every activation of the procedure
  • heap allocation – allocated and deallocated in any order, outlives method creating it, real-world data structures
  • liveness – held in roots, can be detected
  • memory is now a directed graph
  • non-reachable is garbage
  • nodes are reachable if there is a path from any root to that node
  • initialization is part of new – which acts like malloc otherwise
  • ref count – the number of objects that refer to the current object
  • harder to determine deallocateability
  • but les time is spent debugging memory
  • Classical algorithms
    • Reference counting
      • counts number of references to each object
      • incremental
      • Perl, photoshop, small talk, VB
      • update(right(m), null)
        • recursively decrement all of the things below it, and update anything below again
      • management distributed through execution
      • better locality than tracing
      • cells that are short lived can be reused to reduce the amount of heap usage
      • but fragile, and doesn’t deal with cyclic structures
    • Mark sweep
      • dead objects not reclaimed immediately
      • begins when a request is made that results in memory exhaustion
      • useful info is suspended
      • sweep unused objects back to list
      • global traversal of all live objects starting from root
      • sweap from bottom up
      • unmarked cells returned to the free pool
      • marking bits are unset in prep for next gc
      • Start/stop
      • complexity proportional to size of heap
      • more fragmentation/reduced locality
      • requires head-room
    • Copying algorithm
      • split into from and to spaces
      • allocation through pointer mubing in from space
      • live identification is done through tracing
      • live object copied to other side of heap
      • once copied, tags for spaces are flipped
      • fast allocation, simple check for exhaustion
      • no fragmentation
      • poor space utilization
      • more page faults
    • Generational gc
      • the defacto
      • uses multiple kinds

Lecture 11: Process Management

  • difference between programs and processes
  • program is a lifeless object, sits on disk, just instructions
  • a process is a running program
    • OS takes the instructions and gets them running
    • transforms the program into something useful
  • issues with programs running on different OS – because of memory differences, abis etc
  • creation
    • program loaded from disk
    • stored to memory (code and data), stack and heap
    • i/o – create file descriptors
    • execute, starting at entry point (main)
  • System to be initialized
  • user requests a new process
  • initiation of batch jobs
  • be very careful of fork
  • termination conditions
    1. normal exit
    2. error exit
    3. fatal error
    4. killed by another process
  • Parents create children, children can create children
  • forming a hierarchy, called a process group (except on windows)
  • processes may be either running or not running (suspesion is used to wait for other things)
  • has to use a queue – kinda

Lecture 12: Process Management, continued

  • the two process model
  • running or not running
  • running requires availability of cpu resources
  • many state models
  • two state model
    • running, and not running, looped together
    • not running is entry point
    • exit is exit point from running
  • not-running-queue
    • uses a queue for not yet running
  • three state model
    • ready, blocked, running
    • running to blocked, or to ready
    • ready to running
    • blocked to ready
    • lowest layer of a process-structured os
    • handles interrupts and scheduling
    • above the scheduler are sequential processes
    • can preempt, switch, etc
  • scheduler preemption
    • save program counter
    • register values
    • memory, etc
  • five state
    • running, ready, new, blocked, exit
    • new admits to ready,
    • ready dispatches to running
    • running ties out to ready
    • events wait from running o blocked
    • blocked occcurs tto ready
    • running releases to exit
    • two/multiple queues
      • blocked queue
      • ready queue
      • a queue for each event waiting type
  • process suspension
    • processor is faster than i/o
    • swap processes to disk to free memory
    • blocked state becomes suspend when swapped to disk
    • blocked/suspend, ready/suspend
  • process suspension may happen for a variety of reasonns
  • i/o tables
  • file tables
  • status tables
  • process structure points to where various things are
    • registers
    • pc
    • program status
    • stack pointer
    • process stae
    • priority
    • sched params
    • pid
    • parent
    • process group
    • a variety of others
  • shared memory can be used to communicate/coordinate between processes
  • Threading
    • lighter weight than processes
    • but have a similar purpose
    • keep the same process, execution context
    • and only are a specific function
    • can still lead to race conditions
    • can cause issues w/ stack space

Lecture 13: Process Examples

  • virtual addresses, physical addresses must be mapped for shared memory
  • know how many processes are in play
  • remember that process have parent/child relationships

Threading

  • miniature processes – part of one, same resources as the original, but new execution context (new stack space for the thread)
  • may/does require dynamic allocation on the stack
  • multi threading helps to utilize as much of the processor as possible
  • threads can be implemented either in userspace or by the kernel
    • requires scheduling, a thread table, etc
  • kernel threads are now the norm – makes things much easier to handle
    • good concurrency support – blocking thread doesn’t starve siblings
    • scheduling needs system calls
    • not as fleixble
    • not as scalable
  • hybrid implementations – part kernel hread, part user threads
  • scheduler activations
    • mimc kernel threads, gaining performance of user threads
    • avoid unnecessary user/kernel transitions
    • kernel assigns virt processor to each process
    • relies on the kernel, to notify of kernel blocking events
    • more virt processors are assigned in blocking
  • mysql
    • popup threads – new threads on message arrival
    • some before, but most after
  • threads must be reentrant
  • may have private globals
    • and static locals

Lecture 14: Threads, Continued

  • create a thread and start routine
  • exit simply terminates a thread
  • pthreadjoin suspends execution of calling thread until the target thread terminates, unless the target thread has already terminated
  • Be careful with how sync is done.
  • And consider how syncing in threads is done
  • Mutexs, flags, semaphores, etc

Lecture 16: More Concurrency

  • Print Queue – \(N = 100\)
  • Processes add print jobs to the queue, but can’t add more to a full queue
  • Processes remove jobs from the queue and do the printing, bu can’t consume from the empty queue
  • Some buffers are circular
  • When buffer is full, producers go to sleep
  • When empty, consumers go to sleep
  • If empty buffer gets added to, wake consumer; if full buffer gets removed, wake producers
  • Updates may, and likely do, require a mutex
  • Semaphores – A number, 0 or a postive number
    • down – if val > 0, decremet, if val = 0, suspend without completing down
    • up – increment and if there are processes sleeping on the semaphore, wake one or all up
  • Monitors
    • Semaphores are used for mutual exclusion
    • barriers and signalling events
    • low level
  • Monitors provide mutual exclusion implicity

Lecture 17: Monitor

  • Semaphores as barriers
  • Monitor is a language construct
  • Only one process can be active in a monitor at anny instant
  • Use condition variables to block processes
  • wait operations
  • signal operations
  • See slide 11 from class
  • produce-consumer problem, only one process can be active at one time
  • buffer has \(n\) slots
  • implementation independent
  • only visible effects matter
  • callers are independent
  • monitor writers can change the iplementation as long as tthe same effects are maintained
  • provides mutual exclusion implicitly
  • condition variables
    • delay a process if a monitor state fails to satisfy boolean condition
    • awaken a process or processes if the condition becomes true
  • signaling disciplines
  • signal continue signal wait
  • Helps to solve race conditions

Licensed under Creative Commons Attribution-ShareAlike 4.0 International License