Deep Understanding of Computer Systems Notes

Deep Understanding of Computer Systems Notes

Encoding

One’s complement and sign-magnitude encoding have two different representations for the number 0 Two’s complement encoding has uniqueness, where the sign bit 1 indicates negative numbers and sign bit 0 indicates non-negative numbers Subtraction can be converted to addition, making circuit design simpler and more efficient The carry bit from the highest position during addition can be discarded

Essence of Compilation

  graph LR
A(C preprocessor expands source code)-->B(compiler generates assembly code for two files)
B-->C
C(assembler converts assembly code to binary object code)-->D(linker merges object files with standard Unix library functions to produce final executable file)

Machine Words

Word represents 16-bit data type, 32-bit is double word, 64-bit is quad word

Truncation

High precision to low precision conversion is direct truncation Low precision to high precision conversion uses sign extension to fill in

Floating Point Representation

IEEE 754 standard defines single precision floating point as 32 bits long | Sign 1 bit | Exponent 8 bits | Fraction 23 bits |

Locality Principle

Spatial locality: If a memory location is referenced once, nearby memory locations will likely be referenced soon Temporal locality: A memory location that has been referenced once will likely be referenced multiple times soon

Program Loading

Any Unix program can call the execve function to invoke the loader The loader copies code and data from the executable object file from disk to the corresponding process address space, then jumps to the first instruction of the program to run it

fork -> exec complete flow Parent shell forks a child process, which is a copy of the parent process. The child process uses exec to call the loader, which deletes the child’s existing virtual memory segments and initializes new code, data, and stack segments. The stack segment is initialized to zero values. Except for some header information, no data is copied from disk to memory during loading until the CPU references a mapped virtual page, at which point the operating system automatically transfers pages from disk to memory using its paging mechanism

Dynamic Linking

Allocate a dedicated address space block for each shared library implementation, then require the loader to always load shared libraries at this address

Exception Control Flow

Exception is a form of exception control flow, partially implemented by hardware and partially by the operating system

Exception Handling (requires kernel mode)

Each type of exception is assigned a unique non-negative integer exception number

Types: CPU computational exceptions, operating system kernel (system calls, external IO signals)

When the system starts, the operating system allocates and initializes an exception table jump table The starting address of the table is placed in a special register

Exception Types

Interrupts: IO, network adapters, disks, clocks trigger interrupts by sending signals to a processor pin and placing the exception number on the bus

Traps: Intentional exceptions, system calls run in kernel mode

Faults and Aborts

Address Space

The bottom three quarters of the address space are reserved for user programs, including typical text, data, heap, and stack segments The top quarter is reserved for the kernel (such as code, data, and stacks used when executing system calls)

| Kernel Virtual Memory | | User Stack | <- esp stack pointer |Shared Library Memory Mapping Area| <- brk system call (malloc implementation) | Runtime Heap | | BSS Segment Uninitialized | | Data Segment Initialized | | Read-only Segment |

Context Switching

  1. Save the context of the current process
  2. Restore the saved context of a previously preempted process
  3. Transfer control to this newly restored process

Time-slice round-robin is achieved through timer interrupts to allow the scheduler to reschedule

fork

The biggest difference between the parent process and the newly created child process is that they have different PIDs

Process Communication

Signals: Higher-level software form of exceptions. Different signals represent different system events, allowing processes to interrupt other processes Main purpose is to notify upper-layer processes of invisible underlying exceptions ctrl-c sends SIGINT signal to foreground process SIGKILL forces termination of process SIGCHLD: When child processes terminate or pause, the kernel sends this signal to the parent process Can selectively block reception Use ps command to check for defunct flag to identify zombie processes

Virtual Memory

  1. It treats main memory as a high-speed cache for address space stored on disk, keeping only active regions in main memory and transferring data back and forth between disk and main memory as needed, efficiently using main memory in this way
  2. It provides each process with a consistent address space, thus simplifying memory management
  3. It protects each process’s address space from being corrupted by other processes

Virtual address -> MMU (contains TLB cache) -> Physical address Organize main memory in page format, page tables maintained by operating system. When TLB cache misses, need to look up table in memory. If not found, page fault occurs, handled by operating system to load page

Standard library calls: Map virtual pages in different processes to the same physical pages, allowing multiple processes to share the same code

After fork, write operations by either process on virtual memory trigger copy-on-write mechanism, creating new pages

execve replacement process:

  1. Delete existing user areas
  2. Map private areas
  3. Map shared areas
  4. Set program counter

Stack Space

What is the essence of stack allocation? Two CPU instructions: push data to stack space, release only requires moving stack pointer

System pre-allocates a continuous address space that can be managed by compiler. In X86, function calls create new stack frames with sizes calculated by compiler. As long as stack pointer doesn’t touch protection pages when moving down, stack space remains available until overflow. Stack space is smaller than heap space, suitable for allocating variables with known byte lengths Thread stack size is fixed, but main thread stack is much larger than child thread stacks If accessing main memory touches protection pages, CPU generates page fault segfault

Heap Essence!!

  • A dynamic storage allocator maintains a process’s virtual memory region called heap
  • For each process, the kernel maintains a variable brk (break) pointing to the top of the heap
  • Heap is a request binary zero region

Why use heap? Need to know data structure size during actual program runtime

Allocator maintains heap as a collection of blocks of different sizes, connected with doubly linked lists, either allocated or free

Allocation strategy: First fit, Next fit, Best fit Merge and split free blocks, request failure: merge, request success: calculate, if possible split in half

Operating system defaults to allocating part of heap memory. Through brk we can dynamically expand or shrink heap. mmap can be used as heap space without file mapping, but requests are allocated in page multiples

glibc heap allocation algorithm: For allocations less than 64 bytes, uses object pool-like method; for allocations greater than 512 bytes uses best-fit algorithm (covered in OS textbooks); for allocations between 64 and 512 bytes, adopts optimal compromise strategy based on circumstances; for allocations greater than 128KB, uses mmap mechanism to directly request space from operating system

Since heap memory allocation generates system calls, runtime programs initially request a large chunk of heap space from operating system, shrinking only when unused for extended periods, avoiding unnecessary overhead from repeated heap allocation

For C language, process initialization only includes stack, no heap. Only after calling malloc does system allocate heap space. If heap usage doesn’t reach brk address, kernel doesn't participate in heap allocation because it's sufficient

Garbage Collection

Garbage collection treats memory as a directed reachable graph with a set of root nodes and a set of heap nodes C garbage collection is conservative garbage collection, where each reachable block is correctly marked as reachable, while some unreachable nodes may be incorrectly marked as reachable

System I/O

Output: Main memory to I/O Input: I/O copied to main memory

RIO built-in two functions:

  • Unbuffered input/output functions, directly transfer data between memory and files (especially for network read/write)
  • Buffered input functions (thread-safe bufio)

Shared files: Each process maintains a descriptor table All processes share open file table All processes share v-node table fd0 File1 offset File inode fd1 File2 offset fd2

Close actually closes the file descriptor, allowing kernel to safely delete entries from open file table