Physical Memory (RAM) is small, fast and volatile (pull the plug and nothing’s left). Remember that. :)
A process needs memory to run (storing code, stacks, information, data, etc.), so first we need to allocate some for it. Here are some allocation methods:
- Growing Segments – Give the process a predetermined amount of space for code (that will never change, of course) and a preliminary amount of space for data. Leave a big block of unused space after that for data to be added later, if needed. This still means that a process can get a limited amount of space and that there’s a finite amount of processes that can live in the system.
- Tracking Allocated Memory – Give the process the space for code and an initial space for data. Then give it more as it needs and track where the used/free spaces are. You can do that with Bitmaps (the structure, not the image type) or linked lists. Now we need an Allocation Strategy, that is – where do we allocate from the space we have left?
- First Fit – No need to search a lot, but highly ineffective.
- Next Fit – The first one we can find from the last place we allocated. Not a lot of improvement there.
- Best Fit – It needs 500 bytes? We’ll find the closest we can get to it (without going under 500) and give it that. The problem we get here is external fragmentation (later).
- Worst Fit – Now you’re just guessing.
- Quick Fit – Use Best Fit, but do it more quickly by keeping not one linked list, but several for size ranges (want 500 bytes? no prob – we’ll go look for it in the 250-750 bytes queue).
- The Buddy System – (Knuth 1973) We need X bytes, so we divide the first piece of memory into twos until we get a piece small enough so that xfloor(lgx) <= x <= xceil(lgx) and then allocate that part. For example, if we have 1024KB and we need 200KB, we divide 1024 in half and then the first half in half as well (we now have chunks sized 256, 256 and 512) and then allocate the first 256KB chunk. Once a chunk is freed (or re-merged), it will be re-merged with it’s buddy (that is, the other chunk in the original division). For example, freeing our 256KB chunk would cause it to re-merge with it’s 256KB buddy and the ‘new’ 512KB chunk will then re-merge with it’s 512KB buddy to re-form the original 1024KB chunk.
This algorithm’s memory utilization isn’t that good and there’s a lot of internal fragmentation (think what would happen if we needed a chunk of size 2x + 1).
Fragmentation is something we don’t like, because it means we have free memory we can’t use. There are two types of fragmentation:
- Internal Fragmentation is what happens when a chunk of memory is allocated by the allocation strategy, but only a small percentage of it is ever used. This means that the memory is logically free to be used by someone else, but can’t be, since it’s allocated.
- External Fragmentation happens when we have a lot of little holes of free memory between regions of allocated memory. We now can’t allocate them, since they’re not big enough for any of the requests for memory we get. A way of solving this problem is Memory Compaction, where regions of memory are moved to seal the holes and all of the holes are merged.
We’ve mentioned that memory is small, but we need to host a lot of processes, so we’ll just Swap some of it to a backing store (disk). We could swap an entire process if it’s blocked, too large, has too many holes in its memory, etc.. But we can’t move daemons (services)… Or shared libraries (dlls), so what do we decide?
You know what? Let’s put everything on the disk and simply use the physical memory to store it. The contents of the disk are called the Virtual Memory. If you have 4GB of virtual memory and 1GB of physical memory, you have 4GB of memory, since the physical memory is only a sort of buffer for quick and easy access. Remember that. :)
A process will never know where its memory is, since the abstraction supplied by the OS means that it only sees its Logical Address Space, no matter where it is at the moment. It should never need to stoop to that level. We’ll discuss how this is implemented later.
There are two ways of managing the backing store’s (disk) Swap File (that is – where the virtual memory is): It can either be pre-allocated and used directly (most systems) or allocated when needed and a translation table sits in memory. There is no real reason to go with the second way.
Needing to work fast and easily means we should probably divide our memory into same sized blocks. We’ll call those Pages and use their numbers as the first part in a logical address. The second part would be the offset in that page. The Memory Management Unit (MMU) is a (fast) part of the processor that takes care of translating the logical address’s page part into the page’s location in the physical memory, where it’s now called a Frame (it uses a Page Table, that is a table where the index is the logical page number, and the values are the frame’s location in the physical memory and some control bits). The important control bits are (in order of checking by the MMU):
- Valid Bit – 0 if the page wasn’t even allocated yet for the process (this is because the table includes the maximum amount of entries for a process, even though they may have not yet been allocated). When the MMU sees a 0 value in this bit, it throws a Segmentation Fault (Unix) or BSOD (Windows).
- In Memory Bit (aka Present/Absent Bit) – 1 if the page is actually in the physical memory. 0 means that none of the rest of the entry is valid and when the MMU sees it, it throws a Page Fault, telling the OS that the page isn’t in the physical memory and should be first brought from the virtual memory before being asked for.
- Dirty Bit – 1 if the page has been changed in the physical memory and must be committed to disk when paged out.
- Referenced Bit – Changed to 1 whenever the page is read from or written to. Used in the algorithms for selection of pages to be paged out.
The page table in the MMU can be really big, so we can simply split it into a Multilevel Page Table (mostly 2) and only have to keep the topmost level in the physical memory all the time. Associative memory (aka Translation Lookaside Buffer or TLB) helps by keeping recently used logical addresses so that we have less of a hit on performance (think about the amount of page faults bringing the xth level page to physical memory, where x > 1, since the first is always there).
We could also use, for big enough memories, an Inverted Page Table, which means that a table is kept only for the available space in the physical memory (instead of one for the maximum size of a logical address space). The problem is that we now don’t have a good lookup method to find logical address pages (it would take O(n)) and we can solve that by creating a hashtable where the key is the <process id, logical page number> pair (costing us only O(1)).
You can read more about these techniques on Wikipedia.
Page Replacement Algorithms come into play when we need
to copy a page into physical memory, but can’t find an open spot. The following algorithms return a frame number that will be thrown:
- Optimal – If you think it’s weird that we start with the optimal algorithm – don’t – because it simply does not exist. It’s only an index to measure an algorithm against. The optimal algorithm knows what pages are going to be requested next (the future Reference String) and outputs the number of the frame that contains the page that is the farthest away.
- FIFO – We had the best case, now we have the worst case. This algorithm simply outputs the frame number of the page that was paged in the earliest.
- Least Recently Used (LRU) – We can take the reference string of the past and see which page was least recently used. This is like FIFO, only that it doesn’t take a frame by the time its page entered the physical memory, but rather by the last reference to that page. This algorithm is good, but implementing it takes too much time and space.
- FIFO Second Chance – This is an elaboration on FIFO and LRU. This algorithm finds the page that was paged in the earliest: if its reference bit is off, its frame number is given, otherwise it is turned off, the frame number goes back to the head of the queue (as if it was paged out and back in) and the algorithm runs again. This is ‘fair’ towards pages that are referenced often, since the original FIFO would have thrown them away as well.
- Clock – The clock’s hand points to the ith frame. If the frame’s reference bit is set, it is turned off, the clock’s hand moves to frame (i + 1) mod n (where n is the total amount of frames) and the algorithm is run again. Otherwise, the algorithm outputs i. This is the base for WS-Clock, so that’s the only reason it’s important to us.
- Not Recently Used (NRU) – This algorithm divides pages into four states:
- Both the reference bit and the dirty bit are off.
- The reference bit if off, but the dirty bit is on.
- The reference bit if on, but the dirty bit is off.
- Both the reference bit and the dirty bit are on.
The frame of the page with the lowest number class is returned. Note that state 2 is created by the fact that every few seconds, the algorithm intentionally clears all of the reference flags for more precise results.
- Not Frequently Used (NFU) – This algorithm keeps a counter for every frame. At every clock tick it shifts them all by one bit to the right, adds their reference bit in the most significant bit and clears it. The lowest counter value is the counter for the frame that will be returned, since it was the one referenced the farthest backwards and also the least. This allows for simulated aging of references to pages.
Remember: before removing a page, we need to copy it back to the disk if the dirty bit is 1 and also set its in memory bit to 0 in the MMU’s page table.
Belady’s Anomaly states that increasing the number of frames in memory does not necessarily decrease the number of page faults that will occur. This does not, however, apply to Stack Algorithms (such as Optimal and LRU). So we need to find a middle ground, where not too many (more page faults) or too few (hurts multiprogramming) frames are allocated. This is done dynamically in the OS, which sets a maximum and minimum number of frames usable by a process in the physical memory. These bars are also dynamic themselves, adjusting according to the number of page faults, etc.
When a process has too few frames, it starts Thrashing – the high level scheduler is too busy paging in and out (lots of I/O = bad). This could either be caused by the above paragraph’s minimum bar being too low or the fact that there are just too many processes alive and working in the system right now.
To solve thrashing, we use the Working Set Model (WSM), which is based on the principle of locality and uses a window to look at only the frames used by the process in the past n instructions and tries to preserve these pages the most, since it is more likely the process will need them again soon. If the window is too long, it will include some frames not in the last locality that we usually don’t need and will limit our choices of frame freeing; too short and it won’t include the whole current locality and thrashing will ensue.
Since implementing pure WSM is costly, we use the Working Set Clock Page Replacement Algorithm (aka WS-Clock). Each process has a virtual time that measures the amount of ticks since it was created. For each function, its reference virtual time is kept. When the clock hand reaches a process with its reference bit on, it will turn it off and update its reference virtual time to the owning process’s virtual time (similar to FIFO-SC). If the reference bit is off, it will check if it has been referenced in the last τ (Greek letter tau) ticks by comparing τ and the difference between the owning process’s virtual time and the page’s reference virtual time. If it wasn’t, this frame is selected for freeing. Note that if τ is too small, it will cause thrashing and if it’s too large, the level of multiprogramming will decline because WS-Clock will keep getting called instead of running a process. τ is dynamically changed by trial and error.
Unix has a page daemon that uses a two-handed clock algorithm – both act as the hands of a WS-Clock with a fixed angle between them. The smaller the angle, the busier a page needs to be to survive.
Windows 2000 keeps a dynamic minimum and maximum of frames per process. Requests for more frames than the maximum are not given. Once in a while, the balance-set manager runs and clears frames until it reaches the minimum frames for the process.
Both systems have a threshold that once it is passed (meaning too few free frames are available), the algorithms kick in.
Segments are logical groups of pages (you might know them as Data Segment, Stack Segment, Code Segment, etc.), which are used to distinguish between different parts of a process’s memory. Segments are also the smallest unit of memory shareable between processes (since it is the smallest unit of logical significance).