Operating systems
Page fault
Prashanth L.A.
2023-09-25
Lecture 23 (contd)
So far
Paging
- too slow: use TLB
- too big: use multi-level PT
What
happens if pages, or even the page tables don’t fit in memory?
- When does this occur?
- one program with tons of data
- many programs sharing memory
Require an
additional level in the memory hierarchy
- OS needs a place to stash away portions of address space that
currently aren’t in great demand
- Solution: use hard disk drive
The crux
How can the OS make use of a larger, slower device to transparently provide the illusion of a large virtual address space?
Swap Space

- Page may be in memory or on disk
- OS needs to track this and
- may have to get the disk address of a given page
(How?)
Page table entry

- Other
- valid and protection: Not needed for today’s
discussion
- present P:
- if \(P=1\), page is in
memory, else on disk
- if \(P=0\), can use PFN. What
to use it for?
- Page Fault: Accessing a page that is not in
physical memory
- OS has to bring the page back from swap space to service the page
fault. Done by OSPFH
- Why doesn’t hardware handle page faults?
What If Memory Is Full ?
- The OS like to page out pages to make room for the new pages the OS
is about to bring in
- The process of picking a page to kick out, or replace is known as
page-replacement policy
- Are all pages equally easy to replace?
New Control Flow
Try to fetch instruction@VA
- extract VPN
- Check TLB for VPN
- TLB hit
- get PFN+offset \(\Rightarrow\)
Fetch PA
- TLB miss: calculate PTE address \(PA_{PTE}\)
- Fetch \(PA_{PTE}\)
- if present and accessible, get PFN, insert in TLB and retry
- else raise PAGE FAULT exception
New Control Flow (again)
VPN = (VirtualAddress & VPN_MASK) >> SHIFT
TLB hit
(Success, TlbEntry) = TLB_Lookup(VPN)
if (Success == True)
if (CanAccess(TlbEntry.ProtectBits) == True)
Offset = VirtualAddress & OFFSET_MASK
PhysAddr = (TlbEntry.PFN << SHIFT) | Offset
Register = AccessMemory(PhysAddr)
else
RaiseException(PROTECTION_FAULT)
TLB miss
else
PTEAddr = PTBR + (VPN * sizeof(PTE))
PTE = AccessMemory(PTEAddr)
if (PTE.Valid == False)
RaiseException(SEGMENTATION_FAULT)
else
if (CanAccess(PTE.ProtectBits) == False)
RaiseException(PROTECTION_FAULT)
else if (PTE.Present == True)
// assuming hardware-managed TLB
TLB_Insert(VPN, PTE.PFN, PTE.ProtectBits) RetryInstruction()
else if (PTE.Present == False)
RaiseException(PAGE_FAULT)
Page Fault Control Flow – Software
PFN = FindFreePhysicalPage() // no free page found
if (PFN == -1) // run replacement algorithm
PFN = EvictPage()
DiskRead(PTE.DiskAddr, pfn) // sleep (waiting for I/O)
PTE.present = True // update page table with present
PTE.PFN = PFN // bit and translation (PFN)
RetryInstruction()
- The OS must find a physical frame for the soon-be-faulted-in page to
reside within
- If there is no such page, waiting for the replacement algorithm to
run and kick some pages out of memory
When to replace
Option 1: Wait until memory is entirely full
- There are many reasons for the OS to keep a small portion of memory
free more proactively (What reasons?)
Option 2: Let Daemons free things up
- Two watermarks: HW and LW
- If there are fewer than LW pages available, Page
Daemon background thread runs
- Page daemon evicts pages until there are HW pages
available
Lecture 24
Replacement policy
- Deciding which page to evict is the replacement policy
- Similar to cache management
- Replacement policy goal: minimize the number of cache misses
- Metric: average memory access time (AMAT) \[AMAT = T_M +(P_{Miss} \times T_D)\]
Example
- 4KB, with 256-byte pages \(\Rightarrow\) 4-bit VPN and 8-bit
offset
- Number of virtual pages: 16
- A mem ref sequence
0x000, 0x100, 0x200, 0x300, 0x400, 0x500, 0x600, 0x700, 0x800, 0x900
- Suppose page 3 isn’t in memory
- Hit rate: ?
- \(T_M=100ns\) and \(T_D=10ms\)
- AMAT: \(100ns + 0.1 ·
10ms=1.001ms\)
- With \(99.9\)% hit rate, AMAT=\(10.1\)microsecs
- Miss rate dominates AMAT
Optimal Replacement Policy
- Replaces the page that will be accessed furthest in the
future
- Least # of misses
- Useful for comparison; not practical
Example with cache size 3
Access stream: \(0, 1, 2, 0, 1, 3, 0, 3, 1,
2, 1\)
0 |
Miss |
|
0 |
1 |
Miss |
|
0,1 |
2 |
Miss |
|
0,1,2 |
0 |
Hit |
|
0,1,2 |
1 |
Hit |
|
0,1,2 |
3 |
Miss |
2 |
0,1,3 |
0 |
Hit |
|
0,1,3 |
3 |
Hit |
|
0,1,3 |
1 |
Hit |
|
0,1,3 |
2 |
Miss |
3 |
0,1,2 |
1 |
Hit |
|
0,1,2 |
Hit rate\(=\frac{6}{6+5}=0.545\)
Policy 1: FIFO
- Put pages on a queue, replace from the tail
- It is simple to implement, but can’t determine the importance of
blocks
Example
0 |
Miss |
|
0 |
1 |
Miss |
|
0,1 |
2 |
Miss |
|
0,1,2 |
0 |
Hit |
|
0,1,2 |
1 |
Hit |
|
0,1,2 |
3 |
Miss |
0 |
1,2,3 |
0 |
Miss |
1 |
2,3,0 |
3 |
Hit |
|
2,3,0 |
1 |
Miss |
|
3,0,1 |
2 |
Miss |
3 |
0,1,2 |
1 |
Hit |
|
0,1,2 |
Hit rate\(=\frac{4}{4+7}=0.364\)
Belady’s anomaly
How does the cache hit rate change when moving from a cache size of 3 to 4 pages?
With FIFO, it may increase
Try the memory stream: \(1, 2, 3, 4, 1, 2,
5, 1, 2, 3, 4, 5\)
Policy 2: Random
- Picks a random page to replace under memory pressure
Sample run
0 |
Miss |
|
0 |
1 |
Miss |
|
0,1 |
2 |
Miss |
|
0,1,2 |
0 |
Hit |
|
0,1,2 |
1 |
Hit |
|
0,1,2 |
3 |
Miss |
0 |
1,2,3 |
0 |
Miss |
1 |
2,3,0 |
3 |
Hit |
|
2,3,0 |
1 |
Miss |
3 |
2,0,1 |
2 |
Hit |
|
2,0,1 |
1 |
Hit |
|
2,0,1 |
Hit rate\(=\frac{5}{5+6}=0.454\)
Using History
recency |
If a page has been accessed recently, it should not
be replaced |
LRU |
frequency |
If a page has been accessed many times, it should not
be replaced |
LFU |
LRU example
0 |
Miss |
|
0 |
1 |
Miss |
|
0,1 |
2 |
Miss |
|
0,1,2 |
0 |
Hit |
|
1,2,0 |
1 |
Hit |
|
2,0,1 |
3 |
Miss |
2 |
0,1,3 |
0 |
Hit |
|
1,3,0 |
3 |
Hit |
|
1,0,3 |
1 |
Hit |
|
0,3,1 |
2 |
Miss |
0 |
3,1,2 |
1 |
Hit |
|
3,2,1 |
Hit rate\(=\frac{6}{6+5}=0.545\)
Lecture 25
Workload 1: No-Locality
- Each reference is to a random page within a set of \(100\) pages
- Overall, 10,000 pages are accessed
No-locality workload
|
Remarks
|
|
- It doesn’t matter much which realistic policy you are using if
- when there is no locality
- when the cache is large enough to fit the entire
workload
|
Workload 2: 80-20
- 80% of the references are to 20% of the pages (hot
ones)
- Overall, 10,000 pages are accessed
80-20 workload
|
Remarks
|
|
- LRU does better
- but, optimal is optimal
|
Workload 3: Looping sequential
- Refer to 50 pages in a loop
- From 0 to 49 and back
- Total 10000 accesses
Looping sequential workload
|
Remarks
|
|
- LRU and FIFO do worse
- Random fares better
|
Implementing LRU
- To keep track of which pages have been least-and-recently used, the
system has to do some accounting work on every memory
reference
- Require some HW support, in the form of a use bit
- Upon every page access, the use bit is set by HW to 1
- After running for a while, can differentiate which pages are
in use.
- Is this enough?
- No OS must clear use bits periodically
Clock Algorithm
- All pages in a circular list
- A clock hand points to some page
if(use=1)
clear (use=0)
inc clock
else
replace this page
Simulation results
|
Remark
|
|
Clock algorithm doesn’t do as well as perfect LRU, it does better then
algos that don’t consider history at all
|
Clock algorithm variant with a dirty bit
- If a page has been modified and is thus dirty, it
must be written back to disk to replace it
- If not, replacement is free
- HW includes a dirty bit
- The clock algorithm could scan
- for pages that are both unused and clean to evict first;
- failing to find those, then for unused pages that are dirty;
etc.
Page Selection Policy
The OS has to decide when to bring a page into memory
Demand paging
- OS brings the page into memory when it is accessed, on
demand
Prefetching
- OS could guess that a page is about to be used, and thus bring it in
ahead of time;
- Example, if a code page P is brought into memory, that code page P
+1 will likely soon be accessed and thus should be brought into memory
too.
Thrashing
Problem
- Memory is over committed (demands of current processes exceeds
available mem)
Solution
- Admission control: kill some processes
- Buy more memory