Processes need to be managed, so that they will know when to run and when not to run, so here enter the Scheduling Algorithms. A scheduling algorithm needs to be fair and efficient, but how do you measure that?
- Throughput – This measures the number of processes completed per time unit, which is an important thing for batch systems, where you want as many jobs to finish as fast as possible.
- Turnaround Time – The average time it takes a process to complete from the moment it is introduced to the system.
- Waiting Time – The average amount of time the process is ready to run, but isn’t run (notice that this does not include blocked processes). This indicates how much CPU time we ‘waste’ (when of course, we do not waste it – it simply goes towards running other processes).
- Response Time – The time between submitting a command and the beginning of output. This is a measurement mostly used in interactive systems, where response to the user is very important.
- CPU Utilization – How much of the time was the CPU really utilized? A low ranking (that is – not close to 100%) in a batch system means something is wrong. On the other hand, in interactive systems, a computer might be idle and therefore would have no need for the CPU.
Some of these measurements conflict with one another, so they don’t all need to be at their best – it all depends on what kind of operating system we’re using and what we’re using it for.
Let’s review some scheduling algorithms:
- First Come, First Served (FCFS) – Better known as FIFO, this is a non-preemptive scheduling algorithm. Whenever a process enters the Ready state, it is assigned a timestamp. When a process exits the Running state (either ends or is blocked), the process with the earliest (smallest) timestamp is allowed to run. This is a very basic approach and is highly ineffective for interactive systems (waiting time is very high).
- Shortest Job First (SJF) – This enhancement of FCFS adds the estimated amount of future running time by the process to the mix and lets the processes that are shortest to run first. This algorithm can be implemented as either pre-emptive or not. The main point of the algorithm is that we’re willing to hurt fairness if it gets us better performance.
This produces optimal turnaround time, but can’t be implemented without statistics, since processes aren’t in the habit of declaring the amount of CPU time they’d need in advance.
In order to guess the amount of CPU time a process needs, we can use a simple equation: G(n+1) = a*T(n) + (1 – a)*G(n), where G is the guess for a CPU-burst instance n, T is the actual time spent in CPU in instance n and a is their balancer, which can easily be altered based on past experience.
- Round Robin (RR) – Time in the CPU is divided into quanta and the processes that want to run are given one quantum of time each in a circular pattern (process 1 runs 1 quantum, process 2 runs 1 quantum, … process n runs 1 quantum, process 1 runs 1 quantum, …). Note that if we take a quantum large enough, we get FCFS.
Since context switching takes up time, this might not be a great algorithm when the difference between the time to switch between contexts and the time in one quantum is high enough (of course in favor of the quanta). Changing the size of each quantum does not necessarily mean it will have a good impact on the turnaround time.
- Shortest Remaining Time First (SRTF) – This is a pre-emptive upgrade of SJF, in which a process stays in Running until such a time as it either finishes running (blocked or exits) or a process with less time to run enters Ready, in which case the running process is pre-emptively returned to Ready and the shorter process enters Running.
- Guaranteed Scheduling (GS) – Also known as Fair Share Scheduling, this algorithm is fair towards every single process in the system (as opposed to RR, which is fair only towards those in the Ready and Running states).
To do this, it holds two counters for each process:
- Time Ran – The amount of time quanta the process has stayed in the Running state.
- Time Deserved – This is a sum of the relative amount of time the process was entitled to run: If there are n processes in the system, for each quantum, each of their Time Deserved counters will be increased by 1/n.
If a new process enters, its Time Deserved is 0. A Blocked and/or Suspended process still deserves time (which is the difference from RR).
When the ratio of Time Ran/Time Deserved is greater than 1, this means that a process has taken up more than the amount of time it deserved and now another process must be placed in the Running state.
Imagine two processes (A and B) in the system, which both want to run. A will run first (why not), so that at the end of the first quantum, it would have deserved 1/2 and run 1, while B would deserve 1/2 and run 0. Since A has run more than it deserved (TR/TD ratio = 2 > 1), B will now run. At the end of the second quantum, both A and B have a TR/TD ratio of 1 and A can run again.
- Highest Response Ratio Next (HRRN) – In order to avoid starvation, we can use a ratio of the amount of time a process has spent waiting and the amount of time it needs to run. Since this is a non-preemptive algorithm, whenever a process exits Running, each process i has its ratio P(i) computed, where P(i) = (time waiting + expected run time) / expected run time and the process with the highest ratio wins and goes to Running.
- Feedback Scheduling (FS) – Each process is assigned the highest priority group. Each group has its own scheduling algorithm (Round Robin), when the process that would run at any given moment is the process decided upon by the scheduler of the highest priority group that has runnable processes. Once a process has run one time-quantum, it is demoted to the next priority level. This means that there is also fairness towards process that have yet to run, since they will enter the system with the highest priority.
The algorithm penalizes jobs that have been running longer (means that we keep the overall idea of SJF) without the need to know or guess the time a process might run in advance.
- Dynamic Priority Groups (CTSS) – FS presents us with the problem that longer running processes will finish very late and their turnaround time will be horrid. Since we can assume that a process waiting for more time in the lowest priority will likely wait again, we could decide that the lower the priority of the process, it will be given more time quanta. A good example is giving each process in a priority group n (where 0 is the highest priority group) the amount of 2n time-quanta per run. This would mean that instead of taking 100 context switches to finish a 100-time-quanta job, it would only take 7, since the process would run in the following time frames (in time-quanta): 1, 2, 4, 8, 16, 32, 37 (out of 64 given to it). This means that longer running processes are less penalized and that waiting time is smaller.
But how is scheduling really done in Unix? Each process is given a priority and entered into a priority queue. The queue with the smallest number runs first, just like in FS. Kernel-mode processes (daemons and such) get negative priorities (high) while User-mode processes get only positive priorities.
The Low-Level Scheduler maintains these queues (while the
High-Level Scheduler makes sure that Suspended processes get their time share too by moving them to and from the disk) and recalculates the processes’ priorities for the point in time n+1 by using Pn+1 = base priority + nice + Cn where Cn = cpu usagen + Cn-1/2. The fact that the more the process uses the cpu, the lower its priority, means that no process will ever feel starvation.
In Windows, there are 32 priority level, where the higher 16 are for system processes. Each process will consume its allocated time and then call the scheduler to call the next process, instead of the scheduler having to stop the process itself.
One problem we could have with scheduling is called Priority Inversion, in which a high-priority process waits for a lower-priority process to act, yet the lower-priority process doesn’t get to run yet. This means that a higher-priority process will not run and this is contrary to the principle of priorities. Windows solves this by giving the blocking process priority 16 (lowest system priority) for two time quanta at a time until the blocking process stops blocking the higher-priority process.