UNIT-
I
1.1 INTRODUCTION
An operating system act as an
intermediary between the user of a computer and computer hardware. The purpose
of an
operating system is to provide an environment in
which a user can execute
programs in a convenient
and efficient manner.
An operating system is software that manages the computer
hardware. The hardware must provide appropriate mechanisms to ensure the
correct operation of the computer system and to prevent user programs from
interfering with the proper operation of the system.
1.2 OPERATING SYSTEM
1.2.1 Definition of Operating System:
An Operating system is a program that
controls the execution of application programs
and acts as an interface
between the user
of a computer and the computer hardware.
A more common definition is that the
operating system is the one program running at all times on the computer
(usually called the kernel), with all else being applications programs.
An
Operating system is
concerned with the allocation of resources
and services, such as memory, processors, devices
and information. The
Operating System correspondingly includes
programs to manage these resources, such as
a traffic controller, a scheduler, memory management module, I/O programs, and a file system.
1.3 HISTORY OF OPERATING SYSTEM
Operating systems have been evolving through the years. Following
table shows the history of OS.
The 1940's - First Generations
The earliest electronic digital
computers had no operating systems. Machines of the time were so primitive that
programs were often entered one bit at time on rows of mechanical switches
(plug boards). Programming languages were unknown (not even assembly
languages). Operating systems were unheard of .
The 1950's - Second Generation
By the early 1950's, the routine had
improved somewhat with the introduction of punch cards. The General Motors
Research Laboratories implemented the first operating systems in early 1950's
for their IBM 701. The system of the 50's generally ran one job at a time.
These were called single-stream batch processing systems because programs and
data were submitted in groups or batches.
The 1960's - Third Generation
The systems of the 1960's were also
batch processing systems, but they were able to take better advantage of the
computer's resources by running several jobs at once. So operating systems
designers developed the concept of multiprogramming in which several jobs are
in main memory at once; a processor is switched from job to job as needed to
keep several jobs advancing while keeping the peripheral devices in use.
For example, on the system with no
multiprogramming, when the current job paused to wait for other I/O operation
to complete, the CPU simply sat idle until the I/O finished. The solution for
this problem that evolved was to partition memory into several pieces, with a
different job in each partition. While one job was waiting for I/O to complete,
another job could be using the CPU.
Another major feature in
third-generation operating system was the technique called spooling (simultaneous
peripheral operations on line). In spooling, a high-speed device like a disk
interposed between a running program and a low-speed device involved with the
program in input/output. Instead of writing directly to a printer, for example,
outputs are written to the disk. Programs can run to completion faster, and
other programs can be initiated sooner when the printer becomes available, the
outputs may be printed.
Note that spooling technique is
much like thread being spun to a spool so that it may be later be unwound as
needed.
Another feature present in this
generation was time-sharing technique, a variant of multiprogramming technique,
in which each user has an on-line (i.e., directly connected) terminal. Because
the user is present and interacting with the computer, the computer system must
respond quickly to user requests, otherwise user productivity could suffer.
Timesharing systems were developed to multiprogramming large number of
simultaneous interactive users.
Fourth Generation
With the development of LSI (Large
Scale Integration) circuits, chips, operating system entered in the system
entered in the personal computer and the workstation age. Microprocessor
technology evolved to the point that it become possible to build desktop
computers as powerful as the mainframes of the 1970s. Two operating systems
have dominated the personal computer scene: MS-DOS, written by Microsoft, Inc.
for the IBM PC and other machines using the Intel 8088 CPU and its successors,
and UNIX, which is dominant on the large personal computers using the Motorola
6899 CPU family.
1.4
THE OPERATING SYSTEM ZOO
All of this history and development
has left us with a wide variety of operating systems, not all of which are
widely known. In this section we will briefly touch upon seven of them. We will
come back to some of these different kinds of systems later in the book.
1.4.1 Mainframe Operating Systems
At the high end are the operating
systems for the mainframes, those room-sized computers still found in major
corporate data centers. These computers distinguish themselves from personal
computers in terms of their I/O capacity. A mainframe with 1000 disks and
thousands of gigabytes of data is not unusual: a personal computer with these
specifications would be odd indeed. Mainframes are also making something of a comeback
as high-end Web servers, servers for large-scale electronic commerce sites, and
servers for business-to-business transactions.
The operating systems for
mainframes are heavily oriented toward processing many jobs at once, most of
which need prodigious amounts of I/O. They typically offer three kinds of
services: batch, transaction processing, and timesharing. A batch system is one
that processes routine jobs without any interactive user present. Claims
processing in an insurance company or sales reporting for a chain of stores are
typically done in batch mode. Transaction processing systems handle large
numbers of small requests; for example, check processing at a bank or airline
reservations. Each unit of work is small, but the system must handle hundreds
or thousands per second. Timesharing systems allow multiple remote users to run
jobs on the computer at once, such as querying a big database. These functions
are closely related: mainframe operating systems often perform all of them. An
example mainframe operating system is OS/390, a descendant of OS/360.
1.4.2 Server Operating Systems
One level down is the server
operating systems. They run on servers, which are very large personal
computers, workstations, or even mainframes. They serve multiple users at once
over a network and allow the users to share hardware and software resources.
Servers can provide print service, file service, or Web service. Internet
providers run many server machines to support their customers and Web sites use
servers to store the Web pages and handle the incoming requests. Typical server
operating systems are UNIX and Windows 2000. Linux is also gaining ground for servers.
1.4.3 Multiprocessor Operating
Systems
An increasingly common way to get
major-league computing power is to connect multiple CPUs into a single system.
Depending on precisely how they are connected and what is shared, these systems
are called parallel computers, multicomputer, or multiprocessors. They need
special operating systems, but often these are variations on the server
operating systems, with special features for communication and connectivity.
1.4.4 Personal Computer Operating
Systems
The next category is the personal
computer operating system. Their job is to provide a good interface to a single
user. They are widely used for word processing, spreadsheets, and Internet
access. Common examples are Windows 98, Windows 2000, the Macintosh operating
system, and Linux. Personal computer operating systems are so widely known that
probably little introduction is needed. In fact, many people are not even aware
that other kinds exist.
1.4.5 Real-Time Operating Systems
Another type of operating system is
the real-time system. These systems are
Characterized by having time as a
key parameter. For example, in industrial process
control systems, real-time
computers have to collect data about the production process and use it to
control machines in the factory. Often there are hard deadlines that must be
met. For example, if a car is moving down an assembly line, certain actions
must take place at certain instants of time, if a welding robot welds too early
or too late, the car will be ruined. If the action absolutely must occur
at a certain moment (or within a certain range), we have a hard real-time
system.
Another kind of real-time system is
a soft real-time system, in which missing an occasional deadline is
acceptable. Digital audio or multimedia systems fall in this
category. VxWorks and QNX are
well-known real-time operating systems.
1.4.6 Embedded Operating Systems
Continuing on down to smaller and
smaller systems, we come to palmtop computers and embedded systems. A palmtop
computer or PDA (Personal Digital Assistant) is a small computer that
fits in a shirt pocket and performs a small number of functions such as an
electronic address book and memo pad. Embedded systems run on the computers
that control devices that are not generally thought of as computers, such as TV
sets, microwave ovens, and mobile telephones. These often have some
characteristics of real-time systems but also have size, memory, and power
restrictions that make them special.
Examples of such operating systems
are PalmOS and Windows CE (Consumer Electronics).
1.4.7 Smart Card Operating Systems
The smallest operating systems run on smart
cards, which are credit card-sized devices containing a CPU chip. They have
very severe processing power and memory constraints. Some of them can handle
only a single function, such as electronic payments, but others can handle
multiple functions on the same smart card. Often these are proprietary systems.
Some smart cards are Java oriented. What this means is that the ROM on the
smart card holds an interpreter for the Java Virtual Machine (JVM). Java
applets (small programs) are downloaded to the card and are interpreted by the
JVM interpreter. Some of these cards can handle multiple Java applets at the
same time, leading to multiprogramming and the need to schedule them. Resource
management and protection also become an issue when two or more applets are
present at the same time. These issues must be handled by the (usually
extremely primitive) operating system present on the card.
1.5 OPERATING SYSTEM CONCEPTS
Modern operating systems share the goal of supporting the system components.
The system components are:
1. Process Management
2. Main Memory Management
3. File Management
4. Secondary Storage Management
5. I/O System Management
6. Networking
7. Protection System
8. Command Interpreter System
1.6 OPERATION ON PROCESSES
Several operations are possible on the process. Process must be
created and deleted dynamically. Operating system must provide the environment
for the process operation. We discuss the two main operations on processes.
1. Create a process
2. Terminate a process
1.6.1 Create Process
Operating system creates a new process with the specified or
default
attributes and identifier. A
process may create several new sub processes.
Syntax for creating new process is:
CREATE ( processed, attributes )
Two names are used in the process they
are parent process and child process.
Parent process is a
creating process. Child process
is created by
the parent
process. Child process may create another sub process. So it forms
a tree of
processes. When operating system issues a CREATE system call, it
obtains a
new process control block from
the pool of free memory, fills the fields with
Provided and default parameters, and insert the PCB into the ready
list. Thus it
makes the specified process eligible to run the process.
When a process is created, it requires
some parameters. These are priority,
level of privilege, requirement of memory, access right, memory
protection
information etc. Process will need certain resources, such as CPU
time,
memory, files and I/O
devices to complete the operation. When process
creates a sub process, that
sub process may obtain its resources directly from
the operating system. Otherwise it uses the resources of parent
process.
When a process creates
a new process, two possibilities
exist in terms
of
execution.
1.
The parent continues to execute concurrently with its children.
2. The parent waits until some or all of its children have terminated.
For address space, two possibilities occur:
1.
The child process is a duplicate of the parent process.
2.
The child process has a program loaded into it.
1.6.2 Terminate a Process
DELETE system call is used for
terminating a process. A process may delete itself or by another process. A process
can cause the termination of another process via an appropriate system call.
The operating system reacts by
reclaiming all
resources allocated to the
specified process, closing files
opened by or for the process. PCB is also
removed from its
place of residence in the list and is returned to the free pool. The
DELETE service is normally invoked as a part of orderly program termination.
Following are the resources for terminating the child process by
parent process.
1. The
task given to the child is no longer required.
2. Child has exceeded its usage of some of the
resources that it has been allocated.
3. Operating
system does not allow a child to continue if its parent terminates.
1.6.3 Co-operating Processes
Co-operating process is a process that
can affect or be affected by the other processes while executing. If suppose
any process is sharing data with other processes, then it is called
co-operating process. Benefit of the co-operating processes is:
1. Sharing
of information
2. Increases
computation speed
3. Modularity
4. Convenience
Co-operating processes share the
information: Such as a file, memory etc.
System must provide an environment to allow
concurrent access to these types of resources. Computation speed will increase
if the computer has multiple processing elements are connected together. System
is constructed in a modular fashion. System function is divided into number of
modules.
1.7 PROCESS MANAGEMENT / PROCESS SCHEDULING
1.7.1 Concept of Process
A process is sequential program in
execution.
A process defines the fundamental unit of computation for the
computer. Components of process are:
1.Object
Program
2.
Data
3.
Resources
4.
Status of the process execution.
Object program i.e. Code to be executed. Data is used for executing
the program. While executing the program, it may require some resources. Last component
is used
for verifying the status of the process execution. A process can run to completion
only when all requested resources have been allocated to the process. Two or
more processes could be executing the same program, each using their own data
and resources.
1.7.2 Processes and Programs
Process is a dynamic entity that is a
program in execution. A process is a sequence of information executions.
Process exists in a limited span of time.
Two or more processes could be executing the same program, each
using their own data and resources.
Program is a static entity made up of
program statement. Program contains the instructions. A program exists at
single place in space and continues to exist. A program does not perform the
action by itself.
Multiprogramming operating system allows
more than one process to be loaded into the executable memory at a time and for
the loaded process to share the CPU using time multiplexing.
The scheduling mechanism is the part of
the process manager that handles the
removal of the running process
from the CPU and
the selection of another process on the basis of particular
strategy.
1.7.3 Scheduling Queues
When the process enters into the system,
they are put into a job queue. This queue consists of all processes in the
system. The operating system also has other queues.
Device queue is a queue for which a list
of processes waiting for a particular I/O device. Each device has its own
device queue. In this type queue is represented by rectangular box. The circles represent the
resources that serve the queues.
Queues are of two types:
ready queue and set of device queues.
A newly arrived process is put in the ready queue. Processes are waiting
in ready queue for allocating the CPU. Once the CPU is assigned to the process,
then process will execute. While executing the process, one of the several
events could occur.
1.The process could issue an I/O request
and then place in an I/O queue.
2.The process could create new sub
process and waits for its termination.
3.The process could be removed forcibly
from the CPU, as a result of interrupt and put back in the ready queue.
1.7.4 Two State Process Model
Process may be in one of two states:
a)Running
b)Not Running
when
new process is
created by OS, that
process enters into
the system in the running state.
Processes that are not running are kept
in queue, waiting their turn to execute. Each entry in the queue is a printer
to a particular process.
Queue is implemented by using linked list. Use
of dispatcher is as follows. When a process interrupted, that process is
transferred in the waiting queue. If the process has completed or aborted, the
process is discarded. In either case, the dispatchers then select a process
from the queue to execute.1.8 PROCESS STATE
When process executes, it changes state. Process state is defined as the Current
activity of the process. Fig. 3.1 shows
the general form of the Process state transition diagram. Process state contains
five states. Each Process is in one of the states. The states are listed below.
1.
New
2.
Ready
3.
Running
4.
Waiting
5.
Terminated (exist)
1. New: A process that just been
created.
2. Ready: Ready processes are waiting to
have the processor allocated to them by the operating system so that they can
run.
3. Running: The process that is
currently being executed. A running process .
Possesses all the resources needed for its execution, including
the processor.
4. Waiting: A process that cannot
execute until some event occurs such as the completion of an I/O operation. The
running process may become suspended by invoking an I/O module.
5. Terminated: A process that has been
released from the pool of executable processes by the operating system.
Whenever processes changes
state, the operating system reacts
by placing the process PCB in
the list that corresponds
to its new
state. Only one process can be running on any processor at any instant
and many processes may be ready and waiting state.
1.8.1 Suspended Processes
Characteristics of suspend process
1. Suspended process is not immediately
available for execution.
2. The process may or may not be waiting
on an event.
3. For preventing the execution, process
is suspend by OS, parent process,
Process it and an agent.
4. Process may not be removed from the
suspended state until the agent orders the removal.
Swapping is used to move all of a
process from main memory to disk. When all the process by putting it in the
suspended state and transferring it to disk reasons for process suspension
1. Swapping
2. Timing
3. Interactive
user request
4. Parent
process request
Swapping: OS needs
to release required main memory to bring in a process that is
ready to execute.
Timing: Process
may be suspended while waiting for the next time interval.
Interactive user request: Process may be suspended for debugging purpose by
user.
Parent process request: To modify the suspended
process or to coordinate the
activity of various descendants.
1.8.2 Process Control Block (PCB)
Each process contains the process control block (PCB). PCB is the
data structure used by the operating system. Operating system groups all 




























information
that needs about particular process.






























1. Pointer: Pointer points to another
process control block. Pointer is used for maintaining the scheduling list.
2. Process State: Process state may be
new, ready, running, waiting and so on.
3. Program Counter: It indicates the address of the next
instruction to be executed for this process.
4. Event information: For a process in the blocked state this field
cont information concerning the event for
which the process is waiting.
5.CPU
register : It indicates
general purpose register, stack pointers, index registers and accumulators
etc. number of register and type of register totally depends upon the computer
architecture.
6. Memory Management Information: This
information may include the value of base and limit register. This information
is useful for deal locating the memory when the process terminates.
7. Accounting Information: This
information includes the amount of CPU and real time used time limits, job or
process numbers, account numbers etc.
Process control block also includes the information about CPU
scheduling, I/O resource management, file management information, priority and
so on.
The PCB simply serves as the repository for any information that
may vary from process to process.
When a process is created, hardware registers
and flags are set to the values provided by the loader or linker. Whenever
that process is
suspended, the contents of the
processor register are
usually saved on the
stack and the pointer to the related
stack frame is stored
in the PCB. In this way, the hardware state can be
restored when the process is scheduled to run again.
1.9 INTRODUCTION OF THREAD
A
thread is a flow
of execution through the process
code, with its own program counter, system registers and
stack. Threads are a popular way to improve application performance through
parallelism. A thread is sometimes called a light weight process.
Threads
represent a software approach to
improving performance of operating system
by reducing the
over head thread
is equivalent to a
classical process. Each thread belongs to exactly one process and no thread can
exist outside a process. Each thread represents a separate flow of control.
Threads have been successfully used in
implementing network servers. They also provide a suitable foundation for
parallel execution of applications on shared memory multiprocessors.
1.9.1 Types of Thread
Threads is implemented in two ways :
1.User
Level
2.Kernel
Level
1.9.2 User Level Thread
In
a user thread, all of
the work of
thread management is
done by the application and
the kernel is
not aware of the existence of threads.
The thread library contains code for creating and destroying threads,
for passing message and data between threads, for scheduling thread execution
and for saving and restoring thread contexts. The application begins with a
single thread and begins running in that thread.
Advantage of user level thread over Kernel level thread:
1. Thread switching does not require Kernel
mode privileges.
2. User level thread can run on any operating
system.
3. Scheduling can be application specific.
4. User level threads are fast to create and
manage.
Disadvantages of user level thread:
1. In a typical operating system, most system
calls are blocking.
2. Multithreaded application cannot take
advantage of multiprocessing.
1.9.3 Kernel Level Threads
In Kernel level thread, thread
management done by the Kernel. There is no thread management code in the application
area. Kernel threads are supported directly by the operating system. Any
application can be programmed to be multithreaded. All of the threads within an
application are supported within a single process. The Kernel maintains context
information for the process as a whole and for individual’s threads within the
process.
Scheduling by the Kernel is done on a thread
basis. The Kernel performs thread creation, scheduling and management in Kernel
space. Kernel threads are generally slower to create and manage than the user
threads.
Advantages of Kernel level thread:
1.Kernel
can simultaneously schedule multiple threads from the same process on multiple processes.
2.If
one thread in a process is blocked, the Kernel can schedule another thread of
the same process.
3.Kernel
routines themselves can multithreaded.
Disadvantages:
1.Kernel
threads are generally slower to create and manage than the user threads.
2.Transfer of control from one thread to another within
same process requires a mode switch to the Kernel.
1.9.4 Advantages of Thread
1. Thread minimizes context switching time.
2. Use of threads provides concurrency within a process.
3. Efficient communication.
4. Economy- It is more economical to create and context switch
threads.
5. Utilization of multiprocessor architectures – The benefits of multithreading
can be greatly increased in a
multiprocessor architecture.
1.9.5 Multithreading Models
Some
operating system provides a combined user level thread and Kernel level
thread facility. Solaris is a good example of this combined approach. In a
combined system, multiple threads within the same application can run in
parallel on multiple processors and a blocking system call need not block the
entire process.
Multithreading models are three types:
1. Many to many relationship.
2. Many to one relationship.
3. One to one relationship.
1.9.6 Many to Many Models
In this model, many user level threads
multiplex to the Kernel thread of smaller or equal numbers. The number of
Kernel threads may be specific to either a particular application or a
particular machine.
Fig. 4.3 shows the many to many models. In this
model, developers can create
as many
user threads as
necessary and the corresponding Kernel threads can run in parallels on a
multiprocessor.
1.9.7 Many to One Model
Many to one model maps many user level
threads to one Kernel level thread. Thread management is done in user space.
When thread makes a blocking system call, the entire process will be blocks.
Only one thread can access the Kernel at a time, so multiple threads are unable to run in
parallel on multiprocessors.
Fig.4.4 shows the many to one model.
If the user level thread libraries are implemented in the operating
system, that system does not support Kernel threads use the many to one
relationship modes.
1.9.8 One to One Model
There is
one to one
relationship of user level
thread to the
kernel level thread. Fig. 4.5
shows one to one relationship model. This model provides more concurrency than
the many to one model.
It also another thread to run when a
thread makes a blocking system call. It supports multiple threads to execute in
parallel on microprocessors.
Disadvantage of this model is that
creating user thread requires the corresponding Kernel thread. OS/2, Windows NT
and windows 2000 use one to one relationship model.
1.9.9 Threading Issues
System calls fork and exec is discussed
here. In a multithreaded program environment, fork and exec system calls is
changed. UNIX system has two versions of fork system calls. One call duplicates
all threads and another that duplicates only the thread that invokes the fork
system call. Whether to use one or two version of fork system call totally
depends upon the application.
Duplicating all threads is unnecessary,
if exec is called immediately after fork system call.
Thread cancellation is a process of
thread terminates before its completion of task. For example, in multiple thread
environments, thread concurrently searching through a database. If anyone thread
returns the result, the remaining thread might be cancelled.
Thread cancellation is of two types.
1. Asynchronous
cancellation
2. Synchronous
cancellation
In asynchronous cancellation, one thread
immediately terminates the target thread. Deferred cancellation periodically
check for terminate by target thread. It also allows the target thread to
terminate itself in an orderly fashion. Some resources are allocated to the
thread. If we cancel the thread, which update the data with other thread. This
problem may face by asynchronous cancellation system wide resource is not free
if threads cancelled asynchronously. Most of the operating system allows a
process or thread to be cancelled asynchronously.
1.10
THREADS
In
traditional operating systems, each process has an address space and a single
thread of control.
In
fact, that is almost the definition of a process. Nevertheless, there are
frequently situations in which it is desirable to have multiple threads of
control in the same address space running in quasi-parallel, as though they
were separate processes (except for the shared address space). In the following
sections we will discuss these situations and their implications.
1.10.1
The Thread Model
The
process model as we have discussed it thus far is based on two independent
concepts:
Resource
grouping and execution. Sometimes it is useful to separate them; this is where
threads come in.
One
way of looking at a process is that it is way to group related resources
together. A process has an address space containing program text and data, as
well as other resources. This resource may include open files, child processes,
pending alarms, signal handlers, accounting information, and more. By putting them
together in the form of a process, they can be managed more easily.
The
other concept a process has is a thread of execution, usually shortened to just
thread. The thread has a program counter that keeps track of which
instruction to execute next. It has registers, which hold its current working
variables. It has a stack, which contains the execution history, with one frame
for each procedure called but not yet returned from. Although a thread must
execute in some process, the thread and its process are different concepts and
can be treated separately. Processes are used to group resources together;
threads are the entities scheduled for execution on the CPU.
What
threads add to the process model is to allow multiple executions to take place
in the same process environment, to a large degree independent of one another.
Having multiple threads running in parallel in one process is analogous to
having multiple processes running in parallel in one computer. In the former
case, the threads share an address space, open files, and other resources. In
the latter case, processes share physical memory, disks, printers, and other resources.
Because threads have some of the properties of processes, they are sometimes
called lightweight processes. The term multithreading is also
used to describe the situation of allowing multiple threads in the same
process.
We
see three traditional processes. Each process has its own address space and a single
thread of control. we see a single process with three threads of control. Although
in both cases we have three threads, in Fig. 2-6(a) each of them operates in a different
address space, all three of them share the same address space.
Three
processes each with one thread. (b) One process with tree threads.
When
a multithreaded process is run on a single-CPU system, the threads take turns
running. we saw how multiprogramming of processes works. By switching back and
forth among multiple processes, the system gives the illusion of separate
sequential processes running in parallel. Multithreading works the same way.
The CPU switches rapidly back and forth among the threads providing the
illusion that the threads are running in parallel, albeit on a slower CPU than
the real one. With three compute-bound threads in a process, the threads would
appear to be
running
in parallel, each one on a CPU with one-third the speed of the real CPU.
Different
threads in a process are not quite as independent as different processes. All
threads have exactly the same address space, which means that they also share
the same global variables.
Since
every thread can access every memory address within the process’ address space,
one
thread
can read, write, or even completely wipe out another thread’s stack. There is
no protection between threads because (1) it is impossible, and (2) it should
not be necessary. Unlike different processes, which may be from different users
and which may be hostile to one another, a process is always owned by a single
user, who has presumably created multiple threads so that they can cooperate,
not fight. In addition to sharing an address space, all the threads share the
same set of
open
files, child processes, alarms, and signals when the three threads are actually
part of the same job and are actively and closely cooperating with each other.
The
items in the first column are process properties, not thread properties. For
example, if one thread opens a file, that file is visible to the other threads
in the process and they can read and write it. This is logical since the
process is the unit of resource management, not the thread. If each thread had
its own address space, open files, pending alarms, and so on, it would be a separate
process. What we are trying to achieve with the thread concept is the ability
for multiple threads of execution to share a set of resources so they can work
together closely to perform
some
task.
Per
process items
Per
thread items
Address
space
Global
variables
Open
files
Child
processes
Pending
alarms
Signals
and signal handlers
Accounting
information
Program
counter
Registers
Stack
State
The
first column lists some items shared by all threads in a process. The second
one lists some items private to each thread.
Like
a traditional process (i.e., a process with only one thread), a thread can be
in any one of several states: running, blocked, ready, or terminated. A running
thread currently has the CPU and is active. A blocked thread is waiting for
some event to unblock it. For example, when a thread performs a system call to
read from the keyboard, it is blocked until input is typed.
A
thread can block waiting for some external event to happen or for some other
thread to unblock it. A ready thread is scheduled to run and will as soon as
its turn comes up.
The
transitions between thread states are the same as the transitions between
process states and are illustrated in
It
is important to realize that each thread has its own stack, Each thread’s stack
contains one frame for each procedure called but not yet returned from. This
frame contains the procedure’s local variables and the return address to use
when the procedure call has finished. For example, if procedure X calls
procedure Y and this one calls procedure Z , while Z is
executing the frames for X , Y and Z will all be on the
stack. Each thread will generally call different procedures and a thus a
different execution history. This is why thread needs is its own stack.
When
multithreading is present, processes normally start with a single thread
present. This thread has the ability to create new threads by calling a library
procedure, for example, thread_create. A parameter to thread_create typically
specifies the name of a procedure for the new thread to run.
It
is not necessary (or even possible) to specify anything about the new thread’s
address space since it automatically runs in the address space of the creating
thread. Sometimes threads are hierarchical, with a parent-child relationship,
but often no such relationship exists, with all threads being equal. With or
without a hierarchical relationship, the creating thread is usually returned a
thread identifier that names the new thread.
When
a thread has finished its work, it can exit by calling a library procedure,
say, thread_exit .
It
then vanishes and is no longer schedulable. In some thread systems, one thread
can wait for a (specific) thread to exit by calling a procedure, for example, thread_wait.
This procedure blocks the calling thread until a (specific) thread has
exited. In this regard, thread creation and termination is very much like process
creation and termination, with approximately the same options as well.
Another
common thread call is thread_yield, which allows a thread to voluntarily
give up the CPU to let another thread run. Such a call is important because
there is no clock interrupt to actually enforce timesharing as there is with
processes. Thus it is important for threads to be polite and voluntarily
surrender the CPU from time to time to give other threads a chance to run.
Other
calls allow one thread to wait for another thread to finish some work, for a
thread to announce that it has finished some work, and so on.
While
threads are often useful, they also introduce a number of complications into
the programming model. To start with, consider the effects of the UNIX fork
system call. If the parent process has multiple threads, should the child also
have them? If not, the process may not function properly, since all of them may
be essential.
However,
if the child process gets as many threads as the parent, what happens if a thread
in the parent was blocked on a read call, say, from the keyboard? Are two
threads now blocked on the keyboard, one in the parent and one in the child?
When a line is typed, do both threads get a copy of it? Only the parent? Only
the child? The same problem exists with open network connections.
Another
class of problems is related to the fact that threads share many data
structures. What happens if one thread closes a file while another one is still
reading from it? Suppose that one thread notices that there is too little
memory and starts allocating more memory. Part way through, a thread switch
occurs, and the new thread also notices that there is too little memory and
also starts allocating more memory. Memory will probably be allocated twice.
These problems can be solved with some effort, but careful thought and design
are needed to make multithreaded programs work correctly.
1.10.2
Thread Usage
Having
described what threads are, it is now time to explain why anyone wants them.
The main reason for having threads is that in many applications, multiple
activities are going on at once.
Some
of these may block from time to time. By decomposing such an application into
multiple sequential threads that run in quasi-parallel, the programming model
becomes simpler.
We
have seen this argument before. It is precisely the argument for having
processes. Instead of thinking about interrupts, timers, and context switches,
we can think about parallel processes. Only now with threads we add a new
element: the ability for the parallel entities to share an address space and
all of its data among themselves. This ability is essential for certain applications,
which is why having multiple processes (with their separate address spaces)
will not work.
A
second argument for having threads is that since they do not have any resources
attached to them, they are easier to create and destroy than processes. In many
systems, creating a thread goes 100 times faster than creating a process. When
the number of threads needed changes dynamically and rapidly, this property is
useful.
A
third reason for having threads is also a performance argument. Threads yield
no performance gain when all of them are CPU bound, but when there is
substantial computing and also substantial I/O, having threads allows these
activities to overlap, thus speeding up the application.
Finally,
threads are useful on systems with multiple CPUs, where real parallelism is
possible.
It
is probably easiest to see why threads are useful by giving some concrete
examples. As a first example, consider a word processor. Most word processors
display the document being created on the screen formatted exactly as it will
appear on the printed page. In particular, all the line breaks and page breaks
are in their correct and final position so the user can inspect them and change
the document if need be (e.g., to eliminate widows and orphans—incomplete top
and bottom lines on a page, which are considered esthetically unpleasing).
Suppose
that the user is writing a book. From the author’s point of view, it is easiest
to keep the entire book as a single file to make it easier to search for
topics, perform global substitutions, and so on. Alternatively, each chapter
might be a separate file. However, having every section and subsection as a
separate file is a real nuisance when global changes have to be made to the
entire book since then hundreds of files have to be individually edited. For
example, if proposed
Standard
xxxx is approved just before the book goes to press; all occurrences of “Draft
Standard xxxx” have to be changed to “Standard xxxx” at the last minute. If the
entire book is one file,typically a single command can do all the
substitutions. In contrast, if the book is spread over 300 files, each one must
be edited separately.
Now
consider what happens when the user suddenly deletes one sentence from page 1
of an 800- page document. After checking the changed page to make sure it is
correct, the user now wants to make another change on page 600 and types in a
command telling the word processor to go to that page (possibly by searching
for a phrase occurring only there). The word processor is now forced to
reformat the entire book up to page 600 on the spot because it does not know
what the first line of page 600 will be until it has processed all the previous
pages. There may be a
substantial
delay before page 600 can be displayed, leading to an unhappy user.
Threads
can help here. Suppose that the word processor is written as a two-threaded
program.
One
thread interacts with the user and the other handles reformatting in the
background. As soon as the sentence is deleted from page 1 the interactive
thread tells the reformatting thread to reformat the whole book. Meanwhile, the
interactive thread continues to listen to the keyboard and mouse and responds
to simple commands like scrolling page 1 while the other thread is computing
madly in the background. With a little luck, the reformatting will be completed
before the user asks to see page 600, so it can be displayed instantly.
While
we are at it, why not add a third thread? Many word processors have a feature
of automatically saving the entire file to disk every few minutes to protect
the user against losing a day’s work in the event of a program crash, system
crash, or power failure. The third thread can handle the disk backups without
interfering with the other two.
If
the program were single-threaded, then whenever a disk backup started, commands
from the keyboard and mouse would be ignored until the backup was finished. The
user would perceive this as sluggish performance. Alternatively, keyboard and
mouse events could interrupt the disk backup, allowing good performance but
leading to a complex interrupt-driven programming model. With three threads,
the programming model is much simpler. The first thread just interacts with the
user. The second thread reformats the document when told to. The third thread writes
the contents of RAM to disk periodically.
It
should be clear that having three separate processes would not work here
because all three threads need to operate on the document. By having three
threads instead of three processes, they share a common memory and thus all
have access to the document being edited.
An
analogous situation exists with many other interactive programs. For example,
an electronic spreadsheet is a program that allows a user to maintain a matrix,
some of whose elements are data provided by the user. Other elements are
computed based on the input data using potentially complex formulas. When a
user changes one element, many other elements may have to be recomputed. By
having a background thread does the recomputed on, the interactive thread can
allow
the user to make additional changes while the computation is going on.
Similarly, a third thread can handle periodic backups to disk on its own.
Now
consider yet another example of where threads are useful: a server for a World
Wide Web site. Requests for pages come in and the requested page is sent back
to the client. At most Web sites, some pages are more commonly accessed than
other pages. For example, Sony’s home page is accessed far more than a page
deep in the tree containing the technical specifications of some particular
camcorder. Web servers use this fact to improve performance by maintaining a collection
of heavily used pages in main memory to eliminate the need to go to disk to get
them.
Such
a collection is called a cache and is used in many other contexts as
well.
One
way to organize the Web server is shown in Fig. 2-10(a). Here one thread, the dispatcher,
reads incoming requests for work from the network. After examining the
request, it chooses an idle (i.e., blocked) worker thread and hands it
the request, possibly by writing a pointer to the message into a special word
associated with each thread. The dispatcher then wakes up the sleeping worker,
moving it from blocked state to ready state.
When
the worker wakes up, it checks to see if the request can he satisfied from the
Web page cache, to which all threads have access. If not, it starts a read
operation to get the page from the disk and blocks until the disk operation
completes. When the thread blocks on the disk operation, another thread is
chosen to run, possibly the dispatcher, in order to acquire more work, or possibly
another worker that is now ready to run.
This
model allows the server to be written as a collection of sequential threads.
The dispatcher’s program consists of an infinite loop for getting a work
request and handing it off to a worker.
Each
worker’s code consists of an infinite loop consisting of accepting a request
from the dispatcher and checking the Web cache to see if the page is present.
If so, it is returned to the client and the worker blocks waiting for a new
request. If not, it gets the page from the disk, returns it to the client, and
blocks waiting for a new request.
A
rough outline of the code is given in Fig. 2-11. Here, as in the rest of this
book, TRUE is assumed to be the constant 1. Also, buf and page
are structures appropriate for holding a work request and a Web page,
respectively.
while
(TRUE) {
get_next_request(&buf);
handoff_work(&buf);
}
while
(TRUE) {
wait_for_work(&buf);
look_for_page_in_cache(&buf,
&page);
if
(page_not_in_cache(&page))
read_page_from_disk(&buf,
&page);
return
page(&page);
}
Consider
how the Web server could be written in the absence of threads. One possibility
is to have it operate as a single thread. The main loop of the Web server gets
a request, examines it, and carries it out to completion before getting the
next one. While waiting for the disk, the servers idle and do not process any
other incoming requests. If the Web server is running on a dedicated machine,
as is commonly the case, the CPU is simply idle while the Web server is waiting
for the disk. The net result is that many fewer requests/sec can be processed.
Thus threads gain considerable performance, but each thread is programmed
sequentially, in the usual way.
So
far we have seen two possible designs: a multithreaded Web server and a
single-threaded Web server. Suppose that threads are not available but the
system designers find the performance loss due to single threading
unacceptable. If a no blocking version of the read system call is available, a
third approach is possible. When a request comes in, the one and only thread examines
it. If it can be satisfied from the cache, fine, but if not, a no blocking disk
operation is started.
The
server records the state of the current request in a table and then goes and
gets the next event. The next event may either be a request for new work or a
reply from the disk about a previous operation. If it is new work, that work is
started. If it is a reply from the disk, the relevant information is fetched
from the table and the reply processed. With no blocking disk I/O a reply
probably will have to take the form of a signal or interrupt.
In
this design, the “sequential process” model that we had in the first two cases
is lost. The state of the computation must be explicitly saved and restored in
the table every time the server switches from working on one request to
another. In effect, we are simulating the threads and their stacks the hard
way. A design like this in which each computation has a saved state and there
exists some set of events that can occur to change the state is called a finite-state
machine.
This
concept is widely used throughout computer science.
It
should now be clear what threads have to offer. They make it possible to retain
the idea of sequential processes that make blocking system calls (e.g., for
disk I/O) and still achieve parallelism. Blocking system calls make programming
easier and parallelism improves performance. The single-threaded server retains
the ease of blocking system calls but gives up performance. The third approach
achieves high performance through parallelism but uses no blocking calls and
interrupts and is thus is hard to program:
Threads
Parallelism,
blocking system calls
Single-threaded
process
No
parallelism, blocking system calls
Finite-state
machine
Parallelism,
no blocking system calls, interrupts
A
third example where threads are useful is in applications that must process
very large amounts of data. The normal approach is to read in a block of data,
process it, and then write it out again.
The
problem here is that if only blocking system calls are available, the process
blocks while data are coming in and data are going out. Having the CPU go idle
when there is lots of computing to do is clearly wasteful and should be avoided
if possible.
Threads
offer a solution. The process could be structured with an input thread, a
processing thread, and an output thread. The input thread reads data into an
input buffer.
The
processing thread takes data out of the input buffer, processes them, and puts
the results in an output buffer.
The
output buffer writes these results back to disk. In this way, input, output,
and processing can all be going on at the same time. Of course, this model only
works if a system call blocks only the calling thread, not the entire process.
1.10.3
Implementing Threads in User Space
There
are two main ways to implement a threads package: in user space and in the
kernel. The choice is moderately controversial, and a hybrid implementation is
also possible. We will now describe these methods, along with their advantages
and disadvantages.
The
first method is to put the threads package entirely in user space. The kernel
knows nothing about them. As far as the kernel is concerned, it is managing
ordinary, single-threaded processes.
The
first, and most obvious, advantage is that a user-level threads package can be
implemented on an operating system that does not support threads. All operating
systems used to fall into this category, and even now some still do.
The
threads run on top of a run-time system, which is a collection of procedures
that manage threads. We have seen four of these already: thread_create ,
thread_exit , thread_wait , and thread_yield , but usually
there are more.
When
threads are managed in user space, each process needs its own private thread
table to keep track of the threads in that process. This table is analogous
to the kernel’s process table, except that it keeps track only of the
per-thread properties such the each thread’s program counter stack pointer,
registers, state, etc. The thread table is managed by the runtime system.
When
a thread is moved to ready state or blocked state, the information needed to
restart it is stored in the thread table, exactly the same way as the kernel
stores information about processes in the process table.
When
a thread does something that may cause it to become blocked locally, for
example, waiting for another thread in its process to complete some work, it
calls a run-time system procedure.
This
procedure checks to see if the thread must be put into blocked state. If so, it
stores the thread’s registers (i.e., its own) in the thread table, looks in the
table for a ready thread to run and reloads the machine registers with the new
thread’s saved values. As soon as the stack pointer and program counter have
been switched, the new thread comes to life again automatically. If the machine
has an instruction to store all the registers, and another one to load them
all, the entire thread switch can be done in a handful of instructions. Doing
thread switching like this is at least
an
order of magnitude faster than trapping to the kernel and is a strong argument
in favor of user level threads packages.
However,
there is one key difference with processes. When a thread is finished running
for the moment, for example, when it calls thread_yield, the code of thread
_yield can save the thread’s information in the thread table itself.
Furthermore, it can then call the thread scheduler to pick another thread to
run. The procedure that saves the thread’s state and the scheduler are just
local procedures, so invoking them is much more efficient than making a kernel
call. Among other
Issues,
no trap are needed, no context switch is needed, and the memory cache need not
be flushed, and so on. This makes thread scheduling very fast.
User-level
threads also have other advantages. They allow each process to have its own customized
scheduling algorithm. For some applications, for example, those with a garbage collector
thread, not having to worry about a thread being stopped at an inconvenient
moment is a plus. They also scale better, since kernel threads invariably
require some table space and stack space in the kernel, which can be a problem
if there are a very large number of threads.
Despite
their better performance, user-level threads packages have some major problems.
First among these is the problem of how blocking system calls are implemented.
Suppose that a thread reads from the keyboard before any keys have been hit.
Letting the thread actually make the system call is unacceptable, since this
will stop all the threads. One of the main goals of having threads in the first
place was to allow each one to use blocking calls, but to prevent one blocked thread
from affecting the others. With blocking system calls, it is hard to see how
this goal can be achieved readily.
The
system calls could all be changed to be no blocking (e.g., a read on the
keyboard would just return 0 bytes if no characters were already buffered), but
requiring changes to the operating system is unattractive. Besides, one of the
arguments for user-level threads was precisely that they could run with existing
operating systems. In addition, changing the semantics of read will require
changes to many user programs.
Another
alternative is possible in the event that it is possible to tell in advance if
a call will block.
In
some versions of UNIX, a system call select exists, which allows the caller to
tell whether a prospective read will block. When this call is present, the
library procedure read can be replaced with a new one that first does a
select call and then only does the read call if it is safe (i.e., will not
block). If the read call will block, the call is not made. Instead, another
thread is run. The next time the run-time system gets control, it can check
again to see if the read is now safe. This approach requires rewriting parts of
the system call library, is inefficient and inelegant, but there
is
little choice. The code placed around the system call to do the checking is
called a jacket or wrapper.
Somewhat
analogous to the problem of blocking system calls is the problem of page
faults. it is sufficient to say that computers can be set up in such a way that
not all of the program is in main memory at once. If the program calls or jumps
to
an instruction that is not in memory, a page fault occurs and the operating
system will go and get the missing instruction (and its neighbors) from disk.
This is called a page fault. The process is blocked while the necessary
instruction is being located and read in. If a thread causes a page fault, the
kernel, not even knowing about the existence of threads, naturally blocks the
entire process until the disk I/O is complete, even though other threads might
be runnable. Another problem with user-level thread packages is that if a
thread starts running, no other thread in that process will ever run unless the
first thread voluntarily gives up the CPU. Within a single
process,
there are no clock interrupts, making it impossible to schedule processes
round-robin fashion (taking turns). Unless a thread enters the run-time system
of its own free will, the scheduler will never get a chance.
One
possible solution to the problem of threads running forever is to have the run-
time system request a clock signal (interrupt) once a second to give it control,
but this, too, is crude and messy to program. Periodic clock interrupts at a
higher frequency are not always possible, and even if they are, the total
overhead may be substantial. Furthermore, a thread might also need a clock
interrupt, interfering with the run-time system’s use of the clock.
Another,
and probably the most devastating argument against user-level threads, is that programmers
generally want threads precisely in applications where the threads block often,
as, for example, in a multithreaded Web server. These threads are constantly
making system calls.
Once
a trap has occurred to the kernel to carry out the system call, it is hardly
any more work for the kernel to switch threads if the old one has blocked, and
having the kernel do this eliminates the need for constantly making select
system calls that check to see if read system calls are safe. For applications
that are essentially entirely CPU bound and rarely block, what is the point of having
threads at all? No one would seriously propose computing the first n prime
numbers or playing chess using threads because there is nothing to be gained by
doing it that way.
1.10.4
Implementing Threads in the Kernel
Now
let us consider having the kernel know about and manage the threads. No run-time
system is needed in each, as shown in Fig. 2-13(b). Also, there is no thread
table in each process.
Instead,
the kernel has a thread table that keeps track of all the threads in the
system. When a thread wants to create a new thread or destroy an existing
thread, it makes a kernel call, which then does the creation or destruction by
updating the kernel thread table.
The
kernel’s thread table holds each thread’s registers, state, and other
information. The information is the same as with user-level threads, but it is
now in the kernel instead of in user space (inside the run-time system). This
information is a subset of the information that traditional kernels maintain
about each of their single-threaded processes, that is, the process state. In addition,
the kernel also maintains the traditional process table to keep track of
processes.
All
calls that might block a thread are implemented as system calls, at
considerably greater cost than a call to a run-time system procedure. When a
thread blocks, the kernel, at its option, can run either another thread from
the same process (if one is ready), or a thread from a different process. With
user-level threads, the run-time system keeps running threads from its own
process until the kernel takes the CPU away from it (or there are no ready
threads left to run).
Due
to the relatively greater cost of creating and destroying threads in the
kernel, some systems take an environmentally correct approach and recycle their
threads. When a thread is destroyed, it is marked as not runnable, but its
kernel data structures are not otherwise affected.
Later,
when a new thread must be created, an old thread is reactivated, saving some
overhead. Thread recycling is also possible for user-level threads, but since
the thread management overhead is much smaller, there is less incentive to do
this.
Kernel
threads do not require any new, no blocking system calls. In addition, if one
thread in a process causes a page fault, the kernel can easily check to see if
the process has any other runnable threads, and if so, run one of them while
waiting for the required page to be brought in from the disk. Their main
disadvantage is that the cost of a system call is substantial, so if thread operations
(creation, termination, etc.) are common, much more overhead will be incurred.
No comments:
Post a Comment