FIT9137(week1-week4)

2021-03-28 本文已影响0人 raito4

week1:

three major functions of an operating system (OS):

process management
memory management
file management

1940s ENIAC: ENIAC was the first general-purpose (programmable) electronic digital computer.

Bill Gates - Microsoft founder

Steve Jobs - Apple computer co-founder

Ken Thompson and Dennis Ritchie - receiving the 1999 US National Technology Medal for the invention of Unix

Larry Ellison - Oracle founder

Richard Stallman - Founder of the Free Software
Foundation and creator of the GNU Project

Linus Torvalds - Creator of Linux

Basic components of a PC：

CPU: Central Processing Unit
Primary storage or Main Memory: Holds running programs whilst they are being executed
Secondary storage: Holds data and programs
permanently
Bus: Connects, and carries signals (Control, Address, Data) to, all components of the system
Input / Output devices and controllers

The von Neumann Architecture:

The solution : a “stored program computer”

Three concepts underlie the architecture:

Program instructions and data are stored in a single read/write store (the main memory)
The contents of this memory is addressable by memory location, regardless of the type of data contained at the location
Execution of program instructions occurs sequentially, unless explicitly modified

Binary number system (“base 2”)
Binary Digit (or bit): 0 or a 1
8 bits = 1 ‘byte’, eg. 0100 1001

Machine language is the lowest level programming language (closest
to the hardware)

The Fetch/Execute Cycle

Cycle: A processor can have several states, However, the following are used in conventional computers
Fetch: CPU fetches instructions and data from main memory and stores it in special memory locations (Registers).
Decode: CPU interprets the instruction it just fetched
Execute: The instruction is carried out (executed) on the data and any temporary result is stored in a register.
The PC advances through the program using a program counter

Interrupts

Modern computer systems are interrupt driven.
• An interrupt is a signal to the processor to suspend its current
tasks and deal with whatever caused the interrupt.
Interrupts can be classified broadly as:
• Program/Software
• Timer
• I/O
• Other Hardware

Interrupt and Multiprocessing

The interrupt mechanism can be used to implement one of the key features of all modern computer systems – multiprocessing
Multiprocessing is the capacity to have multiple programs in memory and switch the processor between them – gives the illusion of many programs running at the same time.
The order in which programs are executed can be determined by factors such as priority, time-sharing, I/O-interrupts, etc

Week2

Infrastructure Architecture

Application / Programming languages
Virtual Machine / Operating System
Instruction Set Architecture (ISA) / Microarchitecture
Logic / Circuits

Infrastructure Architecture

Application Architecture

Complex hardware/networks require more complex software architectures
These are commonly used approaches (patterns) for application architecture
–client/server architecture
–three-layer client/server architecture
–web services architecture
–internet and web-based application architecture

History of Operating Systems

1st generation (Vacuum Tube Computers, 1945)
2nd generation (Transistorized Computers, 1954)
3rd generation (Integrated Circuits Computers, 1965)
4th generation (Very Large Scale Integrated Computers, 1980)
5th generation (Quantum Computers)

Operating Systems

Operating systems control the underlying computer hardware.
Provide an environment within which other programs can be run.
Can also control how the CPU and other computer resources are allocated to individual users or programs.
In Unix/Linux, the kernel is the most important part of the operating system.
We can generally categorize the various software sitting on top of the O/S into “Utilities” and “Application Software”, depending on the complexity and purpose of the software.

A multi-user operating system is generally also a multi-tasking
operating system
• individual users can be running more than one program simultaneously.
On a single-CPU machine, this is actually just an illusion - since if there is only one physical CPU, there can only be one program running on the CPU at any given instant.
This illusion is achieved by rapidly switching the CPU between the different programs in memory, in turn giving them some “time-slices” on the CPU.
Unix is a multi-user, multi-tasking operating system.

History of Unix - GNU

Richard Stallman (often referred to by his username rms) is the father of the Free Software Foundation which included the GNU (GNU’s not UNIX) project.

History of Unix - MINIX

Andrew Tanenbaum (PhD UC Berkeley) wrote a UNIX clone from scratch called MINIX in order to support an operating systems course he was teaching.

Unix Philosophy

Programs are tools.
Like any good tool, they should be specific in function, but usable for many different purposes.
Within limits, the output of any program should be usable as the input of another program.
All of the information needed by a program should either be contained in the data stream passed to it from another program or specified on the command line.
The UNIX philosophy underpins much of the development of UNIX programs and operating system utilities.

File Management

everything in Unix is treated as a file

Typical tasks of the File Management System:

Controlling transfer of data to and from secondary storage.
Controlling security of file access.
Keeping track of storage space and maintaining file directories.
Provide sharing mechanisms for files
(Possibly) provide recovery and restoration mechanisms

Memory Management

The operating system is also key to the control of data in primary storage (main memory).

In modern operating systems, memory allocation may be non-contiguous (a single logical object may be spread over several disjointed memory areas).

A major function of modern memory management is the implementation of virtual memory.

Process Management

A process is normally defined as “a program in execution”.
The operating system typically needs to provide the following functionality with regard to processes:

Creating and destroying processes
Controlling the execution/progress of processes
Acting on exceptional conditions arising during the execution of processes (eg. errors, interrupts, etc)
Allocating hardware resources (fairly) among processes
Providing some form of inter-process communication

File Management

File management systems allow users to store information in fundamental units called 'files'. What the file actually represents is defined by the system and/or the user.
File system provides connection between logical file structure and physical implementation, creating logical view for user, and hiding physical view.

A File System is a data structure to serve a particular application need.
Through a file system driver, the O/S controls creation/deletion/access.
File systems must be mounted before they can be used by the operating system.
Shared File System such as NFS, SMB etc.
Popular File Systems
Windows:
• FAT , VFAT, and FAT32, used in MS-DOS, older versions of Windows, and removable storage devices like USB memory sticks
• NTFS (New Technology File System) used in Windows
Unix:
• UFS (Universal File System) and VxFS (Veritas File System) used in most UNIX flavors
• Ext (and ext2, ext3, ext4) used in Linux

O/S maintains a directory structure for each device, to facilitate location and organization of user files, and keeps track of free space, allocating space and reclaiming it as required.
Provides naming, access/manipulation/storage, security/protection functions for files and directories.

Ordinary Files

Data (e.g. a text file, program source code)
Executables (e.g. a Unix command, a shell script, etc)

Directories

A directory is another type of file in Unix - a special “file” that can contain other
files and other directories.

Special Files

Other types of files, e.g. files that represent hardware devices like hard drives.

Files - naming conventions

Unix is case-sensitive. In general, most Unix commands are in lowercase letters (ie. Unix does not like uppercase letters!).
Unix filenames are generally made up of lowercase and uppercase letters, digits, dots (.) and commas.
Avoid using spaces (or other “special” characters) in filenames can occasionally make file-handling difficult, so try to avoid them if possible.
There is no notion of a file “extension” in Unix (unlike O/S such as Microsoft Windows). While files can have an extension, the extension (i.e.. the bit after a "dot") has no special meaning, and does not necessarily define the type of the file or indicate how it should be dealt with by an application.
In theory, any name can be used for a file or a directory (with the exception of the “root directory” which must always be named /). Unix is also very generous with the length of a filename.
Try to avoid using special characters.
However, important system files and directories are generally given the same “standard” names on Unix systems. Examples of some common/typical system directories include (but are not limited to) /etc, /bin, /home, /bin, /mnt, /usr, /var, /tmp, /proc and /lib.
you should not modify/delete these system directories/files unless you know what you are doing!
UNIX stores files on the disk in a hierarchical structure.
The top of the hierarchy is referred to as the root directory and is always named /
typical Unix file system

Week3

A special file : /etc/passwd
This file holds important information about all the users in the system. For instance, you may find in this file entries such as:
muni:x:1000:1000:muni:/home/muni:/bin/bash
/etc/group holds information about all the groups in the system.
• a device is represented as a file.
• a program is a file.
• a directory is really also a file (!)
The kernel does not identify files by names; it uses a unique number to identify a file called the inode number. $ stat /root/test.txt

Character File type

regular (ordinary) file
d directory
b buffered special file (e.g. a disk drive)
c unbuffered special file (e.g. a terminal)
l symbolic link
p pipe
s socket

File permissions

user group others r/- w/- x/- r/- w/- x/- r/- w/- x/-

$ls

The option “-l” in the command above is to request the output in long listing, format
The option “-a” in the command above is to request the output in all files (including the hidden files, format
There is another option, “-h”, which will make ls display sizes in “human readable” format (eg. 8K, 555M, 4G, etc.)

Change file permission (chmod)

Syntax: chmod [-R] who [op] [permission] file-list
who is one of:
• u user owner of the file
• g group group to which the owner belongs
• o other all other users
• a all can be used in place of u, g, o
op is one of:

'+' add permission
'-' remove permission
'=' set permission (reset all other permissions)

permission is one or more of : r, w, x
Note : this is typical of a Unix command – the command is given with some option(s) & the actual operands)

File access for processes

When a process executes, it has four id’s:

a real user-id
an effective user-id
a real group-id
an effective group-id

these id’s determine a process’s access permissions to
files/directories.
• Real UID is the UID of the user that created THIS process – i.e. the user
who executes/runs the program.
• Effective UID is used to evaluate privileges of the process to perform a
particular action.

setuid and setgid

A process’ access privileges depend on who executes the process, not on who owns the executable program itself
This is safer in general, but not helpful in some (rare, but important) cases.
This can be overcome using special permissions:

set-user-id and
set-group-id

When a program with setuid permission is executed, the resulting process’s effective user-id becomes that of owner of the program (instead of the user who executes that program).
Similarly with setgid.
In both cases, the real uid and gid are not affected

chmod 764 file r:4 w:2 x:1

Every time a shell is started, 3 files are opened automatically :

stdin keyboard 0
stdout screen 1
stderr screen 2

Links - Hard Links

A hard link is a pointer/reference to a file - every file has at least one hard link to it.
The hard link is how the operating system associates a file name with the address of the actual data on the storage device.
Additional links can be created to allow sharing of files or access through a different name.
Hard links can be created in Unix using ln.
A file exists until the last hard link to it is removed. When the last hard link is removed, the space previously used by the file is marked for re-use.

Hard link (default) : ln ~/week3/newfile.txt hardlink1
Soft link (using -s option) : ln -s ~/week3/newfile.txt softlink1

Memory management

Physical main memory is finite (and expensive)
✔ Single-processing: 1 process in memory at any one time. Easy to implement – either it fits or it doesn't.
✔ Multi-processing: multiple processes in memory at the same time.

Solutions : Swapping, Virtual Memory.

Swapping is a technique used to run more than one process at once. It allows the computer to rapidly "swap" its CPU between the process by loading and unloading them into/from memory. The switching occurs sufficiently quickly that it gives the user the illusion that the system is multi-tasking.
Virtual Memory is a more complicated technique used to solve memory management problems. It allows the computer to separate logical program addresses from actual physicaladdresses, using dynamic relocation of program addresses in memory.
It allows programs to be divided up into small sections stored in different parts of memory, and allows execution of programs larger than physical memory ie. only currently executing portion of program is in memory.
Virtual memory may be implemented using paging and/or segmentation

External Fragmentation– memory which is unallocated (to any processes) and unused.
Internal Fragmentation – memory which is allocated (to a process)
and unused.

Memory Management – Paging

Paging:

Programs/data is divided into logical sections called PAGES.
physical memory is divided into areas called page FRAMES.
A Page-Table then translates the pages to their corresponding frames at run-time.
Page size and Frame size are equal.
When a program is running, its pages are brought into memory as required – i.e. only the portion being executed is in memory.
The rest stays on the hard drive. So we can now run programs which may be larger than the available physical memory.

Page size is usually hardware-dependent, and is typically in powers of 2 (e.g. 512Bytes, 1024Bytes, etc.) to aid address translation. 2-4KB size is common.
Large page sizes will :

increase internal fragmentation (i.e. more likely that a frame will
have unused portion, as a process/page may not need the entire
frame), but
reduce maintenance overhead of page tables since there are less pages to maintain
✔ Small page sizes will have the opposite effect.
✔ So it's important to have the correct page size.
During program execution, some program “pages” stay in memory in page frames, the rest stay on secondary storage (in special area, sometimes known as swap space) until required (when they are then “swapped” back in).
Program logical addresses are converted to the form:
• page_number : page_offset
• e.g. a 32-bit address may use 21 bits for page # and 11 bits for page_offset (so total of 221, or ~2.1million pages).
• A Page # is used as an index into the Page Table, to find the corresponding Frame #.

image.png

The O/S also maintains a Process Table for all the processes. A Process Table contains entries called Process Control Blocks (PCB).
Each PCB represents one process
A typical PCB entry contains data such as :

process state & process ID
program counter
memory address of the Page Table
resources in-used/needed
etc

Improving Paging performance by Caching

Data in page table is constantly accessed when the process is running. For efficiency, special high speed memory is provided for the “active” part of page table, called caches, associative memory or Translation Lookaside Buffer (TLB).
Each location in the caches can be searched simultaneously rather than sequentially.
The cache memory is small and stores only active pages. It is searched first for page #. If not found, search then goes to the page table in conventional memory.
image.png
image.png

Page Replacement/Swapping

If a required page is not in memory then an interrupt called Page Fault results. This causes the required page to be loaded (from secondary memory) into main memory, and the page table updated. Page table indicates whether page is in memory, using a "valid/invalid" bit entry.
If process memory is full (all allocated frames used) some form of page replacement policy is needed i.e. replace current frame with new one. This is called Page Replacing/Swapping. If evicted frame modified, a write to disk is necessary before replacing page.

Paging Algorithms

✔ Least Recently Used (LRU) page swapped out - each access time stamped 🡺 extra overhead.
✔ Least Frequently Used (LFU) page swapped out - each access adds to count, but this causes problems with new pages that have just been brought in (which would naturally have a low frequency count!)
✔ First In First Out (F.I.F.O) - oldest page swapped out first; simplest but has the disadvantage that the most heavily used page may be replaced.

Paging Algorithm Variations

Not Used Recently (NUR) - modification of LRU, that also looks at whether page has been modified (in addition to being accessed).
Second Chance algorithms are modification of FIFO to allow second chance if reference bit is set. Page is time stamped again as though new.

The Working Set Model

As the number of pages allocated to a process decreases, number of page faults increase. Need to store enough pages of the process so that the CPU may be used effectively.
Minimum set is called Working Set (or Resident Set) of the process. If number of pages allocated falls below this level then Thrashing will result.
Thrashing is when so many page faults occur that most activity is just paging in and out while the CPU is blocked waiting on I/O. This will have a very serious performance impact, since I/O access time is typically much slower than memory access time.
Pages in working set may be:
loaded on demand,
as a result of direct page faults (demand paging), or
anticipated in advance (pre-paging) based on some common access pattern.

Page Replacement Policy

local - replaces pages actually owned by the process
global - replaces pages from any process
Global replacement is generally more flexible, because there is a larger set of replacement pages to choose from;
• disadvantage : can reduce the working set size for unrelated processes.
Crucial pages (e.g.. the actual disk driver, video driver, etc) can be marked as “locked” so that they are never swapped out.

Segmentation

Segmentation is another approach to memory management, similar to Paging.
• Reflects the Logical division in programs and data, e.g. may have segment for global variables, code portions of functions and procedures, etc.
• the division is decided by the programmer, and the O/S will then build the appropriate Segment Table (as opposed to a Page Table) to mirror the division.
• Program addresses are now of form:
segment_number : offset

The address mapping for logical to physical addresses is maintained in a Segment Table (similar to a Page Table).
• each segment may vary in size depending on its function (i.e. logical division)
• each segment is a logical entity of the program it represents compare this approach to Paging : where each page/frame is of the same size.
• Segmentations is more complicated to program than paging; so less popular in use to implement virtual memory

Segmentation vs Paging

image.png

Virtual Memory Technique

Advantages:

Provides a large logical memory space to physical memory space ratio.
Allows more processes to run concurrently.
Process isolations - protect processes from each other. Each process has its own virtual memory space.
Less I/O resource, as we load in only the required sections of a user process.

Disadvantages:

Adds complexity to memory management
Can cause performance degradation if not implemented properly – e.g. Thrashing due to excessive page faults
Memory Management

Order of increasing complexity:

Single-processing – process loaded into memory and stays there until it has finished.
Multi-processing - all processes staying in memory.
Swapping - system can handle more processes than it has room for in memory, by swapping processes to and from disk.
Virtual memory - can handle processes bigger than physical memory using paging or segmentation.
Paging/Segmentation - memory subdivided into pages or segments.

Process Management

A process is a program in execution.
consists of executable program, its data & stack, its Program Counter(PC), Stack Counter (SP) and other info necessary to run (or restart) the program.
Process control is an important part of multi-tasking O/S –this involves allocating memory, cpu time, I/O devices, etc
Modern O/S breaks processes down further - into threads – which are smaller individually executable pieces of a process. This makes more efficient use of the CPU.
Information about processes is stored in a Process Table (a data structure of O/S)

Process States

✔ New: The process is being created.
✔ Ready: waiting for a processor to become available
✔ Running: instructions are being executed
✔ Blocked: waiting for some event, e.g. I/O completion

image.png

Processes

A process may create a new process as it runs - this is called forking or spawning a new process.
The original process is called the parent process, the new one the child process.
The Unix ps (try the –ael option) command can show information about currently running processes.

Process Control Block

Usually there are more processes than processors. Concurrency achieved by interleaving processes i.e. allocating each process a fraction of the CPU time.
When a process is interrupted, its current state must be saved, for it to be resumed later. This info is stored in a “Process Control Block”, which forms one entry in the Process Table .
Process Control Block (PCB)

is an in-memory data structure created by O/S
is used to identify process
stores status of process
stores its 'volatile environment' (e.g. register values)
A PCB exists for every process in the system.
A Process Table contains PCBs for all the processes.
Entries in the Process Table may be linked together to form a list, or stored in an array; each entry in the list (or in the array) is for one PCB.

Scheduling

Long-term (or high-level) scheduler decides which processes will be
admitted into the system’s Ready Queue. Decision based on memory availability and system load. Important in batch systems, but less so in most modern interactive systems.
• Short-term (or low-level) scheduler works with processes already in memory and ready to run. A Dispatcher then decides which one to run next.

High-Level Scheduler

If there is not enough memory to hold all processes, high-level
scheduler will swap jobs from disk to memory and vice versa
Criteria that the high-level scheduler could use may include:
• how long the process has been swapped in/out
• how much CPU time has the process recently had
• how big is the process
• how high is the process priority
• etc

Low-Level Scheduler

A running process may need to stop for I/O, or is interrupted for some reason. The dispatcher will then choose the next process to run.
An operating system may use pre-emptive or non-pre-emptive low-level scheduling.

Pre-emptive Vs Non-Pre-emptive Scheduling

• Non-Pre-emptive scheduling means system relies on a (well-written) process itself to relinquish CPU when it finishes. Eg. Windows 3.1, Windows 95 (16-bit), “Classic”
MacOS, etc
• Pre-emptive scheduling means the O/S controls how the CPU is shared by the processes. Eg. Unix, Linux, Windows NT/2000/XP/Vista/8/9/10, Mac OS X, etc

Non-pre-emptive scheduling

Non-pre-emptive algorithms are more applicable to batch systems. Differ from pre-emptive as processes will only stop executing when they decide to stop.
• Examples of non-pre-emptive algorithms:

First-in, first-out(FIFO), where processes are executed in the order they arrive
Shortest job first(SJF), which maximizes throughput, but may result in job starvation
Priority scheduling, where priorities are assigned based on criteria such as resources requested, or financial criteria

Pre-emptive Scheduling

With pre-emptive scheduling, computer uses an inbuilt clock to ensure no process runs for too long.
• Pre-emptive scheduling is more common in interactive systems, but involves much more overhead.
• Most modern O/S’s use pre-emptive scheduling. e.g: an internal clock creates interrupts 50-100 times/sec.
• The O/S dispatcher runs at each clock interrupt to decide on next process to execute.
• Different algorithms can be used to achieve maximum efficiency and response.

Pre-emptive scheduling algorithms

Round Robin:

• All processes assigned equal time quantum to run.
• All ready-to-run processes are maintained in circular linked-list, and take turn to use CPU.
How long should a reasonable time quantum be? E.g.
• If process switch takes 5msec, 20 mses quantum means 25% of CPU time spent just switching processes.
• If 500mses quantum is used - very slow response time to interactive users.
• Usually, quantum of ~100 msec is used.

image.png

Round Robin problems

• Round Robin does not allow definition of “more important processes”, ie. priority
• Round robin also indirectly penalizes processes that frequently use I/O resources, by always returning them to back of queue even if used only small % of quantum. This is because I/O always takes longer to complete, hence such processes have higher chance of waiting/blocking.

Other Scheduling Algorithm

• Priority Scheduling:
• Processes given initial priority level.
• Usually multiple priority classes exist.
• Runnable processes maintained in priority queues, with round robin used within queue.
• To prevent CPU starvation for low priority jobs, may need to temporarily boost priority (e.g. if they have been waiting for long periods). Once process has had its share of CPU time, its priority drops back to normal.

Dynamic Priority Scheduling

• Another variation of priority scheduling is to assign priorities dynamically, using some formula.
• For instance, based on fraction of the last time quantum used (f), priority formula could be 1/f (i.e. more time used now, lesser priority later).
• This would favor interactive users and I/O bound jobs (these tends to spend more time in blocked states & use less CPU time quantum, then the 1/f formula will give them higher priorities) rather than CPU bound jobs.

Mutual Exclusion - ensuring that non-shareable resources (e.g. printers) are accessed by only one process at a time.
Synchronization - for processes which must cooperate (e.g. chat programs) with one another.
Deadlock - when two or more processes want to use a non-shareable resource held by each other.

Process Management

In order to implement mutual exclusion, synchronization and deal with deadlocks, some form of process-coordination is required.
• In addition, processes often need to exchange information.
• Both of these goals are met by Inter-Process Communicationmechanisms (IPC).

• Local Area Network (LAN) (room, building): a group of clients and servers that share a circuit
• Backbone Network (BN) (< a few km): high-speed connection between LANs
• Metropolitan Area Network (MAN) (> a few km): connect LANs and BNs across locations
• Wide Area Network (WAN): same as MAN except longer distances
• 1 Mbps (million bits per second) from your home to your ISP (Internet Service Provider), 10-20 Mbps in the other direction
50-500 Mbps within your WLAN (wireless network)
1 Gbps in LANs (local area network, e.g. Monash lab)
10 Gbps in backbone networks
Tbps (tera bits per second, 1012) in optical fibre networks

FIT9137(week1-week4)

week1:

The von Neumann Architecture:

Three concepts underlie the architecture:

The Fetch/Execute Cycle

Interrupts

Interrupt and Multiprocessing

Week2

Infrastructure Architecture

Application Architecture

History of Operating Systems

Operating Systems

Unix is a multi-user, multi-tasking operating system.

History of Unix - GNU

History of Unix - MINIX

Unix Philosophy

File Management

everything in Unix is treated as a file

Memory Management

Process Management

File Management

Ordinary Files

Directories

Special Files

Files - naming conventions

Week3

File permissions

$ls

Change file permission (chmod)

File access for processes

setuid and setgid

Links - Hard Links

Memory management

Solutions : Swapping, Virtual Memory.

Memory Management – Paging

Improving Paging performance by Caching

Page Replacement/Swapping

Paging Algorithms

Paging Algorithm Variations

The Working Set Model

Page Replacement Policy

Segmentation

Segmentation vs Paging

Virtual Memory Technique

Order of increasing complexity:

Process Management

Process States

Processes

Process Control Block

Scheduling

High-Level Scheduler

Low-Level Scheduler

Pre-emptive Vs Non-Pre-emptive Scheduling

Non-pre-emptive scheduling

Pre-emptive Scheduling

Pre-emptive scheduling algorithms

Round Robin:

Round Robin problems

Other Scheduling Algorithm

Dynamic Priority Scheduling

Process Management

猜你喜欢

热点阅读