UTLK - Understanding the Linux K

2019-01-31 本文已影响0人帆子_8c3a

Book
内核版本是2.6.11

Chapter 1 Introduction

Chapter 1.1 Linux Versus Other Unix-Like Kernels

抢占式内核(preemptive)

Chapter 1.6 An Overview of Unix Kernels

Synchronization and Critical Regions

Kernel preemption disabling

由于内核支持抢占，进入同步区前要关闭抢占;退出同步区要恢复抢占。内核里的自旋锁就是关闭抢占，没有关闭中断。

Interrupt disabling

对于单CPU关中断能起到作用，但会影响性能。对于多CPU不起作用。

Spin locks

Chapter 3. Processes

3.1 Processes

3.2 Process Descriptor

用task_struct描述进程

Process State

TASK_RUNNING
TASK_INTERRUPTIBLE
TASK_UNINTERRUPTIBLE:很少用，例如在一个驱动在probe设备的状态，如果中断设备会进入不可预见的状态。
TASK_STOPPED
TASK_ZOMBIE //进程结束，但父进程没有调用wait,此时kernel不能回收task_struct等资源
EXIT_DEAD //TASK_ZOMBIE状态后，父进程调用wait,进入此状态

image.png

Identifying a Process

由于linux使用轻量级线程，多个线程做成 thread group,他们的leader的PID作为所有线程的PID。 getpid()其实是current->tgid 而不是current->pid

Process descriptors handling

thread_info和kernel stack共用8K内存(两个page)

image.png

Identifying the current process

current_thread_info展开

movl $0xffffe000, %ecx
andl %esp, %ecx
movl %ecx, p

current宏展开

current_thread_info( )->task

The lists of TASK_RUNNING processes

Linux 2.6修改了runqueue，不是一个很长的list。而是按照优先级，分成了很多list。而每个CPU上又都有自己的runqueue。

void enqueue_task(struct task_struct *p, struct prio_array *array)
{
    sched_info_queued(p);
    list_add_tail(&p->run_list, array->queue + p->prio);
    __set_bit(p->prio, array->bitmap);
    array->nr_active++;
    p->array = array;
}

dequeue_task()代码类似

Relationships Among Processes

image.png

The pidhash table and chained lists

这是简化版本

image.png

这个是现实版本，可以看到pid_list组成的链表，每个元素的nr都一样，它们同属一个线程组，nr就是它们的PID

image.png

do_each_task_pid(nr, type, task)
while_each_task_pid(nr, type, task)
find_task_by_pid_type(type, nr)
find_task_by_pid(nr)

How Processes Are Organized

TASK_STOPPED, EXIT_ZOMBIE, or EXIT_DEAD不在任何队列里
TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE分别有自己的队列

Wait queues

队列头

struct __wait_queue_head {
    wq_lock_t lock;
    struct list_head task_list;
};
typedef struct __wait_queue_head wait_queue_head_t;

队列元素

struct __wait_queue {
    unsigned int flags;
#define WQ_FLAG_EXCLUSIVE   0x01
    void *private;
    wait_queue_func_t func;
    struct list_head task_list;
};
typedef struct __wait_queue wait_queue_t;

唤醒wait queue有两种方式：

只唤醒其中一个，exclusive processes，flags为1，add_wait_queue_exclusive()
所有都唤醒，nonexclusive processe，flags为0， add_wait_queue()

Handling wait queues

init_waitqueue_head()
init_waitqueue_entry()
sleep_on()
interruptible_sleep_on()
interruptible_sleep_on_timeout()

void sleep_on(wait_queue_head_t *q)
{
    wait_queue_t wait;
    init_waitqueue_entry(&wait, current);
    current->state = TASK_UNINTERRUPTIBLE;
    add_wait_queue(q, &wait);  //插入wait queue
    schedule( );  //进入睡眠
    remove_wait_queue(q, &wait); //醒来后从wait queue里删除
}

wake_up()=>__wake_up()=>__wake_up_common()

static void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
                 int nr_exclusive, int sync, void *key)
{
    struct list_head *tmp, *next;

    list_for_each_safe(tmp, next, &q->task_list) {
        wait_queue_t *curr = list_entry(tmp, wait_queue_t, task_list);
        unsigned flags = curr->flags;

        if (curr->func(curr, mode, sync, key) &&
                (flags & WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)
            break;
    }
}

其中的curr->func默认为try_to_wake_up,主要是将这个进程的state设置为TASK_RUNNING

wait相关宏,sleep_on基本不用了

wait_event(wq, condition)
wait_event_interruptible(wq, condition)

wake相关宏

wake_up
wake_up_nr 苏醒几个thread
wake_up_all 全部苏醒
wake_up_interruptible

Process Resource Limits

系统调用getrlimit(),setrlimit()

Process Switch

hardware context swtich
在老版本linux中，使用硬件办法切换，far jump
linux2.4使用软件切换，即用软件的办法保存寄存器，恢复寄存器

TSS段，User Mode到Kernel Mode需要读取TSS段，获取新的EIP,CS等值
thread_struct //一些硬件寄存器被保存在这里
tss_struct //等同TSS段

schedule
局部变量prev,next分别指向切换前的进程，切换后的进程
switch_to宏调用__switch_to，其“调用”通过jmp而非call，所以ret后，会来next->thread.eip，而prev->thread.eip存储了lable 1的位置

3.4 Creating Processes

创建进程三种系统调用

clone() 创建light weight process
fork()
vfork()
他们都会调用do_fork()
user_struct记录user相关数据

Chapter 4. Interrupts and Exceptions

4.8 Work Queues(不同于第二版)

Work Queue出现在2.6版本中，目的是替代2.4版本中的task queue。Work Queue运行在某个kernel thread中，不能访问用户数据。

/*
 * The per-CPU workqueue (if single thread, we always use the first
 * possible cpu).
 *
 * The sequence counters are for flush_scheduled_work().  It wants to wait
 * until until all currently-scheduled works are completed, but it doesn't
 * want to be livelocked by new, incoming ones.  So it waits until
 * remove_sequence is >= the insert_sequence which pertained when
 * flush_scheduled_work() was called.
 */
struct cpu_workqueue_struct {

    spinlock_t lock;

    long remove_sequence;   /* Least-recently added (next to run) */
    long insert_sequence;   /* Next to add */

    struct list_head worklist;
    wait_queue_head_t more_work;
    wait_queue_head_t work_done;

    struct workqueue_struct *wq;
    struct task_struct *thread;

    int run_depth;      /* Detect run_workqueue() recursion depth */
} ____cacheline_aligned;

/*
 * The externally visible workqueue abstraction is an array of
 * per-CPU workqueues:
 */
struct workqueue_struct {
    struct cpu_workqueue_struct *cpu_wq;
    const char *name;
    struct list_head list;  /* Empty if single thread */
};
### Work queue functions

create_workqueue("foo")创建名为foo的work queue。其中创建n个线程，n等于CPU个数。
create_singlethread_workqueue()类似，但之创建一个线程。
destroy_workqueue()销毁work queue
queue_work()将work_struct插入队列中
flush_workqueue()接收work_struct参数， block当前进程，知道所有这个work queue上的函数执行完毕

真正运行是在worker_thread=>run_workqueue中运行。

The predefined work queue

The predefined work queue被叫做events

4.9 Returning from Interrupts and Exceptions (不同于第二版)

从中断、异常返回ret_from_intr()或者ret_from_exception().

The entry points

等价的汇编

中断返回时候已经关闭中断，所以只有异常返回时候需要关中断。内核把thread_info指针放入ebp。根据进入中断时候的压栈情况，判断返回内核，还是返回用户态。

Resuming a kernel control path

如果需要内核抢占，则调用preempt_schedule_irq()

等价汇编
判断thread_info的preempt_count是否为0，如果是则内核被抢占，否则继续，最后从中断请求返回(iret)。

Checking for kernel preemption

如果需要抢占，调用preempt_schedule_irq()。它为preempt_count加上PREEMPT_ACTIVE，内核大锁计数器lock_depth为-1，打开本地中断，进入schedule()。退出schedule()后，关闭本地中断。删掉PREEMPT_ACTIVE标记，将内核大锁计数器lock_depth恢复以前的值。

Chapter 5. Kernel Synchronization

5.1 How the Kernel Services Requests (不同于第二版)

Kernel Preemption

preemptive和nonpreemptive kernels都可以自愿放弃CPU。preemptive的kernel允许被更高优先级的进程打断。
抢占内核和非抢占内核Linux都是通过switch_to宏，非抢占内核linux在内核态，不允许切换内核。
例子1：进程A进入kernel模式下的异常处理函数，此时来了有一个中断，进入ISP。
对于preemptive kernel，ISP返回后会切换到别的进程。
对于nonpreemptive kernel，ISP返回后不会切换到别的进程。

例子2：在kernel模式下，如果当前进程的时间片用完

对于preemptive kernel，切换到别的进程。
对于nonpreemptive kernel，不会切换到别的进程。

preemptive kernel的目的：让用户模式下的反应加快。

preempt_count如果大于0，则关闭preemptive kernel。它发生在如下情况：

ISP运行中
kernel运行softirq或者tasklet
手动修改preempt_count,如preempt_disable就是将其加一。

在UP中，如果变量不被中断或异常处理函数访问，那么它是安全的，因为不会被打断。

When Synchronization Is Necessary

例子1：多个ISP访问某资源，对于单核CPU可以简单的关闭中断实现保护。
例子2：多个系统调用共享某资源，对于单核CPU可以简单的关闭preemptive kernel。

When Synchronization Is Not Necessary

ISP中关闭某IRQ。
ISP、softirqs、tasklets是nonpreemptable and non-blocking。
ISP不会被其他kerenl模块打断。
softirq、tasklet不会被给定CPU交织。
tasklet不会同时被多个CPU执行。

5.2 Synchronization Primitives

Spin lock

对于linux2.4 , UP的获取Spin Lock什么也没做，因为linux 2.4是 nonpreemptive的
for SMP // copy from asm/spinlock.h

#define spin_lock_init(x) do { *(x) = SPIN_LOCK_UNLOCKED; } while(0)

for UP //copy from include/spinlock.h

#define spin_lock_init(lock) do { } while(0)
#define spin_lock(lock) (void)(lock) /* Not "unused variable". */
#define spin_is_locked(lock) (0)
#define spin_trylock(lock) ({1; })
#define spin_unlock_wait(lock) do { } while(0)
#define spin_unlock(lock) do { } while(0)

Chapter 12. The Virtual Filesystem

五种标准UNIX文件类型

Regular file
Directory
Symbolic link
Device
Pipe

12.1 The Role of the Virtual Filesystem (VFS)

根目录所在的文件系统是root filesystem,其他文件系统需要mount在根下面的子目录下。

The Common File Model

read() => sys_read() => file->f_op->read(...)
The Common File Model分为：

The superblock object
The inode object
The file object
The dentry object

image.png

几种cache的比较

disk cache：减少IO
- dentry cache : pathname => dentry
- inode cache
- page cache
hardware cache
memory cache : to bypass the Kernel Memory Allocator

12.2 VFS Data Structures

12.7 File Locking

两个文件同时写同一个文件，结果无法预测。

advisory lock
mandatory lock: open( ) , read( ) , and write( )被执行前，kernel都会检查是否满足锁要求。
注:NFS4不支持mandatory lock，参看内核代码nfs_flock

Linux File Locking

flock()是系统调用，只针对advisory lock
fcntl()是系统调用，
lockf()不是系统调用，内部调用fcntl()
lease强制锁(基于fcntl) ，租期为/proc/sys/fs/lease-break-time, usually 45 seconds
操作mandatory lock:

mount时候带上-o mand option参数，它代表MS_MANDLOCK(默认此功能关闭)
设置set-group bit (SGID) ,清除group-execute permission bit
使用fcntl()获得、释放锁

chmod g+s /mnt/testfile
chmod g-x /mnt/testfile

租约的使用:

invoke a fcntl() system call with a F_SETLEASE or F_GETLEASE command.
fcntl() invocation with the F_SETSIG command may be used to change the type of signal to be sent to the lease process holder
系统调用：

flock( ) : 设置fl->fl_flags为FL_FLOCK
fcntl( ) ：设置fl->fl_flags为FL_POSIX
lease : 设置fl->fl_flags为FL_LEASE

#define IS_POSIX(fl)    (fl->fl_flags & FL_POSIX)
#define IS_FLOCK(fl)    (fl->fl_flags & FL_FLOCK)
#define IS_LEASE(fl)    (fl->fl_flags & (FL_LEASE|FL_DELEG|FL_LAYOUT))

File-Locking Data Structures

struct lock_file是一个单链表，struct inode中的i_flock描述了这个单链表第一个元素的位置。

// The i_flock list is ordered by:
struct file_lock {
    struct file_lock *fl_next;  /* singly linked list for this inode  */
    struct hlist_node fl_link;  /* node in global lists */
    struct list_head fl_block;  /* wait在这个lock上所有的进程*/
    fl_owner_t fl_owner;
    unsigned int fl_flags;
    unsigned char fl_type;
    unsigned int fl_pid;
    int fl_link_cpu;        /* what cpu's list is this on? */
    struct pid *fl_nspid;
    wait_queue_head_t fl_wait;
    struct file *fl_file;
    loff_t fl_start;
    loff_t fl_end;

    wait_queue_head_t fl_wait;
}
//active locks， /proc/locks里显示的就是这些
static DEFINE_PER_CPU(struct hlist_head, file_lock_list);
//The blocked_hash is used to find POSIX lock loops for deadlock detection.
static DEFINE_HASHTABLE(blocked_hash, BLOCKED_HASH_BITS);

FL_FLOCK Locks

An FL_FLOCK lock is always associated with a file object.

sys_flock()
{
        //对于nfs，f.file->f_op->flock是nfs_flock()
    if (f.file->f_op && f.file->f_op->flock)
        error = f.file->f_op->flock(f.file,
                      (can_sleep) ? F_SETLKW : F_SETLK,
                      lock);
    else
        error = flock_lock_file_wait(f.file, lock);    
}

flock_lock_file_wait() => flock_lock_file()
找到filp->f_dentry->d_inode->i_flock，对其加锁、释放锁。

FL_POSIX Locks

An FL_POSIX lock is always associated with a process and with an inode.
sys_fcntl( )会根据参数调用fcntl_getlk()或fcntl_setlk()
对于NFS4来说，filp->f_op->lock是nfs_lock,filp->f_op->flock是nfs_flock

int vfs_lock_file(struct file *filp, unsigned int cmd, struct file_lock *fl, struct file_lock *conf)
{
    if (filp->f_op && filp->f_op->lock)
        return filp->f_op->lock(filp, cmd, fl);
    else
        return posix_lock_file(filp, fl, conf);
}

Chapter 15. The Page Cache

inode cache和dentry cache存的是元数据。page cache是存储page的disk cache。

15.1 The Page Cache

如果一个page不在page cache中，会分配一个，并把磁盘的内容复制到这个Cache里。
Page cache可以是以下几类:

Pages containing data of regular files
Pages containing directorie
Pages containing data directly read from block device files (skipping the filesystem layer)
Pages containing data of User Mode processes that have been swapped out on disk
Pages belonging to files of special filesystems
page属于一个owner，对应一个inode。
read()和write()会依赖page cache，但当使用O_DIRECTflag时例外，IO会直接操作读取。
page cache满足两点要求：

从cache里快速搜索到page。
对于不同类型(普通文件、设备文件等)的owner，处理方法不一样。
所有的page都有一个owner，对应一个inode。page通过owner的inode和offset定位。

The address_space Object

它是内嵌在inode中的数据结构。
struct inode中的i_mapping指向了struct address_space
struct page中的mapping指向了struct address_space
struct page中的index记录着位置。

struct address_space {
    struct inode        *host;      /* owner: inode, block_device 指向嵌入address_space的那个inode*/
    struct radix_tree_root  page_tree;  /* radix tree of all pages */
    spinlock_t      tree_lock;  /* and lock protecting it */
    unsigned int        i_mmap_writable;/* count VM_SHARED mappings */
    struct rb_root      i_mmap;     /* tree of private and shared mappings */
    struct list_head    i_mmap_nonlinear;/*list VM_NONLINEAR mappings */
    struct mutex        i_mmap_mutex;   /* protect tree, count, list */
    /* Protected by tree_lock together with the radix tree */
    unsigned long       nrpages;    /* number of total pages */
    pgoff_t         writeback_index;/* writeback starts here */
    const struct address_space_operations *a_ops;   /* methods */
    unsigned long       flags;      /* error bits/gfp mask */
    struct backing_dev_info *backing_dev_info; /* device readahead, etc */
    spinlock_t      private_lock;   /* for use by the address_space */
    struct list_head    private_list;   /* ditto */
    void            *private_data;  /* ditto */
} __attribute__((aligned(sizeof(long))));

struct inode {
    ...
    struct address_space    *i_mapping;//总是指向owner的page cache
    struct address_space    i_data;//如果owner是个文件，address_space嵌入在inode中
    ...
}

struct page {
    ...
    struct address_space *mapping;//指向拥有这个page的inode的i_mapping
    pgoff_t index;
    unsigned long private; //如果存page buffer,指向第一个buffer head
}

如果page的owner是一个regular file,inode中的i_data就是address_space，inode中的i_mapping指针也指向它。同时address_space中的host指向这个inode。
如果page是来自于block device,address_space存在于bdev设备对象所关联的inode上。

The Radix Tree

image.png

Page Cache Handling Functions

find_get_page()
find_get_pages()
add_to_page_cache()
remove_from_page_cache()
read_cache_page()
find_or_create_page()

15.2 Storing Blocks in the Page Cache

自2.4.10后the buffer cache就不用了，为block分配的buffer被称作buffer page。一个buffer page和buffer head描述符联系起来。buffer head的意义在于block的数据如何从page里定位。

Block Buffers and Buffer Heads

struct buffer_head {
    unsigned long b_state;      /* buffer state bitmap (see above) */
    struct buffer_head *b_this_page;/* circular list of page's buffers 指向同一个page里的下一个buffer head*/
    struct page *b_page;        /* the page this bh is mapped to */

    sector_t b_blocknr;     /* start block number */
    size_t b_size;          /* size of mapping */
    char *b_data;           /* pointer to data within the page */

    struct block_device *b_bdev;
    bh_end_io_t *b_end_io;      /* I/O completion */
    void *b_private;        /* reserved for b_end_io */
    struct list_head b_assoc_buffers; /* associated with another mapping */
    struct address_space *b_assoc_map;  /* mapping this buffer is
                           associated with */
    atomic_t b_count;       /* users using this buffer_head */
};

b_bdev+b_blocknr可以定位到具体block设备上的具体位置。

Managing the Buffer Heads

bh有其对应的slab分配器。

alloc_buffer_head()
free_buffer_head()
bh中的b_count，记录reference count

Buffer Pages

两种情况内核创建buffer pages：

When reading or writing pages of a file that are not stored in contiguous disk blocks. 对于这种情况，page会被插入到这个文件的address_space中。
When accessing a single disk block (for instance, when reading a superblock or an inode block).对于这种情况，page会被插入到这个设备文件的address_space中。这被称为block device buffer pages (有时称作blockdev’s pages).本章重点讲述这种情况。

一个例子是，VFS读取一个block的内容，申请一个page，并创建4个bh

image.png

Allocating Block Device Buffer Pages

static int
grow_buffers(struct block_device *bdev, sector_t block, int size)

bdev:block device
block: block pos
size: block size

Releasing Block Device Buffer Pages

int try_to_release_page(struct page *page, gfp_t gfp_mask)

Searching Blocks in the Page Cache

struct buffer_head *
__find_get_block(struct block_device *bdev, sector_t block, unsigned size)

先在LRU中寻找，如果找不到在从address_space中找

struct buffer_head *
__getblk(struct block_device *bdev, sector_t block, unsigned size)

__getblk()调用__find_get_block(),如果不成功则调用__getblk_slow()从内存中分配

struct buffer_head *
__bread(struct block_device *bdev, sector_t block, unsigned size)

__bread()调用__getblk(),如果内容不是最新，则调用__bread_slow()

Submitting Buffer Heads to the Generic Block Layer

int submit_bh(int rw, struct buffer_head *bh)
//for example
submit_bh(READ | REQ_META | REQ_PRIO, bh);

内部通过bio发生实际IO

void ll_rw_block(int rw, int nr, struct buffer_head *bhs[])
//for example
ll_rw_block(WRITE, 1, &bh);

15.3 Writing Dirty Pages to Disk

Chapter 16. Accessing Files

操作文件

Canonical mode: O_SYNC and O_DIRECT flags都没有设置
Synchronous mode:O_SYNC
Memory mapping mode: mmap( )
Direct I/O mode: O_DIRECT
Asynchronous mode

16.1 Reading and Writing a File

Reading and Writing a File

读文件 : 它是page-based的，内核如果读几个字节，如果它不在内存中，则先分配一个page,将它加入到page cache中。然后将结果从page拷贝到用户态内存中。读的入口函数一般是generic_file_read()
写文件 : 写的入口函数一般是generic_file_write()

Reading from a File

generic_file_read()是针对block类型文件的。
generic_file_read() => __generic_file_aio_read() => do_generic_mapping_read()
do_generic_file_read()做了以下几件事：

filp->f_mapping作为address_space对象
address_space对象的owner即address_space结构体中的host字段，如果是块设备则是inode in the bdev special filesystem，而不是filp->f_dentry->d_inode
读会分解成若干个page大小的read。对于分解的每一个page,做如下操作：
1. 在需要预读的情况下调用page_cache_readahead()
2. 调用find_get_page()向address_space对象获得一个page。如果为null,则分配一个并加入到address_space对象
3. 如果有PG_uptodate标记,说明是updated的数据，直接返回
4. 执行lock_page()，对page上锁
5. 调用readpage方法
6. 将结果拷贝回用户态内存，由file_read_actor()负责：
  1. 试图从page对应的内存拷贝到用户态内存。
  2. 如果失败说明，page对应的内存是高端内存，需要重新映射，调用kmap()
  3. 将重新映射的地址内容拷贝拷贝到用户空间
  4. 调用kunmap()释放映射

The readpage method for regular files

ext3_readpage() => mpage_readpage()
mpage_readpage的分析:

此函数会传进一个get_block函数，它可以帮助计算从文件的block位置，转换成块设备上的具体位置(逻辑位置)
如果page的private域有值，说明此page对应bh,并且这4个block在磁盘上不连续
依次调用get_block，判断这4个block是否在磁盘连续。
如果连续，则调用submit_bio一次把这4个连续的block从磁盘读上来。
如果不连续，则调用block_read_full_page，即依次对于每个block进行读取。

The readpage method for block device files

blkdev_readpage() => block_read_full_page
block_read_full_page : 依次对每个block进行读取

Read-Ahead of Files

read-ahead对顺序读有帮助，对随机读没帮助。

16.3 Direct I/O Transfers

16.4 Asynchronous I/O

Asynchronous I/O in Linux 2.6

异步IO可以在操作系统不支持异步IO的情况下实现。aio_read/aio_write可以创建有一个子进程，并正子进程中调用同步read/write。
Linux 2.6内核支持以下AIO系统调用

io_setup() : Initializes an asynchronous context for the current process
io_submit() : Submits one or more asynchronous I/O operations
io_getevents() : Gets the completion status of some outstanding asynchronous I/O operations
io_cancel() : Cancels an outstanding I/O operation
io_destroy() : Removes an asynchronous context for the current process

The asynchronous I/O context

在使用io_submit之前需要初始换AIO context。AIO context和kioctx对象关联。所有的kioctx对象组成链表，粗放在ioctx_list中。每个kioctx关联一个AIO ring，它是用户模式下的buffer，同时可以在内核模式下看到。
io_setup就是为了创建AIO context。

Submitting the asynchronous I/O operations

系统调用io_submit做了以下事情:

查询iocb是否有效

Chapter 18. The Ext2 and Ext3 Filesystems

18.2 Ext2 Disk Data Structures

第一个block被用作boot，不被ext2所管理。其他block被分成大小一样的block group

image.png

UTLK - Understanding the Linux K

Chapter 1 Introduction

Chapter 1.1 Linux Versus Other Unix-Like Kernels

Chapter 1.6 An Overview of Unix Kernels

Synchronization and Critical Regions

Kernel preemption disabling

Interrupt disabling

Spin locks

Chapter 3. Processes

3.1 Processes

3.2 Process Descriptor

Process State

Identifying a Process

Process descriptors handling

Identifying the current process

The lists of TASK_RUNNING processes

Relationships Among Processes

The pidhash table and chained lists

How Processes Are Organized

Wait queues

Handling wait queues

wait相关宏,sleep_on基本不用了

wake相关宏

Process Resource Limits

Process Switch

3.4 Creating Processes

Chapter 4. Interrupts and Exceptions

4.8 Work Queues(不同于第二版)

The predefined work queue

4.9 Returning from Interrupts and Exceptions (不同于第二版)

The entry points

Resuming a kernel control path

Checking for kernel preemption

Chapter 5. Kernel Synchronization

5.1 How the Kernel Services Requests (不同于第二版)

Kernel Preemption

When Synchronization Is Necessary

When Synchronization Is Not Necessary

5.2 Synchronization Primitives

Spin lock

Chapter 12. The Virtual Filesystem

12.1 The Role of the Virtual Filesystem (VFS)

The Common File Model

12.2 VFS Data Structures

12.7 File Locking

Linux File Locking

File-Locking Data Structures

FL_FLOCK Locks

FL_POSIX Locks

Chapter 15. The Page Cache

15.1 The Page Cache

The address_space Object

The Radix Tree

Page Cache Handling Functions

15.2 Storing Blocks in the Page Cache

Block Buffers and Buffer Heads

Managing the Buffer Heads

Buffer Pages

Allocating Block Device Buffer Pages

Releasing Block Device Buffer Pages

Searching Blocks in the Page Cache

Submitting Buffer Heads to the Generic Block Layer

15.3 Writing Dirty Pages to Disk

Chapter 16. Accessing Files

16.1 Reading and Writing a File

Reading and Writing a File

Reading from a File

The readpage method for regular files

The readpage method for block device files

Read-Ahead of Files

16.3 Direct I/O Transfers

16.4 Asynchronous I/O

Asynchronous I/O in Linux 2.6

The asynchronous I/O context

Submitting the asynchronous I/O operations

Chapter 18. The Ext2 and Ext3 Filesystems

18.2 Ext2 Disk Data Structures

猜你喜欢

热点阅读