操作系统实验：Lab6 调度器

2018-05-03 本文已影响0人 wenj1997

清华大学操作系统Lab6实验报告
课程主页：http://os.cs.tsinghua.edu.cn/oscourse/OS2018spring
实验指导书：https://chyyuu.gitbooks.io/ucore_os_docs/content/
github：https://github.com/chyyuu/ucore_os_lab

实验目的

理解操作系统的调度管理机制
熟悉 ucore 的系统调度器框架，以及缺省的Round-Robin 调度算法
基于调度器框架实现一个(Stride Scheduling)调度算法来替换缺省的调度算法

实验内容

实验五完成了用户进程的管理，可在用户态运行多个进程。但到目前为止，采用的调度策略是很简单的FIFO调度策略。本次实验，主要是熟悉ucore的系统调度器框架，以及基于此框架的Round-Robin（ RR）调度算法。然后参考RR调度算法的实现，完成Stride Scheduling调度算法。

练习1：加载应用程序并执行

为了完成Lab6的练习1，首先需要对之前的代码做一些修改。
发生时钟中断时，不需要直接在trap_dispatch中进行调度的设置，而是间接使用sched_class中的函数指针proc_tick来完成：

    case IRQ_OFFSET + IRQ_TIMER:
    sched_class_proc_tick(current);
        break;

此外，还需要在对proc_struct中针对lab6添加的接口初始化。在proc.c的alloc_proc函数中：

        proc -> rq = NULL;
        list_init(&(proc -> run_link));
        proc -> time_slice = 0;

完成之后，运行make grade，结果如下。除了priority测试外，其预测是均能通过。

完成练习1的结果

请理解并分析sched_class中各个函数指针的用法，并接合Round Robin 调度算法描述ucore的调度执行过程

通过阅读sched.c中的代码，可以发现实现调度的主要函数schedule实际上调用的是sched_class_enqueue、sched_class_dequeue、sched_class_pick_next、sched_class_proc_tick和sched_init这五个函数，在这五个函数内部，通过sched_class类的实例中的函数指针来调用具体的调度算法。

sched_class类的定义如下：

// The introduction of scheduling classes is borrrowed from Linux, and makes the 
// core scheduler quite extensible. These classes (the scheduler modules) encapsulate 
// the scheduling policies. 
struct sched_class {
    // the name of sched_class
    const char *name;
    // Init the run queue
    void (*init)(struct run_queue *rq);
    // put the proc into runqueue, and this function must be called with rq_lock
    void (*enqueue)(struct run_queue *rq, struct proc_struct *proc);
    // get the proc out runqueue, and this function must be called with rq_lock
    void (*dequeue)(struct run_queue *rq, struct proc_struct *proc);
    // choose the next runnable task
    struct proc_struct *(*pick_next)(struct run_queue *rq);
    // dealer of the time-tick
    void (*proc_tick)(struct run_queue *rq, struct proc_struct *proc);
    /* for SMP support in the future
     *  load_balance
     *     void (*load_balance)(struct rq* rq);
     *  get some proc from this rq, used in load_balance,
     *  return value is the num of gotten proc
     *  int (*get_proc)(struct rq* rq, struct proc* procs_moved[]);
     */
};

可以看到，其中有一个定义了调度算法名称的字符串和五个函数接口：

init：初始化运行队列
enqueue：将进程p加入到运行队列rq中
dequeue：将进程p从运行队列rq中删除
pick_next：返回运行队列中下一个可执行的进程
proc_tick：time tick的处理函数

如果要实现自己的调度算法，需要按照如上的功能完成这五个函数，并构造sched_class类的实例，将这五个函数传给实例中的函数指针，随后通过这个实例即可使用相应的调度算法。

结合Round Robin算法，具体的调度过程如下：

当调用schedule函数时发生调度。首先清除当前进程需要调度的标记，把当前进程放进运行队列中去。

        current->need_resched = 0;
        if (current->state == PROC_RUNNABLE) {
            sched_class_enqueue(current);
        }

sched_class_enqueue函数调用Round Robin算法的RR_enqueue，也就是把当前进程加入到运行队列的最后，将当前进程的时间片时间重置为最大，更新队列中进程数目。

static void
RR_enqueue(struct run_queue *rq, struct proc_struct *proc) {
    assert(list_empty(&(proc->run_link)));
    list_add_before(&(rq->run_list), &(proc->run_link));
    if (proc->time_slice == 0 || proc->time_slice > rq->max_time_slice) {
        proc->time_slice = rq->max_time_slice;
    }
    proc->rq = rq;
    rq->proc_num ++;
}

随后，通过Round Robin调度算法选择下一个要运行的进程。具体地，sched_class_pick_next函数调用Round Robin算法的RR_pick_next接口，

        if ((next = sched_class_pick_next()) != NULL) {
            sched_class_dequeue(next);
        }

Round Robin算法的RR_pick_next将返回运行队列中的第一个进程交给schedule作为下一个要运行的进程。

static struct proc_struct *
RR_pick_next(struct run_queue *rq) {
    list_entry_t *le = list_next(&(rq->run_list));
    if (le != &(rq->run_list)) {
        return le2proc(le, run_link);
    }
    return NULL;
}

如果可以选出下一个要运行的进程，则将这个进程从运行队列中删除。否则，运行idleproc进程，即返回查询是否有新的进程需要运行。sched_class_dequeue函数会调用Round Robin算法的RR_dequeue函数。

        if ((next = sched_class_pick_next()) != NULL) {
            sched_class_dequeue(next);
        }
        if (next == NULL) {
            next = idleproc;
        }

随后通过proc_run函数切换到新进程并进入新进程执行。proc_run的具体切换过程在Lab5中已经完成。

        next->runs ++;
        if (next != current) {
            proc_run(next);
        }

此外，在每一次时钟tick的时候，都会调用Round Robin的RR_proc_tick函数，将当前进程的所占用的时间片剩余时间减一，当时间片耗尽时，设置为需要调度并等待调度。

static void
RR_proc_tick(struct run_queue *rq, struct proc_struct *proc) {
    if (proc->time_slice > 0) {
        proc->time_slice --;
    }
    if (proc->time_slice == 0) {
        proc->need_resched = 1;
    }
}

请在实验报告中简要说明如何设计实现”多级反馈队列调度算法“，给出概要设计，鼓励给出详细设计

多级反馈队列算法可以实现在不同的队列间使用不同的调度算法。
假设进程一共有4个调度优先级，分别为0、1、2、3，其中0位最高优先级，3位最低优先级。为了支持4个不同的优先级，在运行队列中开4个队列，分别命名为rq -> run_list[0..3]。除此之外，在proc_struct中加入priority成员表示该进程现在所处的优先级，初始化为0。

MLFQ_enqueue(struct run_queue *rq, struct proc_struct *proc)：判断proc进程的时间片proc -> time_slice是否为0，如果为0，则proc -> priority += 1，否则不变。根据proc加入到对应优先级的列表中去。时间片的长度也和优先级有关，低优先级的时间片长度设置为高优先级的两倍。

static void
MLFQ_enqueue(struct run_queue *rq, struct proc_struct *proc) {
    assert(list_empty(&(proc->run_link)));
    if (proc -> time_slice == 0 && proc -> priority != 3) {
        ++(proc -> priority);
    }
    list_add_before(&(rq->run_list[proc ->priority]), &(proc->run_link));
    proc->time_slice = (rq->max_time_slice << proc -> priority);
    proc->rq = rq;
    rq->proc_num ++;
}

MLFQ_dequeue(struct run_queue *rq, struct proc_struct *proc)：将proc进程从相应的优先级运行队列中删除。

static void
MLFQ_enqueue(struct run_queue *rq, struct proc_struct *proc) {
    assert(!list_empty(&(proc->run_link)) && proc->rq == rq);
    list_del_init(&(proc->run_link));
    rq->proc_num --;
}

MLFQ_pick_next(struct run_queue *rq)：为了避免优先级较低的进程出现饥饿现象，对每个优先级设置一定的选中概率，高优先级是低优先级选中概率的两倍，然后选出一个优先级，找到这个优先级中的第一个进程返回。

static struct proc_struct *
MLFQ_pick_next(struct run_queue *rq) {
    int p = rand() % (8 + 4 + 2 + 1);
    int priority;
    if (p >= 0 && p < 8) {
        priority = 0;
    } else if (p >= 8 && p < 12) {
        priority = 1;
    } else if (p >= 12 && p < 14) {
        priority = 2;
    } else priority = 3;
    list_entry_t *le = list_next(&(rq->run_list[priority]));
    if (le != &(rq->run_list[priority])) {
        return le2proc(le, run_link);
    } else {
        for (int i = 0; i < 3; ++i) {
            le = list_next(&(rq->run_list[i]));
            if (le != &(rq -> run_list[i])) return le2proc(le, run_link);
        }
    }
    return NULL;
}

MLFQ_proc_tick(struct run_queue *rq, struct proc_struct *proc)：和RR算法相似。

练习2: 实现 Stride Scheduling 调度算法

Stride算法的原理是对于每一个进程，有一个stride值和一个pass值。每次进行调度时，选择stride最小的进程运行，并将这个进程的stride加上pass。pass越小那么被调度的次数就会越多。在实验中，pass依托优先级实现。优先级越大，pass即用一个大常数除以优先级得到的值就越小，也就意味着被调度的次数越多。具体实现如下

首先，需要设置一个大整数，将来的所有pass值都由这个大整数除以进程的优先级得到：

#define BIG_STRIDE  0x7FFFFFFF

初始化时，需要将列表、斜堆清空，并置运行列表中的进程数为0。

static void
stride_init(struct run_queue *rq) {
// (1) init the ready process list: rq->run_list
    list_init(&(rq -> run_list));
// (2) init the run pool: rq->lab6_run_pool
    rq -> lab6_run_pool = NULL;
// (3) set number of process: rq->proc_num to 0       
    rq -> proc_num = 0;
}

将进程加入运行队列时，插入到斜堆中，并将运行队列的进程计数加一，同时需要在进程的数据结构中关联运行队列。

static void
stride_enqueue(struct run_queue *rq, struct proc_struct *proc) {
#if USE_SKEW_HEAP
// (1) insert the proc into rq correctly
    rq -> lab6_run_pool = skew_heap_insert(rq -> lab6_run_pool, &(proc->lab6_run_pool), proc_stride_comp_f);
#else
    assert(list_empty(&(proc->run_link)));
    list_add_before(&(rq->run_list), &(proc->run_link));
#endif
// (2) recalculate proc->time_slice
    if (proc->time_slice == 0 || proc->time_slice > rq->max_time_slice) {
        proc->time_slice = rq->max_time_slice;
    }
// (3) set proc->rq pointer to rq
    proc -> rq = rq;
// (4) increase rq->proc_num
    rq -> proc_num ++;
}

将进程从运行队列移走时，需要将进程从斜堆中删除，并将运行队列的进程计数减一。

static void
stride_dequeue(struct run_queue *rq, struct proc_struct *proc) {
#if USE_SKEW_HEAP
// (1) remove the proc from rq correctly
    rq -> lab6_run_pool = skew_heap_remove(rq -> lab6_run_pool, &(proc -> lab6_run_pool), proc_stride_comp_f);
#else
    assert(!list_empty(&(proc->run_link)) && proc->rq == rq);
    list_del_init(&(proc->run_link));
#endif
// (2) decrease rq -> proc_num by 1
    rq -> proc_num --;
}

选择接下来调度的进程时，如果使用斜堆只需要取出堆顶，随后要更新进程的stride值。

static struct proc_struct *
stride_pick_next(struct run_queue *rq) {
    struct proc_struct *proc;
#if USE_SKEW_HEAP
    if (rq -> lab6_run_pool == NULL) {
        return NULL;
    }
//  (1) get a  proc_struct pointer p  with the minimum value of stride
    proc = le2proc(rq -> lab6_run_pool, lab6_run_pool);
#else
    list_entry_t *le = list_next(&(rq->run_list));
    if (le == &(rq->run_list)) {
        return NULL;
    }
    proc = le2proc(le, run_link);
    while (le != &(rq -> run_list)) {
        struct proc_struct *temp = le2proc(le, run_link);
        if (proc_stride_comp_f(temp, proc) < 0) {
            proc = temp;
        }
        le = list_next(le);
    }
#endif
//  (2) update p;s stride value: p->lab6_stride
    if (proc -> lab6_priority == 0) {
        proc -> lab6_stride += BIG_STRIDE;
    } else {
        proc -> lab6_stride += BIG_STRIDE / proc -> lab6_priority;
    }
    return proc;
}

处理时钟和RR算法一样，如果time slice大于0，则将值减一。否则以为着时间片用完，需要调度。

static void
stride_proc_tick(struct run_queue *rq, struct proc_struct *proc) {
    if (proc->time_slice > 0) {
        proc->time_slice --;
    }
    if (proc->time_slice == 0) {
        proc->need_resched = 1;
    }
}

覆盖的知识点

进程调度的过程
RR算法和Stride算法

与参考答案的区别

练习1：自己完成。
练习2：自己完成，但是没有加使用列表的情况，后补上。

总结

感觉这个实验的难度并不大，但是在完成实验二的时候遇到了玄学问题，盯了一个小时代码之后就莫名其妙的出正确结果了，目前不知道原因。