从零开始UNIX环境高级编程(1):Unix基础知识

2017-01-10 本文已影响849人伤口不该结疤

1. 概述

《UNIX环境高级编程》介绍的是不同版本的Unix操作系统提供的服务。那具体指的是哪些服务呢？首先，需要了解整个Unix操作系统的体系结构。

1.1 Unix体系结构

Unix操作系统的体系结构，如下图所示：

Unix操作系统体系结构

**硬件 **：芯片、电路板、磁盘、键盘等。
内核：有对所有硬件的完全访问权，并且只有内核才能操作硬件
系统调用：操作内核的接口
shell：是一种特殊的应用程序，分为command-line interface (CLI) shell和Graphical shell
公用函数库：建立在系统调用基础之上
应用程序：可以进行系统调用，也可以调用公用函数库

从Unix体系结构可以看出，如果需要使用Unix系统提供的服务，那么必须进行系统调用或者库函数调用。因此，本书介绍的内容就是Unix系统函数和公用库函数的使用方式。

1.2 系统调用和库函数

库函数分为两类：一种是对系统调用的封装，如malloc是对库函数sbrk(2)的封装。另外一种库函数没有进行系统调用，如strcpy，格式转换这类函数。

系统函数和库函数

1.3 Unix程序员手册

我们经常都能看到这种表示: ls(1) write(2) group(5) ... ..

其中括号里面的数字代表手册的区段，Unix程序员手册被分为8个部分，每个部分描述不同的内容。

手册区段

可以通过man命令来查询手册：

man 5 group

2. 登录

2.1 口令文件(/etc/passwd)

用户在登录系统时，需要输入登录名和密码。这些信息都存放在口令文件(通常是/etc/passwd文件)中查看登录名，组成字段信息如下，以逗号隔开。

ckt@ubuntu:~$ cat /etc/passwd
ckt:x:1000:1000:ckt,,,:/home/ckt:/bin/bash

登录名：ckt
加密口令：x
用户ID：1000
组ID：1000
注释段：ckt,,,
起始目录：/home/ckt
shell程序：/bin/bash

2.2 用户标识

用户ID，组ID和附属组ID用作用户的唯一标识，以区分不同的用户。

用户标识

查看组文件/etc/group

 ckt@ubuntu:~$ cat /etc/group | grep zhm
 zhm:x:1001:ckt

使用man命令查看/etc/group格式为： group_name:password:GID:user_list

组名称：zhm
密码：x
组ID：1001
用户列表：ckt

示例程序
获得当前用户的user ID和group ID

#include "apue.h"

int main(int argc, char const *argv[])
{
    int user_id = -1;
    int group_id = -1;
    user_id = getuid();
    group_id = getgid();
    printf("user_id = %d, group_id = %d \n", user_id, group_id );
    return 0;
}

运行结果
以ckt作为用户运行程序，得到user_id和group_id为1000，和通过cat /etc/passwd得到的结果一致。
再切换到root用户下运行，得到user_id和group_id为0。

ckt@ubuntu:~/work/unix/code$ ./a.out 
user_id = 1000, group_id = 1000 

root@ubuntu:/home/ckt/work/unix/code# ./a.out 
user_id = 0, group_id = 0

2.3 shell

是一种特殊的应用程序，分为command-line interface (CLI) shell和Graphical shell。由/etc/passwd中设置的shell path可以知道，我们使用的shell是Bourne-Again shell。

常见shell

3. 文件和I/O操作

3.1 文件和目录

文件和目录.png

文件描述符

内核用来标识一个特定进程正在访问的文件

文件描述符示例代码
使用open打开一个文件，并打印出文件描述符的值。

#include "apue.h"
#include <fcntl.h>

int main(int argc, char const *argv[])
{
    int file_descr = -1;
    file_descr = open("/home/ckt/work/unix/code/MySignal.c", O_RDONLY);
    printf("file descriptor = %d\n", file_descr);
    return 0;
}

** 运行结果 **

成功打开一个文件时，返回一个非负整数作为文件描述符。打开一个不存在的文件，返回-1。

ckt@ubuntu:~/work/unix/code$ ./a.out 
file descriptor = 3

ckt@ubuntu:~/work/unix/code$ ./a.out 
file descriptor = -1

3.2 输入和输出

输入和输出

不带缓存I/O和标准I/O对比
以不带缓存I/O的read函数为例，函数原型：ssize_t read(int fd, void *buf, size_t count); 在调用read时需要设置缓存区长度即count，如下图所示，不同大小的缓存区对读写操作时间也会有影响。读取相同大小的数据，设置的buffsize过小，read的调用次数就会增加，程序的系统CPU时间变长。

Unix环境高级编程 3.9 I/O效率

调用标准I/O函数无需设置buffsize，先将数据读入缓存流，在填满缓存区后才执行对磁盘的I/O磁盘。
标准I/O库提供缓存的目的就是为了尽可能的减少read的调用次数。

不带缓存I/O和标准I/O区别

printf是按行进行缓冲 - 示例程序
printf是按行进行缓冲，只有当一行结束以后，才会将数据输出。

#include "apue.h"

int main(int argc, char const *argv[])
{
    printf("This Line has been cached...");
    sleep(3);
    printf("\nEnd by line break...\n");
    return 0;
}

运行结果
由于第一个printf没有加换行符，它的缓冲区没有被填满，所以不会马上显示到标准输出，只有当3秒的休眠结束都，执行到第2个printf的\n，缓冲区被填满了以后，才会将第一句printf输出。

示例代码运行效果

修改代码为：printf("This Line has been cached...\n"); 后运行，会立即输出第一句printf。

printf后面加上换行符运行效果

4. 程序及处理

4.1 进程和程序

程序和进程

进程示例程序
使用fork创建一个新进程，并打印出fork的返回值和当前的进程id。

#include "apue.h"

int main(int argc, char const *argv[])
{
    int values = -1;
    printf("current pid = %d\n", getpid());
    values = fork();
    printf("values return by fork : %d, current pid : %d\n", values, getpid()); 
    return 0;
}

运行结果
fork调用一次，会返回两次。返回值不为0，表示当前为父进程；返回值
为0，表示当前调用是在子进程里面。

ckt@ubuntu:~/work/unix/code$ ./process_test
current pid = 2622
values return by fork : 2623, current pid : 2622
values return by fork : 0, current pid : 2623

4.2 出错处理

出错处理

打印错误函数 - 示例代码
使用strerror和perror进行错误打印。

#include "apue.h"
#include <errno.h>

int main(int argc, char const *argv[])
{
    printf("The default value of errno : %d\n", errno);
    fprintf(stderr, "%s\n", strerror(EACCES));
    errno = ENOENT;
    perror("printf the last value of errno ");
    return 0;
}

** 运行结果 **
程序出错后，错误值是存放在errno里面的。先打印出errno的默认值为0。先使用strerror(EACCES)，将枚举值EACCES转为对应错误的字符串描述。再重新赋值errno = ENOENT，perror也会将error转化为对应错误的字符串。它们的区别就在于，perror可以传入一个字符串作为参数输出。

ckt@ubuntu:~/work/unix/code$ ./a.out 
The default value of errno : 0
Permission denied
printf the last value of errno : No such file or directory

4.3 信号

信号

** 函数**

sighandler_t signal(int signum, sighandler_t handler);

** 信号示例代码 **
当捕捉到ctrl+c信号时，就调用自定义函数MySigHanlder。SIGINT就代表我们要捕捉的信号是ctrl+c。

#include "apue.h"

void MySigHanlder();

int main(int argc, char const *argv[])
{
    char buf[MAXLINE];
    printf("please input...\n");
    if (signal(SIGINT, MySigHanlder) == SIG_ERR) 
    {

    }
    while (fgets(buf, MAXLINE,stdin) != NULL) 
    {
        printf("please input...\n");
    }
    return 0;
}

void MySigHanlder() 
{
    printf("\nsignal catch by MySigHanlder\n");
}

** 捕捉ctrl+c - 运行效果**

捕捉ctrl+c

** 不捕捉ctrl+c - 运行效果**
如果将signal(SIGINT, MySigHanlder)注释掉，按照系统默认方式处理，直接停止程序。

不捕捉ctrl+c

5. Unix时间

Unix时间

使用time命令查看ls命令执行的时间

ckt@ubuntu:~$ time ls
Android  AndroidStudioProjects  android-x86  ARM_Compiler_5   

real    0m0.006s
user    0m0.000s
sys 0m0.006s

FreeMind

按照书中讲解顺序，附上完整的思维导图。

Unix基础

参考

UNIX 环境高级编程第3版
Linux中time命令输出的Real time, User time and Sys time
perror的用法
UNIX时间
sbrk(2)
Linux探秘之I/O效率
带缓冲I/O 和不带缓冲I/O的区别与联系