进程和线程

2017-06-28  本文已影响19人  狗狗胖妞

进程和线程

什么是进程(process)?

An executing instance of a program is called a process. (进程就是资源的集合)
程序并不能单独运行,只有将程序装载到内存中,系统为它分配资源才能运行,而这种执行的程序就称之为进程。程序和进程的区别就在于:程序是指令的集合,它是进程运行的静态描述文本;进程是程序的一次执行活动,属于动态概念。

在多道编程中,我们允许多个程序同时加载到内存中,在操作系统的调度下,可以实现并发地执行。这是这样的设计,大大提高了CPU的利用率。进程的出现让每个用户感觉到自己独享CPU,因此,进程就是为了在CPU上实现多道编程而提出的。

什么是线程(thread)?

线程是操作系统能够进行运算调度的最小单位。它被包含在进程之中,是进程中的实际运作单位。一条线程指的是进程中一个单一顺序的控制流,一个进程中可以并发多个线程,每条线程并行执行不同的任务

A thread is an execution context, which is all the information a CPU needs to execute a stream of instructions.

Suppose you're reading a book, and you want to take a break right now, but you want to be able to come back and resume reading from the exact point where you stopped. One way to achieve that is by jotting down the page number, line number, and word number. So your execution context for reading a book is these 3 numbers.

If you have a roommate, and she's using the same technique, she can take the book while you're not using it, and resume reading from where she stopped. Then you can take it back, and resume it from where you were.

Threads work in the same way. A CPU is giving you the illusion that it's doing multiple computations at the same time. It does that by spending a bit of time on each computation. It can do that because it has an execution context for each computation. Just like you can share a book with your friend, many tasks can share a CPU.

On a more technical level, an execution context (therefore a thread) consists of the values of the CPU's registers.

Last: threads are different from processes. A thread is a context of execution, while a process is a bunch of resources associated with a computation. A process can have one or many threads.

Clarification: the resources associated with a process include memory pages (all the threads in a process have the same view of the memory), file descriptors (e.g., open sockets), and security credentials (e.g., the ID of the user who started the process).

进程与线程的区别?
  1. Threads share the address space of the process that created it; processes have their own address space.
    线程共享内存空间,进程的内存是独立的
  2. Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
  3. Threads can directly communicate with other threads of its process; processes must use interprocess communication to communicate with sibling processes.
    同一进程中的线程之间可以直接通信,进程之间通信必须使用中间代理
  4. New threads are easily created; new processes require duplication of the parent process.
    新线程易于创建,创建新进程需要克隆其父进程
  5. Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
    线程可以控制与操作同一进程中的其他线程,进程只能操作其子进程
  6. Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads of the process; changes to the parent process does not affect child processes.
    对主线程的修改可能会影响到同一进程中的其他线程,对于父进程的修改不会影响其子进程
import threading
import time

def show(arg):
    time.sleep(2)
    print('thread+str(%s)' %arg)

for i in range(10):
    t = threading.Thread(target=show, args= (i,))
    t.start()

print('main thread stop')
#继承式调用
import threading
import time

class MyThread(threading.Thread):
    def __init__(self, n):
        super(MyThread, self).__init__()
        self.n = n

    def run(self):
        time.sleep(2)
        print('thread+str(%s)' %self.n)

if __name__ == '__main__':
    for i in range(10):
        t = MyThread(i)
        t.start()
    print('main thread stop',threading.current_thread(), threading.active_count())
Join & Daemon
import threading
import time

def show(arg):
    time.sleep(2)
    print('thread+str(%s)' %arg)

start_time = time.time()
t_objs = []
for i in range(10):
    t = threading.Thread(target=show, args= (i,))
    t.start()
    t_objs.append(t)  #将实例加进来

for t in t_objs:    #循环线程实例列表,等待所以线程执行完毕
    t.join()

print('main thread stop')
print("cost:", time.time() - start_time)

守护线程:主线程结束时,不等待守护线程结束,直接结束

def show(arg):
    print(arg)
    time.sleep(2)
    print('thread+str(%s)' %arg)

start_time = time.time()

for i in range(10):
    t = threading.Thread(target=show, args= (i,))
    t.setDaemon(True)     #把当前线程设置为守护线程
    t.start()
print('main thread stop')

Python GIL(Global Interpreter Lock)

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)
上面的核心意思就是,无论启多少个线程, Python在执行的时候在同一时刻只允许一个线程运行。当然,不同的线程可能是在不同的CPU上运行的。

线程锁(互斥锁Mutex)

由于线程之间是进行随机调度,并且每个线程可能只执行n条执行之后,当多个线程同时修改同一条数据时可能会出现脏数据,所以,出现了线程锁--保证在这个线程执行完后别的线程才能调度(串行执行)

import time
import threading


def addNum():
    global num  # 在每个线程中都获取这个全局变量
    print('--get num:', num)
    # lock.acquire()  #加锁后,程序就变串行了。所以,锁的范围尽量小
    time.sleep(3)
    lock.acquire()
    num -= 1  #(加锁保证数据修改的一致型,防止此时对数据的修改还没完成,而原数据又被其他线程取走了)
    lock.release()

num = 100  # 设定一个共享变量
thread_list = []
lock = threading.Lock()
for i in range(100):
    t = threading.Thread(target=addNum)
    t.start()
    thread_list.append(t)

for t in thread_list:  # 等待所有线程执行完毕
    t.join()

print('final num:', num)

递归锁(Rlock)

说白了就是在一个大锁中还再包含子锁

import threading, time

def run1():
    print("grab the first part data")
    lock.acquire()
    global num
    num += 1
    lock.release()
    return num

def run2():
    print("grab the second part data")
    lock.acquire()
    global num2
    num2 += 1
    lock.release()
    return num2

def run3():
    lock.acquire()
    res = run1()
    print('--------between run1 and run2-----')
    res2 = run2()
    lock.release()
    print(res, res2)

if __name__ == '__main__':
    num, num2 = 0, 0
    lock = threading.RLock()
    for i in range(10):
        t = threading.Thread(target=run3)
        t.start()

while threading.active_count() != 1:
    print(threading.active_count())
else:
    print('----all threads done---')
    print(num, num2)

信号量

信号量:允许同一时间几个线程访问公共数据

import threading, time

def run(n):
    semaphore.acquire()
    time.sleep(1)
    print("run the thread: %s\n" %n)
    semaphore.release()

if __name__ == '__main__':
     semaphore = threading.BoundedSemaphore(3)   #最多允许5个线程同时执行
     for i in range(12):
         t = threading.Thread(target=run, args=(i,))
         t.start()

while threading.active_count() != 1:
     pass  #print(threading.active_count())
else:
    print('----all threads done----')

事件(event):通过Event来实现两个或多个线程间的交互。

python线程的事件用于主线程控制其他线程的执行,事件主要提供了三个方法 set、wait、clear。

事件处理的机制:全局定义了一个“Flag”,如果“Flag”值为 False,那么当程序执行 event.wait 方法时就会阻塞,如果“Flag”值为True,那么event.wait 方法时便不再阻塞。
• clear:将“Flag”设置为False
• set:将“Flag”设置为True

import threading, time

event = threading.Event()

def light():
    count = 0
    event.set()
    while True:
        if count < 5:
            print("\033[1;42mGreen light is on..\033[0m")
        elif count >= 5 and count < 10:
            event.clear()
            print("\033[1;41mRed light is on.\033[0m")
        else:
            count = 0
            event.set()
            print("\033[1;42mGreen light is on..\033[0m")
        time.sleep(1)
        count += 1

def car(name):
    while True:
        if event.isSet():
            print("[%s] is running" %name)
            time.sleep(1)
        else:
            print("[%s] sees red light, it's waiting" % name)
            event.wait()
            # print("\033[1;34m[%s] sees green light is on, it's keep going." % name)

lights = threading.Thread(target=light)
lights.start()

car1 = threading.Thread(target=car, args=("Jeep",))
car1.start()

队列queue和生产者消费者模型

队列

作用:解耦,提高效率

queue 实例化方法

class queue.Queue(maxsize=0) # 先入先出
class queue.LifoQueue(maxsize=0) # last in first out
class queue.PriorityQueue(maxsize=0) # 存储数据时可设置优先级的队列

常用方法(q =queue.queue()):
q.qsize() 返回队列的大小
q.empty() 如果队列为空,返回True,反之False
q.full() 如果队列满了,返回True,反之False
q.full 与 maxsize 大小对应
q.get([block[, timeout]]) 获取队列,timeout等待时间
q.get_nowait() 相当q.get(False)
非阻塞 q.put(item) 写入队列,timeout等待时间
q.put_nowait(item) 相当q.put(item, False)
q.task_done() 在完成一项工作之后,q.task_done() 函数向任务已经完成的队列发送一个信号
q.join() 实际上意味着等到队列为空,再执行别的操作

生产者消费者模型

什么是生产者消费者模式

生产者消费者模式是通过一个容器来解决生产者和消费者的强耦合问题。生产者和消费者彼此之间不直接通讯,而通过阻塞队列来进行通讯,所以生产者生产完数据之后不用等待消费者处理,直接扔给阻塞队列,消费者不找生产者要数据,而是直接从阻塞队列里取,阻塞队列就相当于一个缓冲区,平衡了生产者和消费者的处理能力。
理解生产者消费者模型及在Python编程中的运用实例

import threading, queue
import time

q = queue.Queue()

def producer(name):
    count = 1
    while True:
        print('[%s]生产了骨头%s' % (name, count))
        q.put('骨头%s' %count)
        # print('[%s]生产了骨头%s' %(name, count))
        count += 1
        time.sleep(0.5)

def cusumer(name):
    while True:
        print('[%s] 取到[%s] 并且吃了它...' %(name, q.get()))
        time.sleep(3)

p = threading.Thread(target=producer, args=('Alex',))
c1 = threading.Thread(target=cusumer, args=('dog1',))
c2 = threading.Thread(target=cusumer, args=('dog2',))
p.start()
c1.start()
c2.start()
上一篇 下一篇

猜你喜欢

热点阅读