Flask 源码之WSGI

2023-02-22 本文已影响0人 will_7c3c

WSGI

WSGI是python容器服务器和python web app通信协议标准

server负责去进行http层的处理,外部看来,server接受client端请求

返回http response

而中间处理过程则是调用web app

WSGI就是调用标准,规定了app暴露的接口标准和返回标准以及服务器传参标准

这样一来不同的app和server之间能够相互统一交互

其中对app的要求是

callable
结果iterable
接受server的两个参数 : environ环境参数, start_response 生成标准http包头的函数

callable好理解,实际上的app可以理解为server处理http request 的逻辑处理函数,所以要求一定是可以被服务器调用的,需要暴露调用接口

iterable 要和start_response结合起来理解, iterable 的结果我们将其用data简写 data[], 实际上这个data[]是http body, server 不停迭代,data[] 然后传输内容

和start_response相结合,start_response也是一个callable，接受两个必须的参数，status（HTTP状态）和response_headers（响应消息的头）.实际上返回的就是http header

在wsgi app 内部, return data[]之前要先调用,但是不会立刻返回包头给 server, 而一定要等到迭代 data[]到第一个非空对象以后才会传

换句话说,如果这时候迭代data[]出错了,你先传过去包头状态里写的是200那不就完犊子了
所以一定要先等到迭代到第一个非空对象以后,这个头才会被传过去

这里表述有问题,是一定在传data[]给server之前,先传start_response,也就是先传头,但是这时候server没有直接把这个header发给client,而是等body内容
body一定不能为空,不能只给一个头

这里start_response还有一个可选参数也就是exc_info,当处理请求的过程遇到错误时，这个参数会被设置，同时调用 start_response.如果这时候headers 还在 start_response内没出来呢,可以用这个参数直接去设置header 头, 也就是把status code改为其他错误代码
如果已经输出到server了,则需要raise 一个 error 让服务器去处理,跳出应用.

为了避免循环引用，start_response实现时需要保证 exc_info在函数调用后不再包含引用。也就是说start_response用完 exc_info后，需要保证执行一句

exc_info = None

释放掉引用.

下面来一个示例简陋的WSGI程序

def application(environ, start_response): 
    status = '200 OK' 
    output = 'World!' 
    response_headers = [('Content-type', 'text/plain'), 
                        ('Content-Length', str(12)] 
    write = start_response(status, response_headers) 
    write('Hello ') 
    return [output]

关于这个write,我看了一下start_response的返回对象也就是理论上的header,实际也是一个callable,
形式为write(body_data)
不过理论上,下面这么写应该更合理

def application(environ, start_response): 
    status = '200 OK' 
    output = 'Hello,World!' 
    response_headers = [('Content-type', 'text/plain'), 
                        ('Content-Length', str(12)] 
    start_response(status, response_headers) 
    return [output]

output就是http body
startresponse就是header,这样子更易接受
以上参考:
wsgiref 源代码分析 --start_response()

进一步理解WSGI对app端的定义

callable
return iterable
接受environ,start_response

callable一定是function么? 类或是对象实现了callable不也可以?
iterable 一定要是[],{}这种基本数据结构么?实现iter不也可以?

至于接收参数就更简单了

所以从wsgi规则上,我们可以定义出的不止是app可以是对象,或是直接是类如下

# 1. 可调用对象是一个函数
def application(environ, start_response):

   response_body = 'The request method was %s' % environ['REQUEST_METHOD']

   # HTTP response code and message
   status = '200 OK'

   # 应答的头部是一个列表，每对键值都必须是一个 tuple。
   response_headers = [('Content-Type', 'text/plain'),
                       ('Content-Length', str(len(response_body)))]

   # 调用服务器程序提供的 start_response，填入两个参数
   start_response(status, response_headers)

   # 返回必须是 iterable
   return [response_body]    

# 2. 可调用对象是一个类
class AppClass:
    """这里的可调用对象就是 AppClass 这个类，调用它就能生成可以迭代的结果。
        使用方法类似于： 
        for result in AppClass(env, start_response):
             do_somthing(result)
    """

    def __init__(self, environ, start_response):
        self.environ = environ
        self.start = start_response

    def __iter__(self):
        status = '200 OK'
        response_headers = [('Content-type', 'text/plain')]
        self.start(status, response_headers)
        yield "Hello world!\n"

# 3. 可调用对象是一个实例 
class AppClass:
    """这里的可调用对象就是 AppClass 的实例，使用方法类似于： 
        app = AppClass()
        for result in app(environ, start_response):
             do_somthing(result)
    """

    def __init__(self):
        pass

    def __call__(self, environ, start_response):
        status = '200 OK'
        response_headers = [('Content-type', 'text/plain')]
        self.start(status, response_headers)
        yield "Hello world!\n"

严格意义讲,这个标准是允许嵌套的,可以更进一步

也就是server调用一个app,但是app继续往下调用app.
从结果来看,中间这个app是一个中间件,对服务器来说是app,对app来说是服务器.

PEP333给这个嵌套就定义为中间件,并给出了假设的场景

Routing a request to different application objects based on the target URL, after rewriting the environ accordingly.
Allowing multiple applications or frameworks to run side by side in the same process
Load balancing and remote processing, by forwarding requests and responses over a network
Perform content postprocessing, such as applying XSL stylesheets

直接翻译一下

根据 url 把请求给到不同的客户端程序（url routing）,在把environ 改写以后
允许多个客户端程序/web 框架同时运行，就是把接到的同一个请求传递给多个程序。
负载均衡和远程处理：把请求在网络上传输
应答的过滤处理

PEP333直接给了一个中间件使用例子,但是不是那么多直观
大概意思是先实现了一个迭代器类LatinIter,然后实现了一个类Latinator,在这个类中实现了callable,设置了对environ和start_response的处理
并且指定了接下来要调用的可能的self.app

最后实验调用这个Lationtor作为中间件,处理foo_app
服务器环境是cgi

PEP333
这里面同时演示说明了,处理response头的过程中如果出意外了该怎么办.

from piglatin import piglatin

class LatinIter:

    """Transform iterated output to piglatin, if it's okay to do so

    Note that the "okayness" can change until the application yields
    its first non-empty string, so 'transform_ok' has to be a mutable
    truth value.
    """

    def __init__(self, result, transform_ok):
        if hasattr(result, 'close'):
            self.close = result.close
        self._next = iter(result).next
        self.transform_ok = transform_ok

    def __iter__(self):
        return self

    def next(self):
        if self.transform_ok:
            return piglatin(self._next())
        else:
            return self._next()

class Latinator:

    # by default, don't transform output
    transform = False

    def __init__(self, application):
        self.application = application

    def __call__(self, environ, start_response):

        transform_ok = []

        def start_latin(status, response_headers, exc_info=None):

            # Reset ok flag, in case this is a repeat call
            del transform_ok[:]

            for name, value in response_headers:
                if name.lower() == 'content-type' and value == 'text/plain':
                    transform_ok.append(True)
                    # Strip content-length if present, else it'll be wrong
                    response_headers = [(name, value)
                        for name, value in response_headers
                            if name.lower() != 'content-length'
                    ]
                    break

            write = start_response(status, response_headers, exc_info)

            if transform_ok:
                def write_latin(data):
                    write(piglatin(data))
                return write_latin
            else:
                return write

        return LatinIter(self.application(environ, start_latin), transform_ok)


# Run foo_app under a Latinator's control, using the example CGI gateway
from foo_app import foo_app
run_with_cgi(Latinator(foo_app))

我觉得上面这个例子不够直观,同时验证了好几条,我们看下面这个例子就是将路由设置为中间件,更容易理解

class Router(object):
    def __init__(self):
        self.path_info = {}
    def route(self, environ, start_response):
        application = self.path_info[environ['PATH_INFO']]
        return application(environ, start_response)
    def __call__(self, path):
        def wrapper(application):
            self.path_info[path] = application
        return wrapper

router = Router()

## 上面是中间件router,实际是一个wsgi app

#here is the application
@router('/hello')    #调用 route 实例，把函数注册到 paht_info 字典
def hello(environ, start_response):
    status = '200 OK'
    output = 'Hello'
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    write = start_response(status, response_headers)
    return [output]

@router('/world')
def world(environ, start_response):
    status = '200 OK'
    output = 'World!'
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    write = start_response(status, response_headers)
    return [output]

#here run the application
result = router.route(environ, start_response)
for value in result: 
    write(value)

以上来自博客python wsgi简介
我觉得更容易理解一些
这里还有一点,我看到这个路由实现的时候惊呆了,怀疑是否flask里的路由也是这么实现的,hh但不是,
那么为什么不这么做呢?
这就是一个问题了,我需要问问老师,
效率问题么?还是这么一来解耦太厉害了?

但确实前后端分离中,后端只负责开发RESTfulAPI的话,连路由都不用写的,只需要处理数据库和暴露API给前段就好,node和vue自然会处理好路由.

我们下一步开始分析流程

Flask 源码之WSGI

WSGI

进一步理解WSGI对app端的定义

猜你喜欢

热点阅读