Django 的 SessionMiddleware 源码学习

2018-07-13  本文已影响0人  vckah

先来看一个问题,后端是怎样将 session 放置到浏览器中的呢?
Django 在 response 对象中有 session 这样一个字段,在返回消息头时,有 Set-Cookie 这样一个响应头,而且是两个,一个是 sessionid,另一个是 csrftoken字段。这两个都是键值对,里面有一些信息。
Django 的中间件为我们封装了 Session 这一对象,Django 对开发者做了非常友好的设定。所以我们来看一看这一层封装。
在 PyCharm 中首先进入 SessionMiddleware

class SessionMiddleware(MiddlewareMixin):
    def __init__(self, get_response=None):
        xxx
    def process_request(self, request):
        xxx
    def process_response(self, request, response):
        xxx

注:xxx 代表省略。这里实现了两个方法:process_requestprocess_response 。这两个方法是 Django 中间件里面可以实现的 5 个方法中的两个。

 #初始化方法
 def __init__(self, get_response=None):
    self.get_response = get_response
    engine = import_module(settings.SESSION_ENGINE)
    self.SessionStore = engine.SessionStore

这里的初始化方法,首先将 None 赋值给了一个变量,engine 其实是赋值了一个 db 对象 。跟踪进 settings 发现是一个 LazySettings 对象,它没有 SESSION_ENGINE 变量,在它的父类 LazyObject 也没有此变量,再来详细看一看 LazySettings:

class LazySettings(LazyObject):
    def _setup(self, name=None):
        settings_module = os.environ.get(ENVIRONMENT_VARIABLE)
        if not settings_module:
            desc = ("setting %s" % name) if name else "settings"
            raise ImproperlyConfigured(
                "Requested %s, but settings are not configured. "
                "You must either define the environment variable %s "
                "or call settings.configure() before accessing settings."
                % (desc, ENVIRONMENT_VARIABLE))

        self._wrapped = Settings(settings_module)

进入 Settings 对象:


class Settings(object):
    def __init__(self, settings_module):
        # update this dict from global settings (but only for ALL_CAPS settings)
        for setting in dir(global_settings):
            if setting.isupper():
                setattr(self, setting, getattr(global_settings, setting))

        # store the settings module in case someone later cares
        self.SETTINGS_MODULE = settings_module

        mod = importlib.import_module(self.SETTINGS_MODULE)

        tuple_settings = (
            "INSTALLED_APPS",
            "TEMPLATE_DIRS",
            "LOCALE_PATHS",
        )
        self._explicit_settings = set()
        for setting in dir(mod):
            if setting.isupper():
                setting_value = getattr(mod, setting)

                if (setting in tuple_settings and
                        not isinstance(setting_value, (list, tuple))):
                    raise ImproperlyConfigured("The %s setting must be a list or a tuple. " % setting)
                setattr(self, setting, setting_value)
                self._explicit_settings.add(setting)
        # xxxx 还有一些代码

注意到有一个 global_settings 变量,跟踪发现其实是一个 py 文件,它里面定义了各种初始化变量,类似于 settings 一样。在这里终于发现了 SESSION_ENGINE:
SESSION_ENGINE = 'django.contrib.sessions.backends.db'。看见了吧,这就是一个操纵数据库的文件。
顺便说一句,这里的代码主要是进行各种设置,以后为了进行 settings. 这种操作。这里可以看到,大量使用了 getattrsetattr 这种自省操作。
好了,回到 SessionMiddleware,来看看 import_module

def import_module(name, package=None):
    if name.startswith('.'):
        if not package:
            raise TypeError("relative imports require the 'package' argument")
        level = 0
        for character in name:
            if character != '.':
                break
            level += 1
        name = _resolve_name(name[level:], package, level)
    __import__(name)
    return sys.modules[name]

这里主要是通过字符串标识的路径来导入该模块或其下的方法/属性。
还有一个 SessionStore,它是一个类,主要是为了进行 session 的操作。
来看一看 process_request

def process_request(self, request):
    session_key = request.COOKIES.get(settings.SESSION_COOKIE_NAME)
    request.session = self.SessionStore(session_key)

process_request 在收到 request 之后,未解析 url 之前进行。
这里 session_key 主要就是在 cookies 中取出 sessionid。其实
SESSION_COOKIE_NAME 是在上面提到的 global_setting.py 中定义的:SESSION_COOKIE_NAME = 'sessionid'。我们以后任何设置都可以在这里找到原型。
session 是一个 SessionStore 对象。进入:

class SessionStore(SessionBase):
    """
    实现数据库会话存储
    """
    def __init__(self, session_key=None):
        super(SessionStore, self).__init__(session_key)
    # xxx 代表还有代码

紧接着进入 SessionBase:

class SessionBase(object):
    """
    Base class for all Session classes.
    """
    TEST_COOKIE_NAME = 'testcookie'
    TEST_COOKIE_VALUE = 'worked'

    __not_given = object()

    def __init__(self, session_key=None):
        self._session_key = session_key
        self.accessed = False
        self.modified = False
        self.serializer = import_string(settings.SESSION_SERIALIZER)

这里主要进行了赋值,注意到 SESSION_SERIALIZER,在 global_settings.py 中找到:SESSION_SERIALIZER = 'django.contrib.sessions.serializers.JSONSerializer'。如果你用过 drf 的话,那么对这个 serializer 一定很熟悉,就是做序列化的。
在设置 session 的时候通常认为它就是一个字典,可以进行字典赋值。字典的赋值调用 __setitem__ 方法,获取调用 __getitem__ 方法。

def __getitem__(self, key):
    return self._session[key]
def _get_session(self, no_load=False):
    """
    Lazily loads session from storage (unless "no_load" is True, when only
    an empty dict is stored) and stores it in the current instance.
    """
    self.accessed = True
    try:
        return self._session_cache
    except AttributeError:
        if self.session_key is None or no_load:
            self._session_cache = {}
        else:
            self._session_cache = self.load()
    return self._session_cache
_session = property(_get_session)

可以看到这里运用了 property,就是将一个方法变为属性来操作的装饰器。第一次访问服务器时,_session_cache 没有这个变量,然后 session_key 也是空的,no_load 是 False,所以初始化一个 self._session_cache = {}。这里总体来说,如果你是第一次访问服务器,那么创建一个 session 的字典。
再来看一看 __setitem__:

def __setitem__(self, key, value):
    self._session[key] = value
    self.modified = True
def _get_session(self, no_load=False):
    """
    Lazily loads session from storage (unless "no_load" is True, when only
    an empty dict is stored) and stores it in the current instance.
    """
    self.accessed = True
    try:
        return self._session_cache
    except AttributeError:
        if self.session_key is None or no_load:
            self._session_cache = {}
        else:
            self._session_cache = self.load()
    return self._session_cache
 _session = property(_get_session)

这里与上面变化不大,如果是赋值的时候,那么 _session_cache 已经存在,直接赋值即可。
注意这里还没有保存到数据库中。
再来看一看 process_response:

def process_response(self, request, response):
    """
    If request.session was modified, or if the configuration is to save the
    session every time, save the changes and set a session cookie or delete
    the session cookie if the session has been emptied.
    """
    try:
        accessed = request.session.accessed
        modified = request.session.modified
        empty = request.session.is_empty()
    except AttributeError:
        pass
    else:
        # First check if we need to delete this cookie.
        # The session should be deleted only if the session is entirely empty
        if settings.SESSION_COOKIE_NAME in request.COOKIES and empty:
            response.delete_cookie(
                settings.SESSION_COOKIE_NAME,
                path=settings.SESSION_COOKIE_PATH,
                domain=settings.SESSION_COOKIE_DOMAIN,
            )
        else:
            if accessed:
                patch_vary_headers(response, ('Cookie',))
            if (modified or settings.SESSION_SAVE_EVERY_REQUEST) and not empty:
                if request.session.get_expire_at_browser_close():
                    max_age = None
                    expires = None
                else:
                    max_age = request.session.get_expiry_age()
                    expires_time = time.time() + max_age
                    expires = cookie_date(expires_time)
                # Save the session data and refresh the client cookie.
                # Skip session save for 500 responses, refs #3881.
                if response.status_code != 500:
                    try:
                        request.session.save()
                    except UpdateError:
                        raise SuspiciousOperation(
                            "The request's session was deleted before the "
                            "request completed. The user may have logged "
                            "out in a concurrent request, for example."
                        )
                    response.set_cookie(
                        settings.SESSION_COOKIE_NAME,
                        request.session.session_key, max_age=max_age,
                        expires=expires, domain=settings.SESSION_COOKIE_DOMAIN,
                        path=settings.SESSION_COOKIE_PATH,
                        secure=settings.SESSION_COOKIE_SECURE or None,
                        httponly=settings.SESSION_COOKIE_HTTPONLY or None,
                    )
    return response

主要是在执行视图函数之后返回 Response 后调用的,进行一些 session 的返回等。
首先进行赋值:accessedmodified 这2个值在之前设置为了 True。这里 empty:

def is_empty(self):
    "Returns True when there is no session_key and the session is empty"
    try:
        return not bool(self._session_key) and not self._session_cache
    except AttributeError:
        return True

意思就是当没有 session_key 且 session 是空时候为True
之后进行了一些 cookie 的设置以及向 response 写入了一些东西。接着在 process_response 里面我们发现了一个重要的操作:

request.session.save()

我们知道 session 是一个 SessionStore 对象,所以我们就到那里去寻找 save 操作吧:

def save(self, must_create=False):
    """
    Saves the current session data to the database. If 'must_create' is
    True, a database error will be raised if the saving operation doesn't
    create a *new* entry (as opposed to possibly updating an existing
    entry).
    """
    if self.session_key is None:
        return self.create()
    data = self._get_session(no_load=must_create)
    obj = self.create_model_instance(data)
    using = router.db_for_write(self.model, instance=obj)
    try:
        with transaction.atomic(using=using):
            obj.save(force_insert=must_create, force_update=not must_create, using=using)
    except IntegrityError:
        if must_create:
            raise CreateError
        raise
    except DatabaseError:
        if not must_create:
            raise UpdateError
        raise

这里主要进行 session 的持久化保存,就是保存到数据库。
浏览器第一次访问不会携带 session 。这里第一次会执行
return self.create(),进入 create()

def create(self):
    while True:
        self._session_key = self._get_new_session_key()
        try:
            # Save immediately to ensure we have a unique entry in the
            # database.
            self.save(must_create=True)
        except CreateError:
            # Key wasn't unique. Try again.
            continue
        self.modified = True
        return

这里进行了唯一性判断,session_key 保证不重复。然后进入 _get_new_session_key()

# crypto.py 
def get_random_string(length=12,
                      allowed_chars='abcdefghijklmnopqrstuvwxyz'
                                    'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'):
    """
    Returns a securely generated random string.

    The default length of 12 with the a-z, A-Z, 0-9 character set returns
    a 71-bit value. log_2((26+26+10)^12) =~ 71 bits
    """
    if not using_sysrandom:
        # This is ugly, and a hack, but it makes things better than
        # the alternative of predictability. This re-seeds the PRNG
        # using a value that is hard for an attacker to predict, every
        # time a random string is required. This may change the
        # properties of the chosen random sequence slightly, but this
        # is better than absolute predictability.
        random.seed(
            hashlib.sha256(
                ("%s%s%s" % (
                    random.getstate(),
                    time.time(),
                    settings.SECRET_KEY)).encode('utf-8')
            ).digest())
    return ''.join(random.choice(allowed_chars) for i in range(length))

这里使用了 hashlib 模块用 sha256 进行了加密,包括字符,时间以及 settings 文件中的那个 SECRET_KEY。所以 那个变量非常重要,千万不要外泄。
创建完成之后,在 create 函数中会保存,即执行 save() 函数,跟上面的一样:

def save(self, must_create=False):
    if self.session_key is None:
        return self.create()
    data = self._get_session(no_load=must_create)
    obj = self.create_model_instance(data)
    using = router.db_for_write(self.model, instance=obj)
    try:
        with transaction.atomic(using=using):
            obj.save(force_insert=must_create, force_update=not must_create, using=using)
    except IntegrityError:
        if must_create:
            raise CreateError
        raise
    except DatabaseError:
        if not must_create:
            raise UpdateError
        raise

这次就 if 不通过,然后进入 _get_session,它里面主要是判断 session 是否缓存了,如果缓存直接返回缓存数据,如果没有,那么构造一个 session_cache 字典 。
接着进入 create_model_instance:、

def create_model_instance(self, data):
    """
    Return a new instance of the session model object, which represents the
    current session state. Intended to be used for saving the session data
    to the database.
    """
    return self.model(
        session_key=self._get_or_create_session_key(),
        session_data=self.encode(data),
        expire_date=self.get_expiry_date(),
    )

返回一个 model,跟进之后发现它导入了一个 Session 对象:

class Session(AbstractBaseSession):
    """
    Django provides full support for anonymous sessions. The session
    framework lets you store and retrieve arbitrary data on a
    per-site-visitor basis. It stores data on the server side and
    abstracts the sending and receiving of cookies. Cookies contain a
    session ID -- not the data itself.

    The Django sessions framework is entirely cookie-based. It does
    not fall back to putting session IDs in URLs. This is an intentional
    design decision. Not only does that behavior make URLs ugly, it makes
    your site vulnerable to session-ID theft via the "Referer" header.

    For complete documentation on using Sessions in your code, consult
    the sessions documentation that is shipped with Django (also available
    on the Django Web site).
    """
    objects = SessionManager()

    @classmethod
    def get_session_store_class(cls):
        from django.contrib.sessions.backends.db import SessionStore
        return SessionStore

    class Meta(AbstractBaseSession.Meta):
        db_table = 'django_session'

这一堆注释解释的很清楚了。注意底下的 db_table = 'django_session',它就是在数据库中表的名称。它继承了父类的一些东西。
然后 db_for_write 是查找数据库的代码。这一块暂时停一下。
接着就开始数据库的事务操作了,确保正确将 session 存入数据库。
以上还有些不详细的地方,以后会慢慢补充。

上一篇下一篇

猜你喜欢

热点阅读