PostgreSQL DBA(176) - Serializab
PostgreSQL 9.1或以上版本,提供了真正意义的Serializability Isolation,本节主要介绍了Serializability的相关概念,包括Serializability、写偏序(Write Skew)等。
What is Serializability Isolation?
ANSI SQL 1999对Serializability的定义是:
The execution of concurrent SQL-transactions at isolation level SERIALIZABLE is guaranteed to be serializable. A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions that produces the same effect as some serial execution of those same SQL-transactions. A serial execution is one in which each SQL-transaction executes to completion before the next SQL-transaction begins.
事务可以理解为按顺序执行的几个基本子操作,Serializability(可串行化)保证了操作以原子方式发生,即事务的子操作不会与其他事务的子操作交织在一起。换句话说,Serializability意味着即使是并发的事务,但从整体上来看仍然是顺序发生的。
比如下面的执行历史H是可串行化的:
H:r1(x) r2(x) w1(y) c1 w2(y) c2
而H'则不是可串行化,因为w1(y)和w2(y)出现了ww冲突:
H':r1(x) w1(y) w2(y) c1 r2(x) c2
What is Write Skew?
使用形式化语言来表示:定义rset(Tn)是事务Tn读数据涉及的元组集合,wset(Tn)是事务Tn写数据涉及的元组集合。注意这里的写包括增删改,并不局限于插入数据。
设有两个并发事务T1和T2,如果rset(T1) ∩ rset(T2) ≠ ∅且wset(T1) ∩ wset(T2) = ∅,则数据库出现了写Write Skew现象。
比如数据表doctor,业务约束要求处于oncall状态的医生人数不能少于1人,但如果存在并发事务同时对不同医生的oncall状态进行更新,则会出现Write Skew:
数据:
name | oncall | remark |
---|---|---|
Alice | true | |
Bob | true | |
Jacky | false | |
Susan | false |
并发事务T1和T2的操作:
时间点 | T1 | T2 |
---|---|---|
t1 | begin; | |
t2 | begin; | |
t3 | select count(*) from doctor into v_count where oncall = true; | |
t4 | select count(*) from doctor into v_count where oncall = true; | |
t5 | if v_count ≥ 2 then update doctor set oncall = false where name = 'Alice'; | |
t6 | if v_count ≥ 2 then update doctor set oncall = false where name = 'Bob'; | |
t7 | commit; | |
t8 | commit; |
T1和T2完成后,数据如下:
name | oncall | remark |
---|---|---|
Alice | false | |
Bob | false | |
Jacky | false | |
Susan | false |
已经没有了处于待命状态的医生,违反了业务约束。