Transaction Isolation

Transaction Isolation

Overview

Serial execution (T1 ; T2; T3 ;...) is the simplest form of isolation.

Problem: serial execution yields poor throughput.

Concurrency control schemes (CCSs) aim for “safe” concurrency.

Abstract view of DBMS concurrency mechanisms:

Serialisability

Consider two schedules S1 and S2 produced by:

  • Executing the same set of transactions T1 .. Tn concurrently.
  • But with a non-serial interleaving of read/write operations.

S1 and S2 are equivalent if StateAfter(S1) == StateAfter(S2).

  • This is a very restrictive requirement.

S is a serializable schedule if S is equivalent to some serial schedule S' of T1 .. Tn.

  • This is a less restrictive requirement.
  • Enough to guarantee consistency.

Classes of Serialisability

Conflict serialisability::

  • Conflicting read/write operations occur in the “right order”.
  • Check via precdence graph; look for absence of cycles.

View serialisability:

  • Read operations see the correct version of data.
  • Check via VS conditions on likely equivalent schedules.

View serialisability is strictly weaker than conflict serialisability.

Checking Conflict Serialisability

construct graph with just nodes, one for each T_i.
for each pair of operations across transactions:
    if T_i and T_j have conflicting ops on variable X:
        put directed edge between T_i and T_j 
        where direction goes from  first tx to access X
        to second tx to access X

    if this new edge forms a cycle:
        return NotConflictSerialisable

return ConflictSerialisable

Checking View Serializability

// T_{C,i} denotes transaction i in concurrent schedule
for each serial schedule S:
    for each shared variable X:
        if T_{C, i} reads same version of X as T_{S, i}
            (either initial value or value written by T_j)
            continue
        else
            give up on this serial schedule
        if T_{C, i} and T_{S, i} write the final version of X:
            continue
        else:
            give up on this serial schedule

    return ViewSerialisable

return NotViewSerialisable

Transaction Isolation Levels

SQl programmers’ concurrency control mechanism:

set transaction
    read only  -- so weaker isolation may be ok
    read write -- suggests stronger isolation needed
isolation level
    -- weakest isolation, maximum concurrency
    read uncommitted
    read committed
    repeatable read
    serializable
    -- strongest isolation, minimum concurrency

Implication of Isolation Levels

Isolation LevelDirty ReadNon-repeatable ReadPhantom Read
Read UncommittedPossiblePossiblePossible
Read CommittedNot PossiblePossiblePossible
Repeatable ReadNot PossibleNot PossiblePossible
SerialisableNot PossibleNot PossibleNot Possible

View of Database

A PostgreSQL tx consists of a sequence of SQL statements:

BEGIN S_1; S_2; ... S_n; COMMIT;

Isolation levels affect view of DB provided to each S_i.

  • Read Committed:
    • Each S_i sees snapshot of DB at start of S_i.
      • This snapshot does NOT include any uncommitted changes from other concurrent tx’s.
      • This snapshot may however include committed changes from other concurrent tx’s.
  • Repeatable Read and Serialisable:
    • Each S_i sees snapshot of DB at start of tx.
      • This snapshot does NOT include even committed changes from other concurrent tx’s.
    • Serialisable checks for extra conditions.

Transactions fail if the system detects violation of isolation level.

Repeatable Read vs Serialisable

Table R(class, value) containing (1,10) (1,20) (2,100) (2,200).

  • T1: X = sum(value) where class=1; insert R(2,X); commit.
  • T2: X = sum(value) where class=2; insert R(1,X); commit.
  • With repeatable read, both transactions commit, giving:
    • Updated table: (1,10) (1,20) (2,100) (2,200) (1,300) (2,30).
  • With serial transactions, only one transaction commits:
    • T1;T2 gives (1,10) (1,20) (2,100) (2,200) (2,30) (1,330).
    • T2;T1 gives (1,10) (1,20) (2,100) (2,200) (1,300) (2,330).
  • PG recognises that committing both gives serialisation violations.

Concurrency Control

Isolation requires some method to control concurrency.

Possible approaches to implementing concurrency control:

  • Lock-based:
    • Synchronise tx execution via locks on relevant part of DB.
  • Version-based (multi-version concurrency control):
    • Allow multiple consistent versions of the data to exist.
    • Each tx has access only to version existing at start of tx.
  • Validation-based (optimistic concurrency control):
    • Execute all tx’s; check for validity problems on commit.
  • Timestamp-based:
    • Organise tx execution via timestamps assigned to actions.