Isolation level in SQL
Ensure transactions are run without interference in SQL.
In a previous blog post on Transactional Outbox, we discussed that writing to multiple tables in a database will be transactional. Isolating a transaction is one of the key and interesting ways to handle transactions in SQL to ensure we do not observe inconsistent data or behaviour in our queries.
There are different ways SQL provides transaction isolation.
READ UNCOMMITTED
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
We’ll discuss the first two in this post.
READ UNCOMMITTED
This is the most basic isolation level where we read any data that is meant to be written to the database, even if it has not yet been committed. This is called a dirty read problem. This leads to very high performance but should not be used unless we know what we are trying to achieve. It is the same as no lock. Let’s take an example of what this looks like.
First, we set the transaction isolation level to be read uncommitted.
> SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
> SELECT @@transaction_isolation;
+-------------------------+
| @@transaction_isolation |
+-------------------------+
| READ-UNCOMMITTED |
+-------------------------+
S1> start transaction;
S1> select * from test where col1 = 1;
+------+------+
| col1 | col2 |
+------+------+
| 1 | 10 |
+------+------+
S1> update test set col2 = 11 where col1 = 1;
S1> select * from test where col1 = 1;
+------+------+
| col1 | col2 |
+------+------+
| 1 | 11 |
+------+------+
S2> select * from test where col1 = 1;
+------+------+
| col1 | col2 |
+------+------+
| 1 | 11 |
+------+------+
In the above SQL snippet, a session(S1) writes some value but hasn't been committed yet, but another session(S2) can read the uncommitted value.
We may not want this behaviour in most cases. This is solved by READ COMMITTED isolation.
READ COMMITTED
Read committed solves the dirty read problem by ensuring that we read only the committed values to the database. It is not the fastest way of isolation but provides a higher level of concurrency than READ UNCOMMITTED. Let’s see this in action!
First things first, we update the isolation level to read committed.
> SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
> SELECT @@transaction_isolation;
+-------------------------+
| @@transaction_isolation |
+-------------------------+
| READ-COMMITTED |
+-------------------------+
Let’s see the same example as above and observe the changes. Suppose we have two sessions (S1 and S2):
S1> start transaction;
S1> select * from test where col1 = 1;
+------+------+
| col1 | col2 |
+------+------+
| 1 | 10 |
+------+------+
S1> update test set col2 = 11 where col1 = 1;
S1> select * from test where col1 = 1;
+------+------+
| col1 | col2 |
+------+------+
| 1 | 11 |
+------+------+
S2> select * from test where col1 = 1;
+------+------+
| col1 | col2 |
+------+------+
| 1 | 10 |
+------+------+
S1> commit;
S2> select * from test where col1 = 1;
+------+------+
| col1 | col2 |
+------+------+
| 1 | 11 |
+------+------+
In the above SQL snippet, a session(S1) writes some value but hasn't yet been committed. This time another session(S2) is NOT able to read the uncommitted value. Once the transaction is committed, our second connection can read the updated value. Awesome!
For each consistent read in this isolation level, a fresh snapshot is used to query the results. This results in reading the latest committed state.
In this isolation level, MySQL will only lock rows that it needs to update. This reduces the probability of deadlocks happening - this doesn’t necessarily mean deadlocks won’t occur. Since only the rows being updated are locked, we’ll face issues like non-repeatable reads and phantom reads.
Non-repeatable reads problem
Let’s understand this with a simple example. Suppose we have 2 sessions(S1 and S2):
S1> start transaction;
S1> select * from test;
+------+------+
| col1 | col2 |
+------+------+
| 1 | 12 |
| 2 | 100 |
| 3 | 999 |
+------+------+
S2> update test set col2 = 10 where col1 = 1;
S1> select * from test;
+------+------+
| col1 | col2 |
+------+------+
| 1 | 10 |
| 2 | 100 |
| 3 | 999 |
+------+------+
In the same transaction, we read the same rows twice and we get different results. This happens because the second session committed the data before the first one read it again, and we always read the latest committed data in the read-committed isolation. This is something we may not want in most applications. This problem is addressed by the REPEATABLE READ isolation level.