top of page

Database Snapshot

When we create the database snapshot in the live operational database, it takes a database point in time static view and Rollback all uncommitted transactions in the snapshot database so we will not be having any inconsistent data that is yet to be committed.

Database snapshot always exists on the Source database server. The only way to create a SQL Server database snapshot is to use Transact-SQL. SQL Server Management Studio does not support the creation of database snapshots (https://docs.microsoft.com/en-us/sql/relational-databases/databases/create-a-database-snapshot-transact-sql?view=sql-server-ver15)

Database snapshot works on the pages (the fundamental unit of data storage in SQL server is the page). The disk space allocated to a data file in a database is logically divided into pages numbered contiguously from 0 to n.

Disk I/O operations are performed at the page level. That is, SQL server reads or writes whole data pages. Basically, it creates a sparse file and will be pointing to the original databases, written only in case of any insert, update and delete statement is in the source database.

 

Database snapshots are like a view of a database as it was at a certain point in time. It is a read-only copy of the data and the state of the pages, which are made possible using a pointer file called the sparse file. A snapshot contains the original version of pages, and changes in the source since the snapshot was created. However, remember that this is not a full copy of the database (https://www.sqlshack.com/understanding-database-snapshots-vs-database-backups-in-sql-server).

We can see here whichever page gets changed due to any operation, its original page (before modification) is copied into the sparse file (snapshot), so if user operation request to read the snapshot database it works in below ways

  1. If a page is modified, read operation occurs on sparse File

  2. If the page is not modified it still points to the source database page

This is the concept behind the snapshot, thus, its size is small compared to the original database but it all depends on the operation if there are too many operations in the database which gets page modified so the snapshot size will be kept increasing.

So whenever we create the snapshot following operation will be done

  1. It creates an empty file, i.e. sparse file for each source database data file

  2. Uncommitted transactions are rolled back, thus having a consistent copy of the database

  3. All dirty pages will be returned to the disk

  4. The user can query the snapshot database now

Now let’s say there are many database pages being updated so now the snapshot will look like

Here we can see the as there are many operations being performed all pages shown above are modified, thus the snapshot size will be increased. We should always keep an eye on the growth of the snapshot database https://www.sqlshack.com/database-snapshot-in-sql-server

 

The technique used in snapshot is called row versioning. Therefore, snapshot consumes tempdb size, which in turn consumes memory.

Version stores, which are a collection of data pages that hold the data rows that are required to support the features that use row versioning. There are two version stores: a common version store and an online-index-build version store. The version stores contain:

  • Row versions that are generated by data modification transactions in a database that uses read-committed using row versioning isolation or snapshot isolation transactions.

  • Row versions that are generated by data modification transactions for features, such as: online index operations, Multiple Active Result Sets (MARS), and AFTER triggers (https://docs.microsoft.com/en-us/sql/relational-databases/databases/tempdb-database?view=sql-server-ver15).

See my previous blog about Cross-container transactions, it gives an aspect how snapshot cost memory. https://fancy147.wixsite.com/johndatastory/single-post/2020/02/11/Cross-container-transactions

 

Example1:

Your company runs end-of-the-month accounting reports. While the reports run, other financial records are updated in the database.Users report that the reports take longer than expected to run.You need to reduce the amount of time it takes for the reports to run. The reports must show committed data only.What should you do?

  • A. Use the NOLOCK option.

  • B. Execute the DBCC UPDATEUSAGE statement.

  • C. Use the max worker threads option.

  • D. Use a table-valued parameter.

  • E. Set SET ALLOW_SNAPSHOT_ISOLATION to ON.

  • F. Set SET XACT_ABORT to ON.

  • G. Execute the ALTER TABLE T1 SET (LOCK_ESCALATION = AUTO); statement. H. Use the OUTPUT parameters.

Correct Answer: E Snapshot isolation enhances concurrency for OLTP applications. Once snapshot isolation is enabled, updated row versions for each transaction are maintained in tempdb. A unique transaction sequence number identifies each transaction, and these unique numbers are recorded for each row version. The transaction works with the most recent row versions having a sequence number before the sequence number of the transaction. Newer row versions created after the transaction has begun are ignored by the transaction.

References: https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/sql/snapshot-isolation-in-sql-server

 

Example 2:

You are designing a stored procedure for a database named DB1. The following requirements must be met during the entire execution of the stored procedure:

✑ The stored procedure must only read changes that are persisted to the database. statements within the stored procedure should only show changes to the data that are made by the stored procedure.

You need to configure the transaction isolation level for the stored procedure. Which Transact-SQL statement or statements should you run?

  • A. SET TRANSACTION ISOLATION LEVEL READ UNCOMMITED ALTER DATABASE DB1 SET READ_COMMITED_SNAPSHOT ON

  • B. SET TRANSACTION ISOLATION LEVEL READ COMMITED ALTER DATABASE DB1 SET READ_COMMITED_SNAPSHOT OFF

  • C. SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

  • D. SET TRANSACTION ISOLATION LEVEL READ UNCOMMITED ALTER DATABASE SET READ_COMMITED_SNAPSHOT OFF

Correct Answer: B READ COMMITTED specifies that statements cannot read data that has been modified but not committed by other transactions. This prevents dirty reads. Data can be changed by other transactions between individual statements within the current transaction, resulting in nonrepeatable reads or phantom data. This option is the SQL Server default.

Incorrect Answers: A, D: READ UNCOMMITTED specifies that statements can read rows that have been modified by other transactions but not yet committed.

References: https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/sql/snapshot-isolation-in-sql-server

Argument:

I think C is correct: statements within thestored procedure should only show changes made by the stored procedure - that IMHO means no repeatable reads and no phantom reads.

My opinion:

For SERIALIZABLE, cross-container transactions are still possible, which will go to memory. Therefore, answer B seems to be the only way meeting the requirement of the question.

bottom of page