Update (Jan 23, 2010): I have made a new post explaining a method that I think is better than any of those described here.
In part one I talked about how there is no true way to unit test your data access code under the standard definition of a unit test. However, I think it is useful to consider your database and your data access layer as a single unit when it comes to automated testing (read part one if you're wondering why). Everything is a trade off though so there are two important drawbacks to hitting a real database in your tests:
- If a test fails you don't necessarily know right off the bat whether it was your .NET code that failed or something to do with the database.
This can be a pain if you really like your unit tests to point out the exact issue when they fail, but I would say it is a very minor problem. Also, don't you want to know when there's a bug in your database anyways?
- It can be very slow to roll back the database to its original state after every test.
This one is the real kicker. It is important that each test executes as fast as possible, but in order to prevent cross contamination between your tests, the database needs to be restored to its original state after each test. Even though it is very fast to access a local SQL Server instance, the process of rolling back can be slow.
I will go over three methods for rolling back that database, but I only recommend the last one:
- Put each test in a transaction
Rolling back a SQL Server transaction is an appealing option. It is by far the fastest method and seems easy to implement. The problem is that a transaction belongs to a single session, so you can get yourself into deadlock if you have multiple connections. If you begin a transaction on connection A and alter a table with that connection, then try to read from that table with connection B, then you have deadlock. Using the default SQL Server isolation level (read committed) connection B will wait for connection A to end its transaction, but the transaction doesn't end until the test is finished, which requires connection B to read the data.
Basically, using this method puts a major restriction on the actual implementation making it pretty much useless. There are other problems too as your test needs access to the connection so that it can begin and end the transaction. In many cases you want the connection to be internal, so this leads to bad design.
- Rebuild the database after each test
This certainly works, but it is really slow and nobody likes slow unit tests.
- Use SQL Server snapshots
In SQL Server 2005 and later you can make a snapshot of a database, and then restore the database to that snapshot at any time. Restoring the snapshot is much faster than rebuilding the entire database because the snapshot file doesn't store the entire database, it just stores data that has been changed so that it can quickly be reverted. There are a few caveats to this approach. One is that snapshots are not available in SQL Server Express, you will need one of the non-free versions. The other is that you cannot restore a snapshot while the database has active connections, so you need to make sure you kill them before attempting to restore the snapshot after each test.
Using SQL, you can create a snapshot like this:
CREATE DATABASE SnapshotName
ON (NAME=DatabaseFileLogicalName, FILENAME='PATH_TO_NEW_FILE')
AS SNAPSHOT OF DatabaseName
And then you can later restore the database to that snapshot:
RESTORE DATABASE DatabaseName
FROM DATABASE_SNAPSHOT = 'SnapshotName'
Restoring a snapshot usually still has a delay of half a second or so, but it's better than the alternatives.
- Update (Jan 23, 2010): Distributed Transactions
Assuming that the use of MSDTC is an option, this is likely the best choice. In is described in this article.