Converting from SVN to Git

I held out for a while, but I have officially become a Git convert. Now that TortoiseGit has reached version 1.0 I can’t think of a single way in which SVN is better that Git. There are a lot of advantages, but by far my favorite is the distinction between “commit” and “push”. In any version control system it is good to commit as often as possible so that you have better version history, but in SVN every commit gets pushed to the server. This means that if you want to commit, but you are not ready to give the code to other developers (ie: the code does not belong on the trunk of the master copy) then you have to make a branch. In SVN it can be a major pain to do everything in banches because of poor (not to mention slow) merging functionality. With Git you can commit as many times as you want, then when your code is ready for the trunk, you can push it to the master server. Much more civilized! If you haven’t tried Git yet then you should. You will be impressed.

Here’s an article on learning Git for SVN users: http://git.or.cz/course/svn.html

Unit testing an Entity Framework DAL part 2: Rolling back the test database

Update (Jan 23, 2010): I have made a new post explaining a method that I think is better than any of those described here.

In part one I talked about how there is no true way to unit test your data access code under the standard definition of a unit test. However, I think it is useful to consider your database and your data access layer as a single unit when it comes to automated testing (read part one if you’re wondering why). Everything is a trade off though so there are two important drawbacks to hitting a real database in your tests:

  • If a test fails you don’t necessarily know right off the bat whether it was your .NET code that failed or something to do with the database.

    This can be a pain if you really like your unit tests to point out the exact issue when they fail, but I would say it is a very minor problem. Also, don’t you want to know when there’s a bug in your database anyways?

  • It can be very slow to roll back the database to its original state after every test.

    This one is the real kicker. It is important that each test executes as fast as possible, but in order to prevent cross contamination between your tests, the database needs to be restored to its original state after each test. Even though it is very fast to access a local SQL Server instance, the process of rolling back can be slow.

I will go over three methods for rolling back that database, but I only recommend the last one:

  • Put each test in a transaction

    Rolling back a SQL Server transaction is an appealing option. It is by far the fastest method and seems easy to implement. The problem is that a transaction belongs to a single session, so you can get yourself into deadlock if you have multiple connections. If you begin a transaction on connection A and alter a table with that connection, then try to read from that table with connection B, then you have deadlock. Using the default SQL Server isolation level (read committed) connection B will wait for connection A to end its transaction, but the transaction doesn’t end until the test is finished, which requires connection B to read the data.

    Basically, using this method puts a major restriction on the actual implementation making it pretty much useless. There are other problems too as your test needs access to the connection so that it can begin and end the transaction. In many cases you want the connection to be internal, so this leads to bad design.

  • Rebuild the database after each test

    This certainly works, but it is really slow and nobody likes slow unit tests.

  • Use SQL Server snapshots

    In SQL Server 2005 and later you can make a snapshot of a database, and then restore the database to that snapshot at any time. Restoring the snapshot is much faster than rebuilding the entire database because the snapshot file doesn’t store the entire database, it just stores data that has been changed so that it can quickly be reverted. There are a few caveats to this approach. One is that snapshots are not available in SQL Server Express, you will need one of the non-free versions. The other is that you cannot restore a snapshot while the database has active connections, so you need to make sure you kill them before attempting to restore the snapshot after each test.

    Using SQL, you can create a snapshot like this:

    CREATE DATABASE SnapshotName
    ON (NAME=DatabaseFileLogicalName, FILENAME='PATH_TO_NEW_FILE')
    AS SNAPSHOT OF DatabaseName

    And then you can later restore the database to that snapshot:

    RESTORE DATABASE DatabaseName
    FROM DATABASE_SNAPSHOT = 'SnapshotName'

    Restoring a snapshot usually still has a delay of half a second or so, but it’s better than the alternatives.

  • Update (Jan 23, 2010): Distributed Transactions
    Assuming that the use of MSDTC is an option, this is likely the best choice. In is described in this article.

Unit testing an Entity Framework DAL part 1: Just hit the database

As almost anyone who tries to unit test a database application will quickly discover, databases present a huge problem for unit testing. Strictly speaking, if you are testing your C# or VB code and you actually hit a real database, then it isn’t really a unit test. It is actually an integration test. However, I have found that it doesn’t really matter what you call it, the end result is that your tests are much more useful if they actually hit a real database. You don’t have to worry about whether the test failed because you screwed up your mock object or if the actual application is buggy and you get better code coverage because even broken SQL will lead to a failed test.

There are several methods that can be used to prevent your unit tests from actually using a real SQL Server database, but they all have their problems:

  • Using an in-memory provider like SQLite

    There is an Entity Framework provider for SQLite that allows you to interact with a database without using a network or even going to your file system. This could certainly increase the execution speed of your unit tests and makes it easy to prevent cross contamination in your tests, but they are still integration tests. The only difference is that you are now testing whether your code works on SQLite, rather than the DBMS that you will actually use in production. The problem is that all database systems have different behaviors and feature sets, so your tests are no longer valid if you use a different DBMS for testing. There is also currently no system in place to automatically generate the SQLite schema from your entity data model, so you will need to find your own way of doing that, or you have to manually maintain a separate SQLite schema. Gross. If you are going to use another provider, it needs to be specially designed to behave exactly like your production database (ie: a mock SQL Server provider) but to my knowledge no such providers exists (if I’m wrong, please let me know!).

  • Mock the Entity Framework ObjectContext

    If all you want to do is read data, then this works well and is easy to implement. Unfortunately, in the vast majority of cases, we also need to write data and that’s where this method gets tricky. Your mock ObjectContext needs to be able to track changes and save them to an in-memory repository. And again, you have to make sure that it behaves exactly like your production database. Because this method often involves either a huge wrapper or major alterations to auto-generated code (which means you also need to make your own generator or you’ll lose maintainability) the mock object itself is extremely complicated, leaving a high likelihood that it will have errors. Since the mock is so complicated one could argue that you are again doing integration tests, not unit tests. But this time instead of testing your code and the database, you are testing your code and the mock ObjectContext. Just like the SQLite example, this is much worse because you are testing whether your code integrates with something you will not use in production. If you are going to do integration tests anyways, then you might as well integrate with the real thing. This method could lead to faster executing tests, but don’t forget that a local SQL Server instance is actually extremely fast and might be just as good.

  • Encapsulate your data access layer and then mock it

    I see this response on message boards all the time. Whenever someone asks how they unit test their data access code someone will respond “You’re doing it wrong, put all of your data access code into a separate module that you can mock”. There are a couple problems with this. First of all, you still need to test the code in the data access layer. If you have a function in your DAL that executes a complicated LINQ to Entites query, then you want to test that query. Without using one of the techniques mentioned above, this requires hitting the database. Secondly, making your client code completely unaware of the data access layer’s implementation leads to some issues. Let’s pretend that my data access layer looks like this:

    Public Interface IUsersModel
     
        Function GetUsers() As IEnumerable(Of Users)
        Sub Save()
     
    End Interface
     
    Public Class UsersModel
        Implements IUsersModel
     
        Private _context As New DataTestEntities
     
        Public Function GetUsers() As IEnumerable(Of Users) Implements IUsersModel.GetUsers
            Return _context.Users
        End Function
     
        Public Sub Save() Implements IUsersModel.Save
            _context.SaveChanges()
        End Sub
     
    End Class

    It’s pretty simple, the code just allows you to get a collection of users and save any changes you make. UsersModel correctly implements the interface using the Entity Framework. Then we also have a controller that accesses the DAL. It looks like this:

    Public Class UsersController
     
        Private _usersModel As IUsersModel
     
        Public Sub New(ByVal usersModel As IUsersModel)
            _usersModel = usersModel
        End Sub
     
        Public Sub ChangeFirstUserNameToFoobar()
            _usersModel.GetUsers().First.userName = "foobar"
            _usersModel.Save()
        End Sub
     
    End Class

    UsersController has a dependency on IUsersModel, so when unit testing the ChangeFirstUserNameToFoobar method, we pass in a mock implementation of IUsersModel, but we cannot simply verify that Save() was called, we also need to know what is going to happen when Save is called. Specifically, we need some way of checking that the first user’s username was changed to “foobar”. This means that a mocking framework like RhinoMocks or Moq will not be sufficient. There must be a fake implementation of IUsersModel that keeps track of the changes that have been made. Now we are getting back into “mock the ObjectContext” territory because that’s basically what we will have done.

There is a definite trend here: each of the above methods is complicated enough that you lose the benefits of isolating your tests from the database. They are all integration tests. In every case you are testing your client code, plus the repository. Since you have to test a repository, it might as well be the real one. Of course, this presents its own challenges. You will want to use a local instance of SQL Server (or whatever DBMS you use) to keep the tests fast (and isolated from other developers) and you will need to roll back changes after each test. In subsequent articles I will look at how to deal with these issues.

Update: I have posted the second article: Unit testing an Enitity Framework DAL part 2: Rolling back the test database

WPF nullable combo box source code

Download the code here: WpfNullableComboBox.zip

A few months ago I wrote this article on making a nullable combo box control in WPF. I had a bunch of requests to see an actual implementation, so here is a sample Visual Studio 2008 project with the source code for the user control. It basically uses the same technique I described in my previous blog post. All the combo box properties (even the obscure ones no one will ever use) are implemented.

Easier PropertyChanged notification with PostSharp

As I described in this previous article raising the PropertyChanged event for classes that implement INotifyPropertyChanged can be a real pain. The biggest problem is that PropertyChangedEventArgs takes the name of the property that changed as a string and as we all know, strings are the root of all evil. Here I will show how to use a simple PostSharp attribute on your properties that need to raise the PropertyChanged event when they are changed so that you don’t manually need to do it and hard code the name of the property as a string. PostSharp is a framework for .NET that allows for aspect oriented programming. You can read all about it at the PostSharp website.

First of all, let’s assume that the classes implementing INotifyPropertyChanged are model view classes in the MVVM pattern. We will use a base class for all model views called BaseModelView that looks like this:

Imports System.ComponentModel
 
''' <summary>
''' Parent class for all model views
''' </summary>
Public Class BaseModelView
    Implements INotifyPropertyChanged
 
    Public Event PropertyChanged( _
        ByVal sender As Object, _
        ByVal e As System.ComponentModel.PropertyChangedEventArgs) _
        Implements System.ComponentModel.INotifyPropertyChanged.PropertyChanged
 
    ''' <summary>
    ''' Raises the <c>PropertyChanged</c> event for the property with the given name.
    ''' </summary>
    ''' <param name="propertyName">The name of the property that has changed.</param>
    ''' <remarks>If there is no property on this class with the given name, then an
    ''' exception will be thrown.</remarks>
    Public Sub OnPropertyChanged(ByVal propertyName As String)
 
        ' Throw an exception if the property doesn't exist
        If Me.GetType().GetProperty(propertyName) Is Nothing Then
            Throw New ArgumentException( _
                String.Format("The property {0} doesn't exist on type {1}.", _
                              propertyName, _
                              Me.GetType().Name))
        End If
 
        RaiseEvent PropertyChanged(Me, New PropertyChangedEventArgs(propertyName))
 
    End Sub
 
End Class

This class is very important. There needs to be a method for property change notification (ie: OnPropertyChanged on BaseModelView) instead of just an event (ie: PropertyChanged on INotifyPropertyChanged) because the attribute cannot directly raise an event, but it can call a public method that raises the event. The PostSharp attribute looks like this:

Imports PostSharp.Laos
Imports System.ComponentModel
 
<Serializable()> _
Public Class NotifyAttribute
    Inherits OnMethodBoundaryAspect
 
    Public Overrides Sub OnExit(ByVal eventArgs As PostSharp.Laos.MethodExecutionEventArgs)
        ' Convert to BaseModelView
        Dim notifier = TryCast(eventArgs.Instance, BaseModelView)
 
        ' If the instance is the wrong type then throw an exception
        If notifier Is Nothing Then
            Throw New InvalidOperationException("Cannot raise PropertyChanged event unless instance implements INotifyPropertyChanged.")
        End If
 
        ' Ignore everything that's not a setter
        If eventArgs.Method.Name.StartsWith("set_") Then
            notifier.OnPropertyChanged(eventArgs.Method.Name.Substring(4))
        End If
    End Sub
 
End Class

Note that when you apply PostSharp attribute to a property, you are actually applying the attribute to the two methods that are generated for that property. For example, if you have a property called MyProperty then the compiler will actually generate two methods: get_MyProperty and set_MyProperty. Since OnExit() will actually get called for both of these methods when we apply the attribute to a property, the code has to check whether the getter or the setter was called. Using the attribute is very simple:

<Notify()> _
Public Property Text() As String
    Get
        Return _text
    End Get
    Set(ByVal value As String)
        _text = value
    End Set
End Property

The result is that the PropertyChanged event will automatically be raised after the setter finishes executing and there is no need to hard code any strings! Now you are free to change the name of your property and it won’t break any code.

WPF rendering thread synchronization

Download the sample project here: WpfRenderingThreadSynchronization.zip

In most applications it is necessary to offload long running processes to an alternate thread so that the rest of the program does not lock up during that time. However, it’s not so simple when the long running process is the actual rendering. Separate windows can have their own UI threads (as explained here) but to my knowledge there is no way to use multiple rendering threads on a single window.

The second problem is that rendering is done in big chunks. For example, if you have an ItemsControl that is bound to an ObservableCollection and a loop that adds 1000 items to that collection, you will notice that the elements are not drawn one at a time. Instead the UI will stall for a moment and then every element will suddenly appear on screen. During the time that it is loading, the entire window will be completely unusable. Basically, what happens is that UI changes (like adding an element to an ItemsControl’s ObservableCollection) all get put into a queue and then the rendering thread deals with a whole bunch of them all at once.

There are two problems with this behaviour:

  • The rest of the UI is unusable while this loading takes place
  • It’s not obvious what is happening during the loading period. Since absolutely nothing is happening on screen, the user might think the app is broken.

It turns out that in cases like this where we have many small rendering operations that add up to a large amount of time, we can force the rendering thread to flush out the Windows message queue after each element is added to the collection. This will not only allow the user to see progress (ie: items appearing in the ItemsControl one at a time) but between each item being added other UI updates can take place giving the illusion that there are separate UI threads.

The included sample draws 1000 TextBoxes inside an ItemsControl, but after each element is added the Windows message queue is flushed out using the FlushWindowsMessageQueue function. All the functions does is tell the dispatcher to invoke a delegate that does nothing. The result is that the code blocks at that line until the specified delegate has been run. But since it is at the end of the queue, everything else has to be dealt with first. The function looks like this:

Private Sub FlushWindowsMessageQueue()
    Application.Current.Dispatcher.Invoke( _
        New Action(AddressOf DummySub), _
        DispatcherPriority.Background, _
        New Object() {})
End Sub
 
Private Sub DummySub()
End Sub

When the sample is run with the FlushWindowsMessageQueue() line commented out the whole UI will lock up for a couple of seconds after you click “Refresh data”. However, when the message queue is emptied after each element is added the UI never locks up, even when it is still drawing TextBoxes.

Unfortunately, there are some drawbacks to this method. The most obvious is that it makes the entire rendering operation actually take longer. The trade off is that the first items appear much earlier, but the last items appear later. The technique also cannot be used when the rendering cannot easily be split into many small chunks.

SVN with Visual Studio

Here are some tips for using SVN with a Visual Studio project:

Choosing a client

There are quite a few SVN clients out there. If you like to have one built in to the IDE, then there’s Ankh SVN and Visual SVN. I have used Ankh but found it to be a little buggy (those issues may have been resolved by now) and I have never used Visual SVN because it costs money. I prefer not to have my SVN client built into the IDE though because I often need to do SVN operations outside of Visual Studio anyways. It can also complicate the process a little bit. For example, if you just want to commit your solution file with Ankh, you would probably right click on the solution and choose commit, but this actually commits everything in the solution, not just the solution file. If you use an external client like Tortoise SVN or Rapid SVN then you will get full control over your SVN activity. Both clients are good. Tortoise is a windows explorer extension, while Rapid is a standalone app. Personally, I use Tortoise, but some people don’t like anything that messes with the explorer.

What not to commit

Most of the time you do not want to commit your binary files, just the source, so you should tell SVN to ignore the bin and obj folders in each project. Every solution also has a .suo file that stores the state of the IDE (eg: the files you have open). So for example, if user X commits his .suo file and user Y does an update, then when user Y reloads the project it will open the windows user X was viewing, not the ones user Y was viewing. This isn’t normally the desired behaviour, so you should also ignore the .suo file (it’s in the same directory as the .sln file).

External libraries

When your solution and project files are being versioned, you don’t want to have references to DLLs on your hard drive with absolute paths. Instead, it is ideal to include a folder called lib in your project and put all your DLLs in there so that the entire folder can be included in SVN, ensuring that the references will work for everyone.

Updating from an external client

If you use an external SVN client like Tortoise or Rapid then you do not need to close Visual Studio to do an update, but for the love of god, make sure to save your files before you update. If any of the files or projects you have opened are changed by the update you will be asked whether to reload them. Say yes (if you have files with unsaved changes those changes will get overwritten, that’s why you need to save before updating). This can take a few seconds if any large projects need to be reloaded. After it’s done you might see a bunch of compile errors. Usually they will disappear with a simple compile, but sometimes false errors will still be displayed until you do a full rebuild of the solution. If the compile errors persist then it is likely a legitimate error and you need to take to whoever committed last.

Omitting certain projects in a solution

If you have a project in your solution that you do not want to commit (eg: a test project) then it is not enough to simply ignore the files for that project. If you just ignore the files and commit, then when another user gets your update they will receive an error saying that the project you omitted cannot be found. This is because the .sln xml file keeps track of the projects in your solution, so now it contains a reference to a project with no files. To prevent this you need to right click on the project in Visual Studio and choose “Remove” before committing. This will just remove the project from the solution, it won’t actually delete any files. Note: removing the project changes the .sln file, but it does not automatically save those changes. In order to save the .sln file you need to recompile the solution or do a “save all” (ctrl+shift+S).

Don’t let “Option Strict Off” make you lazy

VB.NET has the sometimes useful feature of late binding, but this seems to lead to poor code. By default, late binding is enabled (ie: Option Strict is set to Off) allowing for implicit narrowing conversions (no cast). Although there are certainly cases where this is a useful feature that can cut down on the amount of reflection code left up to the programmer, I have found that it is more often a cause of less robust code and needless performance degradation.

With Option Strict Off we can write code like this:

Dim obj As Object = "Hello, World!"
Dim str As String = obj

In this case the code will run just fine, and it saved us the hassle of casting obj to String. However, we will obviously run into problems in a situation like this:

Dim obj As Object = "Hello, World!"
Dim int As Integer = obj

Even though int is an Integer this code will compile, but at runtime there will be an InvalidCastException. This is all pretty simple stuff, but the bottom line is that in this case, Option Strict Off gives a runtime error, while Option Strict On gives a compile error. The value of compile-time errors should not be taken lightly, and in my humble opinion they are a programmer’s best friend. With Option Strict On our first sample only needs a minor change:

Dim obj As Object = "Hello, World!"
Dim str As String = DirectCast(obj, String)

Was it really that difficult just to cast it? Type casting is not an inconvenience, but a necessary precaution requiring the programmer to say to the compiler: “Yes, I did intend to perform a narrowing conversion. It was not an accident”.

As a general rule of thumb, I like to set Option Strict On as the project default (go to Project -> Properties -> Compile) and then add Option Strict Off to code files that require it rather than the other way around.

Verify your property names in INotifyPropertyChanged implementation

Update: I have posted another article here that explains what I think is a better solution to this problem using a simple PostSharp attribute.

When you raise the PropertyChanged event you have to pass it a property name as a string. If there is no property with that name then nothing will happen. The listener will not be notified and no exception will be thrown making the problem very difficult to debug. You can change this behaviour and make the application fail at runtime by adding a simple check to your helper function for the event:

Public Sub NotifyPropertyChanged(ByVal propertyName As String)
    ' Throw an exception if the property doesn't exist
    If Me.GetType().GetProperty(propertyName) Is Nothing Then
        Throw New InvalidPropertyNameException()
    End If
 
    RaiseEvent PropertyChanged(Me, New PropertyChangedEventArgs(propertyName))
End Sub

If you put this in a base class for all of your model views (or controllers, or presenters) then you will automatically get this functionality every time, preventing some potentially very annoying bugs.

This still isn’t the ultimate solution because you don’t find out that the property name doesn’t exist until runtime. Ideally, we would get a compile error when the property does not exist. What I would like to do is call the function like this:

NotifyPropertyChanged(AddressOf MyProperty)

This way you wouldn’t have to use a string at all and the compiler would tell you if MyProperty doesn’t exist. Unfortunately, .NET languages only have delegates for functions/subroutines so there is no way to make a strongly typed pointer to a property. Let’s hope they add that in one day, but until then, we’ll have to use strings.

Nullable combo boxes in WPF

Update: Sample source code demonstrating this technique can be downloaded here: WpfNullableComboBox.zip

By default, combo boxes in WPF have some really annoying behaviour. When the control is initialized, if the SelectedItem is Nothing then the default selection will be blank, but as soon as you choose an item in the combo box, you can not reselect the blank/null option. One quick way around this is to add a null placeholder object to your ItemsSource. There are a few problems with this approach though:

  1. The null placeholder cannot actually be Nothing/null or else selecting the value will have no affect. Instead, it needs to be some object that represents “null”. This means that if you want your setter on the property bound to the SelectedItem to be set to null, you need to convert the object representing null to actually be Nothing/null.
  2. Ideally, you should not have to alter the collection in your model view/controller/presenter that is bound to the ItemsSource just to add a null option. It would be better if we could just specify in XAML that this combo box should have a null option that actually sets the SelectedItem to Nothing.

Since we don’t want to alter the collection in the controller and we cannot have a combo box item of Nothing (we need a null place holder object instead) but we don’t want the SelectedItem property to ever have the null place holder object as its value (we want it to just be Nothing when that is chosen) we can do one of two things:

  1. Use two converters: one on the ItemsSource to add in the null place holder object and one on SelectedItem to convert the place holder to Nothing.
  2. Create a user control that acts as a wrapper around the combo box control. All the necessary logic could be handled within the user control.

Option one would look like this:

<ComboBox
    ItemsSource="{Binding MyItems, Converter={StaticResource addNullPlaceHolderConverter}}"
    SelectedItem="{Binding MySelectedItem, Converter={StaticResource placeHolderToNullConverter}}" />

In my opinion, that method really sucks. You have to add the converters in you resources section and then specify them in two places. Another issue is that we could run into some major converter explosion if it turns out that you already need some other converter on one of the properties. Then you have to make a new converter that combines the two. I don’t like it.

Option two looks like this:

<local:NullableComboBox
    ItemsSource="{Binding MyItems}"
    SelectedItem="{Binding MySelectedItem}" />

Much better!

The XAML for the user control is extremely simple. You just need to create a combo box with a name:

<UserControl x:Class="NullableComboBox"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml">
    <ComboBox x:Name="combo" />
</UserControl>

In the code-behind you need to expose two dependency properties: SelectedItem and ItemsSource so that the control has the same interface as a regular combo box.

By listening to the combo box’s SelectionChanged event you can update the SelectedItem property on the NullableComboBox except with the place holder converted to Nothing.