LINQ to SQL Gotcha #3: Chaining Where Clauses

In LINQ to SQL you can chain multiple where clauses like this:

Module Module1
 
    Sub Main()
 
        Using context As New TestDataContext
            context.Log = Console.Out
            Dim articles = context.Articles.Where(Function(a) a.articleID > 10) _
                                           .Where(Function(a) a.articleID Mod 2 = 0) _
                                           .ToList()
        End Using
 
        Console.ReadKey()
 
    End Sub
 
End Module

This will generate SQL that looks roughly like this:

SELECT ... FROM Articles WHERE articleID > 10 AND articleID % 2 = 0

Since chained where clauses are equivalent to ANDing multiple expressions in a single WHERE, the above SQL is exactly what you would expect to see. Unfortunately, things get more complicated when one of the expressions cannot be converted to SQL, like in this case:

Module Module1
 
    Sub Main()
 
        Using context As New TestDataContext
            context.Log = Console.Out
            Dim articles = context.Articles.Where(AddressOf FilterArticle).ToList()
        End Using
 
        Console.ReadKey()
 
    End Sub
 
    Function FilterArticle(ByVal a As Article) As Boolean
        Return a.articleID Mod 2 = 0
    End Function
 
End Module

The above code generates SQL that looks like this:

SELECT ... FROM Articles

The query has no where clause, it just loads all the articles and then filters them on the client side. It’s usually optimal to do the filtering on the SQL side, but the behaviour is reasonable. I wouldn’t expect the ORM to be capable of magically converting the contents of the FilterArticle function into SQL (it sure would be nice though). This is still expected behaviour, but here’s an example where things get weird:

Module Module1
 
    Sub Main()
 
        Using context As New TestDataContext
            context.Log = Console.Out
            Dim articles = context.Articles.Where(AddressOf FilterArticle) _
                                           .Where(Function(a) a.articleID > 10) _
                                           .ToList()
        End Using
 
        Console.ReadKey()
 
    End Sub
 
    Function FilterArticle(ByVal a As Article) As Boolean
        Return a.articleID Mod 2 = 0
    End Function
 
End Module

This code generates the same SQL as last time:

SELECT ... FROM Articles

It is filtering both where clauses on the client side even though the second one could have been converted to SQL. If you flip the where clauses like this:

Module Module1
 
    Sub Main()
 
        Using context As New TestDataContext
            context.Log = Console.Out
            Dim articles = context.Articles.Where(Function(a) a.articleID > 10) _
                                           .Where(AddressOf FilterArticle) _
                                           .ToList()
        End Using
 
        Console.ReadKey()
 
    End Sub
 
    Function FilterArticle(ByVal a As Article) As Boolean
        Return a.articleID Mod 2 = 0
    End Function
 
End Module

then you will still get the expected SQL:

SELECT ... FROM Articles WHERE articleID > 10

The where clause that can be converted to SQL is filtered in the SELECT statement, but the clause that cannot be converted is filtered on the client side. I would have hoped that the order of the where clauses would not matter since they are just being ANDed, but that is not the case.

The lesson is that if you need to chain a where clause that cannot be converted to SQL, try to put it at the end of the chain. This can be a real issue if you are using a data access layer that automatically filters queries (eg: for security) with a function that cannot convert to SQL. If all of your LINQ to SQL queries have this built in filter then none of them will ever generate WHERE clauses in the SQL, it will just load the entire table every time.

WPF Responsiveness: Asynchronous Loading Animations During Rendering

Download the code here: WpfLoadingOverlay.zip

A common issue in MDI or TDI style user interfaces is that it can take a long time to render new forms when they are opened. Even a one or two second delay is enough to make an application seem very unresponsive. If there’s nothing you can do to improve the actual performance (ie: the total time to load the form) you can at least improve the perceived performance. This code sample creates a simple form with a tab control and a button. Every time you click the button it adds a new tab with 4000 text boxes. Depending on the speed of your computer the form will probably take around five seconds to load. As soon as you click the button, the new tab appears with a loading animation that continues until the contents have been rendered. This is not a trivial task because we want to render a loading animation at the same time that we are waiting for another long rendering operation to complete. Basically, we need two rendering threads. You can’t have multiple rendering threads in a single window, but you can put your loading animation in a new window with its own rendering thread and make it look like it’s not a separate window.

The loading overlay is a separate, chromeless window that does not appear on the taskbar and disappears as soon as loading is complete. The window is also semi-transparent and automatically positioned exactly over top of the form that is loading, so it looks like it is a part of the existing window.

Download the sample project and try it out.

LINQ to SQL Gotcha #2: GetChangeSet Weirdness

If you attach an entity with a required association that is nulled out, you will be unable to call GetChangeSet(). In my opinion, the expected behaviour is that the entity should show up in the change set as though it is valid, but an exception should be thrown when you attempt to call SubmitChanges() because a foreign key constraint has been violated. In fact, with code like this we will get exactly that result (an exception is thrown on SubmitChanges()):

Using testData As New TestDataContext
    Dim newArticle As New Article With {.title = "Foobar", _
                                        .text = "blah blah blah"}
    testData.Articles.InsertOnSubmit(newArticle)
    Dim changes = testData.GetChangeSet()
    testData.SubmitChanges()
End Using

There is a required association to the Users table that has not been set at all. Using the following snippet, with the User property explicitly set to Nothing an exception will be thrown on GetChangeSet() instead of SubmitChanges():

Using testData As New TestDataContext
    Dim newArticle As New Article With {.title = "Foobar", _
                                        .text = "blah blah blah", _
                                        .User = Nothing}
    testData.Articles.InsertOnSubmit(newArticle)
    Dim changes = testData.GetChangeSet()
    testData.SubmitChanges()
End Using

It gives this error on GetChangeSet():

An attempt was made to remove a relationship between a User and a Article. However, one of the relationship’s foreign keys (Article.userID) cannot be set to null.

It appears that the internal implementation of LINQ to SQL distinguishes between an unset relationship, and one that has specifically been set to Nothing. The awkward thing here is that it is not always easy to avoid this issue since you don’t even have to call InsertOnSubmit. Attaching an entity by setting an association to an already attached object gives the same result.

Using testData As New TestDataContext
    Dim existingUser = testData.Users.First
    Dim newUserGroup As New UserGroup With {.User = existingUser, .Group = Nothing}
    Dim changes = testData.GetChangeSet()
    testData.SubmitChanges()
End Using

In this snippet there are two required associations: User and Group. As soon as User is set, the UserGroup entity is attached to the DataContext. However, since Group is Nothing the ChangeSet is now corrupt.

This bug is described in this forum thread where a Microsoft employee called it a bug and recommended that he post it on Connect (Microsoft’s bug tracking site). The bug report on Connect can be found here. One hour after it was posted Microsoft replied saying this:

We are currently investigating. The investigation process normally takes 7-14 days.

They then went silent for 9 months before posting this:

Hi,

Thank you for taking the time to send this feedback and bug report. We have reviewed the issue and confirmed the behavior, but we will not be fixing this in the next release of LINQ to SQL.

LINQ to SQL Team

That’s Microsoft for ya.

LINQ to SQL Gotcha #1: Unexpected LoadWith Behaviour

By default, LINQ to SQL uses deferred loading. When you want to eager load an entity’s associated data you need to set DataLoadOptions using the LoadOptions property on the DataContext. If you have a one-to-many relationship between Users and Articles you can force LINQ to SQL to eager load Articles with Users like this:

Using testData As New TestDataContext
 
    ' Log SQL queries to the console
    testData.Log = Console.Out
 
    ' Set LoadOptions
    Dim options As New DataLoadOptions
    options.LoadWith(Function(user As User) user.Articles)
    testData.LoadOptions = options
 
    ' Load users with their articles
    Dim users = testData.Users.ToList
    For Each user In users
        Dim articles = user.Articles.ToList
    Next
 
End Using

This will generate a single SELECT statement with a JOIN on the Articles table. The same goes for for one-to-one relationships. You can also use LoadWith as many times as you want. For one-to-one relationships and no more than a single one-to-many relationship this will still generate one query with JOINs to all the LoadWith tables. However, if you want to eager load multiple one-to-many relationships you will get into a select N + 1 situation (or worse). For example, this code eager loads Articles and UserGroups with each User entity:

Using testData As New TestDataContext
 
    ' Log SQL queries to the console
    testData.Log = Console.Out
 
    ' Set LoadOptions
    Dim options As New DataLoadOptions
    options.LoadWith(Function(user As User) user.Articles)
    options.LoadWith(Function(user As User) user.UserGroups)
    testData.LoadOptions = options
 
    ' Load users with their articles
    Dim users = testData.Users.ToList
    For Each user In users
        Dim articles = user.Articles.ToList
        Dim userGroups = user.UserGroups.ToList
    Next
 
End Using

Technically, the behaviour here is correct. It will successfully eager load both the Articles and UserGroups collections for each User, but it will not do it in a single query. When I ran this I got one query that fetched the Users and Articles like last time, but then a separate SELECT for each UserGroup rather than another JOIN. Even though this won’t alter the behaviour of the code, it will definitely make a major impact on performance, especially if there are a lot of users in the database.

Scott Guthrie confirmed this behaviour in a post on David Hayden’s blog. This is what he said:

In the case of a 1:n associations, LINQ to SQL only supports joining-in one 1:n association per query.

Lame.

SQL Server Management Studio and default databases

There is a really annoying bug (or at least what I would call a bug) in SQL Server Management Studio where you cannot login with a user whose default database does not exist. Even if you are already logged in and you rename the default database, you will automatically be logged out and will receive an error every time you try to login again. You can always login as a different user and change the default database, but if you only have access to the one account, you can change the default database using sqlcmd.

First, login with a different database in a command window:

sqlcmd -d master -S server -U username -P password

Then issue the following commands to change the default database:

1> ALTER LOGIN login_name WITH DEFAULT_DATABASE = master
2> GO

You should now be able to login with this account through management studio.

High performance database rollback in automated tests with SQL Server

A couple months ago I wrote this article explaining why I think it is reasonable for unit tests to hit a real database. Subsequently, I wrote a follow up article describing some techniques for rolling back your database to its original state after each test. In that article I found that just using simple transactions did not solve the problem because you need access to all database connections being used, and they all have to be rolled back. I have since found a way around this problem using distributed transactions.

With the Microsoft Distributed Transaction Coordinator (MSDTC) the activity over multiple connections can be lumped into a single transaction using the TransactionScope class. MSDTC needs to be running for this to work, but since this is just for unit tests it doesn’t need to be enabled on your production environment.

In order to use the TransactionScope class your project will need a reference to System.Transactions. Here’s a sample unit test using MSTest and Entity Framework where the database is altered with multiple connections within a transaction and then the changes are rolled back:

Imports System.Transactions
Imports System
Imports System.Text
Imports System.Collections.Generic
Imports Microsoft.VisualStudio.TestTools.UnitTesting
 
<TestClass()> _
Public Class UnitTestSample
 
    <TestMethod()> _
    Public Sub ProofOfConceptTest()
        Using New TransactionScope
            Dim conn1 As New DataTestEntities
            Dim conn2 As New DataTestEntities
 
            Dim row1 As New Users With {.userName = "user1", .password = "pass"}
            Dim row2 As New Users With {.userName = "user2", .password = "pass"}
 
            conn1.AddToUsers(row1)
            conn2.AddToUsers(row2)
 
            conn1.SaveChanges()
            conn2.SaveChanges()
 
            Dim conn3 As New DataTestEntities
            Assert.AreEqual(conn3.Users.Count, 6)
        End Using
    End Sub
 
End Class

Alternatively, if you want every test method inside a test class to be within its own TransactionScope without adding a Using block to every single test, you can use the initialization and cleanup methods like this:

Imports System.Transactions
Imports System
Imports System.Text
Imports System.Collections.Generic
Imports Microsoft.VisualStudio.TestTools.UnitTesting
 
<TestClass()> _
Public Class UnitTestSample
 
    Private _transaction As TransactionScope
 
    <TestInitialize()> _
    Public Sub Setup()
        _transaction = New TransactionScope
    End Sub
 
    <TestCleanup()> _
    Public Sub TearDown()
        _transaction.Dispose()
    End Sub
 
    <TestMethod()> _
    Public Sub ProofOfConceptTest()
        Dim conn1 As New DataTestEntities
        Dim conn2 As New DataTestEntities
 
        Dim row1 As New Users With {.userName = "user1", .password = "pass"}
        Dim row2 As New Users With {.userName = "user2", .password = "pass"}
 
        conn1.AddToUsers(row1)
        conn2.AddToUsers(row2)
 
        conn1.SaveChanges()
        conn2.SaveChanges()
 
        Dim conn3 As New DataTestEntities
        Assert.AreEqual(conn3.Users.Count, 6)
    End Sub
 
End Class

As long as the use of MSDTC is an option, I have found this method to be far better than any of those described in the last article. It guarantees that the state or your database is maintained and is extremely fast (at least on small amounts of data).

Finding all validation errors on an IDataErrorInfo object

WPF data binding has built in support for IDataErrorInfo, so it is easy to display a validation error when a property has invalid data. However, sometimes we need to manually find all the validation errors on an object. A perfect example of this is when trying to save data. Often you will want to verify that there are no validation errors before allowing the save operation to proceed. Using the MVVM pattern it would be ideal to determine whether there are any errors purely within your model view. The default behavior of IDataErrorInfo does not give a collection of all current errors. Instead, it will just tell you if a given property has an error (ie: using IDataErrorInfo.Item), so all we have to do to find all the errors is enumerate through each property on the class and call the Item property with that property name as the argument. The GetValidationErrors function below does exactly that:

Option Strict On
 
Imports System.ComponentModel
 
Public Class ValidationHelper
 
    ''' <summary>
    ''' Checks for errors in <c>validatable</c> and returns all the errors found.
    ''' </summary>
    Public Shared Function GetValidationErrors(ByVal validatable As IDataErrorInfo) As IEnumerable(Of DataError)
 
        Dim errors As New List(Of DataError)
 
        ' Iterate through every property in the class
        For Each prop In validatable.GetType.GetProperties
 
            ' If the property has an error, then add it to the list
            Dim errorMessage = validatable(prop.Name)
            If errorMessage IsNot Nothing Then
                errors.Add(New DataError(prop.Name, errorMessage))
            End If
 
        Next
 
        Return errors
 
    End Function
 
    ''' <summary>
    ''' Represents a single error.  It's a propertyName, errorMessage pair
    ''' </summary>
    Public Class DataError
 
        Private _propertyName As String
        Private _errorMessage As String
 
        Public Sub New(ByVal propertyName As String, ByVal errorMessage As String)
            _propertyName = propertyName
            _errorMessage = errorMessage
        End Sub
 
        ''' <summary>
        ''' The name of the property that has the error.
        ''' </summary>
        Public ReadOnly Property PropertyName() As String
            Get
                Return _propertyName
            End Get
        End Property
 
        ''' <summary>
        ''' A description of the property's error.
        ''' </summary>
        Public ReadOnly Property ErrorMessage() As String
            Get
                Return _errorMessage
            End Get
        End Property
 
    End Class
 
End Class

Specifying expected DataContext type in WPF

Josh Smith just made a blog post about XAML DataContext comments when using the MVVM pattern. He makes a great point, which is that in many cases it is not immediately obvious what the DataContext of a view is intended to be. A simple comment as Josh suggests will go a long way, but the downside is that comments create a maintainability issue. If you rename a model view, or refactor code so that a page, window or user control expects a different DataContext you also need to update the comment. Here’s an example:

<!-- DataContext = SampleModelView -->
<UserControl x:Class="SampleUserControl"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Width="300" Height="300">
    <Grid>
 
    </Grid>
</UserControl>

But now what if I change my mind and decide that instead of SampleModelView, this UserControl will have AlternateModelView as its DataContext? If I forget to update the comment then it is now a source of misinformation. What I would really like to do is somehow specify the expected data context type for a given UI element, so I created an attached property called ExpectedDataContextType. When the element is loaded, it will fail at runtime if the DataContext is not of the desired type. It looks like this:

<UserControl x:Class="TestControl"
    xmlns:local="clr-namespace:ExpectedDataContextType"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Width="300" Height="300"
    local:DataContextHelper.ExpectedDataContextType="{x:Type local:TestModelView}">
    <Grid>
        <TextBlock>Hello</TextBlock>
    </Grid>
</UserControl>

The code is extremely simple. It just attaches a handler to the Loaded event and then checks the type. When the types do not match up it gives you a warning via a message box if you are debugging. I chose not to throw an exception because the exception gets covered up and just ends up as a tiny message on Visual Studio immediate window. When I am debugging and a control has the wrong DataContext I want to make sure I find out about it, hence the message box. You can always replace the message box code with something else though.

Option Strict On
 
Public Class DataContextHelper
 
    Public Shared Function GetExpectedDataContextType(ByVal element As DependencyObject) As Type
        If element Is Nothing Then
            Throw New ArgumentNullException("element")
        End If
 
        Return DirectCast(element.GetValue(ExpectedDataContextTypeProperty), Type)
    End Function
 
    Public Shared Sub SetExpectedDataContextType(ByVal element As DependencyObject, ByVal value As Type)
        If element Is Nothing Then
            Throw New ArgumentNullException("element")
        End If
 
        element.SetValue(ExpectedDataContextTypeProperty, value)
    End Sub
 
    Public Shared ReadOnly ExpectedDataContextTypeProperty As  _
                           DependencyProperty = DependencyProperty.RegisterAttached("ExpectedDataContextType", _
                           GetType(Type), GetType(DataContextHelper), _
                           New FrameworkPropertyMetadata(Nothing, AddressOf OnExpectedDataContextTypeChanged))
 
    Private Shared Sub OnExpectedDataContextTypeChanged(ByVal obj As DependencyObject, ByVal args As DependencyPropertyChangedEventArgs)
        Dim element = DirectCast(obj, FrameworkElement)
        AddHandler element.Loaded, AddressOf OnElementLoaded
    End Sub
 
    Private Shared Sub OnElementLoaded(ByVal sender As Object, ByVal args As RoutedEventArgs)
        Dim element = DirectCast(sender, FrameworkElement)
 
        RemoveHandler element.Loaded, AddressOf OnElementLoaded
 
        ' Compare the expected type to the actual type
        Dim expectedDataContextType = GetExpectedDataContextType(element)
        Dim actualDataContextType = element.DataContext.GetType
        If expectedDataContextType IsNot actualDataContextType Then
 
#If DEBUG Then
            ' The types don't match and debug mode is on so notify the developer that the element
            ' has the wrong data context
            MessageBox.Show(String.Format("DataContext has type {0}. Expected {1}.", _
                                          actualDataContextType.ToString, _
                                          expectedDataContextType.ToString))
#End If
 
        End If
    End Sub
 
End Class

The end result is that you are still specifying what the type of your DataContext should be, but now it is actually enforced. The best part is that if the you rename your model view without updating the ExpectedDataContextType property you will get a compile error because the type no longer exists. If the type still exists then you have to settle for a runtime error.

WPF super converters are bad for unit testing

I recently learned about a WPF converter library on CodePlex. They are all very flexible and it is pretty amazing what you can do with just a few converters. Here’s a sample usage of one of the converters from the project’s documentation:

<MultiBinding>
	<MultiBinding.Converter>
		<con:ExpressionConverterExtension>
			<con:ExpressionConverterExtension.Expression>
				<![CDATA[
				{0} && {1} && {3} && !{2}
				]]>
			</con:ExpressionConverterExtension.Expression>
		</con:ExpressionConverterExtension>
	</MultiBinding.Converter>
</MultiBinding>

This MultiConverter returns true if parameters 0, 1 and 3 are true and parameter 2 is false. At first this looks pretty awesome, but I think it is a little dangerous for any team trying to maximize their test coverage. As soon as you start putting relatively complex logic right into your XAML you are basically writing code that cannot be tested. One of the fundamental principles of MVVM is that your view should be simple. If this {0} && {1} && {3} && !{2} logic is moved to a regular converter, or better yet (maybe) into a ModelView it becomes testable code.

MultiBindings are NOT useless in MVVM

Lately I have been detecting a growing sentiment that any use of MultiBinding is poor practice in MVVM (this thread on WPF disciples for example). However, I think people who say you should never use MultiBinding are missing its point. It is probably better to make a single binding to a FullName property than to multi bind to FirstName and LastName and then convert it into a full name. But there is still a scenario that demands the use of MultiBinding: bindings that need to update when more than one UI property changes. MultiBindings are not just used for passing more information to the converter, they also specify when the value needs to be updated and call the converter again. This is a much more important feature and it requires MultiBinding.