A lesson on using closed source libraries

A challenging question that programmers are often faced with when chosing a library is whether to go with an open source option or a closed source product that may come with professional support. The answer of course is “it depends” and the quality of each library will often be more important than whether or not it is open source. In the past I have not concerned myself too much with whether or not the source code is available to me because I normally don’t plan on ever touching it. I don’t use an existing library because I want to spend weeks digging through source code and internals, I use them because I don’t want to spend much time on that feature. I want it to just work.

In recent projects I have looked closely at both LINQ to SQL and NHibernate for my ORM needs. Since LINQ to SQL is easy to use and meets my requirements it seemed like a no brainer to use a product from Microsoft over NHibernate since I never planned to edit the NHibernate source. However, a bug that I recently found in LINQ to SQL has made me think differrently (I wrote about the bug here). I don’t blame Microsoft for having a bug in their code since all software has bugs, but when I found out that Microsoft does not plan to fix the issue, I suddenly realized that the team had hit a brick wall. We had already made a big committment to LINQ to SQL. Now there is a bug that Microsoft won’t fix, but they also won’t let anyone else fix it! If NHibernate had been chosen then the worst case scenario is that you have to fix it yourself (though it is likely that someone else will fix it first).

The lesson: proprietary libraries are not the safe option. The only way to ensure bugs will be fixed is if you can fix them yourself.

Or maybe I’m wrong and the real lesson is to chose the library that wasn’t implemented by Microsoft.

LINQ queries return queries not data

The title of this article is a pretty obvious statement, but it’s actually pretty easy to forget and it can lead to some painful bugs. Here’s a code snippet whose output may seem surprising:

Module Module1
 
    Sub Main()
        Dim query = 
            From name In {"one", "two", "three"}
            Select New User(name)
 
        Dim x = query.First
        Dim y = query.First
 
        Console.WriteLine(x Is y)
 
        Console.ReadKey()
    End Sub
 
End Module
 
Public Class User
    Public Sub New(ByVal username As String)
        Console.WriteLine("Creating user: " & username)
    End Sub
End Class

At first glance it looks like x and y are the same object, but since query is just a query, not an actual collection, the result will be fetched independently each time you call First. When the code is run, the console output looks like this:

Creating user: one
Creating user: one
False

This shows that the constructor was called twice for the same string, which explains why x and y are actually different objects. In a case like this it is better to put ToList at the end of the query:

        Dim users =
            (From name In {"one", "two", "three"}
             Select New User(name)).ToList()

Bugs like this can be particularly problematic when you pass around an object of type IEnumerable and the programmer assumes that they are dealing with a collection, when they are really dealing with a query. So… watch out!

Visual Basic 10 properties still lag behind C#

Up until Visual Studio 2010, simple property definitions were always ridiculously verbose. For example:

    Private _title As String
    Public Property Title() As String
        Get
            Return _title
        End Get
        Set(ByVal value As String)
            _title = value
        End Set
    End Property

That’s 9 lines of code just to make a simple string property. Luckily, in VB 10 we will be able to write these simple properties in one line:

    Public Property Title As String

This is great, but what if things get a little more complicated? Obviously this syntax just uses the default getter and setter, both of which have the same scope. When I write my properties I like avoid using ReadOnly and instead make the setter private so that I can still encapsulate the setting code within the class. With VB 10 you still have to define the property the old way in order to do this:

    Private _title As String
    Public Property Title() As String
        Get
            Return _title
        End Get
        Private Set(ByVal value As String)
            _title = value
        End Set
    End Property

But now it’s back to 9 lines just because I wanted the setter to be private. In C#, programmers have the luxury of writing properties like this:

    public string Title { get; private set; }

Furthermore, neither language allows you to use a default getter and custom setter (or vice versa) probably because it would make it difficult for them both to use the same backing store, but that’s a whole other tangent. Overall, the new abbreviated syntax for properties in VB 10 is great for the simple cases, but you have to revert to the old, verbose method if you want to deviate from the default in even the slightest way.

LINQ to SQL Gotcha #6: Delete, Save, Insert, CRASH

If you keep the same entity around after it has been deleted and SubmitChanges() is called then you can run into an InvalidOperationException if you try to insert it again.

var data = new DataClasses1DataContext();
var user = new User() { userName = "foo", password = "bar" };
 
data.Users.InsertOnSubmit(user);
data.SubmitChanges();
 
data.Users.DeleteOnSubmit(user);
data.SubmitChanges();
 
data.Users.InsertOnSubmit(user);
data.SubmitChanges();

Here you will actually get an exception on the second InsertOnSubmit() because the data context remembered the entity and will no longer allow you to attach it for some reason. Once an entity has been deleted from a data context it can never go back. To get around this you need to either insert the entity to a different data context or copy the data to a new instance of the same entity class and then insert it. This has been confirmed here.

Note: You are free to call DeleteOnSubmit() and then InsertOnSubmit() all you want as long as you never call SubmitChanges().

Singleton WCF service gotcha: allowing more than 10 connections

By default, an instance of a WCF service can only have 10 connections at a time. If your service is a singleton then there is obviously only one instance, which means that only 10 clients can connect to your service at a time. All subsequent connection attempts will result in timeouts if one of the 10 spots is not freed up. You can easily change this from the default by adding a serviceThrottling entry to your behavior definition in the app.config/web.config file. Here’s an example that cranks the max up to 50:

<serviceThrottling maxConcurrentCalls="100" maxConcurrentSessions="50" maxConcurrentInstances="50" />

Updating and restarting a singleton WCF service hosted in IIS

I recently moved a singleton WCF service to IIS that used to be hosted within a Windows service. When it was hosted in a Windows service restarting it was easy, you just go to the service manager in Windows and click restart, but to my knowledge there is no direct equivalent for a WCF service running inside IIS. Since a singleton service continues to run indefinitely once it starts I was afraid that even after publishing an update or changing the web.config the old code would continue to run until I forced a restart somehow. I didn’t want to have to restart IIS every time there is a change to the config file, so I made a test project and ran some experiments. It turns out that when any change is made to the service (ie: any of the binaries, the svc file, or the web.config are replaced) the service immediately restarts. This means that all of your existing connections will be terminated and the service class will be recreated. Unlike a Windows service you don’t have to manually force a restart when you publish an update. The bad news is that I still don’t know of a way to simply restart the service without making any changes. Aside from restarting IIS (which is obviously a terrible solution) the only way I know is to make some random negligible change to the config file (like changing a comment or maybe even just whitespace). I guess that in theory singleton services are not meant to be restarted, but you’re bound to want to do it at some point.

CruiseControl.NET: Versioning ccnet.config and integrating it into CI

Cruise Control is meant to help manage your builds based on the code in source control, and since ccnet config files are basically code, it only makes sense that they should be a part of this process. Basically, there are two problems I wanted to solve:

  • If I change ccnet.config and introduce an error, I want to be able to roll back to a previous version in SVN.
  • When multiple developers are working on ccnet.config, they should not be fighting over the same file on the CI server. Instead, each developer should be editing a file in their working copy and then merging changes into the repository.

Both of these issues are actually very easy to solve. Using SVN, you can create a repository with only one file: ccnet.config. Then add this project to both the ccnet.config in SVN and the one actually used by Cruise Control (probably in C:\Program Files\CruiseControl.NET or something like that):

  <project name="ccnet config">
    <sourcecontrol type="svn">
      <executable>C:\path\to\svn\on\CI\server\svn.exe</executable>
      <workingDirectory>C:\CI</workingDirectory>
      <trunkUrl>http://localhost/svn/myrepo/trunk</trunkUrl>
      <autoGetSource>true</autoGetSource>
      <username>SVN_USERNAME</username>
      <password>SVN_PASSWORD</password>
    </sourcecontrol>
    <tasks>
      <exec>
        <executable>C:\Windows\System32\xcopy.exe</executable>
        <buildArgs>/Y C:\CI\ccnet-config\ccnet.config "C:\Program Files\CruiseControl.NET\server"</buildArgs>
      </exec>
    </tasks>
  </project>

All this does is check out the config file and then copy it to the location used by Cruise Control. Remember to edit all the paths above with the actual ones for your setup.

That’s all you need to do! The config file is now versioned and Cruise Control is automatically updating itself every time you commit.

LINQ to SQL Gotcha #5: Column Default Values

A common pattern in database design is to use make a column required, give it a default value and then never think about it when doing INSERTs. A perfect example would be a createdDate column on the Users with a default value of GetDate(). Here’s the full table definition:

  • userID (identity key)
  • userName
  • password
  • ts (timestamp)
  • createdDate (default value = GetDate())

In this case we can easily insert into the table without worrying about the createdDate, userID, or ts columns:

INSERT INTO Users (userName, password) VALUES ('asdf', 'qwer')

However, since this is the 21st century, we don’t want to do this in SQL, we want to do it with an ORM. Unfortunately, LINQ to SQL doesn’t do a very good job with this.

Using context = New TestDataContext
 
    ' Output SQL to the console for debugging
    context.Log = Console.Out
 
    ' Attach a new user and submit the changes
    Dim newUser As New User With {.userName = "NewUser", .password = "password"}
    context.Users.InsertOnSubmit(newUser)
    context.SubmitChanges()
 
End Using

The above code generates the following INSERT statement when SubmitChanges() is called (note: I replaced @p0, @p1, etc with their actual values to make the query more readable):

INSERT INTO [dbo].[Users]([userName], [password], [createdDate]) VALUES ('NewUser', 'password', NULL)

This query fails and we get a SqlTypeException because createdDate is NOT NULL and NULL cannot be converted to a valid date. Notice that the generated SQL does not attempt to explicitly set a value for userID or ts. It appears that LINQ to SQL knows how to deal with IDENTITY fields and TIMESTAMPs, but not how to deal with other required columns that happen to have a default value.

I would have expected LINQ to SQL to generate a query that does not explicitly set createdDate so that SQL Server could handle it, but no such luck. You can easily set the createdDate manually like this:

Dim newUser As New User With {.userName = "NewUser", .password = "password", .createdDate = Date.Now}

It really sucks to have to do this every time though, especially if you have many fields to fill in. A possible alternative is to put a partial class on either your DataContext or just on the User class and write some code that will automatically initialize fields like createdDate. If you want to make generic behaviour for this (eg: automatically set columns named “createdDate” to Date.Now when SubmitChanges is called) you can do something like this in the DataContext partial class:

Public Overrides Sub SubmitChanges(ByVal failureMode As ConflictMode)
 
    ' NOTE: this is just a sample to get you started
    For Each insert In GetChangeSet().Inserts
        Dim createdDateProp = insert.GetType.GetProperty("createdDate")
        If createdDateProp IsNot Nothing Then
            createdDateProp.SetValue(insert, Date.Now, Nothing)
        End If
    Next
 
    MyBase.SubmitChanges(failureMode)
 
End Sub

Saving changes to stored procedure results in LINQ to SQL

The great thing about fetching data via a LINQ to SQL query is that you get a nice formatted result and you can easily save back any changes you make with SubmitChanges(). Unfortunately, we all inevitably fall into scenarios where we have to make use of stored procedures for performance or other reasons. If you have a stored procedure whose result set contains columns from just a single table then you can easily map the stored procedure to that table, but in most cases the result set involves multiple tables making things a little more tricky. It’s easy to execute a stored procedure from LINQ to SQL (just drag the SP from the server explorer into the designer and then execute it like a function on the data context) but you lose some of the benefits of LINQ to SQL. First of all, you just get a flat result set instead of a hierarchical result set using the auto generated entity classes. Second, you can’t just make changes to the result and call SubmitChanges. Luckily, with a little extra work, the flat, detached result set can be converted into a hierarchical, attached result set where changes can easily be saved.

If you don’t want to bother reading the whole article and all of the code, here’s the short answer: use the Attach() method.

Below is an example that runs a stored procedure to return all users in the database joined with their articles. The results are converted into an attached list of users, each containing a collection of articles. Notice that not all of the columns need to be known, just the primary key and timestamp are required. For more info on the timestamp, check out this article.

Module Module1
 
    Sub Main()
 
        Using testContext As New TestDataContext
 
            ' Print SQL queries to the console for testing purposes
            testContext.Log = Console.Out
 
            ' Get attached entities
            Dim users = GetAttachedUsersWithGroups(testContext)
 
            ' Make some random changes to prove the concept
            users.First.userName = "foo"
            users.First.Articles.First.text = "bar"
 
            ' Submit the changes to see what SQL gets executed
            testContext.SubmitChanges()
 
        End Using
 
        Console.ReadKey()
 
    End Sub
 
    Public Function GetAttachedUsersWithGroups(ByVal context As TestDataContext) As IEnumerable(Of User)
 
        ' Get some data from a stored procedure
        Dim result = context.GetAllUsersWithArticles
 
        ' Convert flat result set to groups of articles by user
        Dim userGroups = From row In result _
                         Group row By row.userID, row.userTimestamp _
                         Into articles = Group _
                         Select userID, userTimestamp, articles
 
        Dim users As New List(Of User)
 
        ' Create LINQ to SQL entities
        For Each userGroup In userGroups
            Dim user As New User With {.userID = userGroup.userID, _
                                       .ts = userGroup.userTimestamp}
            For Each article In userGroup.articles
                user.Articles.Add(New Article With {.articleID = article.articleID, _
                                                    .title = article.title, _
                                                    .ts = article.articleTimestamp})
            Next
            users.Add(user)
        Next
 
        ' Attach the users to the data context.  This will also attach the articles
        ' because they have been added to each user's Articles collection.
        context.Users.AttachAll(users)
 
        Return users
 
    End Function
 
End Module

LINQ to SQL Gotcha #4: ChangeConflictException on Update of Manually Attached Data

LINQ to SQL has built in optimistic concurrency checking. When you create an unattached entity and then attach it (ie: with the Attach() function) the concurrency check will always fail by throwing a ChangeConflictException unless one of the two are true:

  • The table that the entity belongs to has a timestamp column and its value is exactly the same as it appears in the database.
  • There is no timestamp column but the “no count” feature on SQL Server is off.

Using a timestamp column seems like the more elegant solution, but it does require that you know the timestamp value. This usually means that if you’re attaching the result of a stored procedure so that you can save back the results, your stored procedure needs to return the timestamp in addition to the primary key.