Archive for the ‘LINQ To SQL’ Category.
June 19, 2010, 11:23 am
A challenging question that programmers are often faced with when chosing a library is whether to go with an open source option or a closed source product that may come with professional support. The answer of course is “it depends” and the quality of each library will often be more important than whether or not it is open source. In the past I have not concerned myself too much with whether or not the source code is available to me because I normally don’t plan on ever touching it. I don’t use an existing library because I want to spend weeks digging through source code and internals, I use them because I don’t want to spend much time on that feature. I want it to just work.
In recent projects I have looked closely at both LINQ to SQL and NHibernate for my ORM needs. Since LINQ to SQL is easy to use and meets my requirements it seemed like a no brainer to use a product from Microsoft over NHibernate since I never planned to edit the NHibernate source. However, a bug that I recently found in LINQ to SQL has made me think differrently (I wrote about the bug here). I don’t blame Microsoft for having a bug in their code since all software has bugs, but when I found out that Microsoft does not plan to fix the issue, I suddenly realized that the team had hit a brick wall. We had already made a big committment to LINQ to SQL. Now there is a bug that Microsoft won’t fix, but they also won’t let anyone else fix it! If NHibernate had been chosen then the worst case scenario is that you have to fix it yourself (though it is likely that someone else will fix it first).
The lesson: proprietary libraries are not the safe option. The only way to ensure bugs will be fixed is if you can fix them yourself.
Or maybe I’m wrong and the real lesson is to chose the library that wasn’t implemented by Microsoft.
April 1, 2010, 8:18 pm
If you keep the same entity around after it has been deleted and SubmitChanges() is called then you can run into an InvalidOperationException if you try to insert it again.
var data = new DataClasses1DataContext();
var user = new User() { userName = "foo", password = "bar" };
data.Users.InsertOnSubmit(user);
data.SubmitChanges();
data.Users.DeleteOnSubmit(user);
data.SubmitChanges();
data.Users.InsertOnSubmit(user);
data.SubmitChanges();
Here you will actually get an exception on the second InsertOnSubmit() because the data context remembered the entity and will no longer allow you to attach it for some reason. Once an entity has been deleted from a data context it can never go back. To get around this you need to either insert the entity to a different data context or copy the data to a new instance of the same entity class and then insert it. This has been confirmed here.
Note: You are free to call DeleteOnSubmit() and then InsertOnSubmit() all you want as long as you never call SubmitChanges().
March 6, 2010, 9:17 am
A common pattern in database design is to use make a column required, give it a default value and then never think about it when doing INSERTs. A perfect example would be a createdDate column on the Users with a default value of GetDate(). Here’s the full table definition:
- userID (identity key)
- userName
- password
- ts (timestamp)
- createdDate (default value = GetDate())
In this case we can easily insert into the table without worrying about the createdDate, userID, or ts columns:
INSERT INTO Users (userName, password) VALUES ('asdf', 'qwer')
However, since this is the 21st century, we don’t want to do this in SQL, we want to do it with an ORM. Unfortunately, LINQ to SQL doesn’t do a very good job with this.
Using context = New TestDataContext
' Output SQL to the console for debugging
context.Log = Console.Out
' Attach a new user and submit the changes
Dim newUser As New User With {.userName = "NewUser", .password = "password"}
context.Users.InsertOnSubmit(newUser)
context.SubmitChanges()
End Using
The above code generates the following INSERT statement when SubmitChanges() is called (note: I replaced @p0, @p1, etc with their actual values to make the query more readable):
INSERT INTO [dbo].[Users]([userName], [password], [createdDate]) VALUES ('NewUser', 'password', NULL)
This query fails and we get a SqlTypeException because createdDate is NOT NULL and NULL cannot be converted to a valid date. Notice that the generated SQL does not attempt to explicitly set a value for userID or ts. It appears that LINQ to SQL knows how to deal with IDENTITY fields and TIMESTAMPs, but not how to deal with other required columns that happen to have a default value.
I would have expected LINQ to SQL to generate a query that does not explicitly set createdDate so that SQL Server could handle it, but no such luck. You can easily set the createdDate manually like this:
Dim newUser As New User With {.userName = "NewUser", .password = "password", .createdDate = Date.Now}
It really sucks to have to do this every time though, especially if you have many fields to fill in. A possible alternative is to put a partial class on either your DataContext or just on the User class and write some code that will automatically initialize fields like createdDate. If you want to make generic behaviour for this (eg: automatically set columns named “createdDate” to Date.Now when SubmitChanges is called) you can do something like this in the DataContext partial class:
Public Overrides Sub SubmitChanges(ByVal failureMode As ConflictMode)
' NOTE: this is just a sample to get you started
For Each insert In GetChangeSet().Inserts
Dim createdDateProp = insert.GetType.GetProperty("createdDate")
If createdDateProp IsNot Nothing Then
createdDateProp.SetValue(insert, Date.Now, Nothing)
End If
Next
MyBase.SubmitChanges(failureMode)
End Sub
February 9, 2010, 9:23 pm
The great thing about fetching data via a LINQ to SQL query is that you get a nice formatted result and you can easily save back any changes you make with SubmitChanges(). Unfortunately, we all inevitably fall into scenarios where we have to make use of stored procedures for performance or other reasons. If you have a stored procedure whose result set contains columns from just a single table then you can easily map the stored procedure to that table, but in most cases the result set involves multiple tables making things a little more tricky. It’s easy to execute a stored procedure from LINQ to SQL (just drag the SP from the server explorer into the designer and then execute it like a function on the data context) but you lose some of the benefits of LINQ to SQL. First of all, you just get a flat result set instead of a hierarchical result set using the auto generated entity classes. Second, you can’t just make changes to the result and call SubmitChanges. Luckily, with a little extra work, the flat, detached result set can be converted into a hierarchical, attached result set where changes can easily be saved.
If you don’t want to bother reading the whole article and all of the code, here’s the short answer: use the Attach() method.
Below is an example that runs a stored procedure to return all users in the database joined with their articles. The results are converted into an attached list of users, each containing a collection of articles. Notice that not all of the columns need to be known, just the primary key and timestamp are required. For more info on the timestamp, check out this article.
Module Module1
Sub Main()
Using testContext As New TestDataContext
' Print SQL queries to the console for testing purposes
testContext.Log = Console.Out
' Get attached entities
Dim users = GetAttachedUsersWithGroups(testContext)
' Make some random changes to prove the concept
users.First.userName = "foo"
users.First.Articles.First.text = "bar"
' Submit the changes to see what SQL gets executed
testContext.SubmitChanges()
End Using
Console.ReadKey()
End Sub
Public Function GetAttachedUsersWithGroups(ByVal context As TestDataContext) As IEnumerable(Of User)
' Get some data from a stored procedure
Dim result = context.GetAllUsersWithArticles
' Convert flat result set to groups of articles by user
Dim userGroups = From row In result _
Group row By row.userID, row.userTimestamp _
Into articles = Group _
Select userID, userTimestamp, articles
Dim users As New List(Of User)
' Create LINQ to SQL entities
For Each userGroup In userGroups
Dim user As New User With {.userID = userGroup.userID, _
.ts = userGroup.userTimestamp}
For Each article In userGroup.articles
user.Articles.Add(New Article With {.articleID = article.articleID, _
.title = article.title, _
.ts = article.articleTimestamp})
Next
users.Add(user)
Next
' Attach the users to the data context. This will also attach the articles
' because they have been added to each user's Articles collection.
context.Users.AttachAll(users)
Return users
End Function
End Module
February 9, 2010, 9:00 pm
LINQ to SQL has built in optimistic concurrency checking. When you create an unattached entity and then attach it (ie: with the Attach() function) the concurrency check will always fail by throwing a ChangeConflictException unless one of the two are true:
- The table that the entity belongs to has a timestamp column and its value is exactly the same as it appears in the database.
- There is no timestamp column but the “no count” feature on SQL Server is off.
Using a timestamp column seems like the more elegant solution, but it does require that you know the timestamp value. This usually means that if you’re attaching the result of a stored procedure so that you can save back the results, your stored procedure needs to return the timestamp in addition to the primary key.
February 5, 2010, 7:26 pm
In LINQ to SQL you can chain multiple where clauses like this:
Module Module1
Sub Main()
Using context As New TestDataContext
context.Log = Console.Out
Dim articles = context.Articles.Where(Function(a) a.articleID > 10) _
.Where(Function(a) a.articleID Mod 2 = 0) _
.ToList()
End Using
Console.ReadKey()
End Sub
End Module
This will generate SQL that looks roughly like this:
SELECT ... FROM Articles WHERE articleID > 10 AND articleID % 2 = 0
Since chained where clauses are equivalent to ANDing multiple expressions in a single WHERE, the above SQL is exactly what you would expect to see. Unfortunately, things get more complicated when one of the expressions cannot be converted to SQL, like in this case:
Module Module1
Sub Main()
Using context As New TestDataContext
context.Log = Console.Out
Dim articles = context.Articles.Where(AddressOf FilterArticle).ToList()
End Using
Console.ReadKey()
End Sub
Function FilterArticle(ByVal a As Article) As Boolean
Return a.articleID Mod 2 = 0
End Function
End Module
The above code generates SQL that looks like this:
The query has no where clause, it just loads all the articles and then filters them on the client side. It’s usually optimal to do the filtering on the SQL side, but the behaviour is reasonable. I wouldn’t expect the ORM to be capable of magically converting the contents of the FilterArticle function into SQL (it sure would be nice though). This is still expected behaviour, but here’s an example where things get weird:
Module Module1
Sub Main()
Using context As New TestDataContext
context.Log = Console.Out
Dim articles = context.Articles.Where(AddressOf FilterArticle) _
.Where(Function(a) a.articleID > 10) _
.ToList()
End Using
Console.ReadKey()
End Sub
Function FilterArticle(ByVal a As Article) As Boolean
Return a.articleID Mod 2 = 0
End Function
End Module
This code generates the same SQL as last time:
It is filtering both where clauses on the client side even though the second one could have been converted to SQL. If you flip the where clauses like this:
Module Module1
Sub Main()
Using context As New TestDataContext
context.Log = Console.Out
Dim articles = context.Articles.Where(Function(a) a.articleID > 10) _
.Where(AddressOf FilterArticle) _
.ToList()
End Using
Console.ReadKey()
End Sub
Function FilterArticle(ByVal a As Article) As Boolean
Return a.articleID Mod 2 = 0
End Function
End Module
then you will still get the expected SQL:
SELECT ... FROM Articles WHERE articleID > 10
The where clause that can be converted to SQL is filtered in the SELECT statement, but the clause that cannot be converted is filtered on the client side. I would have hoped that the order of the where clauses would not matter since they are just being ANDed, but that is not the case.
The lesson is that if you need to chain a where clause that cannot be converted to SQL, try to put it at the end of the chain. This can be a real issue if you are using a data access layer that automatically filters queries (eg: for security) with a function that cannot convert to SQL. If all of your LINQ to SQL queries have this built in filter then none of them will ever generate WHERE clauses in the SQL, it will just load the entire table every time.
January 31, 2010, 8:45 pm
If you attach an entity with a required association that is nulled out, you will be unable to call GetChangeSet(). In my opinion, the expected behaviour is that the entity should show up in the change set as though it is valid, but an exception should be thrown when you attempt to call SubmitChanges() because a foreign key constraint has been violated. In fact, with code like this we will get exactly that result (an exception is thrown on SubmitChanges()):
Using testData As New TestDataContext
Dim newArticle As New Article With {.title = "Foobar", _
.text = "blah blah blah"}
testData.Articles.InsertOnSubmit(newArticle)
Dim changes = testData.GetChangeSet()
testData.SubmitChanges()
End Using
There is a required association to the Users table that has not been set at all. Using the following snippet, with the User property explicitly set to Nothing an exception will be thrown on GetChangeSet() instead of SubmitChanges():
Using testData As New TestDataContext
Dim newArticle As New Article With {.title = "Foobar", _
.text = "blah blah blah", _
.User = Nothing}
testData.Articles.InsertOnSubmit(newArticle)
Dim changes = testData.GetChangeSet()
testData.SubmitChanges()
End Using
It gives this error on GetChangeSet():
An attempt was made to remove a relationship between a User and a Article. However, one of the relationship’s foreign keys (Article.userID) cannot be set to null.
It appears that the internal implementation of LINQ to SQL distinguishes between an unset relationship, and one that has specifically been set to Nothing. The awkward thing here is that it is not always easy to avoid this issue since you don’t even have to call InsertOnSubmit. Attaching an entity by setting an association to an already attached object gives the same result.
Using testData As New TestDataContext
Dim existingUser = testData.Users.First
Dim newUserGroup As New UserGroup With {.User = existingUser, .Group = Nothing}
Dim changes = testData.GetChangeSet()
testData.SubmitChanges()
End Using
In this snippet there are two required associations: User and Group. As soon as User is set, the UserGroup entity is attached to the DataContext. However, since Group is Nothing the ChangeSet is now corrupt.
This bug is described in this forum thread where a Microsoft employee called it a bug and recommended that he post it on Connect (Microsoft’s bug tracking site). The bug report on Connect can be found here. One hour after it was posted Microsoft replied saying this:
We are currently investigating. The investigation process normally takes 7-14 days.
They then went silent for 9 months before posting this:
Hi,
Thank you for taking the time to send this feedback and bug report. We have reviewed the issue and confirmed the behavior, but we will not be fixing this in the next release of LINQ to SQL.
LINQ to SQL Team
That’s Microsoft for ya.
January 31, 2010, 4:48 pm
By default, LINQ to SQL uses deferred loading. When you want to eager load an entity’s associated data you need to set DataLoadOptions using the LoadOptions property on the DataContext. If you have a one-to-many relationship between Users and Articles you can force LINQ to SQL to eager load Articles with Users like this:
Using testData As New TestDataContext
' Log SQL queries to the console
testData.Log = Console.Out
' Set LoadOptions
Dim options As New DataLoadOptions
options.LoadWith(Function(user As User) user.Articles)
testData.LoadOptions = options
' Load users with their articles
Dim users = testData.Users.ToList
For Each user In users
Dim articles = user.Articles.ToList
Next
End Using
This will generate a single SELECT statement with a JOIN on the Articles table. The same goes for for one-to-one relationships. You can also use LoadWith as many times as you want. For one-to-one relationships and no more than a single one-to-many relationship this will still generate one query with JOINs to all the LoadWith tables. However, if you want to eager load multiple one-to-many relationships you will get into a select N + 1 situation (or worse). For example, this code eager loads Articles and UserGroups with each User entity:
Using testData As New TestDataContext
' Log SQL queries to the console
testData.Log = Console.Out
' Set LoadOptions
Dim options As New DataLoadOptions
options.LoadWith(Function(user As User) user.Articles)
options.LoadWith(Function(user As User) user.UserGroups)
testData.LoadOptions = options
' Load users with their articles
Dim users = testData.Users.ToList
For Each user In users
Dim articles = user.Articles.ToList
Dim userGroups = user.UserGroups.ToList
Next
End Using
Technically, the behaviour here is correct. It will successfully eager load both the Articles and UserGroups collections for each User, but it will not do it in a single query. When I ran this I got one query that fetched the Users and Articles like last time, but then a separate SELECT for each UserGroup rather than another JOIN. Even though this won’t alter the behaviour of the code, it will definitely make a major impact on performance, especially if there are a lot of users in the database.
Scott Guthrie confirmed this behaviour in a post on David Hayden’s blog. This is what he said:
In the case of a 1:n associations, LINQ to SQL only supports joining-in one 1:n association per query.
Lame.
January 23, 2010, 9:15 am
A couple months ago I wrote this article explaining why I think it is reasonable for unit tests to hit a real database. Subsequently, I wrote a follow up article describing some techniques for rolling back your database to its original state after each test. In that article I found that just using simple transactions did not solve the problem because you need access to all database connections being used, and they all have to be rolled back. I have since found a way around this problem using distributed transactions.
With the Microsoft Distributed Transaction Coordinator (MSDTC) the activity over multiple connections can be lumped into a single transaction using the TransactionScope class. MSDTC needs to be running for this to work, but since this is just for unit tests it doesn’t need to be enabled on your production environment.
In order to use the TransactionScope class your project will need a reference to System.Transactions. Here’s a sample unit test using MSTest and Entity Framework where the database is altered with multiple connections within a transaction and then the changes are rolled back:
Imports System.Transactions
Imports System
Imports System.Text
Imports System.Collections.Generic
Imports Microsoft.VisualStudio.TestTools.UnitTesting
<TestClass()> _
Public Class UnitTestSample
<TestMethod()> _
Public Sub ProofOfConceptTest()
Using New TransactionScope
Dim conn1 As New DataTestEntities
Dim conn2 As New DataTestEntities
Dim row1 As New Users With {.userName = "user1", .password = "pass"}
Dim row2 As New Users With {.userName = "user2", .password = "pass"}
conn1.AddToUsers(row1)
conn2.AddToUsers(row2)
conn1.SaveChanges()
conn2.SaveChanges()
Dim conn3 As New DataTestEntities
Assert.AreEqual(conn3.Users.Count, 6)
End Using
End Sub
End Class
Alternatively, if you want every test method inside a test class to be within its own TransactionScope without adding a Using block to every single test, you can use the initialization and cleanup methods like this:
Imports System.Transactions
Imports System
Imports System.Text
Imports System.Collections.Generic
Imports Microsoft.VisualStudio.TestTools.UnitTesting
<TestClass()> _
Public Class UnitTestSample
Private _transaction As TransactionScope
<TestInitialize()> _
Public Sub Setup()
_transaction = New TransactionScope
End Sub
<TestCleanup()> _
Public Sub TearDown()
_transaction.Dispose()
End Sub
<TestMethod()> _
Public Sub ProofOfConceptTest()
Dim conn1 As New DataTestEntities
Dim conn2 As New DataTestEntities
Dim row1 As New Users With {.userName = "user1", .password = "pass"}
Dim row2 As New Users With {.userName = "user2", .password = "pass"}
conn1.AddToUsers(row1)
conn2.AddToUsers(row2)
conn1.SaveChanges()
conn2.SaveChanges()
Dim conn3 As New DataTestEntities
Assert.AreEqual(conn3.Users.Count, 6)
End Sub
End Class
As long as the use of MSDTC is an option, I have found this method to be far better than any of those described in the last article. It guarantees that the state or your database is maintained and is extremely fast (at least on small amounts of data).
February 11, 2009, 9:18 pm
There are a few problems with the data model code that is automatically generated in LINQ to SQL. The most obvious issues are that class names are not capitalized and that tables with two foreign keys to the same table will not have descriptive names. For example, if a table has columns firstUserID and secondUserID which are both foreign keys to the users table then you would probably hope to see the properties FirstUser and SecondUser on that class. Unfortunately, what you will actually get is User and User1 which is pretty much pointless because it is very difficult to tell which one is which. One way to tackle this is to simply change the code after it has been generated so that it looks like you want it to. But then as soon as you change the database and import the model again your changes are lost and have to be redone every time. Ideally, the automatically generated code would be formatted exactly as you want it.
The solution I came up with was to useĀ SqlMetal to generate XML output, then manipulate that output and feed it back into SqlMetal so that it can generate the code from the altered XML. As a naming heuristic, I specified that a foreign key column’s property would have a name derived from the column name without the “ID” suffix and a capitalized first letter. Depending on the extent of the changes you plan on making, you may want to make an XSLT file to translate the XML, or simply use some regular expressions.
One downside to this approach is that SqlMetal often cannot generate code for stored procedures because it is unable to determine the return type without actually running the procedure. To get around this I told SqlMetal not to generate any code for the stored procedures (just omit the /sprocs argument), then manually incorporated the XML for my stored procedures in the XSLT file so that it would be injected into the second input to SqlMetal that actually generates the code.