Working on an architecture framework for a client of Liberty Fox, I discovered a little known caveat of the Entity Framework. If you implement the repository pattern (or any data access methodology), then I am sure you’ve created a method to retrieve data from the data source based on some business rules.
public IEnumerable<User> GetActiveUsers()
This method, as you may have guessed, returns an enumerable of the active users in the system. Now, because of the simplicity of LINQ, you would think you could do something like this in your presentation layer:
var users = repository.GetActiveUsers().Skip(10).Take(10)
This snippit of code would work just fine. You would end up retrieving users 11 through 20 in the result-set of active users. Now, being that LINQ is intelligently designed, you would think that it would tack-on the Skip() and Take() to the underlying SQL query and limit the result-set returned from the database, but you run the SQL profiler and you see:
SELECT UserId, UserName, LastActiveDate FROM Users
Oh No! Where is the “
TOP(10)“? Where is the “
WHERE rowcount() > 10“? I know LINQ is capable of this! I have seen it work before in my unit tests and in others’ example code. What’s going on?
Well, I spent an unnecessary amount of time figuring this out so that you don’t have to… As efficient developers, we’re taught to program against interfaces, and specifically, the least responsible interface to get the job done. To further clarify, if I’m working with a
List of users, I may pass it to a method as an
IEnumerable if I only need to
foreach over it, or if I need to run a
for loop with it, I may pass it as an
ICollection. So therefore, its not uncommon to want to return the result of my repository method as an
IEnumerable. The problem is that when you downcast a Linq-to-SQL query to an
IEnumerable, it immediately becomes a Linq-to-Objects query and all further filtering is done in memory. Yikes! So if you had a method that returns an
IEnumerable of all the users in the system and you use Linq to tack-on a filter, you’d expect that to be tacked-on as a WHERE clause in the underlying SQL query, but it won’t! It will load the entire table into memory and then do the filtering. This is extremely inefficient!
So, after my long rant of how I spent hours figuring this out, how can you avoid the same fate? Pass the result of your repository method as an
public IQueryable<User> GetActiveUsers()
In your presentation layer, you will still be able to work with it similar to how you would have if it was an
IEnumerable, but now any further limiters WILL be tacked-on to the SQL query, maintaining efficiency. I hope this saves somebody the time I lost. Cheers! 🙂