Now that we’re firmly on .NET 4.0, and EF 4.x has arrived I’ve begun revisiting our EF v1 style repository pattern looking to see if the new code-first and POCO support will help mitigate some of the frustration we’ve encountered with Entity Framework. The pain we’ve experienced over the past couple of years primarily falls into two buckets:
- The models (EDMX and designer files) can become large, unwieldy and idiosyncratic. Refreshing it with underlying changes to the database can often result in cryptic errors, and source control merges of these files can be painful.
- Independent Association relationships are tricky and complicated when working in a disconnected mode. Properly navigating these relationships often involves loading entities from two or more contexts, associating them together, and then getting them to properly persist as a complete graph when reattached to a new context.
Over the years we’ve mitigated some of the giant model problems (our primary database contains just shy of 400 tables)by creating sub-models. We’ve created these sub-models by clustering groups of related tables in their own model. This has helped quite a bit, but has also led to a bit of code duplication in the form or repetitive boilerplate code. Its likely that could also be mitigated with some T4 code generation, but we never got there. Additionally, some tables belong to more than one model, and querying across models isn’t possible. Lastly we had to employ some tricks to make sure we don’t have to have a connection string for each sub-model.
We don’t have a good answer for the contortions we have to go through to manage disconnected relationships. Up to this point we’ve treated EF entities as DTOs, following Rocky’s Lohtka’s advice, rather than using EF entities to replace our domain objects. On the other hand, Ayende disagrees with this approach, at least in the case of NHibernate. His recent posts about limiting abstractions seems like its directed right at us and our repository over EF solution.
But I digress…
We have a considerable amount of code invested in this approach, domain objects mapped to EF entities (as well as straight up ADO.NET), so before we throw away both our domain and data access layers in one fell swoop, I figured the first step was to see whether code-first POCO objects could assuage our model problem and if Foreign Key associations could sooth our relationship pains. In the process I hope to get a better sense of whether EF or another O/RM might eventually serve in the domain.
Keeping it real
Naturally, I chose our simplest case to start with, a sub-model with only two entities. I began to convert our database-first sub-model to code-first, picking the Fluent API over Data Annotations for a few reasons:
- I’m not fond of polluting the objects with persistence attributes if I can avoid it.
- The Fluent API is a superset of data annotations. I’m unlikely to be able to do everything with Data Annotations alone, and rather than have configuration information in two places, I decided to centralize it in EntityConfiguration classes.
- If I was going to consider EF as a replacement for my domain objects, Fluent API offered me the best opportunity to do that with the least amount of change to the domain objects. Keeping the POCO’s pure, might even lend itself to more easily attempt a similar approach with a NHibernate (which I have next to no experience with, so this may be a pipe dream).
One of the first things I discovered is that I couldn’t mix code-first in the same assembly with my existing database-first models. After one day, I can’t say that this is necessarily impossible but I found some indications that it wouldn’t work. Rather than fight it, I created a separate assembly for the code-first data access and began building my two POCO entities, my new DbContext (which I found much easier to leverage existing connection strings), and a repository with the same signature as the existing database-first repository.
By the end of the day I had swapped out my unit tests and domain classes to use the new code-first repository instead of the database-first and had all my unit tests passing. I then deleted the model, designer, and the rest of my database-first artifacts, just to be sure. The code-first version definitely feels cleaner, fewer files with less code. One feature of database-first that is both a benefit and a curse is its ability to refresh itself from the database. I don’t see an obvious way to mimic that with code-first, but then again maybe I won’t need to. Overall, day 1 went pretty smoothly, albeit with a trivial example.
Real World/Road Rules Challenge
Next I’m going to pick a more complex sub-model, one with 20+ tables with complex relationships, and begin the conversion. I expect this will take more than a day, but will give me a much better idea of the advantages, feasibility and effort of doing a full conversion to code-first.