Monday, December 7, 2009

The Agony of EF

 

We’ve been working with EF v1 (.NET 3.5) for a little while now, using it sparingly when creating new code requiring CRUD functionality, and refactoring older code with similar characteristics. You may recall, from an earlier post, that our approach was to use EF to create a data-layer to replace the Datasets, Datatables and Datareaders we currently use with type-safe rich Entity Objects.  It was not and is not our intention to replace our existing business objects with EF.  To this end, we employed a repository pattern that serves up ‘detached’ EF objects which are then used to hydrate and persist our business objects.

All too easy

Initially, the results were very positive.  With some of the simpler cases we started with (mapping tables with few or no relationships) an immediate benefit was obvious.  We could replace the SELECT stored procedures with equivalent Linq-to-Entities statements, and remove entirely the INSERT, UPDATE and DELETE stored procedures in favor of EF persistence.  With simple use of the designer to import entities from the database into our model, and the implementation of some boilerplate repository code, we could eliminate at least four stored procedures per entity in our database.

A dream to some, a nightmare to others

Slowly, as we attempted to work with slightly more sophisticated cases, issues began to emerge; issues that are probably well documented across the web and perhaps most notably in the vote of no confidence.  The biggest issue we encountered was the general lack of tiering/layering support in EF.  Entity objects could be detached, but the consequences were painful. 

  • Relationships were not preserved.  Parent and child objects could not be traversed. Even worse, since the relationships were null entities rather than key columns, even the identifiers could not be accessed thus making it difficult to query and manually load related objects.  Lazy-loading disconnected relations isn’t possible and doesn’t even make sense.
  • Re-attaching detached objects was not straightforward.  Any non-trivial situation resulted in object graphs where some objects where attached while others detached, a mixture EF couldn’t handle.

These two alone were enough to make me doubt EF’s usefulness beyond trivial usage (or building a one-tier application), and we hadn’t even gotten to performance, or model merging issues.

Making the simple simpler; the complex more complicated

Nevertheless, I was determined to understand the limitations fully before discarding EF v1 (abandoning O/RM altogether, waiting for the promise of EF4, or switching O/RMs).  While our EF Repository pattern wasn’t mainstream usage, the issues we were facing with our data-layer are the same issues what would be faced by anyone using EF across layers and tiers.  The most likely parallel would be anyone attempting to use EF with WCF to create services.  In this camp, we were not alone, there are a considerable number posts lamenting the difficulties and a fair number of proposed solutions and attempted workarounds; some clever, some not so.

After investigating dozens of these solutions, implementing one other candidate with only partial success, I eventually came across Reattaching Entity Graphs with the Entity Framework.  This solution uses serialization to ‘detach’ the object graphs thereby preserving the relationships.   It also contains some very clever logic for reattaching the object graph while at the same time including a method for instructing the attachment process on how far to traverse down the various graph pathways when performing persistence.

After about a day or two I was able to integrate this solution into our code and solve our detachment and relationship issues.  Now I can look forward to facing the performance issues (this solution does require eager-loading the related entities using Include), the source control branch merging issues inevitable with one large edmx file, and the many others lying in wait for me.

A game of trade-offs

This experience reinforces one of the lingering doubts I have about O/RMs, and other tools, frameworks and patterns in general.  It often feels like a zero-sum gain, where we’re just moving the complexity around rather than reducing it.  Instead of struggling with writing stored procedures or dynamic sql and mapping the datasets and readers to object properties and back again (which I never found to be that onerous to begin with), we’re struggling with mapping objects to tables via Xml and persistence idiosyncrasies.  It doesn’t feel like an easier or better way necessarily, just a different way and different set of challenges. Maybe that’s just familiarity with the old method, and lack of familiarly with a new method, but maybe its not.

Thursday, December 3, 2009

Dealing with Design Debt Part 2½: The Smell of Fear

Its probably time for an update on the dealing with design debt series.  I promised part III would be the conclusion and it will be, but we’re not quite there yet, nevertheless there’s some story to tell now that will bridge the gap. 

Along with the business case made for the debt reduction project, there were also other business decisions that that had direct impact or were directly impacted by the project.  Most significant were the several projects on the schedule with direct overlap with key structures under construction as part of debt reduction.  To schedule these projects first would mean doing them twice, but not doing them first would mean delaying or missing revenue opportunities.

Measure twice, cut once.

From a purely technical perspective it made sense to complete the debt reduction project first, and tackle the dependent projects after.  This would allow us to design the solution for the new structure and implement once, rather than designing for the old structure then later retrofitting it for the new while also implementing twice.  However, the business case ran counter to the technical case.  These projects were estimated to have significant revenue impact, therefore delaying them until the completion of a long and risky debt reduction project was judged to represent a significant loss in revenue.

“It is a riddle, wrapped in a mystery, inside an enigma” - Churchill

Based on these projections, it was decided that the high revenue impact projects would be done prior to the debt reduction project, acknowledging the implied cost of implementing twice, and the ultimately less clean implementation.  Essentially, we had decided to take on debt during our debt reduction exercise.

Unfortunately, its not entirely clear whether that was the right decision or not.  The high revenue impact projects were completed but took longer than expected, pushed back the debt reduction project and made it a bit more complex.  Further, the actual revenue impact of those projects isn’t entirely clear nor is the impact of the additional debt and complexity particularly measurable. 

And there upon the rainbow is the answer to our neverending story

I guess the moral to this story is that, at least in this case, debt accumulation was an explicit rather than implicit trade-off decision.

Technorati Tags: ,

Friday, November 13, 2009

Deep thoughts

 

I’ve recently come across some blog posts which resonate with some of my own ideas.  While these particular quotes are not necessarily related to each other they are consistent with a larger theme that I’m trying to develop.  I’ve included the salient excerpts (providing my my own titles).

 

Negative consequence of [design] patterns

One of the biggest challenges with patterns is that people expect them to be a recipe, when in reality they are just a vague formalization of a concept…In my view a pattern should only be used if its positive consequences outweigh its negative consequences. Many patterns, oddly enough, require extra code and/or configuration over what you’d normally write – which is a negative consequence.

– Rocky Lhotka

Why on earth are we doing so many projects that deliver such marginal value?

To understand control’s real role, you need to distinguish between two drastically different kinds of projects:

  • Project A will eventually cost about a million dollars and produce value of around $1.1 million.
  • Project B will eventually cost about a million dollars and produce value of more than $50 million.


What’s immediately apparent is that control is really important for Project A but almost not at all important for Project B. This leads us to the odd conclusion that strict control is something that matters a lot on relatively useless projects and much less on useful projects. It suggests that the more you focus on control, the more likely you’re working on a project that’s striving to deliver something of relatively minor value. To my mind, the question that’s much more important than how to control a software project is, why on earth are we doing so many projects that deliver such marginal value?

– Tom DeMarco

 

The dirty little secret about simple: It’s actually hard to do

The dirty little secret about simple: It’s actually hard to do. That’s why most people make complex stuff. Simple requires deep thought, discipline, and patience – things that many companies lack. That leaves room for you. Do something simpler than your competitors and you’ll win over a lot of people…You can try to win a features arms race by offering everything under the sun. Or you can just focus on a couple of things and do ‘em really well and get people who really love those things to love your product. For little guys, that’s a smarter route.

When you choose that path, you get clarity. Everything is simpler. It’s simpler to explain your product. It’s simpler for people to understand. It’s simpler to change it. It’s simpler to maintain it. It’s simpler to start using it. The ingredients are simpler. The packaging is simpler. Supporting it is simpler. The manual is simpler. Figuring out your message is simpler. And most importantly, succeeding is simpler.

– Matt (37Signals)

Wednesday, September 23, 2009

Loopback and kiss myself

A consequence of using transactions to rollback and thus cleanup after database dependent tests is that some code, which would not otherwise be run within a transaction context, doesn’t work when it is. One such situation is the case of a loopback server, which I’ve encountered several times over the years. 

when people stop being polite... and start getting real...The Real World

A real world production environment might consist of several databases spread across several machines.  And sometimes, as distasteful as it may sound, those databases are connected via linked servers.  That is exactly the situation that we presently find ourselves in. We have quite a few views and procedures that make use of these linked servers, and those views and procedures invariably get called from within unit tests.  That in of itself isn’t an issue for transactional unit tests.  The critical factor is that our test environments, and more importantly our development environments aren’t spread across multiple machines, but instead host several databases on one local SQL server instance. 

There can be only one

In order for code that utilizes linked servers to be executable in development environments, we create linked servers that actually point back to the same SQL server instance, creating a loopback server.  Presently, loopback servers don’t support distributed transactions.  So what to do with transactional unit tests that call loopback code?

A few options come to mind, but are impractical for us:

  1. Turn off transactions on those tests, and write manual test data clean up code.
  2. Use aliasing so that the code doesn’t actually interact with the linked servers directly, and then simply don’t use linked servers in development environments, instead have the alias point directly to the tables, etc.
  3. Use virtualization to mimic physical production configuration.

These seem viable and actually #2 seems like a pretty good idea, but we have a mixed SQL 2000 and 2005 environment and aliasing is only available on 2005, so we’ve never even tried this.  Option #3, although it would more closely resemble the production configuration is more practical for a test environment than a development environment.  So, while it may solve the former, we still need to solve the latter issue, without the need for an overly complicated complete and self contained virtual environment for each developer.  Option #1 is just a step backwards that we’d like to avoid.

For instance keep your distance

There is a simpler option and is the path we recently chose after implementing transactional unit tests and finding numerous tests that immediately failed due to the loopback problem.  We simply set up a second instance of SQL server on the development machines, and then configured the linked servers to point not back to the same instance but to the two different instances.  For transactions to work, it turns out that the linked servers don’t have to be on two physically separate machines, just two different instances of SQL server.  This solution may have limited applicability for environments with multiple servers (and multiple links) or that don’t use linked servers at all.  But for us, with essentially only two databases that utilized the linked servers, it solved our issue without forcing us to change any code.  Its only slightly more complicated in that we have to run two instances of SQL server.  Eventually, we may end up moving to an aliasing solution, but that will require code changes and an upgrade, but for now we’ve sidestepped the loopback.

Wednesday, September 16, 2009

Evolution of unit testing

We take unit testing seriously, and make it a cornerstone of our development process.  However, as I alluded to in earlier post, our approach stretches the traditional meaning of unit testing. With a system that was not designed for testability, often times we have to write a lot of unrelated logic and integrate with external databases and services just to test a ‘unit’.  Further, while the desire would be to practice TDD, what we often practice is closer to Test Proximate Development.  Rather than start with the test first we tend to create tests at approximately the same time or in parallel with our development. Nevertheless, we have evolved and continue to increase the quantity and quality of our tests.  The path to our current methodology may be a familiar one.

Once you start down the dark path, forever will it dominate your destiny

We started out with a code-base that wasn’t necessarily written with testability in mind.  Nonetheless, the system was large enough, complex enough and mission critical enough to require not only testing our changes but that those tests be automated.  If we were to have any hope of introducing the amount of change demanded by the business, at the pace demanded and without introducing excessive amounts of bugs or letting a catastrophic blunder out into the wild, we had to begin building a suite of automated tests.

We made the most obvious choice and began using nUnit to write tests. We’ve used a variety of developer tools to run these tests throughout development, tools like TestDriven.NET, and later ReSharper.  We also set up CruiseControl.NET, which we were already using to automate our builds, to run these tests as part of the continuous integration process.

The biggest challenge, of course, was that there was no clear separation between business logic and data access code.  Therefore, right from the get go, our ‘unit tests’ weren’t unit tests in the purist sense.  They exercised the code, but also required interaction with the backing store.  Further, the majority of tests required a lot of setup data to either already exist in the database or be created prior to the test run in order for the the ‘unit’ of functionality to be testable (e.g. in order to test an Order Adjustment, a Customer, an Order, an Item, and other transactional records all had to exist to create and test an Order Adjustment).  In the beginning this meant that the majority of tests randomly retrieved whatever data was in the test system to test with or otherwise assumed that requisite data would be present allowing the tests to succeed.

That is why you fail

There are a few glaring problems with this approach that quickly exposed themselves. 

  • Tests would fail and succeed erratically. One test might alter test data used by a subsequent test in a way that would make it fail; order and timing mattered.  This left us chasing ghosts, often troubleshooting tests rather than application code.
  • The test database grew indefinitely as test data was dumped into the database on each build but never cleaned.  And builds were frequent.
  • The test database, originally set up on an old un-used PC, saw its performance degrade as the number of tests we wrote increased.  It got to the point where a failed test might take minutes to fix, but an hour to run the test suite to verify the fix.  Often times, after waiting an hour or more, we’d find out another failure had occurred. Fix-and-wait turnaround time became prohibitive.

We tackled these issues as they became productivity drains in no particular order and with no particular strategy.  At first we addressed our exponential data growth and performance problems with solutions barely adequate to get us by, to keep us writing and running tests.

Of course we threw hardware at the problem, incrementally (more memory here, an extra drive there) as problems arose. Eventually we upgraded to a beefy server, but that was much later.  The bulk of our first phase attempts were concentrated on creating a pruning script to be triggered at the beginning of each test run.

Train yourself to let go of everything you fear to lose

The pruning script attempted to clean out all the data created by the prior run.  This script is rather long and complex, recursively traversing from parent tables to child tables to delete in reverse order (all manually written).  You might ask, why not just wipe the database clean and fill it with known data prior to each run?  This was considered but ruled out based on what can be boiled down to:  DELETE statements work regardless of the columns in a table, INSERT statements don’t, which makes pruning a little more resilient to schema changes.  It seemed to me like a dummy data creation script would be much harder to maintain, but others may question that assumption.

Attachment leads to jealousy. The shadow of greed, that is

Co-dependent tests came next.  We began to refactor our tests (as they became problems) to be properly isolated and autonomous.  These test were re-written to create their own setup data, as they should have in the first place.

Having autonomously run-able tests, and more hardware resources, while continuously tweaking our pruning script, allowed us to grow our test suite to more than 2000 tests.  These tests ran in less than 20 minutes.  But of course these solutions were band-aids and living on borrowed time.

At an end your rule is... and not short enough it was

Working toward the elimination of the need for a pruning script, we began requiring that each test not only create its own data but also clean up after itself by removing that same data.  My initial solution was for each class to implement a Purge() method which would recursively call the Purge() methods of its children.  Thereby, each unit test could be wrapped in a try-finally, and within the finally all data created within the test would be purged. 

We wrote a considerable amount of these Purge methods, which encountered some of the same order of execution/referential integrity issues experienced by the pruning script, but they worked more or less.  A good percentage of tests were now cleaning up after themselves.  But I had a bad feeling about the Purge pattern, every time I wrote a Purge method it was as if millions of voices suddenly cried out in terror, and were suddenly silenced.  Writing unit testing code directly into application code classes can do that to you.  The purge code, in retrospect, was nothing more than hand coded compensating transactions.  Purge methods weren’t an elegant solution.  Off and on we toyed with the idea of using transactions and Enterprise Services to perform rollbacks, but each time it came up I could have sworn I had a good reason why it wouldn’t work but I can’t recall one now.  Eventual epiphany caused me to conclude that my Purge endeavor was ill-conceived, and a more elegant solution would likely be found in the use of transactions.

Mind what you have learned. Save you, it can

I recently went back to the drawing board on our cleanup approach, and decided to look at TransactionScope for a simpler solution.  The idea was an obvious one, wrap our tests in a transaction which always rolls back, thereby superannuating the need for Purge methods.  After a few quick proofs, I found TransactionScope not only worked, was cleaner, but also performed better than the manual Purge methods. I then encapsulated the transactional behavior in a base class from which all our test classes could inherit.

using System;
using System.Transactions;
using NUnit.Framework;

namespace Foo.Test.Common
{
[TestFixture]
public abstract class TransactionalTestBase : IDisposable
{
#region Setup/Teardown

[SetUp]
public virtual void Setup()
{
trx = new TransactionScope();
}

[TearDown]
public virtual void Teardown()
{
Dispose();
}

#endregion

private TransactionScope trx;
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}

protected virtual void Dispose(bool disposing)
{
if (disposing)
{
if (trx != null)
{
trx.Dispose();
trx = null;
}
}
}
}
}



Always there are two. A master and apprentice.



We’ve only recently begun to replace our current Purge methods with the transactional approach, and I think it holds promise for defeating Darth DataCleaner.  But I fear Darth DataSetupious is still out there forcing us inexorably toward repositories and mocking frameworks. Although, in my mind, the need to create and destroy data for testing purposes will always remain, there may be a new hope for bringing balance to our tests.



Wednesday, September 9, 2009

KISS my YAGNI abstraction

I’ve recently been observing what appears to me to be a growing contradiction within the software development community.  One the one hand popularity and adoption of the various flavors of Agile and its affiliated principles is growing, while at the same time tools, technologies, patterns and frameworks are being pumped out which seek higher levels of abstraction, looser coupling, and greater flexibility. Agile principles encourage simplicity, less waste, less up front design, the more familiar of those principles being:

But are the tools, technologies, patterns and frameworks simpler and necessary?  Does the fact that those recommended tools, technologies, patterns and frameworks continue to change so rapidly undermine any claims that they are simpler or necessary?

Who are you calling stupid?

In contrast to the doctrine of simplicity and just-in-time design, the latest technologies, tools, patterns and frameworks seem to be trending towards ultimate flexibility at the expense of simplicity.  Certainly, SOA, N-tier, DDD, MVP, MVC, MVVM, IoC, POCO (and the list goes on), are anything but simple.  Not only aren’t they simple, but they are also likely to fall into the “not gonna need it” category.  If one were to blindly follow best practices recommendations, then every application would be a Service Oriented N-tier Domain Driven multi-layered highly abstracted masterpiece, and would be re-written every few months.  But that hardly seems agile, lean, simple or less is more.  In some ways, Agile principles almost demand architecting after the fact.  On the other hand if you wait until you need Service Orientation, Inversion of Control or a Domain model its very difficult to add later. 

“If Woody had gone straight to the police this would never have happened”

How many successful companies succeed using systems that don’t subscribe to any of these concepts, but instead run their businesses using Cobol, Foxpro, Access, VB 6.0, Classic ASP (or any equivalent ‘old school’ technology)? And the corollary, how many failed businesses can attribute their failure to a fatal flaw in their LOB application design?  How many companies have said, “if we had only decoupled our inventory system from our purchasing system using a service layer and utilized a POCO capable O/RM tool we’d still be in business”?  My guess would be very few, and of that few they’d likely be software companies or SaaS providers where the technology is their product. But for the vast majority of companies out there where technology is the enabler not the product are we  being encouraged to over engineer by the loudest 1% of developers?

The devil made me do it

In some ways developers are snobs.  I think we spend a lot of energy looking for ways to separate the men from the boys.  The classic ranking of developers as professionals, amateurs, hobbyist and hacks gets played out over and over.  Just recalling the C++ vs. VB developer comparisons reveals parallels with each new generation.  C++ developers are professional developers while VB developers are hobbyists' and amateurs.  C# developers are professionals while VB.NET developers are hobbyist and amateurs.  ASP.NET MVC are professionals while Webforms developers are hobbyists and amateurs. Professionals use an O/RM, IoC, SOA and Mocking frameworks, if you don’t you’re an amateur. 

I don’t want to suggest that when used to solve a particular problem, any one of these technologies, tools, patterns or frameworks can’t in fact simplify a solution or make it more elegant or flexible, because they can and often do.  Or suggest that I’m not a participant in this snobbery, which I invariably am.  What interests me is when the desire to produce a ‘sophisticated’ or ‘professional’ solution means stuffing it full of the latest technologies, tools, patterns or frameworks and calling it simpler or more elegant.  While this is often interesting learning opportunities for developers and architects, and one more feather in our caps to differentiate ourselves from the outdated riff-raff, it hardly seems lean.  If when your only tool is a hammer, every problem looks like a nail then it can also be said that when you have a lot of (cool) tools every problem seems to require them all.

“It depends on what the meaning of the word 'is' is”

This is topic is further obscured by the fact that there is seldom a widely accepted ‘right’ way to do anything in software development.  Almost any approach has its share of debate. 

For instance, there’s lots a debate about which O/RM tool is the best, or purest. Even if you decide, yes their is a general consensus in the community that some form of O/RM is the preferred persistence/data access strategy, as I recently did, and you wade through the debates and pick a tool, inevitably you’ll discover another perspective that throws the decision back into question.  A recent post by Tony Davis, The ORM Brouhaha, did just that for me.

“The IT industry is increasingly coming to suspect that the performance and scalability issues that come from use of ORMs outweigh the benefits of ease and the speed of development”

Benchmarks posted on ORMbattle.NET purport to demonstrate a staggering performance difference between O/RM’s and standard SqlClient.  I mention this just as one example of how there are few right answers, just an endless series of trade-offs.

Which leads me back to my original premise. I don’t think we are necessarily keeping it simple or waiting until we need it, and we are certainly being pulled in two directions.

Monday, August 10, 2009

Prequel to Dealing with Design Debt

Two of my recent posts described how we are currently paying down our design debt in a big chunk.  As I eluded to in those posts, this sort of balloon payment is atypical, and has lead us to deviate from our normal development practices.  So what are our typical development practices as they regard technical debt and scar tissue?

Shock the world

I don’t mean to blow your mind, but we refactor as we go, leaving the park a little cleaner than we found it.  In its simplest form, that means for every bug, feature or enhancement worked on we strive for refactoring equivalency.  In other words, we aim to refactor code proportional to the amount of change necessitated by the bug fix or to meet the feature or enhancement requirements.  If we have to add 10 new lines of code for a new feature, then we also need to find 10 lines of code worth of refactoring to do (generally in that same area of the code).

Beyond that simple precept are some practical considerations;  we have to have some guidelines or goals around what kind of refactoring to do, have some QA capability to provide a safety net that encourages refactoring, some form of code review process to keep us honest and consistent, and hopefully some tools to help identify and automate refactoring opportunities.

In the beginning God created the code-behind

We have guidelines regarding naming conventions, casing, standard exception handling to name a few, we mostly follow the Microsoft guidelines.  There is a substantial amount of code written before these guidelines were agreed upon, and we change them over time as we learn more and new techniques become available. There’s always some refactoring work to be done in this regard. 

The main goal, however, has been to address the abuse of the code-behind.  Probably not all that unfamiliar to many, as our codebase has moved from classic ASP through the various versions of .NET it has carried with it the primitive code-behind and page_load centric model.  Naturally, the bulk and focus of our refactoring efforts goes into moving data access code and business logic out of pages into the business layer, increasing coherence by re-organizing and consolidating business logic into the correct business objects, and converting programmatic presentation logic into its declarative equivalent.  Essentially we want to make the code-behind as lean as possible, and the business objects as coherent as possible.  Not a lofty goal, and its a long way from MVC, MVP, DDD and so on, but its an essential first step before we can even consider a more well defined and established pattern or framework.

"So you wanna be a high roller"?

There’s a certain amount of risk in making any changes to the system and we’re essentially doubling the amount of change with each request by adding the additional refactorings.  There’s a long term payoff to these refactorings in terms of readability and maintainability (at the very least), but in the short term we have to mitigate the risk of “unnecessary” changes.  The way we do that is multifaceted; short iterations, unit tests, continuous integration, code reviews, and productivity tools.

  • What we mean by short iterations might not necessarily be what you expect.  We do a full system release every two weeks on a known schedule, completely independent of project or task schedules.  Therefore, all work targets a release and is fit into the two week cycle.  In a two week cycle that could mean one developer completes 10 bug fixes or one quarter of a 2 month project, but either way, whatever is complete and releasable is released into the live system.  This keeps the volume of change smaller and more manageable so that when things do go wrong the source of the issue can be more quickly pinpointed.
  • Unit tests are required for unit testable changes (and changes that may not be testable are often refactored to be unit testable).  The pre-existing as well as the new unit tests provide the safety net we need to safely refactor.  As long as the changes don’t break the unit tests, developers can feel reasonably confident about the changes they’ve made and be more aggressive about refactoring than they might be otherwise.  As the volume and completeness of unit tests increases we can be more and more aggressive.
  • We use CruiseControl.NET as our continuous integration server to monitor our Subversion repository and build and run unit tests whenever changes are committed.  This gives us nearly immediate feedback when changes do break something and as early as possible in the cycle.
  • We have an on-line code review process utilizing integration between Fogbugz and Subversion.  All system changes require a Fogbugz case # (large projects may be broken into numerous cases).  Each case includes change notes as well as the versions of the modified code attached.  Cases are then routed from the developer to a reviewer who can view the diffs directly from the case where they can approve the changes or comment and send them back for additional work before allowing them into the release.
  • In addition to the already mentioned tools, Subversion, CruiseControl.NET, Fogbugz, nUnit, our big gun is the widely used ReSharper which identifies refactoring opportunities, provides shortcuts to those refactorings and just does what ReSharper does.  We run a few other tools for informational and trending purposes like fxCop, StyleCop, CloneDetective, StatSvn, etc.  These tools don’t necessarily provide us any productivity gains at this point but some are incorporated into our build processes as indicators.

Let the healing begin

This is just a brief overview of how we approach incremental improvement.  Its working for us, albeit with a few glaring caveats.  Its undoubtedly a slow process, some portions of the code base almost never get touched and thus are unlikely to be refactored. Secondly, we have a hard time measuring our progress.  Code coverage, code duplication rate and lines of code are our best metrics. Just to give you some idea of what those metrics are; over approx. a two year period our code coverage has risen from 10% to close to 50%, duplication rate has dropped approx. 30% and our number of lines of code has begun trending downward even as we add features and grow the system.  Those are all good indicators that we’re heading in the right direction even if they don’t tell us how far we’ve come or how far we have left to go.

Tuesday, July 28, 2009

Linq-to-Sql, Entity Framework, NHibernate and what O/RM means to me.

We’ve recently completed an upgrade of our enterprise application suite, or home grown ‘mini-ERP’ for lack of a better term.  We upgraded from .NET 2.0 to .NET 3.5 SP1,  a fairly painless exercise as far as complete system upgrades go.  We didn’t really have to change much code to get there, but with .NET 3.5 we get a whole host of new features and capabilities.  There’s WCF, WF, WPF, Silverlight, ASP.NET MVC, Linq and the Entity Framework just to name a few.  Naturally, the question is which of these new capabilities do we try to take advantage of and in what order?

To use any of them represents a considerable learning curve, a significant change in our architecture and in some cases a paradigm shift in thinking about how we solve certain kinds of problems. As the title implies, in this post I intend to discuss persistence patterns and technologies.

Being persistent

Our current persistence pattern is likely a familiar one to most.  We have a collection of business objects that serve the dual purpose of both business logic layer and data access layer.  That is to say that the objects contain both data and behavior as well as the logic for how to hydrate and persist themselves.  These objects loosely follow an Active Record pattern, mapping roughly one-to-one with the underlying data structure entities.  The persistence logic consists of internal calls to stored procedures which return DataSets/Datareaders and are then manually mapped to the object properties (or private members), in turn the properties are mapped back to stored procedure parameters for persistence.  This works adequately and has the advantage of being already written, proven and time tested code. Why would I consider changing existing working code to utilize an O/RM?

SwwwORM

There’s a multitude of O/RM tools that have been around, widely used, feature rich and have achieved a level of maturity, none of which are dependent on .NET 3.5.  The most well known and probably most widely used of these is NHibernate.  Admittedly, my only experience with this tool was purging its remnants from a codebase I inherited a number of years back, not because of any deficiency with NHibernate, but because it was haphazardly implemented by a developer who didn’t understand it and then left the company.  At that time we had no experience with NHibernate, and were inundated with performance problems, locking and transactional issues with any code that used NHibernate.  Being as it was a relatively small percentage of the codebase utilizing NHibermate, we decided that rather than ramp up on NHibernate and fix the implementation, we’d instead favor consistency by replacing the NHibernate code with our tried and true DataSet pattern.  We weren’t ready then, but I knew at some point we’d need to revisit O/RM whether it be NHibernate or another of the multitude out there.  What does .NET 3.5 have to do with revisiting O/RM?

Linq is the word, its got groove its got meaning

For me, Linq is the catalyst to challenging our data access approach.  From the get-go, Linq-to-Objects and Linq-to-Xml were no-brainers.  To be able to query collections, xml documents, and DataSets with Sql-like syntax was an obvious and giant leap over nested foreach loops and similar conventions.  Linq-to-Sql and later Linq-to-Entities, however, are intriguing but much less obvious choices.  Considering that we already have hundreds of business objects, the rapid development features of generating code directly from the database schema and coding away isn’t compelling.  These two technologies, if we are to use them, would have to be shoehorned into our existing architecture.  We aren’t about to replace all our business objects with Linq-to-Sql’s ‘dumb’ data objects and throw away all our optimized stored procedures and hope for the best from the generated Sql.  Similarly, we aren’t going to convert all our business objects to EntityObjects and also hope for the best from the Linq-to-Entity Sql generator.  In both instances, we’d also be forced to find a place for all the other code and business logic contained in our business objects that wouldn’t play nice with the code generators, whether that be by using partial classes, inheritance or whatever.  Add to this the fact that L2S is supposedly dead, and EF is v1, I can’t make a strong case to jump on either of these as a business object replacement.  But do I really need L2S or EF to replace my business objects, or even want them to if they could?

Rise of the Data Layer

Code generation, tight coupling between database schema and objects, and Sql generation all become much more compelling at the data layer rather than the business layer.  If I view L2S objects, EF Entities, NHibernate generated objects, or any other O/RM objects as data transfer objects (dto) rather than business objects, then things get a little more interesting.  Using EF (I actually started out with L2S at first) to do some prototyping, I used a Repository Pattern in conjunction with Linq to query and persist data centric entities in a ‘detached’ manner.  What this approach allows me to do is replace my DataSets/Datareader mapping code with Entity mapping code, having the following benefits:

  • I only need to change the internals of my business objects, their structure, inheritance tree and interfaces are all unchanged.
  • My business objects are now mapped to type safe Entities, rather then DataSets and DataReaders, and with fewer lines of code.
  • Persistence is two-way, handled by mapping back to those same Entities, rather than passing a litany or parameters back to a stored procedure.
  • With very few lines of .NET code and no stored procedures I can handle SELECT, INSERT, UPDATE and DELETE, allowing me to drop all the boiler plate stored procedures that perform CRUD now, and not write new ones.
  • This same approach could have just as likely utilized L2S, NHibernate, LLBLGen, etc. giving me flexibility to chose or change tools without significant impact on the business objects.

This may not be POCO paradise or necessarily what the O/RM vendors intended, but for me it provides a compelling use for the technology while not forcing me to lock in and conform to the particulars of one tool.

Sometimes code is worth a thousand words so I’ve provided sample code. Download

Why use the Entity Framework?

Monday, July 13, 2009

Dealing with Design Debt: Part II

In my previous post I talked about technical debt and scar tissue and how we used those concepts to identify and justify a project to rework a core part of our mission critical business applications. In this post I want to talk about our approach to tackling such a high risk project which is largely viewed more as a necessary evil than an exciting new feature or capability.

Race to Baghdad

There are many ways that systems like ours grow and I hazard a guess that few are architected up front with carefully drawn schematics and blueprints. In our case, it would be more accurate to describe how we got here as a Race to Baghdad approach. The original mission was to get the business off the ground and keep it growing by erecting applications and tying them together into a larger whole always meeting the immediate needs. In that sense we won the war but not the peace. The business is thriving but the system is struggling with the aftermath of being stretched too thin with a lot of terrain left to go back and fortify.

In the context of our current debt reduction project, the terrain we’re fortifying is not an isolated ‘module’ (a term that I want to use loosely) core but insulated. Instead it is a hub in a spider web of dependencies. This means that hundreds of stored procedures, views, queries, reports and all the application logic built on top on them could be and often are directly connected to each other and used by other modules. Further, it means that radically altering the structure of essentially three entities, will have a ripple effect across the majority of our enterprise systems. Naturally, our first instinct was to find a stable wormhole, go back in time, build a loosely coupled modular system, then slingshot around the sun back to the present where we’d then begin our debt reduction project. After a few Google searches failed to yield any results we switched to Plan B.

An iterative heart transplant?

We generally like to follow an agile-ish approach and develop in short iterations, releasing frequently. However, this project felt like it wasn’t well suited for that approach. Many, including those on the team, may question the wisdom of that decision now or in retrospect. The decision to perform a few large iterations rather than dozens of smaller ones was not made lightly. The presiding sentiment, at the onset of the project, was that making a small modification to each of a number of entities one at a time and then sweeping through the entire code-base following the ripples would lead to too many iterations, and more importantly too much repetitive work (sweeping through the code-base over and over again) and a prohibitive duration. A project like this, which won’t have visible tangible business benefits until final completion, coupled with the prospect of an extremely long duration lead us to deviate from our norm. We decided to attempt to shorten the duration by completing fewer sweeps through the code, acknowledging both the longer than normal iteration cycles and the danger inherent in massive changes released all-at-once. We foresaw three major iterations, each bigger than its predecessor.

“This shit's somethin. Makes Cambodia look like Kansas”

1) Reduce the surface area. Borrowing a term from the security world, we decided that since we didn’t have modularity on our side, at least we could, in the first iteration, eliminate dependencies and hack away the strands of the spider web. This made sense from several perspectives:

  • It allowed us to get a fuller picture of what we were dealing with by going through the entire code-base once, without being overcommitted. If our initial estimates of scope and cost were off, we could adjust them and/or allow the stakeholders to pull the plug on the project.
  • Removing unused and seldom used code, combining and reducing duplication, and refactoring to circumvent easily avoidable dependencies in of itself was a debt reductive activity (or perhaps the analogy or removing scar tissue is more appropriate here). If the plug were to be pulled on the project due to technical, resource, financial or any of a number business re-prioritization reasons, completing this activity would be a significant step towards modularity and simplification. There was immediate ROI.
  • It gave the stakeholders a legitimate evaluation point at which to decide to continue or not without “throwing away” work.

2) Neuter the entity. One critical entity was radically transforming both in terms of its attributes and its relationship to other entities. At this step we aimed to move the attributes to their new homes on other entities or eliminate them entirely, while at the same time preserving the relational structure. This meant that some code would have to change, but a significant number of queries, reports, and the like could remain only marginally effected as the JOINs would stay viable. It would also mean that the eventual change to the entity’s relationships would be slightly less impactful because most of its attributes had been removed leaving it more or less a relational placeholder. At this point we’d also write the data migration scripts.

3) Break the chains. The last step would be to sever the relationships and reorganize the entities, effectively breaking all code that had previously relied on those relationships. Once this was done, there was no going back, no partially effected queries or seemingly unaffected business logic. We struggled with a way to do this without one massive all-at-once ripple, but couldn’t find one (without going back to the “long duration” approach).

“Plans are useless: planning is invaluable” (Eisenhower)

Currently we’re working on breaking the chains. We successfully reduced the surface area, reaping more benefit (getting rid of more scar tissue than anticipated) in a shorter time than expected. However, neutering the entity proved elusive, perhaps out of necessity but equally likely due to a lack of resolve on our part to stay committed to the plan. Nevertheless, some of the work done there wasn’t sufficiently insulated enough to release. Therefore, what could not be released safely into the production system has slipped into the current iteration making it bigger than we’d hoped. The lessons continue to be learned.

The trilogy will conclude with: The Final Iteration

Tuesday, July 7, 2009

Dealing with Design Debt: Part I

There are two related terms that always come to mind when I’m asked to evaluate or implement any non-trivial change or addition to the line-of-business application systems that I work on; Technical debt and scar tissue.

First suggested by Ward Cunningham the idea of technical debt:

Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite.... The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation, object-oriented or otherwise.

And scar tissue as described by Alan Cooper:

As any program is constructed, the programmer makes false starts and changes as she goes. Consequently, the program is filled with the scar tissue of changed code. Every program has vestigial functions and stubbed-out facilities. Every program has features and tools whose need was discovered sometime after construction began grafted onto it as afterthoughts. Each one of these scars is like a small deviation in the stack of bricks. Moving a button from one side of a dialog box to the other is like joggling the 998th brick, but changing the code that draws all button-like objects is like joggling the 5th brick.

…As the programmers proceed into the meat of the construction, they invariably discover mistakes in their planning and flaws in their assumptions. They are then faced with Hobson's choice of whether to spend the time and effort to back up and fix things from the start, or to patch over the problem wherever they are and accept the burden of the new scar tissue—the deviation. Backing up is always very expensive, but that scar tissue ultimately limits the size of the program—the height of the bricks.

Each time a program is modified for a new revision to fix bugs or to add features, scar tissue is added. This is why software must be thrown out and completely rewritten every couple of decades. After a while, the scar tissue becomes too thick to work well anymore.

I will continue with these two analogies, the financial and the biological, as they both suggest dire consequences for applications that grow without accounting for the cumulative effects of all the design decisions and trade-offs made throughout the life of the application, yet so many systems grow organically in just that way. So what do you do with an application that’s already in debt and badly scarred?

The first stage is denial.

Whether its the other members of the your technology dept., technology leadership, or business owners; someone will likely need, not only to be convinced of the debt (unlike financial debt which is more obvious), but also understand the magnitude of the problem and why it ever need be “paid off”. This is precisely the quandary we found ourselves in several months ago.

For the better part of a year, or maybe longer, several seemingly reasonable large scale projects were assessed as being too costly (in terms of time and effort) and postponed, only to be periodically re-proposed with a slightly different twist or by another business owner. Over the same time frame, there were also many projects that either turned out to take longer, be more complicated than anticipated or have their planning padded due to the acknowledgement that they’d likely tread on some of the more heavily scarred tissue within our system. Recognizing this pattern, we were able to identify and raise the visibility of a fundamental design flaw with in our application systems. This particular design flaw was hurting our ability to add new functionality, collect accurate business intelligence data and do general application maintenance and enhancements. The module in question (and its structure) is fundamental, critical and core to our business and subsequently how the business logic has been written; the equivalent of a major organ.

That there was a problem within this module was easy enough to see for developers who had to tread lightly, create a complicated work-around, and/or hard-code a band-aid in order to bridge gaps between the current design and all those newly requested features that were grafted on over the years. Similarly, the problem manifested itself for those writing reports who felt like they were pushing the limits of SQL just to answer basic questions. However, what was less clear was determining the correct, or more correct, design for this module (more on that in later posts).

Moving to Acceptance

Having identified debt, which we were continuing to pay interest on, in the form of added complexity to many smaller projects, and its prohibitive complexity to larger projects, the case for paying the principal gained momentum. The trick was to identify how big the debt was, how much we were paying in interest and how much of the principal we needed to pay to get out from under.

We estimated that we were paying 10% interest on every project (for just this one design issue). This was a swag arrived at by looking at a sample time period and seeing how many of our projects dealt with, either directly or indirectly, the scarred module in question. This module, being so fundamental to our business, was estimated to be substantially involved in roughly 40% of our projects over the sample period. This estimate may seem high but another factor, tight coupling, made ripples inevitable. The second part of the estimate was equally as rough, in that we estimated paying a 25% complexity cost on those projects. Therefore, our rudimentary debt analysis yielded a cost of 25% on 40% of projects, which for simplicity’s sake became a flat 10% of all projects. Following this to its logical conclusion, we could see that not only were we living with substantial debt already and implicitly paying interest (without even taking in to account the projects that were simply not done because of the perceived complexity). Further, by continuing to build upon the unsure footing of a flawed module we’d be taking on more debt and could easily be seeing 30%-40% complexity interest on 50%-60% of our projects in the next year or two.

Putting the scarred module in those terms; projects that couldn’t be done, data that couldn’t be mined, and costs incurred on every project that weren’t even addressing the issue, was the basis for justifying the undertaking of a project to correct the design flaw in our core table structures. We were going to try to pay off the debt.

Next up: How do you perform a heart transplant on a runner while they’re running a marathon?

Further reading:

Tuesday, June 30, 2009

Ex Post Facto

This is the launch of a new blog addressing the practical concerns of working with the myriad of systems that are mission critical to many businesses, but yet are systems grown organically, been cobbled together in one way or another, and built using whatever technologies and skills were available at the time.

Continuing to maintain, extend, and enhance these systems with ever changing business requirements is often at odds with the need and desire of developers to retrofit these systems according to established design patterns, using best practices with the latest technologies and architectures. Add to that the pace at which technology is changing, the articles exhorting us to adopt or die and the financial climate leaving companies reticent to invest in costly 'infrastructure' projects and most of us are left in a quandry.

There're numerous blogs, books and articles describing how to use the multitude of new technologies and techniques, but generally they assume a clean slate, which for a lot of us is seldom the case. I don't have the answers but intend to share my experiences as I try to strike the balance between dealing with the realities of the applications I have and the applications "they should ideally be".