Wednesday, September 23, 2009

Loopback and kiss myself

A consequence of using transactions to rollback and thus cleanup after database dependent tests is that some code, which would not otherwise be run within a transaction context, doesn’t work when it is. One such situation is the case of a loopback server, which I’ve encountered several times over the years. 

when people stop being polite... and start getting real...The Real World

A real world production environment might consist of several databases spread across several machines.  And sometimes, as distasteful as it may sound, those databases are connected via linked servers.  That is exactly the situation that we presently find ourselves in. We have quite a few views and procedures that make use of these linked servers, and those views and procedures invariably get called from within unit tests.  That in of itself isn’t an issue for transactional unit tests.  The critical factor is that our test environments, and more importantly our development environments aren’t spread across multiple machines, but instead host several databases on one local SQL server instance. 

There can be only one

In order for code that utilizes linked servers to be executable in development environments, we create linked servers that actually point back to the same SQL server instance, creating a loopback server.  Presently, loopback servers don’t support distributed transactions.  So what to do with transactional unit tests that call loopback code?

A few options come to mind, but are impractical for us:

  1. Turn off transactions on those tests, and write manual test data clean up code.
  2. Use aliasing so that the code doesn’t actually interact with the linked servers directly, and then simply don’t use linked servers in development environments, instead have the alias point directly to the tables, etc.
  3. Use virtualization to mimic physical production configuration.

These seem viable and actually #2 seems like a pretty good idea, but we have a mixed SQL 2000 and 2005 environment and aliasing is only available on 2005, so we’ve never even tried this.  Option #3, although it would more closely resemble the production configuration is more practical for a test environment than a development environment.  So, while it may solve the former, we still need to solve the latter issue, without the need for an overly complicated complete and self contained virtual environment for each developer.  Option #1 is just a step backwards that we’d like to avoid.

For instance keep your distance

There is a simpler option and is the path we recently chose after implementing transactional unit tests and finding numerous tests that immediately failed due to the loopback problem.  We simply set up a second instance of SQL server on the development machines, and then configured the linked servers to point not back to the same instance but to the two different instances.  For transactions to work, it turns out that the linked servers don’t have to be on two physically separate machines, just two different instances of SQL server.  This solution may have limited applicability for environments with multiple servers (and multiple links) or that don’t use linked servers at all.  But for us, with essentially only two databases that utilized the linked servers, it solved our issue without forcing us to change any code.  Its only slightly more complicated in that we have to run two instances of SQL server.  Eventually, we may end up moving to an aliasing solution, but that will require code changes and an upgrade, but for now we’ve sidestepped the loopback.

Wednesday, September 16, 2009

Evolution of unit testing

We take unit testing seriously, and make it a cornerstone of our development process.  However, as I alluded to in earlier post, our approach stretches the traditional meaning of unit testing. With a system that was not designed for testability, often times we have to write a lot of unrelated logic and integrate with external databases and services just to test a ‘unit’.  Further, while the desire would be to practice TDD, what we often practice is closer to Test Proximate Development.  Rather than start with the test first we tend to create tests at approximately the same time or in parallel with our development. Nevertheless, we have evolved and continue to increase the quantity and quality of our tests.  The path to our current methodology may be a familiar one.

Once you start down the dark path, forever will it dominate your destiny

We started out with a code-base that wasn’t necessarily written with testability in mind.  Nonetheless, the system was large enough, complex enough and mission critical enough to require not only testing our changes but that those tests be automated.  If we were to have any hope of introducing the amount of change demanded by the business, at the pace demanded and without introducing excessive amounts of bugs or letting a catastrophic blunder out into the wild, we had to begin building a suite of automated tests.

We made the most obvious choice and began using nUnit to write tests. We’ve used a variety of developer tools to run these tests throughout development, tools like TestDriven.NET, and later ReSharper.  We also set up CruiseControl.NET, which we were already using to automate our builds, to run these tests as part of the continuous integration process.

The biggest challenge, of course, was that there was no clear separation between business logic and data access code.  Therefore, right from the get go, our ‘unit tests’ weren’t unit tests in the purist sense.  They exercised the code, but also required interaction with the backing store.  Further, the majority of tests required a lot of setup data to either already exist in the database or be created prior to the test run in order for the the ‘unit’ of functionality to be testable (e.g. in order to test an Order Adjustment, a Customer, an Order, an Item, and other transactional records all had to exist to create and test an Order Adjustment).  In the beginning this meant that the majority of tests randomly retrieved whatever data was in the test system to test with or otherwise assumed that requisite data would be present allowing the tests to succeed.

That is why you fail

There are a few glaring problems with this approach that quickly exposed themselves. 

  • Tests would fail and succeed erratically. One test might alter test data used by a subsequent test in a way that would make it fail; order and timing mattered.  This left us chasing ghosts, often troubleshooting tests rather than application code.
  • The test database grew indefinitely as test data was dumped into the database on each build but never cleaned.  And builds were frequent.
  • The test database, originally set up on an old un-used PC, saw its performance degrade as the number of tests we wrote increased.  It got to the point where a failed test might take minutes to fix, but an hour to run the test suite to verify the fix.  Often times, after waiting an hour or more, we’d find out another failure had occurred. Fix-and-wait turnaround time became prohibitive.

We tackled these issues as they became productivity drains in no particular order and with no particular strategy.  At first we addressed our exponential data growth and performance problems with solutions barely adequate to get us by, to keep us writing and running tests.

Of course we threw hardware at the problem, incrementally (more memory here, an extra drive there) as problems arose. Eventually we upgraded to a beefy server, but that was much later.  The bulk of our first phase attempts were concentrated on creating a pruning script to be triggered at the beginning of each test run.

Train yourself to let go of everything you fear to lose

The pruning script attempted to clean out all the data created by the prior run.  This script is rather long and complex, recursively traversing from parent tables to child tables to delete in reverse order (all manually written).  You might ask, why not just wipe the database clean and fill it with known data prior to each run?  This was considered but ruled out based on what can be boiled down to:  DELETE statements work regardless of the columns in a table, INSERT statements don’t, which makes pruning a little more resilient to schema changes.  It seemed to me like a dummy data creation script would be much harder to maintain, but others may question that assumption.

Attachment leads to jealousy. The shadow of greed, that is

Co-dependent tests came next.  We began to refactor our tests (as they became problems) to be properly isolated and autonomous.  These test were re-written to create their own setup data, as they should have in the first place.

Having autonomously run-able tests, and more hardware resources, while continuously tweaking our pruning script, allowed us to grow our test suite to more than 2000 tests.  These tests ran in less than 20 minutes.  But of course these solutions were band-aids and living on borrowed time.

At an end your rule is... and not short enough it was

Working toward the elimination of the need for a pruning script, we began requiring that each test not only create its own data but also clean up after itself by removing that same data.  My initial solution was for each class to implement a Purge() method which would recursively call the Purge() methods of its children.  Thereby, each unit test could be wrapped in a try-finally, and within the finally all data created within the test would be purged. 

We wrote a considerable amount of these Purge methods, which encountered some of the same order of execution/referential integrity issues experienced by the pruning script, but they worked more or less.  A good percentage of tests were now cleaning up after themselves.  But I had a bad feeling about the Purge pattern, every time I wrote a Purge method it was as if millions of voices suddenly cried out in terror, and were suddenly silenced.  Writing unit testing code directly into application code classes can do that to you.  The purge code, in retrospect, was nothing more than hand coded compensating transactions.  Purge methods weren’t an elegant solution.  Off and on we toyed with the idea of using transactions and Enterprise Services to perform rollbacks, but each time it came up I could have sworn I had a good reason why it wouldn’t work but I can’t recall one now.  Eventual epiphany caused me to conclude that my Purge endeavor was ill-conceived, and a more elegant solution would likely be found in the use of transactions.

Mind what you have learned. Save you, it can

I recently went back to the drawing board on our cleanup approach, and decided to look at TransactionScope for a simpler solution.  The idea was an obvious one, wrap our tests in a transaction which always rolls back, thereby superannuating the need for Purge methods.  After a few quick proofs, I found TransactionScope not only worked, was cleaner, but also performed better than the manual Purge methods. I then encapsulated the transactional behavior in a base class from which all our test classes could inherit.

using System;
using System.Transactions;
using NUnit.Framework;

namespace Foo.Test.Common
{
[TestFixture]
public abstract class TransactionalTestBase : IDisposable
{
#region Setup/Teardown

[SetUp]
public virtual void Setup()
{
trx = new TransactionScope();
}

[TearDown]
public virtual void Teardown()
{
Dispose();
}

#endregion

private TransactionScope trx;
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}

protected virtual void Dispose(bool disposing)
{
if (disposing)
{
if (trx != null)
{
trx.Dispose();
trx = null;
}
}
}
}
}



Always there are two. A master and apprentice.



We’ve only recently begun to replace our current Purge methods with the transactional approach, and I think it holds promise for defeating Darth DataCleaner.  But I fear Darth DataSetupious is still out there forcing us inexorably toward repositories and mocking frameworks. Although, in my mind, the need to create and destroy data for testing purposes will always remain, there may be a new hope for bringing balance to our tests.



Wednesday, September 9, 2009

KISS my YAGNI abstraction

I’ve recently been observing what appears to me to be a growing contradiction within the software development community.  One the one hand popularity and adoption of the various flavors of Agile and its affiliated principles is growing, while at the same time tools, technologies, patterns and frameworks are being pumped out which seek higher levels of abstraction, looser coupling, and greater flexibility. Agile principles encourage simplicity, less waste, less up front design, the more familiar of those principles being:

But are the tools, technologies, patterns and frameworks simpler and necessary?  Does the fact that those recommended tools, technologies, patterns and frameworks continue to change so rapidly undermine any claims that they are simpler or necessary?

Who are you calling stupid?

In contrast to the doctrine of simplicity and just-in-time design, the latest technologies, tools, patterns and frameworks seem to be trending towards ultimate flexibility at the expense of simplicity.  Certainly, SOA, N-tier, DDD, MVP, MVC, MVVM, IoC, POCO (and the list goes on), are anything but simple.  Not only aren’t they simple, but they are also likely to fall into the “not gonna need it” category.  If one were to blindly follow best practices recommendations, then every application would be a Service Oriented N-tier Domain Driven multi-layered highly abstracted masterpiece, and would be re-written every few months.  But that hardly seems agile, lean, simple or less is more.  In some ways, Agile principles almost demand architecting after the fact.  On the other hand if you wait until you need Service Orientation, Inversion of Control or a Domain model its very difficult to add later. 

“If Woody had gone straight to the police this would never have happened”

How many successful companies succeed using systems that don’t subscribe to any of these concepts, but instead run their businesses using Cobol, Foxpro, Access, VB 6.0, Classic ASP (or any equivalent ‘old school’ technology)? And the corollary, how many failed businesses can attribute their failure to a fatal flaw in their LOB application design?  How many companies have said, “if we had only decoupled our inventory system from our purchasing system using a service layer and utilized a POCO capable O/RM tool we’d still be in business”?  My guess would be very few, and of that few they’d likely be software companies or SaaS providers where the technology is their product. But for the vast majority of companies out there where technology is the enabler not the product are we  being encouraged to over engineer by the loudest 1% of developers?

The devil made me do it

In some ways developers are snobs.  I think we spend a lot of energy looking for ways to separate the men from the boys.  The classic ranking of developers as professionals, amateurs, hobbyist and hacks gets played out over and over.  Just recalling the C++ vs. VB developer comparisons reveals parallels with each new generation.  C++ developers are professional developers while VB developers are hobbyists' and amateurs.  C# developers are professionals while VB.NET developers are hobbyist and amateurs.  ASP.NET MVC are professionals while Webforms developers are hobbyists and amateurs. Professionals use an O/RM, IoC, SOA and Mocking frameworks, if you don’t you’re an amateur. 

I don’t want to suggest that when used to solve a particular problem, any one of these technologies, tools, patterns or frameworks can’t in fact simplify a solution or make it more elegant or flexible, because they can and often do.  Or suggest that I’m not a participant in this snobbery, which I invariably am.  What interests me is when the desire to produce a ‘sophisticated’ or ‘professional’ solution means stuffing it full of the latest technologies, tools, patterns or frameworks and calling it simpler or more elegant.  While this is often interesting learning opportunities for developers and architects, and one more feather in our caps to differentiate ourselves from the outdated riff-raff, it hardly seems lean.  If when your only tool is a hammer, every problem looks like a nail then it can also be said that when you have a lot of (cool) tools every problem seems to require them all.

“It depends on what the meaning of the word 'is' is”

This is topic is further obscured by the fact that there is seldom a widely accepted ‘right’ way to do anything in software development.  Almost any approach has its share of debate. 

For instance, there’s lots a debate about which O/RM tool is the best, or purest. Even if you decide, yes their is a general consensus in the community that some form of O/RM is the preferred persistence/data access strategy, as I recently did, and you wade through the debates and pick a tool, inevitably you’ll discover another perspective that throws the decision back into question.  A recent post by Tony Davis, The ORM Brouhaha, did just that for me.

“The IT industry is increasingly coming to suspect that the performance and scalability issues that come from use of ORMs outweigh the benefits of ease and the speed of development”

Benchmarks posted on ORMbattle.NET purport to demonstrate a staggering performance difference between O/RM’s and standard SqlClient.  I mention this just as one example of how there are few right answers, just an endless series of trade-offs.

Which leads me back to my original premise. I don’t think we are necessarily keeping it simple or waiting until we need it, and we are certainly being pulled in two directions.