Thursday, September 9, 2010

John MacIntyre's Clean Code Experience #1.

John MacIntyre has written a lovely blog about his first experience with "Clean Code".   The change in the structure of his code, and in his attitude about code, is dramatic.  As an author, there is little that can be more gratifying and inspiring than to see this kind of direct result.  So, thank you John! 

On twitter a few weeks ago, John suggested that I review the code that he wrote.  I was happy to agree to do this.  There's little I like better than reading other people's code. I learn much more from other people's code than I learn by writing my own.  So again, thank you John for letting me pick a few nits.

The first thing I did was to download the zip file from his blog and load up all the source files in to TextMate.  I didn't want to compile or run anything just yet, I simply wanted to look at the difference between the two batches of code.  John had organized it very nicely into two folders named, appropriately enough, DirtyDal and CleanDal

The DirtyDal consisted of a single 300 line class that implemented all the CRUD methods for a Comment table.   The code was simple enough, but was laced with lots of distracting comments and was laden with considerable duplication.  It was the typical kind of CRUD class that built up SQL commands from arguments, executed those commands, and then extracted the results.  Code like this often has a repeating structure since every command follows the same basic form.  1. Construct SQL.  2. Execute SQL.  3. Process results. 

The CleanDal consisted of many very small classes arranged in an pleasant inheritance hierarchy that eliminated much of the duplication.  The base classes provided the basic framework for a SQL operation, and the derivatives implemented them simply and directly. 

So my overall impression was that this was a significant improvement in structure and cleanliness.  John made the point in his blog that this new structure took more lines of code, and was more complicated.  While it's true that more lines are used, most of those lines are boilerplate required by the statically typed nature of C#.  There are fewer actual executable lines, and that's the important measure.  As to whether the new code is more complicated, it certainly has a more complex structure.  But the code within that structure is much simpler.  The structure is visible, and it's contents are simple.  To me, that's a rather significant improvement.

So then I loaded the solution into VS.  Everything came up nicely, and the project built without trouble.  Good!  I hate compile and path errors in downloaded software.

Next, I ran the tests.  John was good enough to include tests!  Indeed, it was from reading those tests that I learned how he intended his inheritance hierarchy to work.  So, well done John! 

Unfortunately the tests did not run.  They failed because they tried to connect to a database that did not exist on my computer.  Why do I need to create a dabase just to run these tests?  The database is irrelevant to the tests!

"Wait!" you say.  "This is a CRUD application, how could the database be irrelevant to the tests?"  Simple.  I believe the database works, so there's no need to test it.  All I care about is the code that John wrote. I just want to be sure that the SQL commands care created properly, and that the returned data is processed appropriately.   The database is a complication that I'd rather not deal with.

My next complaint is that John did not use one of the standard test frameworks, like NUnit.  Instead he just wrote his own little main program that called his functions and made sure they worked.  Worse, those tests were written at a fairly high level.  For example, he tested that InsertComment worked by calling InsertComment, and then by calling SelectComment to make sure the data was inserted.  For my money this test touches way too much code!  It does not prove to me that the SQL is generated properly in each situation, and I am not convinced that complementary bugs in Insert and Select aren't canceling each other out and generating false positives.

To fix these tests I tried to create a mock derivative of SqlConnection but, of course, it was sealed (grumble).  I thought about wrapping SqlConnection in an IConnection interface, but unfortunately SqlConnection returns SqlCommand which returns SqlParameter, all of which are sealed (double grumble!)

I thought about downloading one of the mocking tools, like TypeMock, but my time is limitted, and I'm not an expert .NET programmer.  

So rather than fiddle around with trying to get a mocking framework or a database up and running (triple grumble) I decided just to refactor without running the tests. (Gasp, Horror)  What can you do when the test environment isn't supplied...

So I went back to the test and started to read.  The first thing it does is create a CommentData object, so I looked at the CommentData class.  Here's a small snippet.

  public class CommentData {
    /// <summary>
    /// Id of the comment
    /// </summary>
    public Int32? CommentId { get; set; }



I really hate these kinds of comments.  The comment tells you nothing more than the property name, so why is it there?  So I deleted it and it's ilk.




To be fair to Jonathan, this class is not in his "clean" folder.  It's one of the classes he didn't refactor.  Still, I think the point is an important one so I chose to show it anyway.




Next is this lovely function:

    public bool Equals(CommentData obj) {
      if (!CommentId.Equals(obj.CommentId)) return false;
      if (!Comment.Equals(obj.Comment)) return false;
      if (!CommentorId.Equals(obj.CommentorId)) return false;
      return true;
    }

Wouldn't this read better if it was written like this:

    public bool Equals(CommentData obj) {
      return CommentId == obj.CommentId &&
             Comment.Equals(obj.Comment) &&
             CommentorId == obj.CommentorId;
    }

And doesn't this need a null check for the Comment property?

Next the test calls the static Execute method of CommentInsertCommand

  public static void Execute(CommentData commentData, 
                             int userId, 
                             SqlConnection cn)      
  {
      ThrowExceptionIfExecuteMethodCommentParameterIsInvalid(commentData);
      List<CommentData> comments = new List<CommentData>(1);
      comments.Add(commentData);
      Execute(comments, userId, cn);
  }


This function is nice and small.  However, the creation of the list comes out of nowhere.  It make sense when you look at the Execute statement two lines down,  but I dislike the fact that the reader has to look two lines down to understand what's going on.  So this might be better:

    public static void Execute(CommentData commentData, 
                               int userId, 
                               SqlConnection cn) {
      ThrowExceptionIfExecuteMethodCommentParameterIsInvalid(commentData);
      Execute(toList(commentData), userId, cn);
    }

    private static List<CommentData> toList(CommentData commentData) {
      List<CommentData> comments = new List<CommentData>(1);
      comments.Add(commentData);
      return comments;
    }

 
The function that checks the exceptions looks like this:

    protected static void ThrowExceptionIfExecuteMethodCommentParameterIsInvalid(
                          CommentData commentData) {
      if (null == commentData)
        throw new ArgumentNullException("commentData");
      if (commentData.CommentId.HasValue)
        throw new ArgumentNullException(

          "commentData", 
          "CommentInsertCommand is only for adding new data.");
    }


I understand the old C programmer's trick of inverting equality statements.  It prevents the inadvertent omission of a = from being silent.  But I don't like it.  It doesn't read right.  I prefer:

      if (commentData == null)
 
CommentData is the subject of that sentence, and null is the direct object.  Inverting them doesn't jive well with the way we think.  I dislike any statement in code that causes the reader to do a double-take.  Code that protects the author at the expense of the reader is flawed code.

There is another function in this class that has the following name:

ThrowExceptionIfExecuteMethodCommentsParameterIsInvalid

Can you tell the difference between that function name, and the one we just looked at?  It bothers me that these two function have names that are so very similar.  What's worse is that they use two different conventions.  The latter refers directly to the comments argument of the function that calls it.  However, the former does not use the precise name of the argument it is checking.  Rather it abbreviates CommentData to Comment

Fixing this changes the method names as follows:

ThrowExceptionIfExecuteMethodCommentDataParameterIsInvalid
ThrowExceptionIfExecuteMethodCommentsParameterIsInvalid

That's really not that much better.  So how about this:

ThrowIfCommentDataIsInvalid(CommentData)
ThrowIfListOfCommentsIsInvalid(IEnumerable<CommentData>)

Or maybe even this:

ThrowIfInvalid(CommentData)
ThrowIfInvalid(IEnumerable<CommentData>)

Usually I prefer the longer names.  But in this case the argument type is easily seen as part of the function name.  So I think I like it better.

The implementation of the second function is:
    protected static void ThrowIfInvalid(IEnumerable<CommentData> comments) {
      if (comments == null)
        throw new ArgumentNullException("comments");
      if (AreAllCommentsNew(comments))
        throw new ArgumentNullException(
          "comments", "CommentInsertCommand is only for saving new data.");
    }

    protected static bool AreAllCommentsNew(IEnumerable<CommentData> Comments) {
      return (0 == Comments.Count(x => !x.CommentId.HasValue));
    }


Notice that the implementations of these two functions have some redundancy. They both perform the following check:


commentData.CommentId.HasValue

And they both emit the same (or very similar) exception message.  It seems to me that the check in the first function need not exist.  It can be shortened to:

    protected static void ThrowIfInvalid(CommentData commentData) {
      if (commentData == null)
        throw new ArgumentNullException("commentData");
    }


That's about all I have time for today.  The dog is barking at my door, and my wife has fish cooking upstairs.

9 comments:

  1. First off, I want to thank you for writing Clean Code & reviewing my first Clean Code refactor. I've had a lot of fun sharing the Clean Code concepts and how I applied the principles in my refactor. I really appreciate the review, and hope everybody will find it educational.

    With regards to the unit tests, it only occurred to me after asking you to review it that I should have done it as a series of NUnit tests. But yes, the tests I have are admittedly weak at every level. My apologies, this entire project should have been an NUnit test component.

    With regards to the ThrowExceptionIfExecuteMethodXXXXXParameterIsInvalid() methods; the naming of these methods gave me a lot of angst. The naming was so similar, but it was the only way that made sense to me at the time. I definitely like your recommendations better.

    Also, just one clarification for anybody reading, when I made the point about complexity, I said something to the affect that the code complexity was transferred to the OOP structure. But even while this is true, I started with let’s say a code complexity of 7 out of 10, and a structural complexity of 1 out of 10, transferring that complexity brought my code complexity down to 1, an only raised my structural complexity to maybe a 3 (in my opinion anyway). So even though there was a transfer in complexity from code to structure, complexity dropped overall.

    Thanks again,
    John

    ReplyDelete
  2. I think I see a bug in the original code. I'll describe it in its refactored form because it's easier to explain that way. The overload of ThrowIfInvalid that takes an IEnumerable should check that each individual comment is null in addition to the collection being null.

    And with that, I think that overload could be further simplified to the following:

    protected static void ThrowIfInvalid(IEnumerable comments) {
    if (comments == null)
    throw new ArgumentNullException("comments");

    foreach(var comment in comments)
    ThrowIfInvalid(comment);
    }

    And then leave the original implementation of "ThrowIfInvalid(CommentData comment)" as is, letting it check if the comment is new or not.

    Does that sound reasonable?

    ReplyDelete
  3. @Anna - He is using the LINQ Extension method Count to do the iteration:

    Comments.Count(x => !x.CommentId.HasValue)

    It's a really nice clean implementation.

    ReplyDelete
  4. خصومات الان , وباقل الاسعار الممكن فى مصر من تجهيزات سوبر ماركت تجهيزات مطاعم التى نقدمة الان على اعلى مستوى ممكن الان فى كافة المحافظات الممكن من تجهيزات سوبر ماركت الاننا نقدم افضل الخصومات المختلفة من تجهيزات محلات وكافة المعدات التى نقدمة الان

    ReplyDelete
  5. تعرف الان معنا مع افضل الخدمات التى نقدمة من صيانة الكتروستارالان وعلى اعلى متسوى ممكن من رقم الكتروستار الان وعلى اعلى مستوى ممكن فى مصر

    ReplyDelete
  6. Scintillating information! I always check on your blog to keep myself updated. Your content is really good to be up to date.
    Top 10 CBSE School Meerut
    Dietitian in Meerut
    Satta King Result
    School Management Software Meerut
    Nivia Premier Carbonite

    ReplyDelete
  7. شركتنا افضل شركة تنظيف كنب في ابوظبي التي ساهمت في تسهيل و توفير الخدمات جميعها باعلي مستوي و جودة و باقل الاسعار ،و ارخص الاسعار التي تناسب جميع عملائنا الكرام حيث يتم استيراد الاجهزة و الادوات منا الخارج

    ReplyDelete
  8. تقدم هذه شركة تنظيف كنب في العين خدمات تنظيف المنازل ابوظبي من خلال أساليب تقنية حديثة وفريق عمل محترف ومتخصص في كافة مجالات التنظيف العميق والشامل للمنزل بتفاصيله الصغيرة. كما وتشتهر بأسعارها

    ReplyDelete