Refactoring Java using Clojure with the Eclipse Java development tools (JDT)

March 10, 2013 at 2:17 pm | Posted in Clojure, Programming | 4 Comments

Not a very catchy title this time, since this post will be mostly about hardcore nerdy coding. In my previous post I talked about the business value of elegant code. I argued that cleaning up an existing codebase in the maintenance phase still makes a lot of sense, but only if it can be done cheaply. One of the ways to make it cheap (cost-effective might be a better word if you need to sell this to your management) is of course to automate the refactoring process.

The problem

By automating I mean automating detection of code smell and automating fixing this smell. This in contrast to tools that are very good at detecting but don’t fix anything. For example Sonar. Also in contrast to most IDE’s that allow you to select a piece of code and select for example the ‘Extract method‘ action from the menu. That certainly is helpful, but your IDE will probably not detect if that is needed. It will just execute what you tell it to do.

Let me first show you an example of some Java code that I would like to refactor automatically:

public class A {
   int answer() {
      return (42);
   }
}

I case you already haven’t noticed: in the code above the return statement has an extra pair of parenthesis. You can find many discussions on the internet on why this is or isn’t a good thing, but personally I don’t like them for the simple reason that return is a keyword and not a function. Extra parenthesis just add visual noise. So what I would like to see instead is:

public class A {
   int answer() {
      return 42;
   }
}

This simple refactoring is already surprisingly difficult if you want to do this with a set of regular expressions since you need the context of the return statement. For example you have to be sure it’s not part of a comment or a string. The alternative is to fully parse the code and create an abstract syntax tree (AST). Again you can create one yourself using for example ANTLR and a grammar for the language of your choice. I decided to use the Eclipse Java development tools in combination with Clojure. The scope of my experiment: being able to refactor above example.

Preparation

I got the idea for this blogpost from ‘A complete standalone example of ASTParser‘. This post lists all the Eclipse libraries you need. You will have to add these libraries (8) to your own local Maven repository. Assuming you are using Leiningen 2, I followed these steps described by Stuart Sierra. For example to add an artifact to my local Maven repository called maven_repository within my Leiningen project, I used:

mvn deploy:deploy-file -DgroupId=org.eclipse -DartifactId=text -Dversion=3.5.200 \
-Dpackaging=jar -Dfile=/Users/maurits/development/eclipse/plugins/org.eclipse.text_3.5.200.v20120523-1310.jar \
-Durl=file:maven_repository

I added all the dependencies do my project file. It looks like this:

(defproject ast "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.5.0"]
                 [org.eclipse.core/contenttype "3.4.200"]
                 [org.eclipse.core/jobs "3.5.300"]
                 [org.eclipse.core/resources "3.8.1"]
                 [org.eclipse.core/runtime "3.8.0"]
                 [org.eclipse.equinox/common "3.6.100"]
                 [org.eclipse.equinox/preferences "3.5.0"]
                 [org.eclipse.jdt/core "3.8.2"]
                 [org.eclipse/osgi "3.8.1"]
                 [org.eclipse/text "3.5.200"]]
  :repositories {"local" ~(str (.toURI (java.io.File. "maven_repository")))})

Now that the preparations are done we can finally start to write the actual refactoring code.

Refactor it!

First the code to create the AST from the Java code and the actual example I am going to use:

(ns ast.core
  (:import (org.eclipse.jdt.core.dom ASTParser AST ASTNode)))

(defn create-ast
  "Create AST from a string"
  [s]
  (let [parser (ASTParser/newParser(AST/JLS3))]
    (.setSource parser (.toCharArray s))
    (.createAST parser nil)
    ))

(def example
     (create-ast (str
                  "public class A {"
                  "   int foo() {"
                  "      return (42);"
                  "   }"
                  ""
                  "   int bar() {"
                  "      return 13;"
                  "   }"
                  "}")))

As you will notice creating an AST with the Eclipse JDT only takes a very few lines of code: on line 7 I create a JLS3 (Java Language Specification 3) parser, Line 8 tells the parser where it will get its source (in this case a string) and line 9 creates the AST. Next I need some helper functions:

(defn parenthesized-expression? [expr]
  (= (.getNodeType expr) ASTNode/PARENTHESIZED_EXPRESSION))

(defn return-statement? [stmt]
  (= (.getNodeType stmt) ASTNode/RETURN_STATEMENT))

(defn parenthesized-return-statement? [stmt]
  (and (return-statement? stmt)
       (parenthesized-expression? (.getExpression stmt))))

(defn method-declaration? [body]
  (= (.getNodeType body) ASTNode/METHOD_DECLARATION))

Details about for example ASTNode can be found in the Eclipse JDT API Specification

Next the are 4 functions that zoom in into the code we would like to refactor. You will notice that I use doseq quite a lot since the actual AST manipulation (the refactoring) will be in-place and thus has side effects. This is not always avoidable when using Java libraries from within your Clojure code. We could write an immutable version that leaves the original AST intact by returning copies of the AST though. Such functionality is supported by the Eclipse JDT.

(defn refactor-block [block]
  (doseq [stmt (filter parenthesized-return-statement? (.statements block))]
    (refactor-return stmt)))

(defn refactor-method [method]
  (refactor-block (.getBody method)))

(defn refactor-type [type]
  (doseq [method (filter method-declaration? (.bodyDeclarations type))]
    (refactor-method method)))

(defn refactor [ast]
  (doseq [type (.types ast)]
    (refactor-type type)))

As you can see we filter out the return statements that need refactoring using the parenthesized-return-statement? predicate. Only thing left is to do the actual refactoring:

(defn refactor-return [stmt]
  (let [exp (.getExpression (.getExpression stmt))
        ast (.getAST exp)
        node (ASTNode/copySubtree ast exp)]
    (.setExpression stmt node)))

In this code we first get to the expression within the parentheses, hence the double .getExpression. Note: this code only strips one level of parentheses. Next we make a copy of the expression and finally we assign it back to our return statement, effectively removing the outer parentheses.

This code is easy to test via the REPL. You will see something similar to:

ast.core=> example
#<CompilationUnit public class A {
  int foo(){
    return (42);
  }
  int bar(){
    return 13;
  }
}
>
ast.core=> (refactor example)
nil
ast.core=> example
#<CompilationUnit public class A {
  int foo(){
    return 42;
  }
  int bar(){
    return 13;
  }
}
>
ast.core=>

Finally the complete code:

(ns ast.core
  (:import (org.eclipse.jdt.core.dom ASTParser AST ASTNode)))

(defn create-ast
  "Create AST from a string"
  [s]
  (let [parser (ASTParser/newParser(AST/JLS3))]
    (.setSource parser (.toCharArray s))
    (.createAST parser nil)
    ))

(def example
     (create-ast (str
                  "public class A {"
                  "   int foo() {"
                  "      return (42);"
                  "   }"
                  ""
                  "   int bar() {"
                  "      return 13;"
                  "   }"
                  "}")))

(defn parenthesized-expression? [expr]
  (= (.getNodeType expr) ASTNode/PARENTHESIZED_EXPRESSION))
  
(defn return-statement? [stmt]
  (= (.getNodeType stmt) ASTNode/RETURN_STATEMENT))

(defn parenthesized-return-statement? [stmt]
  (and (return-statement? stmt)
       (parenthesized-expression? (.getExpression stmt))))

(defn method-declaration? [body]
  (= (.getNodeType body) ASTNode/METHOD_DECLARATION))

(defn refactor-return [stmt]
  (let [exp (.getExpression (.getExpression stmt))
        ast (.getAST exp)
        node (ASTNode/copySubtree ast exp)]
    (.setExpression stmt node)))

(defn refactor-block [block]
  (doseq [stmt (filter parenthesized-return-statement? (.statements block))]
    (refactor-return stmt)))

(defn refactor-method [method]
  (refactor-block (.getBody method)))

(defn refactor-type [type]
  (doseq [method (filter method-declaration? (.bodyDeclarations type))]
    (refactor-method method)))

(defn refactor [ast]
  (doseq [type (.types ast)]
    (refactor-type type)))

As always, don’t hesitate to leave comments or email if you have questions/remarks/suggestions.

Have fun!

Is there business value in elegant code?

February 18, 2013 at 9:38 pm | Posted in Programming | Leave a comment

Recently I run into the following Java code (actual variable names replaced by x and y to anonymize):

if (x != 0) {
  y = 4 - x;
} else {
  y = 4;
}

So what’s the problem with this code? There is the use of the magic number 4, maybe we could test the x for equality to zero and swap the if and the else condition. And of course the whole test is nonsense and this 5 line code snippet could be just as well be replaced by:

  y = 4 - x;

But again, is there a problem with the original code? The code works as expected. The compiler will probably optimize this code anyhow and get rid of that test statement. The unit test and functional tests passed. So at least from a business (or end user if you will) point of view no problem at all.

And of course there are costs involved with refactoring the above code. If this happens during active development costs are probably quite low and the benefits will outweigh those costs. But if this is a maintenance project the situation might be different. The actual code is already in production and might be mission critical for the company. And if there is no (fully automated) continuous delivery in place the cost of this seemingly small code improvement can be one or two orders of magnitude higher than during development.

24093_old_measuring_instrument_for_navigation_isolated

Now lets try to make a business case for the above code change in a maintenance project. If you did an MBA you learned that business case = benefits – costs. An MBA is hardly more than that. Usually this formula results in overly optimistic hockeystick curves with an ROI of less than a year. But I digress. So we have to do two things here: we have to minimize the costs and maximize the benefits.

To start with the benefits. You can make a case that refactoring your code will usually result in less code and certainly in higher quality. If the code becomes more pleasant to work with it will also result in more motivated and productive developers instead of the attitude “I am not going to touch this mess that John left behind unless I really have to.” In general: cleaning up your code base and reducing it to half its size will leave you with about 1/4 of the maintenance costs. That might sound ambitious but over the last 20 years I have seen a lot of non-trivial pieces of software (ranging from 10k – 500k LOCs) for which this was no problem at all.

On the costs side of the equation you need a couple of things to minimize these costs:

  • Continuous delivery: automated building, unit testing, functional testing, integration testing, performance testing, version control, automated deployment, etc. Push hard to get these into place in your organisation. If you really can’t have continuous delivery on short notice, at least make sure that you have tests covering the code that you are going to change. Avoid the temptation to do ‘easy changes’ without them.
  • Any modern IDE will help for trivial refactoring. I call them trivial, because most if not all IDE’s only support refactoring on the syntax level. You can extract methods, rename variables, etc. But the IDE has no clue of what the code actually means. So refactoring on a semantic or design level is mostly not supported.
  • Automate your refactoring! This seems to contradict the previous point but there are tools available (usually as plug-ins for Java IDE’s) that can take refactoring one step further by looking at the AST (Abstract Syntax Tree) of your code and recognise patterns. I wouldn’t be surprised if they could detect and fix the example I started with.

In a next blogpost I will go into detail on what levels of refactoring there exist and how they could (at least theoretically) be automated. Until then:

Have fun refactoring your legacy code base!

(You might be wondering why I inserted the image of an old instrument in this post. My point is that in the past design of many things went far beyond pure functionality. For example scientific devices were often real pieces of art. I like code to be more than ‘just functional’ and have a certain elegancy about it.)

Using Reactive Extensions with Mono

November 21, 2012 at 8:44 pm | Posted in C#, Programming | 2 Comments

I first learned about Reactive Extensions (Rx) begin this month when it was open sourced by Microsoft. Although I found a few scattered references on the internet on how to get Rx working with Mono, I had to jump through quite a few hoops. This blogpost is a detailled account and will hopefully save you a couple of hours.

Getting Reactive Extensions

When you are using Windows this is pretty straightforward. But then again, in that case you are probably using .NET and not reading this blogpost at all. However when you are using Linux or OS-X it gets a bit more complicated. In that case your only option is to use NuGet.

Getting NuGet

I didn’t download the recommended version (NuGet.exe Bootstrapper 2.0) but used the NuGet.exe Command Line. This didn’t work out of thebox. According to this excellent blog post you first have to import some root certificates so that Mono will trust NuGet:

$ mozroots --import --sync

Next you type:

$ mono NuGet.exe

This will result in output similar to:

NuGet bootstrapper 1.0.0.0
Found NuGet.exe version 2.1.2.
Downloading…
Update complete.

You now have NuGet running. To get help type:

$ mono NuGet.exe help

Getting Rx-Main

Ok, so let’s finally get Rx. I started with the latest and greatest (Rx-Main 2.0.21114 at the moment of writing) but I didn’t get that working. However version Rx-Main 1.0.11226 does seem to work with Mono. To see all available versions enter:

$ mono NuGet.exe list Rx-Main -AllVersions

To install the latest Rx 1.0 enter:

$ mono NuGet.exe install Rx-Main -Version 1.0.11226

This will download Rx-Main into your current working directory. You can find the dll you need as: ./Rx-Main.1.0.11226/lib/Net4/System.Reactive.dll

Compiling your first Rx program

With the downloaded dll we can finally build our first Rx program. As an illustration (you can find more examples and explanation on the Reactive Framework Wiki) I used the following code:

using System;
using System.Reactive;
using System.Reactive.Linq;

class Rx
{
  public static void Main(string[] args)
  {
    var input = Observable.Range(1, 15);

    input.Subscribe(x => Console.WriteLine("The number is {0}", x));
  }
}

If you save this code as rx.cs you are ready to compile your first Rx program. Make sure that you have the System.Reactive.dll in the same directory or set the library path for the Mono compiler using the -lib directive. Assuming the dll is in the same directory as your source, just type:

$ mcs -r:System.Reactive rx.cs

This will create a rx.exe that can of course be executed with:

$ mono rx.exe

Next steps

This is all you need to get Rx and Mono working. I tried with both Mono 2.10.x and 3.0.x on OS-X and Linux. As mentioned before, I only got this running with Rx 1.0.x which uses a single dll. In Rx 2.0.x this dll is split-up into several dll’s. However trying to compile this leads to:

Unhandled Exception:
IKVM.Reflection.MissingMemberException: Member ‘System.IComparable`1′ is a missing member and does not support the requested operation.

I haven’t investigated this any further yet, but it might very well be a Mono versus .NET incompatibility.

Have fun hacking Rx and Mono and please let me know if you have any questions or remarks.

Lazily stealing..errr. getting data from the Dutch Rijksmuseum

November 10, 2012 at 10:46 pm | Posted in Clojure, Programming | Leave a comment

Last year the Dutch Rijksmuseum published an API that allows a developer to retrieve information and images from a collections of more than 110,000 items. You have to register for an API key first.

Unless you are looking for a specific item, the interface every time returns 100 items in XML format. You will also get a resumption token so you can query for the next 100 items. I imagined it would be useful to abstract from this, using a lazy sequence in Clojure. So let me show you the resulting code and a brief explanation:

(ns rijksmuseum.core
  (:require [clojure.xml :as xml]
            [clojure.zip :as zip]
            [clojure.data.zip.xml :as zf]))

We will have to parse some XML that is returned, so we start with adding some convenient libraries. If you are not familiar with Clojure zippers, please look it up in the documentation and numerous blogs. They make navigating XML almost painless.

(def api-key "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
(def base-url (str "https://www.rijksmuseum.nl/api/oai/" api-key))

In line 5 you will have to fill in your own API key. Line 7 defines the base url that is used for all queries.


(defn- build-query [resumption-token]
  (str base-url "/?verb=ListRecords&"
       (if (nil? resumption-token)
         "set=collectie_online&metadataPrefix=oai_dc"
         (str "resumptiontoken=" resumption-token))))

The routine build-query builds up a query. If there is no resumption token yet, the resulting query loads the first data. Otherwise, it will continue with the next batch of records. Currently the Rijksmuseum API supports two kinds of queries. You can either ask for a list of records (using verb=ListRecords) or you can ask for a specific record (using verb=GetRecord and an identifier). The API documentation has all the details.

We will first start with a couple of helper routines. Basically they extract the information we are interested in from a XML stream:


(defn- get-records [zipper]
  (zf/xml-> zipper :ListRecords :record))

(defn- get-resumption-token [zipper]
  (zf/xml1-> zipper :ListRecords :resumptionToken zf/text))

get-records extracts the records from the (zipped) XML response. get-resumption-token returns (I’m sure you already guessed) the resumption token for our next query.

Now comes the construction of the lazy sequence:


(defn- lazy-get-works [resumption-token]
  (lazy-seq
   (let [zipper (zip/xml-zip (xml/parse (build-query resumption-token)))
         works (get-records zipper)
         token (get-resumption-token zipper)]
     (concat works (lazy-get-works token)))))

(defn get-works
  "Return all works as a lazy sequence"
  []
  (lazy-get-works nil))

lazy-seq (line 18) is a macro that creates a lazy sequence out of a body of expressions. Next we create a query, parse the resulting XML and create a zip structure. All in one single line of code: line 19. All we have to do now is to extract the records (called works in line 20), the resumption token (line 21) and call ourselves recursively (line 22). Don’t be afraid of stack overflows: the lazy-seq macros takes care of this.

Now we are ready to use our lazy sequence. The next example creates a list with the image url’s of the first 10 items in the collection:


(defn get-image-url [work]
  (zf/xml1-> work :metadata :oai_dc:dc :dc:format zf/text))

(map get-image-url (take 10 (get-works)))

Don’t go overboard with requesting all items in the collection at once. Retrieving 1000 items takes about 1 minute, so the calls to the API are most probably throttled. Anyhow, have fun with lazily stealing works of art!

IT consultancy companies should learn from used car dealers!

June 6, 2012 at 1:49 pm | Posted in Agile, Ramblings | Leave a comment

Let’s suppose that your old car one day suddenly stops working and it is beyond repair. So now you are in the market for a new (or second-hand) car. You have a mental model of what you are looking for: it should be roomy enough for your family and 2 dogs, it should be safe and of course your new car should have sufficient power. Together with your wife you decide that you want a MPV. So you walk into the nearest car dealership. To your surprise it is a rather nondescript building and when you enter it the showroom is completely empty. Luckily enough there is this friendly car sales guy and you start to talk to him: “Hi, we are looking for a new car and …”. Before you can finish your sentence he raises his hand to stop you, smiles at you and walks back to the counter. He returns with a key and hands it over to you. “Here is the key of your car. If you just follow me, I will show it.” Completely flabbergasted you and your wife are led into a parking lot behind the building and he points to a Mazda MX-5 Miata. You start to protest and try to explain that your 3 children and 2 dogs are not going to fit into a convertible, but he ignores your protests. “I’m sorry Sir, as you can see this is the only car we have got. And I think it is just right for you. Good luck!”

Now the above scenario might sound a bit absurd, but this is what I see a lot in IT consulting: your current project has come to an end and in some magic gathering that often goes by the name “Project Allocation Meeting” or “Resource Meeting” the sales guys and girls at a typical IT company have decided that the very first project that comes along miraculously is the perfect fit for you. And you can already start tomorrow. Sounds familiar with that single car that is for sale in an otherwise empty dealership?

Lets first have a look at the criteria for a good project allocation process:

  1. Limited waste because of non-billable hours. The business model for IT consulting is rather straightforward: revenue equals the number of billable hours multiplied by the hourly rate of a consultant, summed over all consultants. Both parameters are not very scalable, so it is tempting to maximize the number of billable hours by leaving no gaps between projects.
  2. The right fit: projects should be challenging enough so that a consultant can develop his skills. Working below his level is going to make him leave the company eventually. Working far above his level will only result in a burn-out. In general: the assignment should be a good fit in the career path of the consultant. That will make him move valuable to customers so you can ask higher rates later.
  3. Physical location: a project at a nearby customer will limit traveling time and will result in overall happiness. Long traveling times will make it unlikely that the consultant is going to put in some extra hours for either the customer or his own company.
  4. Other criteria like for example: does the customer’s culture fit with that of the consultant. Putting a consultant who thrives on freedom into a limited or formal organization like an insurance company is not going to make him very happy.

You will notice that the first point is a rather short-term optimization. The next three points are going to have way more impact on an IT consulting company but the effects are not immediately noticeable. Therefore an IT company will have to make a careful trade-off between optimization for billable hours (which is easy, a straightforward computer algorithm can do this for you!) and long-term sustainability and profit. This is quite difficult and one of the reasons that most consulting companies behave like this weird car dealership I introduced at the start of this blogpost. That brings me to the conclusion.

So we can choose the company we like to work for, the place we want to live, or our own car. And yet in IT consulting others are deciding what assignments best fit us. What I would like to propose is a very simple project allocation process with a minimum amount of overhead: All project information is always visible to every consultant. Very similar to a car dealer that has many cars on display. So you can pick the right car or in this case: the right project for you. That comes with the freedom of waiting for a better opportunity and passing a project. Of course that also comes with the responsibility for balancing the number of non-billable hours. That could be done by putting a cap on that number or by introducing some kind of rewarding mechanism.

Most important is that an IT consultant can perfectly well make these trade-offs himself. Self organization and responsibility at the lowest possible level instead of old Soviet style planning!

Writing software as easy as installing laminate flooring

May 28, 2012 at 8:09 pm | Posted in Agile, Programming | 2 Comments

At parties I usually try to avoid mentioning that I work in IT. There are a couple of reasons for this. First I often get a reaction like “Oh cool, I have this weird problem with Windows XP and my new printer. Since you are an expert, you can help me with that!” My answers range from anywhere between polite (trying to explain what I do for a living, which is not fixing Windows problems) and a simple “No, I can’t help you”.

The second question I often get is more like a remark of even an accusation: “why are software projects always late, expensive and unpredictable? By now everything is already available as standard libraries, so why is writing software not as simple a clicking existing components together?” These remarks mostly come from people whose software experience is limited to  small 100 line Visual Basic programs or from hobbyists who have tinkered a bit with Excel. Often they add examples like building a new house or bridge, arguing that this is way more difficult than building a simple piece of software and yet at the same time very predictable.

Lately I started telling those people (and others when giving a Scrum/Agile training) a story from my own experience, which is about installing laminate flooring in our house.

When I planned for this do-it-yourself activity, my first estimate was: 3 bedrooms and 1 hallway should take at most one weekend. Next I created an initial work breakdown structure (WBS):

  1. Remove old carpet and floor panels: 1 hour
  2. Install 50 m2 underlayment: 2 hour
  3. Install 50 m2 laminate flooring: 8 hour

Nice, if I would work hard enough I could finish this on a single Saturday and have the Sunday for relaxing, spending time with my wife and kids or even write a blogpost! So I started and everything went more or less according to plan: first two steps took a little bit less, and in another hour I had installed the first 5 m2 of laminate. Since installing a floor is as easy as writing software I figured that I could scale 5 m2 to 50 m2, so that would take 10 hours instead of the planned 8. Well, not that bad.

But then I discovered that it would look way better if the laminate would be a bit under the skirting board, instead of against it which would leave some visible gaps. This was a bit of a setback since this would mean I probably couldn’t finish the job in one day. Luckily I still had the Sunday to finish the work. Then disaster hit: while removing the skirting board I discovered it was quite old and nailed into the wall with really long nails. So two things happened: part of the boards broke, while also part of the plaster felt off, damaging the walls.

Note: not my actual wall…

I decided to do a little bit of refactoring and reuse the old floor panels that I had removed from the first bedroom as skirting board. Without all the details my new WBS looked like this:

  1. Remove old carpet and floor panels: 1 hour
  2. Install 50 m2 underlayment: 2 hours
  3. Remove old skirting board: 2 hours
  4. Install 50 m2 limate flooring: 10 hours
  5. Sawing 40 m floor panels into new skirting boards: 4 hours
  6. Grinding 40 m skirting boards: 2 hours
  7. Using a plunge router to add a nice profile to the skirting boards: 2 hours
  8. Painting the skirting boards twice: 8 hours
  9. Repair damaged walls: 2 hours
  10. Remove remaining nails: 1 hour
  11. Fixing new skirting boards to the walls: 4 hours
  12. Some additional woodwork for the door posts: 8 hours

So my carefully planned 11 hours blew up to 46 hours! So that’s about 400 %. What’s worse, my initial lead-time of 1 day ultimately became 6 months. This simple seemingly  predictable set of tasks  behaved like a real software project after all with lots of unforeseen problems and new functionality during the project.

If I would have foreseen all those problems I might not have started at all. On the plus side I ended up with skirting boards that are way more beautiful than the original ones. And what’s more, I learned how to operate a plunge router, making me a better craftsman which will be useful in future projects.

Conclusion: writing software indeed is as predictable and easy as installing laminate flooring.

Calculating Bell numbers in a single tweet

May 23, 2012 at 12:48 pm | Posted in Clojure, Programming | 6 Comments

Recently I stumbled upon a white paper by Roger Sessions called “The Mathematics of IT Simplification”. In this paper he describes an approach called synergistic partitioning, which is based on the mathematics of sets, equivalence relations, and partitions. One of the topics that comes up in this article is how many ways there are to partition N elements. The answer to this is well-known and called the Nth Bell number.

The 52 partitions of a set with 5 elements

The Wikipedia page that explains the Bell number also shows two implementations on how to calculate them, one in Ruby and the other one in Python. The Ruby version needs 17 lines of code, the Python version 12 lines. I have the peculiar habit of intriguing/annoying my colleagues at work with the statement that using Clojure the solution to any programming problem fits into one single tweet. In other words: 140 characters.

I’ll start with my first attempt:

(defn append-ele [s x] (concat s (list (+ x (last s)))))

(defn next-row [s]
  (reduce append-ele (list (last s)) s))

(defn bell [n]
  (loop [n n s '(1) b s]
    (if (= n 1)
      (reverse b)
      (recur (dec n) (next-row s) (cons (last s) b)))))

(println (bell 9))

The first function (append-ele) appends a single new element to a row in the Bell triangle (explained in the Wikipedia article). In several ways this implementation is suboptimal since concat can only concatenate sequences, so I have to convert the single element to a list first. And what’s worse, appending at the end of a sequence is expensive since it’s computation is O(N), while prepending would only take one single operation.

The function next-row calculates the next row in the Bell triangle. This row always starts with the last element of the previous row. I implemented this by a simple reduce over the elements of the previous row.

And finally the function bell returns a sequence with the first n Bell numbers. I use a loop/recur to avoid a stack overflow. The sequence of Bell numbers is constructed in reverse order, so at the end (line 9) I have to call reverse.

While this version is already pretty short (10 lines, without the printing), this still won’t fit in a single tweet. So the next step was to remove the first two helper functions and inline all the code into a single function. Warning: this is not good coding practice! With this disclaimer, here is the resulting code:

(defn bell [n]
  (loop [n n s '(1) b s]
    (if (= n 1)
      (reverse b)
      (recur (dec n)
             (reduce #(concat % (list (+ %2 (last %)))) (list (last s)) s)
             (cons (last s) b)))))

Without the whitespace this implementation is 170 characters. Almost there! As you can see there are still some ‘expensive’ keywords like concat and reverse (6 and 7 characters!) and some annoying conversions from single numbers to lists. The only way to avoid this is to use a specialized data structure like vector instead of a sequence:

(defn bell [n]
  (loop [n n s [1] b s]
    (if (= n 1)
      b
      (recur (dec n)
             (reduce #(conj % (+ %2 (last %))) [(last s)] s)
             (conj b (last s))))))

Finally the code (without the indentation) fits into 140 characters. One caveat: I use conj which nicely appends an element to the end of a vector. However this behaviour of conj is not guaranteed. It is allowed to add an element anywhere in a sequence. When you use a list for example, the element is prepended. Because of this he above algorithm is O(N^2).

I will leave it as an exercise for the reader to implement the Bell number calculation as a lazy sequence so that you can use for example (take 9 (bell)). Have fun.

Generating release notes from JIRA with Google Docs

April 11, 2012 at 2:29 pm | Posted in Agile, Programming | Leave a comment

Recently I did a project using Scrum in short (one week) iterations. The acceptance testers weren’t part of the team and asked for (well, actually demanded) release notes with every increment we shipped. We told them that wasn’t a problem at all, since we keep track of all our user stories and issues in JIRA. We already had created an account for them at the start of the project, so end of story we thought. Almost.

This didn’t work out because they experienced JIRA as a bit too complicated and didn’t want to dig up all the information themselves every Friday. We realized that some basic introduction in JIRA might help but that we could help them even more by defining a couple of filters. So I asked what they needed, created the filters to come up with this information, showed them how to use them and again concluded: end of story and back to real work. Well, almost.

They still preferred to have a document containing the release notes attached to the email that announced every new release. Mainly because they had always done it like that and also because it was easier to print the document. We decided to take the path with the least resistance, use our own JIRA filters, and waste one or two hours per release to copy JIRA issues to Word, format them in tables, fight to get the layout somewhat correctly, etc. At least  some activities to give a project manager a reason for his existence. End of story. Almost.

Because as IT guys (and girls) we don’t like boring repetitive work, especially not when it has to be done late at night when we finally got that release shipped. And certainly not when we can’t see the added value of duplicating information from one format (JIRA) to another (Word). Our default solution is Yak shaving automating the work. So I came up with this set-up based on JIRA, Google Docs and some Google Apps Script:

The source is JIRA. In the past I already wrote a couple of blog posts on how to import data using Ruby (and Soap), using JavaScript (directly into a Google docs spreadsheet) or using ClojureScript (using REST). The report is based on a template that I created in Google docs. Here you can already include for example the company logo’s, the disclaimers, etc. etc.

Some sample code (note this doesn’t use JIRA) to create a document using a template:

function createDocFromTemplate() {
  var files = DocsList.find("my-template");
  return DocumentApp.openById(files[0].makeCopy("release-notes").getId());
}

As you can see in the picture I also use a Timed Trigger. This fires the script every Friday for example at 6:00 PM. I belong to the minority of people that think that distributing Word documents is not very professional (unless you want to co-author it with others) so I prefer to create a pdf. This is done in the next step. Some sample code on how to do this:

function createAttachments(doc) {
  var mimeType = "application/pdf";
  var blob = doc.getAs(mimeType);
  return [{fileName:"release-notes.pdf", mimeType:mimeType, content:blob.getBytes()}];
}

And finally we have to send the release notes to the right people. For this I created a new Group in Google Contacts. The next code snippet shows how I can read the email addresses from this group and how I create an email with the pdf as an attachment:

function sendDocumentAsPdf(doc) {
  var contacts = getRecipients();
  var attachments = createAttachments(doc);
  contacts.forEach(function(contact) {sendMail(contact, attachments);});
}

function getRecipients() {
  return ContactsApp.getContactGroup("MyGroup").getContacts();
}

function sendMail(contact, attachments) {
  var recipient = contact.getEmails()[0].getAddress();
  var subject = "Release notes";
  var body = "Please find the release notes as attachment.";

  MailApp.sendEmail(recipient, subject, body, {attachments:attachments});
}

This concludes my brief description on how to generate release notes. There is room for improvements. For example right now the email is scheduled at a fixed time. To make your reports look more ‘genuine’ (as in: a lot of work to create) you could use a ClockTriggerBuilder to generate your own triggers that fire at a more or less random time, preferably of course Friday late at night.

Final remark: it is almost always better to include testers in your team. Even at the cost of a lot of initial energy and frustration, it’s worth the end result. The solution I described is only a patch for a very bad process.

The nonsense of risk management

April 2, 2012 at 8:36 pm | Posted in Programming | 2 Comments

Let me start this blogpost with a little fictitious story about two project managers:

Paul and Stephen were sitting outside the conference room, waiting for the steering committee to call them in. Paul looked at Stephen with a smug expression on his face. “I’m so glad that I spent that extra money on hiring two extra developers right from the start. That was quite expensive, but since one of our developers has quit during the project and another one got ill for almost a whole month, this was money well spent. We made the deadline! So how about your project?” Stephen sighed. His project hasn’t been that successful. Their test server had failed a couple of months ago and ordering a new one took quite some time. This has caused a severe delay in his project. If only he had invested in buying a backup server. At the project start the team had created a list of risks. They hadn’t forgotten about this one, but the chance of this happening was considered very low. How wrong they were!

Continue Reading The nonsense of risk management…

Bug reporting: 8 ways to annoy your software development team

March 14, 2012 at 12:04 pm | Posted in Programming, Ramblings | 17 Comments

As usual this blog post should be read with a large grain of salt. It is a collection of bad practices I have seen during many software development projects. There are positive exceptions. For example when the testers are part of the development team and the whole team is committed to delivering valuable software instead of two opposite parties trying to fight each other. Having said that, the ugly situation mostly happens in fixed-price contracts where the bug versus feature discussion often takes place.

Disclaimer: all the examples are made up. Any resemblance with bug reports from projects I did is pure coincidence ;)

So here is some practical advice for acceptance testers on how to maximally help a software development team:

1. Since you have spent a lot of time finding this bug, prioritize it at least as Major. Of course Critical or Blocking is even better so that the development team will realize it is urgent and pick it up immediately. Luckily most bug tracking systems (for example JIRA) have priority Major as the default setting so you don’t have to spent too much time thinking about this. Example: a missing help text on what the field ‘zipcode’ means should be marked as major because it is going to leave the end user clueless.

2. Make sure that the description of the bug is short. A single word is better than a long descriptive sentence. The advantage is that it forces the developer to open this issue every time to understand what it means. This will help him to not ignore this bug just by reading the description. For example ‘Login’ is a much better description than ‘Login failure when I fill in a non existing username’.

3. Don’t let yourself be fooled and use improvement, change or new feature when filling in the issue type. Everything you find is a bug. Otherwise your organization (the customer) will have to pay for it, especially in fixed-price contracts. There are two situations to be handled here: if it is in the specification, then it obviously is a bug if a feature is missing or not accordingly to the spec. If it is not (clear) in the specification then you can always defend it by saying ‘it is obviously that this is not acceptable for the end user, so it is a bug’.

4. Never provide details in your bug report! This might put developers on the wrong track. It is much better to let them find out themselves what caused this problem and how to reproduce it. After all, they are the experts. That will also give them a strong incentive to deliver better software the next time. A good example is of course the good old ‘Doesn’t work’ or ‘Program crashed’.

5. You can (and should!) often re-open an existing bug-report. This is just a practical thing and saves you some time. Especially if you have a good generic bug description (Like for example ‘Login’ mentioned before) this is very convenient. But don’t be bothered if the new bug isn’t related to the old one. You are just helping the developers to limit the total amount of bugs and re-opening fixed issues gives them a nice historic perspective. Software developers like re-use and DRY (Don’t Repeat Yourself), so should you as a tester.

6. Another practical tip: combine multiple bugs in one single bug report. Again you are helping the developers here because many bug reports might give a bad impression. It also helps to give them focus: suppose the login functionality in your application doesn’t work, then it is also a good moment to combine that with fixing a spelling error on that same page and probably adjust the colors a bit so they are more conform the specification.

7. Bug-reports are also great ways to communicate, especially with introvert developers. So you can create a major bug with a question like “What does the specification say about this Cancel button?”. That will help them to think about the spec, enable single/double/triple loop learning, etc. It might take some time, but someday they will appreciate what you did for them.

8. Another great way to stimulate software developers in a positive way is to use bug reports to give them some good advice. For example “Feature xyz is not very easy to use.” This example can be easily marked as ‘Critical bug’ and it will sparkle the creativity of the development team to come up with a great solution. That is also the reason that you, as a tester, shouldn’t give away any information on how this feature should be improved because you want to empower the developers instead of extinguishing  their creativity.

I’m sure there are many more ways to improve the communication between testers and software developers. Don’t hesitate to share them!

Next Page »

Blog at WordPress.com. | The Pool Theme.
Entries and comments feeds.

Follow

Get every new post delivered to your Inbox.