Scala and Voldemort: Types matter
14 February 2010
So I have been taking a good look at Java object and key-value stores recently. One of the more interesting (because it is very aligned with Java rather than being language-agnostic) is Voldemort. However I got it into my head that it would be a good idea to actually play around a bit using the Scala console. That involved a lot of classpath tweaking, made simpler because the Voldemort distribution comes bundled with its dependencies. However I still rather strangely needed the Voldemort Test jar to be able to access Upper Case View class. I’m not whether I am doing something wrong here or whether the packaging has gone wrong during the build.
With the classpath complete I followed the quickstart instructions and quickly found that actually Scala’s strict type-system was revealing something interesting about the Voldemort implementation. If you translate the Quickstart example to Scala literally then you end up with a Store Client that cannot store anything as it is a Client of type [Nothing, Nothing] and if you supply a value you get an error. I realised that I needed to interact with a lower level version of the API and was surprised to find that the signature of the constructor seemed to need a Resolver instead of defaulting one and that I had to supply a Factory instead of the Config that the Factory consumes.
This is the code I came up with (note that you need to paste this into an interactive Scala REPL rather than putting it in a script). However I’m a nub with Voldemort and I’m not exactly fly with Scala either. Any advice or input is appreciated.
After writing this I was wondering how to try and store arbitrary objects, initially I thought it would be as simple as declaring the Client to be [String, Any] but that failed due to a Class Cast exception in the String Serializer, it looks like the server config comes into play here and says how a value should be serialised and stored. It’s an interesting experience, I have also been looking at Oracle’s BDB but Voldemort puts a much nicer (read Java 1.5) interface on top of it.
Java dependency managers
11 October 2009
I am not a big fan of dependency resolvers whether they be standalone like Ivy or built-in to the monstrosity that is Maven. Frankly when it comes to this topic I am positively Luddite and prefer to hand-manage my binary dependencies via library folders that are checked into the project along with the source code. I think dependency repositories create a pain for library providers who are always having to release into them and for consumers who need a fast connection to the repository to be able to build a project. I always prefer to checkout the source code and have everything I need to build the software right there in the checkout.
There is only one usecase I regard as valid for using a dependency manager and that is where you have very complex internal dependencies between different teams in the same organisation. Since internal release times tend to be very short and you often want to pick up the latest versions as they become available it can be valuable to have a tool that is managing that for you.
Even with this I think it is a big question as to it is valid. After all with internal projects it can often be easier to co-ordinate by source control versions and includes than at the binary level. It also implies that internal communications are not as good as they should be.
Low Expectations for the Build
29 July 2009
I attended the talk on Gradle by Hans Doktor tonight and while I found myself agreeing that Maven is wholly unsatisfactory I did end up thinking that actually our expectations of build tools in the Java space are really low. What kind of things does Gradle offer us? Proper event interception, genuine integration with the build lifecycle, build targets dynamically defined at runtime, a directed cyclic dependency graph.
Looking at the list some of things you can’t believe are not part of our standard build package. We should be able to know when a build starts and stops and be able to attach code to those events. We should have decent target resolution that avoids duplication of tasks.
Gradle is head and shoulders about the morass that is Maven and is clearly superior to the ageing but faithful Ant but that it manages to be so on so little functionality is a shame.
The DAO Anti-patterns
11 July 2009
DAO is a venerable pattern and one that managed to escape out of J2EE and continues to be used a lot in Java development particularly where Spring is also in use. I have never been fan, either the first time around in the Spring and later incarnations. It’s worth having a read through the original blueprint. One thing I particularly love is the proposed combination of Factory and DAO, I find it representative of pattern thinking: if at first you don’t succeed, try excess. In fairness though probably no-one had the problem that the Factory/DAO set out to solve.
I feel the DAO actually creates more problems than it solves. Firstly there is the issue of what the DAO is actually encapsulating. Originally it was meant to encapsulate different data access methods, it would provide the same data retrieval irrespective of whether it was using JDBC, EJB, homebrew ORM or so on. However I do not think I have ever seen it implemented that way, in fact I don’t really recall ever seeing a DAO that had more than one implementation.
What DAO quickly came to be in practice was a replacement for EJBs. The most common implementation maps a single table to a DAO class that implements finder methods that return Collections of a Domain object (if you’re lucky and Lists if not). I think this accounts for 100% of Java DAOs I have encountered in my career. This is exactly what EJB definitions used to be like, arguably the only step forward is that you are now configuring in code. The step back is that you are now implementing all those finders by hand.
So a DAO abstracts something that is irrelevant (data access strategy), is linked directly to the underlying data model (usually the table) and provides an API that is just as useless.
The problem with those finders is that unless the DAO author thought of the query you want to perform then you are out of luck. Without the metaprogamming heavy lifting of a Ruby or Groovy then finders are a dead-end implementation strategy for querying data.
So for me DAOs are an anti-pattern that suck design energy out of the development team in the form of interminable discussions of what particular DAO a finder method belongs to and how many finders should be provided by the DAO. They provide zero decoupling from the underlying data and actually hold development teams back by introducing an unnecessary abstraction layer that needs to be understood but which adds no value.
So what are the alternatives? Well my preferred pattern is DataMapper, this doesn’t introduce unnecessary abstraction and shows a bit of respect for the underlying data. It allows you to do some vertical Domain modelling but the mapping gives you the flexibility to deal with legacy data schemes.
Another good alternative is to ditch the finders and introduce SearchCriteria and Repository. I thought this was a pattern too but it doesn’t seem to have a formal write up, the best example is the Hibernate Criteria but I would urge you to judiciously adapt it to your code rather than just straight up copying the the Hibernate model.
The Java Developer’s Dilemmia
31 May 2009
I believe that Java developers are under a tremendous amount of pressure at the moment. However you may not feel it if you believe that Java is going to be around for a long time and you are happy to be the one maintaining the legacy apps in their twilight. Elliotte Rusty Harold has it right in the comments when someone says that there are a lot of Java jobs still being posted. If you enjoy feasting off the corpse then feel free to ignore the rest of this post because it is going to say nothing to you.
Java is in a tricky situation due to competition on all fronts. C# has managed to rally a lot of support. Some people talk nonsense about C# being what Java will look like in the future. C# is what Java would like if you could break backwards compatibility and indeed even runtime and development compatibility in some cases (with Service Packs). C# is getting mind share by leapfrogging ahead technology-wise at the expense of its early adoptors. Microsoft also does a far better job of selling to IDE dependent developers and risk-adverse managers.
Ruby and Python have also eaten Java’s lunch in the web space. When I am working on web project for fun I work with things like Sinatra, Django and Google App Engine. That’s because they are actually fun to work with and highly productive. You focus on your problem a lot sooner than you do in Java.
The scripting languages have also done a far better job of providing solutions to the small constant problems you face in programming. Automating tasks, building and deployment, prototyping. All these things are far easier to do in your favourite scripting language than they are in Java which will have to wait for JDK7 for a decent Filesystem abstraction for example.
Where does this leave Java? Well in the Enterprise server-side niche, where I first started to use it. Even there though issues of concurrency and performance are making people look to things like Erlang and JVM alternatives like Scala and Clojure.
While, like COBOL and Fortran there will always be a market for Java skills and development. The truth is that for Java developers who want to create new applications that lead in their field; a choice about what to do next is fast approaching. For myself I find my Java projects starting to contain more and more Groovy and I am very frustrated about the lack of support for mixed Java/Groovy projects in IDEs (although I know SpringSource is putting a lot of funding into the Eclipse effort to solve the problem).
If a client asks for an application using the now treadworn combination of Spring MVC and Hibernate I think there needs to be a good answer as to why they don’t want to use Grails which I think would increase productivity a lot without sacrificing the good things about the Java stack. Companies doing heavy lifting in Java ought to be investigating languages like Scala, particularly if they are arguing for the inclusion of properties and closures in the Java language spec.
Oracle’s purchase of Sun makes this an opportune moment to assess where Java might be going and whether you are going to be on the ride with it. It is hard to predict what Oracle will do, except that they will act in their perceived economic interest. The painful thing is that whatever you decide to do there is no clear answer at the moment and no bandwagon seems to be gaining discernible momentum. It is a tough time to be a Java developer.
Working with Groovy Tests
14 April 2009
For my new project Xapper I decided to try and write my tests in Groovy. Previously I had used Groovy scripts to generate data for Java tests but I was curious as to whether it would be easier to write the entire test in Groovy instead of Java.
Overall the experience was a qualified “yes”. When I was initially working with the tests it was possible to invoke them within Eclipse via the GUnit Runner. Trying again with the more recent 1.5.7 plugin, the runner now seems to be the JUnit4 one and it says that it sees no tests, to paraphrase a famous admiral. Without being able to use the runner I ended up running the entire suite via Gant, which was less than ideal, because there is a certain amount of spin-up time compared to using something like RSpec’s spec runner.
I would really like all the major IDEs to get smarter about mixing different languages in the same project. At the moment I think Eclipse is the closest to getting this to work. NetBeans and Intellij have good stories around this but it seems to me to be a real pain to get it working in practice. I want to be able to use Groovy and Java in the same project without having Groovy classes be one of the “final products”. NetBeans use of pre-canned Ant files to build projects is a real pain here.
Despite the pain of running them though I think writing the tests in Groovy is a fantastic idea. It really brought it home to me, how much ceremony there is in conventional Java unit test writing. I felt like my tests improved when I could forget about the type of a result and just assert things about the result.
Since I tend to do TDD it was great to have the test run without having to satisfy the compiler’s demand that methods and classes be there. Instead I was able to work in a Ruby style of backfilling code to satisfy the runtime errors. Now some may regard this a ridiculous technique but it really does allow you to provide a minimum of code to meet the requirement and it does give you the sense that you are making progress as one error after another is squashed.
So why use Groovy rather than JRuby and RSpec (the world’s most enjoyable specification framework)? Well the answer lies in the fact that Groovy is really meant to work with Java. Ruby is a great language and JRuby is a great implementation but Groovy does a better job of dealing with things like annotations and making the most of your existing test libraries.
You also don’t have the same issue of context-switching between different languages. Both Groovy and Scala are similar enough to Java that you can work with them and Java without losing your flow. In Ruby, even simple things like puts versus println can throw you off. Groovy was created to do exactly this kind of job.
If the IDE integration can be sorted out then I don’t see any reason why we should write tests in Java anymore.
Three Little Mockers
12 April 2009
At my last client I got the unusual chance to try three Java mocking frameworks within the same project. The project had started to use EasyMock as the project team felt that Mockito didn’t really have any decent documentation (it does but it’s in the Mockito codebase, not on the Google Code wiki). However as a ThoughtWorks team was working next door there was a strong push to use that.
My personal preference is still for JMock so I also selfishly pushed that into the project to round out the selection. With all three there; we were ready for a Mock Off!
The first distinctive thing is that EasyMock and Mockito can mock concrete classes not just interfaces. The second thing that all three have very different methods of constructing and verifying mocks. EasyMock is record-replay, JMock constructs expectations that are automatically verified by a JUnit runner when the test stops executing, Mockito uses a TestSpy where you can verify what happened to the mock whenever you want.
The record-replay paradigm lead to EasyMock being discarded early on. It has two kinds of problems as far as I was concerned. Firstly you have the feeling of “inside out” where the test is copying the internals of the class under test. Secondly I was confused several times as to what state my mock was in and having to switch the mocks to replay mode felt noisy, particularly where you were setting multiple collaborators in the test (do you switch them to replay once their recording is done or do you record all mocks then switch all of the them to replay?).
Mockito’s easy mocking of concrete classes made it a valuable addition to the toolkit and it does deliver the promised noise free setup and verificiation. However its use of global state was frustrating, particularly in that you can create a compiling use of the API that fails at runtime. If you call verify without a following method then you get an error about illegal use of the framework. Although this is meant to happen in the test that follows the test with the illegal construction, in practice this is hideous pain when the test suite is a non-trivial size. It also meant that a test was appearing to pass when actually nothing was being verified and the error appeared in another pair’s test (the one that implemented the next test) making them think that something was wrong with their code.
Mockito also uses a lot of static imports (which I do have a weaknesses for) but it also means that you have to remember the entry points into the framework.
JMock by comparision to Mockito feels quite verbose, the price for having all that IDE support and a discoverable API is that you have to have a Mockery or two in all your classes and you are defining all your Expectations for the Mockery. There’s no denying that even with the IDE autocomplete you’re doing a lot more typing with JMock. From a lazy programmer point of view you are going to go with Mockito every time.
And that is pretty much what happened in the project. Which I think is a shame. Now in explaining this I am going to go into a bit of ThoughtWorks testing geekery so if you are lazy feel free to go off and use Mockito because the pain I’m talking about will happen in the future not now.
I feel that Mockito is a framework for Test Driven Development and JMock is a framework for Test Driven Design. A lot of times you want the former: tight-deadline work and brownfield work. You want to verify the work you are doing but often design is getting pushed to the sidelines or you actually lack the ability to change the design as you might want. Mockito is great for that.
However Mockito doesn’t let the code speak to you. It takes care of so much of the detail that you often lose feedback on your code. One thing that making you type out your Expectations in JMock does is make you think, really think, about the way your code is collaborating. I think you are much more likely to design your code with JMock because it offers lots of opportunities to refactor. For example if two services are constantly being mocked together then maybe they contain services that are so closely related they should be unified, logically or physically. You don’t see that opportunity for change with Mockito.
By using both JMock and Mockito I was able to see that Mockito did tend to make us a bit lazy and our classes tended to grow fatter in terms of functionality and collaborators because there was no penalty for doing so. I was also concerned that sometimes Mockito’s ability to mock concrete classes meant that sometimes we mocked objects that should have been real or stub implementations.
Mockito is a powerful framework and I suspect that Java developers will take to it as soon as they encounter it. However for genuine Test Driven Design I think Mockito suffocates the code and requires developers with a lot more discipline and design-nous, almost certainly not the people that wanted an easy to use framework in the first place.
Java Library Silver Bullet/Golden Hammer Syndrome
10 April 2009
One thing I notice a lot with Java projects is that there is this strong desire to just have One Thing in the code base. One web framework, one testing framework, one mocking library, one logging library, one templating engine and so on and so on.
There is the understandable desire to reduce the complexity required to comprehend the codebase but that often flows over into One True Way-ism. There is One Library because it is The Library that we should all use to Fix Our Problems.
One reason why Java developers argue so fiercely and nervously at the outset of a project is that when they are defining the application stack there is the unspoken assumption that these choices of framework are fixed forever once made. Once we chose Spring then we will use Spring forever with no chance of introducing Guice or PicoContainer. If we make the wrong choice then we will Cripple Our Project Forever.
I actually like Slutty Architectures because they take away this anxiety. If we start out using JUnit 4 and we suddenly get to this point where TestNG has a feature that JUnit doesn’t and having that feature will really help us and save a lot of time; well, let’s use TestNG and JUnit4 and let’s use each where they are strongest.
Similarly if we are using Wicket and we suddenly want some REST-ful Web APIs should we warp our use of Wicket so we get those features? Probably not; let’s chose a framework like JAX-RS that is going to do a lot of the work for us in the feature we want to create.
Of course Slutty Architecture introduces problems too. Instead of the One True Way and hideous API abuse we do have an issue around communication. If we have two test frameworks then when someone joins the team then we are going to have to explain what we test with JUnit 4 and what we test with Test NG. There is a level of complexity but we are going to deal with it explicitly instead of giving a simple statement like “we use JUnit4″ but which has been poisoned by all these unspoken caveats (“but we write all our tests like JUnit3 and you have to extend the right base test class or it won’t work”).
We also need to review our choices: when a new version of a library gets released. Does it include functionality that was previously missing? Perhaps in the light of the new release we should migrate the entire test code to TestNG for example. That kind of continual reassessment takes time but also stops an early technology choice becoming a millstone for everyone involved.
But please, let’s get over the idea that there has to be one thing that is both floor polish and desert topping. It’s making us ill.
Programming to Interfaces Anti-Pattern
31 January 2009
Here’s a personal bete noir. Interfaces are good m’kay? They allow you to separate function and implementation, you can mock them, inject them. You use them to indicate roles and interactions between objects. That’s all smashing and super.
The problem comes when you don’t really have a role that you are describing, you have an implementation. A good example is a Persister type of class that saves data to a store. In production you want to save to a database while during test you save to an in-memory store.
So you have a Persister interface with the method store and you implement a DatabaseStorePersister class and a InMemoryPersister class both implementing Persister. You inject the appropriate Persister instance and you’re rolling.
Or are you? Because to my mind there’s an interface too many in this implementation. In the production code there is only one implementation of the interface, the data only ever gets stored to a DatabaseStorePersister. The InMemory version only appears in the test code and has no purpose other than to test the interaction between the rest of the code base and the Persister.
It would probably be more honest to create a single DatabaseStorePersister and then sub-class it to create the InMemory version by overriding store.
On the other hand if your data can be stored in both a graph database and a relational database then you can legitmately say there are two implementations that share the same role. At this point it is fair to create an interface. I would prefer therefore to refactor to interfaces rather than program to them. Once the interaction has been revealed, then create the interface to capture the discovery.
A typical expression of this anti-pattern in Java is a package that is full of interfaces that have the logical Domain Language name for the object and an accompanying single implementation for which there is no valid name and instead shares the name of the interface with “Impl” added on. So for example Pointless and PointlessImpl.
If something cannot be given a meaningful name then it is a sure sign that it isn’t carrying its weight in an application. Its purpose and meaning is unclear as are its concerns.
Creating interfaces purely because it is convenient to work with them (perhaps because your mock or injection framework can only work with interfaces) is a weak reason for indulging this anti-pattern. Generally if you reunite an interface with its single implementation you can see how to work with. Often if there is only a single implementation of something there is no need to inject it, you can define the instance directly in the class that makes use of it. In terms of mocking there are mock tools that mock concrete classes and there is an argument that mocking is not appropriate here and instead the concrete result of the call should be asserted and tested.
Do the right thing; kill the Impl.
Better, Faster, Weaker
9 November 2008
I went to the Developers and Startups Meetup last week (look it up on Meetup if you’re interested it was a good event). One thing that struck me was that all the entrepreneurs were mostly interested in PHP and Ruby on Rails (there was a significant rump of ASP and .Net which I can only put down to C#’s language hotness at the moment). Java in particular was poorly regarded.
I was intrigued by what was causing the negative vibe as while I can understand that Ruby on Rails is the new tech hotness I couldn’t really see the appeal of PHP over something like Java that has a decent stack that is practically free. Talking to a few people I got the sense that PHP was seen as a “getting things done” language. Something that got you to a tangible real product very quickly. Rails seemed to have much the same quality but I think a lot of people felt that Rails skills were more expensive and in a lot of cases they wanted a thin web layer over an existing service backend that already existed.
Of course both PHP, Rails and the Java Web Stack all make various compromises to do what they do. Where Rails is opinionated, Java probably isn’t opinionated enough. If you put all your code into page (Model 1 stylee) then I’m pretty certain that writing JSPs in J2EE 5 is probably as quick to put something together.
Of course one thing that hugely hampers Java as a productive web stack is the compile and deploy loop. There is none of the immediacy of something like RoR that allows you to make changes very quickly in response to feedback. The tradeoff is of course that the code needs to be interpreted and you have to add some kind of caching to avoid the performance hit of always having to evaluate code.
But given that there are a group of people who need to have code quickly so they can sell something to their customers or investors and that they are willing to make all kinds of compromises about scale and maintainance costs is the Java platform really out of the question? I ask this because I think I have been involved in quite responsive Java web teams where feature changes have been matters of days not weeks or months.
I think that perhaps the issue is partially just mental. Enterprise software development tends to take place in a risk-adverse environment where certainty or process is valued over rapid delivery. “Best practice” in this environment tends to sacrifice timeliness. In the smaller business space this might well be a practice turned into dogma.
That said though some Java Frameworks introduce major barriers to just getting stuff done. Any framework that forces you to produce XML based interaction flow is probably distracting you from generating functionality that people want and will pay for. Routing isn’t a trivial part of an application but that’s all the more reason to box clever with it. Any framework that doesn’t make it simple to inject services into classes is making life too hard. Ditto any framework that forces you to work with a particular Expression Language or Templating product.
I am wondering if what I am looking for is actually the weakest Java framework. Rather than having functionality for all circumstances (and having to configure and setup it all up) I want a very basic REST framework which I can then add templating and persistence according to my need. It still wouldn’t have the nice scaffolding of RoR but it would mean that each element of the web app would be absolutely vital to its function. Weakness would generate the power of vitality and reduce the obfuscation of bloat.