Keep the focus on the read
20 February 2012
One of the interesting things about Wazoku‘s Startup Challenge app is that a lot of the functionality is created via “out of the box” CouchDB features. In fact it is often where we haven’t lent heavily on the features of our store and frameworks where we have issues.
One of the interesting things we decided to do in the app relatively late in the day was provide little encouragements to say how many more votes an entry needed to get to the next place in the ladder. As this was a late feature we didn’t really think through where this feature would sit. We had code that re-ranks entries when their vote ordering changes and so when an entry was being re-ranked it also acquired the target to beat at the same time.
With a store like CouchDB you are really aiming to keep on reading data and minimising writes. That’s via denormalisation and also about strategies to generate related and derived data when you are changing the parent data.
So this placement made sense from that point of view. It was only later that I have begun to realise that we were choosing the wrong point to read. With hindsight it is actually only necessary to calculate the target entry when someone looks at the entry. This is because the views of the entries are distributed unequally and the vote totals already exist as a CouchDB view and therefore we can do a key lookup to find all entries with more votes than the current entry when needed.
If we wanted to cache that result to avoid needless recalculation we would be better off storing the information in front-side cache like Memcached or Redis but in practice key reads in CouchDB are pretty damn fast and low load.
So we thought we were saving ourselves problems by denormalising derived data but in fact we were creating a lot more work at a point where it is uncertain that the additional data will ever be consumed.
Sometimes it can be hard to pick the right point to read!
Intuitive versus Reasoning Programmers
3 April 2011
During the last year I’ve been helping run a monthly series of dojos for the London Clojure User Group. In the course of it I have had the chance to watch a lot of people grapple with functional programming. As a result of this and also looking at the way a lot of my colleagues at ThoughtWorks work I think programmers can roughly be divided into two groups: Intuitive and Reasoning.
To characterise both of them a little, I think that Intuitive programmers tend to use domain language a lot, rely heavily on tests and TDD, often find it difficult to articulate what they are doing in their work, they like small source code files because they like to see everything a file does at a glance, they prefer outside-in problem solving and like the opportunity to go back and revise their work.
Reasoning programmers like to discuss a problem before coding and explore edge-cases and what-ifs, they like to work in the REPL or have in-editor code evaluation, they quickly move from a domain to an abstraction and then work in that abstraction, they don’t mind a thousand line source file as long as it is logically structured (that’s what the search function is there for), they prefer “bottom-up” coding where they distil their abstraction to its essence and having implemented that create the required behaviour by composing their abstractions.
Both sets of programmers can be great at their work but often if they are unaware of the characteristics of how they like to work then there can be massive amounts of tension as the two wrestle back and forth. The Intuitive developers feel angsty when the Reasoning programmers start hacking out code rather than refactoring, Reasoning developers get frustrated at the aimlessness and game playing of the Intuitive’s attempts to generate the minimum amount to make their tests pass.
Reasoning programmers probably feel that in terms of communicating they have the superior style as they are able to advance arguments and logical constructs that can be interrogated. Those articulated models though can feel arid and irrelevant to Intuitive programmers, what does some obscure mathematical formula have to do with trying to make sure the frog doesn’t get run over when it crosses the road?
Intuitive programmers seem to be better at switching contexts and adapting to change as they can quickly see the “outlines” of any problem and their techniques are about honing that initial perception into a functional solution. By contrast a Reasoning program is wary of uncertainty and is unhappy drawing unjustified analogies between different situations.
In FP terms Reasoning programmers have their behaviour emerge as a logical consequence of the operation of lower level abstractions; their code looks like algebra and domain data is passed in to the top of their processing chain. Intuitive programmers on the other hand, fix behaviour in their tests and fill their code with domain language that aims to match the natural language of the organisation, the depth of their function calls is usually smaller and they are swifter to bind calculations to intermediate variables.
I think I am an Intuitive programmer (and therefore I worry that I am miscasting Reasoning programmers through my lack of understanding) and in my work most of my colleagues are as well, it is in the nature of consultancy to have to adapt to constantly changing domains, code bases and expectations. We do have a few Reasoning programmers though who do exactly the same work but do it via thinking deeply through problems and drawing logical inferences.
If there are issues between the two groups then I think it occurs when there are no concessions between the two; common flashpoints being testing, writing defensive/guard code and giving time to discuss problems and problem solving strategies. The two also agree on a lot of things for very different reasons: for example both like to rewrite code, refactoring is a formalised practice for Intuitive developers whereas Reasoning developers often want to apply new insight or learned ideas (“This could all just be monads!”).
An important point is that neither approach is “right” both types of programmer arrive at the same results if their experience, ability and other factors are equal. They are purely styles of working.
The beauty of small things
16 November 2010
I am very interested in the idea of “constellation architecture” and microapps as new model for both web and enterprise architecture. It feels to me like it a genuinely new way of looking at things that can deliver real benefit.
It is also not a new way of doing things, it is really just an extension of the UNIX tools idea and taking ideas like service-orientated architecture and some of the patterns of domain-driven design and taking them to their logical extreme conclusion.
If I take ls and I pipe it through grep, you wouldn’t find that particularly exciting or noteworthy. However creating a web application or service that does just one thing and then creating applications by aggregating the output of those many small components does some novel and slightly adventurous to some.
SOA failed before it began and the DDD silos of vertical responsibility seem poorly understood in practice. Both have good aspects though. However both saw their unit of composition as being something much larger than a single function. An SOA architecture for payments for example tended to include a variety of payment functions rather than just offering one service, authorising a payment for example.
There is a current trend to look at a webpage as being composed of widgets, whether they be written as server-side components or as client operated components. I think this is wrong and we need to see a page as being composed of the output of many different webapps.
Logging in a web-application whose only responsibility is to authenticate users, the most popular pages are delivered by an application whose responsibility to determine which pages are popular.
This applications should be as small as we can make them and still function. Ideally they should be a few lines of domain code linking together libraries and frameworks. They should have acceptance/behaviour tests to guarantee their external functionality and that’s about it.
It seems to me that the only way we are going to get good large-scale functionality is by aggregating useful, small segment small functionality. Building large functional stacks takes a lot of time and doesn’t deliver value exponentially to the effort of its creation.
Mockito and Scala
15 November 2010
Scala is a language that really cares about types and mocking is the art of passing one type off as another so it is not that surprising that the two don’t get along famously. It is also a bit off probably that we are using Java mocking frameworks with Scala code which means you sometimes need to know too much about how Scala creates its bytecode.
Here are two recent problems: the “ongoing stubbing” issue and optional parameters with defaults (which can generally be problematic anyway as they change the conventions of Scala function calling).
Ongoing stubbing is an error that appears when you want to mock untyped (i.e. Java 1.4) code. You can recognise it by the hateful “?” characters that appear in the error messages. Our example was wanting to mock the request parameters of Servlet 2.4. Now we all know that the request parameters (like everything else in a HTTP request) are Strings. But in Servlet 2.4 they are “?” or untyped. Servlet 2.5 is typed and the first thing I would say about an ongoing stubbing issue is to see if there is Java 1.5 compatible version of whatever it is you are trying to mock. If it is your own code, FFS, add generics now!
If it is a library that you have no control over (like Servlet) then I have some bad news, I don’t know of any way to get around this issue, Scala knows that the underlying code is unknown so even if you specify Strings in your mock code it won’t let it compile and if you don’t specify a type your code still won’t compile. In the end I created a Stub sub-class of HttpServletRequest that returned String types (which is exactly what happens at runtime, thank you very much Scala compiler).
Okay so optional parameters in mocked code? So I love named parameters and default values because I think they are probably 100% (no, perhaps 175%) better at communicating a valid operating state for the code than dependency injection. However when you want to mock and set an expectation on a Scala function that uses a default value you will often get an error message saying that the mock was not called or was not called with the correct parameters.
This is because when the Scala code is compiled you are effectively calling a chain of methods and your mock needs to set matchers for every possible argument not the ones that your Scala code is actually calling. The simplest solution is to use the any() matcher on any default argument you will not be explicitly setting. Remember that this means the verification must consist entirely of matchers, e.g. eq() and so on.
What to do when you want to verify that a default parameter was called with an explicit value? I think you do it based on the order of the parameters in the Scala declaration but I haven’t done it myself and I’m waiting for that requirement to become a necessary thing for me to know.
Python as a post-Java language
12 November 2010
I’m a UNIX-based developer and since 2000 I have been working mainly with Java and then JVM languages. When Java 7 slipped I made no real secret of the fact that Java was in a lot of trouble. The post-Oracle world though looks even worse with a lack of clarity of what in the core ecosystem is free, open source and liability free.
Clojure and to a less extent Scala are great steps forward so I don’t feel the burning need for a Java 7/8 whatever. However a moribund or tainted JVM is a major problem and so I’m now thinking about what the post-Java escape route looks like. On the web front it is pretty obvious, Python and Ruby are great languages with great frameworks for developing web-based application. For the server-side heavy lifting it is a lot less clear, people are talking about Google Go but that does feel quite low-level, I’m not sure I’m ready to go back to pointer wrangling even with memory-management. It feels like something you’d build a tool out of not an application. Mono feels like more of the same problems of wrestling with big companies with vested interests, if you are going to do that then why not try and sort out the OpenJDK?
As the title of the post suggests the language I am most inclined towards right now is Python. It is a really concise but clear language that on UNIX systems comes with an amazingly comprehensive set of libraries and which has a virtual environment and dependency management that is on a par with RVM and gem.
The single issue that comes up is performance, what I have been finding that for 80% of the work I am doing performance is okay and I’m producing a fraction of the code I would normally have to create. For that last 20% maybe I am going to have to look at something like Go or (god forbid) go back to C but I would much prefer to see a Clojure or Scala that could run on top of something like LLVM. I also have some hope of smarter people than me making progress on a JIT for Python that might take 20% down to a figure where performance would matter so much to me I wouldn’t mind sweating to make it happen.
The Helper Anti-Pattern
9 February 2009
You have a class X, you have a class called XHelper. XHelper contains methods that make it easy to use X.
The problem I have with this antipattern is that XHelper does nothing of value. If the methods are truly related to X then they should actual be class methods of X. However if you need “helper” methods to use the API of X chances are what is really required is a refactoring of X to incorporate the enhancements of XHelper invisibly. You shouldn’t need a helper to use an API.
Take Rails page helpers. A helper to construct the content of a page contains functionality that would be better marshalled in the controller, prior to view rendering. If multiple controllers perform the action then extract it to a service that controllers can invoke on the requests and delegates they are co-ordinating.
What if the Helper class actually refactors common functionality from classes X and Y and is actually called FooHelper because it helps perform Foo in X and Y?
Well, here we are onto something, we have some common functionality which is good and the name of the class reflects its purpose. The same question arises though, could FooHelper’s methods actually reside in Foo? If Foo is purely a function or method call then perhaps all the functionality relating to Foo should be encapsulated in a Foo class that presents the foo method.
Alternatively perhaps there is a better name than “Helper”? As examples, I tend to call collections of class methods that transform instances of one class into instances of another class “Transformers”. Similarly methods that create database connection instances could be called “Providers”. If you cannot make the class a private class instance of the class or classes the Helper is nominally a Helper to, then there is usually a better name for the class lurking around somewhere.
Jude, the Java Documentation Browser
24 April 2008
I recently bought a license to Dave Flanagan’s Jude. I have enjoyed Dave’s books on Javascript, Ruby and of course the Java in a Nutshell series. I was a little disappointed that the Nutshell Java book didn’t get an update to reflect Java 6 and looking for more information I took a trip to Dave’s site and saw an updated version of Jude. I had tried it before but preferred the paper version of the book (it’s initial sections on Java features are still models in the genre).
However with no new edition in the offing I decided to buy Jude and process my Java 6 JDK. Jude basically reads the Javadoc and code in the target JDK and then serves that information out via an in-application HTTP server which you point your browser at.
Recently I have been switching between several “post Java” languages: JRuby, Groovy, Scala and Jython. In all of them though I have needed to know what exactly the details of the Java Library API are so I can access them via the host language. The other day I was filing a bug report and needed to know the details of the Charset object, as I was flicking between between Jude and the report I wondered, “When did this app become indispensible?”.
It might be argued that all you need is Google and the Sun Javadoc but Jude has a few features that make it much more useful. Obviously it useable offline, that’s not something to be sniffed at. Secondly its search and browsing features are intuitive and “right” for the domain. I find it a lot easier to browse through hierachies and leap between classes and packages using Jude’s dedicated tools than via Google. It has also replaced the not inconsiderable heft of Java in a Nutshell from my work bag.
If I could make one change to make it even more useful I would make the default search much more liberal than it is now. The search accepts Java RE which is great for power using but you shouldn’t have to enter /.*http.*/i to find every instance of a class with any variation of HTTP in it, you want to be able to just type “http” Google-style.