15.01.2016

Functions as Objectives

Software development undervalues an age old principle: building complicated stuff from simpler stuff. That's what became clear to me when I saw this advertisement today: Shell offers at its gas stations an exclusive Lego racing car.

When I saw this my first reaction was "See, Lego is not dead!" But after a couple of seconds I thought: "This is wrong! This is against the original spirit of Lego!"

I grew up with Lego building cars, houses, rockets, and all sorts of stuff from the proverbial Lego building blocks. But back then Lego was different: it provided just a few small very generic building blocks. With them I was able to compose very, very different stuff. And that was the whole point of Lego!

Lego was about the ability to combine simple shapes to form more complicated shapes. Thus you were only limited by your creativity.

But Shell's racing car is different. It's build from parts sporting the same connectivity as the original Lego building blocks. But its parts have a very specific form. You'll be hard pressed to re-use them in a different context. Your creativity is limited by their very concreteness. They make up a nice racing car – but are good for little else.

And this made me think about software development...

A software is like Shell's racing car or my robots and rockets, when I was a child. It's a whole made of smaller parts. But what kind of parts?

As you can see, for my Lego figures on the left it's the smallest building blocks there are. And it's the same for the car. The smallest building blocks might be used multiple times, but there are no additional levels of abstraction. It's either the whole or atoms. That's it.

For such small wholes that's just fine. But it does not scale. If you want to build larger stuff, especially stuff with moving parts, you want to define levels of part granularity between whole and atom.

Look at the following picture. It shows in a schematic way how I built my toys from atomic Lego building blocks. The form and smoothness of the shape of the whole is limited by the size and shape of the generic Lego parts in relation to the size of the whole. What's built from Lego parts is not continuous but quantized.

It's black boxes all the way down

If you wanted to build more complicated stuff, you would first assemble sub-systems, and then you'd assemble the whole system from those sub-systems. Nobody builds a house from atoms :-)

This is houses, cars, computers, blow dryers, even organisms are build. But to me it seems, software is not build like this. Software developers do not define levels of abstractions in their solutions.

Please note: the layered architecture pattern is not (!) about different levels of abstrations! All layers are on the same level of abstractions, but doing different things.

At the core of abstraction like shown above is nesting. Smaller parts are really hidden inside larger. Smaller parts vanish. Once a larger part has been assembled its constituents are no longer important. That's the whole point of building stuff like this. It's black boxes made of black boxes.

Most software architectures, though, seem to be just three strata of black boxes:

The top stratum is the whole system. It's one big black box containing everything.
Then there is the bottom stratum consisting of the atoms, i.e. programming language statements and API calls.
And between the two there is one stratum of heavily interconnected black boxes.

Software thus might consist of many, many layers – but very often those layers exist on the same level of abstraction.

We simply don't really live by the Lego mantra: build complicated stuff from simple stuff. Build large things from small things. Because this is a recursive mantra. Once something large and complicated has been build it immediately becomes something small and simple with regard to a yet higher level of abstraction.

This is what is hardly ever done. At least from what I see.

Stratified Design

Abselson and Sussmann called this stratified design [PDF]. You see, I did not make up the term "stratum". The idea is quite old. And Alan Kay talked about it, too.

Each level of abstraction consists of a number of parts made up of smaller parts from lower levels. They even say, each stratum should be defined in its own language. Think of it: software as a system built from lots of languages sitting on top of each other.

This is no science fiction but daily reality. Maybe we've forgotten about it:

On the lowest level there is machine instructions. That's a very, very low level programming language.
On top of that is (at least in the Java and .NET world) an intermediate language.
On top of that sits a "regular" programming language like Java or C# plus whatever libraries you use.
And on top of that is whatever you code.

Programming like that works very well. But interestingly we don't carry this strategy forward into our own programs. We stop at stratum #4. We just define one big language soup from whatever level #3 provides us with.

But why is that?

My guess is: because it's not taught how to do it differently. Object-orientation provides means to do that. But most projects don't use it to that end. They are content with just one additional stratum which they slice and dice into many, many small building blocks heavily interconnected. That's what then is called a monolith.

Let me repeat: Just because you divide up your code into classes, libraries, or even components does not mean, you're defining different strata. Even layering those modules does not mean, you're working on different levels of abstraction.

Or to put it more succinctly and maybe provocatively: To modularize does not necessarily mean to abstract.

Behavior over data

For 25 years at least, though, our industry has been obsessed with modularization under the name of object-orientation. As we now can see: this thinking does not cut it. Code still becomes hard to maintain after a short while, sometimes even after minutes.

How can this be changed?

By switching the focus to what easily can be stratified. That's behavior.

Or if you like it more in programming language terms: by switching the focus to functions.

Functions encapsulate behavior encoded in the form of logic, i.e. transformations, control structures, and API calls.

And behavior can be defined on different levels of abstraction. That's what the strata from machine code to high level language code show us. That's what you easily see once you think about any complicated behavior. Take cooking a meal for example:

Top stratum: just one whole behavior "cook meal".
Below the top stratum there are high level behaviors on the next stratum like "prepare starters", "prepare main course", "prepare desert".
The high level behaviors are made up of for example of "prepare meat", "prepare side orders" for "prepare main course".
Then on the next lower stratum it's for example "peel potatoes", "cook potatoes", "place potatoes on plate", "get can of beans", "open can of beans", "cook beans" for "prepare side orders".

If you take together all the partial behaviors on one stratum they define what the whole behavior is about – each stratum on another level of abstraction.

Each stratum has its own vocabulary. "open can" or "peel" or "cook" are activities not known in the top three strata. Likewise "main course" is no term known in lowest stratum.

It's like with == of a high level language. It's nothing you'll find in machine code. Like ADC from the x86 instruction set: you won't find that in Java or C#.

What I want to suggest is, you should use functions to encode behavior on different levels of abstraction. Not just split-up logic horizontally to fulfill the Single Responsibility Principle. That's not thinking in terms of different levels of abstraction. But consciously refine software behavior top-down like shown above for "cook meal".

That, to me, is the first and most important task during software development. It's not only modularization but also stratification. And once you see behavior on different levels of abstraction, once you have a bunch of stratified functions constituting languages of different fidelity, you can start thinking about how to continue with modularization by bundling them up into classes and libraries.

Yes, I firmly believe, that all hand-wringing and heated discussions about the dire state of codebase maintainability will not advance our industry, until we embrace stratified design. Functions need to become our first concern. Splitting up behavior on different levels of abstractions into functions needs to become our main objective.

And what about data? Well, data's is gonna take care of itself ;-) Once we get the behavior right, once we get our functions right, data will naturally follow.