"I put them there because ... wait, um, they're, let me see ... they sum up x, y and z and calculate some sort of squared nearest neighbor... yep..."
"So, what do they do?"
"I just told you..."
"See, I just found that I couldn't understand those lines in the current context, so I just wanted to rip out a new method for this stuff..."
"Now how should I call this method?"
"Um... ... call it doCalculation!"
What is "good" abstraction?
Good abstraction should communicate intent unambiguously by utilizing shared knowledge without leaking details.
Abstraction is considered one of the most important concepts of software development. A sequence of bytes can be abstracted into a list of mnemonic assembler instructions which can be abstracted into some higher level language which can be abstracted into some even higher level language which can be abstracted to pressing an "execute" button on your desktop (which is nothing more than yet another visual programming language). *Phew*.
Unfortunately abstraction is highly context dependent (see the Wikipedia page on abstraction):
Abstraction uses a strategy of simplification, wherein formerly concrete details are left ambiguous, vague, or undefined; thus effective communication about things in the abstract requires an intuitive or common experience between the communicator and the communication recipient.
So abstraction is used to communicate. This is important. Abstraction is not some self-serving concept, it's main use is communication. Whether that means you communicating with a coworker or [you now] communicating with [you, a week from now] doesn't matter. A good abstraction therefore communicates a concept that's hard to understand in detail.
Abstraction also is contextual. To communicate via abstractions you need something in common that makes the abstraction understandable. In the software context this "intuitive or common experience between the communicator and the communication recipient" usually means at least a shared programming language in which you specify your abstractions and a shared language and cultural background to be able to decipher meaning.
Build your abstractions upon domain knowledge that you share with your coworkers (and yourself in a few days for that matter).
"This is good code. It is modular, it uses dependency injection, it doesn't violate the Law of Demeter, modules are not coupled and it was developed in a test driven way, so there's lot's of unit tests to start with. And on top of it all not one of my nice code metric tests go red."
"What does this variable nukaguroAtaki mean?"
"I don't know. This is good code, my consultant toolbox told me so. Good code is self-explanatory."
"But I can't see what nukaturoAtaki does on object nutoTsa in class Adko."
"I don't know. This is good code. Now give me my paycheck."
You may get top score on every metric, you may abide the Law of Demeter, you may create low coupling and high cohesion, your code may be perfectly unit testable, it may even be written in some nice fancy functional language - all this is no guarantee your code won't suck.
There seems to be a common misunderstanding of structure with regards to abstraction. Structure is a measurable quality of your code. All code metrics I know of wrangle code structure into numbers. Low coupling and high cohesion are structural elements; the Law of Demeter is a typical example of structural coupling. Structure is easy to measure. This is probably why chicken-hearted people tend to cling to the idea that it's all about structure.
Abstraction is a different concept. A bad structure may hint at bad abstractions, but often there is no visible correlation. A good structure definitely does not hint at good abstractions. It's easy to violate the golden rules of abstraction but write structured code. Good abstraction is about communicating intent.
Of course communicating abstract intent is all about language, so names are an important concept. Using funny names like "yetAnotherSillySelector(...)" is as bad as abbr() that are hrd() to dcphr(). A very interesting approach that is taught for example by Fowler is this:
Write a new function whenever you want to write a comment.
If some lines of code are complex enough that they need some comment to explain their intent you can as well rip off a shiny new method that is named like the comment you wanted to write. Domain Driven Design helps to find the right names for those functions.
An abstraction can never be completely unambiguously. If it was, it would not be an abstraction. That said, ambiguity can render an abstraction useless. Of course this depends on the context. The name doCalculation can be a perfect name describing what should be done next in the context of a LeastSquareSolver class. Without context it is totally useless, as a calculation can be anything.
In my experience this is one of the shortcomings of some functional programs I read. While functional languages are excellent tools to build powerful abstractions, the functional code I see is often "too dense".
If you need very little code to express the solution, there's little space for good names, which cuts out levels of abstraction that may help other people to understand your code. Of course this is easy to overcome by introducing intermediate functions that are named by the Law of Intent.
A useful abstraction provides just enough details to communicate intent unambiguously enough that a reader at the current abstraction level can understand the current abstraction level without knowing the details.
But this is an ideal goal, too, and will probably never be reached, because abstractions tend to leak through.
Without leaking details!
Have you ever experienced people referring to a web browser as "The Internet"? Many nontechnical people I know use this abstraction. Yesterday my nine year old nephew was visiting and the computer we used featured Firefox as primary browser. He asked: "Where is the internet?", looking for the Internet Explorer icon he knew from home. He didn't find it and was totally stunned.
The abstraction we use can get into our way if details of the underlying concepts shine through. Joel states that abstractions are always leaky. Knowing this, we should aim to keep the leaks small and explicit.
The example Joel uses is TCP. TCP utilizes and abstracts IP but has to deal with the unreliable nature of IP. Thus IP leaks to TCP as a side effect: A TCP packet may not arrive at all. This is a nice example of how the complex details of an unreliable transport system can be hidden by statistical algorithms. Beautiful. I know how hard programming at the IP level is. In contrast, calling the Internet Explorer "The Internet" is an abstraction that just messes with your brain.
Good abstractions abstract leaking details to the higher abstraction level.
How to find good abstractions
So what should you do if you want to create good abstractions. Here are some hints to get you started:
- Use a library that provides a solid basis. Unfortunately you still have to figure out which libraries are 'good'. A fine indicator is how long you need to learn a library.
- If you can't use a library, for example because it's in the wrong language or too expensive, then copy some ideas from the library's interfaces.
- If you can't steal your abstractions, make sure that they communicate intent unambiguously by utilizing shared knowledge without leaking details.
In memorial of Flups The Hamster, who died at the 18th of March 2007 at the Methusalean age of 2 years and 2 months.