Abstraction

Abstraction is a powerful word, where the idea of abstraction has taken on a form that is heralded as something like the acme of understanding – the ability to logically and/or intuitively remove “redundant” details on any subject matter, so that the core, the essence, remains.

Much has been written with regard to the debate between more and less abstraction, the benefits (mostly), and some about the short-comings. People have even tried to suggest how much is “just right”. I shall not get into that. What I would like to say is this: abstraction without foundation is often counter-productive, and impedes understanding.

Take for instance reading a Shakespearean classic, once in the form of the original text as Shakespeare himself wrote it, and one as an abridged version. While the analogy is a bit of a stretch, most of us will agree that it is immediately obvious which version is easier to comprehend – to get the gist of, to come to some level of appreciation. But it is also abundantly clear that the abridged version loses a large amount of the original flavour – the language and the nuances of what the master himself intended his audience to appreciate.

Unfortunately, in the field of computer science, that clarity is often overlooked. In the quest for making the machine (the computer) simple and comfortable, and some would even say intuitive, we have abstracted away our very fundamental creation. We often hear the saying, in some form or another, that “computers one day will be so un-obstructive that they’ll be a part of our lives”. That is perfectly good, in the context of the user. However, that warm and fuzzy feeling does not and cannot pertain to practitioners of the art of computing – those who claim mastery and expert-level status in this domain. And the reason is simple. If those who are “experts” in computing are those who merely understand the abstract form of the machine, how they will they be different from one who claims to be a master is the classics, but only read the abridged version of “A Midsummer Night’s Dream”? When one become so comfortable with the abstracted form of the machine, that one no longer understands how the machine fundamentally works, the quicks, assumptions and failings of the machine, one can no longer profess to be the creator, the controller, and indeed, the programmer of the machine.

This is clearly illustrated by the advent of high(er)-level programming languages. Each successive generation of languages seeks to push the machine implementation into deeper and deeper recesses, with the idea of bringing “programming to the masses”. I prefer the phrase “programming to the uninitiated”. What happens is that many computer experts no longer truly recognize the machine they are controlling, and each generation understands it less. Too many speak of objects and classes as though these are concepts that the computer natively understands, but it does not. We talk of types (int, char, bool) as though the machine was designed to comprehend these, but it does not. Yet we assume it does. And those are just two minute examples.

Of course there are outliers, the rarefied few who doggedly pursue the fundamentals of computer science, and indeed, it is easy to spot someone like that. They are those whom people simply turn to when stumped, they are those who speak of the intricacies of the machine as though it is a natural language, and they are those that their peers look up to, and try to model after. Oftentimes, they possess legendary status.

So at the end, what happens? Assumptions are made of the machine which are not true – assumptions which lead to poorly designed software, to insecure software, to inefficient software. The very machine that we humans designed has somehow been contorted into a much more intelligent machine than it truly is. That should not be a shocker. We have been told at the very beginning of learning computer science that the machine is dumb. And yet because of the work of those who constructed building block upon building block on that dumb machine, that it somehow appears magically intelligent, as though it is capable of interacting naturally with a human, recognizing gestures, understanding your voice commands, predicting your habits, and so on. At a lower level, it is tempting to feel that the machine thinks the way you do, that because we humans have a natural tendency to “objectize” things, such as a player or mob in a game, the machine does too.

Because of that comfort, we so easily forget that that “intelligence” is built by having the (high-level) programming language imbue those ideas into the code it generates, allowing us to think in terms of these constructs that those before us came up with. However, the danger is that we, as humans, are not attuned to being able to construct a mental picture of a dumb machine, and that problem is confounded by the fact that we are taught (other than that first lesson telling you the machine is dumb) that the machine is indeed smart. Not directly, but with regularly reinforced implications. How, you may ask. By the way we first dip our toes into programming. Almost any instructor will teach a course of “Programming 101” by leveraging on the ease of languages such as Java, C# or Python. They will teach how to manipulate strings, to And they do it without telling you how that string even got to exist.

Simply consider this – how often do you hear a Java or Python programmer speak of strings as though the computer understands what a string is? How often do you hear a C++ programmer speak of run-time resolution of functions (virtual functions) as though the computer just “gets it”? How often do you hear even a C programmer speak of strcpy() as though it is a built in function of the machine? Yes, you out there with a strong understanding may reason that in this instance it does comes quite close, but consider this in context of the larger argument. Or perhaps consider this – which university level course teaches programming by starting with data representation and assembly language?

And then we wonder why we just wrote code that caused that integer overflow.