Consider the three operating system families we are forced to choose from: Windows, Apple, Other (which I shall refer to as “Linux” despite it technically being more specific).
All of these are built around the same foundational concepts, those of Unix.
Android sits atop the Linux kernel, with iOS (as well as Mac OS) atop Darwin (dunno about Windows Phone, but it doesn’t matter). Linux and Darwin are both descendants of Unix. Windows also bears many similarities to Unix, possibly through it, but whether there is a dependency does not affect the point. To me, the key effects of this monopoly are as follows.
1. The interface to OS services is stuck at the “portable assembly” level.
Let us note that Windows, Unix, and hence Linux and Mac, are written in C.
I should clarify: I do know that the application software of all systems is developed using a great variety of languages and platforms. That is not what I am talking about.
It’s also true that much of the Windows OS is written in C++, and Mac/iOS in Objective-C. I won’t talk about Objective-C, as what little I know about it (objects and messaging rather than memory and subroutines) suggests that it does not represent computing quite so outdated as C and C++. And as for C++, well, forgive me if I draw attention to it being largely a superset of C, and that while COM may be considered a part of Windows, the kernel along with the vast majority of system libraries show no evidence of C++.
Finally, because these are operating systems, it’s entirely true that there was “some assembly required”. I would point out that assembly is at least honest about its nature; assembly is low-level, simple as that.
But for some reason, C is still often referred to as a “high-level language”, and even if it is acknowledged as being effectively portable assembly, there’s no law saying there cannot be advances in low-level languages as well as high-level ones. (on this, assembly is a rather special case, as it is intimately linked to the processor architecture and hence cannot advance faster than the hardware itself.)
Now: Unix is an operating system that was developed in the 1970s. Its basic ideas thus reflect computing as it was in the 1970s. This should come as no surprise.
C was, in fact, developed specifically for Unix, as an alternative to writing system software in non-portable assembly. Hence, it also reflects computing as it was in the early 1970s. Again, this should not be surprising.
And you probably know this already, but computing in the 1970s was a radically different place than it is today. I am always amazed by the constraints imposed upon programmers in past decades and how they coped, compared to what we have now and what we could do with it.
The main constraints were time-based (processor, cache and memory performance) and also capacity-based (size of memory). For example, the creators of Unix were dealing with machines having a mere 8K memory words; many orders of magnitude less than today’s computers. Just read The Development Of The C Language and you’ll get the picture.
In addition to the many features that would simply make life easier (nested functions, better types, proper error handling) not being available in C, the language also omits important safeguards against inevitable human error. Now, again, perhaps this was appropriate for the world of 1971, but think of Moore’s law; surely it was invalid, or at least in need of updating, by the time the 1980s rolled round.
Consider the simple case of string representation. C strings are null terminated because having a length field was too many bytes (see History of C link). Even though this makes basically all string operations take O(n) longer time. Textbook definition of short-term advantage at long-term expense.
The main problem is that C was not only the language of the Unix system; it was also the go-to language for its application software. And such popularity continues to this day, across all of the operating systems that have been foisted upon the general public.
How many billions of dollars in damage have been caused by, say, the simple lack of bounds-checking on memory buffers? How many computer systems have been rendered unusable, their files ransommed or obliterated, by malware that found its way in through buffer overflows as a result of C’s standard library? The victims were all sacrificed on the altars of “Inertia” posing as “Efficiency”.
2. We still have to go through a paper-based interface to do anything useful.
Another peculiar feature of computing in the early 1970s was that computers didn’t have graphical displays. What they had instead was a quaint little device known as a teletype.
The teletype would connect via a phone line, or something to do with 110 baud, and output the results sent to it by printing them on a strip of paper that the operator could then tear off and squint at.
Why do so many languages have a function called
write or something else? They were influenced by C. Ever wonder why C called it so? Now you know. It literally printed ink onto paper.
Of course, paper is static, so this style of interaction doesn’t really lend itself to anything that could be called visual or direct-manipulation. What it does naturally encourage is conversational, line-by-line work in which you type text on a command line and get the result immortalised on dead tree pulp:
$ please replace line 12 of the file I'm editing with the following line, thanks
(clack, beep, whirrr. 2 seconds pass. crackle of paper)
$ please replace line 12 of the file I'm editing with the following line, thanks Changing line 12 from: "The quick brown fox jumped over the lazy dog" to: $ repeat after me. ed is the standard text editor for Unix
(clack, beep, hummmmm. 5 seconds later…)
$ please replace line 12 of the file I'm editing with the following line, thanks Changing line 12 from: "The quick brown fox jumped over the lazy dog" to: $ repeat after me. ed is the standard text editor for Unix $ well, did it work?
(clack, beep, whirrrrr. 2 seconds later…)
$ please replace line 12 of the file I'm editing with the following line, thanks Changing line 12 from: "The quick brown fox jumped over the lazy dog" to: $ repeat after me. ed is the standard text editor for Unix $ well, did it work? Line 12: repeat after me. ed is the standard text editor for Unix $ phew.
You get the idea with that. It’s a world in which your mental tools of visualisation and troubleshooting reign supreme. Thank goodness we’ve moved on since then. Right?
Oh, wait a minute, this is still the only way we can get anything done in the world of programming! Typing textual orders into a “terminal” or “console” application—nothing more than an emulator for the teletype!! (additionally, if you’ve ever wondered why output flows upwards and the command line sinks to the bottom, there’s your answer.)
This is why programming is full of arcane abbreviations and two-letter shell commands. Our ancestors bequeathed to us
SIGSEGV, et al. because these are products of stiff typewriter keyboards that hurt their fingers and jammed all the time. Such ergonomic issues have been fixed for a long time, yet the old habits die hard, and persist.
(edit: it may also have been related to 8KB memory. As in: the variable, constant and function names take up memory—better make them as short as possible!)
3. “Everything is a file”
Let’s look at this fundamental pillar of the Unix philosophy. What, exactly, is a file?
“A stream of bytes”? Really? What sort of a computing abstraction is “everything is a stream of bytes”??
First of all, let’s just realise that handling bytes is an optimisation over handling bits, as is handling words of any size. So the actual content of this is reduced to: “everything is a stream of bits”.
And last time I checked, everything in computing boils down to bits in the end. So half of it clearly contributes nothing more than what we already know.
In fact, the only part of this wonderful idea that actually brings anything new to the table is the emphasis on stream. A data “stream” generalises a data “block” in that the latter has definite, finite size while the former may not. So, general I/O can be treated in the same way as blocks of persistent store. Practically, it also suggests an interface that performs buffering internally, rather than burdening the client with such tasks. But if that’s all, it does seems a bit over-the-top to enshrine it as some sort of brilliant innovation that makes Unix special.
The only other useful aspect ought to be made explicit in the phrase: “Everything is a named stream of bits”. Not only do we have bitstreams, but instead of being identified by numbers (well, except sometimes in the shell…) we can use (comparatively) human-readable names!
The horror! Identifying bitstreams with strings?? But won’t that be terribly inefficient, for such a common operation? You know what, I’m actually glad this is a cornerstone of the Unix philosophy, otherwise it might have been optimised away.
Anyway, once you supply Unix with a name, it hands back to you a stream of bits. Now, despite that this is pretty indisputably the lowest-level picture you could possibly get of anything, at least we can build on top of it. After all, isn’t computing all about structuring the vast swathes of bits that make up a computer’s memory, so we can work with them easily?
But in Unix land, this is a taboo. Binary files are opaque, say the Unix ideologues. They are hard to read and write. Instead, we use Text Files, for it is surely the path of true righteousness we have taken.
4. An obsession with unstructured text.
Allow me to investigate why binary files are hard to read and write. But first, what even is a “binary” file?
Such a term would ordinarily be a tautology, since, as we just learned, a file is a stream of bits—and something that consists of bits is definitely binary. Spot the difference:
FF D8 FF E0 00 10 4A 46 49 46 00 01 02 01 00 48 00 48 00 00 FF E1
43 65 63 69 20 6E 27 65 73 74 20 70 61 73 20 75 6E 20 62 69 6E 61 72 79 20 66 69 6C 65
In this context, the concept of a “binary file” is only given meaning by being “not a text file”. So what is a text file? A text file is just a stream where each byte is an ASCII character, representing unformatted text, in a language using the Latin alphabet, letter-by-letter.
The above two files both look like fluent gibberish, so part of the job of a text editor is to recognise structures corresponding to displayable characters and show the shapes on screen as text.
And obviously, if you try and do that to a “binary file”, you will still end up with gibberish, because it didn’t conform to the text-file format in the first place.
Recall what happened: we decided to represent characters of text, including punctuation and whitespace, using fixed numbers (a practical, though rather dubious, decision; again, a product of early computing). We then developed tools to display said text and modify it using input devices.
And then we complained that binary files are hard to read and write, using these tools. Well pardon my French, but no shit, Sherlock!
And so, the entirety of the Unix text-over-binary preference is thus revealed to be one big circle. We should use text files instead of binary files, because binary files are hard to read and write with text editors. We make text editors because there are so many text files. And voilà—a recipe for unchallenged mediocrity fixed, by ubiquity, at the very foundation of computer programming well into the 21st century.
The big problem with text-as-ASCII (or Unicode, or whatever) is that it only has structure for the humans who read it, not the computer. To the computer it is still just a long list of numbers; barely any more structured than a stream of bits. So, in order to do anything with it (such as compile source code), it needs to be parsed.
Now, this is always going to be necessary for anything resembling language, as at some level there are always lists of ‘things’—letters, words, clauses, sentences, paragraphs—that can be structured by humans.
But the result of parsing is an abstract syntax tree, or AST, which is a “binary” structure amenable to computer processing. Even though, if we really wanted, we could represent ASTs as text, this does not seem to have caught on or even be considered. Instead, unstructured text is the de-facto only text format on persistent storage.
Thus, it is necessary to parse text into an AST in memory to do anything with it, and then serialise this back into text if you wish to keep whatever transformations you’ve made. So parsing must happen every time one wishes to use a file; the results are simply thrown away when the program finishes (see the next bit on processes), instead of being cached for next time, or for other programs that don’t want to duplicate the work.
Unix culture, and programming in general, seems to thrive on inventing new text file languages for every conceivable purpose (e.g: configuration files). These, of course, then need people to spend their time writing parsers and serializers in all the text-based languages that will read them. This is an enormous waste of time and resources that could be better spent not propping up optimisations from half a century ago.
The tragedy is, that because all of this is so ingrained as the natural way to do things that’s “worked so far”, such a virus perpetuates itself, making it even more standard, until I probably sound like a crazy person for criticising programming-as-language and text. “What else could it be?? Have you seen the efforts made to do ‘visual’ programming? They all suck!”
I don’t disagree, and I will sketch out an alternative in a future post. For now, see Bret Victor’s Learnable Programming. But even if I had no idea, our efforts should be spent on at least having a go at thinking outside this box, because it’s a big one. Unstructured text is not the only way even to do language, let alone software construction. The next time you find yourself attributing a problem to your programming language or even programming in general, just ask yourself whether it might have anything to do with your editor or environment, especially if they are a text editor and the shell.
5. Processes, applications and debugging kinda suck.
The concept of processes as virtual address spaces and processors is an interesting one that is at least as old as Unix.
By dividing memory into virtual address spaces, we separate all the simultaneous threads of execution into little groups, each of whose members can only stomp on the memory of their other friends in the group. But the division of work into processes seems very coarse-grained; one widely replicated practice is having each end-user application, on average, correspond to a process, although I do not know how much this is the case with mobile operating systems.
Processes are a bit like computer technicians: they arrive where called, set up their equipment, perform their task and then dismantle their equipment, pack it away and drive home. If they want to leave the place in any different state than they found it then they must either stay forever, or explicitly persist stuff to permanent storage. Otherwise, by default, any work the process does is thrown away.
As I mentioned before, some sort of translator program like a compiler must first parse its source data into an internal AST, process it, and then serialise it out to disk. If the source file is changed even slightly, then the translator parses the entire thing again, does mostly the same things to the AST, and serialises the entire thing out again. But not only this: all programs contain “initialisation” and “cleanup” code.
Bret Victor’s term “destroy-the-world programming” seems apt here (even though he means something slightly different), because each process spends lots of time constructing a world in which it can do its job, before tearing it all down again when it’s finished for the time being. An average shell session, where each command more or less spawns a process, involves constant building-up and throwing-away of processes, often redundantly.
In a text editor, the edited portion of text is stored in the process’ virtual memory. It must take explicit steps to serialise this out for the user to keep their changes. If something goes wrong, say the editor accesses invalid memory, then the entire process is killed, along with the user’s data that was not manually persisted. This is because there is usually not enough information for a human to figure out the cause of the problem, patch it, and continue.
For that is known as debugging. And for it to be a human-friendly experience, it needs to have access to debugging information: namely, those parts of the program’s description that are not necessary for the computer (or humans) to blindly follow instructions, yet are essential for human beings to understand what the program’s actually doing, past what its user interface tells them.
Debugging information is mainly what the creator of the program himself used to understand it. It largely consists of the names and properties of structures in the source code. But these, we are told, add unnecessary bloat and performance overhead to what must, logically, be un-bloated and elegant machine-language, and as such it is established practice to remove such information as a normal part of the distribution of the program to its users. Again, because efficiency.
And because no mortal human would ever be prepared to make a sudden switch from text editing to literally tracing the execution of machine instructions on pure, un-annotated memory, there is little point in offering the option. Instead, the OS pulls out a gun and shoots the process in the head, leaving the user to wonder faintly, as the blood pools around them, “why on earth did that happen? If only I knew, I could fix it or tell the author” before they snap back to sense, thinking “Ah well. Those computers again. Sometimes they just break, y’know?”
But even if debugging information were retained, all of the other issues I have described come into play to make it utter torture.
First is the misanthropy of many programming languages. Even with debugging information, a 3D game written in C would be impossible to debug because of the amount of internal high-level concepts it would invent, which suggests that the developer should build a debugger into the game itself. This is a step in the right direction, but would just require extra work to ensure it wasn’t compiled away, and would probably be very specific to the game, hence requiring duplicated effort across the board.
Even if we are debugging higher-level languages, we have the use of the telety– ahem I mean terminal. Terminal-based debuggers suffer from interactivity, text, and visualisation problems that perhaps bite harder in debugging than other areas. And remember, every terminal application is its own little language (or not so little as the case may be) to learn and get used to. Further, there will be a separate debugger for each programming language, and they are all likely to be terminal applications, compounding the problem.
And even if we debug with GUIs, which are actually usable, we run into the limitations of the GUI!
6. The GUI reflects the environment that produced it.
In all mainstream systems in use, the GUI is little more than a thin veneer over whatever broken programming model we used to make it. If we’re talking about the system-wide GUIs of, say, Windows or Linux, then surprise surprise, they’re going to have been written in C, and thus provide no more functionality than what the library writers could originally envision.
But this also holds in most cases of GUIs not written in C. They’re just graphical versions of the terminal; inflexible, customisable only as far as the developers saw to explicitly design for. It’s no wonder that many either include their own teletype interface for a scripting language, or even just embed the shell itself.
If you want to increase the font size of some window, then the application designer must have already explicitly accounted for this. If they didn’t, then all you have is a big block of bytes that you know must use other blocks of bytes, but what for and where and why is completely unknown and opaque, because any structure that is unnecessary for the computer has been optimised into oblivion. And if this is the debugger’s GUI we’re talking about, then you’d need to debug it with another debugger to have any hope of getting what you want! And even if you succeeded in changing the process at runtime, getting these to persist past the “end of the world” is another responsibility entirely.
It’s all because GUI applications, like command-line ones, are designed like so: anticipate the specific things that the user might want to do, build just enough flexibility to allow them this, and then compile everything else away into bytes. Put this way, we see that many limitations attributed to GUIs are not in fact a necessary consequence of graphics or direct manipulation. They run far deeper than that.
Now, I’m no expert in human-computer interaction or the history of the GUI. But see what the inventor of the GUI (the original concepts, not what we are stuck with today) has to say about it. Just read the 2006 STEPS proposal, look at all these talks, to see what the GUI, and computing in general, could be like.
I hope that one day, it is like that. I certainly hope that, in twenty years’ time, Unix and its influence will have been forever banished to the history books, where it should already live. Or, all our other options will have been exhausted and found to be worse. But I’ll let you be the judge of how likely that is.
My aim in all this is not to say that we haven’t managed to do great things with computers or Unix, or that everyone is stupid except me who’s a genius. I certainly don’t want to give the impression that I know how to solve all these problems, or that I have all the answers, or know what to do—although I do have some ideas I’d like to try. I suppose my intention is to thoroughly demolish the conservative attitude that can crop up whenever we complain about problems that aren’t going away by themselves—that easy, thoughtless conclusion of “well, they haven’t been solved yet, so maybe they’re inevitable”.
It may be tempting to retort with what Fred Brooks made famous: there is no silver bullet. But to that, I leave you with Bret Victor, who responds:
Maybe we don’t need a silver bullet. We just need to take off our blindfolds to see where we’re firing.
(edit: After re-watching Bret Victor’s Inventing On Principle yet again, I realised I probably got this spark about terminals from the sequence starting at this point in the video. As well as the spark about text, although much of that comes from personal experience having to parse or serialise yet another csv, yet another json, yet another bloody “config” file.
Coming soon to a blog near you: “On the notion of ‘Configuration’ “.)