Notes infodump: programming

The primary aim of expository work should not be to build up 100% watertight proofs or refutations of things we suspect are true. Instead the real value is giving people ideas. In the sense of, writing material that naturally suggests and brings out certain lines of inquiry. The hard parts of filling in the details, are also the least urgent. (unless, maybe, you have a person to convince.) But they can always be left till later 🙂

Programming

The sensible way to allocate limited R&D

  • You know the best piece of advice I learned in CS? It was in a course on microprocessor architecture; we were told:
  • “Optimise for the common case!

  • Along with an explanation of Amdahl’s Law.
  • Of course, if we really took this seriously, then we would realise how utterly pointless it is to come up with new “features”, libraries or even programming languages when it is the ecosystems and environments that we are constantly fighting!
  • People think this principle is “about” hardware design. This could not be further from the truth…
  • What is the commonest of common cases in our interaction with software? The area where even a small improvement would have the biggest ramifications?
  • How about all the time wasted imagining and Greenspunning and “playing computer”, when we could be seeing and experimenting and doing?
  • (The solution you seek, is not to be found in yet another programming language.)

Partial evaluation, code generation, constant / variable, form / content

  • (x \mapsto y \mapsto x+y) 2 \, \longrightarrow \, y \mapsto 2+y
  • Take this further and apply it elsewhere.
  • Consider the instruction: Treat box #452 as the address of another box; put the value 23 into this box. i.e. MEM[MEM[452]] <- 23
  • The effect of this instruction is different for different contents of box #452.
  • Thus, I want to say that the instruction itself is spread over more than one place. Box #452 is part of the instruction.
  • However, it might be misleading to say that box #452 controls the effect of the instruction. Because it only varies the behaviour along a very narrow range — namely, which box will receive the value 23. No matter how hard we try, box #452 cannot turn this instruction into a conditional jump.
  • It seems natural to call the constant, or static part, or properties, of the instruction, its shape — and the run-time, dynamically-determined effects, … something else. Like the “variable” part. And the real value of this analysis will show when used to study scales larger than the individual “low-level machine instruction”.
  • I think the “code” vs “data” things is a relative er, relation.
  • For a start, something is not “data” in some absolute sense, but relative to some other given “data”, which we call the “code” in this relation.
  • I think the key is this: if code and data changed at the same rate, then we wouldn’t be able to tell which is which.
  • So at least in a relative sense, “code” is fixed and unchanging. The only mistake made here — though it is a big one — is to insist that code be fixed on an absolute scale — such as “runtime of a UNIX process”.
  • Some examples:
  • M[452] <- 23. Same effect every time. Effectively a description of a word of memory. Say for example that this instruction is encoded as the contiguous bytes AB CD EF; then the potential areas of system “state” (i.e. memory) that could affect this instruction, are limited to those three bytes.
  • M[M[452]] <- 23. I want to say this instruction “lives” at possibly non-contiguous memory locations. The “shape” lives in one area, and there is memory location #452 which we expect to change arbitrarily.
  • M[M[M[452]]] <- 23. Similarly to the last, but another layer of indirection has been added. Location #452 is part of the variable part of the instruction. But the specific run-time contents thereof — that is to say, the contents the next time the instruction is executed (not just “when the UNIX process is started up” or anything silly like that) — determine another variable part of the instruction — and we don’t know where that is except at instruction execution time.
  • It does make sense to have (relatively) fixed instruction shapes along with variable data or parameters. Because often we want to do the same, or a very similar, thing, to different data.
  • As the “constant” “shape” gets simpler and simpler, the system as a whole — perhaps unsurprisingly — gets harder to reason about. More things become “emergent”. I.e. cellular automata like the Game of Life or Rule 30.
  • This is all worth considering in the world of dispatch.

Dispatch

  • Procedural programming: We turn the name foo into the address of some constant executable code structure. The behaviour of this is then affected by the arguments.
  • Procedural-OOP: The “this” pointer is one such argument: there is still a constant code-structure identified by the name foo and the “this” argument is simply just another argument that parameterises this constant code structure.
  • Traditional OOP: A.k.a. single-dispatch. The executable code structure is no longer obtained from the name foo alone. Instead, the “receiver” object is also used in the lookup. We map the pair (receiver, foo) to a fixed executable code structure, which then has access to the arguments, including “this”.
  • Multiple dispatch: generalising, the pattern is to use some of the arguments to obtain a “method implementation”, i.e. fixed shape of executable code — the “dispatch” — while the rest do not contribute to a different shape of code-structure.
  • Obviously these can all simulate each other in a slightly yucky way. This involves turning the dynamic associative-array data structure used to implement method lookup, into a static code associative-array made of, say, switch/case or if/else blocks.

“Data structures” and “Databases” are just dual implementations of knowledge rep

  • Fundamental building-block is the assertion: object o has field k set to value v.
  • o[k] = v. Set one of these as “unknown” to obtain a question to answer
  • ?[k] = v. Which object(s) have field k equal to v? A.k.a: “select” query. Pre-compute index on k, e.g. by exploiting some ordering on the values it can take, to efficiently answer this question.
  • o[?] = v. Which fields in object o have the value v? Not really used.
  • o[k] = ?. Object o has what value for field k? Group key/value pairs by object to answer this efficiently. A.k.a: “data structure”.
  • A static “schema” is simply a half-baked type system, which are in turn nothing more than half-baked expert systems. It follows that schemas are only 1/4-baked.
  • Hmm, on second thoughts — schemas seem to have the power that many type systems ought to have. E.g. primary and foreign key constraints, other constraints between things, not null, etc…
  • Maybe one day, the Haskellers will realise that their quest to guarantee more and more things about the entire life of their running programs, must end only with them coding many useless programs guaranteed to produce a fixed output, each of which is to be used in a slightly different situation.
  • Types and schemas i.e. simple systems of logical inference i.e. primitive forms of automated reasoning and constraint maintenance i.e. early-bound AI, have to be “run-time” constructs, if they are to scale.
  • The stark difference between “compile-time” and “run-time” must disappear, because the concept of “applications” must disappear, along with all the other historical cruft of Unix.
  • Unix, then, has produced its own gravediggers no, stop.

This knowledge rep is really nothing more than [extensional] functions.

  • I am using the word “function” here in the mathematical sense, i.e. as simply a “mapping” from some “inputs” to some “outputs”.
  • An intensional definition of a function is a rule that applies to a potentially infinite set of inputs, i.e. a computer program: f(x) = sin(x) + 2.
  • An extensional definition of a function is simply a listing of the input/output pairs:
  • f(1) = 2, f(2) = 3, f(3) = 4, f(5) = 10.
  • The following different uses that take up vast swathes of our time converting between:
    • Record r has column c equal to value v
    • Object o has field k set to value v
    • Directory d contains a file called f with contents c
    • Module m has a sub-module called n which has a class called which has a method called f
    • DOM node n has attributes a=A and b=B and children , the first of which is c
    • HTML, JSON, XML, YAML, …
  • etc. are all just clumsy special cases of something that really ought to be implemented once and for all, i.e. extensionally specified functions.
  • By the way — this is fully compatible with cyclic reference graphs, so do not be fooled by certain implementations e.g. hierarchical tree / DAG filesystems.
  • One huge benefit of JavaScript and Lua is that they have low concrete syntax cost for this ubiquitous concept (which is available as a “data structure” in most languages) of associative array.
  • But associative arrays aren’t enough. There are many things that extensional definitions cannot account for.
  • Such as default values: f(1) = 2, f(2) = 3, and f(anything else) = 0.
  • Or indeed any property of inputs that cannot be finitely enumerated — i.e. a property which needs some rule / computer program to test it, that doesn’t just involve looking it up in a list. E.g: f(odd number) = 1, f(even number) = 5.
  • If we start with an intensional definition e.g. f(x) = sin(x) + 3, then it’s all well and good until we want to make little changes, or specify exceptions, to the general rule. E.g: f(x) = sin(x) + 3, unless x = 52.5 in which case it is 0.
  • Break functions down into a finite set of programs or rules, i.e. into an extension of intensions. This is quite possibly the principle behind dispatch.
  • There is an uneasy sense that simply calling for a single implementation of extensionally-specified functions will not be enough. Can I add intensions to it somehow, without the whole project collapsing into “everything is a Turing machine”??

Changes, deltas, dirty rectangles vs. re-rendering the entire frame

  • DRAM has to constantly be “refreshed” to work, though this is carried out by the hardware.
  • Imagine if there was a tiny area of memory that kept its values from one CPU cycle to the next. And the rest of memory has to be manually refreshed, in code, that you have to write, which lives in the tiny area.
  • Quite possibly, imperative programming would not have evolved here. Instead, we would write code to “render” all of memory to the state it should be in, every few gazillion CPU cycles. We’d use a “pure function” taking “current state of all memory” to “next state of all memory”. Functional programming in letter if not in spirit.
  • And the assignment statement representing a small “local” change to a piece of memory would not exist, except as a “higher-level” pattern on top.
  • We don’t write code to ensure a memory word is still the value it was previously (unless we think it has been changed in the meantime by another thread, say.)
  • But we do write code to completely re-render a scene, because we regularly clear the framebuffer.
  • Each of these styles gives rise to its own patterns …
  • The only reason we can write “update listener”, change-based code is because the “frame axiom” that untouched state stays that way, is handled for us already.
  • Thus if I need to keep the box on the mouse cursor, I can write the position at regular intervals — or I can choose to write the position only on a “pointer moved” event.
  • If the pointer hasn’t moved, then there’s no point writing the same value to the box position — because we know that the contents of memory at time t always carry through to time t+1 unless explicitly changed.

Parsing and serialising, linearization and trees

  • The / an essential characteristic of “text” as we know it, is that this is nothing more than an “in-order” traversal of the AST. This makes it trivial to display on a screen and edit character-by-character, and not much else.
  • An in-order traversal needs to be parsed in such a way from “left” to “right” that can still work even if the “type” of a node is distributed around it (e.g. parentheses).
  • And remember this: when we’re talking about sequential tasks like parsing, “left” and “right” on our screen are really “before” and “after” to the computer. Our eyes don’t scan text like machines do, at least not on the character level — possibly on the word level.
  • And now it’s clear why in-order parsing is so complex — it’s about waiting for information that hasn’t appeared yet.
  • pre-order traversal could be much easier to parse, and may end up resembling binary tag-length-payload file formats. Perhaps that is what these are: a cousin of text files, differing in the general order of their node traversal. (also, the fact that text files are limited to ASCII while such file formats tend not to be.)
  • TODO: Must turn this into a Stupendously Witty Polemic:
  • Imagine a commercial graphics application that is “industry-standard” …
  • … and it’s based on poking pixels, and moving around rectangular blocks thereof.
  • (Like classic MS Paint)
  • Imagine trying to explain the concept of “vector graphics” to people who are used to firmly entrenched in working in this way.
  • Imagine being able to work on entire shapes or groups of shapes at once: you could drag a selection box around the circle, and the circle would be selected. The triangle behind the circle that partially intersected the selection box would be ignored. And then you could drag the circle away from the triangle.
  • “But you can already do this by just keeping shapes sufficiently far apart that their bounding boxes don’t overlap!”
  • “What else could there be? Pixels are human-visible!
  • “And how do you ‘render’ your so-called vector graphics? By turning them into pixels — exactly what we have now! It’s all pixels in the end!
  • And we are all so used to advertisements and magazine covers and company logos that look as if all of their shapes have invisible and non-overlapping bounding boxes around them that we forget that it’s possible for shapes to overlap, etc.
  • Except that wealthy companies do have nice graphics because they can afford to pay legions of pixel artists to get the job done by brute force.
  • Zooming is only available at fixed levels because each one has been drawn again from scratch, or simply makes the image look blocky.
  • People pay for extra zoom levels, and the quantity of different zoom levels is regarded as a “feature” above and beyond what one should normally expect.
  • And artists are paid proportionally to the pixel resolution of the images they create, charge extra for moving individual shapes around, changing colors, etc. They even have to do this because it is so labour-intensive!
  • Well, ASCII characters in source code are the pixels, and the abstract syntax structures they are a rendering of are the trivially-transformable-and-rasterisable vector shapes.

The curse of “context-free”

  • Recognised as tree / taxonomy / hierarchy obsession by some.
  • But it goes far further than that.
  • Anything that is to do with spatial containment — such as indentation or textual coding in general — suffers from the curses of nesting and context-freedom.
  • I.e. the inability to express anything that isn’t a tree
  • And the inability for a sub-tree to know about the role it plays in the grand scheme of things.
  • Life isn’t context-free. Nor are literally any programming languages. But nested text is.
  • Many ideas / constructs that are hard to express, or inexpressible, can be traced back to programming via flat text. But some problems are still not fixed by insisting instead on Abstract Syntax Trees.
  • The idea of breaking one big “routine” into multiple “sub-routines” is a good one and just an instance of “recursive design” in the world of “sequential tasks”.
  • But how does this idea manifest in programming languages? “Procedures” that cannot access information available at their call sites, and instead must be explicitly and manually passed all information as local context, i.e. “parameters” or “arguments”.
  • true “sub-routine” would have at least potential access to the local variables of its context.
  • And of course we are just talking about which names to let through, and which to override / hide / whatever, which is a question concerning — surprise surprise — extensional functions!

Minimalism obsession

  • Every time I bind functionality to a ready-made physical keyboard key in my project instead of implementing a virtual button on the screen, it feels like cheating.
  • If I was missing the V key on my keyboard, this shouldn’t magically stop me from being able to bootstrap a self-sustaining system.
  • A counter-force to my runaway imagination:
  • Imagine the most primitive possible computer with the most primitive possible interaction hardware.
  • A box with a light and a single button.
  • You interpret the light on/off patterns in accordance with your mental model of the system (knowing the initial user interface embedded in the firmware).
  • You press the button to make state-changes to the system.
  • No human being could be expected to do anything useful with this.
  • Thus, it is OK to be relying on a certain set of assumed physical capabilities — such as positional input (mouse), text input plus random other buttons (keyboard), and visual feedback (teletype at minimum, raster display in the 21st century).

Design Patterns

  • Had a quick leaf through my copy of the GoF book the other day.
  • Perfect demonstration of great ideas being tied up with their particular form in the unambitious landscape: as mechanisms primarily for easier management of textual source code.
  • Most of the patterns have far more value as more abstract systems patterns. In this aspect they can be fully grasped by ignoring the huge bulk of material with extremely narrow utility pertaining to classes, inheritance, naming things, syntax, languages, C++, etc.
  • Signal-to-noise ratio very low. Actual useful content (for designing systems) is dwarfed by the rather specific technical details (of organising code).
  • (I have no confusion of who the book was aimed at etc, I’m just articulating the value that someone like me can get out of it — because there is clearly something useful behind these patterns)
  • What would they look like if taken further? What are they a mere stunted form of, what do they grasp out for?
  • Iterator: imposing a total order on subobjects
  • Mediator: centralised Observer i.e. liveness
  • Memento: versioning? Naming and synchronisation in a decentralised computer system?
  • Prototype: knowledge rep and sharing, the Universal Design Pattern.
  • Observer: live state. Reactive programming. Literally the shifting of the burden of constraint maintenance over time from the programmer to the computer.
  • State: disguised method dispatch?
  • Strategy: reified co-operative or pre-emptive processes? Coroutines? Yes, the “first-class function” but what exactly is that?
  • Template Method: late-bound code construction / assemblage, dynamic code modification.
  • Visitor: multiple dispatch.
  • Flyweight: encapsulated composite / cell? Optimisation management?
  • All rather rushed and speculative interpretations. I need to properly read it again with these new eyes.

Simulations

  • That’s what it’s all about. Simulating “systems”. But what is “simulation”? What is a “system”?
  • Perhaps etymology can help us? Simulation: “that which is similar / resembles something else”; system : “those which hang together”
  • By this definition, a simulation of a painting is a visually similar painting, e.g. a parody or a forgery. A simulation of a piece of music is another piece of music that unfolds over time similary to the original.
  • Aha: we have picked up an important concept — time evolution.
  • A simulation of a moving picture is something that visually unfolds over time in a way that resembles the original — an animation.
  • And now: systems we can reach into and poke around with.
  • A simulation of paintball is a first-person shooter where your actions on the mouse and keyboard produce visual and auditory effects over time as if they were proxies for the legs, neck and fingers.
  • A physics simulation tool allows us to affect the behaviour of the system by suddenly repositioning objects, changing the laws of physics, etc.
  • Continuous evolution of the real world is just what we observe.
  • Discrete time evolution of a simulation of the world is based on the ‘atom’ of the smallest possible state-change to a small part of the entire system.
  • “Apply the next atomic state-change” is a regular signal originating external to simulated reality.
  • Even if the timing of this is not strictly regular, a separate “timer” signal can regularly act as input to the simulation to allow time measurement consistent with ourselves on the outside.
  • And finally, arbitrary other external signals from the real world act as further inputs.
  • In general: use the current state and any active external signals to obtain the next state.
  • “Active external signals” could be taken to be part of the state of the system — except that this state is “read-only” and changes mysteriously, viewed from within.
  • The next direction for analysis needs to be: what exactly is the “state” of the system?
  • I mean, what can we see as humans to tell us how we’re affecting things?
  • The externally visible state.
  • “Hidden” state, i.e. the state of memory, is somehow different… we do not develop software by looking at the memory cells through a microscope, but by looking at shapes in a GUI.
  • Even if we are inspecting memory in a hex editor, the only reason we are able to do this is because of the pixels that light up in ways that tell us what’s there.
  • In fact, the widespread teaching of “digital electronics memory” could be an elaborate cover-up for a hidden goblin creature that merely implements the “memory” metaphor in a transparent manner — and this would not make the slightest difference to how I develop software.
  • Let’s look at space and graphics to start with.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.