Saturday, May 9, 2026

Conceptual Walkthrough: Distributed Applications

It may be surprising, since I am here to talk about distributed computing, but I hate process forking - making copies of a running application, each running the same code but each dedicated to a different task.  I'm sure I'm not alone in this, because it's not as though people are running around calling fork() for every little thing; multithreaded and multiprocess apps are not exactly common, not unless you're doing something that really requires such a technology.

Frankly, it's a headache to try to understand.  In theory, it should be simple; there's a copy of your program, but you know which copy is which, so you just do different things with them.  If before that, you open up a pipe, both copies of the application will have access to the same file descriptor for that pipe, and they can communicate through it.  But all of those things being technically possible isn't a plan.  When that pattern is laid out in a book, it leaves you wondering: what am I supposed to do with it?

I talked in several places about programmers being obliged to reinvent the wheel, and process forking is a good example; forking wouldn't be such a terrible thing, if it was paired with a functional remote procedure call mechanism.  When two copies of the same program are connected with a pipe (or two), you can use it to pass data, but that data transfer is mostly pointless without context.  Adding context, labeling the data, makes it essentially a very simple, poorly implemented remote procedure call, where the "function" that is being "called" is simply an index specifying what to do with the data, out of a set of options fixed at compile time.  While process forking would still be a bit confusing if the process forking mechanism came with a built in RPC mechanism, at least the programmer would not have to wrestle with all the complexity of implementing their own RPC stack from scratch while also trying to build program logic around it.

Although I wasn't thinking of it when I designed the system, the ADA distributed applications model is basically exactly this, and to help you understand it, let's walk through the analogy a little more closely.

As I see it, forking has two fundamental problems with it; first, the fork is undifferentiated.  It is easy to overlook the fact that it is a feature of programming languages that we can attach variable names to data in the first place; when I say that forks are undifferentiated, what I mean is that this deliberate feature and mechanism of programming languages no longer functions.  In each half of the program, a large number of symbols exist but point to the wrong data, and it is entirely up to the programmer to recognize this and keep track of which symbols are now incorrect.

This is in fact the reason why I have specified that under the ADA, programs should be written and compiled such that each module which is to be depoloyed separately, shall have its own variables confined to a single namespace.  Namespaces in this context may be nothing more than a collection of variables given a group name, but for our purposes, it draws an explicit distinction between what is, and is not, accessible within a specific context.  You could, of course, do the same thing with a general forked application - but it still leaves us with the problem of needing to create a new remote procedure call mechanism.

Let's take the "remote" part out of the equation for now, and only talk about a forked application communicating with another copy of itself.  A procedure call mechanism in this context is a pipe that you periodically check (or block and wait), looking for a data stream in a fixed format.  That data stream will tell you what function to call, or what variable to write to, or what variable to read from, and it may also include parameters for the function (or an index to an array, etc) and then you return any out-values through the pipe to the other side with a similar mechanism.  Likewise, the calling side of the pipe will send a packet in the expected format down the pipe, and then block and wait for a return value (or merely an acknowledgement that the function is being run, or that it has completed successfully, depending on the context).

You as the programmer will most likely want to select some highly specific subset of functions or variables that you want to use with this mechanism - when building it yourself, you will most likely only accept some very few "commands" that you listen for and respond to from the other half of your application.  Because both halves are the same applicaiton, though, you could also use a lookup table that lets you unambiguously access any variable or function present on your half of the application - there are, after all, a fixed number of those functions and variables, and you know for a fact that all of them have unique identifiers, because you use those unique identifiers while programming.  And once you have selected a function and/or variable to access, any parameters that get sent along by the request can be type-checked, because you know the method signature or variable type.

Doing all of that work by hand would be an awful experience, and most if not all of that hand-done work would be wasted, as you are unlikely to call most of the functions or access most of the variables.  If however, it were a function of the programming language, compiler, and linker - that is eminently possible, because that is, essentially, the compiler's whole job.  Translating an identifier to a function, checking the number and type of the parameters, all of that is necessary to create the program in the first place.

Suppose, then, that you had a compiler that created an indexing function.  These kind of functions exist, when you use reflection tools - the indexing function would take a variable name or function and some arbitrary parameters, and call the function with those parameters, and then the indexing funciton would return the variable or return value when it's done.  Let's suppose at first that exactly one indexing function is created for your whole program.  You as the programmer might simply pass anything that comes down the fork pipe to the indexing function (unless you just called a function, in which case you would wait for the return value and parse that) - in fact, ideally, there would be a built-in function that just reads commands from the pipe in exactly this way, and a paired function that writes to the pipe in exactly the way the reader expects.

This is still relatively unsafe; there is a lot of potential for race conditions and so on, and there is nothing stopping you from requesting data from the fork that is actually stored somewhere else.  But assuming you had some need to fork an application and run it, this reflection-based indexing function would free the programmer from having to figure out how to pass messages back and forth, free them from having to write any middleware, and simply allow the programmer to use the capabilities of the other process, its data and functions.

But let's go back to the namespace thing.  You see, we are still left with the problem I started out on - that a forked process has duplicate identifiers, half of which are simply wrong, because the actual memory location where they are stored is in another process.  But, you know which process they are stored in, and you have a well-established mechanism for requesting data or calling functions in that process.  If you knew for certain at compile time that some subset of those identifiers would be in the forked process, the compiler might translate any requests for those identifiers to instead use the pipe-send function.

But wait, there's a problem: both forks of the application are running the same code, so at compile time, it's impossible to distinguish whether the request is going to be requesting data from the original or forked process.  This is where namespaces come in - you may not know whether a given namespace is in one process or the other, but you should know for certain that the contents of a specific namespace will all be in the same place.  It's only when you exit the namespace that there is any chance you will be trying to reach data that exists in another process.  That's simple, then: in the forking function, allow the programmer to specify which namespaces are being detached, resulting in two collections of namespaces, one associated with the original process, and one with the fork.  This will be a runtime check, in case there is any reason to change which namespaces get separated in different circumstances - because it is a runtime check, a checking function will be called every time any bit of code requests data or functions outside its own namespace.  (There are ways to reduce the overhead, but ignore that for now.)

Well, if you can do all this once, why not do it multiple times?  Say you have a program with five total namespaces and you want each in its own process.  There's no reason why that should be difficult, should it?  You would simply need the runtime check, the one that knows which process contains a given piece of data, to also select the appropriate pipe to use.  There is, admittedly, one complication - making sure that all forked processes receive updated routing tables - but since you've built a robust framework for sharing data already, that hardly seems like much of an imposition.

It's worth pointing out, though, that if this system works correctly, it should also work correctly if you never fork the application at all.  In that case, there is no confusion about which process owns which variable - all of them are stored in the same process.  So, your program will truly not care whether it is currently running in a single process, two processes, or five processes; when you as the programmer insist on calling a function or fetching data, the function will be called or the data fetched, no matter which process currently stores that particular piece of data.

Now, forking a process is necessarily something that happens in the same place where your application currently is, which sharply limits the usefulness of everything I've just described.  Not to say that it's useless - I think anyone who is interested in forking processes would appreciate having all of this functionality - but what I've just described is not actually the functionality that I want from the system.  A distributed application, as I've described it, is one where this forking process happens between two computers - where you can run part of the program logic on an entirely separate machine.  And conceptually, once you've done everything I've said above, that's not actually a difficult proposition... to an extent, at least.  You need some permissions, and you need to make certain that dependencies and the like are synchronized, but the technicalities of starting a process somewhere else, and binding a pipe that connects the two processes over the network, are not particularly difficult.

I've said this in other places, but to reiterate here, the value of doing this fork over the network is allowing your program to place specific bits of code where they are needed, without the program itself needing to care that those bits are not on the same machine.  Generally, these "specific bits" are in one of three categories: input, output, or computation.  The benefit I'm talking about is not accelerating your program with parallelization (you can do that if you have a whole lot of computing to do).  No, the benefit I'm talking about is a kind of program that we don't currently possess - one that replaces the idea of remote desktops and SSH tunneling with running an application in one place, and handling the input and output in other places.

This category of applications is something that should be simple.  If you want to edit files, you generally want the file editor to be where the files are, because it will involve a lot of reading from and writing to the disk; compared to that, GUI updates and keyboard and mouse inputs are much less sensitive.  I for instance have an NFS volume that holds many of my files attached to this computer, but if I want to unzip files on that machine, I tend to open a remote shell and perform the action local to where the files are, rather than having my desktop wait on network traffic in both directions for every file operation.  Likewise, some tasks that depend on accelerators, such as video editing, are best done when the data and the accelerator are in one place, but that doesn't necessarily require that the user is in that same place.  As long as input and output latency is within acceptable parameters, you can do a lot remotely - and it's only easier if the entire output stack is being run on the machine closest to the output itself, for the same reason.  A lot of system GUI updates, especially, are highly redundant; you can summarize the changes with a few bytes, while a single new image frame may be several kilobytes, and sending every single frame in an easily-reproduced GUI animation may be millions of times more expensive than doing it locally.

Instead of asking what is required to live in a world where this is possible, it is better to ask why we don't.  The legacy of computing simply assumes that everything will be in one place, and comparatively little of our development has been focused on splitting a program up into pieces distributed across a larger system.  We are encountering this need more and more, and so more and more people are getting exposed to it, but until the default way we write programs allows this kind of cross-computer access natively, accessing a program running somewhere else will continue to be a tedious affair.

And to be fair, this "conceptual walkthrough" is more difficult than it sounds, in that it requires tweaks to compilers and programming languages, on top of providing a mechanism to deploy those forked programs on another machine.  Those things happen, and there are experts in those fields, but if anyone is wondering why I haven't written my own code... rewriting program languages is simply a task that's beyond me!  Tasks that are much simpler for the compiler and linker would be an enormous headache to write by hand, but that doesn't mean that it's easy to change how compliers work.

But for the people who actually know how to do all these things... I imagine that if they just knew what they were trying to accomplish, they could change the world.  There is a lot that's possible, but which simply won't happen without the right support and tooling.

Saturday, March 28, 2026

Generic Runtime Linking

 My last post, about a Distributed Application Model, was an attempt to separate one of several good ideas buried in Project MAD in hopes of making them more accessible to others.  This post is similar; in many ways it is reframing what I said in the last time, with a specific focus on explaining the core mechanism and how it flavors everything else that is going on in the MAD OS and MAD/ADA system.

We are talking, today, about dynamic linking under Project MAD, or if you prefer, MAD/Libs.  And to begin, let's talk about how this normally works, as I understand it.

Preface: Standard Runtime Linking

Programs (let's leave aside scripts for now) are normally written in a human-readable language and then translated into machine code in a process called compiling; frequently, there will be multiple, separate bits of machine code generated when you compile a program, as programmers like to split things up into logical segments.  Because the machine code is still structurally similar to the layout of the human-readable program, you can still recognize that in the resulting machine code, for example, a particular function starts here, and a particular bit of data is there.  An index of these facts about the machine code - where things are and what each of them is - is included with this compiled object.

When one of these compiled objects wants to reference another, you must necessarily be very precise and specific, in two parts: you must know what other object file contains the thing you want, and then know how to find that specific thing among all the other things inside of that file.  During the process of compilation, it's easy to keep track of this all automatically; a mapping is made between an identifier and where exactly to find it, in both human-readable and machine code forms.  Thus if you want to combine several objects together, you simply go through the object's dependencies, and find which file to look at, and when you combine the objects together, you ensure that the function call that was previously left as a dependency is translated into an actual function call within the final object.  This process is known as linking.

There is a form of linking at compile-time which depends on code that the user themselves did not create (which is called a library); this process is known as static linking, and it is, ultimately, the same as it is when you link together objects that the user created, except that the library isn't something that you just created and so you have to know where it is.  The process of specifying and finding these libraries is standard, and the details get into operating system and filesystem standards; common libraries are usually stored in standard places, and you only need to reference them unambiguously.

If, at compile time, the library you say you need cannot be found, the compiler makes a fuss and everything stops.  Mostly, the list of these library files is stored in configuration files along with all the human-readable program source files, and the linker program just checks with the OS to see if they can find the files you want.  And of course, if someone else wants to compile your program (assuming you share the source files), they can reference this list and go looking for exactly the libraries they need to compile the program, but assuming they have or get all those libraries, they will resolve all the dependencies your program has and come away with a finished, working executable.

A dynamic or run-time linking model alters this last bit.  It allows you to create an executable that still has dependencies; the executable is considered “finished”, but when you go to run it, the process is interrupted until the system can find a copy of the libraries it depends on and link them in.  Because this process happens on computers that the programmer doesn't control, if anything goes wrong at this stage, it can be a little harder to figure out and fix, but we generally have the process figured out nowadays.  (In theory.  Mostly.  Kind of.)

Other than that, dynamic linking is only slightly different than static linking.  Because these links are designed to be made when the program is loaded rather than having the canonical on-disk file altered, they are often done with function pointers - which treat the concept of where to find the function as data which can be modified, as opposed to a static linking model where these offsets may be metaphorically set in stone.  Even then, though, the system must find and load the file containing those functions, before it's possible to update those function pointers so that they will work correctly for the program that depends on them.

There are a few added complications, but by and large, this is representative of the process that we call dynamic linking, and to summarize one last time, it requires you to keep track of identifiers (as originally written in human-readable language) and what file to look for them in, and it requires those files to provide a listing that translates the identifiers into pointers for functions and data which can be stuck into the program to allow it to run.  Ultimately, this is what it takes for a program to call a compiled function stored in another file; to put it in a word, it's just recordkeeping.

Problems With the Extant Model

The model as we use it today assumes that linking is and must indefinitely remain highly specific.  It simply does not do for someone to link any other object file in place of the requested one; it is assumed that you are referencing some complex bit of code that works on every client machine in exactly the same way as it worked on the developer's machine, and nothing else will do.  This is not without reason; it's really problematic for a program to work differently on some random computer as it works for the programmer.

The problem is, because the linking process is very specific, operating systems and programming tools lack the features that would be necessary to facilitate generic linking.  The concept does appear in scripting languages, where linking isn't a thing, and there are concepts in programming that do the same thing within a given project (using dynamic function and data tables, again), but it remains a presumption that everything will be done exactly the way the programmer originally intended - and again, this is not without reason.

There problem is, this non-generic structure is remarkably flimsy.  Too many things are assumed and never tested; there is a lack of formal structure that allows you to describe things in unambiguous, but still flexible, ways.  As a result, there are tons and tons of small, specific, things, many of which depend on other small specific things, and if any of those small specific things break, much larger things that you would not expect to be fragile may also break.

Anyone who uses Linux (and related systems) to any degree has encountered situations where they go to update their systems and receive a deluge of small updates.  If one were to examine many of these updates, it can be alarming just how many are redundant; there are libraries that do the same or similar things but have different interfaces, libraries that do the same thing in different programming languages (especially scripts, which are wholly incompatible with one another), thin front-end executables that expose library functionality to shells (usually exposing only one specific library per executable), services that just sit around exposing a library's functionality, and libraries that compete with one another and may have some incentive to be incompatible.

What makes this system fragile is that none of these copies or similar functions can be easily swapped out.  If a given library used by a given program becomes a problem for whatever reason, everything that depends on that specific library fails, because there is no easy way to transition from using that library to using an equivalent library.  You can roll back to an older version… but that's only a patchwork, a temporary fix.

Why are they all incompatible?  The easy answer is because nothing is standardized, and that's correct, but I will say it differently.  There is no standard language for describing the dependencies of a program.  Enumerating them, yes; any given executable will tell you that it depends on function X in library Y.  But that enumeration of the dependencies does not describe anything.

So long as we cannot describe our dependencies, it is impossible to offer a generic alternative.

A Standard API Is A Generic Dependency

It's probably clear to existing readers of my blog that I have been describing for a bit, the idea of standardizing APIs.  Anyone who is new will come to know this concept as the System API Directory, also known as the MAD/SAD subproject.  But what specifically am I advocating for?

Suppose that your program wants to play a sound file; suppose that sound file is provided by the user, and you do not know in advance how the sound is encoded.  When I talk about describing a dependency, I mean that your program wants to be able to say, “I want to link to a library that will play a sound, as it was stored in a file.”  There are extant libraries that provide programmers a “one-stop shop” for various sound file formats, but depending on such a library is not the same as describing the intent of your program.  You remain dependent on that one specific thing, which itself has many additional dependencies, some of which your own program will never use.

Just as your program cannot be linked with just any library, the library was not written to be used in just any arbitrary way.  The library was designed to be used some specific way, and that specific way becomes the structure of your program.  Once your program is compiled for one library, there is no going back; even if you had another library with identically named functions, it is impossible to know for certain that they are compatible.  (Technically, yes, you could try to fool the linker, and perhaps even succeed - but it is not how the system is designed.)

When I talk about a standard API, I am talking about something that exists to guide the library programmer as much or more than it guides the applications programmer that will depend on it.  It does more than simply telling the library programmer what names to call their functions, and the applications programmer what the function name to call is.  It informs the structure of both projects; it describes when memory is allocated and when it is released, what guarantees can be made, what may or may not be taken for granted, and how errors should be handled.  Only when both sides agree on these kind of terms can a specific library be replaced with a generic alternative.

The language of APIs was always meant to describe exactly these sorts of problems, but to date it has largely been left to creators to decide what's in them.  And the point I'd like to make right now, is that until standards exist, libraries cannot be tested against and held to those standards.  And so long as libraries cannot be held to standards, you cannot use them generically, only in the highly specific ways they were designed to be used.

When you look at the state of code libraries, you will see that they must, will, and have created their own ad hoc standards for almost every task.  What is perhaps most distressing, however, is how commonly these ad hoc standards are literally the first thing that came to the creator's mind.  Once the library has been written, and applications linked against it, it becomes painful to change the API and force all the applications using it to rewrite their own code.  Even before it sees any kind of widespread usage, altering the API may mean making large structural changes to how the library was originally laid out.  If there is no impetus to do so, it is most likely that the programmers never will, and whatever form the idea first had, it will continue to have indefinitely.

I choose to place the blame for the “fragility” of non-generic interfaces and linking schemes on this problem: too many have unique and bespoke interfaces, with no agreement and no impetus to agree on what those interfaces should look like.  Programs that depend on some arbitrary set of libraries may find that in different places in their code, the same task is performed by different sub-dependencies, each working slightly differently, and those differences may require different structure around each function call.

I do not intend to disparage individuality in programming methodology; having multiple ways of doing a similar task is valuable, and arguably critical for long-term success.  However, having these multiple methods be fundamentally incompatible, especially where it need not be so, is problematic.  Indeed, I have long thought that one of the benefits of this kind of generic linking is giving users (not merely programmers) the ability to swap in a more optimized version of some algorithm when it is discovered, or to swap in a more transparent version if debugging or monitoring a program.  (There is a security argument to be had here at another time.)

I am also not necessarily suggesting that there be one ultimate and irrevocable interface for some given class of problems moving forward.  Rather, I am talking about deciding upon some minimum reasonable standard which defines the very concept of utilizing a technique, algorithm, process, resource, or device.  For sounds, for example, this minimum standard may include playing the file, pausing and stopping, playing repeatedly, getting or setting the current playback position, and changing the volume.  This list by no means describes everything you may want to do with a sound file, but it is a fair minimum standard (and I would gladly listen to arguments for expanding or shrinking it).

Of course, going beyond the minimum is a matter which brings its own concerns.

Expanding the Language of APIs

It was never my intent, and it remains not my intent, to claim to know how best to organize APIs, or to lay down firm standards without discussion.  As such I hope you will take the following section as informal, tentative, and conceptual.

I have talked in the past about the MAD System API Directory as containing a hierarchical tree of types, from most generic to most specific, and to this concept I still hold.  The nature of this kind of tree makes it technically possible that all possible interfaces can be categorized into rough categories, but this alone may not prove good enough for the task of describing a large, diverse, and robust system.

Originally, I envisioned that the first few layers of this tree were broad categorizations, and once you pass a certain point, then programmers are free to create various child standards with whatever additions to the parent standard they wish to add.  This has the problem, however, of not providing granular extensions to an interface; part of the intent behind the broad categorization is to establish simple baseline APIs that allow programmers to avoid interacting with unused features, for example when reading documentation or when experimenting with the functions of a library.

If you don't have some method for granular extensions, then you are obliged to create complex child standards almost immediately.  For audio files, for example, one program might have a specific need to pitch-shift an audio file (for example, to tune a musical instrument sample to different pitches), but otherwise, you might only use basic capabilities of an audio playback library.  Another program might need to cleanly loop audio, potentially with defined entry and exit audio segments (for example background music, with an intro and coda).  If these functions are not part of the baseline, in a child-only model, it stands to reason that someone will immediately create a “everything and the kitchen sink” child API that includes both of these features and tons more.  Worse, two similar but unequal child APIs may be mostly but not completely compatible, and there is no language to describe their intersection or extremities.

Although my experience with API design is minimal, I cannot help but think that the language of API design itself needs to be extended.  The above problem, for example, seems like it is begging for a concept of “features” which describe granular additions to an API; a child API may provide, or a program may require, some set of those features, but more to the point, the baseline API standard can be read and understood absent technical discussion of advanced features.  Whether or not features in this sense are the right way to accomplish this, I believe that finding some solution to this problem is laudable.

I have other similar thoughts - enough that it would complicate this section and blog post.  Some are already discussed elsewhere on this site, others will be described at some point.

But to summarize this for right now, remember that we are talking about making libraries into something predictable and testable.  That's all that's really required for generic linking; as long as you know what will happen when your program calls a function, then you can move away from depending only and very specifically on the same file from the same vendor as you originally designed the program to use.

But for all that I've talked so far about the problems with what we have, I haven't talked much (in this post at least) about the benefits of a modular, genericized library structure.

I Did Call It MADLibs

I talked in the last post about making devices accessible using inter-process communication and remote procedure calls.  Without rehashing (or if I'm being honest, rereading) that whole argument, I'd like to point out that the idea of generic linking fits directly into the idea that devices (in the OS sense, which may include things like files) have inherent capabilities.

Generic linking suggests that anyone writing library code knows how to expose functionality in such a way that if a program depends on some specific capability (in the API sense), it can know whether your library provides that specific capability.  Devices necessarily have some kind of driver, even boring devices like files; the driver describes the capabilities of that device.  Thus it stands to reason that a library should be able to make the claim, that it provides a device specific capability, when paired with some device of that type.

At this point, we have returned to this post about assigning types to every filesystem object.  And while that kinda is the point, I'd like to phrase it a certain way right now:

Libraries which provide device-specific capabilities are verbs that act upon a noun.  Any given program may start with a specific verb and allow the users to fill in a noun, but you can also create a list of all device-specific capabilities that anyone claims to offer for a specific class of device.  In other words, a highly generic program such as a user shell (file browser, etc) can start with a noun and allow the user to select any appropriate verb, provided by any library that is available on the system.  The user, of course, has a lot of files and devices available to them; they have a lot of nouns, each with a comprehensive list of verbs.

I said in my last post that “it may surprise people who don't look at how OSes work that an average programmer probably doesn't know how to get a list of all devices present on a given machine, can't tell what a random device is capable of, and can't figure out how to control that device.”  This is the culmination of that indictment of modern computing: A Madlib-style system, mine or otherwise, that allows you to cultivate a list of verbs that work with nouns and nouns that work with verbs is absolutely possible, it just doesn't exist in systems as we have them today.  Certainly not in the generic case, where you expect the same methodology to work as well with services and hardware devices as it does with documents and media files; the “file associations” structure that modern operating systems use is primitive in comparison.

It is also important that the Madlib-style system provides a rigorous way for the system itself to know how to provide a given capability for a given device.  Remember that Project MAD is distributed; if you want to consume a capability on a remote piece of hardware, you need to know what, if anything, you need to bring with you when you send a program agent out to consume that capability, and you need to know what to expect when you get there.  If for example a remote node has a camera, and some nearby tertiary node contains a library that knows can provide a specific capability for that camera, you may need to copy the library over to the camera host and load/run it, in order to get the required capability.

It is important to consider this, because the library you're talking about may have multiple dependencies.  The camera capability you want to consume, for example, may also require an AI accelerator chip, which may not be present on the camera node, but it may exist in the larger system.  If you want to consume this AI analysis of the camera feed, a separate setup process must first begin which connects the camera to the AI accelerator, and then your program to the output of the AI accelerator.

Under MAD, this scenario makes sense.  It is reasonable to expect this stack of programs, libraries, and drivers to deploy automatically when you ask to consume what appears to be merely the capability of the camera.  It makes sense that you can simply use a verb on a noun.

If someone else finds a way to do the same… I welcome it.  Ultimately, I really just want computers to be better.  In the meantime... I hope this sparks at least some thought on the subject.

Saturday, February 28, 2026

Half-MAD: A Shorter-Term Vision

 I think that one of the reasons why it has been difficult to enunciate exactly what it is I want project MAD to accomplish, is because to an alarming extent, I want to do things that we are not doing in a modern machine, but extended to encompass all of several machines at once.  Thus to my chagrin, I end up effectively making a list of demands that might be considered absurd if they were to happen in a contained, well-understood environment, and then adding “…to everything, everywhere” at the end of the list.  And to be clear, Computer Science as a discipline is well versed in problems that do not scale well.  Proposing something difficult and then demanding that it scale is, itself, extremely arrogant.

The only reason why it doesn't approach being farcical is that I am proposing each sub-project with methods already in mind, if not in hand.  Ultimately, when it comes to actually implementing the ideas, new problems will crop up that we have never experienced before.  But to better understand why those projects are worth the problems they cause, it's worth discussing some of them in a smaller, more general context.

What is a Device and What Do They Do?

I talk a great deal about how Project MAD is designed to let you access devices or resources or capabilities of multiple pieces of hardware.  This is a strange topic for an application programmer; if you haven't already delved into microcontrollers and peripheral interfaces, you may wonder why anyone would care about all the devices attached to a system anyway.  When all you are doing is building some random bit of application logic and wrapping a user interface around it, using fixed and well-defined OS concepts or third party libraries, well, if you end up needing to talk about drivers for literally any device then someone has done their job wrong.  And indeed, if I wanted the average, user-facing application developer's life to change, I would probably be making their life worse, not better.  Part of the design of MAD/ADA is that, in fact, they don't need to worry about devices.

But also, devices will absolutely be used to control how their application works and how it is distributed across the system.  That's all fine, assuming they don't have to worry about it - but of course, eventually, a professional developer will run into every bug that comes within a mile of their code, let alone the ones actually inside it.  So if they do, technically, need to worry about devices, it's worth asking: what are they, why do we care, and what do the changes I'm making actually do for us?

Obviously, at its simplest, a “device” either changes the world outside the computer, or produces data based on the world outside the computer, or else it makes some data-to-data transformation task easier.  Those are, basically, the only things a device could even possibly provide: output, input, or services.  If you want one of those three things, you are going to be touching a device, even if that's only with an OS-standard ten foot pole mounted to the nearest wall.

An application developer's life is usually pretty simple: if we need input, we take it, if we need to output, then we output, if we need some service, then we talk to it.  Usually, someone else simply handles these things, in fairly standard ways.

The trouble starts whenever the OS itself either doesn't care about a device, or makes faulty assumptions about how you want or need to use it.  Say you have a TV across the room and you want to cast video to it. Modern OSes tend to only do two things with monitors: extend the desktop onto it, or copy some other monitor's output onto it, and neither of those may serve your needs well.  That's why in 2013 Google put together a tiny little dongle with a surprisingly important feature: you could just send video to it.  It took input and produced output, entirely separate from the normal desktop (or phone) metaphor that we had become familiar with.  And while I won't say that has or hasn't changed the world, I for one am grateful for such a device.

You can understand devices like that with a simple, generic model.  They have some requirements (the Chromecast needs internet and video out), have some capabilities (it will display internet video streams on the TV), and have some way to control it (a remote, or network packets).  When I talk about devices, I pretty much mean anything that fits that, extremely broad definition of input, output, and control.

So it may surprise people who don't look at how OSes work that an average programmer probably doesn't know how to get a list of all devices present on a given machine, can't tell what a random device is capable of, and can't figure out how to control that device.

On all major operating systems today, the device model can generally be broken up into two categories: if the OS itself uses it, then the OS takes care of it and you live with the OS model for better and for worse.  Otherwise, it may be technically available but you'd better find a library or application that knows what it is and how to use it, because they ain't touchin' it.

And that's… that's kind of it.  There isn't a whole lot of nuance here.  Either the OS takes care of it, or you're on your own.  There are some cases that are better or worse, but broadly… yeah.

Why?  Well, that's not hard to understand.  Someone has to invent a way to deal with it.  If that someone works for Microsoft, then Windows can handle it; if they work for Apple, then Macs can handle it; if they enjoy open source projects, Linux can probably handle it.  If they are part of some other company, such as the manufacturer of some device, they will provide an application and/or library that works and consider their job done.

Unless the device is really, mind-bogglingly important (and/or you convince people that it is), nobody is going to reshape how the OS works in order to help your device fit into the ecosystem better.  And… it has been that way since the dawn of computing.  Generally speaking, everyone involved would rather make small changes than big ones, not least because if you make big changes, consumers may not like it and you may be out of business in anywhere from two months to five years.

Something like MAD's OS project doesn't necessarily need to reinvent the wheel so hard that it changes the way the world itself turns.  I may end up sounding like that's what I'm doing, because I want to paint a picture of what MAD is capable of, but you could just… add some of MAD's features to existing OSes, or as a third party structure on top of them.  To wit, devices.

But first: IPC.

What is IPC/RPC and Why Should I Care?

Inter-Process Communication and Remote Procedure Calls are both fancy ways of saying that two programs talk to one another.  Generally, IPC is happening among two programs on the same computer, and RPC is the same but going between machines; there are some differences there, including important ones, but let's set that aside for the moment and just talk about how computers work in general.

I said in the last segment that you can divide device support into “The OS Does It” and “You're on your own”.  Categorically, the same is true with talking to other programs.  If the OS does it (and by it, I mean, making some given program do something specific for you), then there is a function, one that probably works very similar to every other OS function that you have had to use while learning to program, which talks to that system program and then gets back to you.  If the OS doesn't care about that specific thing, at best you will find a library or programming language that makes the task of talking to some specific programs just as easy.  (There is an aside here for scripting, but that requires a whole ecosystem to set up and so, like OSes themselves, can be difficult to make pivot if a change is necessary)

However, it is at least as likely, if not more, that you will have to do a whole lot of research into how that specific program works and how developing IPC/RPC mechanisms work in general and then you do a whole lot of work to make the mechanism talk to the specific program, and oftentimes, talk to a specific version of that specific program, because that might change.  And this is why we have kind of standardized on IP and web services nowadays - everywhere you see them pop up, there are a whole lot of tools that almost everyone involved did not need to invent.  Reinvent, sometimes, but not from scratch.

IPC/RPC mechanisms, however, do still need to be made.  If you have a program, say, one that takes your company's new fancy keyboard with per-key LED lights, and exposes the light-output capabilities of that keyboard to the user (letting them make all the colors their own brand of pretty), you will probably not design a server into that program which lets other programs talk with it.  And why would you?  That sounds like a lot of work!

But look at it from another app developer's standpoint - if they want to make a key flash when a notification comes in, say, they have three general options: learn how the keyboard driver works (which may be, if I may summarize, very custom), interact with the manufacturer's application (which we just ruled out), or interact with some third party library which can do one of the other two options for them.  If nobody has or wants to make that third party library, they're also out of luck.

MAD takes a tact that may be very alarming, which is that all programs should by default have some IPC mechanisms - it's part of making them distributable across multiple pieces of hardware, after all.  But you could also… you know, just, use that same mechanism to expose some functionality.  That exposure doesn't need to be reckless; can maintain some controls, make sure the user is involved, something like that.  But if we decided that IPC/RPC should be easy, maybe we should expect you to interact with other applications more often.

And the point of deciding that it should be easy, is that having made that proclamation, there ought to be a very standard way to do RPC, such that it is just as easy to work with as when the system takes over handling that task for you, because you're talking to a system component and they know how to do that.  If we want programmers to take it at all seriously, you have to make it easier for programs to interact fairly regularly.

And if you're going to interact more often, it might be best to have standards for communication.  And if you wanted, perhaps, to make that OS/DIY dichotomy even less severe… then OS IPC should work in the same way general application IPC works, adhering to the same standards.  That way, people who are forced to learn OS methods when they start programming, don't need to learn a whole new separate thing when they start dealing with new and more interesting tasks.

Which brings us back to devices, or rather, device drivers.

Making Devices Accessible: The System API Directory

Again, this is all stuff I've talked about before, but let's dial it back.  You have devices on your machine.  How do you interact with them?

Well, generally, every device you'd like to talk to exists on the far end of some kind of connector, and what is exposed at the very bottom of the CPU/Kernel system is the near end of that connector, not the device itself.  Many of these connectors require a little bit of finesse to use properly, so there may be a relatively complex driver just to ensure that any message you send across that bus gets where it's going, meaning you will be using a slightly higher level API that accounts for the bus itself.  Only once you are in contact with that other device can you talk about, you know, talking to the device itself.  Now, modern CPUs aren't just big-banging busses (meaning the CPU itself isn't hot-looping to make sure that every bit gets sent at the correct time, or anything similar); they use support chips, microcode, and other conveniences that might be relatively specific to the processor/motherboard combo you are using, but there are standards and firmware to sort of even out the complexities of making that part of the system work.

Once you can talk across a bus to a device, you have to know what language it speaks.  If you know exactly what device you're talking to, this shouldn't be hard - the manufacturer should list all the commands, what they do, what the return data means, and so on.  A few manufacturers prefer to keep some secrets… but let's not get into that.  Some things that you will be interacting with on the other side of a bus are themselves programmable, and you may be sending code to them (which is a whole other thing that is only partly covered by MAD/ADA), but frequently, you are just issuing commands that make it do whatever it is you want to do, and listening to the bus for both expected replies and updates caused by an event.

The software which converts requests from a programmer to commands that arrive at the device, and responses from the device to normal software events and function returns, can be understood as drivers.  This is where the OS/DIY dichotomy comes into play: the OS will keep track of devices it cares about, and ensure that there are very standard ways to interact with drivers it cares about.  Devices that the OS doesn't care about are not well kept track of, and the ways to interact with them are not terribly standard - at least, not outside of the specifications of whatever bus they are on the other side of.

The MAD/SAD proposes to tackle these two problems at the same time.  It asks device drivers to expose a standard API and an implementation library; programs that utilize the API can link to that library and they will be interacting with the device itself.  Then, all device drivers are listed in a filesystem directory, indexed by a key representing the API, and linking to that library.  If more than one device implements an API, even if they use the same library to do so, the implementations are listed separately and configured differently, so when your app links to a given device's library, you will be talking to that specific device when you make API calls to it.

That on its own represents a massive difference to how libraries and devices exist in any modern operating system.  Today, libraries are just things that may or may not exist, and devices are just things that may or may not exist.  You can't just link to a library and be talking to a device - you link to a library and then it will let you ask the system if any devices exist, because the library itself isn't configured, but is instead generic.  Likewise, you can't just point to a device and get a library that interacts with it.

When I talk about “You can just add shit from MAD”, I'm talking about shit like this.

But I go a little further with MAD/SAD.  I mentioned the libraries are indexed by an API key (in the filesystem, but you can ignore that for the moment), and if you've read any of my articles, you know that the API keys are in a tree, corresponding to an inheritance model.  And since I'm trying to explain all this simply, let's touch on that for a moment.

The API itself promises that the library attached to it will have certain functions; if you call those functions, you will get certain results, and the API makes all that very clear.  That doesn't mean that those are the only functions in the library - it's only a promise that if you go looking for certain ones, you will find them.  So naturally, a library may implement more than one API, each time exposing different functions.

Likewise, an API may provide different levels of what amounts to the same API.  It may provide a dirt-simple one, and a slightly more complex one, and then a far more complex one.  Generally, the simplest one gives you the fewest ways to customize it, but it also means you need to know the least about what kind of specific device you're using; the most complex one requires you to know exactly what the device is, but may essentially give you direct access to everything the device is capable of.  Since you choose the API when you write the program, you get to decide how specific the API you're targeting is.

Thus, in the MAD/SAD API tree, the folders closest to the bottom of the tree contain very simple APIs that require you to know very little about the devices, and which apply to far more devices, but which also expose very few features - and then there are more complex APIs that require you to know what you're asking for, but give you more control, until eventually you get to a library which may provide you direct access to the device with no hand holding whatsoever.

But in an inheritance model, you don't necessarily hide the API that came before - the parent and child APIs coexist, and the child API is obliged to expose every function the parent does.

Suppose you have a device that acts as a source of character data - that may be, for example, a keyboard, but it also describes files, network streams, and some busses like serial ports.  In any of those cases, the API will provide you with a generic “read” function that provides you with a “firehose” of data sent directly from the device, and probably also a generic “write” function to shout character data back down the pipe to be heard at the other end.  Questions like “What does the data mean?” and “What should I say to this device?” are not answered at all by this level of API.  If you use that level of API, you will not know what device is on the other end, or if there even is a device at the other end, not when you are writing the program itself - the user may know, but that's not your business.  Maybe the ‘device’ is not even hardware; maybe it's software, or maybe it's a black hole that says nothing and eats all input (aka /dev/null).

That same low-level API can be more useful to a programmer that knows what the device is, but when you know what device it is, you can usually also provide better service than just “read” and “write” functions.  Not always - some things really are that dumb and simple - but usually, you can provide an extension to that character device which provides new functionality.  Under MAD/SAD, that extended API will only be presented if the device promises that the library, and the device, implement those functions.  It will also still be listed among the low-level, raw character device drivers, because the extension adds to the lower level, instead of replacing it.

Suppose the character device in question your fancy backlit keyboard, for example; it may have three or four levels of API just in the ‘character device’ tree, to say nothing of (eg) a ‘raw usb device’ tree.  First, your keyboard is a generic character device; second, it is a keyboard (the output should be understood as keys, a keymap may be involved, and you can send commands to turn on capslock and so on); third, it may implement some standard “backlit keyboard” API, if there is one, and fourth, there may be a vendor-specific “This is our backlit keyboard” API that has some juicy extras the generic API does not.  This last API extends the other three; the third API extends the first two; the Keyboard API extends the character one.  But if you really want, you can just open the keyboard as a character device and listen to every keystroke that comes in down the line, and shout random character strings down the line just to see what happens.

Of course, the OS will recognize keyboards as something special, because the OS knows that most apps will care about keyboard input in one way or another - so it will make use of that second level of API (char/keyboard) automatically.  In fact it will probably create a virtual keyboard device that collects input from all keyboards (which should be a standard itself) and uses that instead of any specific device, because then the user doesn't need to change how anything is configured when they plug a USB keyboard into their laptop.  But the real point is, we have made the generic way to reach devices good enough that the OS can just use this tree to do what the system normally does, instead of making very unique and specific ways for the system to make critically important devices accessible, ways that kind of clash with how we program other applications.

Sounds nice, right?  Maybe a little boring.  But the way I introduced this topic was about APIs and libraries - it has nothing specific about devices in it.  Any persistent application that wants to provide an API and library can do so, even if it's not a driver or even a background service - and that application's APIs will be listed in the directory, same as any other.  So… if you install a new database service, say, you will be able to find it in the list of all databases on the system, even if that database is a foreground application for some reason.

Now, here's a funny little twist (which I've already talked about in another post): when you use IP and ports to connect to running applications and services, you normally locate services by a standard port number, because that's how it was when the internet was created.  Since more than one service can't share that port, other copies of the same have to have other ports; there isn't a standard for how to do that, so the user or service has to figure something else out.  You may be able to examine the other server processes and find out what ports they have open, or examine all opened ports to see if they sound like that service, but neither of those is a smooth, well-defined operation.  Under MAD/SAD however, if you know what API the service provides, you have a list of all of them that are running, and each item in that list is preconfigured to let you talk to it.

Moreover, you don't necessarily need to publish to the SAD globally.  Remember, I said the API keys are published in the filesystem; you might publish such a configured library file privately, if you had, say, an embedded database in a larger application, and you only wanted the parent application to talk to it.  Linking to that preconfigured library would connect you to the database; it need not be exposed to the network, or configured to deal with the network.  Presumably, an administrator or user could find it if they knew where to look, but that's about all the exposure it would have

The same mechanism that helps you find and utilize devices also helps you utilize things that are not listed?  Wow, it's almost as though it's just a better way to utilize things.  But what exactly is this mechanism?  I took it for granted that you can dynamically link to a library and it will be preconfigured to talk with a specific device.  That is… also not how things work.

Perhaps more to the point: Just before this I was talking about IPC/RPC and I never got back to it.  Why did I use that segment to introduce this one?  We already have dynamically linked libraries, even if they aren't preconfigured.  Making preconfigured libraries would be a fairly straightforward thing that has nothing to do with IPC/RPC mechanisms.

Well… the point with IPCs was that MAD argues every running application should expose IPC endpoints as default under MAD, meaning that if you write a service or application, you can interact with that application via another application directly.  Of course, if you want to interact with something via some IPC/RPC mechanism… you really should have standards.  Like an API.

What's the difference between calling a function embedded in a linkable library, and calling a function via an IPC mechanism directly?  Well… the linking bit, I suppose.  And it matters exactly what application and user is “running” the code, as far as the system is concerned.  But if the library mostly exists to provide a translation layer, between the published API and IPC endpoints exposed by a service… what if there just wasn't a need for a translation layer?  Or rather, what if the “translation layer” was a bog-standard system call?

What if your client application, the one that wants to use the API, can't tell the difference between linking to a library and directly talking via IPC with a service?  What if the programmer involved need never care?  What if they simply… use the API?  What if they do so by using a system call that does whatever is needed to reach the target function?

Bringing It All Together (By Taking It Back Apart)

Now… Does this all sound like hyperbole?  Does it sound like I think this is all easy?  Because this isn't all easy.  Let's look back at a list of things described in this chain of logic that are new:

  • Have all applications use detailed IPC hooks by default
  • System call that translates API call into IPC to a running service
  • Configuring linked libraries at runtime so that they point to a specific device or service, and using such libraries
  • System call that translates API call to loading and running a configured, dynamically linked library
  • The two aforementioned system calls are the same system call
  • Listing APIs in a systemwide tree
  • That systemwide tree is in the filesystem
  • That tree is an interface-inheritance tree (or some other semantic structure)
  • Allowing the API system call to be directed towards libraries that are not in the system tree
  • The system uses the API system call for basic functions (where feasible)

Now, we could quibble, and maybe I've missed some small points, but if I have an idea that requires making ten changes to the way operating systems work, some of them pretty freaking fundamental, then you certainly can't take all of those changes and tack on to the end,

“Now do this for a dozen systems linked together into a gestalt system.”

Yeah, that sounds a little crazy, huh?  About that.

Anything, Anyplace, All at Once

Funny thing about those changes.  One of the key themes is that it lets you address and control devices and services as though they were files.  (In the case of applications, though I've not talked about it here, there's also a big portability thing so that you can run them from anywhere, but same basic idea - an application being accessible means it being runnable when all you have is the “file”.  See MAD/SAD AF)

The point of them being accessible as files was that it's not hard to have a filesystem that represents multiple machines at once - you basically just make a folder with multiple machines in it, and list out the contents of each machine under them.  That's literally all it takes; you can list the entire filesystem, or just what you think is important, or both in different places.  Sticking trees on top of trees… is just how trees work.

The question becomes: what's the best way to make use of that?  UNIX already has the general philosophy of “everything is a file”, so in theory, you could make a distributed system where they simply… all use a distributed filesystem listing all files everywhere all at once.  But first, modern machines don't actually stick to “everything is a file,” and second, sometimes interacting with things at the file level is very… back and forth.  Which is something you may not want to do over a network, where every back and forth has significantly longer latency than a local call.  Like… thousands of times longer, maybe.  And also, you would need some universal addressing schema for this multi-machine directory so that path strings work the same everywhere and are unique.

But anyway, so you want to do some things locally, to avoid thrashing the network.  That means running a program on the machine that hosts some specific device.  How?

We already have some technologies to do this, and the one that most people will be familiar with is the “serverless” cloud infrastructure made famous by Amazon Elastic Compute Cloud (EC2).  The point of the serverless model is that a program can be loaded on any machine that has room, and you just keep track of where it actually is loaded, so that whenever someone goes looking for it, you can direct them.  If that program is part of a larger structure, have it keep track of where other pieces of the same program are, in case they need to talk.

And that's basically the ADA (it's not, but never mind).  It's serverless infrastructure designed for an operating system and home network.  All you're really doing, is running a program somewhere else, so that when it access a file that is actually a driver or service, it is doing so locally instead of across the network.

And that's it.  With that, everything works!  Victory fanfare, parades, we all go home.  Right?  Well, sort of.  No, not really.

Clear the Air (Once Again)

You see, part of the problem with the ADA is that if you do it wrong, people will hate to touch it, let alone depend on it.  They will avoid it, frankly, because people are lazy.  And while you could, maybe probably, have the system still work when programmers avoid it, by doing a bunch of extra legwork across the network, if developers being lazy means that the experience sucks for the user, nobody will want to be a user.

Even if I'm right about how this all should work, if we do a bad job on the implementation, that's it - game over, we lose, things to back to the comfortable old ways with minimal changes.  That's why I did a lot of extra thinking about how certain things should work, most of which sounds crazy out of context, because I'm planning for a future where OSes work very differently than they do today, and then on top of that, computers are combined in some way that developers don't yet understand.  So if I were, today, to ask a developer to operate under MAD rules, or to just assume that my OS will work the way I say it does… they would be totally justified in telling me to piss off.

They'd have to be mad to believe in MAD, before it's proven.  And that's kind of the point, at least of this post.  Some of the things that make the MAD system work, might be implemented on top of a normal operating system.  The API inheritance tree doesn't have to be the only way drivers work; it doesn't even have to be the canonical system way that drivers work.  But if it's there in a form that people can use it, then other things I've said might start to slot into place.

Likewise, you can have an IPC mechanism that application use, which is separate from the normal system IPC mechanism - it would just have to use some higher-level mechanism, like an IP:Port interface, with extra code wrapped around it.  This IPC mechanism could let applications and services publish APIs into the same tree, and you could talk to them the same way you talk to drivers.

That thing about using a system call to reach the tree… well, if it's not part of the system, you'd simply have to use a standard server on each machine.  Aside from that (and it would be inconvenient), things could work about the same, with it farming API requests off to the appropriate source.

And the distributed, agent-oriented MAD/ADA programming model?  Part of its genius is that you don't write drivers and services to listen on the network - they all expect to be contacted by a process on the same machine (IPC, not RPC), which is the same way they would be used if the ADA doesn't exist in the first place (well, mostly - there's user authentication, security and so on that needs to be addressed, but forget that for now).  Meaning you could design and run such services today, if the rest of the ecosystem existed.

Is all of this still a lot?  Hell yes!  But for all of that, it's still only half-mad, because it's less than half of MAD.  Still feasible, still important, still better.  But easier to understand, less intrusive, and I presume, vastly less intimidating.

Well, we'll have to see.

 

Monday, January 12, 2026

Why MAD?

This is another blog post that I've ended up banging my head against, writing and then discarding multiple, relatively long revisions.  Unlike the last several, however, this is not about describing a technical component of Project MAD; it's about the opposite.  The truth is, when you distill a distributed applications model down to its essence, it sounds so simple as to be irrelevant. And indeed, the model - absent proper support - might do little enough.  It really needs to exist inside of a context that makes it more valuable.

And the funny thing is, a properly supported model for distributed applications is insanely valuable.  There are thousands of examples, spread across hundreds or thousands of different domains, where we are having difficulty writing applications that combine the resources of two or more computers together.  In the world today, we are expecting and hoping that companies and open-source projects will each create solutions to each of these examples on a case-by-case basis.

Those are what I call a special-case solutions to the problem of distributed applications.  What I am searching for is the general-case solution, one that fits, if not all of these examples, most of them.  A single technology would make it easier to create solutions for these problems, but alone, a distributed application model still expects - nay, demands - that programmers solve the hard problems involved in making applications work across long distances.  Those hard problems are the kind of thing that need to be solved centrally, on an operating system level, so that each application programmer can go about doing the thing they are there to do.

But it's worth laying out the argument, as best I can, piece by piece.

Why Distributed Applications?

I said above that a properly supported model for distributed application would be insanely valuable.  It's worth running through some examples of things that are wrong with modern computing that come down to, essentially, the lack of general distributed applications.

There are a lot of examples of simply inadequate hardware.  My dear mother has a relatively recent laptop that isn't upgradable, and is now all but useless for a non-technical user.  (I'm sure I could install Linux and it would work fine, but she doesn't want it)  That's an example of something that's too cheap; but modern GPUs are getting too expensive because they are trying to get a single chip to do a massively parallel task all at once.  If someday distributed applications (generically, not counting special cases) can split their GPU load among multiple processors as easily as they can put it on a single one, a lot of people will be a lot happier.  Making applications distributed likewise helps with inadequate memory as much as inadequate CPU threading; you still need the same amount of memory, but today you may already have the excess, perhaps locked up in old laptops or desktops where there's no good way to use it.

It goes beyond computers that are poorly designed.  Smart devices, including TVs, watches, glasses, car dashboards, VR headsets, and so on, are not generally meant to run an entire generic PC for their user, but they have enough hardware to serve as a decent front end - but the process for making good use of them is spotty at best and terrible at worst.  It isn't even, to my knowledge, acknowledged that these are all examples of the same class of problem: finding a good way to make use of compute, display, and sensors distributed across many devices.  That's part of what I'm trying to solve, frankly.

There's also ownership of data.  Home cloud systems exist (I use one), but larger companies put a lot more development work into the proprietary versions, so those end up more advanced and user-friendly.  A large part of making these cloud-compute systems, however, involves solving the fundamental problems of distributed applications, if only in a specific case - finding a free compute node, authenticating users, deploying and updating applications, error handling, and so on.  If there is a general-case solution, and one that users can deploy at home or in small businesses with relative ease, then the value proposition of cloud computing isn't as strong.  This not only relieves end users, who don't have to trust the companies, but it takes companies out of the business of both safeguarding and somehow monetizing other people's data, which is an unfortunate contradiction to be in charge of.

The last example I'll point out for now is administration.  It's not trivial for non-technical users to manage home servers, not least because managing the server is simply a separate task from managing your PC, so it's something that home users need to explicitly remember to do.  Where your PC might (with varying levels of obnoxiousness) force you to update apps, drivers, and services, currently any machine that isn't in the foreground must either do things automatically or… not do things automatically, and leave the user to their own devices.  Distributed administration means that the state of the entire system is always in the foreground; if some part needs updating, it is the same whether that part is local or remote.

In the same vein, because the distributed application model that I'm proposing is about an application expanding to use remote resources, the applications are not split into separate client and server components that must be updated and managed separately.  A unified model means that local and remote pieces are understood as parts of the same whole, and that includes installation, deployment, and updates.  Between distributed OS updates and application updates, the entire environment that is being used - even though it is on multiple pieces of hardware - is all in scope at once, and all managed at once.

That, however, is all about distributed applications.  You could make a distributed application that manages existing operating systems. But MAD is more than just the app model, which raises the next question.

Why a Distributed Operating System?

When I originally created MAD, I took it as granted that we would need a new operating system to handle the management and operation of a distributed system.  Frankly, I merely assumed it; it's fair to wonder whether or not a distributed-first operating system is ultimately required.  Indeed, if you browse my blog, you will find a lot of concepts that seem to be unrelated distractions from the core concept.  And in some cases, that's probably exactly right; you'll never catch me saying that MAD is perfect.  Given the name, that'd be kind of silly.

The distributed operating system concept is not the distributed application model.  They are separate, and must be argued for separately.  And quite frankly, I can envision a future in which distributed applications are simply run on top of existing operating systems with little or no change to those operating system.  Frankly, managing distributed applications comes down to a software service; it doesn't require a clean slate.  I imagine that a lot of people, assuming they managed to figure out what I'm saying among all the stumbling explanations here on my blog, would make that point themselves.

There are two fundamental counter-arguments I'll make.

First, the distributed operating system is a collection of standards, and there should be a collection of standards that applies to distributed applications even if they are being run on top of existing operating systems.  To some extent, when I'm talking about the distributed operating system, I'm talking about the environment that exists inside of the distributed ecosystem itself, rather than the physical disk images on top of which the system itself operates.  It's important that applications and the systems onto which they are deployed, can agree on some terms, like what resources are available, what deployment constraints an application fragment needs fulfilled, how standard functions and hooks work, and so on.  It's astonishingly easy to imagine getting that wrong and creating two or more completely incompatible systems, or ones that will require serious shims and compatibility patches to be made to work later on.

The other argument to be made is that operating systems, even open-source and user-focused operating systems, have become bloated - not because of any technical failing on anyone's part, but because they were developed over time, and different components represent different ideologies regarding how a computer should be programmed.  One needs only familiarize themselves briefly with any one of the many arguments about Linux init systems to see that we are operating a complicated system on top of mixed metaphors.  I'm well aware that any OS I create or am involved in, will not be the final operating system, and assuming it survives, it will tend back towards the current state of chaos over time; thus, that itself can't be my argument.

But the various technologies I have placed generally under the OS category of Project MAD are all distillations.  For example the System API Directory and Application Folders frameworks, together unify files on disk with the abstract concept of data, as provided by drivers, services, and applications.  While I expect to have arguments about this framework with experts someday, the point (as I have made it in my last couple blog posts) is having a consistent metaphor that works for all data in a distributed system, from the smallest bits of application state, to LED light indicators exposed by drivers, to database sockets, to embedded applications and libraries.  One possible interpretation of the SAD/AF framework (not necessarily the one I'll push for) implies that any variable stored persistently in any part of any application is addressable (if not necessarily accessible, or even listed) via the directory, uniquely, so that it would be the same from anywhere in the system.  And at the same time, every compiled-in function would be not only addressable but callable from anywhere, complete with full type safety and crash handling.

It's not about actually doing that.  That would be a safety and security nightmare.  But if you have a data accessibility technology built into your distributed operating system that makes that level of access possible, it can handle anything less than that.  If you can go that far, then you can manage every application process on every node of a distributed system, you can bit-bang every GPIO bus on every piece of hardware, and you can read every last byte from every disk.  And critically, you can do it all from the filesystem, where everything accessible may also be listed publicly for you to peruse, so that you may better understand the system you're working on.

A single consistent metaphor that allows you to do everything that a distributed system can do - that's valuable, and it's a quantity that is not at all guaranteed in modern operating systems.  Building the right constraints on top to prevent abuse… I'm not going to say it's a secondary concern.  But I have some confidence that security experts won't have to strain themselves to find ways to block funny looking remote procedure calls, with suspicious senders and destinations.  If nothing else, IT departments are already wrestling with similar problems.

However, the SAD/AF framework only works for applications within the distributed OS's purview.  It is my concept that even a dedicated MAD OS image will have a number of services and drivers that are not accessible remotely under any circumstances, for security reasons - but if you simply have an ADA-compliant server on a random operating system image, almost none of the applications, services, or drivers of the host system will be exposed to the distributed system.  Rather than an intentional screening of things that do not deserve to be accessed remotely, you simply have the common and average fact of incompatibility.  And that's… fine.  It'll probably end up necessary, one way or another.

But if you want to administer systems, if you want to control all their hardware, then you want the operating system image to expose as much as possible to the network (for authorized and authenticated users, of course).  And there's a fairly obvious security hole - it's entirely plausible if you merely have a remote application service, which isn't well tied to the operating system, a remote application could start a local process which the service itself cannot regulate, or even detect, becoming an out-of-band attack vector.  And while you may be able to count on normal OS tools to discover and handle such an incident, and you can have distributed administration tools that handle those OS tools, the added complexity makes the situation more precarious.

Meanwhile, because of the way distributed systems work, many nodes in your network may be used for nothing more than hosting remote processes.  Modern OS images provide a large number of services and drivers that may not be needed or used; a compute-only network node could be trivially small, and have no exposed filesystem (or an immutable one) for remote applications to attack.  While it's possible to set up, eg Linux, systems to run at this rudimentary level, most systems are still expected to be general-purpose, and their OS images reflect that.  Systems set up for hosted compute only, especially if they have an out-of-band or self-update mechanism, may be all but untouchable to applications in a distributed system, in addition to being faster and simpler to operate.

The reason, of course, that most OS images today contain so many nuances is because the hardware they are running on top of can be so variable, and it's unclear how much of that hardware a given user might actually need.  And that brings me to my last question:

Why Modular Hardware?

When I first envisioned MOS/DCA, adding a processor to a system meant connecting little hardware devices together; the system was made of modules that would daisy-chain infinitely, and each module's hardware included absolutely nothing except its payload, and the backbone networking chip.  No general-purpose networking was involved here; it was dedicated pieces slotting together to make, effectively, a single desktop system out of hardware modules.

Hardware drivers under such a system, you understand, would be quite simple.  A CPU-only chip has no need for any drivers except the chipsets that support the CPU itself (including a fan and temperature sensor), and the backbone.  No generic networking, no printers or keyboards, no serial busses or parallel ports, no USB, no solid-state or spinning platter disk drives, no CD/DVD/Blu-ray drives.  A compute-only node not only need not do anything else, it could not.  No such accessory hardware existed.

In return for that simplicity, these modules were the very definition of commodity hardware.  I imagined they would be churned out by the millions.  In the modern world, I imagine them being built off of license-free technologies like RISC-V, driving the costs ever downwards; twenty years ago, I would have hoped for something similar, even if no such were commercially available at the time.  Nowadays, Raspberry Pi has proven you can deliver a full computer for $30, as long as you don't mind handling exposed pins and solder.

If you could make every computer you own work better with a single $30 purchase, would you really stop at one?  If plugging a raspberry pi into my parents' network was all it took to salvage my mother's failing laptop, she would have taken it in a heartbeat.  And the $30 raspberry pi comes with a graphics chip, HDMI out, and other accessories that a dedicated compute node may not need.  What really would be the ground floor, if you still want to end up with a unit that's good enough to make your distributed system better?

It goes beyond the compute only nodes.  Outside of general-purpose busses like USB, any task-specialized node only needs a few additional drivers.  Some will be complex, like a graphics unit with a GPU and multiple output ports, but if you only want to add, say, an AI accelerator chip, or a collection of data drives, the system image that runs that entire node has very few additions on top of the base layer; you will be certain that no other capability exists or is being used, because that's all that the system is.

If computer parts become that level of commodity hardware… well, I'll be honest, it will harm research and development of new chips.  That's a nice way of saying that chipmaking companies, which are already on rocky ground nowadays, could lose significant licensing revenue and sales, so long as people can get a cheaper alternative.  Some chipmakers, especially in the GPU space, are cagey about letting others understand how their technology works to preserve their competitive advantage - but commodity GPUs don't need to be individually powerful, if they can work together.  A no-nonsense alternative that's easier to program for and keep updated over the years, which gets to the same place with quantity instead of density, would certainly make that policy of stonewalling your own developers harder to maintain.

I'm trying hard not to paint those outcomes as purely good for the consumer - really, we've benefitted a lot from chip R&D over the years, in terms of increased capability and lower power consumption.  Without looking into it, I take it as given that there's graft and corruption at some levels in those companies big chip, and I don't care one way or the other.  Commodity hardware harms anyone who's trying to make high per-unit margins, whether they have good or bad reasons.  Likewise, commodity hardware comes with a massive fall in quality, inescapably.  My opinion, yours, theirs, or the average consumer's, all make no difference to market capitalism.

What holds us back from that possible future is the need to make and sell full, general-purpose computers, especially ones compatible with Windows.  As soon as China can make royalty-free, network-connected processors, that boost the power of your distributed system simply by existing, you'll be able to buy them in bulk.  If the drivers for those bulk processors are open source and can be maintained for decades to come, that will transform the computing space irrevocably.

Again - that's not an unqualified good, not if quality and R&D both nosedive.  But in our world where we already have oceans of unused hardware sitting around, perhaps it might be for the best that we start to make do with simpler, less wasteful technologies.

So Why MAD?

Everything I've talked about here is tied together by the ultimate goal of having software that expands into multiple pieces of hardware as needed.  If you can do that, you can use processors and memory and GPUs on other, even older, computers.  You can display the result on any random, network-connected display.  You can use any random piece of network-connected hardware with a keyboard as input.  You can access your home applications, not just your home files, from wherever you are.  Our existing models don't work to do all that.

The future in which they can do that is exciting.  The future in which you can control all the devices that you own, all the devices that you can reach, as though they were all in a single logical piece of hardware, that's exciting.  The idea of cheaper hardware, even if it is commodity, is exciting.  The idea of working or even playing like you're at home, whenever you have access to an internet connection, is exciting.

MAD will undoubtedly be wrong in some of its particulars.  It's probably got some fundamental flaws that we'll only discover in implementation, or on thorough review by experts.  And in the end, it's possible that MAD, being my concept alone for how distributed systems could work, will simply not be how things do work.  And I embrace all of that.

Because the future where we can do so much more, is worth a lot more than me being celebrated for writing some silly blog posts.

Friday, December 12, 2025

Beyond Unix: Typed Data

 So that whole rant last time was all to make this post somewhat easier.  In truth, on its own, that post is a little bit light on content, for all that it was 3500 words.  I know, I do that; sorry.  The point is that this topic is a bit complicated.  In exactly the same way I said at the top of last post, I have a lot of rewrites of this, and even after paring it down, I'm still not sure I'm satisfied.

The point of the last post was, Unix's “Everything is a file” philosophy is all about setting up a system that people can play with.  It's more “everything is the filesystem;” files as we understand them are honestly not all that important per se, which as I said last time, is good because files kind of suck.  The filesystem doesn't store any type information, which means that you have to intuit what they are based on their name, metadata, or contents - specifically the first few bytes.

And it's that bit that I want to talk about changing today - types.

We could, in theory, pack in a ton of type information about every node in the filesystem, information that must be queried and parsed every time you look up data on that node.  And it's not a terrible idea to maintain that capability, so that some files or executables can provide custom type information - but the general goal here is to standardize.  If some given data object's type is standardized, it makes more sense to simply name the relevant data type than to embed the whole thing with each object.  To be able to simply name them, that information needs to be available elsewhere - I suppose it's possible to simply name a standard but give the user and system no information about it, forcing them to rely on some external source or unnamed standards body, but that's a terrible way to build a system.  It would be better if all the information you need exists somewhere in your system (at least, for the types that already have an installed handler of some kind, the types known to the system).

Perhaps all of that type data would be organized into a directory.  A system-wide directory centered around how you understand and interact with data types.  You know, a System API Directory.  Like the MAD SAD.

That little bit of snark, however cathartic, is also completely uninformative; it leaves you asking, “Okay, but what does that actually mean?”  And part of the reason why this blog post has taken so long to get out is that it's complicated.  So complicated that I doubt this post will be the end of this topic, just as the last few weren't.  If I'm really lucky, I'll end up confident that I can at least move on to other things for a while, though I assure you, there are always still things left unsaid.

Everything has a Type

The stated goal of data types being used to describe files, pipes, and network data streams means above all else, having an official, canonical language that describes data structures in terms of fundamental data types, just as is done in low-level programming languages.  Since there are already formal descriptive languages like that, this means either canonizing one or revisiting xkcd 927.  Either way, you have fields in some given order, each described as one or more data blocks of a fixed size and meaning - primitive types plus some low-level data structures like arrays and dictionaries, presumably.  This can be slightly simplified over the type APIs used for programming, because data-in-transit and data-at-rest have fixed size and meaning; programming structures meant to expand or be modified in memory cleanly get serialized, and ambiguities get nailed down.  Data fields in a structure that have conditional uses either are present, or are not.

There are three objectives behind adding type data system-wide.  First is documentation; when something claims to have a given type, it's nice to know what that means (and from a local source), in both human-readable and machine-readable formats.  Second is confidence; type claims should (as in must, shall) be unambiguous, so when two programs, libraries, or data sources make the same claim, they are explicitly promising to be compatible with each other.  Providing that kind of confidence is valuable, or to be a bit more partisan, having a system that lacks that kind of confidence… sort of sucks?  Either way, that leads into the third - integration.  Once you have confidence in what something is supposed to look like, toolchains and workflows can be built to enrich the ecosystem, allowing administration, moderation, modification, and monitoring of the various systems involved.  When data types are no longer mysterious, when their meaning isn't buried deep in arcane documentation, then people can more readily make use of the data instead of making a new format that they can better understand.

Assigning type information to data blocks only serves as a claim that it is of that data type; going further and verifying that it actually is what it says, and validating that the data makes sense, requires either describing the verification process in formal language or having code available to do the type checking.  It's a fair to argue that going that far is unnecessary - but I'm not so sure.  Either there will be standards around verifying data, or once again, every application developer will be obliged to do the same themselves, which I've explicitly argued against several times.

Moreover, it pays to remember that we're talking about distributed systems.  On the one hand, that means that data will commonly be sent out over a network link, introducing transport errors and attack surfaces; a standardized way to verify that the data not only hasn't been modified, but is what it appears to be, would be useful.  On the other hand, in a distributed system, it's only more critical that agents agree on not only what a data type is, on disk, but what it means.  Data types can change, but more than that, small specifics in how they are interpreted can change, whether that's because the old interpretation was erroneous, or because they want to add, remove, or modify some feature, and that affects all data associated with the system.  Similarly, competing software vendors implementing some data type might disagree on the specifics, as the way one part was using the data when they wrote the file, may not match how will will be using the data when they read it.

That means that we need to move past describing the data in terms of its literal contents - though that is also important - and describe the semantic standard, as represented by the library, application, or service that will be making use of it.  In other words, it is important to describe a file type in terms of the library or application that should be used to read it.  That doesn't necessarily mean a type can only be handled by exactly one specific library or application - but the language used to specify the library or application must be an explicit promise of compatibility between adherents.  Only an explicit promise of compatibility allows the system to have a level of confidence high enough that you can build an entire ecosystem on top of it.

Once you have that level of confidence, you reach what is, ultimately, the goal: both files, and file-like data streams, become nothing more than a buffer between the data producer and data consumer, a middleware that can be effectively ignored by applications programmers.  Once you've confirmed the two are speaking the same language, the entire logic train between the function call that writes the data, and the function call that reads it, might as well be a single atomic operation - at least in cases where the time and space between the two events don't matter.  You can pretend, for those intents and purposes, that the producer and consumer are part of the same application, or that they are transferring the data object directly from one memory space to the other directly instead of using complicated middleware to make the transfer.

Which is a funny thing to say, because one topic I've thus far not talked about much on this blog is the hardware aspect of MAD - and one of my ideals, there, is that the network backbone that connects modules together literally just transfers a memory block from one machine to another.  That was true from the beginning, though it makes more sense nowadays, given the assumptions of the ADA application model.  Specifically, under ADA, most of those data transfers go from one part of a distributed application to another part of the same application, which is vastly more sensible than any alternative - but either way, on a programming level, moving data from one hardware module to another can be seen as an atomic operation, at least in an ideal implementation under ideal circumstances.  You make a single call, and either it succeeds or fails, with no third state that the application itself must manage.  Granted, that's more or less how data transfer was always supposed to work, which is why there's not been a lot of reason to talk about it on the blog.  The point, here, being that these backbone data transfers should be a system call, one that the average program gets to just assume will work, or else error out.

Adding types to files, pipes, and sockets, is simply another way to do the same thing.  If you can be sure that data-in-transit or data-at-rest states do not affect the data, then you can abstract them away.  Ideally, when you write a program, you deal with in-memory data types, and if you serialize it to disk or across the network and then deserialize it somewhere or somewhen else, you should come back with the same in-memory data object, as though all the stuff in between had never happened.

The reason why we can't just do that is that there are so many things that can go wrong, and it's generally the programmer's job to handle each and every last one of them, or at least, to stand there and be present while some library or other handles them.  When reading from a file, you need to handle the file not existing, it existing but not being of the correct type, it being of the correct type but malformed (in terms of the data structure itself), and only then can you start to parse its semantics and meaning as they apply to your program.  Those three errors, broadly speaking, can be described as one assumption made by your program: there exists a file of a given type at a given filesystem location.  The same general principle can be said of data-in-transit, with the file-doesn't-exist error being replaced by a null check or similar.

Let's suppose that under MAD, a program's data ingress and egress points must have detached precondition/postcondition blocks, written perhaps with decorators as separate functions instead of part of the ingress function.  Further, let's presume that data types are required to have a validation function, which only checks to ensure the data type isn't malformed.  When data ingress occurs from either a disk or a stream, those three fundamental type errors are checked with a single trivial command: type.validate(data).  If the file doesn't exist or the streamed value is null, it fails; if it claims to be of a different type, it fails; if it has malformed structure, it fails.  The entire rest of the function precondition (if present) is about the meaning of the data - and the main body of the function can be written presuming the precondition has been passed.

Granted, the idea of pre/post conditions is not new and yet many people don't use them, in part because there aren't good language constructs in many programming languages to handle them.  More than that, sometimes you don't know whether something is valid until you've done a decent chunk of the work that the function exists to do in the first place, or it can be difficult to pry things apart.  Unlike some people, however, I'm not insisting that people's programs work a certain way under the hood as a detached ideal - the idea that these condition blocks are separate means that the system can interface with the precondition/postcondition blocks separately from invoking the actual data ingress function - alongside type validation, it becomes part of the system workflow, verifying that incoming data is correct.

And part of the point of that is ensuring the system itself, separate from end-user applications and even from everyday services, has a workflow for validating data.  If you are building a system on what are essentially remote procedure calls, you need some way to detect not only errors (which may be due to physical programs) but also malfeasance and malice.  If the system starts seeing data packets sent over a network link that are of the correct type but routinely fail precondition checks, that sounds like it might be a malicious actor trying to find an exploitable weakness, and that might be a reason for the system to take action against the sender.

It's possible that you could do the same by insisting that data ingress points throw errors, and even with condition blocks that would help.  As with a lot of things in project MAD, I'm not necessarily trying to proscribe a single course of action, though I will make arguments in favor of one over the other.  In this case, encouraging separate condition blocks might have a lot of knock-on benefits.  Those condition blocks might be automatically parsed and made part of the documentation of a function or program, for example, which is only more valuable in a distributed system - not that debugging third-party programs is ever easy even with local access, but understanding why things are failing starts with understanding what a given error actually is.  The condition blocks (or some public parts of them) might have debug information available even when the rest of the program does not, so that you know not only that a precondition failed, but exactly which check in the process triggered a fault.

Putting the argument in its most general form, at some point the function preconditions are simply a language for describing the function itself, which should (as in, it would be nice) be a part of its public API.  And that, getting back to the topic, is part of the semantics of data types - a given data format may cover a wide gamut of possible meanings, so making a guarantee that data producers and consumers are speaking the same language may involve getting further into the details rather than just saying that the correct data type is involved.  Condition blocks that the system can make use of, are just something that the system offers to help programmers and administrators debug errors in distributed systems - and it is an offering, not a demand.  Ultimately, if you tried to demand people use condition blocks, some people would still make blank dummy blocks and then do all their value checking in the ingress function itself, and there would be no way for the system to tell this kind of scofflaw from a more nuanced situation in which handling checks with condition blocks is untenable.

But okay - the point of all that is being able to abstract away data transit and data storage.  That's lovely, but it gets away from the subject I started with, which is standards.  You can certainly standardize types, and even have nuanced validation functions, but the problem is still that people often do not agree on standards.  If you want to describe a standard, you may need a fair bit of information - which can be a problem, because the System API Directory as I've described it is literally part of the filesystem.

Which means that I'm obliged to explain a little bit about that.

Everything has an Implementation

The System API Directory can be understood as containing a couple different things using the same general mechanism.  That mechanism, is that data types are specified as a path string - the path string points into a filesystem tree, and in that tree, you will find a shared library object suitable to be dynamically linked into an application, which provides for the capabilities and handlers of that object.

That's it - that's the main mechanism.  To a certain extent, it's just a replacement for the Linux /lib directory.  However, the /lib directory is organized by source, where the SAD is organized by function.  And while there is a lot of complication in that (much of which I am explicitly and with some great frustration cutting from this blog post), the result for our purposes is that a type path is explicitly a promise that the data object (as it is being passed in from the filesystem or network) will provide certain capabilities to your program, even if that capability is just ensuring the data was passed cleanly from one end to the other.  With the libraries schema that we use today, and with some implementations under the MAD SAD, you can only ask for one specific vendor's version of one specific understanding of the data type, all others being irrelevant and unusable… and that's fine.

But the SAD is designed specifically to enable and empower standards-setting organizations, because it is hierarchical.  Although the tree may have top-level divisions that are simply categorical, once you start talking about a given type or programming interface, if you continue down that tree, you shall find child types or interfaces that claim to be fully compatible, while potentially having more features.  If for example, two GUI vendors put their heads together and agree on a common API which is, itself, sufficient for either of them to run the GUI, they can at that point also create branches of the API that have additional features, compatible with the parent standard but which are not compatible with each other.

In fact, that process comes naturally from the very idea of dynamically linked libraries (it's less clear with data formats specifically).  Library loading matches a function signature against an entry point in the library; all you need is a guarantee that certain functions exist and work a certain way.  If there are extra functions, because the library specifically uses a child standard, then you can safely ignore them just as you ignore parts of any other library that you won't be using.  The only really tricky bit is interoperability between libraries or services, in which case weird assumptions may come into play.

None of that means that application or service providers must hew to a standard at all, much less any given one, but organizing around a standard helps the ecosystem grow.  Even without standards, the SAD helps with discovering libraries that will parse certain files or data blocks - but with standards, suddenly a lot of things become possible, and all specifically because the SAD is designed to list all libraries that implement a specific type.

This discussion is, finally, about device drivers, and the role that the SAD plays in organizing a distributed system.

Towards a Device Model that Makes Sense

I don't know about you, but honestly, I don't understand how device drivers work - me being a computer and OS enthusiast with a CS Bachelor's who has used computers pretty much every day since I was five years old.  I know that vaguely, device drivers register with the core of a system that they are a device driver, and obviously the system forwards some requests to them when things having to do with that device comes up, but if you asked me for specifics, I couldn't give them.  Specifically, I can't help wondering how I would create a device driver for, eg, a cheap little USB dongle that I have programmed to do some specific thing, perhaps acting as a data source or API provider.  When I've looked into it, it seems confusing and troublesome.

Under MAD, a device driver is a service with low-level access to hardware.  That's pretty much it; end of story.  Oh, wait, we were talking about the SAD; not quite end of story, then.

You see, I said that the SAD lists data types, and I implied that data types are all about the data structures that actually make up the type as it rests on disk.  But I've also said that the SAD is full of interfaces in the programming sense - they are promises that, when you load a library and point the library at a data object of this type, the capabilities of that object will be available for you to use.  If that interface is a child of a parent interface, then you can understand that as a guarantee that the capabilities of the parent interface are all fully implemented in the child.  Logically, all of that makes some sense when you are taking about real, extant data that is already finalized and set in stone.

But you can also see how this works if the “data object” is nothing more than a reference to a device.  Loading the library is a promise that certain capabilities will be available to you, and specifying the device as a filesystem object means that you know that specific device will be the one the library interfaces with.  If some specific driver needs very special code in order to function, such that a generic implementation of the driver wouldn't work, then that device is registered as a child type of a parent - it exposes all the capabilities of the expected type, but unless another library claims to support that specific child type, no other driver code should be loaded for that specific device except the one that was designed for it.

Now again, I said that every driver is a service, and you should assume by now that every service has an Application Folder, because MAD/SAD AF is a thing I keep coming back to.  The driver service in question won't presume that some driver for their particular device exists somewhere out there in the aether - the driver service will provide its own copy of the interface library as an export.  Whenever someone else goes looking to see who implements the (parent/generic) type of that device, eg API/HW/Keyboard, that device driver, since it is providing access to the library, will be on the list.

The next question is - where is the file that represents the device?  Well… why not have that be the driver service's own application folder?  If everything has a type, that means that some things that look like and act like folders can also have types.  They can have interfaces, and they can be subject to standards.  You can validate that a folder is of a certain type by looking at its structure to see if it implements the standard.  Indeed, can't a folder have “functions” that are really programs, as long as those programs fit the expected function signature?  Of course, for that, you would need to be able to pass parameters to a program entrypoint in a type-safe way, but that was the point from the beginning, like it was planned.

Now, reverse the last two things I said.  If you can call a program entrypoint as though it were a function, given that we just added type safety measures, can't you export functions like they were executable program files?  Going further, can't some memory objects - real, programmer-type data objects - be represented as folders in the operating system, as long as the system knows how to decode that data type?  Sure, that might be a highly inefficient way to get access to your data, but you could imagine writing a program that did it, so why not have the system capable of doing it automatically, at least for data objects where that idea makes sense?

What if, for example, you have a device that reads air temperature, humidity, and pressure - when the driver is called, those sensors are queried, and then a data object is returned containing those three values.  Suppose that querying this device driver involves reading some node like a file, which will return a tuple containing those three values.  Because of the type data for that  specific file, you know that the first value is temperature, the second humidity, and the third pressure, each having a fixed size and format.  More than that, the system knows these facts about the data object - so if you were to request eg Sensor/Read/Temperature, the system could understand that request without your own program loading the library code.  You would simply request a file, and the output would look like a file - even though it was originally a smaller data chunk within a larger data set.  And you could even do it all without converting the data back and forth from a human readable format, such as strings.

It's obvious I'm getting ahead of myself, and this blog post is already getting long.  I will say that the above example is probably a good place to use another new path selection operator instead of the filesystem descent operator / - but the specifics don't matter for now.  If I go on much longer I worry I'll bake the brains of anyone still reading.

The point is - by making device drivers out to be services that respond to requests and return results in ordinary ways, we encourage people to make device drivers, allowing new, custom devices to integrate more easily with computer systems.  Why would that be an important topic when talking about a modular and distributed system?  Who knows, must be a passing fancy of mine.  But I will say that the idea of getting new, standards-compliant devices devices fills me with glee.  I'm not a neophile - I don't like everything that's new - but I am a technophile.  I want to be able to make my machines do things, and I get excited by having new capabilities.

And that, ultimately, is a goal worthy enough that I don't mind being called mad.