I have more rewrites of this post than usual, which tends to mean that I'm trying to say too many things and they're all jamming together and making things more complicated than I wanted them to be. It's why the last couple blog posts had to get split out instead of being unified into a larger ongoing discussion. It's incredibly frustrating, because I feel like I have a very simple thing that I want to explain, and thousands of words later I realize it's been buried in a pile of other things.
Of course, for me, “very simple” tends to mean that I want to talk about a combination of systems each with moving parts that just so happen to mesh nicely together. So, you know, it's not necessarily anything that's actually simple. Writing this blog has been… tricky.
The point is, I've been trying to say something about the Unix “everything is a file” philosophy, and how MAD is going further in that direction than Unix and all its derivatives, which is itself such a hubristic and complicated statement that there are very real chances of merely coming off like an asshole instead of a smart asshole, which is at least closer to the goal. But then I go and wreck that already complicated thing by discussing other complicated things and it all just gets confused.
So let me try this again, from yet another angle that kind of feels sideways to me. But fair warning, I'm not entirely sure I succeed.
Everything is a (File)System
The true point and the true nature of Unix's “everything is a file” philosophy isn't actually about files - and that's good, because files kind of suck. In my blog post about how the Internet Protocol makes for a lousy RPC mechanism, I pointed out that its main disadvantage is how it depends on typeless data streams and has no real plan for anything beyond that. If you want to build on top of a typeless data streams, as a system conceit, you are expecting programmers to invent and reinvent methods of typechecking, conversion, validation, verification, bounds checking, error detection, error correction, and error handling - not to mention adding any self-documentation mechanisms they might wish to have, so that people can query the options available or understand why errors happened and what they mean. It's not a terrible place to start, when designing literally the first operating systems humanity has ever had, but it's… incomplete. Primitive. And I want to argue, it's fundamentally insufficient, something that shouldn't make its way unchanged into a more industrial operating system, one that deserves to be used across the globe for decades or centuries to come.
That complaint applies to IP, and it applies to files and IPC data pipes, and it applies to program entry points, all of which are forced to work on top of highly generic data streams. So it's good that the “file” part of “everything is a file” isn't really the point. I know it seems like it is; it's between 20-25% of the phrase depending on how you count it. But no, it's about the system, the underlying metaphor that goes beyond merely organizing the system and starts to feel like a comfortable home for users to play around in.
The filesystem was Unix's answer to this problem - but that philosophy has not really been embraced by a lot of its descendants. Most of the people and projects who came in after Unix or Linux was already a thing, simply created the best mechanism they could think of for their own particular problem and ran with it, Unix philosophy be damned. The windowing system, for example; I can understand them not thinking, especially at first, that there's any reason to involve the filesystem for concepts like desktops, app windows, widgets, and so on. You'd need a compelling reason for them to build that feature into their system, and “it's the philosophy” isn't enough.
The consequences of a system not being cohesive only come out after many years of people depending on it. A whole ecosystem of tools springs up that exist to fill in the gaps left behind by other tools, to make it more convenient to depend on third-party tools and systems, that take the edge off of two systems grinding together, and so on. Eventually the original project lays blanketed under a layer of tools and patches that completely obscure what was originally there. The result works, and it can be made to work well, but only when you plan around the complications, the rough edges and incompatibilities. Any time that things aren't going to plan, suddenly the system that looked so good from afar can become a hellscape of gears refusing to mesh, wires left unconnected, and pipes spilling everywhere.
Preventing complex systems from becoming a disaster is a topic other people understand better than I - but without question it's a matter of technical leadership. Some companies want to be tyrants that force everything to work a certain way, and that can work - so long as people think your product is worth the trouble, and the more trouble you are, the more likely they'll leave someday. Others like the open source community depend on standards bodies, community cohesion, and the willingness of people to take on difficult tasks for the sake of others - and again, the more difficult the task, the less willing people will be. Either way, the ideal result is some plan or system that handles any discrepancy, either forcing or guiding people to do the right thing.
It's foolish to pretend that you'll ever reach that ideal, but it's still what an engineer should aim for, at least one in my exact position. It is, to paraphrase, kind of the job. It will be other people's job to invent the best piece that fits a certain slot or task - it's my job to design an overall system that they'll want to be part of, foresee complications and make sure that there are tolerances built in to handle mistakes and idiosyncrasies. And to a certain extent… the philosophy of the open source movement, which is totally okay with components bolted onto the side of a machine and wires hanging loose, that's a system that's nice to be a part of. It's nice to know you can tinker, nice to know that that's kind of the plan, even if it's a messy one.
When I say, as I've said many times, that maybe MAD won't win but I still think I'm on to something… I'm saying is that I know I'm not a genius system designer. Creating something clever is one thing - making something that everyone will agree with is another. It's disagreements that lead to pieces bolted on the sides, people going their own way instead of following along - or worse, they lead to them never joining in the first place, projects never getting off the ground. Agreement is the heart of technical leadership - agreeing that things are right, because they're right, and because the process isn't too big of a hassle to manage.
I think if people saw the problems the way I do, we would at least all be moving in the general direction I'm pointing. And if I'm wrong… well, I won't find that out without at least explaining the system for others to react to. And that, ultimately, is why we're here.
A Universe in your Pocket: The SAD
This blog post is part of the “Beyond Unix” set and I've been talking about the filesystem, so it shouldn't be much of a surprise that I'm here to talk about its replacement in the MAD paradigm. The “System API Directory” or SAD is defines the MAD operating system, and it's important to understand that it is a protocol, not a place. (I've explained the API part of it before and I'll revisit it again soon, but this blog post is already long enough as-is.) Under Unix, the root filesystem is in fact stored on disk, meaning that if they wanted and had the authority, a user could simply do whatever they wanted with it. User here, being distinct as a concept from OS designer, or from anyone who knows what they're doing. Any random person with the root password can break the system just by renaming folders in the root directory; indeed, if you want to prevent that, you have to add features, not remove them, because under Unix the root directory is a place, not a protocol.
There is a good reason why the root of the MAD/SAD is not a literal filesystem directory - the OS is designed to be distributed among many computers, and it's incorrect to try to say that any one of the computers is the “root” or “home” for the whole system. In a distributed system there are many “roots”; the term becomes ambiguous. The root filesystem operator in principle is simply a command to start your filesystem query in one specific place, and under MAD, the correct place to start a generic query is in an abstract space that exposes important concepts as quickly as possible. (There are some more specific queries you will want to start in different places, but that's for another time.)
Part of the point of a filesystem, part of what makes it valuable, is that it is in some ways self-documenting. Listing a directory tells you what's in it; the names of files and folders tell you something about them. A file path has meaning, especially starting from the root; you know something about the file you're trying to reach from where it is, not just from its own name is. More to the point, you know where something should belong by knowing what it is, and who it belong to. As much as everyday users brush up against these problems when organizing their documents and/or desktop, the stakes are much higher for anyone whose work will be used by others.
In a way, this philosophy comes directly from Unix; the /etc directory doesn't simply give you a predictable place to put configuration files, it also makes the filepath for system configuration files as short as possible, which is as much about semantics as string length. You don't have to go looking for system configuration at the tail end of some involved filesystem tree; it is one of the most important things you may go looking for, so it is indexed near the root. (If you've ever had to go looking for some specific configuration file buried under /usr/share/someThirdPartyLibrary/config, or had to edit the C:\Windows\System32\drivers\etc\hosts file, you understand. Even if you can understand the filepath once you see it, you might not be able to guess it, and quite frankly there are frequently multiple “correct” place it might be. All of this obscurity can sometimes make a necessary task harder.)
Under MAD the root directory contains dynamic collections of concepts deemed important; a collection of actual hardware modules, for one, and a collection of users for another, though there are more. All those modules and all those users are themselves collections of other important things, and many of those collections are themselves full of other collections. Importantly, as you go down these trees, you will frequently be changing who exactly you are making filesystem requests to. Indeed, a large part of the root filesystem is about pointing you towards whomever you need to ask the rest of your question - the more of that can be cached, the fewer questions need to be asked and fewer answers awaited with bated breath… but that's not always practical. The root protocol handler can be done locally (must be, because the system is decentralized), and the top-level contents of those modules might be cached, but if you want to get a list of what applications a user has running, for example, that will be something you have to ask, as it may change moment to moment.
This is one of many parts of a distributed system that is less optimized than its counterpart in a monolithic system, but it's not possible for a distributed system to have a central authority that can answer all your questions at once, if for no other reason than scalability. A distributed system has a theoretically infinite maximum scope, insofar as the back-end allows; a large enough system could have problems simply indexing the hardware (in terms of memory and computing requirements, if nothing else), to say nothing of the many devices and software endpoints each one may export. In an infinitely large system, it's important that the scope narrows down to only what you need to know; for an application, that means narrowing the entire universe down to only the machines it is running on, and the machine its authorizing user is running on. (Well, unless I've missed something, plus there's at least one asterisk not worth going into now)
In my last two posts I talked about the ADA distributed application model, and application folders; for our purposes here, these concepts are expanding on the same fundamental data abstraction. Under the ADA, when you want to access a remote resource (including files), your application sends an Agent to the piece of hardware where that resource resides, and then the application Agent acts as a proxy for access to that resource. This model fundamentally means that applications, once deployed, can focus only on the machines they are already on instead of understanding the system as a whole; even if they accept client connections, those clients are obliged to be present locally, and so aren't remote from the server's perspective. Application Folders push this chain of custody into the system directory; the exposure makes the relationships easier for developers and users to understand and audit.
A simple example. Suppose there are two hardware modules; your application lives on one, /module/somemod.123/apps/myApp, and it wants to access a file on another module, /module/othermod.234/files/data.json. Your application will deploy a file-access agent to module 234; this deployment process also involves the filesystem, so your agent will have some canonical, physical url, eg /modules/othermod.234/agents/myApp/fileAccess; the agent will also be mounted under your app, as /module/somemod.123/apps/myApp/agents/fileAccess. This agent negotiates with the file server on module 234, and if it is allowed to open the file, it exposes that file to the rest of your application as /modules/othermod.234/agents/myApp/fileAccess/data.json. And because the agent is mounted under your application, the same file is also mounted as ./agents/fileAccess/data.json.
This alone may seem like a comfortable level of abstraction - but what if you don't necessarily know what agent will be retrieving that data file, or what the agent will be named? You may have, for instance, may instances of the fileAccess agent, each of which is present on a different module, each exposing some other file; some of those other files may even be titled data.json on the native filesystem. Yet another layer of abstraction wouldn't be hard; simply mount the same data file one more time at, for example ./imports/data.json or ./imports/renamedDataFile.json, depending on exactly how fancy you want to get. This last file mount contains enough information in the filesystem itself that you can reach the actual file, but the url that you actually use to access it (as stored in and exposed by the filesystem) becomes descriptive.
Is that important? Well maybe.
Consider if you are debugging an application. It opens a file, and the file doesn't contain what they expect, perhaps because the write process got corrupted by another instance or another application. The actual file that you are reading has the canonical url /module/othermod.234/files/data.json, but that url may not explain its role within your application even a little bit. If it's not a default part of your application, it's probably being opened due to a configuration or user input; you may only understand the file's purpose when you find that particular line of configuration or where in the code the input occurred.
The source of the file doesn't change its role in your program - it's just that with existing programming models, the role of a file is often stored entirely in variable names, which at best is data stored in debug information. The application folders framework gives us the option of having that naming convention persist when the application is running; you don't always want that, for some secure applications, but it's a useful abstraction for anything you may want to understand and reconfigure. Choosing to cleverly rename the file gives you information that's available at a glance. If it's named ./imports/previousWindowPosition.json or ./imports/customIcons.json, for example, you'll know what's supposed to be in that file in a heartbeat, even if it turns up empty or malformed.
After all of this fussing about the specifics, it almost feels weird to come back around to where this started: making a theoretically infinite system shrink down until all you need to focus on is the application itself. But the truth is that most applications have fixed needs that are determined by their configuration. Once those needs are filled, it only needs to look inwards, and the infinite system beyond its borders is no longer relevant. If that raises the question about how you fill those needs, I invite you to go back to the post on Plan 9 and resource distribution - but the short answer is that you use configured defaults where set, and then use a list of constraints to narrow down the infinite system to only those modules which might be useful, and pick arbitrarily among them, or have the user pick, whichever is appropriate.
Constraining the list of modules by their capabilities, is part of the reason why the list of modules is a system root directory, and why capabilities are root directories of each module. The root protocol handler for the SAD gets handled locally, which means that each client must get enough information, and quickly, to build a list of what modules provide what capabilities. Much of this can be cached, and may be provided at intervals to ensure everyone is on the same page - but the point is that the SAD plants a flag saying that these details are important and central to the model, thus they are available immediately. If you were in an infinite system, at least each query you made to a nearby machine would be short and over with quickly, all while still using a directory structure that users themselves can investigate, explore, and audit, in order to understand how their own machines work.
Having the directory be browseable is part of what makes it self-documenting, part of what makes the whole system come together, and part of what makes users feel like it is under their control, able to be made use of and modified to better suit their needs. And exposing those pieces of a given module that are important, according to the system, also helps users and developers understand. As you come to understand how adding capabilities to your system makes it more powerful, you can come to understand which capabilities you want to add, and if you are a business, what capabilities you can make money selling to people.
Keep in mind that under MAD, that's a large part of the point - the hardware gestalt gets stronger when you add to it, which means that businesses are incentivized to provide you the hardware capabilities you want added to your gestalt in a useful package, cheaply. MAD's hardware schema assumes people will simply package processors that exist for no other purpose than to be attached to a generic system - it's the reason why we can't assume anything exists on a given machine, unless the machine tells us that it does. These systems will be cheap compared to full PCs - they will have fewer support chips, no USB, no HDMI, and no SATA; there will be much less to license, especially if open projects like RISC-V gain traction. But open projects depend on support, and support depends on popularity; absent that popularity, it's a gamble to get behind new technologies, open or proprietary. The ability to add capabilities to other machines for dirt cheap sounds like a pretty good reason to gamble on an open technology, though, doesn't it?
I admit I haven't gone into the hardware side of MAD pretty much at all, and maybe I'll get to that soon. My Introduction and Index post has a very sad, empty space where posts on that topic should be.
But there's at least one more topic in this “Beyond Unix” chain, one that this post was supposed to be on, but that conversation will be easier when I can just reference this blog post to say that the point of “everything is a file” is about the system. Because we'll be changing that system, for the better - and more than anything else I've talked about it, that one's going to be a doozy. Though, again… I've tried to say it all before. It's already out there... if you can decode my ramblings. I'm hoping this string of blog posts will be much more comprehensible, and as such, I'll try to keep a thread of logic going from post to post.
These posts may end up being redundant - but if they help anyone understand, then it's worth it. At least, once people understand what I have and what I'm doing, then we can start talking about the actual concepts, debating them on their technical merits. Until I can explain them well… there aren't any technical merits to debate, as far as anyone else is concerned.
Which is… kind of a lonely place to be, but that's just life.
No comments:
Post a Comment