liam_on_linux | Sandcastles and skyscrapers

Sandcastles and skyscrapers

The problem with the Unix lowest-common-denominator model is that it pushes complexity out of the stack and into view, because of stuff other designs _thought_ about and worked to integrate.

It is very important never to forget the technological context of UNIX: a text-only OS for a tiny, already obsolete and desperately resource-constrained, standalone minicomputer. It was written for a machine that was already obsolete, and it shows.

No graphics. No networking. No sound. Dumb text terminals, which is why the obsession with text files being piped to other text files and filtered through things that only handle text files.

While at the same time as UNIX evolved, other bigger OSes for bigger minicomputers were being designed and built to directly integrate things like networking, clustering, notations for accessing other machines over the network, accessing filesystems mounted remotely over the network, file versioning and so on.

I described how VMS pathnames worked in this comment recently: https://news.ycombinator.com/item?id=32083900

People brought up on Unix look at that and see needless complexity, but it isn't.

VMS' complex pathnames are the visible sign of an OS which natively understands that it's one node on a network, that currently-mounted disks can be mounted on more than one network nodes even if those nodes are running different OS versions on different CPU architectures. It's an OS that understands that a node name is a flexible concept that can apply to one machine, or to a cluster of them, and every command from (the equivalent of) `ping` to (the equivalent of) `ssh` can be addressed to a cluster and the nearest available machine will respond and the other end need never know it's not talking to one particular box.

50 years later and Unix still can't do stuff like that. It needs tons of extra work with load-balancers and multi-homed network adaptors and SANs to simulate what VMS did out of the box in the 1970s in 1 megabyte of RAM.

The Unix was only looks simple because the implementors didn't do the hard stuff. They ripped it out in order to fit the OS into 32 kB of RAM or something.

The whole point of Unix was to be minimal, small, and simple.

Only it isn't any more, because now we need clustering and network filesystems and virtual machines and all this baroque stuff piled on top.

The result is that an OS which was hand-coded in assembler and was tiny and fast and efficient on non-networked text-only minicomputers now contains tens of millions of lines of unsafe code in unsafe languages and no human actually comprehends how the whole thing works.

Which is why we've build a multi-billion-dollar industry constantly trying to patch all the holes and stop the magic haunted sand leaking out and the whole sandcastle collapsing.

It's not a wonderful inspiring achievement. It's a vast, epic, global-scale waste of human intelligence and effort.

Because we build a planetary network out of the software equivalent of wet sand.

When I look at 2022 Linux, I see an adobe and mud-brick construction: https://en.wikipedia.org/wiki/Great_Mosque_of_Djenn%C3%A9#/m...

When we used to have skyscrapers.

You know how big the first skyscraper was? 10 floors. That's all. This is it: https://en.wikipedia.org/wiki/Home_Insurance_Building#/media...

The point is that it was 1885 and the design was able to support buildings 10× as big without fundamental change.

The Chicago Home Insurance building wasn't very impressive, but its design was. Its design scaled.

When I look at classic OSes of the past, like in this post, I see miracles of design which did big complex hard tasks, built by tiny teams of a few people, and which still works today.

When I look at massive FOSS OSes, mostly, I see ant-hills. It's impressive but it's so much work to build anything big with sand that the impressive part is that it works at all... and that to build something so big, you need millions of workers, and constant maintenance.

If we stopped using sand, and abandoned our current plans, and started over afresh, we could build software skyscrapers instead of ant hills.

But everyone is too focussed on keeping our sand software working on our sand hill OSes that they're too busy to learn something else and start over.

Flat | Top-Level Comments Only

I think you over-idealise VMS pathnames. The basic ones for the physical devices are not sophisticated at all, and are very much like those of TOPS-10, RSX-11 and the like. The logical name system is very powerful, but it has to be set up right to make clustering work effectively, and that is not trivial. It is notable that VMS has steadily lost market share since the early 1990s, in spite of its virtues.

In the same way, IBM mainframe operating systems, which IBM has had freedom to develop since the mid-1960s, remain arcane and difficult to use. They do some things very well, but many others quite badly. They're steadily loosing their share of mainframe runtime to Linux, because Linux is easier to build systems on top of.

People have written all kinds of research operating systems in the past few decades, but none of them have had enough advantages to achieve commercial acceptance.

All fair points, and I have no disagreement with any of them. :-)

Including the point about VMS. ;-)

Thanks for the write-up (again).

I think I can follow your general thoughts here.

It is said that Saint-Exupéry did come up with the phrase: "Technology always develops from the primitive via the complicated to the simple". I do like that statement very much.

Regarding your comments on hiding things and allowing to see full complexity, I am very torn apart with following your judgement however. For me, it is not easy to differ between the "primitive" and the "sophisticated" and the "simple" here.

Maybe it all depends on a specific set of requirements? According to my tool choice process different requirements most likely lead to different tool choices. In this case, different file system requirements. Most files never need versioning at all, for example. I think that most of the time, the host/user should not be an integral part of a file name abstraction layer.

It's not that black and white though. I can think of situations, where the mentioned explicit VMS file path clearly does have its advantages.

I guess we all would need to work with such a concept for a longer period of time in order to really get its ideas, advantages and concepts behind.

That's fair.

Also, note vicarage's comment below and more to the point my (hasty and therefore too long) answer to it.

The basic Unix model, which is also the Linux model, is good because it is accessible and very powerful to a lot of people, even if that "lot of people" is only 1% of the population.

Plan 9 is more smaller, cleaner, with a more complex but more complete and powerful conceptual model. Forget Sun marketing: in Plan 9, the network is the computer.

But I've tried it and it breaks my brain completely. I can't use it, at all, even for basic stuff.

Plan 9 did the "right thing" but it did it in a weird complicated way that made it too hard for, I suspect, 99% of that 1% of people who get and like and value the Unix way.

That's fatal. It's too much. It condemned it to a niche forever.

(Aside: Inferno, at graphical level, is much easier, but it's much easier in ways that are not apparent to the 1% of 1% of people who grok Plan 9.)

The Unix CLI model, as I said in my reply to John B, is too hard for most people, but it works for 1% of people and that's enough to mean it was a huge success.

[For clarity: I am pulling these numbers out of the air; they are not real percentages. I am just trying to express the difficulty of some concepts and the fact that only a very small minority will grasp them, but they will love them because they can do amazing stuff with them.)

Nobody considers this kind of stuff at the planning stage. By the time a product is in alpha or beta stage and might ship or get cancelled, this is long gone. It's something that happens in the minds of people who find others to consider maybe planning something. By planning stage, it's gone, it's over.

The real point here, perhaps, is to do some measurement and estimation and try to measure how many people of what kind can handle the DOS conceptual framework, the NT one, the Unix one, the VMS one, the IBM mainframe ones, and actually try to... well, to model them, and maybe calculate, explicitly and rationally, what levels of model are accessible to what people.

A small number of people love Plan 9, but it's too hard for most Unix techies.

A small number of people love Lisp, but it's too hard for most programmers.

A small number love Smalltalk, and Forth, and Oberon, and so on, but it's too hard for most.

Whereas millions loved BASIC, and millions love Python. But the Python folk are Unix folk who probably also know and like C, which is too hard for many -- but they don't know their own skills well enough to know that, resulting in millions of broken unsafe C programs.

(And a large and lucrative software industry.)

Python people scorn BASIC: they can't see the hairy bits of Python that put off some BASIC-lovers.

Hardcore C (and curly-bracket languages in general) people scorn Python and its indentation... but millions love curly-bracket languages.

Lisp people scorn all of them.

I am, in general, interested in trying to enumerate and measure this, and try to work out the sweet spots. The levels of complexity that include or exclude a lot of people.

All programmers rate themselves highly compared to non-programmers. They may evaluate themselves as bad against other programmers, but inside, they all know that they are the demigods who can make sand think and sing.

But some levels of tech are too hard for most techies. Lisp, Plan 9, Forth, etc.

Some levels of tech are easy enough that they can make non-techies into techies. BASIC, MS-DOS, Python, etc.

Can we measure this?

Can we work out what levels of human mind can handle what levels of complexity effectively?

And by doing that, work out how to reduce or hide or (ideally) eliminate some of the complexity from some tools, and make them accessible to more people and thus bring them success?

Because a lot of the tools that we have now, that are loved by legions, are trash. They are dangerous junk, "unsafe at any speed" to borrow Nader's term.

But they are tools that large numbers can learn to use.

There are better tools but they are too hard.

Only if we can measure the complexity gaps can we map them. And only if we can make maps can we work out where the bridges need to be built.

All we have now are desire lines. https://en.wikipedia.org/wiki/Desire_path

Those only work at a low level; you can't build an efficient large network from them.

But they are all we have. Nobody's even worked out we need maps yet.

Directed OS development by single entities doesn't seem to do any better, as companies don't allow competing solution to problems, and the cruft still accumulates. And academic ideas never have enough manpower to cover all the bases. I like piping and text files. Unixes have served me much better than the alternatives that actually got traction.

That's a valid and important point.

What's good about the classical Unix model is that it's one that is readily apprehended by a lot of people. People can study it, grasp it, and use it fairly easily.

Also, up to a point, it scales to the ability of the user. I am damning myself here, but the classic Unix regex model, combined with filename globbing and how that is handled, is frankly a bit too complicated for me.

I've been trying to get it into my head for ~3 decades now and it hasn't worked and at this point it's not going to. I wasn't trying very hard, but using it and trying to get used to it for so long, and that's not enough.

I think that is a fault in both me and the model.

My efforts, because in the late 1980s when I started doing this, I didn't realise what a key skill I was acquiring on Linux, and I treated it equally to Netware and 3Com 3+Share and and other long-gone things.

And in the model because it has a lump in the learning curve. I know from my own work experience that there are many conceptual models that have learning curves too steep for many people. DOS had one, one I easily scaled but many people slid off. OS/2 had the same but more so; easy for me, not for others.

But NT had a much lower one, and Win9x a much lower one than NT in some ways, and that helped them to thrive and OS/2 to slowly die out.

Everyone hates the Windows Registry but most users never need go near it. A way more complicated, hierarchical registry that almost no-one ever need go near is better than a flat plain-text multi-hundred-line CONFIG.SYS file, especially when almost every user has to go into that CONFIG.SYS file and edit it somethings.

DOS and directories was too hard for many people. Windows obviated that, in time, completely. Windows won.

In the past I have shocked many Unix old-timers by saying that I prefer the DOS and classic Windows NT shell. It does what I need better than the Unix shell -- any Unix shell, it doesn't matter. This is profoundly weird and hard to accept for people who mastered the Unix way.

I don't want all the fancy regex. I don't need it. It's too hard and it's very rarely useful.

What I do need, and need fairly regularly, is, say, to take all files called (something) (dot) zero (any digit any digit) and move them.

In NT I can go:

move *.00? backups

and it works (so long as the folder backups exists).

In Unix it doesn't because of some arcane nonsense about how one tool expands wildcards and then passes it to another tool and I DON'T CARE. I just want it to WORK.

This is not the Unix way. The Unix way is very powerful, if you can learn it.

The DOS way is much less powerful, but it's more predictable with a trivial mental model and that, for me, is better.

But it's not better enough. It's died out. Unix has thrived.

And now, on Windows, PowerShell has obsoleted the NT shell. PowerShell is horribly complicated, but if you deal in AD and VB and .NET stuff regularly, it's manageably so.

Worse learning curve up front. Rewards those with the specific knowledge.

Unix: gentler learning curne than PowerShell, but worse than DOS.

DOS: gentler than Unix, but still way too hard for most ordinary users.

Is there a happy medium? No, not a single optimum, no there isn't.

Is there a general lesson? Yes. CLIs are too hard for most nonspecialists. So it's more important to have a good solid GUI.

Lessons from that: making a GUI good and solid and simple isn't as easy as it looks.

The classic Mac one was superb, a best of breed. Windows 3.x was kinda OK, Win9x was, on balance, better.

The OS X one is much stupider, but behind the scenes, it gives you Unix as well. As a combination, that wins, especially for developers. They learned non-obvious lessons from observation of Win9x.

One of these is a good-enough GUI needn't be brilliant, inspired and superb, like classic MacOS. As long as it's good enough to be understandable with no effort and users never need to go to the CLI, it will win.

Linux didn't learn that. So it has poor Win9x imitations, but users are encouraged to learn the CLI way, and that is ultimately doomed because that's way too hard for 99% of people.

Yep good GUI with minimal configuration for non-fussy users (most of them). Then add a panels for main OS and application GUI configuration options for the fussy ones. But have those apps write to text files so the developers/obsessives can edit text files and run piping software to tweak it all, or Google error messages. Windows/Android/my Ipad seem appalling in for fiddling or finding out what went wrong.

The classic thing is image processing. Most people just want them displayed in a GUI and copied to friends, some want to apply a filter to particular image, a few want to run batch scripts to apply the same filter to many images. Developers want to amazing things with AI interpretation of images. Gawd knows how Windows users go from modifying one scanned slide to doing many of them. Perhaps they hope their program has yet another home-brew batch system with home-brew filters.

Edited 2022-08-30 10:58 (UTC)

Flat | Top-Level Comments Only

Sandcastles and skyscrapers

no subject

no subject

Difficult judgement

Re: Difficult judgement

no subject

no subject

no subject