But where is it leading, beyond that point? I have absolutely no concrete idea. But the end point? I've read one brilliant model.
It's in one of the later Foundation books by Isaac Asimov, IIRC. (Not a series I'm that enamoured of, actually.)
A guy gets (steals?) a space yacht: a small, 1-man starship. (Set aside the plausibility of this.)
He searches the ship's crew quarters. In its few luxury rooms, there is no cockpit. No controls, no instruments, nothing. He is bemused.
He returns to the comfiest room, the main stateroom, i.e. cabin/bedroom. In it there is a large, bare dressing table with a comfy seat in front of it. He sits.
Two handprints appear, projected on the surface of the desk, shaped in light.
He studies them. They're just hand-shaped spots of light. He puts his hands on them.
And suddenly, he is much smarter. He knows the ship's position and speed in space. He knows where all the nearby planetary bodies are, their gravity wells, the speeds needed to reach them and enter orbit.
Thinking of the greater galaxy, he knows where all the nearby stars are, their masses, their luminosities, their planetary systems. Merely thinking of a planet, he knows its cities, ports, where to orbit it, etc.
All this knowledge is there in his mind if he wants it; if he allows his attention to move elsewhere, it's gone.
He sits back, shocked. His hands lift from the prints on the desk, and it all disappears.
That is the ultimate UI. One you don't know is there.
Any UI where there are metaphors and abstractions and controls you must operate is inferior; direct interaction is better. We've moved from text views of marked-up files with arcane names in folder hierarchies to today: hi-res, full-colour, moving images of fully-formatted documents and images. That's great.
Some people are happily directly manipulating these — drawing and stroking screens with all their fingers, interacting naturally. Push up to see the bottom of a document, tap on items of interest. It's so natural pre-toddlers can do it.
But many old hands still like their pointing hardware and little icons on screen that they can twiddle with their special pointing devices, and they shout angrily that it's more precise and it's tried and tested and it works.
Show them something better, no, it's a toy. OK for idly surfing the web, or reading, or watching movies, but no substitute for the "real thing".
It's a toy and the mere idea that these early versions could in time grow into something that could replace their 4-box Real Computer of System Unit, Monitor, Mouse and Keyboard is a nonsensical piece of idiocy.
Which is exactly what their former bosses and their tutors said about the Mac's UI 30y ago. It's doubtless what they said about the tinker-toy CP/M boxes a decade before that, and so on.
I'm guilty too. I am using a 25y old keyboard on my tiny silent near-unexpandable 2011 Mac mini, attached via a convertor that cost more than the keyboard and about a third as much as the Mac itself. I don't have a tablet; I don't personally like them much. I like my phablet, though. I gave away my Magic Trackpad - I didn't like it.
(And boy did my friends in the FOSS community curse me out for buying a Mac. I'm a traitor and a coward, apparently.)
But although I personally don't want this stuff, nonetheless, I think it's where we're going.
If adding more layers of abstraction to the system means we can remove layers of abstraction from the human-computer interface, then I'm all for it. The more we can remove, the simpler and easier and clearer the computers we can make, the better. And if we can make them really small and cheap and thus give one to every child in the poorer countries of the world — I'd be delighted.
If price was putting Microsoft and Apple out of business and destroying the career of everyone working with Windows, and replacing it all with that nasty cancerous GPL and Big-Brother-like services like Google — still worth it.