liam_on_linux: (Default)
Someone at $JOB said that they really wished that rsync could give a fairly close estimate of how long a given operation would take to complete. I had to jump in...

Be careful what you wish for.

Especially that "close" in there, which is a disastrous request!

AIUI...

It can't do that, because the way it works is comparing files on source and destination block-by-block to work out if they need to be synched or not.

To give an estimate, it would have to do that twice, and thus, its use would be pointless. Rsync is not a clever copy program. Rsync exists to synch 2 files/groups of files without transmitting all the data they contain over a slow link; to do the estimate you ask would obviate its raison d'être.

If it just looked at file sizes, the estimate would be wildly pessimistic, and thus make the tool far less attractive and that would have led to it not being used and becoming a success.

Secondly, by comparison: people clearly asked for this from the Windows developers, and commercial s/w being what it is, they got it.

That's how on Win10 you get a progress bar for all file operations. Which means deleting a 0-byte file takes as long as deleting a 1-gigabyte file: it has to simulate the action first, in order to show the progress, so everything now has a built-in multi-second-long delay (far longer than the actual operation) so it can display a fancy animated progress bar and draw a little graph, and nothing happens instantly, not even the tiniest operations.

Thus a harmless-sounding UI request completely obviated the hard work that went into optimising NTFS, which for instance stores tiny files inside the file system indices so they take no disk sectors at all, meaning less head movement too.

All wasted because of a UI change.

Better to have no estimate than a wildly inaccurate estimate or an estimate that doubles the length of the task.

Yes, some other tools do give a min/max time estimate.

There are indeed far more technically-complex solutions, like...

(I started to do this in pseudocode but I quickly ran out of width, which tells you something)

* start doing the operation, but also time it
* if the time is more than (given interval)
* display a bogus progress indicator, while you work out an estimate
* then start displaying the real progress indicator
* while continuing the operation, which means your estimate is now
inaccurate
* adjust the estimate to improve its accuracy
* until the operation is complete
* show the progress bar hitting the end
* which means you've now added a delay at the end

So you get a progress meter throughout which only shows for longer operations, but it delays the whole job.

This is what Windows Vista did, and it was a pain.

And as we all know, for any such truism, there is an XKCD for it.
https://xkcd.com/612/

That was annoying. So in Win10 someone said "fix it". Result, it now takes a long time to do anything at all, but there's a nice progress bar to look at.

So, yeah, no. If you want a tool that does its job efficiently and as quickly as possible, no, don't try to put a time estimate in it.

Non-time-based, non-proportional time indicators are fine.

E.g. "processed file XXX" which increments, or "processed XXX $units_of_storage"

But they don't tell you how long it will take, and that annoys people. They ask "if you can tell me how much you've done, can't you tell me what fraction of the whole that is?" Well, no, not without doing a potentially big operation before beginning work which makes the whole job bigger.

And the point of rsync is that it speeds up work over slow links.

Summary:

Estimates are hard. Close estimates are very hard. Making the estimate makes the job take much longer (generally, at a MINIMUM twice as long). Poor estimates are very annoying.

So, don't ask for them.

TL;DR Executive summary (which nobody at Microsoft was brave enough to do):

"No."

This was one of those things that for a long time I just assumed everyone knew... then it has become apparent in the last ~dozen years (since Vista) that apparently lots of people didn't know, and indeed, that this lack of knowledge was percolating up the chain.

The time it hit me personally was upgrading a customer's installation of MS Office XP to SR1. This was so big, for the time -- several hundred megabytes, zipped, in 2002 and thus before many people had broadband -- that optionally you could request it on CD.

He did.

The CD contained a self-extracting Zip that extracted into the current directory. So you couldn't run it directly from the CD. It was necessary to copy it to the hard disk, temporarily wasting ¼ GB or so, then run it.

The uncompressed files would have fitted on the CD. That was a warning sign; several people failed in attention to detail and checks.

(Think this doesn't matter? The tutorial for Docker instructs you to install a compiler, then build a copy of MongoDB (IIRC) from source. It leaves the compiler and the sources in the resulting container. This is the exact same sort of lack of attention to detail. Deploying that container would waste a gigabyte or so per instance, and thus waste space, energy, machine time, and cause over-spend on cloud resources.

All because some people just didn't think. They didn't do their job well enough.

So, I copied the self-extractor, I ran it, and I started the installation.

A progress bar slowly crept up to 100%. It took about 5-10 minutes. The client and I watched.

When it got to 100%... it went straight back to zero and started again.

This is my point: progress bars are actually quite difficult.

It did this seven times.

The installation of a service release took about 45 minutes, three-quarters of an hour, plus the 10 minutes wasted because an idiot put a completely unnecessary download-only self-extracting archive onto optical media.

The client paid his bill, but unhappily, because he'd watched me wasting a lot of expensive time because Microsoft was incompetent at:

[1] Packaging a service pack properly.
[2] Putting it onto read-only media properly.
[3] Displaying a progress bar properly.

Of course it would have been much easier and simpler to just distribute a fresh copy of Office, but that would have made piracy easier than this product is proprietary software and one of Microsoft's main revenue-earners, so it's understandable that they didn't want to do that.

But if the installer had just said:

Installation stage x/7:
Progress: [XXXXXXXXXX..........]

That would have been fine. But it didn't. It went from 0 to 100%, seven times over, probably because first the Word team's patch was installed, then the Excel team's patch, then the Powerpoint team's patch, then the Outlook team's patch, then the Access team's patch, then the file import/export filters team's patch, etc. etc.

Poor management. Poor attention to detail. Lack of thought. Lack of planning. Major lack of integration and overview.

But this was just a service release. Those are unplanned; if the apps had been developed and tested better, in a language immune to buffer overflows and which didn't permit pointer arithmetic and so on, it would have have been necessary.

But the Windows Vista copy dialog box, as parodied in XKCD -- that's taking orders from poorly-trained management who don't understand the issues, because someone didn't think it through or explain it, or because someone got promoted to a level they were incompetent for.

https://en.wikipedia.org/wiki/Peter_principle

These are systemic problems. Good high-level management can prevent them. Open communications, where someone junior can point out issues to someone senior without fear of being disciplined or dismissed, can help.

But many companies lack this. I don't know yet if $DAYJOB has sorted these issues. I can confirm from bitter personal experience that my previous FOSS-centric employer suffered badly from them.

Of course, some kind of approximate estimate, or incremental progress indicator for each step, is better than nothing.

Another answer is to concede that the problem is hard, and display a "throbber" instead: show an animated widget that shows something is happening, but not how far along it is. That's what the Microsoft apps team often does now.

Personally, I hate it. It's better than nothing but it conveys no useful information.

Doing an accurate estimator based on integral speed tests is also significantly tricky and can slow down the whole operation. Me personally, I'd prefer an indicator that says "stage 6 of 15, copying file 475 of 13,615."

I may not know which files are big or small, which stages will be quick or slow... but I can see what it's doing, I can make an approximate estimate in my head, and if it's inaccurate, well, I can blame myself and not the developer.

And nobody has to try to work out what percent of an n stage process with o files of p different sizes they're at. That's hard for someone to work out, and it's possible that someone can't tell them a correct number of files or something... so you can get progress bars that go to 87% and then suddenly end, or that go to 106%, or that go to 42% and then sit there for an hour, and then do the rest in 2 seconds.

I'm sure we've all seen all of those. I certainly have.
liam_on_linux: (Default)
I stumbled across an old article of mine earlier, and tweeted it. Sadly, the server seems to have noticed and slapped a paywall onto it. So, on the basis that I wrote the bally thing anyway, here's a copy of the text for posterity, grabbed from Google's cache. Typos left from the original.

FEATURE - Server integration - Window onto Unix

If you want to access a Unix box from a Windows PCs you might feel that the world is against you. Although Windows wasn't designed with Unix integration in mind there is still a range of third-party products that can help. Liam Proven takes you through a selection of the better-known offerings.

10 March 1998

Although Intel PCs running some variant of Microsoft Windows dominateat the world is against you. Although Windows wasn't designed with Unix integration in mind there is still a range of third-party products that can help. Liam Proven takes you through a selection of the better-known offerings. the desktop today, Unix remains strong as a platform for servers and some high-end graphics workstations. While there's something to be said in favour of desktop Unix in cost-of-ownership terms, it's generally far cheaper to equip users with commodity Windows PCs than either Unix workstations or individual licences for the commercial Unix offering, such as Sun's Solaris or SCO's products, that run on Intel PCs.

The problem is that Windows was not designed with Unix integration as a primary concern. Granted, the latest 32-bit versions are provided with integrated Internet access in the form of TCP/IP stacks and a web browser, but for many businesses, a browser isn't enough.

These power users need more serious forms of connectivity: access to Unix server file systems, text-based applications and graphical Unix programs.

These needs are best met by additional third-party products. Most Unix vendors offer a range of solutions, too many to list here, so what follows is a selection of the better-known offerings.

Open access

In the 'Open Systems' world, there is a single, established standard for sharing files and disks across Lans: Network File System (NFS). This has superseded the cumbersome File Transfer Protocol (FTP) method, which today is mainly limited to remote use, for instance in Internet file transfers.

Although, as with many things Unix, it originated with Sun, NFS is now the de facto standard, used by all Unix vendors. In contrast to FTP, NFS allows a client to mount part of a remote server's filesystem as if it were a local volume, giving transparent access to any program.

It should come as no surprise that no version of Windows has built-in NFS support, either as a client or a server. Indeed, Microsoft promotes its own system as an alternative to NFS under the name of CIFS. Still, Microsoft does include FTP clients with its TCP/IP stacks, and NT Server even includes an FTP server. Additionally, both Windows 95 and NT can print to Unix print queues managed by the standard LPD service.

It is reasonably simple to add NFS client support to a small group of Windows PCs. Probably the best-regarded package is Hummingbird's Maestro (formerly from Beame & Whiteside), a suite of TCP/IP tools for Windows NT and 95. In addition to an NFS client, it also offers a variety of terminal emulations, including IBM 3270 and 5250, Telnet and an assortment of Internet tools. A number of versions are available including ones to run alongside or independently of Microsoft's TCP/IP stack. DOS and Windows 3 are also provided for.

There is also a separate NFS server to allow Unix machines to connect to Windows servers.

If there are a very large number of client machines, though, purchasing multiple licences for an NFS package might prove expensive, and it's more cost-effective to make the server capable of serving files using Windows standards. Effectively, this means the Server Message Block (SMB) protocol, the native 'language' of Microsoft's Lan Manager, as used in everything from Windows for Workgroups to NT Server.

Lan Manager - or, more euphemistically, LanMan - has been ported to run on a range of non-MS operating systems, too. All Microsoft networking is based on LanMan, so as far as any Windows PCs are concerned, any machine running LanMan is a file server: a SCO Unix machine running VisionFS, or a Digital Unix or OpenVMS machine running PathWorks. For Solaris systems, SunLink PC offers similar functionality.

It's completely transparent: without any additional client software, all network-aware versions of Windows (from Windows 3.1 for Workgroups onwards) can connect to the disks and printers on the server. For DOS and Windows 3.1 clients, there's even a free LanMan (Dos-based) client available from Microsoft. This can be downloaded from www.microsoft. com or found on the NT Server CD.

Samba in the server

So far, so good - as long as your Unix vendor offers a version of LanMan for its platform. If not, there is an alternative: Samba. This is a public domain SMB network client and server, available for virtually all Unix flavours. It's tried and tested, but traditionally-minded IT managers may still be biased against public domain software. Even so, Samba is worth a look; it's small and simple and works well. It only runs over TCP/IP, but this comes as standard with 32-bit Windows and is a free add-on for Windows 3. A Unix server with Samba installed appears in "Network Neighborhood" under Windows as another server, so use is completely transparent.

File and print access is fine if all you need to do is gain access to Unix data from Windows applications, but if you need to run Unix programs on Windows, it's not enough. Remote execution of applications is a built-in feature of the Unix operating system, and works in three basic ways.

The simplest is via the Unix commands rexec and rsh, which allow programs to be started on another machine across the network. However, for interactive use, the usual tools are Telnet, for text-terminal programs, and the X Window System (or X) for GUI applications.

Telnet is essentially a terminal emulator that works across a TCP/IP network, allowing text-based programs to be used from anywhere on the network. A basic Telnet program is supplied free with all Windows TCP/IP stacks, but only offers basic PC ANSI emulation. Traditional text-based Unix applications tend to be designed for common text terminals such as the Digital VT220 or Wyse 60, and use screen controls and keyboard layouts specific to these devices, which the Microsoft Telnet program does not support.

A host of vendors supply more flexible terminal emulators with their TCP/IP stacks, including Hummingbird, FTP Software, NetManage and many others. Two specialists in this area are Pericom Software and J River.

Pericom's Teem range of terminal emulators is probably the most comprehensive, covering all major platforms and all major emulations. J River's ICE range is more specific, aiming to connect Windows PCs to Unix servers via TCP/IP or serial lines, providing terminal emulation, printing to Unix printers and easy file transfer.

Unix moved on from its text-only roots many years ago and modern Unix systems have graphical user interfaces much like those of Windows or the MacOS. The essential difference between these and the Unix GUI, though, is that X is split into two parts, client and server. Confusingly, these terms refer to the opposite ends of the network than in normal usage: the X server is the program that runs on the user's computer, displaying the user interface and accepting input, while the X client is the actual program code running on a Unix host computer.

The X factor

This means that all you need to allow PCs to run X applications is an X server for MS Windows - and these are plentiful. While Digital, Sun and other companies offer their own X servers, one of the best-regarded third-party offerings, Exceed, again comes from Hummingbird. With an MS Windows X server, users can log-in to Unix hosts and run any X-based application as if they were using a Unix workstation - including the standard X terminal emulator xterm, making X ideal for mixed graphical and character-based work.

The only drawback of using terminal emulators or MS Windows X servers for Unix host access is the same as that for using NFS: the need for multiple client licences. However, a radical new product from SCO changes all that.

The mating game

Tarantella is an "application broker": it shifts the burden of client emulation from the desktop to the server. In short, Tarantella uses Java to present a remote desktop or "webtop" to any client computer with a Java-capable web browser. From the webtop, the user can start any host-based application to which they have rights, and Tarantella downloads Java code to the client browser to provide the relevant interface - either a terminal emulator for character-based software or a Java X emulator for graphical software.

The host software can be running on the Tarantella server or any other host machine on the network, meaning that it supports most host platforms - including Citrix WinFrame and its variants, which means that Tarantella can supply Windows applications to all clients, too.

Tarantella is remarkably flexible, but it's early days yet - the first version only appeared four months ago. Currently, Tarantella is confined to running on SCO's own UnixWare, but versions are promised for all major Unix variants and Windows NT.

There are plenty of ways to integrate Windows and Unix environments, and it's a safe bet that whoever your Unix supplier is they will have an offering - but no single product will be perfect for everyone, and those described here deserve consideration. Tarantella attempts to be all things to all system administrators, but for now, only if they are running SCO. It's highly likely, though, that it is a pointer to the way things will go in the future.

USING WINDOWS FROM UNIX

There are a host of solutions available for accessing Unix servers from Windows PCs. Rather fewer go the other way, allowing Unix users to use Windows applications or data stored on Windows servers.

For file-sharing, it's easiest to point out that the various solutions outlined in the main article for accessing Unix file systems from Windows will happily work both ways. Once a Windows machine has access to a Unix disk volume, it can place information on to that volume as easily as it can take it off.

For regular transfers, or those under control of the Unix system, NFS or Samba again provide the answer. Samba is both a client and a server, and Windows for Workgroups, Windows 95 and Windows NT all offer server functionality.

Although a Unix machine can't access the hard disk of a Windows box which is only running an NFS client, most NFS vendors also offer separate NFS servers for Windows. It would be unwise, at the very least, to use Windows 3 or Windows 95 as a file server, so this can reasonably be considered to apply mainly to PCs running Windows NT.

Here, the licensing restrictions on NT come into play. NT Workstation is only licensed for 10 simultaneous incoming client connections, so even if the NFS server is not so restricted, allowing more than this violates Microsoft's licence agreement. Different versions of NT Server allow different numbers of clients, and additional licences are readily available from Microsoft, although versions 3.x and 4 of NT Server do not actually limit connections to the licensed number.

There are two routes to running Windows applications on Unix workstations: emulating Windows itself on the workstation, or adding a multi-user version of Windows NT to the Unix network.

Because there are so many applications for DOS and Windows compared to those for all other operating system platforms put together, several companies have developed ways to run Windows, or Windows programs, under Unix. The simplest and most compatible method is to write a Unix program which emulates a complete Intel PC, and then run an actual copy of Windows on the emulator.

This has been done by UK company Insignia, whose SoftWindows was developed with assistance from Microsoft itself. SoftWindows runs on several Unix architectures including Solaris, IRIX, AIX and HP-UX (as well as the Apple Macintosh), and when running on a powerful workstation is very usable.

A different approach was tried by Sun with Wabi. Wabi once stood for "Windows Application Binary Interface", but for legal reasons, this was changed, and now the name doesn't stand for anything. Wabi translates Windows API calls into their Unix equivalents, and emulates an Intel 386 processor for use on RISC systems. This enables certain 16-bit Windows applications, including the major office suites, to run under Unix, without requiring an actual copy of Microsoft Windows. However, it isn't guaranteed to run any Windows application, and partly due to legal pressure from Microsoft, development was halted after the 16-bit edition was released.

It's still on sale, and versions exist for Sun Solaris, SCO Unix and Caldera OpenLinux.

Both these approaches are best suited to a small number of users who don't require high Windows performance. For many users and high-performance, Insignia's NTrigue or Tektronix' WinDD may be better answers. Both are based on Citrix WinFrame, which is a version of Windows NT Server 3.51 licensed from Microsoft and adapted to allow true multi-user access. While WinFrame itself uses the proprietary ICA protocol to communicate with clients, NTrigue and WinDD support standard X Windows, allowing Unix users to log-in to a PC server and remotely run 32-bit Windows software natively on Intel hardware.

July 2025

S M T W T F S
  1234 5
6789101112
13141516171819
20212223242526
2728293031  

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Aug. 17th, 2025 01:15 am
Powered by Dreamwidth Studios