[haiku-commits] Re: r35129 - haiku/trunk/src/kits/tracker

  • From: Ingo Weinhold <ingo_weinhold@xxxxxx>
  • To: haiku-commits@xxxxxxxxxxxxx
  • Date: Mon, 18 Jan 2010 00:25:03 +0100

On 2010-01-17 at 21:19:15 [+0100], Axel Dörfler <axeld@xxxxxxxxxxxxxxxx> 
wrote:
> Stephan Assmus <superstippi@xxxxxx> wrote:
> > On 2010-01-17 at 20:00:35 [+0100], Axel Dörfler <axeld@xxxxxxxxxxxxxxxx>
> > > wrote:
> > > superstippi@xxxxxx wrote:
> > > > Implemented display of current size/total size copied and current
> > > > copy
> > > > speed.
> > > > It will begin to play after a short time (10 seconds).
> > > Nice! But isn't 10 seconds much too long? Many operations won't
> > > even take
> > > that long, and I would still like to know the copy speed in those
> > > cases -
> > > it might already be more or less accurate much earlier (like 2
> > > seconds I
> > > would guess).
> > Unfortunately, it's quite erratic with our current I/O scheduler.
> > With 2
> > seconds intervals, it may change for example between 70 MB/s and 7
> > MB/s
> > (same harddisk). Also, if the whole process lasts less than 10 seconds
> > anyway, I don't really see the point in knowing how fast it is. In
> > any
> > case, it is possible to display it almost immediately by changing the
> > algorithm slightly. I'll play with that. It will certainly be at the
> > expense of raising false hopes in the beginning of a copy process...
> > :-D
> 
> Thanks! I'd say we could keep it like this for a while, and if we then
> decide waiting a bit more is the better idea after all, then we could
> just change it back (or find something inbetween that works good
> enough).

A few weeks ago I was thinking about how one could estimate reasonable 
latencies for media nodes that need to do I/O. It would be relatively easy 
for the I/O scheduler to collect statistics and to provide an API to make 
those accessible to userland. Unfortunately it's not trivial to infer from 
a given file path what the underlying responsible I/O scheduler is. Even 
worse, for a network file system that doesn't help at all. So while the I/O 
scheduler stats are already helpful info (also for gadgets like 
ProcessController and ActivityMonitor), a FS interface extension to 
report/estimate stats would be needed as well.

Back to the issue at hand: Such data provided by the file system could be 
used to compute worst/average case estimates for the copy operation even 
before starting. In fact those could be way more reliable than estimates 
computed from measuring the first seconds of the process.

> The I/O scheduler and most importantly, the write back strategy will
> definitely change in the future, so that might bring more stable
> numbers, too.

The main problem will remain, though: If you start a copy operation with 
empty caches and the source and target FS lie on different drives the copy 
speed of the first seconds should be bound only by the read speed of the 
source. With a lot of RAM those first seconds might be considerably more 
than just a few seconds actually. Of course there are other important 
factors -- like what kinds of files files are copied (small ones with lots 
of attributes vs. huge ones) or whether other I/O on the same disk is going 
on in parallel -- but I guess one will mostly get too optimistic estimates 
when computing them based on the actual progress made in the first few 
seconds.

Anyway, an FS performance estimate API doesn't exist yet, and since it will 
be quite a bit of work, I've decided not to work something like this for 
the time being. So unless someone else makes it happen, it won't be 
available anytime soon. To improve the ETAs, the following could be 
considered:

* Assume that the transfer rate for the first seconds is too optimistic and 
use a lower value (e.g. start with 1/2) for computing estimates. The total 
memory size could be taken into account to guess when the measured figures 
will not be influenced by short-term caching effects anymore.

* When the estimate suggests a short total time, I'd start earlier 
displaying it, e.g. after 10% of the estimated time (with an absolute 
minimum of 1 or 2 seconds).

* Tracker could store copy stats and use them for estimates for later 
operations. Possibly even persistently, though that might be over the top.

CU, Ingo

Other related posts: