4/04/2013

04-04-13 - Tabdir

I made a fresh build of "tabdir" my old ( old ) utility that does a recursive dirlisting in tabbed-text format for "tabview".

Download : tabdir 320k zip

tabdir -?
usage : tabdir [opts] [dir]
options:
 -v  : view output after writing
 -p  : show progress of dir enumeration (with interactive keys)
 -w# : set # of worker threads
 -oN : output to name N [r:\tabdir.tab]

This new tabdir is built on Oodle so it has a multi-threaded dir lister for much greater speed. (*)

Also note to self : I fixed tabview so it works as a shell file association. I hit this all the time and always forget it : if something works on the command line but not as a shell association, it's probably because the shell passes you quotes around file names, so you need a little code to strip quotes from args.

Someday I'd like to write an even faster tabdir that reads the NTFS volume directory information directly, but chances are that will never happen.

One odd thing I've spotted with this tabdir is that the Windows SxS Assembly dirs take a ton of time to enumerate on my machine. I dunno if they're compressed or WTF the deal is with them (I pushed it on the todo stack to investigate), but they're like 10X slower than any other dir. (could just be the larger number of files in there; but I mean it's slow *per file*)

I never did this before because I didn't expect multi-threaded dir enumeration to be a big win; I thought it would just cause seek thrashing, and if you're IO bound anyway then multi-threading can't help, can it? Well, it turns out the Win32 dir enum functions have quite a lot of CPU overhead, so multi-threading does in fact help a bit :

nworkers| elapsed time
1       | 12.327
2       | 10.450
3       | 9.710
4       | 9.130

(* = actually the big speed win was not multi-threading, it's that the old tabdir did something rather dumb in the file enum. It would enum all files, and then do GetInfo() on each one to get the file sizes. The new one just uses the file infos that are returned as part of the Win32 enumeration, which is massively faster).

9 comments:

brucedawson said...

I notice that tabdir defaults to writing to r:\tabdir.tab which seems... odd. It also doesn't give any error messages if (for some odd reason) you don't have an R: drive mounted. I hacked around this with subst, but defaulting to %temp% and adding an error message would help the out-of-box experience.

But, looks useful.

brucedawson said...

I found this oddly hilarious. I've been running clockres more frequently lately so I noticed when my systems timer interval was dropped to 1 ms -- thus wasting power. I investigated with powercfg -energy. Guess who:

Platform Timer Resolution:Outstanding Timer Request

A program or service has requested a timer resolution smaller than the platform maximum timer resolution.

Requested Period 10000
Requesting Process ID 11864
Requesting Process Path \Device\HarddiskVolume3\bin\tabdir.exe

How come tabdir is requesting a lower timer resolution? Is this intentional, or some odd side-effect?

cbloom said...

I always timeBeginPeriod(1) at startup.

I'd never heard of anyone caring. In fact I did some tests at one point where I queried the timer interval *before* setting it, and I always found it was already 1 because some app had set.

Seems like an unnecessary headache to fight that. The whole idea that it's variable but global and apps get to set it is super broken, BTW.

cbloom said...

Updated. Still defaults output to r:\ cuz that's what I like, but if that fails it writes to temp.

Unknown said...

timeBeginPeriod(1) runs down your laptop's battery.

Chrome is* careful to adjust it based on whether there are pending tasks that need it:
https://code.google.com/p/chromium/issues/detail?id=46531

It seems likely the users of tabdir are different than Chrome's though.


* Looking at the bug tracker, it seems to maybe have regressed since that point.

cbloom said...

I suspect that enumerating your entire disk affects battery more than the tbp(1) ;)

Certainly I agree in principal that things like casual games which might just sit running in the background while a user browses the web should be careful about not consuming excessive resources. Game developers can get stuck in this philosophy that they own the whole machine and get to brutalize it.

johnb said...

"Someday I'd like to write an even faster tabdir that reads the NTFS volume directory information directly, but chances are that will never happen."

Apparently, SwiftSearch does something like this.

http://sourceforge.net/projects/swiftsearch/

(Not a recommendation; I've never used it)

brucedawson said...

I agree that increasing the timer frequency while tabdir is running is not a problem, because it keeps the system busy anyway. However I would urge you to not routinely set the timer to 1 KHz at startup because for other programs it is a big deal. Some programs (Chrome, Starry Night, etc.) are on my kill list for when I'm running on battery because they increase the timer frequency unnecessarily.

Timer frequency is, of course, not the only thing that affects power draw, but burning ~0.3 W for no reason seems like a bad idea.

My timer interval is currently 15.6 ms.

cbloom said...

Yeah, totally agree with you that programs which sit idle need to minimize their resource usage.

old rants