[SLL] limit on number of files in a directory and hashed dir vs. flat dir file access time

Chuck Wolber chuckw at quantumlinux.com
Wed Oct 31 23:13:55 PDT 2007


On Thu, 1 Nov 2007, Ana wrote:

> > It would be an interesting data analysis problem to undertake. I've 
> > informally undertaken a test and I do not see that the value creeps up 
> > as the directory gets "fuller". My data set was too low to produce any 
> > useful results, but it does show that if there is an effect, it is 
> > subtle.
> 
> It is a very interesting problem.

Indeed


> The test you describe will measure "seek-and-open" time as it relates to 
> single directory size.  right?  that's probably the most telling test 
> you could make.

It's a self consistency issue. Stuff like drive caching, etc cancels out. 
All we care about is the slope of the graph, not the actual values on the 
graph. My hypothesis is that the slope is near zero as size increases.

Interestingly, if the slope is greater than zero, it may not necessarily 
point to an ext3 inefficiency. It could very well be the drive itself. 
Thus the test should be done over a variety of drives.


> Of course, one thing you always have to take into account is the 
> software you're going to be using.  For instance, given what you've told 
> us about rm, if we know that rm is the only tool we have then a tiered 
> directory structure might be best no matter how good ext3 is.

Software is a red herrring. The test simply calls for opening and closing 
the files to prove that they can in fact be accessed and then assessing 
how long that access took. You could expand the test by writing something 
to the files while they're open. Something as simple as the following 
should do:

echo "Hello World" > $file

..Chuck..

-- 
http://www.quantumlinux.com
 Quantum Linux Laboratories, LLC.
 ACCELERATING Business with Open Technology

"Stay Hungry. Stay Foolish."
	-The Whole Earth Catalog


More information about the linux-list mailing list