[SLL] limit on number of files in a directory and hashed dir vs. flat dir file access time
Chuck Wolber
chuckw at quantumlinux.com
Wed Oct 31 23:13:55 PDT 2007
On Thu, 1 Nov 2007, Ana wrote:
> > It would be an interesting data analysis problem to undertake. I've
> > informally undertaken a test and I do not see that the value creeps up
> > as the directory gets "fuller". My data set was too low to produce any
> > useful results, but it does show that if there is an effect, it is
> > subtle.
>
> It is a very interesting problem.
Indeed
> The test you describe will measure "seek-and-open" time as it relates to
> single directory size. right? that's probably the most telling test
> you could make.
It's a self consistency issue. Stuff like drive caching, etc cancels out.
All we care about is the slope of the graph, not the actual values on the
graph. My hypothesis is that the slope is near zero as size increases.
Interestingly, if the slope is greater than zero, it may not necessarily
point to an ext3 inefficiency. It could very well be the drive itself.
Thus the test should be done over a variety of drives.
> Of course, one thing you always have to take into account is the
> software you're going to be using. For instance, given what you've told
> us about rm, if we know that rm is the only tool we have then a tiered
> directory structure might be best no matter how good ext3 is.
Software is a red herrring. The test simply calls for opening and closing
the files to prove that they can in fact be accessed and then assessing
how long that access took. You could expand the test by writing something
to the files while they're open. Something as simple as the following
should do:
echo "Hello World" > $file
..Chuck..
--
http://www.quantumlinux.com
Quantum Linux Laboratories, LLC.
ACCELERATING Business with Open Technology
"Stay Hungry. Stay Foolish."
-The Whole Earth Catalog
More information about the linux-list
mailing list