[SLL] limit on number of files in a directory and hashed dir vs. flat dir file access time

Xeno Campanoli xcampanoli at gmail.com
Wed Oct 31 18:34:02 PDT 2007


Chuck Wolber wrote:
> On Wed, 31 Oct 2007, Adam Monsen wrote:
> 
>> Is there a limit on the number of files in a directory other than just 
>> the inode limit of a particular partition? I'm using the ext3 
>> filesystem. Ubuntu 7.10.

Sorry man, but we tested that limit recently and it's exactly as I
documented in my post, which, oddly, hasn't come through.  Hey,
moderator!  Fix yourself!
> 
> In old 2.4 kernels ext3 performanced inversely porportional to the number 
> of files in a given directory. So in that sense there was a limit. They 
> fixed that in 2.6 with a tree like data structure. "ls" will still perform 
> dog slow because it tries to grok the entire dir before outputting, but 
> file accesses should be fine.
> 
> As for a physical limit on the number of files in a directory, if there 
> is, it would probably have something to do with how many bits used to 
> represent the tree structure. That number is probably so astronomically 
> large that you'll probably exceed some other filesystem limit before you 
> hit it, but that's just a guess. 
> 
> Have you tried asking that question on the ext3 developers list? I don't 
> know if this is the *OFFICIAL* list, but Theodore Tso posts there all the 
> time:
> 
> https://www.redhat.com/mailman/listinfo/ext3-users
> 
> Also, have you considered other types of filesystems for what you're 
> trying to do? IIRC ReiserFS actually has some advantages over ext3 for 
> lots of small files, or maybe that was xfs...
> 
> 
>> Follow up question: is it more efficient (as far as reads are concerned) to
>> "hash" files into subdirectories rather than just throw them all in a single
>> directory? For instance, say I have 1 million 100 kilobyte JPEG images named
>> as follows:
>> 000000.jpg
>> 000001.jpg
>> 000002.jpg
>> ...
>> 999999.jpg
>>
>> Would it speed up read time for a particular image if images were placed in
>> directories like:
>>
>> 0/0/0/0/0/1/000001.jpg and
>> 0/2/3/6/1/2/023612.jpg
>>
>> and so on?
> 
> Not according to anything I've ever seen, but it's probably worth a test. 
> Perhaps traversing the logical tree in a directory might add *some* 
> overhead as the tree gets larger, but the 2.6 kernel really did fix a lot 
> of the annoying performance problems in ext. Chopping the images into a 
> directory hiearachy may make sense for other reasons though.
> 
> 
> ..Chuck..
> 


-- 
The only sustainable organizing methods focus not on scale,
but on good design of the functional unit,
not on winning battles, but on preservation.


More information about the linux-list mailing list