Archive

Posts Tagged ‘FAT’

How to optimize reading of large amounts of files

February 21st, 2010 3 comments

Conventional hard drives keep getting faster. While 10 years ago an average transfer speed of 40 MiB/sec was considered ‘fast’, today there are mainstream hard drives which deliver average transfer speeds beyond 100 MiB/sec.
So far, so good. What hasn’t increased much is the average access time. Especially when reading large numbers of relatively small files, a lot of time is spent just waiting for the hard disk to position its arm over the relevant data. No big change compared to a decade ago.

Note: In case you are developing software that exclusively operates on Solid State Disks (SSD) you may stop reading now…

The following idea has been developed for an image indexing engine, which is supposed to scan a large amount of images and extract information within the least possible amount of time. The entire indexing operation was disk-bound. For the main test case, a hard disk with roughly 100,000 image files was chosen.

In order to minimize the time spent waiting for each file being read, it would be worthwhile to read all files in the same physical order in which they are residing on the hard disk. So the key information we need is the physical location of each file on the hard disk.

For this purpose, the FSCTL_GET_RETRIEVAL_POINTERS operation can be used:

struct MyFile { PTSTR sFilename; DWORD64 Lcn; DWORD dwFragments; };
 
void GetStartCluster( PCTSTR sFilename, MyFile& f )
{
    f.sFilename = sFilename;
    f.dwFragments = f.Lcn = 0;
 
    HANDLE hFile = CreateFile( sFilename, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL );
    if ( INVALID_HANDLE_VALUE != hFile )
    {
        const DWORD MAX_CLUSTERS = 100; // actually we only care about the first cluster number
        const DWORD RETRIEVAL_POINTERS_BUFFER_SIZE = sizeof( RETRIEVAL_POINTERS_BUFFER ) + ( 2 * ( MAX_CLUSTERS - 1 ) * sizeof( LARGE_INTEGER ) );
        BYTE output[ RETRIEVAL_POINTERS_BUFFER_SIZE ];
 
        STARTING_VCN_INPUT_BUFFER input;
        input.StartingVcn.QuadPart = 0;
 
        DWORD dwBytesReturned;
        if ( DeviceIoControl( hFile, FSCTL_GET_RETRIEVAL_POINTERS, &input, sizeof( input ), &output, sizeof( output ), &dwBytesReturned, NULL ) )
        {
            RETRIEVAL_POINTERS_BUFFER* p = (RETRIEVAL_POINTERS_BUFFER*)&output;
            if ( p->ExtentCount > 0 && p->StartingVcn.QuadPart == 0 )
            {
                f.dwFragments = p->ExtentCount;
                f.Lcn = p->Extents[ 0 ].Lcn.QuadPart;
            }
            else
                _tprintf( _T("Error: ExtentCount = %d StartingVcn = %I64d\r\n"), p->ExtentCount, p->StartingVcn.QuadPart );
        }
        else
            _tprintf( _T("Error: DeviceIoControl = 0x%08X\r\n"), GetLastError() );
        CloseHandle( hFile ); 
    }
}

Initially, the cluster number and the number of extents (fragments) is retrieved for every file. In a second step, all files are sorted according to the number of fragments and their cluster number. Then they are processed in exactly this new order which leads to a minimized amount of disk ‘thrashing’.
The actual retrieval of file locations using DeviceIoControl is relatively fast, since it only queries information from the file system and not from the file’s data location. Typically this operation is CPU bound and consumes less than 5% of all indexing time.

Taken into account the time needed to retrieve position data and sort files, this approach can save between 20 and 50% of read time, depending on average file size, fragmentation and spacial data distribution. There is no benefit from this method if the majority of files is fragmented. However, fragmentation is more likey to happen with larger files, and the average file size in this case was only 149 KiB. Generally speaking, the performance gain this method can provide increases with decreasing file sizes.

Benchmarking

  • Hard disk: 1 TB 7200 rpm SATA II, avg. access time: 17 ms
  • Windows XP Professional 32 bit booted in protected mode
  • Number of files scanned: 87,418
  • Total file size: 12.46 GiB
  • Average file size: 149 KiB
  • Time for reading files in filesystem order: 506 seconds
  • Average throughput reading files in filesystem order: 25.24 MiB/s
  • Time for reading file position data: 12 seconds
  • Time for sorting files according to their position: 54 ms
  • Time for reading files in sorted order: 347 seconds
  • Total time using new approach: 359 seconds
  • Total throughput using new approach: 35.58 MiB/s
  • Performance increase: 147 seconds or 29 %
  • Reading files in filesystem order takes 1.42 times as long as a pre-sorted read