Archive

Posts Tagged ‘Win32’

How to optimize reading of large amounts of files

February 21st, 2010 3 comments

Conventional hard drives keep getting faster. While 10 years ago an average transfer speed of 40 MiB/sec was considered ‘fast’, today there are mainstream hard drives which deliver average transfer speeds beyond 100 MiB/sec.
So far, so good. What hasn’t increased much is the average access time. Especially when reading large numbers of relatively small files, a lot of time is spent just waiting for the hard disk to position its arm over the relevant data. No big change compared to a decade ago.

Note: In case you are developing software that exclusively operates on Solid State Disks (SSD) you may stop reading now…

The following idea has been developed for an image indexing engine, which is supposed to scan a large amount of images and extract information within the least possible amount of time. The entire indexing operation was disk-bound. For the main test case, a hard disk with roughly 100,000 image files was chosen.

In order to minimize the time spent waiting for each file being read, it would be worthwhile to read all files in the same physical order in which they are residing on the hard disk. So the key information we need is the physical location of each file on the hard disk.

For this purpose, the FSCTL_GET_RETRIEVAL_POINTERS operation can be used:

struct MyFile { PTSTR sFilename; DWORD64 Lcn; DWORD dwFragments; };
 
void GetStartCluster( PCTSTR sFilename, MyFile& f )
{
    f.sFilename = sFilename;
    f.dwFragments = f.Lcn = 0;
 
    HANDLE hFile = CreateFile( sFilename, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL );
    if ( INVALID_HANDLE_VALUE != hFile )
    {
        const DWORD MAX_CLUSTERS = 100; // actually we only care about the first cluster number
        const DWORD RETRIEVAL_POINTERS_BUFFER_SIZE = sizeof( RETRIEVAL_POINTERS_BUFFER ) + ( 2 * ( MAX_CLUSTERS - 1 ) * sizeof( LARGE_INTEGER ) );
        BYTE output[ RETRIEVAL_POINTERS_BUFFER_SIZE ];
 
        STARTING_VCN_INPUT_BUFFER input;
        input.StartingVcn.QuadPart = 0;
 
        DWORD dwBytesReturned;
        if ( DeviceIoControl( hFile, FSCTL_GET_RETRIEVAL_POINTERS, &input, sizeof( input ), &output, sizeof( output ), &dwBytesReturned, NULL ) )
        {
            RETRIEVAL_POINTERS_BUFFER* p = (RETRIEVAL_POINTERS_BUFFER*)&output;
            if ( p->ExtentCount > 0 && p->StartingVcn.QuadPart == 0 )
            {
                f.dwFragments = p->ExtentCount;
                f.Lcn = p->Extents[ 0 ].Lcn.QuadPart;
            }
            else
                _tprintf( _T("Error: ExtentCount = %d StartingVcn = %I64d\r\n"), p->ExtentCount, p->StartingVcn.QuadPart );
        }
        else
            _tprintf( _T("Error: DeviceIoControl = 0x%08X\r\n"), GetLastError() );
        CloseHandle( hFile ); 
    }
}

Initially, the cluster number and the number of extents (fragments) is retrieved for every file. In a second step, all files are sorted according to the number of fragments and their cluster number. Then they are processed in exactly this new order which leads to a minimized amount of disk ‘thrashing’.
The actual retrieval of file locations using DeviceIoControl is relatively fast, since it only queries information from the file system and not from the file’s data location. Typically this operation is CPU bound and consumes less than 5% of all indexing time.

Taken into account the time needed to retrieve position data and sort files, this approach can save between 20 and 50% of read time, depending on average file size, fragmentation and spacial data distribution. There is no benefit from this method if the majority of files is fragmented. However, fragmentation is more likey to happen with larger files, and the average file size in this case was only 149 KiB. Generally speaking, the performance gain this method can provide increases with decreasing file sizes.

Benchmarking

  • Hard disk: 1 TB 7200 rpm SATA II, avg. access time: 17 ms
  • Windows XP Professional 32 bit booted in protected mode
  • Number of files scanned: 87,418
  • Total file size: 12.46 GiB
  • Average file size: 149 KiB
  • Time for reading files in filesystem order: 506 seconds
  • Average throughput reading files in filesystem order: 25.24 MiB/s
  • Time for reading file position data: 12 seconds
  • Time for sorting files according to their position: 54 ms
  • Time for reading files in sorted order: 347 seconds
  • Total time using new approach: 359 seconds
  • Total throughput using new approach: 35.58 MiB/s
  • Performance increase: 147 seconds or 29 %
  • Reading files in filesystem order takes 1.42 times as long as a pre-sorted read

Do You have Code Pages? And if not, how many?

November 13th, 2009 No comments

The code page system is a means of representing extended character sets in binary form without leaving the 8 (or 7) Bit based storage and API conventions. There are more than a hundred different code pages, and depending on your system, more or less are supported by the Win32 API.

If you are planning to handle different encodings within your application, it might be useful to know which code pages are available on your system, so you can encode/decode text based on them. Not every System supports every code page though. Just recently I found that an English Windows Mobile 5 would support CP 1252, but not the (very common) Latin-1 character set.
Especially when converting from or to some less common code pages it can be worthwhile to check whether or not there is built in support available.

To enumerate all the installed Code Pages, you can use the EnumSystemCodePages API. It accepts a callback function as a parameter and a flag that specifies whether to enumerate all installed or all supported code pages. Although the docs are not clear about the latter, it seems that the installed code pages are the pages which are ready for conversion, while the supported code pages seem to contain also pages which fail to convert any characters.

Strangely, the callback function gets called with the code page number in string representation, so you might want to do  a _ttoi() in order to get the code page number in use with other APIs.

CAtlArray<CPINFOEX> asCodePagesInstalled;
 
BOOL CALLBACK EnumCodePagesInstalledProc( LPTSTR szCodePageString )
{
    CPINFOEX cpInfoEx;
     if ( 0 != GetCPInfoEx( _ttoi( szCodePageString ), 0, &cpInfoEx ) )
         asCodePagesInstalled.Add( cpInfoEx );
    return TRUE;
}
//...
EnumSystemCodePages( EnumCodePagesInstalledProc, CP_INSTALLED );
//...

It is difficult to determine the lowest common denominator of all code pages supported on every system, and it gets even more complicated when Win 9x or mobile versions need to be supported. So in case your application will demand more than just the bare minimum conversions, you might want to resort to an external library, such as ICU or libiconv.

How a ListView Can Waste Your Time

September 28th, 2009 No comments

When debugging a C++ app that made trivial use of a ListView, the first click inside the ListView produced a noticeable delay before giving a visual feedback such as selecting an item or popping a menu. Debug output produced the following:

Loaded 'C:\WINDOWS\system32\winmm.dll'
Loaded 'C:\WINDOWS\system32\wdmaud.drv'
Loaded 'C:\WINDOWS\system32\setupapi.dll'
Loaded 'C:\WINDOWS\system32\wintrust.dll'
Loaded 'C:\WINDOWS\system32\crypt32.dll'
Loaded 'C:\WINDOWS\system32\msasn1.dll'
Loaded 'C:\WINDOWS\system32\imagehlp.dll'
Unloaded 'C:\WINDOWS\system32\setupapi.dll'
Unloaded 'C:\WINDOWS\system32\wdmaud.drv'
Loaded 'C:\WINDOWS\system32\wdmaud.drv'
Loaded 'C:\WINDOWS\system32\setupapi.dll'
Unloaded 'C:\WINDOWS\system32\setupapi.dll'
Loaded 'C:\WINDOWS\system32\msacm32.drv'
Loaded 'C:\WINDOWS\system32\msacm32.dll'
Loaded 'C:\WINDOWS\system32\midimap.dll'

To clarify, this happened with a standard Win32 ListCtrl, and the app did not make use of any sounds or multimedia APIs at all. Additionally, all system sounds were turned off in the Windows XP control panel. What was going on?
It turned out that a ListView is supposed to emit a sound upon item selection, and due to a bug that still shows up in Windows Vista, it even loads the necessary libraries when sounds are completely turned off at a system level.

The problem is easy to solve on an individual basis, but its not trivial to prevent it from occurring on other systems the app is run. All that needs to be done is delete this (empty) key in the current user’s registry hive:

HKEY_CURRENT_USER\AppEvents\Schemes\Apps\.Default\CCSelect\.Current

It strikes me that such a bug, which is probably affecting a huge number of users, is not fixed. Whenever a ListView is clicked for the first time within an app, the DLL delayloading feature will steal a few milliseconds of the user’s time and induce a noticeable lag in UI reaction.

The following KB post describes this bug as Vista specific, although it can be reproduced on XP as well:
http://support.microsoft.com/kb/944150