Developmentor

Examining the Windows 95 Layered File System

Adding functionality to block devices

by Mark Russinovich and Bryce Cogswell

The authors are researchers in the computer science department at the University of Oregon. Mark can be reached at mer@cs.uoregon.edu and Bryce at cogswell@cs.uoregon.edu.
One major difference between Windows 95 and its predecessors, Windows 3.1 and Windows for Workgroups (WFW) 3.11, is how Windows 95 implements its file systems. Windows 95 introduces a "layered" approach to file-system management, dividing translation of a high-level file access to an actual physical request into multiple, distinct parts. Unfortunately, this new organization has created a plethora of new terminologies and APIs. In addition, Windows 95 Device Driver Kit (DDK) documentation is often vague, incomplete, and misleading.

In this article, we'll briefly discuss how Windows 3.1 and WFW 3.11 implement their file systems, then present an overview of the Windows 95 file system. Our exploration of the file system focuses on the vendor supplied driver (VSD) layer. VSDs are virtual devices (VxDs) that can hook onto the path of device accesses for any block-based device such as a hard disk, CD-ROM, or floppy drive. Microsoft designed the VSD layer to let third-party vendors add functionality to the file system. An extensive API was added to the file system so that VSDs can alter device requests (or create new ones), making it possible to develop VSDs to perform functions ranging from block-device monitoring and data encryption to mirrored or RAID disk management.

To demonstrate how a VSD is built, we'll describe the design and implementation of a monitoring VSD that interfaces with a Win32 program to display information about block-device accesses. Besides serving as a basis for your own custom VSDs, the application will return useful information about your block-device performance and show how to connect a Windows GUI program with a virtual device.

Out With the Old

Windows 3.1 has the most-simplistic file system of the Windows incarnations. When a Windows or DOS application makes a request to read data from a file, for instance, the request is sent to DOS, which then passes it to the BIOS. If you're lucky, you have what's called a "Fast Disk"-compatible hard disk. (Choose virtual-memory information from the Enhanced 386 information on your control panel, and a check box will tell you if fast-disk access is possible, and if so, turned on.) In that case, instead of using the BIOS to do disk I/O, a virtual device called "WDCTRL.386" handles the request in 32-bit protected mode, bypassing the slower real-mode BIOS. Here, your file request is translated to a physical request by real-mode DOS, which has the request serviced in protected mode by WDCTRL.386.

WFW 3.11 introduced the prototype of Windows 95 block-device management. When a WFW or DOS program requests a file, the request is passed to a virtual device called "IFSMgr.386," which passes it to VFAT.386, a virtual device that implements the DOS file system in protected mode. After VFAT.386 has converted the request to a logical device request, it sends it to IOS.386, the I/O system supervisor. If the target hard disk is Fast Disk compatible, the request is serviced in WDCTRL.386; otherwise, it is sent to the BIOS. Thus, in WFW 3.11, if you have a Fast Disk-compatible disk, your file accesses are handled entirely by VxDs, bypassing real-mode DOS completely, and giving you maximum performance.

In With the New

Windows 95 takes the concept of protected-mode disk access a step further than WFW. To maximize Windows 95 performance, Microsoft made it easy for hard-disk manufacturers to make their own versions of WDCTRL.386-type drivers so that disk access can bypass the BIOS. Microsoft also wants to allow Windows 95 to seamlessly integrate any new or odd block-device hardware (a flash memory card used as a disk, for instance) into Windows' file-system management scheme. Therefore, Microsoft had to divide the WFW block-request path, which extends from the application to the hardware, into much more specialized layers. The new scheme is called the "Installable File System" (IFS).

[IMAGE]

Figure 1: File System Layers

The IFS is made up of 32 logical layers, each containing one or more virtual devices, through which block-device requests pass. Fortunately for performance, most layers are empty for typical hardware. For hard disks, a file-system request will usually only pass through about five virtual devices on the way to the hardware. Figure 1 shows how the layers are organized, while Figure 2 shows a typical request path. The smallest numbers represent higher layers of abstraction, with the topmost layer being the entry point to the file system. Higher numbers are closer to the hardware with the highest number (bottom layer) being the virtual devices that access the hardware directly.

IFS mgr <-> File system drvr <-> type-specific drvr <-> vendor drvr <-> port drvr <-> Hard drive

Figure 2: Typical file-system request chain

The IO Supervisor (IOS) manages requests as they pass through the file-system hierarchy. Each device on the chain can select requests based on the logical or physical drive to which the request is directed. The devices can also view the result of a request as it passes back up the chain to the application. Furthermore, the VxDs on the chain can service requests themselves and not pass them to lower levels, or they can generate requests themselves. The VFAT virtual device handles many requests by reading or writing to a memory cache via the VCACHE virtual device.

Layers, Layers, Layers

At this point, we'll provide an overview of what occurs (or can occur) at each level of the file system (again, see Figure 1). Remember that most block devices do not require an entry at each level in the chain.

In general, the upper layers are written by Microsoft, while the lower layers are provided by disk-drive manufacturers. The layer for programmers to play with is the VSD.

The Devmon Application

Devmon (short for "DEVice MONitor") is a block-device monitoring application that demonstrates the design of a VSD. This application consists of a VSD virtual device that monitors and times all block-device requests passing through the VSD layer, and a Windows 95 32-bit GUI program that reads the monitored data and displays it textually in a window. Besides serving as the basis for your own VSD designs, Devmon (see Figure 3) contains a useful example of how a virtual device and a 32-bit Windows 95 program can communicate. In addition, Devmon will tell you about the characteristics of all the block devices in your system, allow you to enable and disable monitoring of requests to the various devices, and tell you how long each request takes. (Complete source code, executables, and other binaries for Devmon are available electronically.)

[Screen Shot]

Figure 3: Running the Devmon program.

The Devmon Windows program initiates communication with the Devmon VSD through the Win32 DeviceIoControl interface. This interface provides the only means whereby a Win32 program can communicate with a virtual device. The first step in establishing communication is the CreateFile command. The filename parameter for this call must be the name of the virtual device to be opened. Virtual device names differ from regular filenames because they contain an initial two backslashes followed by a period, another backslash, and then the name of the virtual device. For example, the name for the Devmon VSD is \\.\devmon.vxd. Note that in a C string, a backslash is a special character, so to specify one backslash, you must enter two in the string; for example, \\\\.\\devmon.vxd.

After the file has been opened, the program can send commands to the virtual device by calling DeviceIoControl (see Example 1) with the handle returned by the CreateFile call. By using the buffers, the program can pass arbitrary amounts of information back and forth with the device. The dwIoControlCode parameter is a VxD-specific function code used to specify the operation to be carried out by the VxD.

BOOL DeviceIoControl(
    HANDLE      hDevice,           // handle of the device
    DWORD       dwIoControlCode,   // control code of operation to perform 
    LPVOID      lpvInBuffer,       // address of buffer for input data
    DWORD       cbInBuffer,        // size of input buffer
    LPVOID      lpvOutBuffer,      // address of output buffer
    DWORD       cbOutBuffer,       // size of output buffer
    LPDWORD     lpcbBytesReturned, // address of actual bytes of output
    LPOVERLAPPED    lpoOverlapped  // address of overlapped structure
    );

Example 1: Format of DeviceIoControl call.

Upon starting, Devmon opens communication with the Devmon VSD and requests that the VSD pass it the device control blocks (DCBs) of the physical devices configured in the system. Devmon then creates a menu that allows the user to select interesting information about the DCB and to enable and disable monitoring of the DCB's device.

Several times a second, the application performs a DeviceIoControl on the VSD, asking it to pass copies of the latest device-access requests sent through the IFS. These are printed in the main window and include a number for the request, the request type (read, write, and so on), the logical drive to which the request is directed (C:, for example), the sector at which the request is directed (if appropriate), the number of sectors associated with the request and finally, the time required to service the request.

The Devmon VSD

The VSD layer is created by the IOS, which, at boot time, looks for VxDs in the system\iosubsys directory under the Windows 95 main directory. It tries to load as a dynamic VxD any file in that subdirectory with the extension VXD. When the VSD receives the SYS_DYNAMIC_DEVICE_INIT call, it responds by registering itself with the IOS. This is accomplished by calling IOS_Register VxD service and passing in a device registration packet (DRP). The DRP structure is in Listing One and the code registering our monitoring VSD with the IOS is in Listing Two .

What makes a VSD a VSD and not a member of some other layer of the file system is the load number it returns as the load-group-number (DRP_LGN) in the DRP. This tells the IOS at which layer to put the VxD in the hierarchy. VSDs have nine levels to choose from: 8 - 10 and 12 - 17. The placement of a VSD is somewhat arbitrary, but a few guidelines can be followed:

The IOS responds to a VSD's registration by passing back a return value indicating what it should do. If all is well with the DRP, the return value will simply indicate success. If something is wrong, the VSD can be told to unload itself. The registration also lets the VSD provide the IOS with the address of the procedure to call to service asynchronous messages. This address is passed as an entry in the DRP. After the registration, this asynchronous event procedure (AEP) will act as the communications channel from the IOS to the VSD. Another communications channel is returned in the DRP by the IOS and is the address of the IOS procedure that the VSD can call with its own requests. This is located in the DRP_ilb (IOS linkage block) field of the DRP.

Once the registration has successfully completed, the VSD begins receiving messages from the IOS through the AEP. The IOS can send about two dozen different types of messages, but most VSDs will only be interested in a handful. Usually, the most interesting are the AEP_CONFIG_DCB and AEP_BOOT_COMPLETE messages. At system boot time, the IOS performs a handshaking initialization with the file-system drivers. In this phase, the system determines which physical and logical block devices are attached. The IOS will send the VSD an AEP_CONFIG_DCB whenever it registers a new physical device and provides the DCB.

The DCB (see Listing Three ) contains information about the physical device, including its bus type and unit number, the number of heads, tracks, and sectors contained on the device, and flags that specify device behavior. A VSD can ignore the AEP_CONFIG_DCB messages or, if it wants to receive requests that are associated with that particular DCB, it can send the IOS a request message, asking it to insert the VSD on the device's call-down chain. IOS request messages are data structures passed on the stack to the IOS's request procedure (taken from the ILB). The Devmon VSD monitors all block devices, so it has itself put on the call-down chain for every physical device that is configured.

The AEP_BOOT_COMPLETE message tells the VSD that the file system has finished initializing and that all devices have been registered. VSDs associated with only certain types of devices can tell the IOS to unload the VSD if none of the devices are present in the system.

After initialization, each block device in the system has a DCB associated with it. That DCB specifies call-down and call-back chains pointing to the VSDs through which file-system requests for that device will pass.

When the boot sequence is complete, the VSD begins getting requests through the call-down chains on which it inserted itself. (The routine that's called with requests was indicated by the VSD in the call-down insertion commands.) The call-down routine receives a pointer to a data structure called the I/O Packet (IOP), which contains another data structure called the I/O Request (IOR). The IOP and IOR contain all the information about the particular request including its type, pointers to buffers associated with it (such as read or write buffers), and parameters indicating the physical sector on the device that the request wants to access.

Each VSD is responsible for calling the next VSD in the chain. The IOP contains a pointer to the DCB associated with the request, and the DCB contains a pointer to the current position in the call-down chain. To call the next VSD, its address is read from the chain, the chain pointer in the DCB is moved to the next location, and the address is called. The call-down data structure is in Listing Four, and a code fragment demonstrating these steps is in Listing Five.

Many VSDs will want to view requests not only as they pass down to the hardware, but also as they return up the chain to the original caller. To do this, the VSD must insert itself on the call-back chain. This chain is managed by a list of data structures pointed to by the IOP_callback_ptr entry of the IOP. To insert itself on the chain, the VSD must set the IOP_cb_address of the call-back entry to the address of its call-back procedure. It must then move the pointer to the next call-back entry for the layer just above it in memory. These steps are in Listing Six. The Devmon VSD inserts itself on the call-back chain for all requests so that it can determine how long a request took to be serviced by the layers below it (which are essentially the device drivers and hardware).

If the VSD is servicing requests itself, it can indicate immediately to layers above it that the request was serviced or postpone this notification until later. To inform the layers above that a request has been serviced, the VSD does not call the layer below; instead it simply calls up the call-back chain. The IOP_callback_ptr is adjusted so that it points at the call-back entry for the layer above the VSD, and then the VSD calls the IOP_cb_address procedure with the IOP on the stack. If the VSD wishes to service the routine later, it simply returns a 0 in the eax register and performs the callback when it wants to indicate request completion. Listing Seven provides the callback data structure, and Listing Eight demonstrates the code for performing the callback.

The Devmon VSD uses an IOS feature called the "expansion area." This is a block of data that the IOS allocates for each IOP for use by the devices in the IOP's call-down chain. A VSD must tell the IOS that it wants some expansion area allocated for it when it inserts itself on a DCB's call-down chain at DCB-configuration time. The expansion area can be used for whatever purpose the VSD desires; in the case of Devmon, it is used to pass a time stamp from the call-down procedure to the call-back procedure. Thus, the VSD can determine the time it took to complete a request by comparing the current time in the call-back procedure with the time stored in the request's expansion area. The address of the expansion area is computed by adding the offset stored in the DCB_cd_entry's DCB_cd_expan_offset field to the IOP's address.

If a VSD has put itself on the call-back chain, it must service call-back calls by continuing to call up the chain using the same method as that described for initiating the call-back chain. The Devmon VSD's call-back procedure stores a copy of the request, along with a time stamp, in a buffer that it provides to the Devmon Windows program.

VSDs can initiate new device requests themselves. If disk mirroring were desired, for example, a VSD would let requests to the primary drive pass through as normal, but it would also initiate identical requests for writes to the secondary drive. Initiating a new request requires that the VSD call the IOS service that allocates a new IOP, fill in the IOP with the correct parameters, and then initiate the request. The VSD can send the request to all virtual devices on the target device's call-down chain, or just the devices beneath itself. If the request is a new SCSI request that cannot be constructed by copying a similar request, the VSD must make sure that the SCSI-izer layer processes the request as described earlier.

Conclusion

The Windows 95 file system provides opportunities for third-party drivers to add new functionality to block devices. Without a doubt, increasing use of Windows 95 will mean a growing desire for data encryption and protection. The Devmon VSD is a framework you can extend to take advantage of these coming needs.

DDJ


Copyright © 1997, Dr. Dobb's Journal
Dr. Dobb's Web Site Home Page -- Top of This Page