<-- prev. page
1
[2]
next page -->
NT Internals /
Mark Russinovich
/
March 2000
Inside Storage Management, Part 1
Drive-Letter Assignments
After the I/O Manager initializes the disk storage drivers, it invokes the internal function IoAssignDriveLetters. This function creates a symbolic link under \?? in the form of a drive letter for each disk partition, as well as for CD-ROMs and 3.5" disks. The drive-letter symbolic links refer to associated partition device objects. The I/O Manager's drive-letter assignment follows a default formula, but you can override the formula by explicitly assigning drive letters in Disk Administrator. After you start Disk Administrator, the program scans the partitions on the system's hard disks and generates a random signature for each partition that Disk Administrator hasn't seen in previous executions. Disk Administrator stores a partition's signature in the partition's boot sector and also in the Registry value HKEY_LOCAL_MACHINE\SYSTEM\DISK\Information. The Information value includes a data structure for each disk partition that incorporates the Disk Administrator signature and the partition's drive letter, if you've assigned one. IoAssignDriveLetters reads the Information value and honors the drive letters you've specified before performing default assignments. The function reads partition signatures and matches them with the data that the Information value stores to correlate partitions with their assigned drive letters.
After IoAssignDriveLetters assigns explicitly specified drive letters, the function starts with the letter C (or the first unassigned letter higher than C) and assigns letters to the first active primary partition of each disk. If a disk has no active primary partition, IoAssignDriveLetters assigns a letter to the first primary partition. In the subsequent phase of assignment, IoAssignDriveLetters gives letters to each partition that is in each disk's extended partitions. Finally, IoAssignDriveLetters creates letters for the remaining unassigned primary partitions.
After IoAssignDriveLetters has created drive-letter symbolic links for hard disk partitions, the function gives letters to 3.5" disks and then to CD-ROMs. The first two 3.5" disks get the letters A and B, and any others receive the next available letter. You can assign letters to CD-ROMs in Disk Administrator, but rather than storing those assignments in the Information value, Disk Administrator stores the assignments in separate values that share the names of the device objects NT uses to represent the CD-ROMs. For example, a system with one CD-ROM that has an assigned drive letter will have a Registry value \device\cdrom0 beneath HKEY_LOCAL_MACHINE\SYSTEM\DISK that specifies the CD-ROM's assigned drive letter. Screen 2, page 68, shows the contents of a system's Object Manager \?? directory and highlights the C drive's symbolic link.
File-System Mounting
Because NT assigns a drive letter to a partition doesn't mean that the partition contains data that is organized by a file-system format NT recognizes. The volume-recognition process consists of a file system claiming ownership for a partition; that process takes place the first time the kernel, a device driver, or an application accesses a file or directory on a partition. After a file-system driver signals its responsibility for a partition, the I/O Manager directs all IRPs aimed at the partition to the owning driver. Mount operations in NT 4.0 consist of three components: file-system driver registration, Volume Parameter Blocks (VPBs), and mount requests.
The I/O Manager oversees the mount process and is aware of available file-system drivers because all file-system drivers register with the I/O Manager when they initialize. The I/O Manager provides the IoRegisterFileSystem function to local disk (rather than network) file-system drivers for this registration. When a file-system driver registers, the I/O Manager stores a reference to the driver in a list that the I/O Manager uses during mount operations.
Every device object contains a VPB data structure, but the I/O Manager treats VPBs as meaningful only for partition device objects. A VPB serves as the link between a partition device object and the device object that a file-system driver creates to represent a mounted file-system instance for that partition. If a VPB's file-system reference is empty, then no file system has mounted the partition. The I/O Manager checks a partition device object's VPB whenever an open API that specifies a filename or directory name on a partition device object executes. For example, if the I/O Manager assigns drive letter D to the second partition on a system's first hard disk, IoAssignDriveLetters creates a \??\D: symbolic link that resolves to the device object \device\harddisk0\partition2. A Win32 application that attempts to open the \test file on the D drive specifies the name D:\test, which the Win32 subsystem converts to \??\D:\test before invoking NtCreateFile, the kernel's file-open routine. NtCreateFile uses the Object Manager to parse the name, and the Object Manager encounters the \device\harddisk0\partition2 device object with the path \test still unresolved. At that point, the I/O Manager checks to see whether \device\harddisk0\partition2's VPB references a file system. If not, the I/O Manager uses a mount request to ask each registered file-system driver whether the driver recognizes the format of the partition in question as the driver's own. If a file-system driver signals affirmatively, the I/O Manager fills in the VPB and passes the open request with the remaining path (i.e., \test) to the file-system driver. The file-system driver completes the request by using its file-system format to interpret the data that the partition stores. After a mount fills in a partition device object's VPB, the I/O Manager hands subsequent open requests aimed at the partition to the mounted file-system driver. If no file-system driver claims a partition, then RAWa file-system driver built into NTclaims the partition and fails all requests to open files on the partition. Figure 2 shows a simplified example (i.e., the figure omits the file-system driver's interactions with the NT Cache Manager) of the path that I/O that is directed at a mounted partition follows.
Aside from the boot volume, which a driver mounts while the kernel is initializing, file-system drivers mount most volumes when Chkdsk runs during the blue-screen portion of the boot sequence. Chkdsk accesses each drive letter to see whether the volume associated with the letter requires a consistency check. Mounting can occur more than once for the same disk with removable media (e.g., 3.5" disk device). CD-ROM File System (CDFS) and FAT, NT's two file-system drivers that support removable media, respond to media changes by querying the disk's volume identifier. If either driver sees the volume identifier change, the driver dismounts the disk and attempts to remount it.
Fault Tolerance
NT's I/O architecture permits a powerful feature: dynamic layering of device objects. A device driver can create a device object and attach it to a target device object. The I/O Manager routes requests directed at a target device object to the object's attached device object, if one exists. Device drivers use this mechanism to monitor or change the behavior of device objects that belong to other device drivers. A driver that relies on layering is a filter driver, and when a filter driver receives an IRP aimed at a target device, the filter has full control over the request. The filter can fail the request, create new subrequests, or pass the unmodified request to the target device. NT storage drivers commonly use layering in three places. At the highest level, file-system filter drivers attach to the target device objects that represent mounted partitions that file-system drivers create. A file-system filter driver therefore intercepts requests aimed at mounted volumes so that the driver can implement functionality such as monitoring, encryption, or on-access virus scanning.
If you've installed NT disk performance counters by executing the Diskperf y command, then you've installed the DiskPerf filter driver. DiskPerf attaches to the device objects that represent physical disks (e.g., \device\harddisk0\partition0) so that DiskPerf can generate performance-related statistics for Performance Monitor to present. If you create a nonstandard volumesuch as a volume set, mirrored drive, stripe set, or stripe set with parityin Disk Administrator, you enable the FtDisk filter driver.
A volume set is a volume that uses two or more partitions to create the image of one contiguous partition. A systems administrator can use partitions from different disks to create a volume set that is larger than any given physical disk on a computer. A mirror is a volume that maintains copies of its data on two partitions. In a mirror, all write operations take place on both partitions, but read operations take place only from one partition. Mirrors tolerate single-disk failures; operation continues on the surviving half of the mirror. A stripe set is a multipartition volume whose data is interleaved between partitions. NT uses a stripe unit of 64KB. The system stores the first 64KB of file-system data on the first partition of the stripe, stores the second 64KB on the second partition, and so on, thus wrapping back to the first partition. Stripe sets can improve performance when the partitions are on different disks because I/O operations can proceed in parallel on different disks. Finally, stripe sets with parity are stripe sets with an extra 64KB block of data for each 64KB stripe spread across the set's partitions. The extra block stores parity information that NT can use to recover the data stored on one of the set's partitions if the disk on which the partition is located fails. Stripe sets with parity are also known as RAID 5 volumes.
Disk Administrator stores advanced volume-configuration information in the HKEY_LOCAL_MACHINE\SYSTEM\DISK\Information value, with the partition drive-letter and signature information, and the FtDisk driver reads this information during the boot process. A data structure that FtDisk manages in the Information value associates partitions that belong to the same volume. Because file-system drivers expect a volume's contents to reside on one partition, without FtDisk, file-system drivers typically don't recognize a volume that consists of multiple partitions. FtDisk therefore attaches itself to every partition device object in a system to manipulate requests aimed at the device objects that constitute advanced volumes.
Some examples of FtDisk's operations will help clarify its role. If a striped volume consists of \device\harddisk0\partition2 and \device\harddisk1\partition3, as Figure 3 shows, and an administrator has assigned drive letter D to the stripe, then the I/O Manager defines the link \??\D: to reference \device\harddisk0\partition2. If FtDisk were not present, an application opening a file on the stripe would receive an error because no file system would understand or mount the partial volume that \device\harddisk0\partition2 represents. With FtDisk present, an FtDisk device object intercepts file-system disk I/O aimed at \device\harddisk0\partition2, and the FtDisk driver adjusts the request before passing it to the disk class driver. The adjustment FtDisk makes configures the request to refer to the correct offset of the request's target stripe on either \device\harddisk0\partition2 or \device\hardisk1\partition3.
In the case of writes to a mirrored volume, FtDisk splits each request so that each half of the mirror receives the full write operation. For mirrored reads, FtDisk performs a read from half of a mirror, relying on the other half when a read operation fails.
The Dynamics of Win2K
Next month, I'll continue with a look inside the Win2K LDM. I'll also discuss reparse points.
<-- prev. page
1
[2]
next page -->
|