|
Chapter One: What the heck is going on here?I have this PLX 9054 PCI device that I've been working on that does bus-master scatter/gather dma through a dual channel dma controller. Its installed on a fairly recent single processor pentium III system running at 450Mhz. I was under the impression that I ought to be able to get some rather decent data transfer throughput with this device. If you are familiar with the PLX 9054 RDK, let me state right up front that, no I am not using their software package with its essentially user mode access methods. I threw all that stuff out for a variety of reasons, none of them disparaging at all to PLX or its partners. Instead I grovelled through the programmers manuals, which were actually quite good as far as this type of document goes, and built my own driver using (of course here comes the plug,) my home grown W2K C++ class library, HTS Generic WDM Shell. ( This is of course what I do simply to amuse myself in my spare time, because I certainly am not making any money out of this. Sort of pathetic isn't it? But I digress.) So it didn't really take very long to get the driver up and running and to poof up a silly test app to drive the thing. First thing I discover is that I can't get more than a truly awful 20MBs out of the device. I assumed that this was a case of stupid programmer on device, and set out to instrument my driver to find out where I was looping around doing linear search operations while holding a spinlock. I considered using WMI to do this and then I considered having to write a WMI client and then I just used a custom IOCTL to fetch instrumented data from my driver. What I discovered, to my horror, was that the latency from isr to dpc was around 80000 cpu timer ticks (the ones at 3.5Mhz fetched via KeQueryPerformanceCounter.) What an appalling number! I stopped looking for my locked linear search and started looking at the hardware configuration of my test system. Device manager revealed something very interesting. It is so interesting that I thought I'd share it with you: Isn't that pretty? Now look closely at the last 5 devices. Starting from the bottom that would be the Realtek PCI Fast Ethernet Adapter, using IRQ 7, my PCI9054RDK-860 device ,using IRQ7, the Intel 8237 USB controller ,using IRQ 7, and the Highpoint dual channel DMA66 IDE controller, with both 'slots' using, yup you guessed it, IRQ 7. Well, you say, surely that is because my system is loaded up with ISA devices. But look again there bucky, the only ISA devices are the ones I can't actually get rid of as they are more or less integrated with the south bridge. The point being that I've got, for a PC platform, buckets of free interrupt vectors. Is there something wrong today with 10 or 11 or 12 that caused my 'pnp manager' to do such a truly lousy job of balancing my resources? So, I rebooted back to bios and steered all my PCI slots to separate interrupts and restarted nt and brought up the device (mis)manager and well, it looked just the same as the picture above. Chapter Two: What the heck is going on here?I searched the MSDN Knowledge Base for something relevant and came up with the following interesting article, titled Q252420 - IRQ Sharing in Windows 2000 In the spirit of fair use, I'd like to quote a few relevant items from this article. "In Windows 2000, peripheral component interconnect (PCI) devices can share interrupts (IRQs) by design. Per the Plug and Play capability that is defined by the PCI specification, adapters are configured by the computer's BIOS, and are then examined by the operating system and changed if necessary. It is normal behavior for PCI devices to have IRQs shared among them, especially for Advanced Configuration and Power Interface (APCI) computers with Windows 2000 ACPI support enabled." And further more: "In Windows 2000, some or all of the devices on your ACPI motherboard may be listed on the Resources tab in Device Manager as using the same IRQ (IRQ 9). You cannot change the IRQ setting because the setting is unavailable. This occurs because Windows 2000 takes advantage of the ACPI features of the motherboard, including advanced PCI sharing. IRQ 9 is used by the PCI bus for IRQ steering. This feature lets you add more devices without generating IRQ conflicts" And finally, my favorite: "Note that Windows 2000 does not have the ability to rebalance resources as does Microsoft Windows 98." Chapter Three: Unintended ConsequencesPicking myself up of the floor, I did some work to deserialize my two dma channels on the off chance that I could get 40MBps from that simple gesture. Reloaded my driver and re-ran my tests. Oddly enough, my latency had dropped down to 30 ticks, as when I mucked with the BIOS settings on reboot NT rebuilt its configuration and my PCI device landed at the front of the share list rather than the back. Aren't I lucky? |
|