November 2006 – theconsultant.net

Identifying unknown PCI devices

November 26, 2006November 26, 2006 stany

If I ever want to identify unknown device installed in a system….

First I’ll attempt to obtain a PCI device ID:

Under Linux, I’ll use lspci.
Under Windows, I’ll use Unknown Devices (And ignore any other piece of software that claims to be called “Unknown Device Identifier”, and that was stolen from Mike Moniz
Under MacOS X, I’ll use system_profiler
Under Solaris, I’ll use /usr/X11/bin/scanpci -v

Then I’ll reference the PCI device ID with the Canonical list of PCI device IDs from Craig’s site.

At that point I can grep the pcidevs.txt, and learn exciting things.
For example, suppose I wonder what an unknown device in a PowerMac G5 is.

From system profiler I know this:

pci8086,1012:

  Type:	Ethernet Controller
  Bus:	PCI
  Slot:	SLOT-3
  Vendor ID:	0x8086
  Device ID:	0x1010
  Subsystem Vendor ID:	0x8086
  Subsystem ID:	0x1012
  Revision ID:	0x0001

So I do a bit of grepping:

stany@gilva:~/Desktop[01:54 PM]$ grep V.*8086 pcidevs.txt 
V       8086    Intel Corporation
stany@gilva:~/Desktop[01:54 PM]$ grep ^S.*1012 pcidevs.txt 
S       1012    SiS650 GUI 2D/3D Accelerator
S       1012    DFE-580TX 4-Port Server Adapter
S       1012    PRO/1000 MT Dual Port Server Adapter
S       1012    PRO/1000 MF Dual Port Server Adapter
S       1012    PRO/100 S Server Adapter (D)
S       1012    PRO/100 S Server Adapter (D)
S       1012    Realtek AC'97 Audio
S       1012    Intel USB 2.0 Enhanced Host Controller
S       1012    PRO/Wireless 3945ABG Network Connection
stany@gilva:~/Desktop[01:54 PM]$ grep ^D.*1010 pcidevs.txt 
D       0020    LSI53C1010-33 PCI to Dual Channel Ultra160 SCSI Multifunction Controller
D       0021    LSI53C1000/1000R/1010R/1010-66 PCI to Ultra160 SCSI Controller
D       1010    SST-128P Adapter
D       1010    Duet 1S(16550)+1P
D       1010    C101/PCI Super Sync Board
D       1010    82546EB Dual Port Gigabit Ethernet Controller (Copper)
D       0003    SG1010 6 Port Serial Switch & PCI to PCI Bridge
stany@gilva:~/Desktop[01:54 PM]$

(V stands for Vendor, S for subsystem and D for device ID)

So logic would imply that this is an Intel Corporation PRO/1000 MT Dual Port Server Adapter, specifically 82546EB Dual Port Gigabit Ethernet Controller (Copper).

An excercise for the reader is to identify the following device:
pci bus 0x0006 cardnum 0x04 function 0x00: vendor 0x1106 device 0x3044

VIA Technologies Inc VT6306 VIA Fire II IEEE-1394 OHCI Link Layer Controller

Untitled

November 25, 2006November 25, 2006 stany

MSI P4N SLI motherboard has a build in nVidea nForce 04 NIC. OpenSolaris doesn’t have driver for it, however a driver can be downloaded from Masayuki Murayama’s Free NIC drivers for Solaris page (Drivers there are SPARC/x86 capable, one might need a functional 64 bit compiler to recompile them for their platform).

His driver will work out of the box, as long as the PCI device ID matches the ones in adddrv.sh script. To verify that, one might need to run /usr/X11/bin/scanpci -v and verify that the PCI id matches. In my case, PCI ID was pci10d3,38, which was not in the adddrv.sh script, however is in fact an nForce4 ethernet controller.
After I’ve added the ID in the script, driver worked right away.

root@dara:/[07:49 PM]# cd ; /usr/X11/bin/scanpci -v
[...]
pci bus 0x0000 cardnum 0x0e function 0x00: vendor 0x10de device 0x0038
 nVidia Corporation MCP04 Ethernet Controller
 CardVendor 0x3462 card 0x7160 (Card unknown)
  STATUS    0x00a0  COMMAND 0x0007
  CLASS     0x06 0x80 0x00  REVISION 0xa2
  BIST      0x00  HEADER 0x00  LATENCY 0x00  CACHE 0x00
  BASE0     0xfe9fc000  addr 0xfe9fc000  MEM
  BASE1     0x0000c481  addr 0x0000c480  I/O
  MAX_LAT   0x14  MIN_GNT 0x01  INT_PIN 0x01  INT_LINE 0x05
  BYTE_0    0x62  BYTE_1  0x34  BYTE_2  0x60  BYTE_3  0x71

[...]
root@dara:/[07:50 PM]# modinfo | grep nfo
 Id Loadaddr   Size Info Rev Module Name
 44 feabbbc4   1e50  15   1  mntfs (mount information file system)
141 febc78d4   4768  88   1  devinfo (DEVINFO Driver 1.73)
219 f946c000   fc40 207   1  nfo (nVIDIA nForce nic driver v1.1.2)
root@dara:/[07:50 PM]# dmesg | grep -v UltraDMA

Sat Nov 25 19:50:28 EST 2006
Nov 25 19:38:58 dara.NotBSD.org nfo: [ID 306776 kern.info] nfo0: doesn't have pci power management capability
Nov 25 19:38:58 dara.NotBSD.org nfo: [ID 130221 kern.info] nfo0: nForce mac type 11 (MCP04) (vid: 0x10de, did: 0x0038, revid: 0xa2)
Nov 25 19:38:58 dara.NotBSD.org nfo: [ID 451511 kern.info] nfo0: MII PHY (0x01410cc2) found at 1
Nov 25 19:38:58 dara.NotBSD.org nfo: [ID 426109 kern.info] nfo0: PHY control:0, status:7949<100_BASEX_FD,100_BASEX,10_BASE_FD,10_BASE,XSTATUS,MFPRMBLSUPR,CANAUTONEG,EXTENDED>, advert:de1, lpar:0
Nov 25 19:38:58 dara.NotBSD.org nfo: [ID 119377 kern.info] nfo0: xstatus:3000<1000BASET_FD,1000BASET>
Nov 25 19:38:58 dara.NotBSD.org nfo: [ID 716252 kern.info] nfo0: resetting PHY
Nov 25 19:38:58 dara.NotBSD.org gld: [ID 944156 kern.info] nfo0: nVIDIA nForce nic driver v1.1.2: type "ether" mac address 00:13:d3:5f:53:2f
Nov 25 19:38:58 dara.NotBSD.org npe: [ID 236367 kern.notice] PCI Express-device: pci1462,7160@e, nfo0
Nov 25 19:38:58 dara.NotBSD.org genunix: [ID 936769 kern.notice] nfo0 is /pci@0,0/pci1462,7160@e
Nov 25 19:38:58 dara.NotBSD.org unix: [ID 954099 kern.info] NOTICE: IRQ21 is being shared by drivers with different interrupt levels.
Nov 25 19:38:58 dara.NotBSD.org This may result in reduced system performance.
Nov 25 19:38:58 dara.NotBSD.org last message repeated 1 time
Nov 25 19:38:58 dara.NotBSD.org last message repeated 1 time
Nov 25 19:38:59 dara.NotBSD.org nfo: [ID 831844 kern.info] nfo0: auto-negotiation started
Nov 25 19:39:04 dara.NotBSD.org nfo: [ID 503627 kern.warning] WARNING: nfo0: auto-negotiation failed: timeout
root@dara:/[07:50 PM]#

ZFS (Part 1)

November 10, 2006November 10, 2006 stany

Over the last year I was getting more and more curious/excited about OpenSolaris. Specifically I got interested in ZFS – Sun’s new filesystem/volume manager.

So I finally got my act together and gave it a whirl.

Test system: Pentium 4, 3.0Ghz in an MSI P4N SLI motherboard. Three ATA Seagate ST3300831A hard drives, one Maxtor 6L300R0 ATA drive (all are nominally 300 gigs – see previous post on slight capacity differences). One Western Digital WDC WD800JD-60LU SATA 80 gig hard drive. Solaris Express Community Release (SXCR) 51.

Originally I started this project running SXCR 41, but back then I only had 3 300 gig drives, and that was interfering with my plans for RAID 5 greatness. In the end the wait was worth it, as ZFS got revved since.

A bit about MSI motherboard. I like it. For a PC system I like it alot. It has two PCI slots, two full length PCI E slots (16x), and one PCIE 1x slot. Technically it supports SLI with two ATI Cross-Fire or Nvidea SLI capable cards, however in that case both full length slots will run at 8x. Single slot will run at 16x. Two dual channel IDE connectors, four SATA connectors, built in high end audio with SPDIF, built in GigE NIC based on Marvell chipset/PHY, serial, parallel, built in IEEE1394 (iLink/Firewire) with 3 ports (one on the back of the board, two more can be brought out). Plenty of USB 2.0 connectors (4 brought out on the back of the board, 6 more can be brought out from conector banks on the motherboard). Overall, pretty shiny.

My setup consists of four IDE hard drives on the IDE bus, and an 80 gig WD on SATA bus for the OS. Motherboard BIOS allowed me to specify that I want to boot from the SATA drive first, so I took advantage of the offer.

Installation of SXCR was from IDE DVD (a pair of hard drives was unplugged for the time).
SXCR recognized pretty much everything in the system, except built in Marvell Gig E nic. Shit happens, I tossed in a PCI 3Com 3c509C NIC that I had kicking around, and restarted. There was a bit of a hold up with SATA drive – Solaris didn’t recognize it, and wanted the geometry, number of heads and number of clusters so that it could create an apropriate volume label. Luckily WD made identical drive but in IDE configuration, for which it actually provided the heads/custers/sectors information, so I plugged those numbers in, and format and fdisk cheered up.

Other then that, normal Solaris install. I did console/text install just because I am alot more familiar with them, however Radeon Sapphire X550 PCIE video card was recognized, and system happily boots into OpenWindows/CDE if you want it to.

So I proceeded to create a ZFS pool.
First thing I wanted to check is how portable ZFS is. Specifically, Sun claims that it’s endinanness neutral (ie I can connect the same drives to the little endian PC, or big endian SPARC system, and as long as both run OS that recognizes ZFS, things will work). I wondered how it deals with device numbers. Traditionally Solaris is very picky about the device IDs, and changing things like controllers or SCSI IDs on a system can be tricky.

Here I wanted to know if I can just create, say, a “travelling zfs pool”, where I’ll have an external enclosure with a few SATA drives, an internal PCI SATA controller card, and if things go wrong in a particular system, I could always unplug the drives, and move them to a different system, and things will work. So I wanted to find out if ZFS can deal with changes in device IDs.

In order for ZFS to work reliably, it has to use a whole drive. It, in turn, writes an EFI disk label on the drive, with a unique identifier. Note that certain PC motherboards choke on EFI disk labels, and refuse to boot. Luckily most of the time this is fixable using a BIOS update.

root@dara:/[03:00 AM]# uname -a
SunOS dara.NotBSD.org 5.11 snv_51 i86pc i386 i86pc
root@dara:/[03:00 AM]# zpool create raid1 raidz c0d0 c0d1 c1d0 c1d1
root@dara:/[03:01 AM]# zpool status
  pool: raid1
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        raid1       ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c0d0    ONLINE       0     0     0
            c0d1    ONLINE       0     0     0
            c1d0    ONLINE       0     0     0
            c1d1    ONLINE       0     0     0

errors: No known data errors
root@dara:/[03:02 AM]# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
raid1                  1.09T    238K   1.09T     0%  ONLINE     -
root@dara:/[03:02 AM]# df -h /raid1 
Filesystem             size   used  avail capacity  Mounted on
raid1                  822G    37K   822G     1%    /raid1
root@dara:/[03:02 AM]#

Here I created a raidz1 (zfs equivalent of RAID5 with one parity disk, giving me (N-1)*[capacity of the drives]. raidz can survive death of one hard drive. zfs pool can also be creatd with raidz2 command, giving an equivalent of raid5 with two parity disks. Such configuration can survive death of 2 disks) pool.

Note the difference in volume that zpool list and df produce. zpool list shows capacity not counting parity. df shows the more traditional available disk space. Using df will likely cause less confusion in normal operation.

So far so good.

Then I proceeded to create a large file on the ZFS pool:

root@dara:/raid1[03:04 AM]# time mkfile 10g reely_beeg_file

real    2m8.943s
user    0m0.062s
sys     0m5.460s
root@dara:/raid1[03:06 AM]# ls -la /raid1/reely_beeg_file 
-rw------T   1 root     root     10737418240 Nov 10 03:06 /raid1/reely_beeg_file
root@dara:/raid1[03:06 AM]#

While this was running, I was running zpool iostat -v raid1 10 in a different window.

               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
raid1        211M  1.09T      0    187      0  18.7M
  raidz1     211M  1.09T      0    187      0  18.7M
    c1d0        -      -      0    110      0  6.26M
    c1d1        -      -      0    110      0  6.27M
    c0d0        -      -      0    110      0  6.25M
    c0d1        -      -      0     94      0  6.23M
----------  -----  -----  -----  -----  -----  -----

               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
raid1       1014M  1.09T      0    601      0  59.5M
  raidz1    1014M  1.09T      0    601      0  59.5M
    c1d0        -      -      0    364      0  20.0M
    c1d1        -      -      0    363      0  20.0M
    c0d0        -      -      0    355      0  19.9M
    c0d1        -      -      0    301      0  19.9M
----------  -----  -----  -----  -----  -----  -----

[...]
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
raid1       8.78G  1.08T      0    778    363  91.1M
  raidz1    8.78G  1.08T      0    778    363  91.1M
    c1d0        -      -      0    412      0  30.4M
    c1d1        -      -      0    411  5.68K  30.4M
    c0d0        -      -      0    411  5.68K  30.4M
    c0d1        -      -      0    383  5.68K  30.4M
----------  -----  -----  -----  -----  -----  -----

10 gigabytes written over 128 seconds. About 80 megabytes a second on continuous writes. I think I can live with that.

Next I wanted to run some md5 digests of some files on the /raid1, then export the pool, shut system down, switch around IDE cables, boot system back up, reimport the pool, and re-run the md5 digests. This would simulate moving a disk pool to a different system, screwing up disk ordering in process.

root@dara:/[12:20 PM]# digest -a md5 /raid1/*
(/raid1/reely_beeg_file) = 2dd26c4d4799ebd29fa31e48d49e8e53
(/raid1/sunstudio11-ii-20060829-sol-x86.tar.gz) = e7585f12317f95caecf8cfcf93d71b3e
root@dara:/[12:23 PM]# zpool status
  pool: raid1
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        raid1       ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c0d0    ONLINE       0     0     0
            c0d1    ONLINE       0     0     0
            c1d0    ONLINE       0     0     0
            c1d1    ONLINE       0     0     0

errors: No known data errors
root@dara:/[12:23 PM]# zpool export raid1
root@dara:/[12:23 PM]# zpool status
no pools available
root@dara:/[12:23 PM]#

System was shutdown, IDE cables switched around, system was rebooted.

root@dara:/[02:09 PM]# zpool status
no pools available
root@dara:/[02:09 PM]# zpool import raid1
root@dara:/[02:11 PM]# zpool status
  pool: raid1
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        raid1       ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c1d0    ONLINE       0     0     0
            c1d1    ONLINE       0     0     0
            c0d0    ONLINE       0     0     0
            c0d1    ONLINE       0     0     0

errors: No known data errors
root@dara:/[02:11 PM]#

Notice that the order of the drives changed. Was c0d0 c0d1 c1d0 c1d1, and now it’s c1d0 c1d1 c0d0 c0d1.

root@dara:/[02:22 PM]# digest -a md5 /raid1/*
(/raid1/reely_beeg_file) = 2dd26c4d4799ebd29fa31e48d49e8e53
(/raid1/sunstudio11-ii-20060829-sol-x86.tar.gz) = e7585f12317f95caecf8cfcf93d71b3e
root@dara:/[02:25 PM]#

Same digests.

Oh, and a very neat feature…. You want to know what was happening with your disk pools?

root@dara:/[02:12 PM]# zpool history raid1
History for 'raid1':
2006-11-10.03:01:56 zpool create raid1 raidz c0d0 c0d1 c1d0 c1d1
2006-11-10.12:19:47 zpool export raid1
2006-11-10.12:20:07 zpool import raid1
2006-11-10.12:39:49 zpool export raid1
2006-11-10.12:46:14 zpool import raid1
2006-11-10.14:09:54 zpool export raid1
2006-11-10.14:11:00 zpool import raid1

Yes, zfs logs the last bunch of commands on to the zpool devices. So even if you move the pool to a different system, command history will still be with you.

Lastly, some versioning history for ZFS:

root@dara:/[02:19 PM]# zpool upgrade raid1 
This system is currently running ZFS version 3.

Pool 'raid1' is already formatted using the current version.
root@dara:/[02:19 PM]# zpool upgrade -v
This system is currently running ZFS version 3.

The following versions are suppored:

VER  DESCRIPTION
---  --------------------------------------------------------
 1   Initial ZFS version
 2   Ditto blocks (replicated metadata)
 3   Hot spares and double parity RAID-Z

For more information on a particular version, including supported releases, see:

http://www.opensolaris.org/os/community/zfs/version/N

Where 'N' is the version number.
root@dara:/[02:19 PM]#

Power consumption and hard drives

November 10, 2006November 10, 2006 stany 1 Comment

Some numbers about power consumption of hard drives….

Maxtor DiamondMax 10 6L300R0, 7200 RPM, 300 gig (279.48GB formatted) ATA hard drive has the following power consumption: +5V 740 mA, +12V 1500 mA.

Seagate Barracuda ST3300831A, 7200 RPM, 300 gig (279.45GB formatted) ATA hard drive has the following power consumption: +5V 460 mA, +12V 560 mA.

Seagate tech spec sheet claims that their ‘cudas also take 2.8 amps of +12V to spin up. Maxtor doesn’t have a useful spec sheet for their product.

Observations: Seagate has a 5 year warranty on their drives. Lower power consumption means lower power dissipation, and thus cooler system. Lower power consumption means that you can get away with smaller power supply (or more drives in a system), and thus reduce your power consumption costs (that are more of an issue in a 24/7 environment) and air conditioning/cooling costs.

Conclusions: One should spec hard drives not only from the point of view of costs (WD is cheap but in my experience dies like a butterfly under a cold spell), but from the point of view of warranty and power consumption. Sadly vendors do not provide power consumtion information in their spec sheets, so the only way to find it out is by going to a computer store, asking to look at an OEM drive, and reading off the numbers.