Linux和PCI的那点事

Posted by 云谷计算 on June 4, 2019

PCI & PCI-E拓扑结构

PCI采用的是总线型拓扑结构,一条PCI总线上挂着若干个PCI终端设备或者PCI桥设备,大家共享该条PCI总线,哪个人想说话,必须获得总线使用权,然后才能发言。

PCIe是点对点结构。一个典型的结构是一个root port和一个endpoint直接组成一个点对点连接对,而Switch可以同时连接几个endpoint。一个root port和一个endpoint对就需要一个单独的PCI bus。而PCI是在同一个总线上的设备共享同一个bus number。过去主板上的PCI插槽都公用一个PCI bus,而现在的PCIe插槽却连在芯片组不同的root port上。

PCI设备识别

每个 PCI 外设有一个总线号, 一个设备号, 一个功能号标识. PCI 规范允许单个系统占用多达 256 个总线, 但是因为 256 个总线对许多大系统是不够的, Linux 现在支持 PCI 域. 每个 PCI 域可以占用多达 256 个总线. 每个总线占用 32 个设备, 每个设备可以是一个多功能卡(例如一个声音设备, 带有一个附加的 CD-ROM 驱动)有最多 8 个功能. 因此, 每个功能可在硬件层次被一个 16-位地址或者 key , 标识.

  • Simple BDF notation, 即 bus, device, 和function;
  • BDF Notation Extension, 即 domain, bus, device, 和function;

Linux的lspci命令会列出所有设备的bdf, 比如下面网卡的bus=18, device=00, function=[0,1]。

/sys/class/pci_bus# lspci -v|grep Ethernet
18:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
	Subsystem: Intel Corporation Ethernet Server Adapter X710-DA2 for OCP
18:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
	Subsystem: Intel Corporation Ethernet Converged Network Adapter X710

Linux PCI枚举

程序实现:

/*
 * Enum all pci device via the PCI config register(CF8 and CFC).
 */

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

#include <sys/io.h>


#define PCI_MAX_BUS 255         /* 8 bits (0 ~ 255) */

#define PCI_MAX_DEV 31          /* 5 bits (0 ~ 31) */

#define PCI_MAX_FUN 7           /* 3 bits (0 ~ 7) */

#define CONFIG_ADDRESS 0xCF8

#define CONFIG_DATA 0xCFC

#define PCICFG_REG_VID 0x00     /* Vendor id, 2 bytes */

#define PCICFG_REG_DID 0x02     /* Device id, 2 bytes */

#define PCICFG_REG_CMD 0x04     /* Command register, 2 bytes */

#define PCICFG_REG_STAT 0x06    /* Status register, 2 bytes */

#define PCICFG_REG_RID 0x08     /* Revision id, 1 byte */


void list_pci_devices()
{
    unsigned int bus, dev, fun;
    unsigned int addr, data;

    //printf("BB:DD:FF VID:DID\n");

    for (bus = 0; bus <= PCI_MAX_BUS; bus++) {
        for (dev = 0; dev <= PCI_MAX_DEV; dev++) {
            for (fun = 0; fun <= PCI_MAX_FUN; fun++) {
                addr = 0x80000000L | (bus << 16) | (dev << 11) | (fun << 8);
                outl(addr, CONFIG_ADDRESS);
                data = inl(CONFIG_DATA);

                /* Identify vendor ID */
                if ((data != 0xFFFFFFFF) && (data != 0)) {
                    printf("%02X:%02X:%02X ", bus, dev, fun);
                    printf("%04X:%04X", data & 0xFFFF, data >> 16);
                    addr =
                        0x80000000L | (bus << 16) | (dev << 11) | (fun << 8) |
                        PCICFG_REG_RID;
                    outl(addr, CONFIG_ADDRESS);
                    data = inl(CONFIG_DATA);
                    if (data & 0xFF) {
                        printf(" (rev %02X)\n", data & 0xFF);
                    } else {
                        printf("\n");
                    }
                }
            }  // end func
        }  // end device
    } // end bus
}

int main()
{
    int ret;

    /* Enable r/w permission of all 65536 ports */
    ret = iopl(3);
    if (ret < 0) {
        perror("iopl set error");
        return 1;
    }

    list_pci_devices();

    /* Disable r/w permission of all 65536 ports */
    ret = iopl(0);
    if (ret < 0) {
        perror("iopl set error");
        return 1;
    }

    return 0;
}

通过lspci -tv按照bdf枚举所有的pci设备:

# lspci -tv
-+-[0000:d7]-+-00.0-[d8]--
 |           +-01.0-[d9]--
 |           +-02.0-[da]--
 |           +-03.0-[db]--
 |           +-05.0  Intel Corporation Sky Lake-E VT-d
 |           +-05.2  Intel Corporation Sky Lake-E RAS Configuration Registers
 |           +-05.4  Intel Corporation Sky Lake-E IOxAPIC Configuration Registers
 |           +-0e.0  Intel Corporation Sky Lake-E KTI 0
 |           +-0e.1  Intel Corporation Sky Lake-E UPI Registers
 |           +-0f.0  Intel Corporation Sky Lake-E KTI 0
 |           +-0f.1  Intel Corporation Sky Lake-E UPI Registers
 |           +-12.0  Intel Corporation Sky Lake-E M3KTI Registers
 |           +-12.1  Intel Corporation Sky Lake-E M3KTI Registers
 |           +-12.2  Intel Corporation Sky Lake-E M3KTI Registers
 |           +-15.0  Intel Corporation Sky Lake-E M2PCI Registers
 |           +-16.0  Intel Corporation Sky Lake-E M2PCI Registers
 |           \-16.4  Intel Corporation Sky Lake-E M2PCI Registers
 +-[0000:ae]-+-05.0  Intel Corporation Sky Lake-E VT-d
 |           +-05.2  Intel Corporation Sky Lake-E RAS Configuration Registers
 ...

PCI-E带宽

PCI-E的连线是由不同的lane来连接的,这些lane可以合在一起提供更高的带宽。譬如2个1lane可以合成2lane的连接,写作x2。两个x2可以变成x4。

PCIe版本 1x 4x 8x 16x
1.0 250MB/s 1GB/s 2GB/s 4GB/s
2.0 500MB/s 2GB/s 4GB/s 8GB/s
3.0 984.6MB/s 3.938GB/s 7.877GB/s 15.754GB/s
4.0 1.969GB/s 7.877GB/s 15.754GB/s 31.508GB/s

PCI配置寄存器

所有的 PCI 设备都有至少一个 256-字节 地址空间. 前 64 字节是标准的, 而剩下的是依赖设备的.

lspci查看(二进制):

# lspci -s 18:00.0 -x
18:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
00: 86 80 72 15 06 04 10 00 02 00 00 02 08 00 80 00
10: 0c 00 00 9f 00 00 00 00 00 00 00 00 0c 80 00 a0
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 0b 00
30: 00 00 f8 ff 40 00 00 00 00 00 00 00 0b 01 00 00

lspci查看(文本格式):

# lspci -s 18:00.0 -v
18:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
	Subsystem: Intel Corporation Ethernet Server Adapter X710-DA2 for OCP
	Flags: bus master, fast devsel, latency 0, IRQ 34, NUMA node 0
	Memory at 9f000000 (64-bit, prefetchable) [size=16M]
	Memory at a0008000 (64-bit, prefetchable) [size=32K]
	Expansion ROM at 9d800000 [disabled] [size=512K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
	Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
	Capabilities: [a0] Express Endpoint, MSI 00
	Capabilities: [e0] Vital Product Data
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number 80-a0-65-ff-ff-fe-fd-3c
	Capabilities: [1a0] Transaction Processing Hints
	Capabilities: [1b0] Access Control Services
	Capabilities: [1d0] #19
	Kernel driver in use: i40e
	Kernel modules: i40e

参考