目前企業比較常見到的 PCIe 2.0 , 一個 lane 的頻寬是 5.0GT/s, 每 8 bit 在傳輸時會使用 10bit 編碼以確保傳輸的資料是正常的, 所以實際的資料傳輸頻寬要打八折
5.0GT/s * 0.8 = 4Gb/s = 0.5GB/s
接下來就看有幾個 lane
PCIe x2 = 8 Gb/s = 1 GB/s
PCIe x4 = 16 Gb/s = 2 GB/s
PCIe x8 = 32 Gb/s = 4 GB/s
PCIe x16 = 64 Gb/s = 8 GB/s
5.0GT/s * 0.8 = 4Gb/s = 0.5GB/s
接下來就看有幾個 lane
PCIe x2 = 8 Gb/s = 1 GB/s
PCIe x4 = 16 Gb/s = 2 GB/s
PCIe x8 = 32 Gb/s = 4 GB/s
PCIe x16 = 64 Gb/s = 8 GB/s
版本 | 資料傳輸頻寬 | 單向單通道頻寬 | 雙向16通道頻寬 | 原始傳輸率 | 供電 | 發表日期 |
1.0 | 2Gb/s | 250MB/s | 8GB/s | 2.5GT/s | 2002年7月22日 | |
1.0a | 2Gb/s | 250MB/s | 8GB/s | 2.5GT/s | 2003年4月15日 | |
1.1 | 2Gb/s | 250MB/s | 8GB/s | 2.5GT/s | 77W | 2005年3月28日 |
2.0 | 4Gb/s | 500MB/s | 16GB/s | 5.0GT/s | 225W | 2006年12月20日 |
2.1 | 4Gb/s | 500MB/s | 16GB/s | 5.0GT/s | 2009年3月4日 | |
3.0 | 8Gb/s | 1GB/s | 32GB/s | 8.0GT/s | 2010年11月10日 | |
4.0 | 16.0GT/s |
System Bus | Version | Data Rate | Encoding | x1 lane | x4 lane | x8 lane | x16 lane |
PCI Express | 1.0 | 2.5 GT/s | 8/10 | 2.0 Gb/s | 8.0 Gb/s | 16.0 Gb/s | 32.0 Gb/s |
PCI Express | 2.0 | 5.0 GT/s | 8/10 | 4.0 Gb/s | 16.0 Gb/s | 32.0 Gb/s | 64.0 Gb/s |
PCI Express | 3.0 | 8.0 GT/s | 128/130 | 7.9 Gb/s | 31.5 Gb/s | 63.0 Gb/s | 126.0 Gb/s |
所以如果是一張 10Gb 的網卡, 上面又有 2 個網孔時
10 Gb/s x 2 = 20 Gb/s => Need PCIe x 8 , 所以在計算頻寬時也要注意一下 PCIe 的格規是否有滿足, 不然買了這麼高級的網卡, 也可能被限制住, 就可惜了.
Reference:
以下用 LSI Raid Card 當範例, 如何用 lspci 來查找 PCIe 的速度
# 首先先找出來 LSI 卡的 slot 編號, 前面的數字就是他的編號 wistor@ubuntu:~$ sudo lspci 04:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) 07:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02) 07:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02) # 接著利用剛剛那個編號找到更詳細的資訊, 記得要加上 sudo, 不然會看不到 Speed 資訊 wistor@ubuntu:~$ sudo lspci -vvv -s 04:00 04:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Subsystem: LSI Logic / Symbios Logic MegaRAID SAS 9260-4i Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin A routed to IRQ 26 Region 0: I/O ports at d000 [size=256] Region 1: Memory at fbdfc000 (64-bit, non-prefetchable) [size=16K] Region 3: Memory at fbd80000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at fbd40000 [disabled] [size=256K] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range BC, TimeoutDis+ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB Capabilities: [d0] Vital Product Data pcilib: sysfs_read_vpd: read failed: Connection timed out Not readable Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [c0] MSI-X: Enable+ Count=15 Masked- Vector table: BAR=1 offset=00002000 PBA: BAR=1 offset=00003800 Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [138 v1] Power Budgeting <?> Kernel driver in use: megaraid_sas Kernel modules: megaraid_sas
LnkCap 代表的是卡片本身支援的最高速度, 以上面 LSI 為例就是 5GT/s (PCIe v2), Width x8
LnkSta 代表是實際跑的的速度, 如果正常應該要和 LnkCap 一樣, 才能獲得最大的頻寬
如果發現 LnkSta 的速度比較小, 那就要追一下是不是插槽本身的速度就有限制,
或者是想了解是否有全速利用到插槽的頻寬, 那要怎麼知道是那一個插槽呢?
# 以樹狀圖來看 pci 串連的狀況 wistor@ubuntu:~$ sudo lspci -vt \-[0000:00]-+-00.0 Intel Corporation 5520 I/O Hub to ESI Port +-01.0-[07]--+-00.0 Intel Corporation 82575EB Gigabit Network Connection | \-00.1 Intel Corporation 82575EB Gigabit Network Connection +-03.0-[06]-- +-04.0-[05]-- +-05.0-[04]----00.0 LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] +-07.0-[03]--
以上面的例子可以看到 LSI 被接在 5520 I/O Hub 的第五個 Port
用同樣的方法可以查一下他的頻寬
wistor@ubuntu:~$ sudo lspci -vvv -s 00:05.0 00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 5 (rev 22) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 256 bytes Bus: primary=00, secondary=04, subordinate=04, sec-latency=0 I/O behind bridge: 0000d000-0000dfff Memory behind bridge: fbd00000-fbdfffff Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Subsystem: Intel Corporation Device 0000 Capabilities: [60] MSI: Enable- Count=1/2 Maskable+ 64bit- Address: 00000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [90] Express (v2) Root Port (Slot+), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us ClockPM- Surprise+ LLActRep+ BwNot+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt- SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise- Slot #6, PowerLimit 25.000W; Interlock- NoCompl- SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg- Control: AttnInd Off, PwrInd Off, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock- Changed: MRL- PresDet+ LinkState+ RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- DevCap2: Completion Timeout: Range BCD, TimeoutDis+ ARIFwd+ DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB Capabilities: [e0] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [150 v1] Access Control Services ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Kernel driver in use: pcieport Kernel modules: shpchp
以上面來看, LnkCap 是 5GT/s Width x8, 目前 LnkSta 也是 5GT/s Width x8, 很好, 一點也沒有浪費
# 另外這個也是很有用的指令 wistor@ubuntu:~$ sudo lshw -businfo Bus info Device Class Description =================================================== pci@0000:04:00.0 scsi6 storage MegaRAID SAS 2108 [Liberator] pci@0000:07:00.0 eth0 network 82575EB Gigabit Network Connection pci@0000:07:00.1 eth1 network 82575EB Gigabit Network Connection pci@0000:00:1f.2 scsi0 storage 82801JI (ICH10 Family) SATA AHCI Controller scsi@0:0.0.0 /dev/sda disk 250GB ST9250610NS scsi@0:0.0.0,1 /dev/sda1 volume 56GiB EXT4 volume scsi@0:0.0.0,2 /dev/sda2 volume 175GiB Extended partition /dev/sda5 volume 175GiB Linux swap / Solaris partition
Reference:
https://noc.sara.nl/wiki/Server_Performance_Tuning
http://benjr.tw/node/663
沒有留言:
張貼留言