# Status of Compute Node development







Jingzhou ZHAO, Zhen-An LIU

Trigger Lab, IHEP

PANDA China Group meeting

#### Outline



**#** PANDA DAQ system requirements

# 3 versions of CN

**■** Status for CN upgrade

**#** Summary

## Requirements to DAQ system



- **■** High event Rate: up to 20MHz,
- **#** Each event: 1.5Kbyte-4.5Kbyte,
- Considering electronic noise, background and signal accumulation, DAQ should has ability of data processing about 200GBps.

How to deal with such big data with NO Dead Time is a great challenge to DAQ system.



#### New concept of DAQ system



- **★** Trigger-less streamingDAQ with event filtering
- # FEE ADC self-triggered.

Signals from detector are self-triggered and sampled in FEE. Self-trigger in FEE is based on signal amplitude and time interval with the collision point.

- # FEE pipeline, NO Dead Time,
- ➡ Pipeline mode is used for FEE ADC data read out to avoid Dead time.
- **♯** Global time distribution for time stamping.
- If readout data has same time stamp, it will be defined as same event data.



#### PANDA DAQ system



- **■** Global time distribution for time stamping,
- # L1 Network,
  - Extract particle information like energy, position, momentum and so on;
- # L2 Network,
  - Make a preliminary reconstruction for physics events;
- **#** Event selection will be done based on the research topics of PANDA experiment.



#### PANDA TDAQ Requirement







#### 功能需求:

- ▶高速数据传输:数据量巨大,200 GB/s,实时高速读出。
- ▶高速数据互联:基于探测器的整体 数据信息的事例重建。系统各部件需 共享数据。
- ▶数据缓存: 事例的重建需要一定的 处理时间。系统要有巨大的数据缓存 能力。
- ▶数据处理:数据量大,事例的重建、 筛选,需要系统有强大的数据处理能力。

# What is Compute Node



- **#** Has the ability of:
  - High speed data readout
  - Large data buffering
  - High performance data processing
- **#** Based on:
  - xTCA
  - High performance FPGA
  - DDR
  - RocketIO and optical



- **♯** Function and performance
  - > ATCA standard
  - > 5xVirtex-4 FX60 with PowerPC405,
  - > 16 MGT channels connect to backplane, 3.125Gbps
  - > 8 optical port x 3.125Gbps
  - > 5x2GB DDR2,

- > 6 Gigabit Ethernet ports,
  - > One to ATCA Z2 basic port
- > 64MB Flash for each FPGA,
- UART Hub,
- > IPMC,
- > Full mesh connection for each FPGA via 32bit bus.









- **♯** Function and performance
  - > ATCA standard
  - > 5xVirtex-4 FX60 with PowerPC405,
  - > 16 MGT channels connect to backplane, 3.125Gbps
  - > 8 SFPx3.125Gbps
  - > 5x2GB DDR2,
  - > 6 Gigabit Ethernet ports,
    - One to ATCA Z2 basic port
  - > 64MB Flash for each FPGA,
  - > UART Hub,
  - > IPMC,
  - > Full mesh connection for each FPGA via 32bit bus.







#### CN V3





#### CN V3: Carrier Board



#### **#** Function and performance

- > ATCA standard
- > Virtex-4 FX60 with PowerPC405,
- Embedded linux system for slow control,
- > 16 MGT channels connect to backplane, 3.125Gbps
- > 2GB DDR2,
- > 2 Gigabit Ethernet ports,
  - > One to ATCA Z2 basic port
  - > One to RTM RJ45
- > 64MB Flash,
- > JTAG, UART Hub,
- > IPMC,
- > AMC Full mesh connection.



#### Backplane Full Mesh for CN



- # Full mesh backplane for CN data sharing with each node,
- Point to Point via one MGT channel,
- **♯** Line rate up to 3.125Gbps





## CN Carrier Full mesh for AMC



- **■** Full mesh connection for AMC,
- Point to Point directly via Carrier board,
  - > One MGT port,
  - > Two general LVDS links
- **■** Line rate up to 3.125Gbps



# xFP(xTCA-based FPGA Processor

#### # Function and performance

- > Virtex-5 FX70T with PowerPC440,
- > Embedded linux system for slow control,
- > 8 MGT channels
  - > 2xSFP+ port, 6.25Gbps/ch
  - > 6 chs to AMC connect,6.25Gbps/ch
- > 12x 600 Mbps LVDS
- > 2x2GB DDR2,
- > 1 Gigabit Ethernet port,
- > 64MB Flash,
- > PROM for FPGA configuration
- > 2 UART ports,
- > MMC





#### Idea for Compute Node upgrade



- ◆ 升级原因:
  - ➤ Virtex-4/5 FPGA型号太老
  - ▶ 性能需要进一步提高
- ◆ 新版计算节点目标:
  - ▶ 模块化设计: 便于系统扩展
  - ➤ 数据处理: Kintex-

Ultrascale芯片

- ▶ 高速数据传输:
  - 光纤吞吐单板800Gbps
  - 支持万兆网口
- ➤ 数据缓存: 单板48GB, DDR4
- ➤ AMC板型采用双宽



#### Block of CN\_V4.0: Carrier Board



- **FPGA:** Virtex4 FX60 —> Ultrascale Kintex xcku060
- **RAM:** 2 GB DDR2 SODIMM —> 16 GB DDR4
- **MGTs:** 3.125 Gbps —> 16.3 Gbps
  - 4 links to each AMC card (currently: 4 x600 Mbps LVDS)
  - 12 links to ATCA backplane
  - 1 link to RTM (10G Ethernet)
- **GbE** switch:
  - 4 AMCs,
  - 1 switch FPGA,
  - 1 uplink to ATCA Base Interface
  - 1 RTM RJ45
- **■** 10 Gigabit Ethernet to RTM(SPF+)



#### Block of CN\_V4.0: Carrier Board



- **Configuration:** Flash/CPLD (slave serial) —> automatic from NOR Flash (master BPI)
- **♯** Programmable MGT clock
- **#** CPLD as JTAG hub
- # Keep:
  - I2C buses, sensors
  - IPMC/MMC



## CN V4.0: Carrier Board



- **#** PCB layout and 3D view of Carrier board.
- # Design work is on going.



#### Block of CN\_V4.0: Daughter board



#### AMC daughter board

2021/7/7

- AMC single width -> doubule width
- Virtex-5 ->Kintex Ultrascale xkcu060
- MGT 31ch 6.25Gbps->16.3Gbps
  - 5 chs AMC backplane
  - 2 chs QSFP
  - 24 chs Firefly(Optical Fiber)

- 4GB DDR2 ->16 GB DDR4
- BPI flash
- 128MB Flash
- UART
- External port for clock
- MMC



# CN\_V4.0: Daughter board



- **#** PCB layout and 3D view of Carrier board.
- # Design work is on going



#### Summary



- **■** Compute Node has been developed for 3 versions.
- **≠** Performance of CN\_V3 need to be upgraded.
- **♯** PCB layout of CN\_V4 upgrade is on-going.
- **■** Next step:
  - Applying funding for CN upgrade prototype production and testing.



# Thanks for your attention.