

# **CEPC TDAQ and Online**

 $Fei\ Li$  On behalf of CEPC TDAQ Group



中國科學院為能物加加完施 Institute of High Energy Physics Chinese Academy of Sciences

#### Oct. 21<sup>th</sup>, 2024, CEPC Detector Ref-TDR Review



- Introduction
- Requirements
- Technology survey and our choices
- Technical challenges
- Previous experience on large facilities
- R&D efforts and results
- Detailed design
- Research team and working plan
- Summary

### Introduction

- This talk is about the design and development of the TDAQ and online
- This talk relates to the Ref-TDR Ch 12.
- Questions for physics and simulation
  - What kind of events need to be saved?
  - How to identify these events?
  - Background level?
- Questions for each detectors and electronics
  - Raw data readout bandwidth?
  - How much latency is acceptable?
  - What trigger primitives can be provided?
  - Noise level itself?
  - Control and monitoring requirements?

| 12.1 Introduction                                              |  |
|----------------------------------------------------------------|--|
| 12.2 Requirements and Design Considerations                    |  |
| 12.2.1 Requirements                                            |  |
| 12.2.2 Event rate & background rate estimation                 |  |
| 12.2.3 Technology survey                                       |  |
| <ul> <li>12.2 Requirements and Design Considerations</li></ul> |  |
| 12.2.5 TDAQ Interface with electronics                         |  |
| 12.3 Trigger Simulation and Algorithms                         |  |
| 12.4 Hardware Trigger                                          |  |
| 12.4.1 Previous experience on large facilities                 |  |
| 12.4.2 System architecture                                     |  |
| 12.4.3 Common Trigger Board                                    |  |
| 12.4.4 Trigger Control and Distribution                        |  |
| 12.4.5 Resource cost estimation                                |  |
| 12.5 Software and High Level Trigger                           |  |
| 12.6 Data Acquisition System                                   |  |
| 12.6.1 Previous experience on large facilities                 |  |
| 12.6.2 Overview of System Functionality                        |  |
| 12.6.3 Detector Readout                                        |  |
| 12.6.4 Dataflow                                                |  |
| 12.6.5 Network                                                 |  |
| 12.6.6 Online Software                                         |  |
| 12.7 Detector Control System                                   |  |
| 12.8 Experiment Control System                                 |  |
| 12.9 Summary                                                   |  |

# **Requirements: Physical Event Rate**

#### 8 Hz @ Higgs 240GeV(50MW)

- Bunch crossing rate: 2.889 MHz
- Higgs: ~0.02Hz
- 82 kHz @ Z pole 91GeV(50MW)
- Bunch crossing rate: 43.3 MHz
   Physical event rates are sufficiently low relative to the bunch crossing rate.
- Keep physical events as more as possible
  - By a rough selection of the relevant objects (jet, e, muon, tau,v, ...) and their combinations.
  - Required detailed signal feature extraction and simulation studies.

|                                                                  | Higgs | Z     | W     | tť     |  |  |  |  |
|------------------------------------------------------------------|-------|-------|-------|--------|--|--|--|--|
| SR power per beam (MW)                                           | 50    |       |       |        |  |  |  |  |
| Bunch number                                                     | 446   | 13104 | 2162  | 58     |  |  |  |  |
| Dunch maning (ng)                                                | 346.2 | 23.1  | 138.5 | 2700.0 |  |  |  |  |
| Bunch spacing (ns)                                               | (×15) | (×1)  | (×6)  | (×117) |  |  |  |  |
| Train gap (%)                                                    | 54    | 9     | 10    | 53     |  |  |  |  |
| Luminosity per IP ( $10^{34}$ cm <sup>-2</sup> s <sup>-1</sup> ) | 8.3   | 192   | 26.7  | 0.8    |  |  |  |  |



### **Requirements: Data Rate**

#### Data rate before trigger

- ~1 TB/s @ Higgs
- Several TB/s @Z
- L1 trigger rate
  - O(1k) Hz @ Higgs
  - O(100k) Hz @ Z
- Event size < 2 MB</p>
  - Related to occupancy and read out window
- Storage rate after HLT
  - <100 Hz(200 MB/s) @Higgs
  - 100 kHz (200 GB/s)
     @Z

|                                   | Vertex                                     | Pix(ITKB)                          | Strip<br>(ITKE)                   | ОТКВ                                  | ΟΤΚΕ                                 | ТРС                              | ECAL-B                                         | ECAL-E        | HCAL-B                                     | HCAL-E                                     | Muon                                                      |
|-----------------------------------|--------------------------------------------|------------------------------------|-----------------------------------|---------------------------------------|--------------------------------------|----------------------------------|------------------------------------------------|---------------|--------------------------------------------|--------------------------------------------|-----------------------------------------------------------|
| Channels<br>per chip              | 512*102<br>4                               | 512*128                            | 1024                              | 128                                   |                                      | 128                              | 8~16                                           |               |                                            |                                            |                                                           |
| Data Width<br>/hit                | 32bit                                      | 42bit                              | 32bit                             | 48bit                                 |                                      | 48bit                            | 48bit                                          |               |                                            |                                            |                                                           |
| Avg. data<br>rate / chip          | 0.18Gbp<br>s/chip,<br>1Gbps/c<br>hip inner | 3.53Mbp<br>s/chip                  | 21.5Mbps<br>/chip                 | 2.9Mb<br>ps/chip                      | 38.8Mb<br>ps/chip                    | ~70Mb<br>ps/mod<br>ule<br>Inmost | 10kHz/ch                                       | 100kHz<br>/ch | 3kHz/chan<br>nel                           | 25kHz/cha<br>nnel                          | 10kHz/c<br>hannel,<br>20kHz/i<br>ner<br>endcap            |
| Detector<br>Channel/m<br>odule    | 1882<br>chips<br>@Stch<br>&Ladder          | 30,856<br>chips<br>2204<br>modules | 23008<br>chips<br>1696<br>modules | 83160<br>chips<br>3780<br>module<br>s | 11520<br>chips<br>720<br>module<br>s | 492<br>Module                    | 0.96M chn<br>~60000<br>chips<br>480<br>modules | 0.39 M<br>chn | 3.38M chn<br>5536<br>aggregatio<br>n board | 2.24M chn<br>1536<br>Aggregatio<br>n board | 43,176<br>chn(iner<br>end-cap<br>6912),<br>288<br>modules |
| Avg Data<br>Vol before<br>trigger | 474.2G<br>bps                              | 101.7Gb<br>ps                      | 298.8Gb<br>ps                     | 249.1<br>Gbps                         | 27.9Gb<br>ps                         | 34.4G<br>bps                     | 460.8Gbps                                      | 1.87Tb<br>ps  | 811.2Gbps                                  | 2.688Tbps                                  | 24Gbps                                                    |
| Occupancy                         | 0.22e-4                                    | 2.5e-4                             |                                   |                                       |                                      | 2.8e-4                           | 58e-4                                          |               |                                            | 19.5e-4                                    |                                                           |
| Sum                               | 7.1 Tbps                                   |                                    |                                   |                                       |                                      |                                  |                                                |               |                                            |                                            |                                                           |

#### Preliminary background and data rate estimation

### **Technology survey**



#### ATLAS Phase II



CMS Phase II



- A few common backend boards (ATCA)
- Network or PCIe bus readout
- GPU/FPGA acceleration at HLT
  - GPU power has increased 1,000 times in the last decade
- Full software trigger @LHCb
  - Deal with higher occupancy and more accurate tracking.



6

### **Our choices**

#### Fewer and cleaner physical processes @CEPC

#### Electronics framework schema

- Full data transmission from Front-End Elec.
- Connect trigger with Back-End Elec.

### Trigger solutions

- Hardware trigger(L1) + high level trigger(HLT)
  - A single type of common hardware trigger board
    - Collect trigger primitives from BEE common boards
    - Send back trigger accept signal to BEE
  - Provide fast and normal trigger menu
  - Network readout



### **Main Technical Challenges**

High efficiency algorithms in trigger and background compression

- 2.887MHz->O(1k)Hz @Higgs
- 43.3MHz->O(100k)Hz @Z
- Trigger primitive synchronization control with asynchronous data readout from electronics
  - Manage data disorder due to data transfer queuing and delay
  - Align sub-detector data of each bunch crossing within limited time and resource

### **Previous experience with TDAQ Hardware**

#### Designed BESIII trigger system

 Comprehensive trigger simulation/hardware design/core trigger firmware development
 GSI PANDA TDAQ R&D

Designed HPCN board for TDAQ

Designed Belle2Link and HPCN V3 as ONSEN for Belle II

- Designed CPPF system for CMS Phase-I
  - Design MTCA board, Cluster finding and fanout to EMTF/OMTF
- Designing iRPC/RPC Backend/Trigger for CMS Phase-II
  - ATCA common Backend and trigger board

Extensive experience in TDAQ system design, algorithm and hardware development



9

### **Previous experience with DAQ&DCS**

#### BESIII DAQ & DCS

- Running since 2008
- Dayabay experiment DAQ&DCS
- Operated from 2011 to 2020LHAASO DAQ
  - Operated since 2019
  - Full software trigger
- JUNO DAQ&DCS
  - Two types of data stream
    - HW trigger for waveform
    - Software trigger for TQ hits
  - Online event classification



Extensive experience in DAQ&DCS development and operation, including software trigger

### **Previous experience with ML algorithm**

#### Neural network used in ATLAS global trigger

- Example: tau reconstruction at the hardware trigger level
- Train the neural network (NN) with ROI
- Use hls4ml to convert NN model to hls project



#### HLT Acceleration on FPGA platform



Some experience in ML algorithm development on FPGA for L1 and HLT

### **R&D efforts and results**

Started the design of an ATCA common trigger board for CEPC

- Based on a series of designed xTCA boards



### **Streaming Software Framework – RADAR**

heteRogeneous Architecture of Data Acquisition and pRocessing

- **V1:** deployed in LHAASO (3.5 GB/s data rate), *software trigger mode* V2: upgraded for JUNO (40 GB/s data rate), mixed trigger mode ROS. Containerized running Data Assemble V3: CEPC-oriented (~ TB/s data rate), under development Data Flow Manager (Farm) Data Storages **Motivation:** Radar Data Flow Storage High-throughput data acquisition and processing **Current Status:** 
  - Over a decade of work led to significant progress, validated through experiments
  - **Recent Focus:** 
    - Heterogeneous online processing platforms with GPU
    - **Real-time data processing acceleration solutions**
  - **Expansion**:
    - Application across various domains (DAQ, triggering, control, etc.) —
    - Integration of AI technologies (ML, NLP, expert systems, etc.)

#### Start to develop new version with GPU acceleration.



- **General-purpose distributed framework**
- **Lightweight structure**
- **Plug-in modules design**
- **Microservices architecture**



### **R&D efforts and results**



Acceleration progress for waveform reconstruction and software trigger algorithm.

# **Preliminary Trigger Simulation with Cal.**

#### Physical events signature at ECal&HCal

- Energy deposition is relatively large and concentrated
- Trigger primitive and condition
  - Two clusters with the highest energy
  - Ecal/HCal barrel >0.5GeV
  - ECal end-cap >5GeV
  - Hcal end-cap >50Gev
     Trigger efficiency
    - nnaa:100%
    - nnbb:100%
    - nnaZ:99.7%
    - nntautau:96.7%
    - nnWW:99.1%
    - nnZZ:95.8%
    - Beambkg:4.8%



### **Preliminary Trigger Simulation with Muon**

- Left: 2000 background events(10BX), Right: 1000 ZH→nnµµ events
- Up : Barrel
  - Number of hits(Barrel) > 10
  - nnµµ efficiency:100%
  - Background: 19%
- Down : Endcap
  - Higher background hits



#### A lot of simulation and research need to be done

# **Design of Hardware Trigger Structure**

#### Trigger primitive(TP)

– Extracted by BEELocal detector trigger

- Sub energy and tracking...
   Global trigger
  - E-sum and tracking
  - Fast trigger(FT) and L1A generation on demand

#### TCDS (Trigger Clock Distribution System)

 Distribute clock and fast control signals to BEE
 Which detectors participate in trigger needs to be studied



### **Preliminary design of the common Trigger Board**

#### Common Trigger board function list

- ATCA standard
- Virtex-7 FPGA
- Optical channel: 10-25 Gbps/ch
- Channel number:36-80 channels
- Optical Ethernet port: 40-100GbE
- DDR4 for mass data buffering
- SoC module for board management
- IPMC module for Power management



### **Preliminary design of TCDS and Readout**

#### TCDS/TTC

- Clock, BC0, Trigger, orbit start signal distribution
- Full, ERR signal feed back to TCDS/TTC and mask or stop L1A
- Data readout from BEE
  - Read out directly or concentrated by DCTD board
  - Depending on the size of the data volume
  - TCDS-Trigger Clock Distribution System
  - TTC- Trigger, Timing and Control
  - DCTD-Data Concentrator and Timing Distribution
  - BEE-Backend board Electronic



### **Architecture Design of DAQ**



- Full COTS(commercial-off-the-shelf) hardware
- Readout interface and protocol
  - Ethernet 100Gbps
  - TCP or RDMA
- RADAR software framework
- Heterogeneous computing
   GPU/FPGA acceleration for HLT
- Disk or memory buffer
  - Decouple computing environments
  - Complete offline algorithm can be run online



### **Preliminary design of DCS**





#### **BESIII Detector Control System** Based on LabVIEW



Designed framework based on existed solutions

### **Preliminary design of ECS**

#### Main components of the Experiment Control System



#### R&D progress from JUNO and BESIII

- 3D Visualization Monitoring
- AI shift assistant based on LLM+RAG (TAOChat)
- ROOT-based Online Visualization System

Unified control and monitoring for all system

- TDAQ, DCS, electronics, accelerator and others





### **Research Team**

# 15 staff of IHEP TDAQ group DAQ

- Fei Li (DAQ, team leader)
- Hongyu Zhang (readout)
- Xiaolu Ji (online processing)
- Minhao Gu (software architecture)
- Trigger
  - Zhenan Liu (trigger schema)
  - Jingzhou Zhao (hardware trigger)
  - Boping Chen (simulation/algorithm)
  - Sheng Dong (firmware/DCS)
- DCS/ECS

– Si Ma

#### IHEP Students(20 totally)

- 2 PhD and 3 master
- New member planned
  - 1 staff next year
  - 2 postdoc
- Collaborators
  - Qidong Zhou (HLT, SDU)
  - Yi Liu (HLT, ZZU)
  - Junhao Yin(HLT, NKU)
  - 3 students planned
- We're looking for more collaborators

#### Gathering manpower for R&D, 9 staff and 5 students involved part of the time



# Working plan

#### TDR related

- Basic trigger simulation and algorithm study
  - Background event study and basic algorithm scheme for each detector
- Detailed hardware trigger and interface design
- Finalize TDAQ and online design scheme
- R&D directions
  - Trigger hardware, fast control and clock distribution
  - TB/s level high throughput software framework(RADAR)
    - FPGA/GPU acceleration and heterogeneous computing
    - Memory-based distributed buffer
  - Detailed trigger simulation and algorithm
  - ML/AI algorithm application
    - Trigger/data compression/ AI operation and maintenance
  - ROCE/RDMA readout protocol and smart NIC

Joined DRD WP7.5(Backend systems and COTS components) as an observer.

### Summary

Following sub detectors design and simulation

Completed architecture design of TDAQ and online

Hardware and high level trigger – default choice
 No show-stopper found for TDAQ and online scheme

Challenges: efficient trigger algorithm and handling TB/s level data rate at manageable hardware scale
 More R&D efforts needed to move forward



# Thank you for your attention!



中國科學院為能物招加完所 Institute of High Energy Physics Chinese Academy of Sciences

#### Oct. 21<sup>th</sup>, 2024, CEPC Detector Ref-TDR Review