### 1.1 Introduction (team, history & projects) (Wei Wei, Jingbo Ye)

The main task of the CEPC electronic system is to design the front-end electronics for each subdetector. This involves amplifying, converting to digital, and processing the detector signals, as well as reading and storing the signals and interfacing with the trigger system. The system also handles key signals like clocks, collision points, and controls to synchronize and control all sub-detectors.

In this chapter, the readout requirements of each sub-detector's electronic system will be introduced in Section \ref{sec:requirement}. Key requirements such as background noise, data rates, and power consumption will be summarized based on the front-end electronic readout schemes provided in the sub-detector chapters. The detailed designs of the Front-end ASIC of each sub-detector will also be described in Section \ref{sec:sub-fee}. In Section \ref{sec:global-arch}, the overall requirements of the CEPC electronic system will be analyzed, clarifying the design ideas and technical choices, defining the readout strategy for the electronic and trigger systems, and providing a unified baseline solution and framework structure for the CEPC electronic system. Section \ref{sec:sub-fee} will detail the specific design of front-end electronic readout chips for each sub-detector and more general front-end readout chip designs. Additionally, Section \ref{sec:common-elec} will introduce the design of key components in the universal platform, including data interfaces, front-end power modules, and back-end electronic boards. Section \ref{sec:wireless} will discuss detector readout schemes based on wireless transmission as an upgrade scheme. Sections \ref{sec:clock}, \ref{sec:power}, and \ref{sec:bee} will cover detector clock synchronization, front-end power distribution, and back-end electronic algorithms and interface designs with the trigger system. Finally, as part of the overall team introduction, the composition of team members and collaboration partners, as well as the previous R&D on electronics systems for large particle physics experiments, will be presented in Section \ref{sec:previous}, along with an overview of the team's overall research foundation. In Section \ref{sec:summary}, the electronic system design will be summarized and research plans outlined.

### 1.2 Detector requirements (Wei Wei, Jingbo Ye)

For the CEPC electronics system, in addition to meeting the front-end electronics readout requirements of each sub-detector, a complete universal electronics framework needs to be designed for the overall spectrometer design. This allows the electronics systems of each sub-detector to be designed with universal interfaces and standards, further modularizing the design of sub-detector electronics and enhancing design independence, plug-and-play capability, and upgradeability. At the system level, this approach also enables more efficient scheduling and communication among various systems. In the previous chapters on each sub-detector, based on the preliminary universality framework provided by the electronics system, corresponding preliminary readout schemes for detectors have been proposed. These schemes include specific organization of front-end electronic chips in modules, taking into account the different operational modes of the CEPC accelerator (such as Higgs, Low LumiZ, etc.), as well as considering the front-end electronic schemes based on background estimates provided by the MDI system. These parameters, as inputs for the design of the universal electronics framework, are summarized in Table \ref{tab:det-summary}.

| Muon         |                                 |                           |                    | Avg: 15.36<br>Mbps/chip<br>Max: 153.6<br>Mbps/chip       | <=24:1<br>@ 0 (400 Mbps)                                    | 43.2k <u>ch.</u><br>72 Aggregation<br>board      | ~ 24.1 Gbps                       |
|--------------|---------------------------------|---------------------------|--------------------|----------------------------------------------------------|-------------------------------------------------------------|--------------------------------------------------|-----------------------------------|
| HCAL-E       | Q                               |                           |                    | Max.<br>350Mbps/modul<br>e-layer                         | < 10:1<br>(40cm*40cm<br>PCB - 4cm*4cm<br>tile - 16chn ASIC) | 2.24M <u>chu</u><br>1536<br>Aggregation<br>board | 537.6Gbps                         |
| HCAL-B       | 8~16<br>@common <u>SiPM</u> ASI | TOT + TOA/<br>ADC + TDC   | 48bit              | Max.<br>144Mbps/modul<br>e-layer                         | < 10:1<br>(40cm*40cm<br>PCB - 4cm*4cm<br>tile - 16chn ASIC) | 3.38M <u>chn</u><br>5536 aggregation<br>board    | 811.2Gbps                         |
| ECAL-E       |                                 |                           |                    | odule<br>odule                                           | k brd @                                                     | 0.52M chn<br>~32500 chips<br>260 modules         | 250Gbps                           |
| ECAL-B       |                                 |                           |                    | Avg. 0.96Gbps/m<br>Max:9.6Gbps/mo                        | i. 4~5:1 side<br>ii. 7*4 / 14*4 bac<br>O(100Mbps)           | 0.96M <u>chn</u><br>~60000 chips<br>480 modules  | 460Gbps                           |
| TPC          | 128                             | ADC + BX ID               | 48bit              | ~70Mbps/<br>module<br>Inmost                             | 1. 279:1<br>FEE-0<br>2. 4:1<br>Module                       | 492 Module                                       | 34.4Gbps                          |
| OTKE         | 00                              | TOT+TOA                   | ßbit               | Avg:<br>38.8Mbps/chip<br>Max :<br>452.7Mbps/chip         | i, 22:1<br>@0(50Mbps)<br>ii. 10:1<br>@0(500Mbps)            | 11520 chips<br>720 modules                       | 27.9Gbps                          |
| OTKB         | 12                              | ADC+TDC/                  | 40~4               | Avg :<br>2.9Mbps/chip<br>Max :<br>3.85Mbps/chip          | i. 22:1<br>@O(5Mbps)<br>ii. 7:1<br>@O(100Mbps)              | 83160 chips<br>3780 modules                      | 249.1Gbps                         |
| Strip (ITKE) | 1024                            | Hit + TOT +<br>timing     | 32bit              | Avg.<br>21.5Mbps/c<br>hip<br>Max.<br>100.8MHz/c<br>hip   | 22:1<br>@0(100Mb<br>ps)                                     | 23008 chips<br>1696<br>modules                   | 298.8Gbps                         |
| Pix(ITKB)    | 512*128                         | XY addr +<br>timing       | 42bit              | Avg.<br>3.53Mbps/c<br>hip<br>Max.<br>68.9Mbps/c<br>hip   | 14:1@O(10<br>0Mbps)                                         | 30,856 chips<br>2204<br>modules                  | 101.7Gbps                         |
| Vertex       | 512*1024<br>Pixelized           | XY addr +<br>BX ID        | 32bit              | 2Gbps/chi<br>p@Triggerl<br>ess@Low<br>LumiZ<br>Innermost | 10~20:1,<br>@2Gbps                                          | 1882 chips<br>@Stch<br>&Ladder                   | 474.2Gbp<br>s                     |
|              | Channels per<br>chip            | Ref. Signal<br>processing | Data Width<br>/hit | Max Data<br>rate / chip                                  | Data<br>aggregation                                         | Detector<br>Channel/mo<br>dule                   | Avg Data<br>Vol before<br>trigger |

From Table \ref{tab:det-summary}, it can be seen that for different sub-detector systems, targeted research and development of 5~6 front-end ASIC chips will be conducted to detect the physical signals of the detectors. In these ASICs, considerations have been made to merge similar requirements to reduce the variety of ASICs developed in parallel. For example, for the ECAL, HCAL, and Muon detectors, as they will all use SiPM devices in the detectors, a plan is in place to combine the common requirements of the three detectors to design a SiPM readout ASIC with a certain level of universality to simultaneously meet the needs of the three detectors. Additionally, in the design process of front-end chips, apart from detector systems such as Vertex and Inner Tracker that require the use of MAPS technology, for other front-end ASIC designs, it is considered to adopt a unified CMOS 55nm/65nm process from the same foundry, which will allow for the sharing of circuit modules used in chip design as much as possible, thereby accelerating the chip design process and improving the reliability of chip design. This approach embodies the idea of a universal electronics framework in the design of front-end ASICs. Specific ASIC schemes will be detailed in Section \ref{sec:sub-fee}.

The third and fourth rows of the table also summarize the data width of the front-end chip outputs for each subdetector, as well as the average data rate and maximum data rate per chip. This takes into account the detector background provided by the MDI system through overall simulation, and considers the arrangement of modules in each subdetector. Based on the detector module size, detector operating mode, and key information to be detected, combined with the preliminary design of the front-end chip, the data width is determined. Furthermore, based on the detector size corresponding to each channel of the front-end chip, combined with the background counting rate, an estimate of the data rate per chip or module is provided. This will also serve as one of the key input parameters for the overall electronics system interface.

In the fifth row of the table, the situation between the front-end chips and the data interface on different detectors is summarized based on the chip arrangement on the modules of each subdetector, module layout, and the way the front-end of the detector is led out. For cases where there are a large number of chips on the detector, but the average data rate per chip is not high, especially for systems such as OTK and calorimeters, an effective approach is to aggregate the data from multiple chips before readout in order to save data interfaces and cable quantity. This preliminary approach has been considered in the subdetector readout scheme. In the relevant sections of the data interface, the design of data aggregation chips will be based on this, and the organization of front-end chips summarized in this table will be a key input parameter for this design.

Finally, the sixth row of the table summarizes the preliminary number of electronics channels provided by each subdetector, which already takes into account the considerations after the detector has been optimized in size. The seventh row summarizes the total data rate of each subdetector after considering the detector background data rate. These two rows will serve as key design parameters for the TDAQ and electronics interface, and will be discussed in detail in the backend electronics and trigger system sections.

1.3 Sub-detector Front-end Electronics Design (Yan Xiongbo, Chang Jinfan, Li Huaishen, etc)5.3.3 ITK readout electronics

The required bandwidth of the readout e-link of each sensor strongly depends on the radial region it covers. For monolithic HVCMOS Pixel, the number of bit per hit is 48. Considering a area of 2cm\*2cm, the distribution of the average number of bit per sensor is 4 Mbps and the maximum number of bit per sensor is 70 Mbps.



Each module consisting of 14 sensors are bonded to a Flex Electronics Board (FEB), considering the material budget. The FEB transfers digital signals to the data aggregation chip, then transfers signals to the optical fiber. For error-free data transmission, the FEB uses two ASICs of data aggregation, TaoTie chip and ChiTu chip. The TaoTie chip collects the data from different sensors and transfer it serially. The ChiTu chip collects data from several TaoTie chips. The hit rate is different at different radius. A dedicated buffer is needed in each ASIC to average the rate variation and match the best speed of the e-link drivers transceiver inputs. The optical fibers take the data out of the detector, and connects to the back-end DAQ. The ChiTu will work at a data rate of 10Gbps which is enough for end cap at the smallest radius.

The power for low voltage (LV) and high voltage (HV) is independent for each module, considering the total consumption is 200mW/cm2. There will be 2 DC-DCs on the flex for each module since the current will be more than 10 A with 1.2V power, regarding the margin for efficiency of conversion. The 48V from power crate steps down to 12V at the first stage and to 1.2V at the second stage where the output capability of DC-DC is 10A. A composite cable with a pair of LV, a pair of HV and an optical fiber will be used to make module assembly easy.

For Monolithic CMOS strip, the readout scheme is very similar, even though the hit rate for each channel is higher than pixel in front end electronics.

#### 5.4.3 OTK readout electronics

The main contributions to the time resolution of a silicon sensor and its associated readout electronics are given by

### $\sigma_{t} = \sigma_{Landau} + \sigma_{time walk} + \sigma_{jitter} + \sigma TDC + \sigma_{clock} (2.1)$

Thefirst termcorresponds to the event-by-event Landau fluctuations of the charge deposition within the sensor. Landau fluctuations set the ultimate (physics) limit to the time resolution. The time walk and jitter components are proportional to the rise time of the signal and inversely proportional to the signal over noise ratio (S/N). The fourth term is given by the resolution of the time-to-digital converter (TDC). For a 20 ps TDC, this resolution is 20 ps/12 5 ps. Based on expression 2.1, the key to precision timing consists of a large signal with short rise time and low noise, which can be achieved with the use of LGAD sensors. LGADs, shown schematically in Figure 2, are n-on-p silicon detectors with gain from an additional highly doped p-layer that creates a high electric field. LGAD sensors are thin, resulting in a small signal rise time, provide a moderate gain to increase the signal while limiting the noise at the same time, and are radiation hard. The single pixel readout electronics consists of a preamplifier, followed by a discriminator and two TDCs. One TDC is used to measure the time of arrival (TOA), and another provides time-over-threshold (TOT) information for time-walk corrections. The TOA TDC is a 7 bit high precision 20 ps TDC, with a range of 2.5 ns. The 9 bit TOT TDC uses the same TDC architecture but has a coarser binning of 40 ps and expanded range of 20 ns. The TDC design uses a Vernier configuration with two delay lines of 120 ps and 140 ps as shown schematically in Figure 2. The TDCtime resolution in this Vernier configuration is given by the difference in delay of the cells in each line. When a trigger signal is received, the start signal is sent to the START delay line, where each cell has a delay of 140 ps. The detection of a signal hit in a pixel sends the stop signal to the faster (120 ps) STOP delay line. As the signals move through the delay cells, the STOP signal will eventually reach and pass the START signal. The TOA is therefore determined by the number of stages needed for the STOP signal to surpass the START signal

The jitter is often expressed in literature as the noise-to-slope ratio and it can be minimized by a front-end able to generate an output with <u>low noise</u>  $\sigma v$  and high slope  $\partial V/\partial t$ . High slopes are obtained with a high amplitude and a short front-end peaking time (tp,fe) of the <u>analog signal</u> [9]. Important quantities that enter in the jitter minimization are the amplifier bandwidth BW, the sensor-front-end <u>impedance matching</u>, and the amplifier gain. The bandwidth affects both noise ( $\sim BW$ ) and slope ( $\sim BW$ ) and, ideally, the higher BW the lower the jitter. However, the intrinsic time response ts of <u>UFSD</u> sensors sets the <u>upper limit</u> to the maximum reachable slope that the analog output can exhibit. As a consequence, the bandwidth should be chosen to be the minimum value that retains the intrinsic sensor speed while keeping the noise low. The bandwidth defines the signal shaping of the front-end and its optimum value for timing is obtained when tp,fe equals the sensor peaking time tp,s, which is the time needed by the signal induced by the charge collected into the sensor, to reach its maximum [9].

The sensor-front-end matching is a key ingredient for a successful front-end design. The input time constant of the amplifier, defined as  $\tau in = Cdet \cdot Rin$ , must be chosen according to  $\tau s$ . Fig. 1 shows the important contributions in the matching. The sensor area and thickness affect the <u>sensor</u> <u>capacitance</u> Cdet whereas the <u>input impedance</u> Rin depends on the front-end. With good matching,

most of the current pulse generated by the sensor enters into the amplifier which operates in current mode, defined as Is = Ie. This condition is obtained when  $\tau in < \tau s$  [10]. Considering the typical UFSD value of  $\tau s$  of ~ 400 ps (defined by the sensor collection time) for a 55 µm thick sensor with capacitance of Cdet = 4 pF, the front-end should be designed with Rin lower than 100  $\Omega$ . In case of a much larger <u>sensor capacitance</u>, the condition  $\tau in < \tau s$  is difficult to achieve because it will require too small values of Rin. In cases where Cdet  $\cdot$  Rin  $\ll \tau s$ , the current Is is entirely integrated on Cdet, generating a voltage pulse at the input of the amplifier. In this conditions, the maximum signal slope is given by the ratio between the maximum current Imax generated by the sensor and Cdet. Taking as reference a 4 pF, 55 µm thick UFSD coupled to a front-end having Rin  $\gg$  100  $\Omega$ , the most probable slope for an impinging MIP (Minimum Ionizing Particle) is about 2.75 mV/ns.

Generally, current mode amplifiers are the best choice for preserving both the rising and falling edges of the fast sensor pulse. However, this is feasible only with small values of Cdet. Depending on the impedance matching, the peaking time tp,s of the pulse (current or voltage) sent to the amplifier from a 55  $\mu$ m thick UFSD ranges between 0.6 and 1.2 ns. If these signals are amplified by a timing-optimized front-end, the expected peaking time, given by Formula (1), will range between 0.85 and 1.7 ns.(1)tp=tp,s2+tp,e2 $\rightarrow$ tp,s=tp,e2 $\cdot$ tp,s

Fig. 2 shows a simulation of timing optimization performed by using the model of Fig. 1 with a 4 pF sensor capacitance, the typical signals generated by 55  $\mu$ m thick UFSD of moderate gain of about 15 and a two pole amplifier with an Rin equal to 100  $\Omega$ . The optimum bandwidth corresponds to values in the range of 400–800 MHz.

The final important term to consider in the design optimization is the amplified <u>signal amplitude</u> that should be maximized. In case of trans-impedance amplifiers (TIA), this can be obtained by using a large feedback resistance Rf. However, since Rin increases with Rf as shown in Fig. 3, having a too large Rf degrades the sensor-front-end matching. For amplifiers with <u>open loop gain</u> of 40 dB, a good trade-off is achieved using Rf~5–20k $\Omega$ . In this case, Rin can be reduced by maximizing the <u>input transistor transconductance</u> gm. For a constant <u>power consumption</u>, this can be ensured by increasing its width up to enter in the weak inversion region. Special care should be used during the front-end layout, using techniques to minimize undesired parasitics.

### 3. The front-end architecture of FAST

The three flavors of the FAST (Fast Amplifiers for Silicon detectors for Timing) family explore different front-end designs, each suitable to reach pico-second time resolutions. The flavors are labeled Regular, EVO1 and EVO2 and have been designed by using two architectures with different bandwidths. Regular has a bandwidth of 100 MHz and aims at minimizing the jitter by reducing the noise. More details about this architecture are reported in [11].

The architecture used to design the EVO flavors is shown in Fig. 4. It is based on a pass-band TIA with two amplification stages. The first stage consist of a buffered <u>broadband amplifier</u> in transimpedance configuration followed by a cascode common source amplifier. The core of the first stage consists of cascode common source amplifier with n-type <u>input transistor</u>. A technique based on two branches is used to increase the <u>open loop gain</u>: the first branch (R1-M2) uses a small current (~80  $\mu$ A) that allows keeping high the value of R<sub>1</sub>, which is important to increase the gain. The second branch (M4-M3), consisting of a cascode current mirror, provides a current up to 1.5 mA to M1, increasing its gm and then also the open-loop (OL) gain. The input transistor is designed to work in weak inversion mode to maximize gm. With a nominal current of 1 mA, the OL gain of this stage is 35 dB. The large bandwidth is obtained combining n-type cascodes (M2 and M8), minimizing the <u>Miller effect</u>. Moreover, passive loads (R1, R2 and R5) are adopted to avoid the parasitics introduced by active components.

The broadband is buffered by a wide-band n-type source follower. The TIA is obtained with the resistor Rf connected in feedback. Three values of Rf,  $5.3 \text{ k}\Omega$ ,  $11.6 \text{ k}\Omega$  (nominal) and  $31.6 \text{ k}\Omega$ , can be connected by means of a dedicated configuration register. This allows exploring three different combinations of BW, gain and Rin, as summarized in Table 1. The architecture allows to tune the bandwidth between 230 and 665 MHz allowing a peaking time tp,el between 0.49 and 1.2 ns.

The noise is reduced by using RC filters, R6-M9, and by a proper sizing of the input transistor. In nominal conditions, the noise at first stage output is 770  $\mu$ V RMS where the largest contribution comes from Rf itself.

| Rf   | Bandwidth | Rin |
|------|-----------|-----|
| [kΩ] | [MHz]     | [Ω] |
| 5.3  | 665       | 90  |
| 11.6 | 580       | 165 |
| 31.6 | 230       | 490 |

Table 1. Main parameters defining the signal shaping of the EVO first-stage amplifier. Rf defines also to the low frequency gain.

The second stage is a cascoded common source with passive load. The input transistor works in weak inversion and its polarization is provided by a dedicated bias (R3 and R4). This stage is AC coupled to the first one, removing the low frequency variations present in the TIA output due to fabrication mismatch or leakage current in the sensors. The gain of the second stage is 13.7 dB and the bandwidth is large enough to maintain the timing properties of the first stage. With the nominal gain, the entire front-end exhibits a gain of  $\sim 5.6 \times 10^4$  V/A, bandwidth of 460 MHz and output noise of 2.7 mV RMS. The EVO flavors are based on the architecture so far described but using different devices: standard and RF transistors have been used for EVO1 and EVO2, respectively.

# 3.1. The FAST ASICs

The flavors of the FAST family have been designed and produced in 110 nm <u>CMOS</u> technology. Each chip die fits an area of 1.6 mm  $\times$  5 mm and contains 20 independent channels with an input pitch of 170 µm. The channel chain consists of a front-end (REG, EVO1 or EVO2), a leading-edge <u>discriminator</u>, a pulse width regulator (PWR) and a LVDS driver. The PWR is used to increase by a fixed quantity the Time-over-Threshold length, to make FAST compatible with commercial TDC. The LVDS format is used for compatibility with commercial TDCs and <u>FPGAs</u>. More details about the chain can be found in [11]. Fig. 5 shows the FAST chip and a 5  $\times$  5 UFSD array connected to the test board.

### 4. FAST characterization

### 4.1. System-level simulations

The FAST family is the result of system-level studies that include all terms, from the sensor and electronics side, affecting the time resolution. The study has also included the detailed emulation of the MIP charge deposition in the <u>silicon sensors</u>, including, for instance, the statistical effects in the sensor signal generation or the size of the sensor. An example of these studies is presented in Fig. 6, where the simulated FASTEVO2 front-end response to a 8 fC delta-shaped input signal is plotted for different Cdet values, ranging from 2 to 20 pF. The simulation shows that Cdet affects significantly the <u>signal amplitude</u> because it impacts the <u>impedance matching</u> whereas the pulse duration does not change appreciably up to 10 pF. The study of these matching effects is carried out by connecting the FAST prototypes to different <u>input capacitors</u> while varying the rise time of the input signal.



2. Download: Download full-size image

Fig. 6. FASTEVO2 front-end response to a 8 fC delta-shaped input charge for a sensor capacitance ranging from 2 to 20 pF. Simulation done with two gains:  $Rf = 5.3 \text{ k}\Omega$  (left) and  $Rf = 11.6 \text{ k}\Omega$  (right).

### 5.4.3 OTK readout electronics

This section describes the required performance, design, and latest prototype testing of the ASIC chip, which will have 128 readout channels in future. The main challenge in the design of this ASIC is a high time resolution for time measurement and charge resolution for position, in order to match the excellent performance of the LGAD. The time contribution comes mainly from the jitter and the time walk. The most critical aspect concerning the jitter is the design of the analog front-end electronics, which are composed of a transimpedance amplifier followed by a shaper and a fast discriminator. The measured time-of-arrival (TOA) and time-over-threshold (TOT) are digitized using two time-to-digital converters (TDCs), and stored in a local memory at the channel level. The charge is related to information of TOT and the charge resolution is determined by time resolution. The contribution of time walk will be addressed by applying a correction based on the fact that the variations in the TOA of the pulse are related to the TOT. The ASIC common digital part is composed of different blocks necessary to generate and align the clocks, receive the slow control commands to configure the ASIC and transmit the digitized data.

A prototype chip has been produced and will be tested so far: ZHULONG, integrated 8 channels, with the preamplifier and the discriminator, TDC and digital components. The chip will be test in the end of 2024.

The requirements imposed by the data taking conditions, the sensor and the targeted performance are presented first in Section 3.1. The ASIC architecture is described in Section 3.2, first going through the single-channel architecture and then the entire ASIC. Section 3.3

describes in detail the design of the single-channel readout electronics, followed by the description of the ASIC common digital part in Section 3.4. The radiation tolerance is described in Section 3.5 and the power distribution in Section 3.6 The performance results obtained so far in test bench and test beam are described in Section 3.7. The description of the monitoring can be found in Section 3.8. Lastly, a brief account is given of the future steps towards the completion of the design and testing of the ASIC in Section 3.9.

5.4.3.1 General requirements

The requirements of the ASIC can be divided into two types. On one side the considerations regarding the operational environment of the ASIC, its powering and electrical connections. These requirements are summarized in Table 6.1. The second group concerns the ASIC performance, driven by the targeted time resolution. A summary of these requirements is presented in Table 6.2.

• The target for the electronics is to be able to read out signals from 16 fC up to 50 fC throughout the lifetime.

• Each readout channel needs to match the sensor strip, with a pitch of 100 $\mu$  It will be capable of handling up to 5 $\mu$ A leakage current from the sensor.

• The electronics jitter is required to be smaller than 30 ps for an input charge of about 16 fC, that is smaller than intrinsic dispersion of LGAD. A detector capacitance of about 4 pF is considered. The TDC bin size for TOA measurement should be less than 30 ps, thus the contribution from TDC will be negligible. The time walk should be smaller than 10 ps over the dynamic range after correction.

• The charge measurement is applied by TOT measurement. A resolution of 1.6 fC is necessary for a special resolution of 10 um. The bin size for TOT will be the same as TOA, thanks to the reusing of delay chain.

• The TOA and TOT information are transferred to the data acquisition system, therefore integrating the protocol of data aggregation is necessary.

• The charge generated by MIP will be shared by adjacent strips, it should be possible to set the discriminator threshold for small enough values of input charge. The minimum threshold (4 fC) should provide an efficiency above 95% for an input charge of 16 fC. The cross-talk between channels should be kept below 10 %, to enable the possibility to set such low thresholds.

• The ASIC will have to withstand high radiation levels. The expected radiation levels have been

presented before, considering a safety factor for the electronics leading to a maximal TID of 1 MGy

### Data transmission bandwidth requirements

The required bandwidth of the readout e-link of each ASIC strongly depends on the radial region it covers, as shown by the distribution of the average number of hits per ASIC in Figure ?. The number of bit per hit is 48 as described in Section ?. Each module consisting of 8 ASICs is connected via a flex cable to a Peripheral Electronics Board (PEB), described in Chapter 9. The PEB transfers digital signals from the flex cables to optical fibres connected to the back-end DAQ. Flex cables for modules placed at a radius above 320mm also carry two differential e-links with luminosity data. For error-free data transmission at the bandwidths required by the expected HGTD data volume, the PEB uses the low-power GigaBit Transmission chip (lpGBT [78]). A dedicated buffer is needed in each ASIC to average the rate variation and match the best speed of the

e-link drivers/lpGBT transceiver inputs:

• The largest average hit rate at small radius does not exceed 20 hits per ASIC and per event, equivalent to a rate of 500 Mbit s-1 (not including header). In the current design a bandwidth of up to 1.28 Gbit s-1 was considered for the innermost radius ASICs (up to r ' 150mm), taking into account a considerable safety margin. However if further studies confirm this, a lower maximum bandwidth could be considered, thus reducing the number of necessary lpGBTs.

• For larger radii, a 320 Mbit s-1 bandwidth can be used.

• For the luminosity measurement, the 12 bits of data for the counts in the larger and smaller window is expanded to 16 bit using the 6b8b encoding (see Section 6.2.1).

Therefore a 640 Mbit s-1 e-link driver and lpGBT speed is needed.

5.4.3.2 ASIC architecture

5.4.3.2.1 Channel architecture

5.4.3.2.2 Readout architecture

5.4.3.3 Single-channel readout electronics

- 5.4.3.3.1 Preamplifier
- 5.4.3.3.2 Discriminator
- 5.4.3.3.3 TDC
- 5.4.3.3.4 Internal pulser
- 5.4.3.3.5 Hit processor

5.4.3.4 Data process and digital blocks

- 5.4.3.4.1 Clock generation unit
- 5.4.3.4.2 Data readout process
- 5.4.3.4.3 Slow control
- 5.4.3.5 Radiation tolerance
- 5.4.3.6 Power distribution and grounding
- 5.4.3.7 Prototype performance

5.4.3.7.1 Test bench measurement

5.4.3..7.2 Test beam measurement

5.4.3.7.3 Irradiation tests

5.4.3.8 Monitoring

5.4.3.8.1 Temperature monitoring

5.4.3.8.1 Supply voltages monitoring

5.4.3.8.1 Complete monitoring system

5.4.3.9 Roadmap towards production

### 1.4 Global architecture (Wei Wei, Jingbo Ye)

#### 1.4.1 Consideration on readout strategy

As a next-generation large collider experiment electronics system, to design all front-end electronics subsystems according to a unified system specification, to ensure that their data interfaces, power interfaces, etc., are supplied in a uniform manner, and furthermore, to make the backend electronics able to receive data, perform slow control, and configure the front-end electronics of each subdetector in a unified interface, and further communicate with the TDAQ system, will significantly enhancing the unity of the electronics system. This will not only facilitate the unified design and management of different subdetector systems but also enable the entire electronics system to be designed with a certain degree of maximized commonality. That is, based on the different scales of subdetectors, achieving modular design of subdetector electronics can be relatively easy by simply increasing the number of common generic modules accordingly.

To achieve this design style, it is necessary to first determine the overall strategy of the electronics and TDAQ systems. In other words, it is essential to clarify whether the electronics and TDAQ systems adopt a front-end trigger scheme or are based on a front-end triggerless readout scheme.

| Characteristics                    | FEE-Triggerless       | FEE-Trigger          | Superiority     |
|------------------------------------|-----------------------|----------------------|-----------------|
| Where to acquire trigger info      | On BEE                | On FEE               |                 |
| Trigger latency tolerance          | Medium-to-long        | Short                |                 |
| Compatibility on Trigger Strategy  | Hardware / software   | Hardware only        | FEE-Triggerless |
| FEE-ASIC complexity on Trigger     | Simple                | Complex on algorithm |                 |
| Upgrade possibility on new trigger | High                  | Limited              |                 |
| FEE data throughput                | Large                 | Small                |                 |
| Maturity                           | Mature but relatively | Very mature          | FEE-Trigger     |
|                                    | new                   |                      |                 |
| Resources needed for algorithm     | High                  | Low                  |                 |
| Representative experiments         | CMS, LHCb,            | ATLAS, BELLE2, BE-   |                 |
|                                    |                       | SIII,                |                 |

Table 1.2: Comparison of the FEE-Triggerless readout and Trigger readout strategy

Table \ref{tab0:comp-trigger} provides a general comparison of the two typical trigger readout schemes. It can be seen that the front-end trigger-based approach is relatively traditional. In this method, while the front-end electronics of the detector process the detector signals, they also need to extract key information usable for triggering from the detector signals and send it to the trigger system. At the same time, detector data needs to be cached in the front-end electronics. Once the trigger system receives the key information, it generates trigger decision information based on the physics model and corresponding trigger algorithms, which is then sent back to the front-end electronic compare the cached data with trigger decision information to extract valid physical events and send them to the backend electronics, which further routes them to the data acquisition system.

On the other hand, the backend trigger-based approach involves digitizing the detector signals in the front-end electronics and directly transferring them to the backend electronics for caching. The

trigger system only communicates with the backend electronics, and the extraction of detector valid events is done solely in the backend electronics and trigger system. The comparison in Table \ref{tab0:comp-trigger} shows that these two main electronics frameworks have their own advantages and disadvantages without a clear superiority. In typical applications, they are supported by various large particle physics experiments such as CMS, LHCb, as well as ATLAS, BELLE2, BESIII, respectively.

The traditional front-end trigger scheme effectively eliminates detector background, reduces pressure from data transmission bandwidth, but also increases the demand for front-end electronics data caching capacity. It usually allows only short trigger delays, requires faster trigger decision speeds, and simpler trigger algorithms. On the other hand, the front-end triggerless readout method reduces the design complexity of front-end electronics by eliminating trigger-related logic. However, since detector background and valid events are both read out together, it increases the pressure on front-end data transmission. Nevertheless, with improved processing capabilities and cache space in the backend electronics compared to the front-end electronics, the front-end triggerless readout scheme can also implement relatively complex trigger algorithms. This reduces the requirements for trigger delay and trigger system design, making pure software triggering possible. In China's collider spectrometer experiments represented by BESIII, the front-end trigger-based approach is commonly adopted. Non-collider experiments represented by JUNO and LHAASO generally explore front-end waveform sampling schemes, but overall, the implementation of trigger algorithms still follows relatively traditional approaches such as data compression and detector

information extraction.

# 1.4.2 Baseline architecture for the Electronics-TDAQ system

# **1.5 Common Electronics interface**

# 1.5.1 Data interface (Di Guo, Xiaoting Li, Jingbo Ye)

1.5.1.1 General requirements and overall architecture

Table 1 Requirements

Figure x.1 shows the overall architecture of the data transmission interface, comprising four ASICs and an optical module. The ASICs include TaoTie, ChiTu, KinWooLDD, and KinWooTIA, while the optical module, KinWooTRX, consists of four KinWooLDDs, four KinWooTIAs, and VCSEL and LD arrays. TaoTie functions as a data pre-aggregation ASIC for multiple channels of readout electronics in different sub-detectors. ChiTu serves as a bi-directional data transceiver with both an uplink and a downlink. The uplink receives data post pre-aggregation, performs encoding, serialization, and transmits the data to KinWooTRX. The downlink receives clock and slow control signals, performs de-serialization, decoding, and transmits data to blocks on-ChiTu and others off-ChiTu. KinWooLDD and KinWooTIA are optical drivers and receivers designed specifically for the VCSEL and LD arrays, respectively. They will be integrated and assembled within the optical

module.



Figure x.1 Overall architecture of the data interface

### 1.5.1.2 Optical data transmission ASICs

1.5.1.2.1 TaoTie: Front-end data pre-aggregation

TaoTie employs a self-compatible scheme to handle a variety of data channels and rates from front-end sub-detectors. Basically, each TaoTie can serialize a maximum of 16 channels into 1 channel, with configurable modes allowing for 2, 4, or 8 channels to be serialized into 1 channel. Moreover, an N-stage TaoTie can serialize 16<sup>N</sup> channels into 1 channel. Based on the current requirements, it is likely that 2 stages will be sufficient. The ultimate serial output data rate should align with the input data rate requirement of ChiTu, which is 1.3856 Gbps based on the 43.3-MHz system clock.

1.5.1.2.2 ChiTu: Bi-direction data interface

ChiTu primarily comprises a flexible high-precision clock system, a high-speed serializer, deserializer, data builder, configuration capabilities, and monitoring. The 8-channel data at 1.3856 Gbps received from TaoTie is processed by D-links in ChiTu. It undergoes alignment by phase aligners, encoding, DC-balancing by a data builder, serialization to a data rate of 11.0848 Gbps, and is ultimately transmitted to KinWooTRX. The high-quality clocks imperative for the serializer are generated by an LC phase-locked loop (PLL), which can also offer several configurable output frequencies and phases externally. The de-serializer in ChiTu receives control signals, including fast command, at a data rate of 2.7712 Gbps. The circuit recovers data and clock signals through an integrated clock data recovery (CDR) mechanism.

|  |  | 1 |  |
|--|--|---|--|
|  |  |   |  |
|  |  |   |  |
|  |  |   |  |

1.5.1.2.3 KinWooLDD: VCSEL array driver

KinWooLDD is a four-channel vertical-cavity surface-emitting laser (VCSEL) driver. Each channel mainly comprises an input equalizer stage, a pre-driving stage, and an output-driving stage. The pre-driving stage is composed of four stages, with two stages sharing an inductor to enhance bandwidth beyond 10 GHz. The output driver receives fast wide-swing differential signals from the pre-driver and converts them into single-end current to drive the VCSEL. This design enables achieving a transmission data rate of up to 14 Gbps in a 55-nm technology, meeting the current data transmission requirement of the CEPC (11.0848 Gbps per fiber). To increase the driving bandwidth further, active-feedback and feed-forward equalizer (FFE) pre-emphasis techniques can be implemented in the pre-driver and output driver, respectively.

### 1.5.1.2.4 KinWooTIA: VCSEL receiver

KinWooTIA is a four-channel photodiode (PD) receiver with a data rate capability exceeding 2.7712 Gbps or 5.5424 Gbps per channel for standard and enhanced requirements, respectively. Each channel consists primarily of a transimpedance amplifier (TIA), limiting amplifier (LA), and driver stage.

#### 1.5.1.2.5 KinWooTRX: Optical module

KinWooTRX is an optical module consisting of a KinWooLDD, a KinWooTIA, a four-channel VCSEL array, a four-channel PD array and a carrier board for the optocoupler devices. The height requirement is xxx mm.

#### 1.5.1.3 Prototype performance

Prototype circuits have been developed to assess functionalities and performance. Figure x.2 illustrates the block diagram of the BDTIC (bi-direction transceiver integrated circuit) chip, which is a prototype design of ChiTu. Figure x.3 presents the test platform comprising a test board with a wire-bonded BDTIC die, a clock board, power supplies, an oscilloscope, a spectrum analyzer, a bit error rate tester (BERT), and a computer. Figure x.4 shows measured performance including jitter and phase noise performance of the PLL (as shown in (a) and (b)), jitter and phase noise performance of the CDR recovered clock (as shown in (c) and (d)), and eye diagrams of the serializer and deserializer (as shown in (e) and (f)). The results indicate that the building blocks meet the characterization requirements of ChiTu.



Figure x.2 Block diagram of the prototype BDTIC chip



Figure x.3 Test platform of the BDTIC chip



(e) 10.24-Gbps eye diagram of the serial output (f) A 160-Mbps eye diagram of the de-serializer

Figure x.4 Performance measurements of the BDTIC chip



1.5.1.3.1 Clock blocks (PLL, CDR, Phase aligner...)

(b) Eye and noise histogram of the divide-by-2 output (#1 board)



- 1.5.1.3.2 SerDes blocks
- 1.5.1.3.3 Optical data interface
- 1.5.1.3.4 Periphery blocks (SPI, I2C...)
- 1.5.1.3.5 Irradiation tests
- 1.5.1.4 ... Package? Power consumption?
- 1.5.1.5 Roadmap towards production

# 1.5.2 Power module (Jun Hu, Jia Wang, Jingbo Ye)

# 1.8 Power supply distribution

# 1.8.1 Power supply distribution overview

The power supply is a crucial element in electronic systems, as it has a direct impact on their overall performance and functionality. Figure 1 provides a visual representation of the power distribution scheme for the CPEC low-voltage supply system.

In the electronics room, which does not necessitate radiation and magnetic protection, both the COTS (Commercial Off-The-Shelf) AC-DC power supply and a custom-designed DC-DC converter are housed. The DC-DC converter is specifically engineered to step down the voltage from 110V to 48V, supplying the necessary power for various components within the system.

To support the frontend detectors, lower voltage levels are required, specifically 1.2V for the analog readout chips and 2.5V for the digital transmission chips. These low voltages are critical for the operation of the frontend boards, which form part of the detector readout electronics. These boards are typically installed in environments that are subject to radiation and magnetic, making it essential for them to not only receive appropriate power levels but also to be designed and shielded to withstand the challenging conditions posed by radiation exposure. Ensuring proper power distribution and voltage regulation in these environments is vital for the reliable operation of the detector systems and, ultimately, for the accuracy of the data they collect.



Figure 1: CEPC low-voltage power supply distribution

We will implement a two-stage DC-DC conversion architecture to efficiently manage voltage levels for our system.

In the first stage, the Basha48 module will receive a 48V input from the long cable that extends from the electronics room. This module is responsible for stepping down the voltage to 12V. The Basha48 module will be strategically installed near the end cap, reducing the length of cable that must carry high voltage, thereby minimizing potential voltage drop and ensuring efficient power delivery.

In the second stage, the 12V output from the Basha48 module will be delivered to the frontend board via finer, more manageable cables. On the frontend board, the Basha12 module will further convert the 12V supply into the specific lower voltages required for the operation of the frontend detectors, namely 1.2V for the analog readout chips and 2.5V for the digital transmission chips.

This two-stage conversion approach not only optimizes the efficiency of power distribution but also allows for a compact design that minimizes radiation exposure to sensitive electronic components in the frontend. It ensures that the necessary voltages are precisely regulated, maintaining the performance integrity of the detector readout electronics while adapting to the challenging environmental conditions present in radiation-prone areas.

**1.8.2** COTS power supply



Since the first two stages of the power supply are installed in an environment free from radiation and strong magnetic fields, commercial power supplies can be procured. In order to enable power control, the power supply must provide the following interfaces to the DCS system:

- 1) On/Off Control Interface: A simple input to allow the DCS to turn the power supply on or off.
- 2) Voltage Adjustment Interface: An adjustable output voltage interface that lets the DCS set the required voltage levels.
- 3) Current Monitoring Interface: An output that provides real-time current readings to the DCS for monitoring purposes.
- 4) Temperature Monitoring Interface: An interface to relay temperature data to the DCS, allowing for effective thermal management.
- 5) Fan Speed Control Interface: The ability to control and adjust fan speeds based on system requirements to ensure proper cooling.

These interfaces will facilitate seamless integration with the DCS, ensuring effective monitoring and control of the power supply operation.

# 1.8.3 Basha DC-DC conversion module

To ensure that the power management module operates stably under radiation levels and provides a consistent voltage for the frontend electronics to meet the requirements of the CPEC experiment, we need to study the key technologies for radiation-hardened power modules and design dedicated radiation-hardened power management modules. Specifically, this includes: Power management modules provide energy for electronic devices and serve as a cornerstone for the proper operation of these devices. Currently, commonly used power management modules are DC-DC converters, which, as the name suggests, convert direct current to direct current. These are primarily categorized into buck converters, boost converters, and buck-boost converters. Among them, buck converters are one of the most prevalent topologies in DC-DC power circuits, characterized by high efficiency and high power density. They represent critical components in power systems for various detectors within the CEPC, primarily responsible for reducing high-voltage power supplied via high-pressure transmission to the lower voltages required by frontend electronics. The use of high voltage and low current for power transmission can reduce the number of cables, minimize voltage loss, and enhance the power efficiency of the detector system. However, in buck converters, key modules such as the reference, pre-buck, and on-chip LDO (Low Dropout Regulator) are susceptible to radiation effects, which may lead to level drift and circuit failures. Therefore, it is crucial to implement specific radiation-hardening measures for different modules.



Since the buck-type DC-DC converter is located within the detector and is subjected to high radiation and magnetic field intensity, magnetic-core inductors cannot be utilized. Additionally, due to area and space constraints, the energy storage inductance values are relatively low. However, the frontend electronics impose stringent requirements on power noise performance, necessitating a sufficiently low ripple voltage. Traditional methods, which involve increasing inductance values to reduce ripple voltage, are no longer feasible. Instead, the ripple voltage can only be minimized by increasing the switching frequency. The proposed DC-DC controller aims to operate at switching frequencies exceeding the MHz range. Furthermore, the radiation environment created during detector operation can induce soft errors or even hard failures, such as burnout, in silicon-based circuits. Hence, the designed DC-DC controller must withstand both ionizing and non-ionizing radiation effects.

This project will define the design criteria for the DC-DC voltage converter based on the typical power requirements of frontend electronics circuits. Building on research and a review of current literature regarding radiation-resistant buck-type DC-DC voltage converters, a study will be conducted on the structure of the controller system. The focus will be on parallel output current capabilities, high switching frequency, and efficient circuit structures, along with specific circuit implementation methods. System-level modeling and simulation will be employed to optimize the controller's system architecture. On this foundation, research will delve into key circuit modules,

analyzing the switching losses of each module to ensure that the power system maintains an efficiency above 85%. Additionally, protection circuits for overheating, overcurrent, overvoltage, and undervoltage, as well as auxiliary circuits for power status indication, will be developed to ensure that the voltage converter can promptly cut off power to protect itself and its load modules during abnormal operating conditions.

The design of the circuit and layout will simultaneously consider radiation-hardening methods. Specific studies will be undertaken on radiation-sensitive components, such as bandgap references, error amplifiers, and switch timing controllers, focusing on TID (Total Ionizing Dose), SET (Single Event Transient), and SEL (Single Event Latchup) hardening strategies. Verification of the designed circuits will incorporate approaches such as fault injection. Finally, research on radiation testing methods for the DC-DC voltage converter will be conducted, designing TID and SEE radiation testing methods for the buck-type DC-DC voltage converter based on existing radiation testing standards.



图 7 DCDC 模块热仿真结果

### 1.8.4 Cable, connector

| 1.5.2 Power | interface |
|-------------|-----------|
|-------------|-----------|

|         | Vertex | Pix Tracker | Si Strip | TPC   | TOF    | ECAL | HCAL |
|---------|--------|-------------|----------|-------|--------|------|------|
| Detecto | CMOS   | HVCMOS      | Si Strip | Pixel | Strip- | SiPM | SiPM |
| r       | Sensor |             |          | PAD   | LGAD   |      |      |
| for     |        |             |          |       |        |      |      |
| readout |        |             |          |       |        |      |      |

| Main    | X+Y       | XY + nsT   | Х       | E +   | X +        | E +    | E + 400psT  |
|---------|-----------|------------|---------|-------|------------|--------|-------------|
| Func    |           |            |         | nsT   | 50psT      | 400psT |             |
| for FEE |           |            |         |       |            |        |             |
| Channel | 512*102   | 768*128    | 128     | 128   | 128        | 16     | 16          |
| s per   | 4         | (2cm*2cm@2 |         |       |            |        |             |
| chip    | Pixelized | 5um*150um) |         |       |            |        |             |
| Voltage | 1.2V@65   | 1.2V@55nm  | 1.2V@1  | 1.2V  | 1.2V@55    | 1.2V@  | 1.2V@55n    |
| @chip   | nm        | (HVCMOS    | 30nm    | @65n  | nm         | 55nm   | m           |
|         |           | Pixel)     | (电压统    | m     | (TDC)      | (TDC)  | (TDC)       |
|         |           |            | 一、便     |       |            |        |             |
|         |           |            | 宜)      |       |            |        |             |
| Power   | 200mW/    | <200mW/cm  | 336mW/  | 35m   | <20mW/c    | 15mW/  | 15mW/ch     |
| @chip   | chip      | 2          | chip    | W/chi | h          | ch     | 160~320m    |
|         |           | <0.8W/chip |         | р     | <2.56W/c   |        | W/chip      |
|         |           |            |         |       | hip        |        |             |
| chips@  | 8~29@la   | 14         | 9~22@la | 1115  | 22(barrel) | 1000ch | 480ch*5pcb  |
| module  | dder      |            | dder    |       | 11-        | *2     | (barrel)    |
|         | 4~25@la   |            | 48~299  |       | 23@secto   |        | 5832ch@se   |
|         | yer       |            | @sector |       | r(endcap)  |        | ctor(endcap |
|         |           |            |         |       |            |        | )           |
| Power   | 2.6~6.8   | 11.2W      | 4~8.4W  | 39.7  | 56.32W(    | 31W@   | 37W@mod     |
| @modu   | W@ladd    |            | @ladder |       | barrel)    | module | ule(barrel) |
| le      | er        |            | 23~119  |       | 58.9W(en   |        | 88W@secto   |
|         | 10.4~170  |            | W@sect  |       | dcap)      |        | r(endcap)   |
|         | W@layer   |            | or      |       |            |        |             |
| Number  | 66        | 2204       | 192     | 496   | 3780 +     | 480 +  | 5536 +      |
| of PW   |           |            |         |       | 720        | 260    | 1536        |
| module  |           |            |         |       |            |        |             |

According to the requirements of the frontend chips of the detector, the power supply module needs to provide a high-quality 1.2V power output. In addition, the data transmission chips require a separate 2.5V digital power supply. The basic parameters are as follows:

| 指标     | 额定值             | 实际范围               |
|--------|-----------------|--------------------|
| 输入电压   | 48V             | 36V-48V            |
| 输出电压   | 1.2V            | 1.2V、2.5V          |
| 输出电流   | 10A             |                    |
| 输出纹波   | 10mVpp          |                    |
| र्भ के | 950/            | 80%-85%-80%(轻载-额定- |
|        | 8370            | 重载)                |
| 尺寸     | 50mmX20mmX6.7mm | 包括散热和屏蔽            |



The basic structure of the designed module is shown in the figure. The output signals include:

# 1.6 Backup scheme based on Wireless communication (Jun Hu)

# Technology study

# 1) Traditional Wi-Fi Solutions

Traditional Wi-Fi solutions operate primarily in the 2.4 GHz and 5 GHz frequency ranges. Their key advantages include a mature technology and widespread adoption, with the most common standards for wireless local area networks defined by the IEEE in the 802.11 series—commonly referred to as Wi-Fi.

However, there are notable drawbacks associated with these solutions. The transmission density is limited due to the small frequency range and restricted number of channels available. Additionally, the inability to miniaturize antennas at high power levels further contributes to lower transmission density. Despite these limitations, traditional Wi-Fi is easier to develop and therefore well-suited for applications where bandwidth requirements are not particularly demanding, such as for slow control data and small system testing.



Test setup based on Raspberry board

2) Millimeter Wave Solutions. It is currently popular to classify the frequency ranges as follows: 0.3-30 GHz is considered the microwave band, 30-300 GHz is classified as the millimeter wave band, and 0.1-10 THz belongs to the terahertz band. The transmission technologies for future 6G and Wi-Fi 7 are likely to be based on these two transmission bands. The frequency range represents a significant increase over traditional Wi-Fi, substantially addressing bandwidth issues and providing a theoretically excellent solution. However, higher bandwidth also brings increased power consumption and costs. More importantly, since these technologies are still in the development stage, relevant chips have not yet been commercially promoted on a large scale. The technical difficulties are considerable, and strict technology restrictions from foreign entities present high barriers and costs in manufacturing. At this stage, this project will focus on collaborating with domestic research institutes or large commercial companies, such as Huawei, to explore the possibility of system development using existing or upcoming millimeter wave and terahertz chips.



3) Research on Wireless Optical Communication. Unlike traditional fiber optic communication, wireless optical communication utilizes air as the transmission medium, loading data onto a light source and directly transmitting information by altering properties such as light intensity, phase, and

polarization. Based on different propagation media frequency bands, wireless optical communication can be subdivided into infrared (IR), visible light (VL), and ultraviolet (UV) communication. Among these, visible light communication is currently the fastest-growing area, with transmission rates exceeding Gbps, which can meet our requirements. Additionally, due to the high directionality of laser channels, natural channel isolation can increase the number of available channels. The main research topics in visible light communication include LED and detector materials, modulation coding, signal processing, and heterogeneous networking.

For a simple visible light communication system, key components include an FPGA, which is the most commonly used programmable device. This enables the development of basic network transmission protocols, and some models also feature high-speed serial transceivers (serdes), making them suitable for prototype validation. As for the light sources, LEDs and lasers are commonly employed in visible light communication systems. LEDs emit light through the principle of spontaneous emission, serving as incoherent light sources with narrow modulation bandwidth but wide divergence angles. Conversely, lasers emit light through stimulated emission, allowing for higher bandwidth suitable for high-speed long-distance transmission, although they require precise alignment. Both light sources have unique characteristics, each suitable for specific application scenarios.

Regarding detectors, commonly used types include PIN and APD detectors, as well as image sensors. Generally, PIN and APD detectors are favored in high-speed visible light communication systems, while image sensors are employed in low-speed, multi-input, and multi-output visible light communication systems. According to the properties of semiconductor PN junctions, LEDs can also be utilized as detectors. In this research, a laser combined with an APD detector is planned to be used as the optoelectronic device.





### 1.7 Clocking systems (Jun Hu)

The clock system needs to fulfill two primary functions:

Provide a Reference Clock for Detector Electronics : This clock will serve as the baseline frequency for sampling, time measurement, and energy measurement across all the detectors.

Offer Absolute Timestamping for Detectors : It is crucial for the trigger system to receive temporally correlated physical data. Drawing from the experiences of the Daya Bay reactor neutrino experiment, recent advancements have shown that integrating precise timing information is essential for the effective operation of the experiment.

White Rabbit Technology

White Rabbit technology, initiated by CERN and GSI in 2008, utilizes synchronized Ethernet to achieve frequency distribution across multiple nodes. It employs precision time protocol (PTP) for time-stamping synchronization and harnesses all-digital dual heterodyne phase detectors to enhance synchronization accuracy to sub-nanosecond els. This technology offers an excellent solution for the timing system requirements of the Jiangmen neutrino experiment:

Synchronized Ethernet : This can provide a low-jitter clock as the operational clock for the frontend electronics.

Phase Difference Measurement and Compensation : This will ensure that all nodes' operational clocks reach sub-nanosecond phase alignment accuracy.

Precision Time Protocol : It will facilitate the alignment of time stamps for the frontend electronics. External Reference Integration : The system can incorporate rubidium clocks and GPS as external reference frequencies and timestamps for Coordinated Universal Time (UTC).

This state-of-the-art clock system will significantly enhance the performance and accuracy of various detector operations in the neutrino experiment.



system main clock

Center clock generator

Clock fanout

### 1.8 Power system (Jun Hu)

### 1.9 General Backend Electronics System (Jun Hu, Wei Wei)

The backend electronics system plays a pivotal role in the overall data acquisition process by receiving raw detector data from the frontend electronics. Once the raw data is received, the backend electronics is responsible for performing initial data processing and compression. This processing not only prepares the data for further analysis but also generates trigger signals that are essential for the trigger system's operation. The connections between the backend electronics system and other interfacing systems are depicted in Figure 2.



Figure 2: General Backend electronics system connection

The trigger signals, once generated, are forwarded to the trigger system to facilitate the triggering process. Concurrently, the backend electronics system caches the raw data for a specified duration, allowing it to maintain access to the data even as decisions are made. After a predetermined delay, the trigger system provides the backend electronics with the final trigger decision. Utilizing this information, the backend can intelligently select the relevant data from its buffer based on the final trigger criteria. Subsequently, the chosen data is packaged appropriately and sent to the Data Acquisition (DAQ) system for comprehensive analysis and storage. Furthermore, the backend electronics system (DCS), ensuring effective monitoring and control of both the frontend detectors and the backend electronics.

In addition to data handling, the backend electronics system also communicates with the synchronization clock distribution system. This interaction is crucial for ensuring that clock and timing information remains synchronized across all components, enabling precise data acquisition and processing.

All of these connections are established using optical fiber connections, which provide the necessary high-speed, high-bandwidth communication required for efficient data transfer in a high-throughput environment. This choice of communication medium not only enhances the speed of data transmission but also mitigates issues related to electromagnetic interference (EMI) and allows for the necessary distance between components in radiation-sensitive areas.

### **1.9.1 Backend Electronics Hardware Design**

The backend electronics system is engineered around the XTCA (Advanced Telecommunications Computing Architecture) racks, optimizing both functionality and scalability. Each rack is designed to accommodate two ATCA (Advanced TCA) crates, which in turn house ten general

backend electronics boards per chassis. This layout allows for the efficient allocation of space, with the remaining capacity utilized for optical patch panels, power supplies, and other auxiliary equipment.



Figure 3: General backend board structure

The hardware components of the custom backend electronics are illustrated in Figure 2, and they primarily consist of the following key sections:

- FPGA: Serving as the core processing unit of the backend electronics board, the FPGA (Field Programmable Gate Array) is responsible for controlling and executing the majority of the backend electronics functions.
- 2) Clock Jitter-Cleaner: This component ensures that timing signals are free from jitter, maintaining signal integrity across high-speed data connections.
- 3) DDR Memory: Dynamic Random-Access Memory (DDR) provides necessary memory resources for data buffering and temporary storage during processing.
- 4) Power Management: This section manages power distribution to ensure that all hardware components receive stable and adequate power.

From preliminary evaluations conducted with the frontend electronics, it is estimated that the fiber optic line rate for data transmission to the backend electronics will be 10 Gbps. To accommodate this high data throughput, a custom communication protocol will be implemented. Additionally, connections to the trigger system and the clock distribution system will also utilize fiber optic links that operate at the same 10 Gbps line rate. In contrast, communication with the DAQ (Data Acquisition) and DCS (Data Control System) will leverage 40 Gbps QSFP+ optical modules to enable robust and high-speed connectivity.

These requirements create stringent demands on both the speed and the quantity of high-speed serial transceivers integrated into the FPGA. Furthermore, the FPGA will be tasked with performing initial processing on the raw data, necessitating a considerable level of processing capability to handle the data rates effectively.

To ensure that our system meets these operational criteria, we have conducted a comparative analysis of the performance specifications of several widely-used FPGA models currently available on the market. The results of this comparison are summarized in Table 1, leading us to initially select the XC7VX690T-2FFG1158C as our preferred FPGA model. This choice is based on its ability to satisfy the high-speed communication requirements and processing demands of the backend electronics system, ensuring efficient operation and data integrity in a high-throughput environment.

|                | XC7K325T-<br>2FFG900C | XCKU040-<br>2FFVA1156E | XC7VX690T-<br>2FFG1158C | XCKU115-<br>2FLVF1924I | KU060        |
|----------------|-----------------------|------------------------|-------------------------|------------------------|--------------|
| Logic Cells(k) | 326                   | 530                    | 693                     | 1451                   | 725          |
| DSP Slices     | 840                   | 1920                   | 3,600                   | 5520                   | 2,760        |
| Memory(Kbits)  | 16,020                | 21,100                 | 52,920                  | 75,900                 |              |
| Transceivers   | 16(12.5Gb/s)          | 20(16.3Gb/s)           | 48(13.1Gb/s)            | 64(16.3Gb/s)           | 32(16.3Gb/s) |
| I/O Pins       | 500                   | 520                    | 350                     | 832                    | 624          |

Table 1: Comparison of Common FPGA Performance Parameters

# 1.9.2 FPGA Firmware Algorithm Development

Based on the functional requirements of the backend electronics system, several key functional modules will be implemented in the FPGA, as illustrated in Figure 2:

# 1) Data Interfaces and Communication

High-speed data communication will be facilitated through the GTH transceivers integrated within the FPGA, supporting a multichannel transmission bandwidth of at least 10 Gbps. The communication protocols will be designed to accommodate various upper-layer protocols in accordance with system specifications. Specifically, the frontend data will be packaged and transmitted using a custom protocol akin to lpGBT, necessitating the implementation of appropriate unpacking logic within the FPGA to effectively interpret this data. Additionally, for communication with the DAQ and DCS systems, the standard TCP/IP protocol will be utilized, which requires the FPGA to implement a hardware protocol stack to maximize communication efficiency and throughput.

### 2) Data Processing and Packaging

The FPGA will be responsible for categorizing and processing the incoming raw data based on channel and timing information. This involves assigning the data to the corresponding digital signal processing algorithms for further refinement and analysis to generate the required trigger signals. Upon receiving a trigger signal, the FPGA will select the relevant valid data, efficiently package it, and transmit it according to a predefined data structure, ensuring that the information is organized and ready for subsequent stages of handling.

# 3) Digital Signal Processing Algorithms

Once the FPGA receives the raw data, it will implement a variety of processing algorithms tailored to the specific type of detector in use. For instance, tracking detectors may require the implementation of cluster-finding algorithms, while calorimeters will focus on extracting timing and energy information. In scenarios where considerable noise is present, fast filtering algorithms may be deployed to enhance the signal-to-noise ratio. Additionally, real-time processing algorithms will facilitate the proactive acquisition of key information from physical events. This capability significantly improves the efficiency of the system's triggering mechanisms, reduces trigger latency, and alleviates the workload on both the DAQ and offline processing systems.

# 4) Clock Synchronization Technology

In high-energy physics experiments, effective clock synchronization is crucial as it coordinates the timing across multiple channels, detectors, and the data acquisition system to ensure data consistency and accuracy. By researching and developing clock synchronization technology tailored to fiber optic transmission systems, it is possible to transmit both data and clock signals through a single fiber optical cable. This approach significantly minimizes the required number of optical cables and reduces overall system complexity.

# 5) DDR Controller

Considering the need to buffer raw data until the final trigger signal is received from the trigger system, the backend processing module will incorporate a sufficiently large DDR memory. The FPGA will manage the timing control for reading from and writing to the DDR memory, ensuring that data integrity is maintained and that no data is lost during the buffering process.

# 6) Slow Control Registers

The backend electronics system must be responsive to monitoring and control commands originating from the slow control system. Acting as a communication bridge between the backend and the frontend detectors, the FPGA will handle the parsing, forwarding, and responding to slow control commands. This functionality ensures that the system remains manageable and that any necessary adjustments can be made in response to operational conditions or changes.

By implementing these functional modules within the FPGA, the backend electronics system will be equipped to handle the rigorous demands of high-speed data acquisition and processing while maintaining reliability and efficiency in a high-energy physics experimental environment.

# 1.9.3 Prototype performance



The physical overview of the data aggregation processing board is presented in Figure 4-14 (a), while the distribution of components on the board is illustrated in Figure 4-15 (b). The board is equipped with twelve SFP+ interfaces and two QSFP+ interfaces, which are utilized for connecting the GTH transceiver modules of the FPGA. These interfaces serve as the communication pathways for data transmission between the data aggregation processing board and the upper computer, as well as the readout system that encompasses multiple readout channels.

The GTH transceivers offer higher data rates and improved performance compared to GTX transceivers, making them suitable for applications that demand higher data transmission bandwidth. Additionally, the board is equipped with a 204-pin DDR3-SODIMM, which is intended for large data buffering during the algorithm processing phase. This configuration ensures that the board can efficiently manage the substantial data flow necessary for effective data aggregation and processing.



1.10 Consideration on Electronics Crates & Cabling (Wei Wei, Zheng Wang)

1.11 Previous R&D on Electronics System for Large Particle Physics Experiments (Wei Wei)

1.12 Summary (Wei Wei, Jingbo Ye)