

# **TDR: electronics**

# **Technical Design Report of the CEPC Reference Detector**

**Author:** the CEPC study group **Institute: Date:** June 9, 2025 **Version:** 0.1 **Bio**: Information

# **Contents**





#### **CONTENTS**



# **Chapter 0 Example chapter**

#### <span id="page-4-0"></span>[\[1\]](#page-4-5)

- Every people please keep their chapter self-consistent.
- Please keep the style same with examples, such as figure, table, citations, and so on
- <span id="page-4-1"></span>Please use bibtex for reference: add entries to the reference.bib file and cite them in your chapter

### **0.1 Main text**

For portable version, simply download lastest ElegantBook-master from GitHub or CTAN (to be more accurate, download ) [\[2\]](#page-5-0)

# <span id="page-4-2"></span>**0.2 Figures**



**Figure 1:** Matplotlib: Scatter Plot Example

# <span id="page-4-3"></span>**0.3 Tables**

## <span id="page-4-4"></span>[\[1\]](#page-4-5)

**Table 1:** Caption is above the table, no vertical line, limited number of horizontal lines

| <b>Characteristics</b> | <b>Value1</b>                 | <b>Value2</b> |  |
|------------------------|-------------------------------|---------------|--|
| Chip width             | 5 cm                          | 4 cm          |  |
| Chip thickness         | $100 \ \mu m$                 |               |  |
| Position resolution    | $5-10 \ \mu m$ 15, 40 $\mu m$ |               |  |

# **References**

<span id="page-4-5"></span>[1] Frank Mittelbach et al. *The EIEX Companion*. 2nd. Addison-Wesley Series on Tools and Techniques for Computer Typesetting. Boston, MA, USA: Addison-Wesley, 2004.

<span id="page-5-0"></span>[2] Tobias Oetiker et al. *The Not So Short Introduction to LATEX 2*ε. 4.2. 2006.

# <span id="page-6-0"></span>**Chapter 1 Electronics System (100**% **ready for the draft0)**

# <span id="page-6-1"></span>**1.1 Introduction (Wei Wei)**

The main task of the CEPC electronic system is to design the front-end electronics for each sub-detector. This involves amplifying, converting to digital, and processing the detector signals, as well as reading and storing the signals and interfacing with the trigger system. The system also handles key signals like clocks, collision points, and controls to synchronize and control all sub-detectors.

In this chapter, the readout requirements of each sub-detector's electronic system will be introduced in Section [1.2.](#page-6-2) Key requirements such as background noise, data rates, and power consumption will be summarized based on the front-end electronic readout schemes provided in the sub-detector chapters. The detailed designs of the Front-end ASIC of each sub-detector will also be described in Section [1.3.](#page-8-0) In Section [1.4,](#page-25-0) the overall requirements of the CEPC electronic system will be analyzed, clarifying the design ideas and technical choices, defining the readout strategy for the electronic and trigger systems, and providing a unified baseline solution and framework structure for the CEPC electronic system. Section [1.3](#page-8-0) will detail the specific design of front-end electronic readout chips for each sub-detector and more general front-end readout chip designs. Additionally, Section [1.5](#page-29-0) will introduce the design of key components in the universal platform, including data interfaces, front-end power modules, and back-end electronic boards. Section [1.6](#page-48-0) will discuss detector readout schemes based on wireless transmission as an upgrade scheme. Sections [1.7](#page-55-0) , [1.8,](#page-57-0) and [1.9](#page-65-2) will cover detector clock synchronization, front-end power distribution, and back-end electronic algorithms and interface designs with the trigger system. Finally, as part of the overall team introduction, the composition of team members and collaboration partners, as well as the previous R&D on electronics systems for large particle physics experiments, will be presented in Section [1.11,](#page-78-1) along with an overview of the team's overall research foundation. In Section [1.13,](#page-80-1) the electronic system design will be summarized and research plans outlined.

### <span id="page-6-2"></span>**1.2 Detector requirements (Wei Wei)**

For the CEPC electronics system, in addition to meeting the front-end electronics readout requirements of each subdetector, a complete universal electronics framework needs to be designed for the overall spectrometer design. This allows the electronics systems of each sub-detector to be designed with universal interfaces and standards, further modularizing the design of sub-detector electronics and enhancing design independence, plug-and-play capability, and upgradeability. At the system level, this approach also enables more efficient scheduling and communication among various systems. In the previous chapters on each sub-detector, based on the preliminary universality framework provided by the electronics system, corresponding preliminary readout schemes for detectors have been proposed. These schemes include specific organization of front-end electronic chips in modules, taking into account the different operational modes of the CEPC accelerator (such as Higgs, Low LumiZ, etc.), as well as considering the front-end electronic schemes based on background estimates provided by the MDI system. These parameters, as inputs for the design of the universal electronics framework, are summarized in Table [1.1.](#page-7-0)

From Table [1.1,](#page-7-0) it can be seen that for different sub-detector systems, targeted research and development of 5 6 front-end ASIC chips will be conducted to detect the physical signals of the detectors. In these ASICs, considerations have been made to merge similar requirements to reduce the variety of ASICs developed in parallel. For example, for the ECAL, HCAL, and Muon detectors, as they will all use SiPM devices in the detectors, a plan is in place to combine the common requirements of the three detectors to design a SiPM readout ASIC with a certain level of universality to simultaneously meet the needs of the three detectors. Additionally, in the design process of front-end chips, apart from detector systems such as Vertex and Inner Tracker that require the use of MAPS technology, for other front-end ASIC designs, it is considered to adopt a unified CMOS 55nm/65nm process from the same foundry, which will allow for the sharing of circuit modules used in chip design as much as possible, thereby accelerating the chip design process and improving the reliability of chip design. This approach embodies the idea of a universal electronics framework in the

<span id="page-7-0"></span>

 $\widehat{\omega}$  Hi  $\bar{z}$  $\epsilon$ ŀ, á  $\overline{\mathbf{q}}$  $\frac{1}{2}$ Ŷ Ŕ

design of front-end ASICs. Specific ASIC schemes will be detailed in Section [1.3.](#page-8-0)

The third and fourth rows of the table also summarize the data width of the front-end chip outputs for each subdetector, as well as the average data rate and maximum data rate per chip. This takes into account the detector background provided by the MDI system through overall simulation, and considers the arrangement of modules in each subdetector. Based on the detector module size, detector operating mode, and key information to be detected, combined with the preliminary design of the front-end chip, the data width is determined. Furthermore, based on the detector size corresponding to each channel of the front-end chip, combined with the background counting rate, an estimate of the data rate per chip or module is provided. This will also serve as one of the key input parameters for the overall electronics system interface.

In the fifth row of the table, the situation between the front-end chips and the data interface on different detectors is summarized based on the chip arrangement on the modules of each subdetector, module layout, and the way the front-end of the detector is led out. For cases where there are a large number of chips on the detector, but the average data rate per chip is not high, especially for systems such as OTK and calorimeters, an effective approach is to aggregate the data from multiple chips before readout in order to save data interfaces and cable quantity. This preliminary approach has been considered in the subdetector readout scheme. In the relevant sections of the data interface, the design of data aggregation chips will be based on this, and the organization of front-end chips summarized in this table will be a key input parameter for this design.

Finally, the sixth row of the table summarizes the preliminary number of electronics channels provided by each subdetector, which already takes into account the considerations after the detector has been optimized in size. The seventh row summarizes the total data rate of each subdetector after considering the detector background data rate. These two rows will serve as key design parameters for the TDAQ and electronics interface, and will be discussed in detail in the backend electronics and trigger system sections.

# <span id="page-8-0"></span>**1.3 Sub-detector Front-end Electronics Design (Xiongbo Yan, Jinfan Chang, Huaishen Li, etc)**

#### <span id="page-8-1"></span>**1.3.1 ITK readout electronics**

#### <span id="page-8-2"></span>**1.3.1.1 FE board**

The required bandwidth of the readout e-link of each sensor strongly depends on the radial region it covers. For monolithic HVCMOS Pixel, the number of bit per hit is 48. Considering an area of 2cm\*2cm, the distribution of the maximum number of bit per sensor is 4 Mbps and the maximum number of bit per sensor is 70 Mbps.



**Figure 1.1:** Detector module design of the ITK

Each module consisting of 14 sensors are bonded to a Flex Electronics Board (FEB), considering the material budget. The FEB transfers digital signals to the data aggregation chip, then transfers signals to the optical fiber. For error-free data transmission, the FEB uses two kinds of ASICs for data aggregation, TaoTie chip and ChiTu chip. The TaoTie chip collects the data from 16 channels and transfer them in serial at 1.29 Gbps. One TaoTie is enough for 14 sensors due to the maximum data rate for 14 sensor is 980 Mbps. The ChiTu chip collects data from the TaoTie chips. The hit rate is different at different radius. A dedicated buffer is needed in each ASIC to average the rate variation and match the best speed of the

e-link drivers transceiver inputs. The optical fibers take the data out of the detector, and connects to the back-end DAQ. The ChiTu will work at a data rate of 10Gbps which is enough for barrel or end cap at the smallest radius. The clock is recovered by Clock and Data Recovery (CDR) in ChiTu. The ChiTu gives 6 choices of output of clock, 43.3MHz, 43.33 MHz, 86.67 MHz, 173.33 MHz, 346.67 MHz, 693.33 MHz and 1.39 GHz, which will be used in frond end electronics for different detectors. The slow control of front end circuits is configured through the down link of ChiTu which is working at 86.67 Mbps.



**Figure 1.2:** Global Framework of the Frontend Electronics

The power for low voltage (LV) and high voltage (HV) is independent for each module, considering the total consumption is 200mW/cm<sup>2</sup>. There will be 2 DC-DCs on the flex for each module since the current will be more than 10 A with 1.2V power, regarding the margin for efficiency of conversion. The 48V from power crate steps down to 12V at the first stage and to 1.2V at the second stage where the output capability of DC-DC is 10A. The DC-DC also provides 2.5V and 3.3V, which is for VCSEL driver. A composite cable with a pair of LV, a pair of HV and an optical fiber will be used to make module assembly easy. The flex Print board is considered if the space is limit.

For Monolithic CMOS strip, the readout scheme is very similar, even though the hit rate for each channel is higher than pixel in front end electronics. The number of bit per hit is 32. Two TaoTie chips will on the module due to the maximum data rate for the smallest radius is 100 Mbps for each strip sensor and 1.4 Gbps for 14 sensors, which exceeds the bandwidth of TaoTie.

#### <span id="page-9-0"></span>**1.3.2 OTK readout electronics**

#### <span id="page-9-1"></span>**1.3.2.1 Front-end board**

The FE board receives signals from LGAD strips that are processed by six ASICs named JuLoong. No need of AC coupled thanks to the internal coupling design which isolates the circuit from the large leakage current and protect the ASIC from being damaged by the bias voltage in case of LGAD failure. The JuLoong ASIC is powered by a single 1.2 V supply. This voltage is regulated and filtered by the DC-DC. The DC-DC delivers up to 10A current, which is sufficient for the operation of 8 front-end ASICs. The 2 slides of sensor (almost 4.4cm\*104cm) share one FE board, on which has 2 DC-DCs and 16 JuLoong ASICs. In the baseline design, the distribution of the LGAD bias voltage is provided by external bias channels, each channel serving a LGAD module. Each bias channel has two HV wires transmitted over the flex power board. The HV wires for each module are independent. The LV wires from different modules are connected by shunt to the Concentrator Card. The FE board interfaces to the Concentrator Card through board side-to-side connectors or soldering transferring the following signals:

- 1. Clock: sixteen ChiTu e-clocks (43.3 MHz), one for each JuLoong ASIC ;
- 2. Data readout: 6 up E-links (1.39 Gbps) per TaoTie connecting to ChiTu on the Concentrator Card;
- 3. Configuration: one down E-link (86.67 Mbps) for JuLoong configuration shared by sixteen JuLoongs;
- 4. Sync/reset: one down E-link (86.67 Mbps) for JuLoong resync/reset shared by sixteen JuLoongs;
- 5. Monitoring: provision for sixteen temperature sensors per FE board (sensor and electronics temperatures).
- 6. Power and ground.

Each FE board is served by 2 DC-DC converters providing 1.2 V. In order to have a more stable and precise regulation of the analog power supplies and to reduce the switching noise introduced by DC-DC, filter with passive components will be used. The aggregation chip TaoTie provides up to 16 upstream links at 43.3 Mbps into 1.39 Gbps at output with configurable modes that allow for 2, 4, or 8 channels to be serialized into one channel. One TaoTie is enough for sensor on barrel or large radius at endcap because of low hit rate while more TaoTies are needed for sensors on small radius at endcap. Each FE board receives a dedicated clock from the ChiTu and to transmit data over one dedicated uplink to one ChiTu. The sixteen JuLoongs in the FE board share two downstream links used to provide the JuLoongs with configuration, synchronization or reset. In order to various JuLoongs to share the configuration downlinks, the configuration protocol includes a 4-bit address. Each JuLoong chip is given its 4-bit address though dedicated ID pins in the chip.

#### <span id="page-10-0"></span>**1.3.2.2 Concentrator card and power distribution**

The Concentrator Card is designed to interface the system readout with four FE boards. Its location with respect to the sensor modules, cooling bar, and FE boards on the barrel or endcap tray. The Conentrator Card uses the ChiTu to provide an interface between E-links and an opto module. Each opto module has a single channel receive and a single channel transmit. The command downlink (receive) will be at 346.67 Mbps and the data uplink will be at 9.71 Gbps, 7 input channels, 1.39 Gbps for each. The hit rate on barrel is much lower than the one on endcap. The average rate of each data link from the FE board is 600 Mbps, 16 JuLoongs, 38 Mbps for each on the endcap. This translates to an aggregate rate of about 3 Gbps for each ring on sector. On the smallest radius, the data rate will achieve 4 Gbps per module, 8 JuLoongs on each module. Another important feature of the ChiTu is to ensure the precise clock distribution, received from the downlink, to the front-end system, achieved by the high frequency clock noise filter in the PLL. The power is distributed from the DC-DC converter module, which is a step-down converter module with a radiation tolerant component. To power the Concentrator Card components as well as the FE cards, two stages of converters are needed. The first one converts 48 V to 12 V on Concentrator Card. The second stage converts 12V to 1.2 V for the front end ASIC on FE board, or 3.3 V for VCSEL driver on Concentrator Card. For the full channel capacity, we will use two DC-DCs for reliability on FE.

#### <span id="page-10-1"></span>**1.3.2.3 Slow control and monitoring**

The ChiTu provides a set of slow control and monitoring features, I2C master controllers, JTAG master controller, programmable and bidirectional IO ports, memory-like bus master controller with data and address. The preliminary of slow control and monitoring functions are following:

- 1. Slow Control bits
	- (a). powering control bits
	- (b). configuration bits of frontend ASIC
- 2. Monitor
	- (a). DC-DC status
	- (b). ASIC status
	- (c). temperature on FE
	- (d). temperature on Concentrator Card
	- (e). leakage current of LGAD

#### <span id="page-10-2"></span>**1.3.2.4 Clock distribution**

<span id="page-10-3"></span>The distribution of a precise clock to the front-end is a major requirement for OTK. The clocks in the back end system are recovered from the ChiTu down links. The clock is synchronized to the 43.3MHz frequency bunch crossings rate. In order to achieve the target timing performance, the clock distribution system should have less than 15 ps rms link-to-link jitter over all clock distribution links. For instance, a 50 ps timing performance obtained from the sensors and readout electronics would be degraded by 2.2 ps by a clock distribution system with 15 ps rms jitter. The high frequency clock noise is expected to be filtered by the PLL in the ChiTu. Low frequency clock jitter and possible phase instability, in particular arising from temperature variations or low frequency response of the clock chain, will require special attention.

#### **1.3.2.5 OTK readout ASIC, JuLoong**

This section describes the required performance, design, and latest prototype testing of the ASIC chip, which will have 128 readout channels in future. The main challenge in the design of this ASIC is a high time resolution for time measurement and charge resolution for position, in order to match the excellent performance of the LGAD. The time contribution comes mainly from the jitter and the time walk. The most critical aspect concerning the jitter is the design of the analog front-end electronics, which are composed of a transimpedance amplifier followed by a shaper and a fast discriminator. The measured time-of-arrival (TOA) and time-over-threshold (TOT) are digitized using two time-to-digital converters (TDCs), and stored in a local memory at the channel level. The TOT can be recognized as charge because the charge is related to the time over a certain threshold. The charge resolution is determined by time resolution and TOT width. The contribution of time walk will be addressed by applying a correction based on the fact that the variations in the TOA of the pulse are related to the TOT. The ASIC common digital part is composed of clock generator and alignment, slow control configuration and data transimission. A prototype chip has been produced and will be tested so far: JuLoong, integrated 8 channels, with the preamplifier and the discriminator, TDC and digital components. The chip will be test in the end of 2024.

<span id="page-11-0"></span>**1.3.2.5.1 General requirements** The requirements of the ASIC can be divided into two types. On one side the considerations regarding the operational environment of the ASIC, its powering and electrical connections. These requirements are summarized in Table 3.1. The second group concerns the ASIC performance, driven by the targeted time resolution. A summary of these requirements is presented in Table 3.2.

- **i)** The target for the electronics is to be able to read out signals from 16 fC up to 50 fC throughout the lifetime.
- **ii)** Each readout channel needs to match the sensor strip, with a pitch of  $100 \mu$ m. It will be capable of handling up to  $5\mu$ A leakage current from the sensor.
- **iii)** The electronics jitter is required to be smaller than 30 ps for an input charge of about 16 fC, that is smaller than intrinsic dispersion of LGAD. A detector capacitance of about 4 pF is considered. The TDC bin size for TOA measurement should be less than 30 ps, thus the contribution from TDC will be negligible. The time walk should be smaller than 10 ps over the dynamic range after correction.
- **iv)** The charge measurement is applied by TOT measurement. A resolution of 1.6 fC is necessary for a special resolution of  $10\mu$ m. The bin size for TOT will be the same as TOA, thanks to the reusing of delay chain.
- **v)** The TOA and TOT information are transferred to the data acquisition system, therefore integrating the protocol of data aggregation is necessary.
- **vi)** The charge generated by MIP will be shared by adjacent strips, it should be possible to set the discriminator threshold for small enough values of input charge. The minimum threshold (4 fC) should provide an efficiency above 95% for an input charge of 16 fC. The cross-talk between channels should be kept below 10 %, to enable the possibility to set such low thresholds.
- **vii)** The ASIC will have to withstand high radiation levels. The expected radiation levels have been presented before, considering a safety factor for the electronics leading to a maximal TID of ? MGy

**Table 1.2:** Geometrical, environmental, electrical and power requirements for the OTK ASIC

| Voltage                               | 1.2V                             |
|---------------------------------------|----------------------------------|
| Channel                               | 128                              |
| Channel pitch                         | $< 100$ um                       |
| Power dissipation per area (per ASIC) | $300$ mW/cm <sup>2</sup>         |
| e-link driver bandwidth               | 320 Mbps, 640 Mbps, or 1.28 Gbps |
| Temperature range                     | $-40C$ to 40 C                   |
| TID tolerance                         | $1.0\,\mathrm{MGy}$              |

<span id="page-11-1"></span>The values given for the noise, minimum threshold and jitter have been specified considering a detector capacitance  $Cd = 4 pF$ .

| Maximum leakage current           | $5\mu$ A             |
|-----------------------------------|----------------------|
| Single channel noise (ENC)        | $< 10000 e = 1.6 fC$ |
| Cross-talk                        | $< 10\%$             |
| Threshold dispersion after tuning | $< 10\%$             |
| Maximum jitter                    | 30 ps at 16 fC       |
| TDC contribution                  | $<$ 30 ps            |
| Time walk contribution            | $< 10$ ps            |
| Minimum threshold                 | 4 fC                 |
| Dynamic range                     | 16 fC-50 fC          |
| TDC conversion time               | $<$ 23 ns            |

**Table 1.3:** Performance requirements for the OTK ASIC

**1.3.2.5.2 Data transmission bandwidth requirements** The required bandwidth of the readout e-link of each ASIC strongly depends on the radial region it covers at barrel or endcap, as shown by the distribution of the average number of hits per ASIC in Figure ?. The number of bit per hit is 48 as described in Section ?. Each module consisting of 2 slides of sensors, and each sensor needs 8 ASICs to readout. 16 ASICs are on one FE board, and are connected to a aggregation chip TaoTie. 4 FE boards connect to a Concentrator Card (cc) via flex cables and connectors. The Concentrator Card transfers digital signals from the flex cables to optical fibers connected to the back-end DAQ. A dedicated buffer is needed in each ASIC to average the rate variation and match the best speed of the ChiTu transceiver inputs:

- 1. The largest average hit rate at small radius does not exceed 20 hits per ASIC and per event, equivalent to a rate of 500 Mbps (not including header). In the current design a bandwidth of up to 1.28 Gbps was considered for the innermost radius ASICs.
- 2. For the barrel, taking into account a considerable safety margin, a 160 Mbps bandwidth can be used. For the innermost radius ASICs at endcap, a 640 Mbps e-link driver is needed.

<span id="page-12-0"></span>**1.3.2.5.3 ASIC architecture** Building on the preliminary results of LGAD, an ASIC, the Out Tracker Read-Out Chip (JuLoong), is proposed to be developed in a CMOS technology. JuLoong includes 128 channels. The height of each channel should be less than 100um, which matches the pitch of the LGAD strip. Each channel in JuLoong has a preamplifier, a discriminator, and a time-to-digital converter (TDC) for the Time-Of-Arrival (TOA) and Time-Over-Threshold (TOT) measurements. The preamplifier and discriminator are most critical parts for the contribution of jitter. The TOT is used to calculate the charge as well as to correct the time walk due to the charge Landau distribution in LGAD. The power consumption of JuLoong must stay below 2.5 W per chip, which means 20 mW per channel, constrained by the system cooling capacity. This value translates to a power budget of 15 mW for the front-end analog readout circuits in each channel. The time resolution of the Out Tracker is determined by the LGAD sensor and JuLoong together. The LGAD sensor has a jitter of about 40 ps due to non-uniform charge deposition. The JuLoong contribution should be below 30 ps to achieve 50 ps overall time resolution per hit. The Most Probable Value (MPV) of the charge from the LGAD sensor is around 16 fC. Considering the charge sharing with adjacent strips, the expected operating range of the charges is 8–50 fC. Due to such a small signal from the LGAD sensor, the analog readout circuits, including the preamplifier and the discriminator, dominate electronics' jitter contribution and are critical for the Out tracker precision timing performance.

**Preamplifier** The preamplifier consists of two stages: a cascade amplifier (M1 and M2) as the first stage and a source follower (M3) as the second stage. The bias current of the input transistor (M1) has two components: the constant current IB1 is small due to that the VGS of M2 should not be too large. The transistor M2 and its gate voltage Vb set the DC operating point of M1. Vb is the replica bias voltage from IB1. The gain and bandwidth depend on the Gm of transistor M1. Most of the current of M1 is provided by a tunable IB2. The feedback resistor Rf is programmable to adjust the gain of the preamplifier. The load capacitance CL of the first stage is also programmable to optimize the bandwidth. The drain current IB3 of M3 is generated by a resistor. The gain of the first stage and the second stage are negative and positive, respectively. Since the preamplifier's input signal is a negative pulse, the output signal is a positive pulse, whose rise edge is the leading edge and falling edge the trailing edge. Both the leading and trailing edges of the



preamplifier should be considered. A faster leading edge can be achieved with a higher bias current of M1 and a smaller load capacitance of CL. When the IB2 is biased to its highest value, the preamplifier output's leading edge time can be set to several 1ns with a bandwidth of 400MHz. Small load capacitance leads to a fast leading edge but introduces more noise because of the large bandwidth. Therefore, the load capacitance can be selected to fine-tune the jitter. The default setting is to use the smallest load capacitance. The time constant, the product of the total capacitance Cs and the input impedance Rin, determines the trailing edge time of the preamplifier output. The input impedance should be considered because the 4 cm strip is a transmission line to a signal with a leading edge of 1 ns. The input impedance Rin is given by the feedback resistance Rf divided by the open-loop gain. The open-loop gain depends on the bias current. Thus, Rf is programmable with four settings to adjust the gain, the trailing edge time of the preamplifier output and input impedance. Based on the simulation with LGAD signals, a default feedback resistor will be selected. The bias current IB2 allows different trade-offs



between the power consumption and the timing performance. A larger signal slew rate (dV/dt) and a faster rise time of the

<span id="page-13-0"></span>preamplifier output tr can be achieved at a higher bias current, and thus the better performance is expected.

**Figure 1.4:** JuLoong schematic

**Discriminator** The discriminator consists of three stages of fully differential amplifiers, a comparator, and an internal buffer. The three stages of amplifiers receive the small input pulses, and generate the larger pulses comfortable

for the comparator. The overall gain for three stages of amplifiers will be 20−40 dB with a bandwidth of around 400 MHz. The comparator discriminates the differential input at the crossing point with programmable hysteresis. The internal buffer delivers the digital output to the following circuits. The internal buffer is composed of two CMOS inverters and relieves the loading pressure. As shown in Figure [1.4,](#page-13-0) the comparator has two stages: The first stage is a common-source differential amplifier of high gain with the MOSFET pair M1 and M2. The loads is comprised of a transistor pair M3 and M4, which are diode-connected PMOS. M11 and M12 provide the bias current with a source current Ibias, shared with the three-stage amplifiers. The first stage digitizes the differential input at the crossing point with a tunable hysteresis. The hysteresis is used to alleviate ringing due to noise and generated by the transistors M9 and M10. The second stage converts the differential output of the first stage into single-ended. The leading edge of the discriminator output provides the TOA, while the trailing edge, combined with TOA, provides the TOT. The discriminator threshold is connected to the inverting input and set by an internal 10bit DAC. The DAC is comprised of a global 6bit DAC and a local 4 bit DAC in each channel. Since the preamplifier's baseline varies for different channels with temperature, bias setting, and irradiation. The DAC output range is from 0.6 to 1 V with an LSB (Least significant bit) of 0.4 mV. A reference voltage generator provides a 1 V reference voltage to the DAC. To minimize the DAC noise contribution to threshold, a RC filter is added to the DAC output.

**TDC** The schematic of the TDC core is shown in the Figure [1.5.](#page-14-0) The JuLoong design faces two main challenges: the large area required for the OTK and the necessity of achieving both time and charge measurements while maintaining low power consumption. Additionally, the pitch of the LGAD strips is only 100  $\mu$ m, which means that the height of the single-channel circuitry must also be less than 100  $\mu$ m. To realize this on a smaller area, a single delay line is employed to simultaneously measure the time of arrival (TOA) and the time over threshold (TOT), with each delay cell providing a delay of 30 ps. The flip-flops record the times of the signal's rising edge, falling edge, and the reference clock's rising edge, storing these values sequentially in registers. The chip utilizes a single delay line without a delay-locked loop (DLL), and to reduce the number of delay cells, a cyclic structure is implemented. The delay of the delay line is influenced by process variations, power supply voltage, and temperature (PVT); thus, a pulse self-calibration scheme is necessary to compensate for the effects of PVT variations. This calibration is performed periodically using the system clock to measure and calibrate the delay chain. The data width of the TDC output includes 8 bits for TOT, 9 bits for TOA, 1 bit for hit flag, 8 bits for calibration, 7 bits for channel identification (128 channels), 8 bits for bunch ID, and 7 bits for chip ID, resulting in a total of up to 48 bits.

<span id="page-14-0"></span>

**Figure 1.5:** JuLoong TDC core schematic

**Calibration and Internal pulser** The calibration circuit is designed to mimic the injection of input charge and can be used during the production phase to verify the proper functioning of the chip's TOA and TOT measurement capabilities. This calibration circuit acts as a pulse generator, charging a 200 fF capacitor to a certain voltage using a DAC. By directly shorting the DAC output to ground through a switch, it generates a current signal that is similar to that of a sensor. The DAC consists of a 6-bit adjustable current mirror and a 50 kΩ resistor. The dynamic range of the DAC can reach up to 250 mV, which means the injected calibration charge can be as high as 50 fC (with  $LSB = 0.8$  fC). The output of the DAC can be connected to a pad or supplied externally. The rising edge of the calibration signal is contingent upon the speed of the switch signal. Variations in the fabrication process can also lead to inaccuracies in the capacitor and resistor values. The

final calibration will be conducted using physical events that provide a reference for the time of arrival, as the leading edge of the pulse generated by the calibration signal differs from that of the signal produced by the actual LGAD. The differing rise and fall times of the pulse result in distinct jitter characteristics in the measured timing between the calibration signal and the LGAD signal.



**Figure 1.6:** JuLoong calibration pulse generator schematic

**Data process and digital blocks** Clock generation unit The clock generator unit provides clock signals to all functional blocks in the JuLoong chip. The oscillator in the PLL works at 1.28 GHz which is the highest frequency, as an input of 43.3 MHz clock. Effective clock distribution of skew and jitter is a critical challenge in the design. A binary tree, as the most common and conservative clock distribution scheme is planned to use. In this scheme, the clock is branched from a central point (root) to all its destination nodes (leaves). The 7-stage tree structure ensures that the clock distribution network is balanced and that all the path lengths from the clock source to each channel are equal. The Buffers are added along the transmission path to ensure the quality of the clock signal. Data readout process The TDC data of each channel is buffered in a circular buffer. A scrambler is utilized and a pseudo random binary sequence (PRBS) block is adopted for test purposes. The JuLoong provides three serial output data rates streams, depending on the anticipated occupancy of the chip. The slowest data rate is 320 Mbps, where a single byte is transmitted during a 40 MHz main clock cycle. The next highest data rate is 640 Mbps, or one 16-bit word is transmitted during each 40 MHz clock cycle. Finally, the highest supported data rate is 1.28 Gbps where a 32 bit double word is transmitted during a single 40 MHz clock cycle. Forward error correction bits like 8b/10 or 64b/66b encoding schemes are used on the serial output.

**Slow control** Slow control mechanisms are implemented to configure the registers within the chip. The chip contains different registers that manage various functions and operating modes, such as different gains for the amplifiers, test modes, or data acquisition modes, as well as calibration controls. An I2C link is utilized in the ASIC, configured through the I2C master in ChiTu chip.

<span id="page-15-0"></span>**1.3.2.5.4 Prototype** A prototype chip named FPMROC is designed, including 8 complete readout chains, with each chain comprising the following components: a low-noise preamplifier, a discriminator, and a time-to-digital converter (TDC) for both time of arrival (TOA) and time over threshold (TOT) measurements. An data event builder is integrated to manage data flow, along with fast data serialization and data driver for output. The TOT is employed to correct the time walk effect in the TOA measurements [5]. Peripheral circuits include digital-to-analog converters (DACs) for calibration and threshold adjustment, a phase-locked loop (PLL) to generate high-qualify clocks, and a serial peripheral interface (SPI) module for slow control. Additionally, the prototype incorporates a charge injection circuit for testing and calibration. Design Figure 1 presents the block diagram of the FPMROC ASIC (application-specific integrated-circuits). Eight channels collectively utilize a data event builder for buffering, framing, scrambling and encoding parallel data from

various channels. The ASIC includes a serializer for off-chip data transmission at a rate of 10.24 Gbps, and a low-jitter LC-based PLL for the generation of 5.12 GHz and 40 MHz clocks for the serializer and TDCs, respectively. Additionally, an SPI is integrated to provide configurations up to 200 bits.



**Figure 1.7:** Block diagram of the FPMROC

Front-end circuit Two types of preamplifiers are implemented in the FPMROC. The first design employs a 4-stage amplifier to saturate the signal, where each stage offers high bandwidth but low gain, as illustrated in Figure 2(a). The second design incorporates a classic trans-impedance amplifier (TIA) that provides higher gain but operates at a slower speed, as shown in Figure 2(b). Both designs maintain an input impedance of approximately 50  $\Omega$  to ensure proper impedance matching.



**Figure 1.8:** (a) Schematic of the saturated amplifier, (b) Schematic of the trans-impedance amplifier

The discriminator converts analog pulses from the preamplifier into digital signals. It comprises a four-stage preamplifier and a comparator with programmable hysteresis. The preamplifier stages amplify small input pulses to a level suitable for reliable processing by the comparator. The comparator digitizes the differential input at the crossing point, with the hysteresis providing adjustable noise immunity. The hysteresis helps prevent false triggering near the threshold ensuring robust signal detection.

Time-to-digital converter The TDC block utilizes 11 voltage-controlled differential delay cells, which construct a ring oscillator using an interpolator approach, as shown in Figure 3. Each delay cell is designed to facilitate precise time measurements by integrating interpolated differential delay cells for fine time resolution, combined with two sets of ripple counters for coarse time measurement. To address propagation variations, a self-calibration mechanism is implemented that records timestamps twice. This approach utilizes two recorders, constructed as chains of D-flip-flops (DFFs), to capture snapshots of both fine and coarse time for TOA and TOT measurements, as well as for calibration purposes.

Event data builder The event data builder receives data from eight TDC channels, along with a 320 MHz synchronous clock from the serializer, to generate 64-bit parallel data at a rate of 160 Mbps. Figure 4 gives the scheme and data frame architecture. Two-level FIFOs are employed for data storage and aggregation. A frame builder retrieves and combines the data, which is subsequently forwarded to a 64B66B encoder incorporating scrambling algorithms from 10 Gigabit Ethernet to maintain DC balance within the data stream. Ultimately, a gearbox ensures alignment with the data rate and



**Figure 1.9:** The architecture of the TDC delay line

width requirements of the high-speed serializer.



**Figure 1.10:** The scheme and data frame of the event data builder

PLL and serializer Figure 5 shows the architecture of the PLL and serializer. Both designs have been silicon-proven and modified for this specific application. The PLL mainly comprises a phase-frequency detector (PFD), a charge pump (CP), an LC-based voltage-controlled oscillator (LC-VCO), a low-pass filter (LPF), and dividers and buffers [7]. The serializer facilitates the conversion of 64 bits into a pair of differential serial outputs by employing four low-speed CMOSlogic 16:4 subunits, four CMOS-logic 4:1 subunits, and a high-speed current-mode logic (CML) multiplexer paired with a driver. A PRBS15 (pseudorandom binary sequence) generator is also integrated in the serializer for fast self-tests. Figure 6 shows the overall layout, which occupies an area of  $2.2 \times 3.4$  mm<sup>2</sup>.

Simulation Results The capacitance Cs for the MCP-PMT is estimated to be approximately 4 pF. A stimulus input charge of 16 fC is injected during the simulation, characterized by rise and falls times of 100 ps and a pulse width of 100 ps. A total of 400 transient noise simulations were conducted for both front-end schemes. The total jitter of the discriminator for both schemes is less than 18 ps, as shown in Figure 5. For the preamplifier in the first scheme, the root mean square



**Figure 1.11:** The block diagram of the PLL and serializer

(RMS) noise is 158  $\mu$ V, with a slop of 67 V/ $\mu$ s. In contrast, the RMS noise for the second scheme is 508  $\mu$ V, with a slop of 145 V/ $\mu$ s.



**Figure 1.12:** Transient noise simulation for output of preamplifier

The transfer function of the TDC is illustrated in Figure 6(a). In the TOA measurement, the arrival time of the input pulse increases in fixed steps of 1 ps with each clock cycle. A minimum-square linear fitting of the measured data is also shown in Figure 6(a) (dashed blue line). The fitting curve indicates that the TDC achieves a time resolution of 9.18 ps. The integral nonlinearity (INL) quantifies the deviation of each stair center in the transfer function from the expected time. Figure 6(b) shows the INL and differential nonlinearity (DNL) of the TDC. According to the simulations, the INL of the delay line is less than 0.6 least significant bit (LSB), and the DNL is less than 0.7 LSB.

Since the PLL has been silicon-proven, detailed performance metrics can be found in the reference [7], which indicates a total jitter less than 7.5 ps. Furthermore, a spectrum analyzer was used to characterize the phase noise performance of the frequency-halved output clock (2.56 GHz), with a corresponding measurement shown in Figure 8 (a). Figure 8(b) presents a clear eye diagram, simulated at 10.24 Gbps, of the modified serializer, while the earlier design has been verified with a measured total jitter of less than 43 ps.

<span id="page-18-0"></span>**1.3.2.5.5 Power distribution and grounding** To ensure the accuracy of signal timing measurements, the allocation of current sources within the chip must be carefully considered. The power and ground for the analog and digital sections are kept independent, with key devices in the analog section located within a deep N-well. Given the large number of channels in the chip and the significant area required for power distribution, it is essential to minimize the impedance of the power lines to reduce IR drop.

<span id="page-18-2"></span><span id="page-18-1"></span>**1.3.2.5.6 Radiation tolerance** In terms of radiation hardness design, the primary considerations are Total Ionizing Dose (TID) effects and Single Event Effects (SEE). TID can lead to an increase in the threshold voltage of MOSFETs, thereby degrading the timing performance of the circuit. To mitigate TID effects, relatively high bias currents are employed in the circuit, and the use of minimum-sized transistors is avoided. Additionally, substrate contacts are increased to prevent latch-up phenomena. The critical components of the digital logic in the chip are designed using Triple Modular Redundancy (TMR) to enhance Single Event Upset (SEU) tolerance.

**1.3.2.5.7 Monitoring** The monitoring of the ASIC primarily focuses on operational temperature, voltage, and the leakage current of the sensors. While the chip itself is not sensitive to temperature, the LGAD is sensitive to temperature variations, allowing temperature detection through the channels to ascertain whether cooling has failed. Voltage monitoring within the chip can be used to determine if the chip is functioning properly and to control the shutdown of any malfunctioning modules.

<span id="page-19-0"></span>**1.3.2.5.8 Development plan and schedule** In the second half of 2024, the design of the LGAD readout scheme and the verification of the corresponding ASIC will be conducted. In the Q1 of 2025, the ASIC will be submitted for wafer production to validate the performance of the preamplifier, discriminator, and TDC modules, along with the design of the ASIC test system. Performance testing of the ASIC will be carried out by the end of the year, and each module will undergo radiation hardness testing. In the Q4 of 2025, the ASIC design will be improved, incorporating a digital logic control section, and the first version of the multi-channel integrated design will be submitted for wafer production. In the first half of 2026, the design of the multi-channel ASIC test system will be completed, alongside performance testing of the ASIC and radiation hardness testing, culminating in the completion of the connection and debugging with the LGAD. In the end of 2026, the multi-channel ASIC design will be further refined, and the V1 version of the ASIC will be submitted for wafer production. Simultaneously, the prototype design of the LGAD readout frontend electronic system will be initiated. In the first half of 2027, performance testing of the V2 version of the ASIC will be conducted, along with testing of the LGAD readout electronic system prototype, ensuring coordination with the LGAD. In the second half of 2027, the prototype system will be finalized in preparation for the mass production of the chips.

### <span id="page-19-1"></span>**1.3.3 SiPM readout ASIC**

#### <span id="page-19-2"></span>**1.3.3.1 General requirements**

| <b>Input Dynamic Range</b>        | 0.1 MIPS -3000 MIPS      |
|-----------------------------------|--------------------------|
| electric charge resolution        | 30%@0.1 MIPS 10%@1MIPS   |
|                                   | $1\% @ 100 \text{ MIPS}$ |
| time resolution                   | $100 \text{ ps}$         |
| Single-channel average event rate | $13$ kHz/ch              |
| <b>Maximum incident rate</b>      | $230 \text{ kHz}$        |

**Table 1.4:** General requirements

#### <span id="page-19-3"></span>**1.3.3.2 ASIC architecture**

The ECAL detector requires the readout ASIC capable of detecting a range from 0.1 MIPs to 200 MIPs, each MIP corresponds to 200 photon-electrons. Due to the wide overall photon dynamic range, a minimal SiPM gain is necessary. The NDL EQR06 11-3030D-S SiPM model is used, which features a gain of  $8 \times 10^4$ , an area of 3mm×3mm, and a capacitance of 5.1 pF/mm<sup>2</sup>. Therefore, the parasitic capacitance of this SiPM model is 45.9 pF.

The conversion of optical signals to electrical signals can be determined using Equation 1-1:

$$
Q = N * 200 * 1.6 * 10^{-19} * 8 * 10^{4}/2
$$

The dynamic range of the charge signal received by the readout circuit is 128 fC to 3.84 nC, corresponding to an input dynamic range of 30,000.



Figure 1.1 Block diagram of the CEPC readout circuit design

To enable single-photon calibration, a dedicated path for single-photon input is required. Since single-photon signals are extremely small, the frontend employs a low-noise charge-sensitive amplifier (CSA) for charge signal integration and amplification. If the output signal of the SiPM is connected directly to the CSA, the parasitic capacitance of the SiPM may affect the signal. The current fed into the CSA after isolation using Current Buffer needs to consider the noise introduced by the current mirror.

Frontend readout circuits with large dynamic ranges typically use two approaches.

Utilizing an adjustable external capacitor within a single gain path to achieve different gain levels. Employing multiple gain paths to achieve measurements across a wide dynamic range. In this project, the energy dynamic range of particles generated from collisions is unknown, making it impossible to adjust the gain to the correct level before signal generation. Therefore, a multi-gain path approach is selected to achieve wide dynamic range detection in this design.

In the readout chip's backend, ADCs and TDCs are used to digitize energy and time information, with a serializer employed for signal readout. A 10-bit ADC is planned for signal quantization. Using the standard 55nm CMOS process, the MOS transistor voltage is 1.2V. The quantization range of the ADC is  $-1.2$  V  $-1.2$  V. Since only positive signals exist in the SiPM readout circuit, the ADC's designed quantization range should be 0 V -1.2 V. The least significant bit (LSB) of the ADC can be calculated using Equation 2-7:

$$
LSB = \frac{1.2 \text{ V}}{1024} = 1.17 \text{ mV}
$$

The signal quantified by the ADC corresponds to the amplified output signal of the amplifier. If the output stage employs a common-source amplifier, a Vdsat of 200 mV must be reserved to ensure the MOS transistor operates in the saturation region. Thus, the ADC's quantization range is 200 mV to 1 V, with a dynamic range of 800 mV.

#### <span id="page-20-0"></span>**1.3.3.3 Single-channel readout electronics**

<span id="page-20-1"></span>**1.3.3.3.1 Current Buffer** In the front-end readout circuit Current Buffer circuit has the role of isolating the SiPM capacitance and increasing the bandwidth of the back-end amplifier, which is an important part of the front-end readout module. The input impedance of the Current Buffer can be determined according to the magnitude of the input current, and the structure of the Current Buffer is shown in Figure 3.2.



Figure 3.2 Current Buffer Schematic

The Current Buffer uses a common gate-level MOS tube M2 as the current buffer stage, and M2, M3, and M4 form a negative feedback to improve the response speed of the loop. When the SiPM output current, the source current of M2 increases, the current mirror M4 in order to generate a larger current, the gate voltage VA will drop, VA decreases will lead to M3 gate source voltage increases, the current flowing through the M3 will also increase, due to the M2, M3 is a parallel branch, the total current of the branch is determined by the current mirror M0, in the case of the total current remains unchanged, the M3 branch current increases will make M2 current decreases, forming a negative feedback branch. The

buffered current is copied to the M6, M8 and M12 branches through the current mirror in a certain ratio to realize different signal processing at different gain levels.

<span id="page-21-0"></span>**1.3.3.3.2 Front-end amplifier** In order to reduce the variable output capacitance caused by a large change in load capacitance, the variable input capacitance used in this design to achieve variable gain, the circuit structure is shown in Figure 3.3.



Figure 3.3 Schematic of the capacitive proportional amplifier

The amplifier gain should be at least 60 dB, in the actual design should be considered process error, out-of-phase voltage, capacitance mismatch and other non-ideal factors, the design of the amplifier should leave a certain margin, so that the amplifier gain should be greater than 70 dB to meet the needs of the gain error.

The amplifier needs to transmit the signal without attenuating the high frequency part, so the amplifier needs a certain bandwidth, according to the empirical value the GBW of the amplifier is related to the rising edge of the signal, the GBW of the amplifier when the rising edge is 1ns can be expressed as:

$$
GBW \ge \frac{0.35}{t_{rising}} = 350 MHz
$$

The performance of the closed-loop amplifier almost depends on the performance of the open-loop amplifier, so the design of the open-loop amplifier is very important. Stall 2 has the smallest gain multiplier and therefore has the largest input dynamic range. When the signal reaches the amplifier has been attenuated 1/20 times, the input dynamic range of the signal is only 0.4 V. The amplifier can be used without rail-to-rail inputs, but it needs to take into account the output dynamic range at the same time. Therefore, this design uses a two-stage folded common-source common gate amplifier, the first stage of the folded common-source common gate structure to provide high voltage gain, the second stage of the common-source amplifier to improve the circuit's amplitude, the schematic diagram of the op-amp is shown in Figure 3.4.



Figure 3.4 Two-stage folded common-source, common-gate amplifier

<span id="page-21-1"></span>**1.3.3.3.3 Shaper circuit** Different shaping circuits are required for the time measurement path as well as for the analog measurement path to speed up or slow down the signal shaping time. Slow shaping circuits are usually implemented by

integrating networks or filters in order to smooth out the noise and obtain the total energy of the signal. Fast forming circuits are mainly used to capture the temporal characteristics of the signal, emphasizing time resolution. In electronics, fast-forming circuits are designed with the goal of responding quickly to a signal and determining the point in time at which the signal arrives, and are usually implemented based on high-pass filtering or differentiation circuits. Filter-forming circuits typically utilize CR-RC2 filter circuits, with the CR portion providing enhancement of the fast signal and the  $RC<sup>2</sup>$ portion reducing noise and limiting bandwidth. The overall frequency response of this structure combines high-pass and low-pass characteristics to form a bandpass filter. The schematic is shown in Figure 3.5.



Figure 3.5 CR-RC2 Forming Circuit

The transfer function of the filter circuit can be expressed as:

$$
H_s = \frac{SRC}{(1+SRC)^3}
$$

The time constant of the resistor-capacitor can be expressed as:

 $\tau = R * C$ 

The total time constant of the response can be expressed as:

$$
\tau_{total} \approx 2.4*\tau
$$

The half pulse width of the response is:

$$
FWHM \approx 5.28*\tau
$$

<span id="page-22-0"></span>**1.3.3.3.4 Comparator** The comparator is responsible for screening the arrival time of the signal. The use of the current comparator can save the power consumption of the fast shaper circuit, so the signal out of the Current Buffer is fed directly into the current comparator to compare, and its structure is shown in Figure 3.6.



Figure 3.6 Fast Current Comparator

The high speed current comparator is composed of three parts - the input stage, the current positive feedback circuit and slew rate enhancement circuits. the input stage is realized with current mirrors (M1-M2 and M3-M4), where the SiPM current is compared with a reference threshold current. The second stage is a current comparator used to amplify the current difference from the first stage. In this work, a regulated cascode stage is used (M5-M9). The shunt feedback reduces the input impedance by a factor equal to the loop gain, resulting in a lower impedance than a source follower stage, making it more sensitive to changes in Iin. In the third stage, an inverter chain is used which amplifies the threshold detection signal forcing it to run rail-to-rail. A replica bias circuit is used to force the DC bias voltage at node Y to VDD/2. When there is no SiPM current, the threshold current mirrored by the transistors M3-M4 flows into the regulated cascode stage, thus bringing down the voltage of node Y from VDD/2. This drives the output of the comparator low. When there is a photon event, transistors M3-M4 mirror the SiPM current to node X. If the SiPM current is lower than the threshold current, the comparator output retains the default operating condition as before, thus no signal pass through the current amplifier. However, when the SiPM delivers a current more than the threshold current, the voltage at node Y becomes higher than VDD/2. The inverter with resistive feedback (M10-M11) allows the transistors to operate in the saturation region, thus having a very high voltage gain and less propagation delay. Also, biasing node Y at mid-rail reduces the propagation delay by holding the inverter input bias voltage near the comparator tripping point. The low input impedance of the regulated cascode, followed by the high gain of the inverter, ensures that any change in Iin is propagated without delay to the output of the comparator, even at high frequencies.

<span id="page-23-0"></span>**1.3.3.3.5 Time-to-Digital Converter** The time-to-digital converter is responsible for measuring the arrival time of the signal and converting it into a digital output. The target of this project is to design a TDC with an accuracy of 100 ps. The accuracy of the TDC is required to be low, and a two-step TDC can be used to achieve the measurement accuracy of 100 ps. The block diagram of the structure is shown in Figure 3.7.



Figure 3.7 Block diagram of TDC structure

The two-step TDC uses a combination of coarse technique and fine counting to measure the energy signal, the coarse technique is realized by the counter and the fine counting is realized by the delay chain, this structure can reduce the locking problem that the fine delay chain generates during a long time of transmission, the measurement principle of the TDC is shown in Figure 3.8.





<span id="page-23-1"></span>By the counter to measure the coarse counting part of time A, by the fine counting to measure the less measured time dt1 as well as the more measured time dt2, and add and subtract operations can be obtained by the measured time gate, whose measurement accuracy is completely dependent on the accuracy of the delay unit, in order to ensure that the chip can be tested, the delay unit should be used to ensure that the chip can be tested, the delay unit should be used to use a variable delay unit. Assuming that the accuracy of the coarse technique is T coarse coarse technique measurement results for N1, dt1 counts for N2, dt2 counts for N3, the measurement time T can be expressed as:

$$
T = N1 * T_{coarse} + N_2 * LSB - N_3 * LSB
$$

**1.3.3.3.6 Sample and hold** In Silicon Photomultiplier (SiPM) readout circuits, switched-capacitor sampling is a critical technique for capturing and processing the fast analog signals generated by the SiPM. The circuit operates by using a clockcontrolled switch to connect the input signal to a sampling capacitor during the sampling phase, allowing the capacitor to charge to the instantaneous voltage of the input signal. Once the sampling phase ends, the switch opens, isolating the capacitor and preserving the sampled voltage as a stable value during the hold phase. This held voltage is then buffered to prevent loading effects and ensure a consistent signal for subsequent processing, such as digitization or integration. By accurately capturing and holding the signal at specific time intervals, switched-capacitor sampling ensures minimal distortion and noise, making it particularly suited for high-precision SiPM applications where weak signals need to be faithfully preserved and analyzed, The switched capacitor sample-and-hold array is shown in Figure 3.9.



Figure 3.9 Timing diagram for TDC measurement

<span id="page-24-0"></span>**1.3.3.3.7 Analog-to-digital converter (ADC)** In the development of integrated circuit technology, there are many kinds of ADC, in the field of high-energy physics, there are SAR ADC, Pipeline ADC and Wilkinson ADC, in this design, considering the consistency of multi-channel signals and the frequency of the measurement signal is not in the high-frequency, and finally chose the Wilkinson ADC, whose circuit block structure is as shown in Fig. 3.9. The circuit block diagram is shown in Figure 3.10.



Figure 3.10 Wilkinson ADC circuit block diagram

<span id="page-24-1"></span>The measurement principle of Wilkinson ADC is that when the input signal is detected, the inverse input node voltage of the comparator is charged to the input level by a current source through a capacitor, and then the capacitor is discharged by a fixed current source, and the pulse width formed is measured by a high-speed counter, and the output of the counter is converted into a code word, which is the corresponding digital code word of the analog signal. Since the project requires a 10-bit ADC, the design of the Welkinson ADC can be completed by setting the discharge current so that the output of the counter is exactly 1024 when charging to a maximum voltage of 1.2 V. The counter output is then converted to a code word by a high-speed counter.

#### **1.3.3.4 Data process and digital block**

The fine count accuracy of TDC is 100ps, and the average event rate is 13 KHz, so TDC requires 15 bits of coarse technology and 2 x 6 bits of fine count for a total of 27 bits of data, ADC needs to transmit 12 bits of data, a single channel has 39 data, and there are a total of 16 channels in the readout system, a total of 624 data needs to be transmitted, and the schematic diagram of the data processing module is shown in Figure 3.11.



Figure 3.11 Serial output logic

All the data is encoded by the 8B10B encoder, 78 groups of parallel data with frequency f are converted into 10-bit serial data with frequency of 78 f through the 8B10B encoder, and then the data is upgraded to two sets of serial data with a frequency of 390 f through 5:1 MUX, and finally the data is converted into serialized data output with frequency of 780 f through a 2:1 MUX and sent to the FPGA for processing.

### <span id="page-25-0"></span>**1.4 Global architecture (Wei Wei)**

#### <span id="page-25-1"></span>**1.4.1 Consideration on readout strategy (Wei Wei)**

As a next-generation large collider experiment electronics system, to design all front-end electronics subsystems according to a unified system specification, to ensure that their data interfaces, power interfaces, etc., are supplied in a uniform manner, and furthermore, to make the backend electronics able to receive data, perform slow control, and configure the front-end electronics of each subdetector in a unified interface, and further communicate with the TDAQ system, will significantly enhancing the unity of the electronics system. This will not only facilitate the unified design and management of different subdetector systems but also enable the entire electronics system to be designed with a certain degree of maximized commonality. That is, based on the different scales of subdetectors, achieving modular design of subdetector electronics can be relatively easy by simply increasing the number of common generic modules accordingly.

To achieve this design style, it is necessary to first determine the overall strategy of the electronics and TDAQ systems. In other words, it is essential to clarify whether the electronics and TDAQ systems adopt a front-end trigger scheme or are based on a front-end triggerless readout scheme.

<span id="page-25-2"></span>

| <b>Characteristics</b>             | <b>FEE-Triggerless</b>      | <b>FEE-Trigger</b>   | <b>Superiority</b> |
|------------------------------------|-----------------------------|----------------------|--------------------|
| Where to acquire trigger info      | On BEE                      | On FEE               |                    |
| Trigger latency tolerance          | Medium-to-long              | <b>Short</b>         |                    |
| Compatibility on Trigger Strategy  | Hardware / software         | Hardware only        | FEE-Triggerless    |
| FEE-ASIC complexity on Trigger     | Simple                      | Complex on algorithm |                    |
| Upgrade possibility on new trigger | High                        | Limited              |                    |
| FEE data throughput                | Large                       | <b>Small</b>         |                    |
| Maturity                           | relatively<br>Mature<br>but | Very mature          | FEE-Trigger        |
|                                    | new                         |                      |                    |
| Resources needed for algorithm     | High                        | Low                  |                    |
| Representative experiments         | $CMS, LHCb, \ldots$         | ATLAS, BELLE2, BE-   |                    |
|                                    |                             | $SIII, \ldots$       |                    |

**Table 1.5:** Comparison of the FEE-Triggerless readout and Trigger readout strategy

Table [1.5](#page-25-2) provides a general comparison of the two typical trigger readout schemes. It can be seen that the front-end trigger-based approach is relatively traditional. In this method, while the front-end electronics of the detector process the detector signals, they also need to extract key information usable for triggering from the detector signals and send it to the

trigger system. At the same time, detector data needs to be cached in the front-end electronics. Once the trigger system receives the key information, it generates trigger decision information based on the physics model and corresponding trigger algorithms, which is then sent back to the front-end electronics. The front-end electronics compare the cached data with trigger decision information to extract valid physical events and send them to the backend electronics, which further routes them to the data acquisition system.

On the other hand, the backend trigger-based approach involves digitizing the detector signals in the front-end electronics and directly transferring them to the backend electronics for caching. The trigger system only communicates with the backend electronics, and the extraction of detector valid events is done solely in the backend electronics and trigger system. The comparison in Table [1.5](#page-25-2) shows that these two main electronics frameworks have their own advantages and disadvantages without a clear superiority. In typical applications, they are supported by various large particle physics experiments such as CMS, LHCb, as well as ATLAS, BELLE2, BESIII, respectively.

The traditional front-end trigger scheme effectively eliminates detector background, reduces pressure from data transmission bandwidth, but also increases the demand for front-end electronics data caching capacity. It usually allows only short trigger delays, requires faster trigger decision speeds, and simpler trigger algorithms. On the other hand, the front-end triggerless readout method reduces the design complexity of front-end electronics by eliminating trigger-related logic. However, since detector background and valid events are both read out together, it increases the pressure on frontend data transmission. Nevertheless, with improved processing capabilities and cache space in the backend electronics compared to the front-end electronics, the front-end triggerless readout scheme can also implement relatively complex trigger algorithms. This reduces the requirements for trigger delay and trigger system design, making pure software triggering possible. In China's collider spectrometer experiments represented by BESIII, the front-end trigger-based approach is commonly adopted. Non-collider experiments represented by JUNO and LHAASO generally explore frontend waveform sampling schemes, but overall, the implementation of trigger algorithms still follows relatively traditional approaches such as data compression and detector information extraction.

Considering the physical goals and project timeline of CEPC, the electronics-TDAQ framework based on front-end triggerless readout scheme has been selected as the reference strategy for the electronics system of CEPC. At the same time, the traditional framework based on front-end trigger readout will serve as a backup plan for the electronics system. This choice is made based on the following considerations:

- **i)** Considering that CEPC will serve as a discovering machine, choosing a front-end triggerless strategy can retain the maximum possibility of exploring new physics, as all raw information from the detector, including background and signals, will be fully read out. Once a front-end trigger is adopted, the trigger strategy will typically be fixed at some extent, usually based on known physics processes, potentially leading to the abandonment of unknown physics processes. Additionally, because front-end triggerless electronics do not require pre-condition judgments, they are also well-suited for future detector upgrade planning, usually only requiring upgrades to the signal processing and readout capabilities of the front-end electronics to meet the new requirements of the detector; in contrast, if a front-end trigger strategy is adopted, new trigger strategies may need to be considered, potentially leading to a redesign of the front-end electronics.
- **ii)** Based on the above considerations, the triggerless strategy at the front-end will also effectively accelerate the process of ASIC iteration and qualification. Under this strategy, the front-end ASIC only needs to consider information related to detector detection and background, and provide output data rates, then all key interface parameters will be determined. On the other hand, ASICs based on front-end trigger strategy also need to consider related on-chip trigger algorithms, trigger information output, and trigger-compliant readout designs, which are often difficult to finalize in the early stages of detector overall design, inadvertently increasing the number of ASIC design iterations significantly. The introduction of new digital module design in front-end ASICs will also increase the risk of potential bugs. Considering the possible construction timeline of CEPC, rapidly completing the design, iteration, and qualification of ASICs, and completing the prototype design of the detector within a limited time frame, is undoubtedly the most suitable strategy for achieving the engineering goals of CEPC.
- **iii)** The triggerless front-end electronics readout framework also maximizes the versatility of the electronics system design. Due to the unified interface between front-end electronics and back-end electronics of different sub-detectors

under this framework, a common data interface design can be adopted to achieve the transmission of front-end data to back-end electronics. This data interface does not need to consider the specific trigger inputs and outputs of different sub-detectors, but only the common high-speed data transfer characteristics, clock distribution characteristics, slow control, and BCID, etc. Furthermore, in this framework, trigger-related algorithms will only be implemented in the FPGA devices of the back-end electronics and trigger system, and the online programmable feature of FPGAs allows their algorithms to be adjusted as needed, achieving detector independence. This also allows the back-end electronics and trigger system to be implemented in a similar manner with common back-end PCBs and common trigger PCBs. It is only needed to scaled the number of common PCBs according to the data volume of different sub-detectors to meet the readout requirements of all sub-detectors. This greatly simplifies the overall design and management of the electronics system, allowing the electronics of different sub-detectors to be unified in a top-down manner, without the need for customized designs of the electronics system for different sub-detectors as in traditional front-end trigger schemes.

**iv)** The key premise that can be achieved by the triggerless scheme is that the readout capability of the front-end electronics is sufficient to read out all data, including background and critical detector information. As shown in Table [1.1,](#page-7-0) based on the preliminary design of the front-end ASICs of various sub-detectors and the background evaluation of the MDI system, the expected data rates of the common data interface corresponding to detector module are generally below 9.71Gbps. Considering the use of balanced encoding for high-speed data transfer, even with the relatively large overhead of 8b10b encoding, the output data rate is within 11.09Gbps. Referring to the commonly used lpGBT interface chips, this data rate falls within the typical interface capabilities using optical fibers as the transmission medium. Even for typical high data rate detectors represented by vertex detectors and electromagnetic calorimeters, the data interface corresponding to the optical fiber channels can be increased by using MTX module, without significantly increasing the interface size (it can be referenced in Section [1.5.1\)](#page-29-1). This will allow the maximum possible data rate of the detector module to reach 48.6Gbps, equivalent to an effective data rate of 38.84Gbps. Even considering the future High LumiZ working environment, there is ample capacity and flexible upgrade room in the readout capability. This makes the adoption of a triggerless readout scheme at the front-end feasible under the necessary conditions.



#### <span id="page-27-1"></span><span id="page-27-0"></span>**1.4.2 Baseline architecture for the Electronics-TDAQ system (Wei Wei)**

**Figure 1.13:** Global Framework of the Electronics-TDAQ System of CEPC

Figure [1.13](#page-27-1) presents the electronic system architecture based on the considerations above, which is built on a front-end triggerless readout scheme. The electronic system can be divided into two main parts: customized front-end electronics and a common platform. Customized front-end electronics are designed according to the specific requirements and layout of each subdetector system, as described in the corresponding sections. However, to maximize versatility, only the front-end readout ASIC responsible for triggerless readout of detector signals and backgrounds is fully customized. The data interface and power supply systems following it are implemented based on a common platform. The data interface platform initially receives multi-channel relatively low-speed data from the front-end ASIC by the data aggregation chip TaoTie, aggregates it into high-speed serial data, encodes the data through the data interface chip ChiTu, and finally converts the high-speed electrical data flow from ChiTu into optical signals using the optoelectronic module KinWoo. The optical signals are then transmitted through optical fibers to the common back-end electronics, completing the front-end data transmission chain. The detailed design of this part will be discussed in Section [1.5.1.](#page-29-1) The front-end power module efficiently converts external high-voltage power supplies into the required power voltages for the front-end chips, with detailed design covered in Section **??**. According to the previous plan, the back-end electronics and trigger system can be implemented using a common PCB approach, which is scalable based on the data volume of different subdetector systems to achieve the overall design. Within this framework, all trigger signals are transmitted between the common back-end board and the common trigger board, allowing flexible trigger algorithms to be implemented based on the physical objectives.

The benefit of dividing the electronics system into customized front-end electronics and a common platform is that the front-end electronics can be fully ASIC-based, achieving the goal of radiation tolerance design. By transmitting front-end data through optical fibers to the back-end electronics, the common platform can be placed in shielded environments further away from the collision point, enabling the utilization of high-performance commercial devices without the need for radiation tolerance considerations.

<span id="page-28-0"></span>

**Figure 1.14:** Backup Plan for the Global Framework of the Electronics-TDAQ System of CEPC

As a backup plan for triggerless front-end readout, it is a reasonable choice compared to the traditional front-end trigger readout scheme, as shown in Figure [1.14.](#page-28-0) With the continuous optimization of detector design and MDI background assessment, it is possible that the final detector background will exceed the rated transmission capability of the common data interface. In this case, the front-end electronics can still be gradually adjusted to return to the backup plan of front-end trigger readout. The specific plan is as follows:

- **i)** When the detector background data rate exceeds the transmission capability of the fiber interface in the preliminary plan, additional fiber channels can be added to the MTX fiber interface, with each effective data rate of 9.71Gbps, to achieve a maximum data interface transmission capacity of 38.84Gbps.
- **ii)** When the data rate levels of certain detector modules still exceed the above data rate limits, advanced data compression algorithms can be deployed in the front-end ASIC chip, such as extracting key information about hit clusters in the track and reducing redundant timestamp information output and so on, thereby reducing the data rate levels at the front end.
- **iii)** When the above optimization schemes still cannot meet the data transmission requirements, adding a Fast Trigger interface to detectors with data transmission bottlenecks (typically the Vertex Detector) can be considered, targeting

only this specific detector to return to the traditional trigger mode: the front-end chip first buffers the detector signals and background data; in order to limit the front-end chip buffer space to a reasonable range, the trigger system needs to generate relatively rough fast trigger arbitration information as quickly as possible and send it to the front-end chip through a dedicated channel; after the front-end chip receives the trigger signal, it compares the buffered data with the fast trigger information, and finally transmits the preliminary arbitrated data to the back-end electronics. This approach can effectively reduce the front-end data rate, but it will require a relatively reasonable fast trigger algorithm based on detector optimization and physical objectives. Further details on this part will be discussed in the relevant sections of the trigger system.

# <span id="page-29-0"></span>**1.5 Common Electronics interface**

#### <span id="page-29-1"></span>**1.5.1 Data interface (Di Guo, Xiaoting Li, Jingbo Ye)**

#### <span id="page-29-5"></span><span id="page-29-2"></span>**1.5.1.1 Overall architecture**



**Figure 1.15:** Overall architecture of the data interface

Figure [1.15](#page-29-5) shows the overall architecture of the data link system designed for CEPC. The front-end data from different detectors with different data rates and channels will be transmitted to the back-end through this data link system as uplink transmission. The clocks, trigs, and configuration signals are also transmitted to the front-end through this data link system as downlink transmission.

This data link system is mainly composed of a series of ASICs and the customized array optical module KinWooTRX. The ASICs include the data pre-processing ASIC: TaoTie, bi-directional data interface ASIC: ChiTu, VCSEL array driver ASIC: KinWooLDD, and transimpedance amplifier receiver ASIC: KinWooTIA. TaoTie functions as a data pre-aggregation ASIC for multiple channels of readout electronics in different sub-detectors. ChiTu serves as a bi-directional data transceiver with both an uplink and a downlink. The uplink receives data post pre-aggregation, performs encoding and serialization, and transmits the data to KinWooTRX. The downlink receives clock and slow control signals, performs de-serialization and decoding, and transmits data to blocks on-ChiTu and others off-ChiTu. KinWooLDD and KinWooTIA are optical drivers and receivers designed specifically for the VCSEL and LD arrays, respectively. They will be integrated and assembled within the optical module.

More detailed functions and interface descriptions of these ASICs are provided in the following sections.

#### <span id="page-29-3"></span>**1.5.1.2 TaoTie: Front-end Data Pre-process ASIC (Le Xiao)**

#### <span id="page-29-4"></span>**1.5.1.2.1 TaoTie function description and structure**

TaoTie employs a self-compatible scheme to handle a variety of data channels and rates from front-end sub-detectors. Basically, each TaoTie can serialize a maximum of 8 channels into 1 channel, with configurable modes allowing for 2 or 4, channels to be serialized into 1 channel. Moreover, an N-stage TaoTie can serialize 8N channels into 1 channel. Based on the current requirements, it is likely that 2 stages will be sufficient. The ultimate serial output data rate should align with the input data rate requirement of ChiTu, which is 1.39 Gbps based on the 43.33 MHz system clock.

#### <span id="page-30-0"></span>**1.5.1.2.2 Interface**

<span id="page-30-1"></span>The structure of TaoTie chip is shown in the Figure [1.16.](#page-30-1) There are three input modes: 8-channel input at 173.33 MHZ, 4-channel input at 346.67 MHZ, or 2-channel input at 693.36 MHZ. After undergoing intermediate processing, the maximum output speed is 1.39 Gbps.



**Figure 1.16:** Architecture of TaoTie

In the 1.39 Gbps full-speed output mode, support:

- 1). 173.33Mbps X 8Ch
- 2). 346.66Mbps X 4Ch
- 3). 693.33Mbps X 2Ch
- 4). 86.66Mbps X 8Ch (2x oversampling)
- 5). 43.33Mbps X 8Ch (4x oversampling)
	- In the 0.69Gbps half-speed output mode, support:
- 6). 86.66Mbps X 8Ch
- 7). 173.33Mbps X 4Ch
- 8). 346.66Mbps X 2Ch
- 9). 43.33Mbps X 8Ch (2x oversampling) In the 0.34Gbps output mode, support:
- 10). 43.33Mbps X 8Ch
- 11). 86.66Mbps X 4Ch

#### 12). 173.33Mbps X 2Ch

Support cascading combination of three modes.

#### <span id="page-31-0"></span>**1.5.1.3 ChiTu: Bi-direction Data Interface ASIC**

ChiTu primarily comprises a flexible high-precision clock system, a high-speed serializer, de-serializer, data builder, configuration capabilities, and monitoring. The 8-channel data at 1.39 Gbps received from TaoTie is processed by D-links in ChiTu. It undergoes alignment by phase aligners, encoding, DC-balancing by a data builder, serialization to a data rate of 11.09 Gbps, and is ultimately transmitted to KinWooTRX. The high-quality clocks imperative for the serializer are generated by an LC phase-locked loop (PLL), which can also provide several configurable output frequencies and phases externally. The de-serializer in ChiTu receives control signals, including fast command, at a data rate of 2.77 Gbps. The circuit recovers data and clock signals through an integrated clock data recovery (CDR) mechanism.

<span id="page-31-3"></span>

| <b>Capability</b><br><sub>of</sub> | <b>Channel</b><br>Num- | <b>Data/Clock Rate</b>               | <b>Notes</b>                                            |  |  |
|------------------------------------|------------------------|--------------------------------------|---------------------------------------------------------|--|--|
| <b>ChiTu ASIC</b>                  | ber                    |                                      |                                                         |  |  |
| Uplink<br>Payload                  | Maximum 7 chan-        | 1.29 Gbps/ch                         | ChiTu can receive and transmit a maximum of 7 chan-     |  |  |
| Channels<br>Data                   | nels                   |                                      | nels of data from TaoTie and front-end. The data rate   |  |  |
| (input)                            |                        |                                      | is fixed at 1.29 Gbps per channel. Maximum uplink       |  |  |
|                                    |                        |                                      | payload transmission capability is 9.71 Gbps per ChiTu  |  |  |
|                                    |                        |                                      | ASIC.                                                   |  |  |
| Uplink<br>External                 | 1 channel              | 86.67 Mbps                           | ChiTu can transmit one specific channel at 86.67 Mbps.  |  |  |
| Channel<br>Control                 |                        |                                      | This channel has specific input pins. The data of this  |  |  |
| (input)                            |                        |                                      | channel is embedded as 2 bits in the uplink frame.      |  |  |
| Downlink<br>Pay-                   | Maximum<br>16          | Maximum 346.67                       | ChiTu can output a maximum of 16 channels of data       |  |  |
| load Data Chan-                    | channels               | Mbps/ch;<br>Mini-                    | to the front-end with the data rate of 86.67 Mbps/ch.   |  |  |
| nels (output)                      |                        | mum 86.67 Mbp-                       | The output data rate can be configured to 86.67 Mbps,   |  |  |
|                                    |                        | $s$ /ch                              | 173.33 Mbps or 346.67 Mbps per channel with the         |  |  |
|                                    |                        |                                      | channel number of 16, 8 or 4, respectively.             |  |  |
| Downlink Exter-                    | 1 channel              | 86.67 Mbps                           | ChiTu can output one specific channel at 86.67 Mbps.    |  |  |
| nal Control Chan-                  |                        |                                      | This channel has specific output pins. The data of this |  |  |
| nel (output)                       |                        |                                      | channel is embedded as 2 bits in the downlink frame.    |  |  |
|                                    |                        |                                      | This data channel is primarily used for slow control    |  |  |
|                                    |                        |                                      | from back-end.                                          |  |  |
| Provide<br>Clock                   | 16 channels            | <b>MHz</b><br>43.33<br><b>or</b>     | ChiTu can provide a maximum of 16 channels of dif-      |  |  |
| (output)                           |                        | MH <sub>z</sub><br>86.67<br>$\alpha$ | ferential clock to the front-end with configurable fre- |  |  |
|                                    |                        | <b>MHz</b><br>173.33<br>$\alpha$     | quency from 43.33 MHz to 1.29 GHz.2 Channels out        |  |  |
|                                    |                        | <b>MHz</b><br>346.67<br>$\alpha$     | of these 16 can provide phase adjustment function with  |  |  |
|                                    |                        | <b>MHz</b><br>693.33<br>$\alpha$     | a resolution of 54.17 ps for all frequencies.           |  |  |
|                                    |                        | 1.39 GHz                             |                                                         |  |  |

**Table 1.6:** Data transmission capability of ChiTu ASIC

#### <span id="page-31-1"></span>**1.5.1.3.1 Input/Output Channels in ChiTu**

In the uplink direction, ChiTu receives the multi-channel parallel data from TaoTie ASIC and the front-end detectors, performs data reception, alignment, scrambler, encoding, frame building, serializing, and outputs high-speed serial data to the optical module for optical data transmission.

In the downlink direction, ChiTu receives the high-speed serial data from optical module, performs the data alignment, clock recovery, de-coding, de-scrambler, outputs parallel data to the front-end, and also provides clocks and slow control data to the front-end.

<span id="page-31-2"></span>The bi-directional data transmission capability of ChiTu is summarized in the Table [1.6.](#page-31-3) In general, ChiTu operates at a serial data rate of 2.77 Gbps for downlink, at 5.55 Gb/s or 11.09 Gb/s (configurable) for uplink, and can provide 16 channels of clock signals to the front-end with configurable frequencies and phases.

#### **1.5.1.3.2 Electrical Signal Interface Specification**

As a bi-directional interface ASIC, ChiTu receives electrical signals from TaoTie or front-end electronics, and output electrical signals/clocks to the front-end. ChiTu ASIC uses "Tx module in D-Link "to transmit all data signals and clocks to the front-end, and uses "Rx module in D-Link "to receive all data signals from TaoTie and front-end electronics. Related input and output electrical signal specifications for ChiTu are summarized in Table [1.7](#page-32-1) and Table [1.8.](#page-32-2)

<span id="page-32-1"></span>





<span id="page-32-2"></span>

#### <span id="page-32-0"></span>**1.5.1.3.3 DPC (Data Phase Control)**

The DPC module is a functional module designed to automatically adjust the phase/delay of the multi-channel data input into the ChiTu chip. It is capable of automatically adjusting the phase of the input data, so that the effective edge of the on-chip clock is exactly located in the middle of the data bits. In the current preliminary design, this module can achieve automatic phase alignment for data rates of 173.33 M, 346.67 M, 693.33 M, and 1.39 Gbps. As shown in Figure [1.17,](#page-33-1) the DPC module is mainly composed of three parts: a delay-locked loop, a voltage-controlled delay chain, and digital selection logic. When data with unknown phases are input, these data will first pass through a VCDL (Voltage-Controlled Delay Line) composed of numerous identical voltage-controlled delay units. During this process, multiple data sets with fixed phase differences are generated from the taps of each delay unit. These data are then evaluated by the digital logic, which selects the data with the optimal phase, and outputs it through a 16:1 multiplexer (MUX) circuit.

Since the rates of the input data are different, it is required that the delay units can generate different delays. The DLL (Delay-Locked Loop) can generate different control voltages by inputting clocks with different frequencies, thereby effectively controlling the delay situations of the four delay chains.

In addition, the DPC module has two different working modes, namely Auto-tracking mode and Training mode. In Auto-tracking mode, all the delay units and taps work normally, with the digital selection logic selecting the data with the optimal phase in real time. In Training mode, the calibration operation is carried out by using the prepared data. After the digital selection logic completes the selection of the taps for the first time, it locks in this result and does not make new selection operations.

<span id="page-33-1"></span>

**Figure 1.17:** Structure of the Data Phase Control module

#### <span id="page-33-0"></span>**1.5.1.3.4 Protocol**

The ChiTu serves as a bi-directional data transceiver with an uplink bandwidth of 2.77 Gbps and a downlink bandwidth which can be set by the user to 5.55 Gbps or 11.09 Gbps. Both the uplink and downlink use scrambling to constrain the DC unbalance of the transmitted data and to enable reliable Clock and Data Recovery (CDR) and Forward Error Correction (FEC) to detect and correct transmission errors. Additionally, the transmitted FEC codes are further interleaved to improve the efficiency of the FEC code.

#### **I. Overall structure of the data digital process in ChiTu**

The Overall structure of the Digital Process are displayed in Figure [1.18.](#page-34-0) The serial data (2.77 Gbps) from the downlink is first de-serialized by the De-serializer, converting it from serial to parallel form. It is then de-interleaved by the De-Interleaver from IFRAMED to obtain FRAMED. Next, the FRAMED data is decoded by the Decoder with appropriate error corrections applied, and finally de-scrambled by the De-scrambler to recover the Header/BDC/UDC/Data.

In the transmitter section (uplink), the data to be transmitted is first scrambled by the Scrambler to achieve DC balance, and then encoded using a Forward Error Correction (Encoder) code. The encoded bits are interleaved to construct the interleaved frame IFRAMEU. Finally, the interleaved frame IFRAMEU is serialized and transmitted.

<span id="page-34-0"></span>

**Figure 1.18:** ChiTu encoding/decoding block diagram

#### **II. Encode/Decode**

Due to factors such as noise, inter-symbol interference, or single-event upsets (SEU), data can be corrupted during transmission. Forward Error Correction (FEC) enables the correction of these errors without requiring the data to be retransmitted. This is accomplished by sending additional "redundant" bits alongside the original data. While this improves the reliability of the transmission, it does so by reducing the available data bandwidth.

In the ChiTu, the codes employed are Reed-Solomon Codes, which are a class of FEC codes. These codes operate on "non-binary" symbols formed of m bits. A message composed of k symbols is encoded into an n symbol word with . The number of redundancy symbols is then n - k, which allows to correct up to  $t=(n-k)/2$  symbols (or equivalently  $t=m(n-k)/2$ bits). In order to minimize link latency, it is chosen to keep the complexity of the code to the minimum by choosing to correct only one symbol error per RS block (t=1).

The downlink employs a FEC code with m=3, and thus the number of symbols in the work is. The code is designed to correct one symbol error (t=1), and thus the number of redundancy symbols is n-k=2t=2 allowing for k=n-2t=5 data symbols. The code employed is identified as  $RS(n,k) = RS(7,5)$ .

In the ChiTu, 4 code groups are interleaved allowing to correct up to 4t=4 symbols or, equivalently, 4mt=12 bits. One code group can handle 15 information bits, so 4 code groups can handle up to 60 information bits. However, only 40 bits are effectively employed and transmitted due to the limited size of the frame (64 bits), which includes the BDC, UDC and Data fields. The remaining 20 bits must be padded and fed to the encoder. These padding bits are not transmitted but are set to a default value. At the receiver, these "known" bits are employed to feed the FEC decoder. However, all the FEC code bits must be transmitted.

<span id="page-34-1"></span>For the uplink, the ChiTu allows the user to choose between two FEC codes and two data rates. The ASIC configuration impacts the user bandwidth, error correction strength, the maximum number of available uplink D-Links and their bandwidth. This is summarized in Table [1.9.](#page-34-1)

| Link             | <b>Downlink</b>   | <b>Uplink</b>    |                   |                  |                   |
|------------------|-------------------|------------------|-------------------|------------------|-------------------|
| Data rate [Gbps] | 2.77              | 5.55             |                   | 11.09            |                   |
| <b>FEC</b>       | FEC <sub>12</sub> | FEC <sub>5</sub> | FEC <sub>12</sub> | FEC <sub>5</sub> | FEC <sub>12</sub> |
| RS               | RS(7,5)           | RS(31,29)        | RS(15,13)         | RS(31,29)        | RS(15,13)         |
| n[symbols]       | ⇁                 | 31               | 15                | 31               | 15                |
| k[symbols]       | 5                 | 29               | 13                | 29               | 13                |
| t[symbols]       |                   |                  |                   |                  |                   |
| m[bits]          | 3                 | 5                | 4                 | 5                | $\overline{4}$    |
| Code groups      | 4                 |                  | 3                 | $\overline{c}$   | 6                 |
| FEC[bits]        | 24                | 10               | 24                | 20               | 48                |
| Correction[bits] | 12                | 5                | 12                | 10               | 24                |

**Table 1.9:** Output electrical signal specification of ChiTu ASIC

To enable the error correction algorithm to handle SEU error events that affect two adjacent bits, which may span

different symbols, Reed-Solomon blocks are interleaved. This approach ensures that even if an SEU or burst error corrupts multiple bits across symbol boundaries, the errors can still be corrected because the affected symbols are distributed across different RS blocks.

For the downlink, ChiTu performs de-interleaving. Since the downlink FEC code can only correct single symbol errors  $(t=1)$  of 3 bits, and four encoders are used to cover the entire frame (36 data bits), the interleaving of the codes from the four encoders allows error correction for up to four consecutive symbols or, equivalently, up to 12 consecutive erroneous bits. The specific method of interleaving for the downlink frame is detailed in the Data Frame section.

For the uplink, ChiTu handles the interleaving process. Since a single encoder is used for FEC5 at 5.55 Gbps, no interleaving is performed for this case, and the frame is transmitted as-is. For other cases, interleaving is applied as described in the Data Frame section.

#### **III. Scramble**

In the ChiTu, the Scrambler/De-scrambler employs a self-synchronizing architecture, meaning that no synchronization pattern or reset signal is required for the de-scrambler to synchronize.

The self-synchronizing scrambler and de-scrambler in ChiTu utilize a parallel equivalent implementation of the serial scrambler and de-scrambler. This design is optimized to match the ASIC N-parallel input and helps reduce latency and minimize the operating frequency. In a serial implementation, N clock cycles are needed to process N data inputs and generate N scrambled outputs. In contrast, the parallel implementation requires only one clock cycle, though it comes at the cost of increased logic complexity and larger silicon area.

For the downlink the number of bits to be scrambled/de-scrambled is 36: UDC[1:0], BDC[1:0] and Data[31:0]. A 36-bit scrambler/de-scrambler (order 36) is used that implements the following scrambling recursive equation: $S_i = D_i$ xnor  $S_{i-25}$  xnor  $S_{i-36}$ .

<span id="page-35-0"></span>For the uplink scrambling is a function of the data rate (number of scramblers used) and the FEC code (scrambling equation used). This is summarized in Table [1.10.](#page-35-0)

| Link                  | <b>Downlink</b>                             | <b>Uplink</b>    |                   |                  |                   |
|-----------------------|---------------------------------------------|------------------|-------------------|------------------|-------------------|
| Data rate [Gbps]      | 2.77                                        | 5.55             |                   | 11.09            |                   |
| <b>FEC</b>            | FEC <sub>12</sub>                           | FEC <sub>5</sub> | FEC <sub>12</sub> | FEC <sub>5</sub> | FEC <sub>12</sub> |
| Data[bits]            | 36                                          | 116              | 102               | 232              | 204               |
| Scrambler width[bits] | 36                                          | 58               | 51                | 58               | 51                |
| Scrambler order       | 36                                          | 58               | 49                | 58               | 49                |
| Number of scramblers  |                                             | $\overline{2}$   | 2                 | $\overline{4}$   | 4                 |
| Recursive equation    | Eq1                                         | Eq2              | Eq <sub>3</sub>   | Eq2              | Eq3               |
| Eq1                   | $S_i = D_i$ xnor $S_{i-25}$ xnor $S_{i-36}$ |                  |                   |                  |                   |
| Eq2                   | $S_i = D_i$ xnor $S_{i-39}$ xnor $S_{i-58}$ |                  |                   |                  |                   |
| Eq <sub>3</sub>       | $S_i = D_i$ xnor $S_{i-40}$ xnor $S_{i-49}$ |                  |                   |                  |                   |

**Table 1.10:** Downlink and Uplink scrambling vs FEC and data rate

For testing purposes, the scrambler/de-scrambler can be bypassed.

#### **IV. Data Frame**

The downlink frame is composed of 64-bits resulting in a data rate of 2.77 Gbps. The frame structure is represented in Figure [1.19.](#page-35-1)

<span id="page-35-1"></span>

**Figure 1.19:** Downlink Frame (64 bits) before interleaving
The uplink frame length depends on the data rate, being the frame 128-bit for 5.55 Gbps transmission and 256-bit for 11.09 Gbps. Additionally, since the length of the FEC field depends on the error correction strength (FEC5 or FEC12) the length of the fields differs among the four modes. The exceptions are the H, BDC and UDC fields that have the same length for the four modes of operation. The Uplink 11.09 Gbps FEC12 frame structure is represented in Figure [1.20.](#page-36-0)

<span id="page-36-0"></span>

**Figure 1.20:** Uplink 11.09 Gbps FEC12 Frame(256 bits) before interleaving

The frame is organized as follows:

- **i)** H-field: A 4-bit or 2-bit Header that delimits the start of the frame. They are used to guaranty the DC balance of the header code and to implement header redundancy allowing robust header detection in the presence of noise and/or single event upsets;
- **ii)** BDC-field: Composed of 2 bits, implements the downlink/uplink of the Bidirectional Control (BDC) channel used to control the ChiTu itself (only operational in transceiver mode). The data rate is 86.67 Mbps;
- **iii)** UDC-field: Composed of 2 bits, implements the downlink/uplink of the Unidirectional Control (UDC) channel. The data rate is 86.67 Mbps;
- **iv)** D-field: The field has variable length (depending on data rate and FEC code used). The length of this field is given in Table [1.11.](#page-36-1)
- <span id="page-36-1"></span>**v)** FEC-field: The field carries the Forward Error Correction code to detect and correct transmission errors due to noise or Single Event Upsets (SEU). The length of this field is given in Table [1.11.](#page-36-1)

| Link                  | <b>Downlink</b>   | <b>Uplink</b>    |                   |                   |                   |  |  |  |
|-----------------------|-------------------|------------------|-------------------|-------------------|-------------------|--|--|--|
| Data rate [Gbps]      | 2.77              |                  | 5.55              |                   | 11.09             |  |  |  |
| <b>FEC</b>            | FEC <sub>12</sub> | FEC <sub>5</sub> | FEC <sub>12</sub> | FEC <sub>5</sub>  | FEC <sub>12</sub> |  |  |  |
| Frame[bits]           | 64                | 128              | 128               | 256               | 256               |  |  |  |
| Header[bits]          | 4'b1001           |                  |                   | 2 <sup>6</sup> 10 |                   |  |  |  |
| <b>BDC</b> [bits]     | ↑                 | $\overline{2}$   | $\overline{c}$    | 2                 | $\overline{2}$    |  |  |  |
| UDC[bits]             | っ                 | 2                | $\overline{2}$    | $\overline{c}$    | $\overline{c}$    |  |  |  |
| $LM(0+DownBDC)[bits]$ | $0 + 0$           | $0+0$            | $0+2$             | $4 + 2$           | $8+2$             |  |  |  |
| Data[bits]            | 32                | 112              | 96                | 224               | 192               |  |  |  |
| FEC[bits]             | 24                | 10               | 24                | 20                | 48                |  |  |  |

**Table 1.11:** Downlink and Uplink scrambling vs FEC and data rate

**vi)** LM-field: The "Latency Measurement" field is a special field that allows estimating the round-trip latency of the transceiver link (excluding D-Links). In this field the two bits of the Downlink BDC-field are returned by the transmitter. Obviously, this field is only valid when operating the ASIC as a transceiver. Depending on the transmitter data rate and FEC encoding, this field is padded with a different number of leading "zeros". Note that this field is not available when operating at 5.55 Gbps with FEC5 encoding, please see the Table [1.11](#page-36-1) below for details.

<span id="page-36-2"></span>

| <b>IFRAMED</b>  | [63:56]        |            |             |            |   |            | [55:24] |            |                                                                                                                                                                                                                                      |      |  |  |   |                                                                             |                    | [23:0]                             |                          |         |                                                   |       |       |   |  |  |
|-----------------|----------------|------------|-------------|------------|---|------------|---------|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|--|--|---|-----------------------------------------------------------------------------|--------------------|------------------------------------|--------------------------|---------|---------------------------------------------------|-------|-------|---|--|--|
| <b>Bits</b>     | ಸ್             | Ξ          | $\boxed{2}$ | Ξ          | Ξ | Ξ          | Ξ       | Ξ          | [18:20]<br>23<br>[27:29]<br>ದ<br>ਜ਼<br>−<br>ਜ<br>H<br>$\overline{ }$<br>$[3.5]$<br>[6:3]<br><u>يٰ</u><br>30:3<br>$\mathbf{H}$<br>$\sim$<br>21<br>[9:1]<br>ம்<br>$\cdots$<br><u>0</u><br>Ë<br>4<br>늰<br>$\overline{\phantom{0}}$<br>Ń |      |  |  |   | 3<br>$\alpha$<br>. .<br>$\overline{\phantom{0}}$<br>$\sim$<br>e de la provi | ≂<br>⊣<br>ίó.<br>르 | $\overline{\phantom{0}}$<br>F<br>ö | 5<br>$\ddot{\mathbf{e}}$ | [18:20] | ਚ<br>$\mathbf{\mathbf{r}}$<br>Ń<br>$\blacksquare$ | [6:8] | [0:2] |   |  |  |
| <b>Function</b> | Н              | <b>BDC</b> | н           | <b>BDC</b> | н | <b>UDC</b> | н       | <b>UDC</b> |                                                                                                                                                                                                                                      | Data |  |  |   |                                                                             |                    |                                    |                          |         | <b>FEC</b>                                        |       |       |   |  |  |
| Group           | 3, Header<br>, |            |             |            |   |            |         |            | 3                                                                                                                                                                                                                                    | 2    |  |  | 3 |                                                                             |                    | 0                                  |                          | ∽       |                                                   | 0     | 3     | 2 |  |  |

**Figure 1.21:** Downlink Interleaved Frame

<span id="page-37-0"></span>

**Figure 1.22:** Downlink Frame before interleaving



**Figure 1.23:** Uplink 5.55 Gbps FEC5 Frame

<span id="page-37-3"></span><span id="page-37-1"></span>

| <b>IFRAMEU</b>  |                    | [127:126]                                                                                         | [123:120]<br>[125:124]<br>[119:116] |         |              |                    |                |         |        |                |              | [115:84]    |                |                                   |         |                |         |         |                |         |             |
|-----------------|--------------------|---------------------------------------------------------------------------------------------------|-------------------------------------|---------|--------------|--------------------|----------------|---------|--------|----------------|--------------|-------------|----------------|-----------------------------------|---------|----------------|---------|---------|----------------|---------|-------------|
| <b>Bits</b>     | [0:1]              |                                                                                                   |                                     | [0:1]   |              | [66:67]<br>[32:33] |                | [0:1]   |        |                | [0:1]        |             | [62:65]        | $\overline{\mathbf{u}}$<br>[28:3] | [92:95] | [58:61]        | [24:27] | [88:91] | [54:57]        | [20:23] |             |
| <b>Function</b> |                    | Header<br><b>BDC</b><br><b>UDC</b><br>Data<br>Data<br><b>DownBDC</b>                              |                                     |         |              |                    |                |         |        |                |              |             |                |                                   |         |                |         |         |                |         |             |
| <b>Group</b>    |                    | $\overline{2}$<br>$\overline{c}$<br>$\overline{2}$<br>1<br>$\bf{0}$<br>$\mathbf 0$<br>1<br>0<br>1 |                                     |         |              |                    |                |         | 1      | $\mathbf 0$    |              |             |                |                                   |         |                |         |         |                |         |             |
| <b>IFRAMEU</b>  |                    |                                                                                                   |                                     |         |              |                    |                | [83:24] |        |                |              |             |                |                                   |         |                |         |         | [23:0]         |         |             |
| <b>Bits</b>     | [84.87]            | [50:53]                                                                                           | [16:19]                             | [80:83] | [46:49]      | [12:15]            | [7679]         | [42:45] | [8:11] | [72.75]        | [38:41]      | $[4.7]$     | [68:71]        | [34:37]                           | $[0.3]$ | [20:23]        | [12:15] | [4:7]   | [16:19]        | [8:11]  | [0.3]       |
| <b>Function</b> | <b>FEC</b><br>Data |                                                                                                   |                                     |         |              |                    |                |         |        |                |              |             |                |                                   |         |                |         |         |                |         |             |
| <b>Group</b>    | $\overline{2}$     | 1                                                                                                 | 0                                   | 2       | $\mathbf{1}$ | $\mathbf 0$        | $\overline{c}$ | 1       | 0      | $\overline{2}$ | $\mathbf{1}$ | $\mathbf 0$ | $\overline{2}$ |                                   | 0       | $\overline{2}$ | 1       | 0       | $\overline{2}$ | 1       | $\mathbf 0$ |

**Figure 1.24:** Uplink 5.55 Gbps FEC12 Interleaved Frame

| <b>FRAMEU</b>   |        |            |            | [127:126]   [125:124]   [123:122]   [121:120]   [119:24] |        | [23:0] |
|-----------------|--------|------------|------------|----------------------------------------------------------|--------|--------|
| <b>Bits</b>     | [0:1]  | [0:1]      | [0:1]      | [0:1]                                                    | [95:0] | [23:0] |
| <b>Function</b> | Header | <b>BDC</b> | <b>UDC</b> | <b>DownBDC</b>                                           | Data   | FEC.   |

**Figure 1.25:** Uplink 5.55 Gbps FEC12 Frame before interleaving

<span id="page-37-2"></span>The ChiTu Clock and Data Recovery system requires the incoming serial data stream to have a high frequency of "0-to-1" and "1-to-0" transitions. However, this is not guaranteed for the BDC, UDC, and D fields initially. To ensure this requirement is met, these three fields are scrambled before being inserted into the frame for transmission. The FEC codes, along with the data, are then calculated based on the scrambled BDC, UDC, and D fields. The resulting frame structure, after all these operations, is depicted in Figures [1.22,](#page-37-0) [1.23,](#page-37-1) [1.25,](#page-37-2) [1.27,](#page-38-0) and [1.29.](#page-39-0)

The interleaving process for both downlink and uplink frames is illustrated in Figures [1.21,](#page-36-2) [1.24,](#page-37-3) [1.26,](#page-38-1) and [1.28.](#page-38-2) In these figures, IFRAMED[63:0]/IFRAMEU[255/127:0] represents the frame as it is transmitted over the optical fiber, with IFRAMED[63]/IFRAMEU[255/127] being sent first and IFRAMED[0]/IFRAMEU[0] last. These figures show the relationship between the frame bits and the FEC codes, Data, UDC, BDC fields, and Header, as well as the code group each field belongs to.

### **1.5.1.3.5 On-chip clock System**

<span id="page-38-1"></span>

**Figure 1.26:** Uplink 11.09 Gbps FEC5 Interleaved Frame

<span id="page-38-0"></span>![](_page_38_Picture_30.jpeg)

**Figure 1.27:** Uplink 11.09 Gbps FEC5 Frame before interleaving

<span id="page-38-2"></span>

| <b>IFRAMEU</b>  | [255:254] |           |           | [253:252]      |         | [251:250]      |           |              | [249:240]            |                |             | [239:236]   |           |           | [235:216]           |                | [215:212] |             |            | [211:192] |                |           |                | [191:190]    |              |
|-----------------|-----------|-----------|-----------|----------------|---------|----------------|-----------|--------------|----------------------|----------------|-------------|-------------|-----------|-----------|---------------------|----------------|-----------|-------------|------------|-----------|----------------|-----------|----------------|--------------|--------------|
| <b>bits</b>     | [0:1]     |           |           | [0:1]          |         | [0:1]          | [100:101] | [167:168]    | [134:135]<br>[66:67] | [32:33]        |             | 0000        | [164:167] | [130:133] | [96:99]<br>[62:65]  | [28:31]        | 0000      |             | [160:163]  | [126:129] | [92:95]        | [58:61]   | [24:27]        | $[0:1]$      |              |
| <b>Function</b> | Header    |           |           | <b>BDC</b>     |         | <b>UDC</b>     |           | Data         |                      |                |             | 0           |           |           | Data                |                | 0         |             |            |           | Data           |           |                | DownBDC      |              |
| Group           |           |           |           | 3              |         | $\overline{2}$ |           | $\mathbf{1}$ |                      | 0              |             | 5           | 4         | 3         | $\overline{2}$<br>1 | 0              | 5         |             | 4          | 3         | $\overline{c}$ | 1         | 0              | 5            |              |
|                 |           |           |           |                |         |                |           |              |                      |                |             |             |           |           |                     |                |           |             |            |           |                |           |                |              |              |
| <b>IFRAMEU</b>  |           |           |           |                |         |                |           |              |                      |                |             | $[189.96]$  |           |           |                     |                |           |             |            |           |                |           |                |              |              |
| bits            | [190:191] | [156.159] | [122.125] | [88:91]        | [54.57] | [20:23]        | [186:189] | [152.155]    | [118:121]            | [84.87]        | [50:53]     | [16:19]     | [182:185] | [148.151] | [117:114]           | [80:83]        | [46.49]   | [12:15]     | [178.181]  | [144.147] |                | [110:113] | [76:79]        | [42.45]      | [8:11]       |
| <b>Function</b> |           |           |           |                |         |                |           |              |                      |                |             | Data        |           |           |                     |                |           |             |            |           |                |           |                |              |              |
| Group           | 5         | 4         | 3         | $\overline{2}$ | 1       | $\mathbf{0}$   | 5         | 4            | 3                    | $\overline{2}$ | $\mathbf 1$ | $\mathbf 0$ | 5         | 4         | 3                   | $\overline{2}$ | 1         | $\mathbf 0$ | 5          | 4         |                | 3         | $\overline{2}$ | $\mathbf{1}$ | 0            |
|                 |           |           |           |                |         |                |           |              |                      |                |             |             |           |           |                     |                |           |             |            |           |                |           |                |              |              |
| <b>IFRAMEU</b>  |           |           |           |                |         | [95:48]        |           |              |                      |                |             |             | [47:0]    |           |                     |                |           |             |            |           |                |           |                |              |              |
| bits            | [174:177] | [140:143] | [106:109] | [72:75]        | [38:41] | [4:7]          | [170:173] | 136:139]     | [102:105]            | [68:71]        | [34:37]     | $[0:3]$     | [44:47]   | [36:39]   | [28:31]             | [20:23]        | [12:15]   | [4:7]       | [40:43]    | [32:35]   |                | [24:27]   | [16:19]        | [8:11]       | $[0:3]$      |
| <b>Function</b> |           |           |           |                |         | Data           |           |              |                      |                |             |             |           |           |                     |                |           |             | <b>FEC</b> |           |                |           |                |              |              |
| Group           | 5         | 4         | 3         | $\overline{2}$ | 1       | 0              | 5         | 4            | 3                    | $\overline{c}$ | 1           | 0           | 5         | 4         | 3                   | $\overline{2}$ | 1         | 0           | 5          | 4         |                | 3         | $\overline{2}$ | $\mathbf{1}$ | $\mathbf{0}$ |

**Figure 1.28:** Uplink 11.09 Gbps FEC12 Interleaved Frame

<span id="page-39-0"></span>

|                 |        |            |            |            | FRAMEU   [255:254]   [253:252]   [251:250]   [249:242]   [241:240]   [239:48]   [47:0] |                    |            |
|-----------------|--------|------------|------------|------------|----------------------------------------------------------------------------------------|--------------------|------------|
| bits            | [0:1]  | [0:1]      | [0:1]      | $8$ 'b $0$ | [0:1]                                                                                  | $[191:0]$ $[47:0]$ |            |
| <b>Function</b> | Header | <b>BDC</b> | <b>UDC</b> |            | DownBDC                                                                                | Data               | <b>FEC</b> |

**Figure 1.29:** Uplink 11.09 Gbps FEC12 Frame before interleaving

The clock system mainly comprises a phase-locked loop (PLL), clock data recovery (CDR) and phase adjustment, as shown in Figure x. The PLL receives a 43.33 MHz system clock and generates the required clocks for ChiTu, while also offering up to 16-channel differential clock outputs externally, with configurable frequencies for 16 channels and phases for 2 channels. The available output clock frequencies for each channel are 43.33, 86.67, 173.33, 346.67, 693.33, and 1386.67 MHz. The phases are regulated by a phase adjustment block (CPC, clock phase control), providing a resolution of 45.08 ps for all six frequencies with a full range of 23.07 ns. The CDR operates at 2.77 Gbps, receiving serial data from the back-end and recovering both the data and clock for the de-serializer block in ChiTu.

![](_page_39_Figure_4.jpeg)

**Figure 1.30:** Block diagram of the on-chip clock system

**I. PLL** The PLL consists of a voltage-controlled oscillator (VCO), a phase-frequency detector (PFD), a differential charge pump (CPp), a low-pass filter (LPFp), an input divider with a division of 2 (or 3) to keep the duty cycle, a feed-back divider chain with a total division of 256 (or 128), several buffers for clock transmission, and a frequency selector.

To achieve low-jitter performance in a PLL, the LC-tank VCO is preferred for its high Q-factor characteristics. The LC-VCO architecture includes a 3-terminal inductor, two p-type varactors, a pair of cross-coupled NMOS transistors, 1-bit controlled Metal-Oxide-Metal capacitors (MOM-CAPs), a current source, an enable switch, and two bias voltage filters. The selection of the 3-terminal inductor over the 2-terminal at 5 GHz is based on its superior Q-factor as stated in the process documentation. Complementary negative-resistance units are utilized for improved symmetry and reduced phase noise. In this implementation, an NMOS cross-coupled pair is used within the constraints of the process power supply (1.2 V) to compensate for energy losses in the LC tank. The PMOS current mirror is favored due to lower flick noise compared to an NMOS current mirror. MOM capacitors are employed to cover the lower frequency band range. Additionally, the LC-VCO integrates two low-pass filters to attenuate noise originating from the bias circuits, enabling a phase noise performance of -113 dBc/Hz (with a 1 MHz frequency offset of 5.12 GHz) achieved at a Kvco value of approximately 0.74 GHz/V.

The Phase Frequency Detector (PFD) employs an edge-detection structure for phase and frequency error detection. The Charge Pump (CP) features a differential design with complementary current sources and switches. Cascade current mirrors are implemented to minimize channel modulation effects, while a unity-gain buffer ensures consistent DC operating

points for both branches. The loop bandwidth (LBW), adjustable from 250 kHz to 1.55 MHz through programmable resistors in the Low Pass Filter (LPF) and charging current control, allows for trade-offs between locking time, noise contribution, and process variations.

**II. CDR** The CDR consists of a bang-bang phase detector (PD), a frequency detector (FD), a differential CPc and a LPFc, with a shared VCO in the PLL. Only one of the PLL and CDR operates at a time. During CDR operation, the divided-by-2 clocks are looped back to the PD and FD to provide a Vctrlc signal for adjusting the VCO. In this case, the CDR works at 2.77 Gbps while the VCO operates at 5.55 GHz, which is required for data serialization. And the 2.77 GHz clock is required for the deserializer.

**III. CPC (Clock Phase Control)** Each output channel's clock phase control is implemented using two-stage Delay-Locked Loops (DLLs). In order to support frequencies ranging from 43.33 to 1386.66 MHz, the first stage comprises 32 coarse voltage-controlled delay cells (VCDCs), where the feedback delay clock is selected based on the frequency. Each coarse VCDC comprises 16 fine-VCDCs with a cell delay of 721.15 ps, enabling a phase adjustment range of 25 ns with 32 cells. The multi-phases are programmable to output through a coarse multiplexer. The fine-VCDL controlled by a DLL with the same VCDC, consists of 16 fine-VCDCs and enables fine delay adjustment of 45.08 ps.

### **1.5.1.3.6 Serializer**

The Figure [1.31](#page-41-0) illustrates the overall block diagram of the chip design. The chip employs a two-stage 4:1 MUX circuit in a cascaded configuration to serialize 16 channels of 693.33 Mbps parallel input data into a single 11.09 Gbps high-speed serial data output. The design mainly consists of a low-speed 4:1 MUX, a high-speed 4:1 MUX, a clock control module, a receiver module (Rx), a transmitter module (Tx), a PRBS self-test module, and other components.

The first stage comprises four low-speed 4:1 MUXs, which convert the 16 channels of 693.33 Mbps parallel input data into 4 channels of 2.77 Gbps serial data output. The second stage consists of a high-speed 4:1 MUX that converts the 4 channels of 2.77 Gbps parallel input into a single channel of 11.09 Gbps serial data output.

The clock MUX3 selection module receives a pair of 5.55 GHz differential clocks. This clock control module provides four multi-phase 693.33 MHz clock signals to the low-speed 4:1 MUX and four multi-phase 2.77 GHz clock signals to the high-speed 4:1 MUX. The receiver module (Rx) receives 16 pairs of 693.33 Mbps differential CML signals and converts them into single-ended full-amplitude voltage signals for serialization by the low-speed 4:1 MUX.

The transmitter module (Tx) sends a pair of 11.09 Gbps differential CML signals off-chip to drive external loads. For convenient testing and verification of the chip's logic, a PRBS self-test module is designed. Sixteen switches (SW) are placed between the receiver module (Rx) and the four low-speed 4:1 MUXes to select whether the input data is provided externally or by the on-chip PRBS self-test module.

When the switch (SW) is set to "1", the chip's input data is provided by an external signal source, which is then converted by the receiver module (Rx) into single-ended full-amplitude voltage signals for subsequent parallel-to-serial conversion. When the switch (SW) is set to "0," the chip's input data is provided by the internal PRBS source. The 16 channels of single-ended full-amplitude 640 Mbps parallel data generated by the PRBS source undergo parallel-to-serial conversion in the core circuit of the chip, producing a high-speed 11.09 Gbps serial PRBS15 sequence. This sequence is then transmitted off-chip via the data transmission circuit (Tx), enabling the self-test functionality of the chip.

### **1.5.1.3.7 Deserializer**

The structure of the 2.77 Gbps 1:16 deserializer is shown in figure [1.32](#page-41-1) . It mainly consists of a data receiver circuit (DRX), a 1:4 demultiplexer (DEMUX), a single-ended-to-differential (S2D) circuit, an LVDS output circuit, a clock receiver circuit (CRX), and a divide-by-4 circuit (Divider/4). The input data rate and clock frequency are 2.77 Gbps and 2.77 GHz, respectively. The input data is first converted into single-ended 2.77 Gbps CMOS data by the DRX circuit. This data then passes through the first-stage 1:4 DEMUX, producing four parallel channels of 693.33 Mbps data. Subsequently, each of these four channels is processed by four parallel 1:4 DEMUXs, achieving a 4:16 demultiplexing operation. This results in 16 channels of single-ended 173.33 Mbps data. The S2D circuit converts the single-ended

<span id="page-41-0"></span>![](_page_41_Figure_1.jpeg)

**Figure 1.31:** Block diagram of the serializer

<span id="page-41-1"></span>signals into complementary CMOS signals. These signals are then fed into the LVDS driver circuit, which outputs 16 parallel channels of differential 173.33 Mbps LVDS data.

![](_page_41_Figure_4.jpeg)

**Figure 1.32:** Block diagram of the deserializer

# **1.5.1.4 KinWooLDD: VCSEL Driver**

KinWooLDD is a four-channel vertical-cavity surface-emitting laser (VCSEL) driver. Each channel mainly comprises an input equalizer stage, a limiting amplifier stage, and an output-driving stage. The design of this VCSEL driving ASIC is presented in Table [1.12.](#page-42-0)

The block diagram of the 11.09 Gbps/ch VCSEL driving ASIC is shown in figure [1.33.](#page-42-1) It consists of an equalizer stage, a limiting amplifier stage and a novel output driver stage. The equalizer stage receives a pair of 11.09 Gbps/ch differential signals. The limiting amplifier stage receives the signals from the equalizer stage, and further amplifies the differential signals with sufficient gain and bandwidth for the output driver stage. The output driver stage converts the amplified differential voltage signals from the limiting amplifier stage to single-ended high speed current signal, and drives

<span id="page-42-0"></span>

| <b>Parameter</b>                       | <b>Design indicators</b>                       |
|----------------------------------------|------------------------------------------------|
| Power supply voltage                   | 1.2 V and 3.3 V                                |
| Power consumption                      | typival 50 mW/ch, 200 mW $@$ 4 x 11.09 Gbps/ch |
| Bit rate                               | 11.09 Gbps/ch                                  |
| Differential input signal amplitude    | $MIN: 200 mV p-p$                              |
| Differential input impedance           | $100 \Omega$                                   |
| Maximum equalizer equilibrium strength | $> 7 \text{ dB}$                               |
| Limiting amplification stage gain      | $> 12$ dB @typical                             |
| Limiting amplification level bandwidth | $>$ 9.8 GHz @typical                           |
| Output current amplitude               | 5 mA @typical                                  |
| Maximum pre-emphasis strength          | $>$ 2.5 dB                                     |
| Simulate ISI jitter                    | $<$ 15 ps                                      |

**Table 1.12:** Specifications of the KinWooLDD ASIC

<span id="page-42-1"></span>![](_page_42_Figure_3.jpeg)

**Figure 1.33:** Block diagram of the VCSEL driver

the external VCSEL.

The equalizer stage and limiting amplifier stage are shown in Figure [1.34.](#page-43-0) Figure [1.34](#page-43-0) (a) shows the equalizer stage circuit. To compensate the high frequency losses at the printed circuit board (PCB) traces, bonding wires and input pads, a programmable continuous-time linear equalizer (CTLE) structure is added in the equalizer stage. The limiting amplifier stage is composed of two amplifier blocks and each amplifier block adopts inductor-shared technology to obtain sufficient gain and bandwidth, as shown in Figure [1.34](#page-43-0) (b).

The output driver stage is shown in Figure [1.35.](#page-43-1) Due to the relatively high VCSEL threshold voltage  $(>1.6 V)$ , the output driver stage uses a 3 V power to obtain sufficient voltage headroom. To avoid the use of the thick oxide (high voltage) transistors, the stacked PMOS current source (M1, M2, M3 and M4) using the 1.2 V core transistors is adopted. Because the voltage of the output node is always larger than 1.6 V, an on-chip AC coupling circuit is used to raise the DC point of the input signals (M5 and M6 gate) and ensures the safety of the 1.2 V core transistors M5 and M6. To match with the raised DC point, the tail current part also adopts the stacked current source (M8, M9, M11 and M12) structure.

The output driver stage in the conventional design uses one feed-forward capacitor to enhance the bandwidth and reduce the jitter. In the proposed output driver, because the PMOS current source uses the stacked structure (M1, M2, M3 and M4), the double feed forward capacitors (C1 and C2) between the left branch and the M2 and M4 gate are added. The passive inductor L1 and the programmable CTLE pre-emphasis structure (R4 and C5) are also used to further enhance the bandwidth.

To increase the driving bandwidth further, active-feedback and feed-forward equalizer (FFE) pre-emphasis techniques can be implemented in the pre-driver and output driver, respectively.

# **1.5.1.5 KinWooTIA: Trans-impedance Receiver**

KinWooTIA is a four-channel photodiode (PD) receiver with a data rate capability exceeding 2.77 Gbps or 5.55 Gbps per channel for standard and enhanced requirements, respectively. Each channel consists primarily of a transimpedance

<span id="page-43-0"></span>![](_page_43_Figure_1.jpeg)

**Figure 1.34:** Equalizer and limiting amplifier stage (a) equalizer stage (b) limiting amplifier stage

<span id="page-43-1"></span>![](_page_43_Figure_3.jpeg)

**Figure 1.35:** Output driving stage

![](_page_44_Picture_201.jpeg)

**Table 1.13:** Specifications of the KinWooTIA ASIC

<span id="page-44-0"></span>amplifier (TIA), limiting amplifier (LA), and driver stage. Table [1.13](#page-44-0) shows the specifications.

The Figure [1.36](#page-44-1) shows the block diagram of the whole design, in which TIA has four channels and adopts differential architecture to realize low noise and adjustable bandwidth. For each channel,a PIN photodiode is AC coupled to the transimpedance amplifier using on-chip capacitance. An on-chip biasing circuit provides proper biasing voltage for the photodiode. A fully differential cascade TIA with programmable feedback resistance is designed to achieve low noise and adjustable bandwidth. The limiting amplifier is composed of 4 stages to simultaneously achieve high gain and bandwidth. The output buffer is integrated in the chip to interface with an external 100  $\Omega$  differential output load. The offset cancellation circuit controls the DC current in the input stage allowing cancelling the offset DC along the amplifying chain.The photodiode signal is AC coupled to the TIA using on-chip capacitors and a photodiode biasing circuit is designed and integrated in the chip to ensure the proper biasing of the photodiode for high irradiation levels.

<span id="page-44-1"></span>The main design specifications are shown in the table.The TIA is designed to achieve a high transimpedance gain (>20 k $\Omega$ ), high bandwidth (> 2 GHz), and low input referred noise (<2  $\mu$ A RMS).

![](_page_44_Figure_5.jpeg)

**Figure 1.36:** Block diagram of the TIA receiver

# **1.5.1.6 KinWooTRX: Customized Optical Module**

KinWooTRX is an optical module consisting of a KinWooLDD, a KinWooTIA, a four-channel VCSEL array, a four-channel PD array and a carrier board for the optocoupler devices. The height requirement is xxx mm.

#### **1.5.1.6.1 Module Introduction**

### **1.5.1.6.2 Electrical Interface**

**1.5.1.6.3 Optical Interface** The choice for optical data transmission, suitable for a four-transmitter (TX), fourreceiver (RX) module is assembled in Chip-on-Board (COB) format, which is illustrated in Figure [1.37](#page-45-0) (a). The optical link suitable for detector readout of a few hundred meters is the 850 nm multi-mode technology. The COB assembly has the compactness to house up to 12 channels. In Figure [1.37](#page-45-0) (b) is a prism for connection to a MT type fiber connector. A prototype transceiver equipped with 4 TX and 4 RX channels is shown in Figure [1.37](#page-45-0) (c), that has a dimension of 10x20 mm<sup>2</sup>, on which the prism is 2.75 mm in height.

The plug-in of an optical transceiver to electrical connector on mother board will require further consideration on fixture of fiber connector and heat dissipation.

- [1] B. Deng et al., JINST 17 (2022) C05005.
- <span id="page-45-0"></span>[2] D. Hall et al., JINST 7 (2012) C01047.

![](_page_45_Figure_7.jpeg)

**Figure 1.37:** The Chip-on-Board assembly of opto-electronics is illustrated in a). VCSEL and PD for light coupling are aligned to prism lenses, and guide pins to MT type fiber connector. The illustrated prism in b) is made in PEI plastic with the dimension of 9.5 mm in width and 2.75 mm in height. A prototype transceiver (QTRx [2]) is shown in c) using a Hirose electric connector.

Radiation tolerant of optical link components is required in the collider environment. The best know fiber type is the F-doped reported for attenuation of 0.02 dB/m at 100 kGy(Si) [1]. However the F-doped is OM2 type for 1 Gbps speed with large propagation loss for. The most popular multi-mode fiber for high speed  $(>10 \text{ Gbps OM3,OM4})$  is the Ge-doped fibers by many manufacturers. For a distance of 150 m, the propagation loss is specified for 0.38 dB. The radiation performance of Ge-dope fibers can be rather different due to the doping complex and fabrication methods. The characteristics in Total Ionizing Dose (TID) of Ge-doped samples are investigated with Co60 gamma-ray for the known dependence of temperature, does rate, and annealing effect.

[1] D. Hall et al., JINST 7 (2012) C01047.

In Figure [1.38](#page-46-0) (a), a bare-fiber sample for TID test was inserted in a water tank maintained at constant temperatures between -15 ◦C and 45 ◦C at a dose rate controlled from 3 Gy/hr to 1.5 kGy/hr. The Radiation Induced Attenuation (RIA) of fibers to total ionizing dose (TID) is expressed by

 $RIA = [IL(T) - IL(0)]/Length$ 

 $IL(dB) = 10 \times log_{10} (P_t/P_r)$ 

where IL(dB) is the insertion loss of  $P_t$  transmitted, and  $P_r$  received. The RIA gives a unit length attenuation after receiving a dose T.

The Co60 irradiation was conducted during working hours in a week. The source-off hours show annealing behavior. The temperature dependence is illustrated in Figure [1.38](#page-46-0) (b) for four samples tested at  $1.4 \text{ kGy(SiO<sub>2</sub>)/hr}$ , at temperatures

![](_page_46_Figure_1.jpeg)

<span id="page-46-0"></span>between 12  $\rm{°}C$  and 42  $\rm{°}C$ . The deviation in RIA at higher temperature is much less than at low temperature.

**Figure 1.38:** a) bare-fiber in reel is stored in an air-tight bag with water cooling plate for TID test. The fiber terminated with SC-ferruls is connected to a laser light source and is measured per minute in a Co60 facility. The fiber temperatures and RIAsat 1.4 kGy( $SiO<sub>2</sub>$ )/hrare plotted in b) for four samples in a day. The RIAs rise quickly in response to irradiation of 5 to 9 hours and anneals in a few hours after the Co60 is removed. At colder temperatures the deviation of source-on (solid lines) is twice higher than those at warmer temperatures.

<span id="page-46-1"></span>![](_page_46_Figure_4.jpeg)

**Figure 1.39:** The RIA measurements are compiled for tests at various temperatures and dose rates for (a) an non-radiation hard Ge-doped fiber type, and (b) a radiation resistant type at 32 ◦C. The dashed lines show optical power during irradiation and the solid-line (points) are after 10 hours annealing.

The radiation responses of different Ge-doped fibers are very different. The RIAs of a non-radiation hard type are compiled and shown in Figure [1.39](#page-46-1) (a). This type of fiber shows little annealing effect after irradiation. The temperature dependence is also little. The RIA rises to 1 dB/m after 100 kGy  $(SiO<sub>2</sub>)$ . Two of the four tested fiber types show radiation resistance. Plotted in Figure [1.39](#page-46-1) (b) is the type showing the least RIA lose. The samples were irradiated at different dose rates at 32 °C. The RIA rises during the beginning 10 kGy (SiO<sub>2</sub>) to 0.02 dB/m, and the slowly rise to 0.05 dB/m with  $300 \text{ kGy (SiO}_2)$ . The RIAs tested at lower temperature are compatible.

### **1.5.1.7 Related Designs and Prototype**

Prototype circuits have been developed to assess functionalities and performance. Figure x.2 illustrates the block diagram of the BDTIC (bi-direction transceiver integrated circuit) chip, which is a prototype design of ChiTu. Figure [1.40](#page-47-0) presents the test platform comprising a test board with a wire-bonded BDTIC die, a clock board, power supplies, an oscilloscope, a spectrum analyzer, a bit error rate tester (BERT), and a computer. Figure [1.41](#page-47-1) shows measured performance including jitter and phase noise performance of the PLL (as shown in (a) and (b)), jitter and phase noise performance of the CDR recovered clock (as shown in (c) and (d)), and eye diagrams of the serializer and de-serializer (as shown in (e) and (f)). The results indicate that the building blocks function well with good performance, and meet the characterization <span id="page-47-0"></span>requirements of ChiTu.

![](_page_47_Figure_2.jpeg)

**Figure 1.40:** Block diagram of the prototype BDTIC chip

<span id="page-47-1"></span>![](_page_47_Picture_4.jpeg)

**Figure 1.41:** Test platform of the BDTIC chip

![](_page_47_Figure_6.jpeg)

**Figure 1.42:** Performance measurements of the BDTIC chip

Figure [1.43](#page-48-0) shows another prototype design of the LC-PLL, which also employs a classical charge pump architecture.

TMR designs have been implemented for some DFF dividers and the PFD design to resist SEU effects, although the CML dividers have not been considered for this purpose yet. The measurement results presented in Figure [1.44](#page-48-1) demonstrate good jitter performance. Currently, the PLL has been subjected to X-ray irradiation, and no significant abnormalities were observed after approximately 46 Mrad (Si) TID exposure at an average dose rate of about 1.057 Mrad/h (Si).

<span id="page-48-0"></span>![](_page_48_Figure_2.jpeg)

<span id="page-48-1"></span>**Figure 1.44:** Performance measurements of the PLL

# **1.6 Alternative scheme based on Wireless communication (Jun Hu)**

# **1.6.1 Motivation**

High-energy physics experiments necessitate the transmission of a substantial volume of detector signals, a task traditionally accomplished using cables and optical fibers. However, the reliance on an excessive number of cables and fibers can lead to several critical issues. Firstly, it contributes to increased dead zones within the detectors, which in turn diminishes detection efficiency. Furthermore, the added material budget required for these extensive wiring systems exacerbates the problem, significantly escalating system errors and elevating the risk of misidentifying new particles. This misidentification can have profound implications for experimental outcomes and the validity of theoretical predictions.

In addition to these technical challenges, the costs associated with cables themselves can be quite high. The limited routing space in experimental setups complicates cabling efforts, resulting in increased installation time and labor. These factors highlight the need for alternative solutions that can address these inherent limitations.

Transitioning to wireless communication systems offers a viable alternative with numerous benefits:

- 1. Reduction in Cable and Fiber Usage : By minimizing the reliance on physical cabling, we can achieve a more flexible system installation process. This ultimately leads to cost savings and enables easier modifications or expansions of the experimental setup, accommodating evolving research needs without extensive rewiring.
- 2. Easier Control of System-Level Crosstalk : Wireless systems can provide enhanced management of interference and signal overlap, improving the overall quality of data transmission. This is particularly crucial in high-energy physics experiments where precise measurements are essential.
- 3. Improved Reliability and Radiation Resistance : Wireless communication often demonstrates greater resilience against environmental factors, including radiation. This makes it a safer choice for high-energy experiments that demand robust data integrity in harsh conditions.
- 4. Reduced Overall System Space Requirements and Material Mass : The elimination or reduction of cables and fibers not only streamlines the physical footprint of the experimental apparatus but also lessens the overall material mass. This is especially important in particle detector designs, where size and weight can be significant factors in system performance.
- 5. Broadcast Control and Clock Distribution : For distributed detector systems, the capability to implement broadcast control and clock distribution eliminates the need for additional cable connections. This allows for seamless synchronization across multiple detectors, simplifying system architecture while enhancing operational efficiency.
- 6. More Flexible Configuration of Data Acquisition and Triggering Systems : Wireless solutions allow for greater adaptability in configuring data acquisition and triggering systems. Researchers can reconfigure data channels and settings without the need for physical changes to the wiring. This flexibility facilitates rapid testing of different configurations and experimental setups, enabling researchers to optimize their systems efficiently.

Given these numerous advantages, wireless communication represents a compelling alternative for high-energy physics experiments. By addressing the limitations associated with traditional cabling systems, this approach not only enhances operational efficiency and data integrity but also fosters innovation in experimental design and execution. Embracing wireless technology could significantly advance our capabilities in high-energy physics research, paving the way for discoveries that expand our understanding of fundamental particles and forces.

### **1.6.2 Wireless application proposal for CEPC**

Taking the inner trace as an example, in figure [1.45,](#page-50-0) the baseline design scheme involves aggregating data to a specific intermediate collection unit, after which this data is transmitted to the end cap via optical fibers. This design leads to a substantial amount of fiber optic wiring distributed within the multilayer tracker. As a result, careful consideration of the layout and routing of the optical fibers is crucial, as the complexity of this setup increases and ultimately contributes to higher material budget.

In contrast, a wireless solution offers the potential for a more streamlined design by utilizing different data pathways, as illustrated. In this proposal, data can be transmitted radially between layers instead of relying solely on fixed fiber connections. Once the data has been collected in the outermost layer, it can then be transmitted axially toward the end cap. This strategy not only reduces the volume of cabling needed within the internal structure but also simplifies the overall architecture by minimizing the risk of interferences that can occur in a highly wired environment. In figure [1.46,](#page-50-1) The specific implementation of this wireless system involves using a transmission node to relay data from the inner layer to the intermediate layer. In the intermediate layer, a repeater is designed to efficiently receive the inner layer data and forward it to the outer layer. The outermost layer is expected to host the maximum number of nodes; current design estimates suggest that a single node can achieve a maximum bandwidth of 6.25 Gbps. To support this configuration, approximately 15,000 links will be required, with transmission distances ranging from 10 cm to 25 cm. The adoption of millimeter-wave transmission technology is proposed to enhance data transfer rates and overall system performance.

Another transmission method is axial transmission, for instance, transmitting data from the barrel section to the end cap or from the end cap to the outside for high-bandwidth data transmission. This method can be considered a direct replacement for optical fiber, utilizing free space optical transmission and designed with different collimation positions and DWDM technology to enhance the data rate of the optical path. Figure [1.47](#page-51-0) shows a proposal to that involves designing a fixed, independent optical pathway. This design allows for the data from the barrel section ladder to be

<span id="page-50-0"></span>![](_page_50_Figure_1.jpeg)

**Figure 1.45:** Radial readout for inner tracker

<span id="page-50-1"></span>![](_page_50_Figure_3.jpeg)

**Figure 1.46:** Data transmission between layers with repeater

aggregated and transmitted to the end cap using wireless optical methods. Given the narrow beam width of optical propagation, this approach significantly minimizes interference with other optical pathways. This characteristic enables a high-density arrangement of the optical components without compromising performance. Additionally, the bandwidth of optical transmission is notably high, facilitating the realization of high-bandwidth data transfer after aggregation. This design not only enhances data throughput but also optimizes the overall communication efficiency within the system.

<span id="page-51-0"></span>![](_page_51_Figure_2.jpeg)

**Figure 1.47:** Optical Wireless Communication

# **1.6.3 Technology study**

# **1.6.3.1 Traditional Wi-Fi Solutions**

Traditional Wi-Fi solutions operate primarily in the 2.4 GHz and 5 GHz frequency ranges. Their key advantages include a mature technology and widespread adoption, with the most common standards for wireless local area networks defined by the IEEE in the 802.11 series—commonly referred to as Wi-Fi.

A wide variety of similar modules are available on the market, featuring universal interfaces such as SPI, I2C, USB, and PCIe. Most of these modules come with readily available software and drivers for easy installation. This eliminates the need for complex designs with Wi-Fi client chips.

For testing, we are using a commercial board – the Raspberry Pi Zero 2 W, which is a mature system on a chip (SoC) that supports 2.4 GHz 802.11b/g/n Wi-Fi. The module dimensions are 6.5 cm x 3 cm, with a power consumption of approximately 2 W.The module communicates with an FPGA board using two SPI buses—one for receiving and one for transmitting. When tested with a PC, it can achieve up to 22.03 Mbps in both uplink and downlink bandwidth wirelessly. Further testing with multiple channels is required. When using multiple channels, the bandwidth is shared when endpoints connect to a single switch. However, when two independent endpoints are spaced more than 2 meters apart, they generally do not interfere with each other.

From the test results, there are notable drawbacks associated with these solutions. The transmission density is limited due to the low frequency range and restricted number of channels available. Additionally, the inability to miniaturize antennas at high power levels further contributes to lower transmission density. Despite these limitations, traditional Wi-Fi is easier to develop and therefore well-suited for applications where bandwidth requirements are not particularly demanding, such as for slow control data and small system testing.

![](_page_52_Figure_1.jpeg)

Test setup based on Raspberry board **Figure 1.48:** Test setup based on Raspberry boards

### **1.6.3.2 Millimeter Wave**

Currently, the frequency ranges are often categorized as follows: 0.3 to 30 GHz is known as the microwave band, 30 to 300 GHz is referred to as the millimeter wave band, and 0.1 to 10 THz is classified as the terahertz band. The transmission technologies for future 6G and Wi-Fi 7 are expected to utilize these two frequency bands. This frequency range marks a significant advancement over traditional Wi-Fi, effectively addressing bandwidth challenges and offering a theoretically strong solution. However, higher bandwidth also leads to increased power consumption and costs.

Furthermore, since these technologies are still under development, relevant chips have not yet been widely introduced in the market. The technical challenges are considerable, and strict regulations from foreign entities create significant barriers and expenses in manufacturing. At this stage, this project will focus on collaborating with domestic research institutes or major commercial companies, to explore the possibility of developing systems using available or upcoming millimeter wave and terahertz chips.

![](_page_52_Figure_6.jpeg)

**Figure 1.49:** mm-wave test

The testing is conducted using the SK202 evaluation board, which is based on STMicroelectronics' commercial 60GHz RF transceiver chip, the ST60A2. This advanced chip is capable of achieving data transmission rates of up to 6.25 Gbit/s, making it suitable for high-speed wireless communication applications. The power consumption of the chip is measured at 44 mW during transmission (TX) and 27 mW during reception (RX), indicating its efficiency in power usage while maintaining high performance.

#### 1. Transmission distance and bandwidth test:

The test procedure involved placing two test boards face-to-face and connecting them to a computer. The distance between the two boards was continuously adjusted to evaluate network connectivity, transmission bandwidth, and packet loss rate. The testing results are summarized in Table [1.14.](#page-53-0)

<span id="page-53-0"></span>When the distance remained less than 5 centimeters, transmission speeds reached up to 900 Mbps, showcasing the chip's ability to maintain a high data rate over short distances. However, when the distance exceeded 6 centimeters, establishing a link became impossible, indicating the limitations of this technology concerning effective transmission range.

| Distance(cm) | <b>Bandwidth</b> (Mbps) | Packet loss rate |
|--------------|-------------------------|------------------|
|              | 914                     | $0.031\%$        |
| 3            | 917                     | 0.061%           |
| 5            | 915                     | $0.05\%$         |
| 6            | 913                     | $0.13\%$         |
| > 6          | No link                 | No link          |

**Table 1.14:** Transmission distance and bandwidth

Based on the test results, the transmission distance and speed of commercial devices do not meet the required specifications, indicating a need for further research and development efforts. However, other testing activities can leverage this commercial module as a foundation for experimentation and analysis.

2. Penetration test:

<span id="page-53-1"></span>To evaluate the penetration capabilities of millimeter waves, various materials were selected for testing. The outcomes of these tests are summarized in the table [1.15.](#page-53-1) The testing revealed that metallic materials, in particular,

**Table 1.15:** Penetration test at 3 cm distance

| Material | Thickness         | <b>Penetration Ability</b> |
|----------|-------------------|----------------------------|
| Paper    | 2mm               | Yes                        |
| Plastic  | 2mm               | Yes                        |
| FR4 PCB  | 1.6 <sub>mm</sub> | No                         |
| Flex     | 0.2 <sub>mm</sub> | No                         |

demonstrated a significant ability to block millimeter wave transmission. This finding aligns with the fundamental principle that metals reflect electromagnetic waves, thereby preventing them from penetrating the material. As a result, the effectiveness of millimeter wave communication is reduced when faced with obstacles made of metal.

In contrast, non-metallic materials, such as plastics, glass, and wood, showed varying degrees of permeability to millimeter waves, allowing for some degree of transmission. These materials can be more conducive to the propagation of millimeter waves, thereby making them more suitable for scenarios where unobstructed wave transmission is essential.

3. Cross talk test:

To evaluate the impact of millimeter waves on the detector, the Vertex pixel prototype chip TAICHU3 was employed to investigate the crosstalk between the pixel chip and millimeter waves. Various thresholds and noise levels were assessed from multiple directions and distances to ascertain their effects on the chip's performance. The test setup is shown in Figure [1.50](#page-54-0)

The findings in Figure [1.51](#page-54-1) suggest that identifying the specific effects of 60 GHz millimeter waves on the pixel chip is notably difficult. Such investigations will help to clarify the interaction mechanisms where millimeter wave signals are present.

### **1.6.3.3 Optical Wireless Communication**

Optical wireless communication, often referred to as free-space optical communication (FSO), is a method of transmitting data using light waves in the visible, ultraviolet, and infrared spectrum. This technology leverages the

<span id="page-54-0"></span>![](_page_54_Picture_1.jpeg)

**Figure 1.50:** Cross-talk test between mmwave and taichu3

<span id="page-54-1"></span>![](_page_54_Figure_3.jpeg)

![](_page_54_Figure_4.jpeg)

properties of light to send information over distances without the need for physical cables. Unlike traditional wired communication which relies on physical media (like fiber optics or copper), wireless optical communication uses air or vacuum as the transmission medium. FSO requires a clear line of sight between the transmitter and receiver, as obstacles can interfere with the optical signal. The optical path in detector should be considered very carefully. Optical frequencies offer a much larger bandwidth compared to conventional radio frequencies, enabling high data rates.

<span id="page-55-0"></span>![](_page_55_Figure_2.jpeg)

**Figure 1.52:** optical wireless test setup

A demonstration setup have been established to validate wireless optical transmission. The structure and physical components of the setup are illustrated in the Figure [1.52.](#page-55-0) The optical transceiver components utilize standard SFP (Small Form-factor Pluggable) optical modules. Data is generated using Xilinx's GTX module, which operates at a single-channel line rate of 10 Gbps.In our configuration, we have integrated 12 channels of Dense Wavelength Division Multiplexing (DWDM) optical waves on a single optical link. This integration enables us to successfully achieve data transmission rates exceeding 100 Gbps. The demonstration setup successfully illustrates the feasibility of high-capacity data transmission using wireless optical methods. By achieving over 100 Gbps in data transmission, this system highlights the potential of utilizing advanced modulation and multiplexing techniques in wireless optical communication, paving the way for future applications that require high-speed data transmission in various environments.

# **1.7 Clocking systems (Jun Hu)**

# **1.7.1 Basic structure**

The clock system needs to fulfill two primary functions:

- 1. Provide a Reference Clock for Detector Electronics: This clock will serve as the baseline frequency for sampling, time measurement, and energy measurement across all the detectors. In CEPC design, the frequency of clock is 43.33 (130/3) MHz. The stability of the timing distribution network therefore has to reach picosecond-levels.
- 2. Offer Absolute Timestamping for Detectors: It is crucial for the trigger system to receive temporally correlated physical data. The baseline is using optical links responsible for delivering the Timing, Trigger and Control (TTC) information as shown in figure, The back-end part of the timing distribution network is comprised of FPGAs having multiple transceivers.

The structure of TTC distribution is shown in Figure [1.53.](#page-56-0) For each subsystem, there is a TTC distribution crate that serves as a central hub for communication with the trigger system. Inside this crate, a top-level TTC distributor (TTCD) board receives TTC information from the trigger & clock system. It then distributes this information to all Level 2 TTCDs

<span id="page-56-0"></span>![](_page_56_Figure_1.jpeg)

**Figure 1.53:** TTC distribution

within the crate via the backplane. However, in subsystems that utilize fewer than 10 Backend Electronics (BEE) crates, there are no Level 2 TTCDs.

The Level 2 TTCDs are responsible for transferring TTC information to the BEE crates using optical fiber connections from the front panel. Each BEE crate is equipped with its own TTC distributor, which receives TTC information from the Level 2 TTCDs and distributes it to all BEE boards within the crate through the backplane. This design ensures low-latency communication, which is essential for maintaining the precision of timestamping across the system.

On the front-end side, the KinWoo and ChiTu custom chips will recover the synchronized clock from the data received from the BEE boards. This process ensures that the timing information is accurately extracted and aligned for further processing, enabling precise synchronization across the entire system.

### **1.7.2 Clock Synchronization methods**

The foundational principle of our clock synchronization system is grounded in the Precision Time Protocol (PTP), specifically the IEEE 1588 standard. This protocol allows for precise synchronization of clocks across networked systems, ensuring that all devices operate in concert with a high degree of accuracy. Figure [1.54](#page-56-1) shows a concept of a timing compensation scheme.

<span id="page-56-1"></span>![](_page_56_Figure_8.jpeg)

**Figure 1.54:** Concept of a timing compensation scheme

Central to our approach is the measurement of phase differences, which is carried out using the Digital Dual Mixer Time Difference (DDMTM) technique. This method leverages advanced digital signal processing to accurately gauge time offsets between the reference clock and the incoming timing signals. By analyzing the phase relationships, we can derive precise measurements that are essential for effective synchronization.

Timing compensation plays a critical role in this process. According to the principles laid out in the PTP 1588 protocol, we calculate a timing compensation value that reflects the discrepancies identified during phase measurement.

This value determines the extent to which the phase of the local clock needs to be adjusted to align with the reference clock accurately.

The phase adjustment process involves employing a high-performance jitter cleaner, which is pivotal in mitigating short-term jitter. By effectively reducing jitter, the system ensures that the timing signals delivered to the front-end are stable and reliable. This not only improves the overall synchronization accuracy but also enhances the performance of time-sensitive applications.

# **1.8 Low voltage power system (Jun Hu)**

# **1.8.1 Power supply distribution overview**

The power supply is a crucial element in electronics systems, as it has a direct impact on their overall performance and functionality. Figure [1.55](#page-57-0) provides a visual representation of the power distribution scheme for the CPEC low-voltage supply system. In the electronics room, which does not necessitate radiation and magnetic protection, both the COTS (Commercial Off-The-Shelf) AC-DC power supply and a custom-designed DC-DC converter are housed. The AC-DC power supply converts alternating current (AC) into direct current (DC). In the mentioned case, AC voltage ranging from 220 to 480V is converted to 110V DC, which indeed improves the efficiency of subsequent transmission.The DC-DC converter is specifically engineered to step down the voltage from 110V to 48V, supplying the necessary power for various components within the system.

To support the front-end detectors, high-performance, low voltages are required: specifically, 1.2 V for the analog readout chips and 2.5 V for the digital transmission chips. These low voltages are crucial for the operation of the front-end boards, which are part of the detector readout electronics. These custom modules are usually installed in environments exposed to radiation and magnetic fields, making it essential for them to receive the correct power levels. Additionally, they must be designed and shielded to endure the harsh conditions caused by radiation exposure. Ensuring proper power distribution and voltage regulation in these environments is critical for the reliable functioning of the detector systems and, ultimately, for the accuracy of the data they collect.

<span id="page-57-0"></span>![](_page_57_Figure_7.jpeg)

**Figure 1.55:** CEPC low-voltage power supply distribution

We will implement a two-stage DC-DC conversion architecture to efficiently manage voltage levels for our system.In the first stage, the Basha48 module will receive a 48V input from the long cable that extends from the electronics room. This module is responsible for stepping down the voltage to 12V. The Basha48 module will be strategically installed near the end cap, reducing the length of cable that must carry high voltage, thereby minimizing potential voltage drop and ensuring efficient power delivery.In the second stage, the 12V output from the Basha48 module will be delivered to the frontend board via finer, more manageable cables. On the frontend board, the Basha12 module will further convert the 12V supply into the specific lower voltages required for the operation of the frontend detectors, namely 1.2V for the analog readout chips and 2.5V for the digital transmission chips.This two-stage conversion approach not only optimizes the efficiency of power distribution but also allows for a compact design that minimizes radiation exposure to sensitive electronic components in the frontend. It ensures that the necessary voltages are precisely regulated, maintaining the performance integrity of the detector readout electronics while adapting to the challenging environmental conditions present in radiation-prone areas.

# **1.8.2 COTS power supply**

![](_page_58_Picture_2.jpeg)

**Figure 1.56:** Test COTS power supply module

Since the first two stages of the power supply are installed in an environment free from radiation and strong magnetic fields, commercial power supplies can be procured. The main parameters should be meet the table [1.16](#page-59-0) and [1.17](#page-59-1)

In order to enable power control, the power supply must provide the following interfaces to the DCS system:

- 1. On/Off Control Interface: A simple input to allow the DCS to turn the power supply on or off.
- 2. Voltage Adjustment Interface: An adjustable output voltage interface that lets the DCS set the required voltage levels.
- 3. Current Monitoring Interface: An output that provides real-time current readings to the DCS for monitoring purposes.
- 4. Temperature Monitoring Interface: An interface to relay temperature data to the DCS, allowing for effective thermal management.
- 5. Fan Speed Control Interface: The ability to control and adjust fan speeds based on system requirements to ensure proper cooling.

These interfaces will facilitate seamless integration with the DCS, ensuring effective monitoring and control of the power supply operation.

# **1.8.3 Serial powering**

The loads are connected in series and powered by a constant current source in serial powering. The shunt regulator generates voltage from the constant current source, which is followed by a linear regulator to give noiseless power supply voltage, as shown in Fig[.1.57.](#page-60-0) Extra current will be sunk by the shunt transistor. Shunt regulator and linear regulator are called shuntLDO. All the modules are powered by same current source. Thus, the power consumed by cables will not increase with the number of modules and power efficiency is improved. Almost one cable is used in serial powering. Furthermore, the shuntLDO can be fully integrated in the sensor or front-end readout chip. Consequently, the material budget is small and it is benefit to the be equipped. The shuntLDO is a linear device. Compared with switching power, shuntLDO doesn't suffer EMI and ripple voltage. The power noise is good, even better than the independent powering.

<span id="page-59-0"></span>![](_page_59_Picture_173.jpeg)

![](_page_59_Picture_174.jpeg)

<span id="page-59-1"></span>![](_page_59_Picture_175.jpeg)

**Table 1.17:** DC-DC Power Supply Parameters

Serial powering is a promising scheme, especially for the large load current. The pixel detectors have chosen serial powering in ATLAS and CMS experiments, where extreme high current density is required.

However, the power chain will be cut off if one module fails. The reliability is an important issue. One method is to power two or more modules in parallel. The current of the failed module can be afforded by others and the chain can still work. Moreover, the ground of the modules is different. The controlling signal and output signal must be transferred in AC-coupled or light-coupled method. Since the ground of each chip is different, stitching can not be supported. Serial powering is a novel method, which results in many changes in the traditional power system.

### **1.8.4 Parallel powering**

Parallel powering is a traditional method used in electrical system, which distributes electrical loads across one or more power sources. This approach is commonly used in various applications, including power distribution networks, data centers, and renewable energy systems, where reliability, redundancy, and efficiency are critical. In order to decrease the power dissipated by the cables, power supply voltage is delivered by high level, as shown in Fig. [1.58.](#page-60-1) This approach is similar to the electricity delivery. The power is delivered by high voltage and low current. Therefore, the power consumed by cables can be great decreased. The cable number can be also decreased due to the shared power supply. Power of load (PoL) converters are located in the detector modules to step down the high voltage to the required voltage by the load. Since the converter is closed to the load, low noise can be achieved and any deviation induced by the load can be quickly recovered. The PoL conversion is a popular method in power system.

Parallel powering is more attractive owing to its compatibility with the traditional power system. Less extra effort is required by the sensor module design. The loads are powered in parallel, which means any faulty load can not destroy the total power supply system. Compared with serial powering, the parallel powering requires one power cable and one ground cable. The material budget is little larger. Moreover, the critical issue is the PoL converter. The radiation is strong when beams are collided. The DC-DC converter is near the collision point, so it must be required to survive in such strong

<span id="page-60-0"></span>![](_page_60_Figure_1.jpeg)

**Figure 1.57:** Structure of serial powering

<span id="page-60-1"></span>![](_page_60_Figure_3.jpeg)

**Figure 1.58:** Structure of parallel powering

radiation environment. The other issue is the power efficiency, which is an important performance parameter of DC-DC converters. Since the number of power module is very large, the power efficiency should be enough large.

### **1.8.5 BUCK DC-DC converter**

Power management modules provide energy for electronic devices and serve as a cornerstone for the proper operation of these devices. Currently, commonly used power management modules are DC-DC converters, which, as the name suggests, convert direct current to direct current. These are primarily categorized into buck converters, boost converters, and buck-boost converters. Among them, buck converters are one of the most prevalent topologies in DC-DC power circuits, characterized by high efficiency and high power density. They represent critical components in power systems for various detectors within the CEPC, primarily responsible for reducing high-voltage power supplied via high-pressure transmission to the lower voltages required by frontend electronics. The use of high voltage and low current for power transmission can reduce the number of cables, minimize voltage loss, and enhance the power efficiency of the detector system. However, in buck converters, key modules such as the reference, pre-buck, and on-chip LDO (Low Dropout Regulator) are susceptible to radiation effects, which may lead to level drift and circuit failures. Therefore, it is crucial to implement specific radiation-hardening measures for different modules.

![](_page_61_Figure_4.jpeg)

**Figure 1.59:** Schematic of the DC-DC Controller

Since the buck-type DC-DC converter is located within the detector and is subjected to high radiation and magnetic field intensity, magnetic-core inductors cannot be utilized. Additionally, due to area and space constraints, the energy storage inductance values are relatively low. However, the frontend electronics impose stringent requirements on power noise performance, necessitating a sufficiently low ripple voltage. Traditional methods, which involve increasing inductance values to reduce ripple voltage, are no longer feasible. Instead, the ripple voltage can only be minimized by increasing the switching frequency. The proposed DC-DC controller aims to operate at switching frequencies exceeding the MHz range. Furthermore, the radiation environment created during detector operation can induce soft errors or even hard failures, such as burnout, in silicon-based circuits. Hence, the designed DC-DC controller must withstand both ionizing and non-ionizing radiation effects.

Building on research and a review of current literature regarding radiation-resistant buck-type DC-DC voltage converters, a study will be conducted on the structure of the controller system. The focus will be on parallel output current capabilities, high switching frequency, and efficient circuit structures, along with specific circuit implementation methods. System-level modeling and simulation will be employed to optimize the controller's system architecture. On this foundation, research will delve into key circuit modules, analyzing the switching losses of each module to ensure that the power system maintains an efficiency above 85%. Additionally, protection circuits for overheating, overcurrent, overvoltage, and undervoltage, as well as auxiliary circuits for power status indication, will be developed to ensure that the voltage converter can promptly cut off power to protect itself and its load modules during abnormal operating conditions.

The design of the circuit and layout will simultaneously consider radiation-hardening methods. Specific studies will be undertaken on radiation-sensitive components, such as bandgap references, error amplifiers, and switch timing controllers, focusing on TID (Total Ionizing Dose), SET (Single Event Transient), and SEL (Single Event Latchup) hardening strategies. Verification of the designed circuits will incorporate approaches such as fault injection.

# **1.8.5.1 Design specification**

The power requirement by detectors has been surveyed. Most of electronics are powered by 1.2 V, due to the 65 nm CMOS process. A few blocks are perhaps powered by 1.8 or 3.3 V. Since the on-time is too short for high conversion ratio, the proposed DC-DC conversion is divided into 2 stages. As shown in Fig[.1.60,](#page-62-0) a power of 48 V is firstly stepped down to 12 V. Then, the voltage of 12V is stepped down to 1.2V. The two stages conversion may consume more power, so one stage with more switches will be researched in future.

<span id="page-62-0"></span>![](_page_62_Figure_4.jpeg)

**Figure 1.60:** Power conversion stage

<span id="page-62-1"></span>The final specification is list in table [1.18](#page-62-1)

| <b>Parameter</b>                   | <b>Specification</b>                                 |
|------------------------------------|------------------------------------------------------|
| <b>Input Voltage Range</b>         | 36V-48V                                              |
| <b>Output Voltage</b>              | 12V, 1.2V (option: 1.8V, 3.3V) DC                    |
| Maximum Output Current             | 1A, 10A                                              |
| Efficiency                         | $80\%$ to $85\%$                                     |
| <b>Output Ripple Voltage</b>       | :5mV                                                 |
| <b>Protection Features</b>         | Over voltage, over current, short circuit protection |
| <b>Operating Temperature Range</b> | $-10^{\circ}$ C to $+65^{\circ}$ C                   |
| <b>Dimensions</b>                  | 50mmX20mmX6.7mm(including shielding)                 |

**Table 1.18:** DC-DC converter specification

The DC-DC converters are used as PoL converters, whose design specifications are listed in Table [1.18.](#page-62-1) The designed converters are considered to be a common power supply chip for almost all the detectors. The output voltage is 1.2 V and 2.5 V or 3.3 V, according to the design requirements of chips equipped on detectors. In order to decrease the size of power module and supply more current, output current of up to 10A is required at the output voltage of 1.2V.

DC-DC converters work based on the switching, which transforms energy from input to output voltage. The ripple usually appears at the output. Since the power noise is required to be sufficiently low, the performance of front-end readout chip and sensors will not be degraded. The output peak-peak ripple voltage is required to be lower than 10 mV.

Efficiency is an important performance parameter to evaluate the power efficiency of power system. Higher efficiency means more current can be supplied to the load. The efficiency is 85% in the normal working mode. The size of power module is smaller than  $50 \text{mm} \times 20 \text{mm} \times 6.7 \text{mm}$  due to the limited space in the detectors. In addition, the DC-DC converter, which is closed to the collision point, works in radiation environment. 5 Mrad (Si) is estimated, which must be afforded by the power chips. The detectors are surrounded by the magnet, so DC-DC converters work in strong magnetic field of 3T.

The floor plan of Basha module containing customized DC-DC converter is shown in Fig[.1.61.](#page-63-0) Most of the area is occupied by the inductor. Moreover, the size of capacitors will increase to bear high voltage. Since high current is required, the heat map is simulated, whose result is shown in Fig.**??**. The high temperature appears near the power transistors,

<span id="page-63-0"></span>which is higher than 100°. It should be noted that the cooling is not considered. To make the things worse, smaller area will degrade the cooling effect. The module design should be carefully considered.

![](_page_63_Figure_2.jpeg)

**Figure 1.61:** Floor plan of Basha DC-DC module

![](_page_63_Figure_4.jpeg)

**Figure 1.62:** Heat Simulation Result of the DC-DC Module

# **1.8.5.2 Structure of DC-DC converter**

The DC-DC converter is in fact the buck converter. As shown in Fig. [1.63,](#page-64-0) the output voltage is sampled and compared to a reference voltage. The switch is turned on and off to adjust the output voltage. The inductor and output capacitor can store the energy and supply current to the load when the switch is turned off. The controlling scheme can be pulse width modulation (PWM), pulse frequency modulation (PFM), hysteresis controlling and so on. Though high efficiency can be achieved by PFM, when the load current is low, the frequency of ripple voltage is varied. It is difficult to filter the output noise. The switching frequency is fixed in PWM mode, which is helpful to filter design. Therefore, PWM will be used.

The proposed structure of the DC-DC converter is shown in Fig. [1.64,](#page-64-1) which working in voltage mode of PWM. The output voltage and current are sampled to be compared with the reference voltage. The PWM pulses are generated to adjusted the on-time of switches, which forms as a negative feedback loop. The PWM controller contains error amplifier, comparator, driver, dead-time controller, ramp generator, oscillator and soft start. One challenge of the mode is the

<span id="page-64-0"></span>![](_page_64_Figure_1.jpeg)

**Figure 1.63:** Buck DC-DC converter

<span id="page-64-1"></span>![](_page_64_Figure_3.jpeg)

**Figure 1.64:** Structure of DC-DC converter with voltage mode of MPW

frequency compensation of the loop. The parasitic resistor, capacitor and inductor can influence the stability. Moreover, the transient performance is not good. In order to overcome the problems, the current mode PWM is considered as the backup structure.

Since the magnetic filed is very strong, air-core inductor is proposed to be used. The inductor can not be very high with limited size. However, the noise will increase with low inductor. The switching frequency is hoped to be increased. GaN power transistors are planed to be used, due to the design requirements of high power efficiency, high switching frequency, high output current. The power transistors, inductor and other issues will be introduced in the flowing subsections.

# **1.8.5.3 GaN power transistor**

GaN power transistor has been rapidly developed in recent years, which features wide band gap. Compared with power MOSFET, GaN transistor can offer low on-resistance, high switching frequency, high current density. The switching loss is low, which can improve the power efficiency. GaN transistors are very promising to be used in switching power. The designed DC-DC converters will adapt GaN transistors.

<span id="page-65-0"></span>![](_page_65_Figure_5.jpeg)

**Figure 1.65:** Measured threshold voltage of INN40FQ015A irradiated by proton source with energy of 80 MeV.

In order to evaluate the radiation tolerance of GaN transistors, a commercial GaN transistor from INNOSCIENCE has been measured in Chinese Spallation Neutron Source. The proton beam is used to irradiate the GaN transistor named INN40FQ015A. The gate-source voltage, drain-source voltage and drain current are measured on-line. The threshold voltage of measured transistor decreases to 0.45V when the proton beam is opened with energy of 80MeV, as shown in Fig[.1.65.](#page-65-0) It is stable till the flux increases to  $1.7 \times 10^{13} p/cm^2/s$ , which means that the GaN transistor can work at the radiation environment. It should be noted that the threshold voltage increases at the flux of near  $1 \times 10^{13} p/cm^2/s$ . The reason is that there is a mistake in the test bench, which results in high current flowing in the transistor for a long time. The temperature increases, as well the threshold voltage. After the mistake is corrected, the threshold voltage returns.

### **1.8.5.4 Air core inductor**

Since the detectors are surrounded by the solenoid magnet, the inductor with magnetic core can not work in such strong magnetic field. Air core inductor seems to be the only choice, which is difficult to achieve high inductance with a compact size. However, the space left to power module is strongly limited by each detector. Commercial air-core solenoid is available and the size is relative small. However, its magnetic field is open, which leads to unexpected EMI (electromagnetic interference). The air-core toroid is proved to be a good candidate. Therefore, customized toroid will be researched. Moreover, the shield method also should be studied in view of size, material budget and magnetic field.

# **1.9 Backend Electronics (Jun Hu)**

The backend electronics system is integral to the overall data acquisition process, serving as the bridge between raw data captured by the frontend electronics and higher-level data processing systems. Upon receiving the raw detector data, the backend electronics performs several critical functions to ensure effective data management and analysis. The connections between the backend electronics system and other interfacing systems are depicted in Figure [1.66.](#page-66-0) The key functions of the backend electronics system is shown as below.

<span id="page-66-0"></span>![](_page_66_Figure_3.jpeg)

**Figure 1.66:** General Backend electronics system connection

- 1. Data Reception : The backend electronics receives raw data directly from the frontend electronics, which encompasses sensors or detectors that convert physical signals into digital data.
- 2. Initial Data Processing and Compression : Once the raw data is received, the backend system processes it to filter out noise, enhance signal quality, and perform data compression. This processing condenses the data without significant loss of information, preparing it for more in-depth analysis.
- 3. Trigger Signal Generation : The backend electronics generates trigger signals that are forwarded to the trigger system. These signals are crucial for initiating specific operations and ensuring that the timing of data acquisition aligns with relevant events.
- 4. Data Caching : The backend system temporarily caches the raw data for a defined duration, ensuring that it can access this information while waiting for the trigger system's decision. This feature is vital for maintaining data integrity and accessing relevant data even as the system processes decisions.
- 5. Final Trigger Decision Processing : After a predetermined delay, the trigger system sends a final trigger decision back to the backend electronics. Based on this decision, the backend can intelligently filter through the buffered data to select the relevant information that meets the final trigger criteria.
- 6. Data Packaging and Transmission : Once the relevant data is identified, it is appropriately packaged and sent to the Data Acquisition (DAQ) system for comprehensive analysis and long-term storage.
- 7. Communication with the Data Control System (DCS) : The backend electronics system exchanges status monitoring and control signals with the DCS, ensuring effective oversight and management of both the frontend detectors and the backend electronics. This dialogue is essential for maintaining operational integrity and system performance.
- 8. Synchronization Clock Distribution : The backend electronics also communicates with the synchronization clock distribution system, which ensures that all components maintain precise timing. This synchronization is critical for accurate data acquisition and processing, especially in high-speed applications.

Overall, the backend electronics system is responsible for critical data management processes that enhance the efficiency and effectiveness of the data acquisition system. Its ability to generate trigger signals, perform initial data processing, maintain synchronization, and ensure smooth communication between various components reinforces its pivotal role in high-performance data acquisition environments.

### **1.9.1 Backend Electronics Hardware Design**

The backend electronics system is engineered around the ATCA (Advanced Telecommunications Computing Architecture) racks, optimizing both functionality and scalability. Each rack is designed to accommodate two ATCA (Advanced TCA) crates, which in turn house ten general backend electronics boards per chassis. This layout allows for the efficient allocation of space, with the remaining capacity utilized for optical patch panels, power supplies, and other auxiliary equipment.

<span id="page-67-0"></span>![](_page_67_Figure_3.jpeg)

**Figure 1.67:** General backend electronics board structure

The hardware components of the custom backend electronics are illustrated in Figure [1.67,](#page-67-0) and they primarily consist of the following key sections:

- 1. FPGA: Serving as the core processing unit of the backend electronics board, the FPGA (Field Programmable Gate Array) is responsible for controlling and executing the majority of the backend electronics functions.
- 2. Clock Jitter-Cleaner: This component ensures that timing signals are free from jitter, maintaining signal integrity across high-speed data connections.
- 3. DDR Memory: Dynamic Random-Access Memory (DDR) provides necessary memory resources for data buffering and temporary storage during processing.
- 4. Power Management: This section manages power distribution to ensure that all hardware components receive stable and adequate power.

From preliminary evaluations conducted with the frontend electronics, it is estimated that the fiber optic line rate for data transmission to the backend electronics will be 10 Gbps. To accommodate this high data throughput, a custom communication protocol will be implemented. Additionally, connections to the trigger system and the clock distribution system will also utilize fiber optic links that operate at the same 10 Gbps line rate. In contrast, communication with the DAQ (Data Acquisition) and DCS (Data Control System) will leverage 40 Gbps QSFP+ optical modules to enable robust and high-speed connectivity.

These requirements create stringent demands on both the speed and the quantity of high-speed serial transceivers integrated into the FPGA. Furthermore, the FPGA will be tasked with performing initial processing on the raw data, necessitating a considerable level of processing capability to handle the data rates effectively.

To ensure that our system meets these operational criteria, we have conducted a comparative analysis of the performance specifications of several widely-used FPGA models currently available on the market. The results of this comparison are summarized in Table [1.19,](#page-68-0) leading us to initially select the XC7VX690T-2FFG1158C as our preferred FPGA model. This choice is based on its ability to satisfy the high-speed communication requirements and processing demands of the backend electronics system, ensuring efficient operation and data integrity in a high-throughput environment.

<span id="page-68-0"></span>

| <b>FPGA</b>           | <b>XC7K325</b>          | XCKU040                 | XC7VX690T               | <b>XCKU115</b>          | <b>KU060</b>            |
|-----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|
|                       | -2FFG900C               | $-2$ FFVA1156E          | <b>-2FFG1158C</b>       | $-2$ FLVF1924I          |                         |
| Logic Cells $(k)$     | 326                     | 530                     | 693                     | 1451                    | 725                     |
| <b>DSP Slices</b>     | 840                     | 1920                    | 3600                    | 5520                    | 2760                    |
| <b>Memory (Kbits)</b> | 16020                   | 21100                   | 52920                   | 75900                   |                         |
| <b>Transceivers</b>   | $16(12.5 \text{ Gb/s})$ | $20(16.3 \text{ Gb/s})$ | $48(13.1 \text{ Gb/s})$ | $64(16.3 \text{ Gb/s})$ | $32(16.3 \text{ Gb/s})$ |
| <b>I/O Pins</b>       | 500                     | 520                     | 350                     | 832                     | 624                     |

**Table 1.19:** Comparison of Common FPGA Performance Parameters

### **1.9.2 FPGA Firmware Algorithm Development**

Based on the functional requirements of the backend electronics system, several key functional modules will be implemented in the FPGA, as illustrated in Figure [1.67:](#page-67-0)

- 1. Data Interfaces and Communication High-speed data communication will be facilitated through the GTH transceivers integrated within the FPGA, supporting a multichannel transmission bandwidth of at least 10 Gbps. The communication protocols will be designed to accommodate various upper-layer protocols in accordance with system specifications. Specifically, the frontend data will be packaged and transmitted using a custom protocol akin to lpGBT, necessitating the implementation of appropriate unpacking logic within the FPGA to effectively interpret this data. Additionally, for communication with the DAQ and DCS systems, the standard TCP/IP protocol will be utilized, which requires the FPGA to implement a hardware protocol stack to maximize communication efficiency and throughput.
- 2. Data Processing and Packaging The FPGA will be responsible for categorizing and processing the incoming raw data based on channel and timing information. This involves assigning the data to the corresponding digital signal processing algorithms for further refinement and analysis to generate the required trigger signals. Upon receiving a trigger signal, the FPGA will select the relevant valid data, efficiently package it, and transmit it according to a predefined data structure, ensuring that the information is organized and ready for subsequent stages of handling.
- 3. Digital Signal Processing Algorithms Once the FPGA receives the raw data, it will implement a variety of processing algorithms tailored to the specific type of detector in use. For instance, tracking detectors may require the implementation of cluster-finding algorithms, while calorimeters will focus on extracting timing and energy information. In scenarios where considerable noise is present, fast filtering algorithms may be deployed to enhance the signal-to-noise ratio. Additionally, real-time processing algorithms will facilitate the proactive acquisition of key information from physical events. This capability significantly improves the efficiency of the system's triggering mechanisms, reduces trigger latency, and alleviates the workload on both the DAQ and offline processing systems.
- 4. Clock Synchronization Technology In high-energy physics experiments, effective clock synchronization is crucial as it coordinates the timing across multiple channels, detectors, and the data acquisition system to ensure data consistency and accuracy. By researching and developing clock synchronization technology tailored to fiber optic transmission systems, it is possible to transmit both data and clock signals through a single fiber optical cable. This approach significantly minimizes the required number of optical cables and reduces overall system complexity.
- 5. DDR Controller Considering the need to buffer raw data until the final trigger signal is received from the trigger system, the backend processing module will incorporate a sufficiently large DDR memory. The FPGA will manage

the timing control for reading from and writing to the DDR memory, ensuring that data integrity is maintained and that no data is lost during the buffering process.

6. Slow Control Registers The backend electronics system must be responsive to monitoring and control commands originating from the slow control system. Acting as a communication bridge between the backend and the frontend detectors, the FPGA will handle the parsing, forwarding, and responding to slow control commands. This functionality ensures that the system remains manageable and that any necessary adjustments can be made in response to operational conditions or changes.

By implementing these functional modules within the FPGA, the backend electronics system will be equipped to handle the rigorous demands of high-speed data acquisition and processing while maintaining reliability and efficiency in a high-energy physics experimental environment.

# <span id="page-69-0"></span>**1.9.3 Prototype performance**

![](_page_69_Figure_5.jpeg)

**Figure 1.68:** Backend electronics board prototype

The physical overview of the data aggregation processing board is presented in Figure [1.68.](#page-69-0) The board is equipped with twelve SFP+ interfaces and two OSFP+ interfaces, which are utilized for connecting the GTH transceiver modules of the FPGA. These interfaces serve as the communication pathways for data transmission between the data aggregation processing board and the upper computer, as well as the readout system that encompasses multiple readout channels. The total power consumption is about 40W.

The GTH transceivers offer higher data rates and improved performance compared to GTX transceivers, making them suitable for applications that demand higher data transmission bandwidth. Additionally, the board is equipped with a 204-pin DDR3-SODIMM, which is intended for large data buffering during the algorithm processing phase. This configuration ensures that the board can efficiently manage the substantial data flow necessary for effective data aggregation and processing.

We conducted loopback tests to assess the transmission quality of 12 GTH links. The loopback tests used a Pseudo-Random Binary Sequence (PRBS) at conditions of 120 Gbps (single channel at 10 Gbps, PRBS  $2^{31} - 1$ ). The tests were carried out until the error rate reached a BER of -15 (the Ethernet transmission standard for 10 Gbps requires an error rate of below BER -12). Figure [1.69](#page-70-0) displays the eye diagram obtained from the loopback test of channel 1 under 10 Gbps link conditions.

<span id="page-70-0"></span>![](_page_70_Figure_1.jpeg)

**Figure 1.69: GTH eye diagram** 

# **1.10 Consideration on Electronics Crates & Cabling (Wei Wei, Zheng Wang)**

#### **1.10.1 Electronics Crate and Rack Design Specifications**

The requirements for crates and racks related to electronics systems mainly include data transmission, low-voltage power supply, and high-voltage power supply. As an electronics system designed uniformly from top to bottom, crates and racks should also be designed and planned according to the same specifications. Since crates and racks are usually based on current industry standards and existing products, we consider adopting the ACTA industrial standard that can maintain relative advancement and maturity in design comprehensively for the current and future period. As the standard height is typically 42U, the maximum number of crates that can be installed in a single rack will be considered based on this height for the backend electronics, electronics power supply, and detector high-voltage power supply.

### **1.10.1.1 Backend Electronics Crate and Rack Considerations**

The backend electronics in this section mainly relate to data interfaces. According to the previous sections, the data rate of a single optical fiber channel is designed to be approximately 11.09Gbps, with an effective data rate of about 9.71Gbps. Considering the  $\mu$ TCA standard for designing backend data crates, the height of a general backend electronics PCB can be designed as 9U, with 16 input data channels. According to the  $\mu$ TCA standard, each crate has 14 plugin spaces, with two reserved for control plugins, one for clock synchronization plugin TTC, and one upgradeable slot, allowing each crate to accommodate 10 general backend PCBs. To fully utilize ACTA rack resources, the height of the cooling space for each backend electronics crate is 2U, allowing for the installation of rack fans and accessories. Therefore, each rack can accommodate three backend crates, with the remaining space reserved for the use of data exchange switches for the DAQ system.

### **1.10.1.2 Power Supply Crate and Rack Considerations**

The power supply crate mainly involves two design considerations:

- **i)** Power supply for detector frontend electronics. As discussed earlier, the power supply for detector frontend electronics will use high-voltage DC for long-distance transmission and utilize frontend DC-DC modules for voltage conversion. The high-voltage DC part is planned to be provided by a 110V commercial DC power supply crate. The number of power channels has some customization flexibility, mainly limited by the total power of the crate. However, due to the different actual performance requirements of each detector, the power consumption of detector module frontend electronics varies significantly. If a unified output power standard is followed, it can only be based on the highest power detector frontend module, limiting the number of output channels of the power supply crate and reducing its space utilization. Therefore, based on the statistics of detector frontend electronics and modules, the power supply crate output channels are divided into high-power and low-power, with a single channel power limit of 100W for high power and a capacity of 48 channels per crate; and a single channel power limit of 40W for low power, with a capacity of 96 channels per crate. Furthermore, considering the height of the power supply crate as 3U and the cooling space as 1U, a 42U rack can accommodate 10 power supply crates.
- **ii)** Power supply for the power supply crate itself. Using a commercial high-voltage crate connected to the AC power grid to convert from AC 380V to DC 110V, and further supply power to the DC power supply crate. The typical power

supply capacity is 60-70kW per crate, with a height of 6U and a capacity of 10 output power channels. Considering a cooling space of 2U for each crate, a 42U rack can accommodate 5 of the aforementioned high-voltage AC crates.

### **1.10.1.3 Detector High-Voltage Crate and Rack Considerations**

The high-voltage bias required for detector operation will also be based on the same crate standard, provided uniformly from the underground hall room. Combining the preliminary design of sub-detectors, most detectors operate at voltages ranging from tens of volts to 200 volts, with relatively low power, so the main constraint on space requirements comes from the number of crate channels. Considering the parameters of the detector high-voltage crate provided to the ATLAS-HGTD project, each high-voltage crate contains a total of 224 channels, with the high voltage value of each channel independently adjustable. Each high-voltage crate has a height of 8U, with 2U reserved for cooling height, allowing each 42U rack to accommodate 4 detector high-voltage crates. In addition, considering that some detectors (such as TPC) require higher voltages, the safety spacing on the high-voltage crate plugins needs to be increased accordingly. For such SHV cases, the crate height will increase to 10-12U, with the cooling height remaining at 2U, allowing each 42U rack to accommodate 3 SHV crates.

## **1.10.2 Detector Frontend Electronics Cabling Considerations**

As discussed earlier, the electronics system adopts a uniform style, based on a common platform to complete the design from top to bottom. For the frontend modules of sub-detectors, considering the "one light, one power" style to achieve data and power connections. For data, all uplink and downlink data, including slow control, clock, etc., will be completed through a bundle of optical fibers based on the frontend data interface. The number of fiber channels will be designed according to the data rate of the frontend modules, with a maximum capacity of 4 uplink fiber channels (corresponding to a data rate of 40Gbps), without affecting the overall mechanical size. For the frontend electronics power supply, it is connected to each detector module through high-voltage DC supply and further converted to the corresponding low voltage through DC-DC modules. When considering the total power consumption of the frontend, in addition to the power consumption of the frontend electronics chips themselves, the related chipsets responsible for data transmission will also consume power. The power consumption of the data chip ChiTu is 0.75W, and the total power consumption of the optoelectronic interface chip KinWoo chipset is 0.25W, totaling 1W. Based on this, the DC-DC modules converting high-voltage DC will also consume power. Referring to the design target efficiency of the DC-DC modules in Section \*\* as 85%, an additional 18% will be added to the total power consumption of the frontend electronics based on the power consumption of the frontend chips.

This section will mainly review the cabling situation of the frontend electronics of each sub-detector, providing considerations for data readout and frontend power for each sub-detector.

### **1.10.2.1 The data and power considerations for the vertex detector**

According to the reference scheme of the vertex detector (refer to Section XX), it will consist of 4 layers of Stitching layer and 1 layer of double-sided Ladder layer, totaling 6 layers of chip layers. For simplicity of discussion, the repeat unit RSU of the Stitching layer and the independent chip size of the Ladder layer are both considered as the typical 1024×512 pixel array of the vertex detector, with pixel units of  $25\mu m \times 25\mu m$ .

**i)** Data connection

Based on the estimated background event rate of each layer of the vertex detector provided by MDI, the output data rate of each layer and each chip can be further calculated.

Note that in Table [1.20,](#page-72-0) the BX rate was 1.34MHz, the safety factor is 1.5, and the cluster size estimated from the technology is 3 pixels.

According to the operating mode of CEPC, the vertex detector will be compatible with the operation modes of Higgs and LowLumi Z in the first phase. Therefore, for chip design, its processing capacity needs to be designed according to the highest event rate. According to the LowLumi Z estimation, the innermost layer event rate per chip unit is 2Gbps, and the event rates of subsequent layers are reduced proportionally. Furthermore, combining the
| laver       | <b>Hit Density (Hits/cm<sup>2</sup>/BX)</b> | <b>Hit Density</b> (kHits/cm <sup>2</sup> /s) |
|-------------|---------------------------------------------|-----------------------------------------------|
| Stitching 1 | 0.65                                        | 870                                           |
| Stitching 2 | 0.43                                        | 580                                           |
| Stitching 3 | 0.09                                        | 116                                           |
| Stitching 4 | 0.08                                        | 110                                           |
| Ladder 5    | 0.05                                        | 70                                            |
| Ladder 6    | 0.05                                        | 68                                            |

**Table 1.20:** Hit density estimation from the background rate

arrangement of chips in each layer of the detector, the following table can be obtained: For the inner 4 layers of the

| laver       | Data rate/chip (Gbps) | Chips/row | Data rate/row (Gbps) | <b>Rows</b>       | Links@10Gbps                 |
|-------------|-----------------------|-----------|----------------------|-------------------|------------------------------|
| Stitching 1 |                       | 8         | 16                   | $2 \times 2 = 4$  | $2\times4=8$ (2 fibers/chn)  |
| Stitching 2 | 1.3                   | 12        | 15.6                 | $3 \times 2 = 6$  | $2\times6=12$ (2 fibers/chn) |
| Stitching 3 | 0.27                  | 16        | 4.3                  | $4 \times 2 = 8$  | $1 \times 8 = 8$             |
| Stitching 4 | 0.25                  | 20        |                      | $5 \times 2 = 10$ | $1 \times 10 = 10$           |
| Ladder 5    | 0.16                  | 29        | 4.64                 | 25                | $1 \times 25 = 25$           |
| Ladder 6    | 0.16                  | 29        | 4.64                 | 25                | $1 \times 25 = 25$           |

**Table 1.21:** Data link estimation of the Vertex detector

Stitching structure, due to issues with chip manufacturing, data readout is currently assumed to be single-ended for each row of detectors. Since the maximum effective data rate for each dual-fiber channel is 9.71Gbps, for the inner 2 layers of detectors, each row needs to provide two fiber channels to meet the readout target data rate. For the outer layers of detectors, only one fiber channel per row is needed to meet the readout requirements.

From the table, it can be seen that the vertex detector requires a total of 88 fibers, and a total of 6 universal backend boards are needed. The frontend data communication task can be completed using 1 data crate.

## **ii)** Power connection

For power connections, for simplicity, the power consumption of chips in each layer is estimated based on the power consumption of the innermost layer at 40mW/cm<sup>2</sup> (refer to Section x of the vertex detector chapter). Based on the chip unit area of 2.6cm×1.6cm, the power consumption per chip unit is approximately 200mW. The table above is obtained based on the arrangement of chips in each layer.

| laver       | Chips/row<br>Power/row |                | <b>Rows</b>       | <b>Chip Power/Layers</b> | <b>Total Power/Layers</b> |  |
|-------------|------------------------|----------------|-------------------|--------------------------|---------------------------|--|
|             |                        | W)             |                   | (W)                      | $(Chip+Link)+85\%$ (W)    |  |
| Stitching 1 | 8                      | 1.6            | $2 \times 2 = 4$  | 6.4                      | $12.2(6.4+4)$             |  |
| Stitching 2 | 12                     | 2.4            | $3 \times 2 = 6$  | 14.4                     | $24(14.4+6)$              |  |
| Stitching 3 | 16                     | 3.2            | $4 \times 2 = 8$  | 25.6                     | $39.5(25.6+8)$            |  |
| Stitching 4 | 20                     | $\overline{4}$ | $5 \times 2 = 10$ | 40                       | $58.8(40+10)$             |  |
| Ladder 5    | 29                     | 5.8            | 25                | 145                      | $200(145+25)$             |  |
| Ladder 6    | 29                     | 5.8            | 25                | 145                      | $200(145+25)$             |  |

**Table 1.22:** Power estimation of the Vertex detector

In the design of the 1st to 4th layers based on the Stitching scheme, each layer consists of two semicircles of Stitching, with each semicircle using a power module powered by a cable at one end. For the outer layers using the Ladder scheme, due to relatively higher power consumption, each Ladder is powered from both ends with a power module provided at each end.

Taking into account the number of data interfaces and the additional power consumption, combined with the efficiency of the power modules, the total power consumption of the frontend of the vertex detector is approximately 449.8W, requiring a total of 66 power channels, powered by two power crates.

**iii)** High Voltage Connection

Since the vertex detector will be designed based on CMOS Sensor technology, it does not have an actual high voltage requirement. However, in order to improve detection efficiency, the detector substrate may need to be biased at a negative voltage, typically within -10V. Considering all sub-detectors, this may be the only negative voltage bias requirement. Although the voltage value is not high and can be provided by DC-DC modules, specially developed radiation-resistant negative voltage modules have a relatively low cost-effectiveness and small quantity, so it is considered to provide the bias through a "high voltage power supply".

The specific high voltage channel connection method is consistent with the power connection, with no special requirements for the power of the high voltage power supply. Therefore, a total of 66 high voltage channels are needed, powered by 1 high voltage crate.

## **1.10.2.2 The data and power considerations for the TPC**

The TPC detector will consist of only two endcaps. Referring to the design scheme of the sub-detector in Section X, each endcap is composed of 248 modules.

For data connection, according to MDI estimates, in Higgs mode, the data rate of each module is between 30 100Mbps, much lower than the data capacity of a single fiber optic cable. In addition, due to the use of low-power design, the total power consumption of each endcap of the TPC is limited to 10kW, with each module's power consumption limited to 40.3W. Considering the power consumption of the data interface at 1W, and an 85% power module efficiency, the total power consumption of each module is approximately 42W.

To maintain system reliability, the implementation is still based on one fiber optic cable and one power supply cable per module. Therefore, the fiber optic data channels, power channels, and high voltage channels are all 496. Designed independently for the two endcaps, they require 32 backend boards, 4 data crates, 6 power crates, and 4 high voltage crates for the frontend cabling.

#### **1.10.2.3 The data and power considerations for the Inner Tracker**

In addition to the vertex detector and TPC, the other sub-detector systems will be composed of barrel and endcap sections, with significant differences in the arrangement and organization of detector modules. Therefore, the cablingrelated issues of each sub-detector will be discussed separately according to the barrel and endcap sections.

**i)** ITK Barrel

According to the ITK design scheme in Section X, it will consist of 3 layers of barrel detectors and 4 layers of double-sided endcap detectors. In the barrel section of the ITK, the frontend chips will be implemented based on HVCMOS pixel chips, with an area of 2cm×2cm and a data width of 42 bits as provided in Section X. Each detector module will consist of 2×7, totaling 14 chips. Based on MDI's background estimates for the three layers of the barrel section, the data rate of chips and modules can be obtained as shown in the table below. It can be seen that even





under extreme conditions, the maximum data rate of ITK barrel detector modules remains within 1Gbps, leaving a large margin in the data channels. On the other hand, the power consumption estimate provided by the frontend chip design is 200mW/cm<sup>2</sup>. Combining this with the chip area, the power consumption per chip is 0.8W, and the net power consumption of the module is 11.2W. Therefore, both the data rate and power consumption of the module are within a relatively small range, considering a one fiber one power strategy, the total power consumption of the module is 14.4W.

According to the module arrangement in the table below for each layer of the ITK, the total number of modules is 2204. Therefore, the data channels, power channels, and high voltage channels are all 2204, requiring 139 backend boards, 14 data crates, 29 power crates, and 10 high voltage crates to achieve all electronics connections. table x

#### **ii)** ITK Endcap

For the ITK endcap, a dual-layer overlapping design is adopted for each endcap to achieve seamless coverage. Furthermore, unlike the uniform module design used in the barrel section, there are significant fluctuations in the size of the endcap modules and the number of chips they contain, resulting in varying data rates and power consumption for detector modules at different positions. In terms of electronics connections, it is necessary to

| <b>ITKE</b> | <b>Ladder Max Chips</b> | <b>Bkgrd Rate</b><br>Avg $(Hz/cm2)$ | <b>Bkgrd Rate</b><br>Max $(Hz/cm2)$ | <b>Module Rate</b><br>Avg (Mbps) | <b>Module Rate</b><br>Max (Mbps) | <b>Module/Ladders</b><br>(Fibers) |
|-------------|-------------------------|-------------------------------------|-------------------------------------|----------------------------------|----------------------------------|-----------------------------------|
|             |                         | 39k                                 | 230k                                | 47.2                             | 278.4                            | 224                               |
|             |                         | 160k                                | 380 <sub>k</sub>                    | 279.7                            | 664.3                            | 320                               |
|             | 22                      | 89k                                 | 750 <sub>k</sub>                    | 263.3                            | 2218.9                           | 608                               |
|             | 22                      | 24k                                 | 63k                                 | 71.0                             | 186.4                            | 544                               |

**Table 1.24:** Data rate estimation of the Inner Tracker Endcap

determine the layout of each layer of the endcap based on the specific arrangement of the ITK endcap. This, combined with the counting rate estimates provided by background analysis, allows for an assessment of data rates. From Table X, considering the maximum number of chips and counting rates for each layer of endcap modules, the maximum data rate of endcap modules is around 2.2Gbps, leaving room for single-channel data connections. Therefore, the approach of using one fiber per module can still handle the current data rates.

The power consumption of the frontend chips at the ITK endcap is 80mW/cm<sup>2</sup>, with a chip area of 2.1cm×2.3cm, resulting in a power consumption of 336mW per frontend chip. Therefore, considering the maximum number of chips per endcap module, the maximum net power consumption of a module's frontend is 7.4W, at a relatively low level. To save power crate resources, the power supply interfaces are consolidated at the sector level, with power connections made to the 8 sectors of each endcap layer, distributed to each module after DC-DC conversion. Furthermore, for the outer endcap, due to its larger total area and higher total power consumption, two power channels are provided for each sector on the endcap for power supply, as shown in Table X. Considering that each

| <b>ITKE</b> | <b>adder</b> | Ladder         | Chip          | <b>Sector</b>   | <b>Sector Link</b> | <b>Sector Power</b> | <b>Laver Power</b> | <b>Power Chns</b> |
|-------------|--------------|----------------|---------------|-----------------|--------------------|---------------------|--------------------|-------------------|
|             | Max          | <b>Max Pwr</b> | per           | <b>Chip Pwr</b> | <b>Power</b>       | $(Chip+Link)+85\%$  | (x8x2x2)           | (1Chn/Sect)       |
|             | <b>Chips</b> | W)             | <b>Sector</b> | (W)             | (W)                | W)                  | (W)                |                   |
|             |              | 3.02           | 48            | 16.1            |                    | 27.2                | 870.6              | $1 \times 32$     |
|             | 13           | 4.36           | 98            | 32.9            | 10                 | 50.5                | 1616.0             | $1\times32$       |
|             | 22           | 7.39           | 299           | 100.5           | 18                 | 139.4               | 4497.4             | $2\times32$       |
|             | 22           | 7.39           | 274           | 92.1            | 15                 | 126                 | 4105.9             | $2\times32$       |

**Table 1.25:** Power estimation of the Inner Tracker Endcap

module requires a certain level of independent adjustment and compensation capability for high voltage, a separate high-voltage power channel is still provided for each module, with an adjustment range of 50 200V.

Overall, for the ITK endcap, there are a total of 1696 fiber optic channels for data transmission, requiring 106 backend boards and a total of 12 data crates for readout. There are 192 power channels, supplied by 2 power crates, and a total of 1696 high-voltage channels for the detector, requiring 8 high voltage crates.

## **1.10.2.4 The data and power considerations for the Outer Tracker**

**i)** OTK Barrel

Based on the layout scheme of the ITK detector discussed in Section X, its barrel section adopts a three-tier structure, with a 6-meter-long stave containing 6 1-meter-long Ladders, each of which includes 7 modules, and each module contains 22 frontend ASIC chips. Through data aggregation and power distribution at each level, the data readout and power conversion modules will be located on the second-level aggregator board of each ladder.

According to the background simulation results, the average background counting rate for the ITK is 7 kHz/cm<sup>2</sup>, with a maximum rate of 9 kHz/cm<sup>2</sup>. With a module area of 14 cm  $\times$  14 cm and 48-bit data width for the ASIC chips, the average data rate per module is 65.9 Mbps, with a maximum rate of 84.7 Mbps, indicating low data volume. For the total data rates of the Ladders, they correspond to 461.3 Mbps and 1355.2 Mbps, providing sufficient margin for fiber optic channels. Therefore, setting the data channels at the ladder level is more appropriate and can reduce the number of frontend fiber optics.

Regarding frontend power consumption, the power consumption per channel of the frontend ASIC is 20 mW, totaling 2.56 W for 128 channels per chip. Therefore, the frontend module consumes 56.32 W, a relatively high value, necessitating power channels to be provided per module rather than per ladder.

Similarly, considering the ability of each sensor in the LGAD detector to independently adjust high voltage, highvoltage channels should also be provided at the module level, with a voltage range of 150-200 V.

Overall, the barrel section of the ITK requires a total of 540 fiber optic channels, corresponding to 34 backend PCBs and 4 data crates. The power channels require 3780, corresponding to 79 power crates. The detector high-voltage channels also require 3780, corresponding to 17 high-voltage crates.

For the OTK endcap, the frontend ASIC design will be consistent with the barrel section. Background simulations provide estimates of 3 kHz/cm<sup>2</sup> and 35 kHz/cm<sup>2</sup> for the counting rates. According to the mechanical layout design of the ITK endcap detector, each endcap consists of 24 Pedals, each of which can be further divided into 10 rings according to the radius, with the inner 5 rings containing one sector per ring and the outer 5 rings containing two sectors per ring. With an endcap area of 19.4 m2, the area of each Pedal is calculated to be  $4041.7 \text{ cm}^2$ . The total data rates at the Pedal level are an average of 582 Mbps and a peak of 6.79 Gbps, with the peak data rate approaching the limit of a single fiber optic channel. Considering the potential higher data rates in Low LumiZ mode compared to Higgs operation and leaving room for future upgrades, the data interfaces will still be deployed at the sector level. Based on the specific arrangement of different sectors on each ring, the maximum number of chips per sector is 23, resulting in a frontend chip power consumption of 58.9 W per sector. Therefore, the power channels will also need to be provided at the sector level and cannot be aggregated at a higher level.

The high-voltage situation for the detectors is similar to that of the barrel section, with provision at the sector level. Therefore, the data interfaces, power channels, and detector high-voltage for the ITK endcap all correspond to the sector level. With a total of 720 sectors for both endcaps, 720 fiber optic channels, 45 backend boards, and 6 power crates are needed, along with 720 power channels and 16 power crates, as well as 720 detector high-voltage channels and 4 high-voltage crates.

#### **ii)** OTK Endcap

For the ITK endcap, the frontend ASIC design will be consistent with the barrel section. Background simulations provide estimates of 3 kHz/cm<sup>2</sup> and 35 kHz/cm<sup>2</sup> for the counting rates. According to the mechanical layout design of the ITK endcap detector, each endcap consists of 24 Pedals, each of which can be further divided into 10 rings according to the radius, with the inner 5 rings containing one sector per ring and the outer 5 rings containing two sectors per ring. With an endcap area of 19.4 m2, the area of each Pedal is calculated to be 4041.7 cm<sup>2</sup>. The total data rates at the Pedal level are an average of 582 Mbps and a peak of 6.79 Gbps, with the peak data rate approaching the limit of a single fiber optic channel. Considering the potential higher data rates in Low LumiZ mode compared to Higgs operation and leaving room for future upgrades, the data interfaces will still be deployed at the sector level. Based on the specific arrangement of different sectors on each ring, the maximum number of chips per sector is 23, resulting in a frontend chip power consumption of 58.9 W per sector. Therefore, the power channels will also need to be provided at the sector level and cannot be aggregated at a higher level.

The high-voltage situation for the detectors is similar to that of the barrel section, with provision at the sector level. Therefore, the data interfaces, power channels, and detector high-voltage for the ITK endcap all correspond to the sector level. With a total of 720 sectors for both endcaps, 720 fiber optic channels, 45 backend boards, and 6 power crates are needed, along with 720 power channels and 16 power crates, as well as 720 detector high-voltage channels and 4 high-voltage crates.

## **1.10.2.5 The data and power considerations for the ECAL**

#### **i)** ECAL Barrel

According to the ECAL detector design, its barrel section is divided into a total of 480 modules. Each module contains 992 or 1000 Crystal Bars based on their orientation. The organization of electronics on the modules is divided into module side boards and module back boards. The height of the module side boards is limited to not exceed 3.2mm to ensure detector efficiency. Each crystal is read out from both ends, corresponding to two electronics channels. The height of each side board does not exceed 1cm crystal width, with a length of 40cm. Only smaller volume ASIC chips can be placed on it, and not optical fibers or power modules. Therefore, the data from all side boards will be aggregated on the module back boards and further read out through fiber optic interfaces, powered by DC-DC modules. For ECAL, HCAL, and Muon, a unified design will be used for the front-end ASICs with a data width of 48 bits.

The average counting rate of ECAL given by background simulations is around 100kHz per crystal. Therefore, for each module, the total data rate of 1000 dual-readout Crystal Bars is 9.6Gbps. This data rate exceeds the transmission capacity of a single data interface, so it is considered to provide two fiber optic channels for data transmission for each ECAL module. This results in a total of 960 fiber optics, corresponding to 60 backend PCBs and 6 data crates. For power supply, since the power consumption of each channel of the front-end ASIC is 15mW, the total power consumption of each module is 30W, at a moderate power level. Considering the power consumption of each module's data interface and power efficiency, the total power consumption of the module is 36.5W. A total of 480 power channels are required, corresponding to 10 power crates.

For the current target SiPM devices, their high voltage bias is around 24V, providing one global high voltage input for each module, while the inconsistency of different SiPM devices within the module will be compensated by the corresponding ASIC chip. Therefore, a total of 480 high-voltage channels are required, corresponding to 2 high-voltage crates.

**ii)** ECAL Endcap

The module organization of the ECAL endcaps is similar to the barrel section, also divided into electronics side boards and back boards. Data and power will be provided from the back board. Considering that the counting rate of the endcaps is relatively higher than the barrel section, it is still reasonable to use 2 fiber optic channels for each module.

Based on the current layout of ECAL, each endcap contains approximately 130 modules, requiring a total of 520 fiber optics, corresponding to 34 backend boards and 4 data crates. 260 power channels are needed, corresponding to 4 power crates, and 260 HV channels are also required, corresponding to 2 high-voltage crates.

#### **1.10.2.6 The data and power considerations for the HCAL**

**i)** HCAL Barrel

According to the arrangement of the HCAL detector's barrel section, the entire barrel is divided into 16 sectors, with each sector divided into 10 divisions in the z direction, and each sector composed of 48 layers of modules stacked in the  $\phi$  direction. Due to the limited space of 3.2mm for electronics on each layer, it is not feasible to access fiber optic and power modules within each layer. Instead, data aggregation and power distribution must first be done at the end of the detector through end PCBs before connecting to the electronics interconnections. Therefore, data interfaces and power channels will be deployed based on end PCBs for each layer. Furthermore, depending on the

 $\phi$  position of each layer, different numbers of PCBs are needed to cover as much as possible. For layers 1 to 19, 3 PCBs are arranged per layer, while for layers 20 to 48, 4 PCBs are arranged per layer. Each PCB has a uniform length in the z direction, corresponding to 15 scintillating glasses, and is available in three different lengths in the  $\phi$ direction: 6, 7, and 8 scintillating glasses.

According to the background estimate provided by MDI, the average counting rate for each scintillating glass in the HCAL barrel section is 5 kHz, resulting in a maximum of 120 channels per PCB and a maximum data rate of 28 Mbps per PCB. This data rate is reasonable for transmitting signals over a telecom distance of 3 meters. Furthermore, the total data rate for each end board is the aggregation of data from 5 PCBs, totaling 144 Mbps. This is still far below the capacity of a single fiber optic channel, leaving significant room for upgrades for High LumiZ operation.

For power consumption, the ChoMin ASIC chip at the frontend of the SiPM consumes 15mW/channel, resulting in a total power consumption of 1.8W per PCB and 9W for the frontend chips on the end board. Considering the power consumption of 1W for data interfaces and the efficiency of DC-DC conversion, the total power consumption for the end PCB is 11.8W.

Therefore, the HCAL barrel section requires a total of 5536 fiber optic cables for data interfaces, corresponding to 346 backend boards and 36 data crates. Similarly, 5536 power channels correspond to 116 power crates and 26 detector high-voltage crates.

**ii)** HCAL Endcap

Similar to the barrel section, each endcap of the HCAL is also divided into 16 sectors and 48 layers. Due to the higher detector background of the endcaps compared to the barrel section, two end PCBs are provided for each sector on each layer of the endcaps to facilitate data aggregation and power distribution at the edges of the endcaps. According to the background estimate provided by MDI simulations, the average counting rate for each scintillating glass in the HCAL endcaps can reach up to 50 kHz. With approximately 1459 total units on each layer of each sector of the endcaps, the total counting rate for each sector is 3.5 Gbps, corresponding to a data volume of 1.75 Gbps for each aggregation board. At this data rate level, it is appropriate to use one fiber optic readout per end PCB.

In terms of power supply, the total power consumption for each sector on each layer of the endcaps is 22W. Taking into account the 1W power consumption of the data interfaces on each aggregation board and power conversion efficiency, the total power consumption of each aggregation board is approximately 14.2W, which is at a relatively reasonable level.

Therefore, the HCAL endcaps require a total of 3072 fiber optic cables  $(48\times2\times16\times2)$  for data interfaces, corresponding to 192 backend boards and 20 data crates. Additionally, 3072 power channels correspond to 32 power crates and 14 high-voltage crates.

### **1.10.2.7 The data and power considerations for the Muon Detector**

**i)** Muon Barrel

The Muon barrel detector is divided into two halves along the Z direction, with each half consisting of 12 modules, and each module composed of 6 layers of detectors, totaling 144 modules. Due to the relatively low data rate and power consumption, in terms of electronics, a data aggregation PCB will be deployed on the outer end face of each module, and an additional aggregation PCB will be deployed for readout on the outer trapezoidal inclined surface of each module.

Considering a total of 23,976 channels in the barrel section, the background estimate provided by MDI simulations indicates a counting rate of less than 1 kHz per channel. If using the same SiPM and ASIC as ECAL and HCAL, the total data rate is only 1.15 Gbps. Therefore, for each aggregation board, the average data rate is less than 10 Mbps, providing a single fiber optic for readout with sufficient data bandwidth to maintain the uniformity of the overall electronics framework and ensure good timing performance.

Additionally, considering the ASIC power consumption of 15mW/channel and a total frontend ASIC net power

consumption of 359.6W, the net power consumption of the frontend on each aggregation board is in the order of 1.25W. Taking into account data interfaces and power efficiency, the total power consumption of each aggregation board is 2.64W, which is relatively low.

Therefore, the Muon barrel section requires a total of 288 fiber optic cables for data interfaces, corresponding to 18 backend boards and 2 data crates. Additionally, 288 power channels correspond to 3 power crates. The overall functionality can be achieved with 2 high-voltage crates for detector high voltage supply.

#### **ii)** Muon Endcap

The Muon endcap detector is divided into inner and outer endcaps, each containing 48 modules with 6,912 and 12,288 channels, respectively.

Similar to the barrel design, providing a single fiber optic for data readout for each module can meet the requirements, with a total data rate of less than 921.6 Mbps and an average of 9.6 Mbps per aggregation board.

The net power consumption of the frontend ASIC is 288W, with a net power consumption of 3W per aggregation board. Taking into account data interfaces and power efficiency, the total power consumption of each aggregation board is 4.71W.

Therefore, the Muon endcap requires a total of 96 fiber optic cables for data interfaces, corresponding to 6 backend boards and 1 data crate. Additionally, 96 power channels are needed, corresponding to 1 power crate. The detector high voltage supply can be achieved with 1 high-voltage crate for the entire design.

#### **1.10.3 Consideration of the Electronics Room**

Based on the analysis of the cables and crates of each subdetector electronics, Table X can be obtained. From the table, it can be seen that the total number of frontend crates required for the CEPC electronics system is 110 data crates, 237 power crates, and 91 detector high-voltage crates, corresponding to 37 data racks, 24 power racks, and 23 detector high-voltage racks, respectively.

Furthermore, based on the raw data volume of the detector front ends and the trigger rate estimate provided by the CEPC physics goals, a preliminary estimation of the TDAQ system's crate requirements can be made, which is 160 general trigger boards, corresponding to 20 ATCA crates or 10 racks.

Additionally, for each low-voltage rack, an additional high-voltage power crate is needed to convert and supply power from 380V AC to 110V DC, with each crate providing power output of 60-70kW and a maximum of 10 power channels. Considering the rack height of 6U and 2U for cooling space, each 42U rack can accommodate 5 such crates. Therefore, out of the summarized 94 low-voltage racks, a total of 10 high-voltage AC crates are required for power supply, corresponding to 2 high-voltage power crates.

Hence, the minimum total number of crates required for the electronics and TDAQ systems is 96. Considering that the electronics are located in an underground hall with limited space, which cannot be expanded in the future, it is necessary to account for potential data volume fluctuations due to inaccurate background estimates and possible High LumiZ upgrades after ten years of collider operation. Preliminary considerations suggest doubling the number of crates for redundancy, with a total capacity of approximately 200 crates for the electronics area.

Assuming a crate size of 0.5m x 0.5m and accounting for cooling between crates, with adjacent spacing of 1.5 meters (center-to-center distance of 2 meters) and front-to-back spacing of 2 meters (center-to-center distance of 2.5 meters), the overall area of the electronics area can be considered as 2 floors, each with an area of 500 square meters (total of 1000 square meters). Each floor can accommodate 100 crates arranged in a 10 x 10 grid, resulting in a floor area size of 25 meters x 20 meters.



1.11 Previous R&D on Electronics System for Large Particle Physics Experiments (Wei Wei)  $\frac{1}{1 - 11}$  $\sim$  1

# **1.11 Previous R&D on Electronics System for Large Particle Physics Experiments (Wei Wei)**

The CEPC electronics research team has brought together the main forces in the field of high-energy physics in China, covering a wide range of research directions including Front-end ASIC, Back-end high-speed readout, and system-level electronics, as well as advanced electronics technologies.

Starting from BESIII, the team has successively led and undertaken the development tasks of electronics systems for several major high-energy physics scientific facilities in China, including the BESIII experiment, Dayabay Neutrino Experiment, LHAASO, CSNS, and JUNO. The electronics systems of these large-scale particle physics experiments have achieved long-term stable operation, laying a reliable foundation for the acquisition of physical results. In addition, significant contributions have been made to a series of international collaborations such as ATLAS and nEXO.

Currently, efforts are being made to further expand the scope of international cooperation. By actively participating in DRD7, we aim to attract foreign teams to join the future development of CEPC electronics.

## **1.12 Main schedule for future works beyond the detector Ref-TDR**



**Figure 1.70:** Main schedule for future works

As for the Ref-TDR, the main goal of the current electronics system is to determine the preliminary feasible baseline scheme for the CEPC electronics system. Therefore, Table X provides a research plan for the next five years. The entire electronics system, with the release of the Ref-TDR as the first milestone, will provide the final design scheme of the electronics system based on the detector requirements from background simulations and the final design scheme, along with estimates of data volume, power consumption, etc., as inputs to the trigger system and mechanical system. Subsequently, short-term and long-term goals will be set for 3 years and 5 years, respectively, to complete the verification of prototypes for each subsystem and ultimately finalize the development of each frontend chip and the electronics universal interface. The research plans for each frontend ASIC can be found in the respective subsystem chapters for detailed information. This section mainly summarizes the key technology research and development related to the electronics system, particularly ASIC development, for unified management in the subsequent engineering implementation phase.

## **1.13 Summary (Wei Wei)**

This chapter presents the overall framework of the CEPC electronics system, which, based on the counting rate estimation from background simulations, adopts a frontend Triggerless readout approach. With a strategy based on backend triggering, it can meet the requirements of CEPC for searching new physics. Within this framework, the electronics system can be divided into customized frontends, as well as universal data interfaces, universal power modules, and universal backend electronics. Early-stage R&D results for each part indicate that the scheme has technical feasibility, with no show stopper found for the electronics framework the Sub-Detector readout. Additionally, backup schemes based on conventional triggers and innovative schemes based on wireless communication have been considered for more conservative and aggressive approaches, respectively. Furthermore, beyond the Ref-TDR release plan, the electronics system will set short-term and long-term goals at 3 years and 5 years, respectively, to achieve the overall objectives of electronics system development.

## **1.14 Chip Naming**

A unified naming convention has been adopted for the main front-end ASIC chips currently involved in the electronics system of the CEPC. The names are not composed of abbreviations of English words, but directly derived from Chinese mythology, as explained below:

**Taichu**: In Chinese mythology, the origin of all things, meaning "very beginning". Corresponding to the Vertex Detector, it is the first detector that particles pass through in the entire detector system.

**JuLoong**: A divine beast in Chinese mythology that governs time. As the front-end ASIC of the OTK detector, its primary design goal is to achieve the highest time resolution performance of the entire detector, corresponding to the function governed by JuLoong.

**ChoMin**: A mythical bird in ancient Chinese legends. This bird has two eyeballs in each eye, hence the name "Double Vision" . As the front-end ASIC of SiPM, it will be used as a general-purpose ASIC in ECAL, HCAL, and Muon detectors, symbolizing the meaning of ChoMin, which is to see signals from multiple detectors. In addition, its main design challenge is to meet the large dynamic range of ECAL, so the chip will also adopt dual-range amplification for the front-end, aligning with the meaning of ChoMin, signifying multiple visions.

**ChiTu**: The Data Link chip. The most famous horse in Chinese tales, ridden by the Chinese God of War Guan Yu. It is in charge of transportation with ultra fast speed, just as GBTx-like chip is doing.

**KinWoo**: The opto-electrical conversion chip series. The bird who lives in the sun in Chinese tales, an avatar of the sun and in charge of the light, just as the VTRx chip does, to convert electronic signal to/from optical.

**TaoTie**: The Data Aggregation chip. A mythical animal in Chinese tales, who can swallow anything, just as the chip does, to collect all the input data streams.

**BaSha**: The frontend power module seris. One of the nine sons of the Chinese Loong, who is famous for its strongness and always to bear a monument. Just like the powering system which is the basement and support of all electronics.