# **INTERNATIONAL CONFERENCE ON TECHNOLOGY AND INSTRUMENTATION IN PARTICLE** PHYSICS

# BEIJING. 22-26 MAY 2017

# TOM JAMES, IMPERIAL COLLEGE LONDON

# **TRACK FINDING FOR THE LEVEL-1** IKIUUEK UF IHE UMS EXPEKIMENI







# INTRODUCTION **INTRODUCTION TO COMPACT MUON SOLENOID (CMS)**

- Dark Matter). 20-40 simultaneous collisions (pileup) at 40 MHz
  - 1. Silicon strip tracker (~1.2m radius, 200 m<sup>2</sup> area), largest silicon tracker in operation
  - **B**-field
  - 3. Level-1 (L1) trigger, latency O(3-4 µs), rejects uninteresting events, rate reduction O(x400)
  - 4. Data size/rate from tracker **too large** to use in L1-trig

 $p_T$  = curvature in transverse direction to beam

 $\eta$  = angle in longitudinal direction to beam





Large, all-purpose detector, designed to investigate a wide range of physics (Higgs, Supersymmetry,

2. Within **3.8 T** superconducting solenoid. **Transverse momentum (p<sub>T</sub>)** measured with curvature in



# INTRODUCTION HIGH LUMINOSITY LHC

- By 2026 LHC will be upgraded in luminosity -> 2-3x improved statistics by 2035 vs no upgrade
- Silicon strip tracker will be replaced (radiation damage) Phase II CMS
  - Challenging high occupancy conditions, ~15,000 tracks per bx, must perform ≥ at present
  - Need completely new handle at L1 trigger, to keep rate < 750 kHz, while maintaining sensitivity to interesting physics</p>
  - New tracker design will allow read out of some data at 40 MHz



**precision** measurements, **push** search limits, **rare** processes





# INTRODUCTION **CMS TRACKER UPGRADE**

See A. Konig, The CMS Tracker Phase II Upgrade for the HL-LHC era

- High **p**<sub>T</sub> tracks signs of interesting physics (decays of high mass particles)
- - Forward these stubs to off-detector trigger electronics rate reduction O(10)



Imperial College. Prototype modules under beamtesting



### "PS" Pixel + Strip Modules 20 < r < 60 cm

### "2S" 2 Strip Modules r > 60 cm

### Novel tracking modules utilise two 1.6-4.0 mm spaced silicon sensors, to discriminate pT > 2-3 GeV

**Proposed upgraded tracker geometry** 







# LEVEL-1 TRACK FINDING **TRACK TRIGGER PROPOSAL**

- ~4 µs available for track finding (~12.8 µs total L1)
- Very high data rates (~20,000 stubs per bx)
- **Proposal:** A Track Finding Processor (TFP)
  - Data-stream FPGA-based processing board
  - Processes 1/8 of tracker in **φ and 1/(time multiplex period)**
- **Objective**: Build hardware demonstrator to prove feasibility and capability of such an object
  - Each TFP operates independently
  - One TFP becomes the *demonstrator slice*

### Field-Programmable Gate Array (FPGA) programmable digital integrated circuit

pileup 140 simulation



**Choice** of physical/time segmentation dictated by data rates out of the detector and into FPGA board







## **DEMONSTRATOR HARDWARE** THE IMPERIAL MASTER PROCESSOR 7 (MP7)

- A generic, high-performance FPGA, high-bandwidth, data-stream processing double-width AMC card
- Currently used widely in CMS trigger Xilinx Virtex-7 690 FPGA
- 72 optical transmitters/receivers running at 10.3 Gbps 8b/10b
  - Usable optical bandwidth **0.55 Tbps** each way
- Infrastructure firmware that provides transceiver buffering, I/O formatting & external communication and configuration kept separate from algorithm space





Backplane has low latency LVDS IO as well as standard SerDes (e.g. GbE, PCIe, etc.)











# **DEMONSTRATOR HARDWARE DEMONSTRATOR SYSTEM**

### **11 MP7s in MicroTCA crate**



- CERN rack provides turbine, 3-phase power, air deflector, water cooling/heat exchangers
- MicroTCA carrier hub (MCH) for GbE communication via backplane
- AMC13 for synchronisation, timing & control

### LOCATION: TRACKER INTEGRATION FACILITY, CERN





**AMC13** 



# LEVEL-1 TRACK FINDING THE TRACK FINDING PROCESSOR (TFP)

> TFP firmware is divided into logical algorithm elements, where (for demonstration), each element implemented on a separate MP7 - can extrapolate to future FPGA resources



**FPGA firmware** - generated by a specialised computer language used to describe structure and behaviour of electronic circuits





# LEVEL-1 TRACK FINDING THE TRACK FINDING PROCESSOR (TFP)







# **DEMONSTRATOR ALGORITHMS 2D HOUGH TRANSFORM (HT)**

- Widely used feature extraction technique to find imperfect instances of objects within a space e.g tracks in our tracker hit map
- Search for **primary tracks in the r-φ plane**, using the **parameterisation** (q/p<sub>T</sub>, φ<sub>0</sub>)
  - Stub positions correspond to straight lines in **Hough Space**
  - Where 4 or more lines intersect -> track candidate

 $q/p_T$  is the free parameter, but  $p_T$  estimate from stacked modules used to constrain allowed  $q/p_T$  space





~270 track candidates







# **DEMONSTRATOR FIRMWARE** HOUGH TRANSFORM (HT)

Each sub-sector implemented as a fully independent, pipelined 32 x 64 array



2) In each Column (Col.), the corresponding  $\phi_0$  of the stub for the column is calculated and the appropriate cell(s) are marked

1) Book keeper receives stubs and propagates to each q/p<sub>T</sub> column in turn.

p<sub>T</sub> estimate from stacked modules used to constrain allowed q/p<sub>T</sub> space



# \$





**One HT Column, 32 per sub-sector** 

3) Candidates marked with stubs from > 4layers propagate back to the Book Keeper for read-out

**One HT sub-sector, 18 per MP7** 







# **DEMONSTRATOR ALGORITHMS 3D KALMAN FILTER (KF)**

- estimates of **unknown variables** 
  - 1. Initial estimate of track parameters (HT seed) & their uncertainties
  - 2. Stub used to update state (weighted average)
  - 3.  $\chi^2$  calculated, used to reject false candidates, incorrect stubs on genuine candidates
  - 4. Repeat until all stubs are added



Coarse track parameters state = estimated track parameters

measurements = stubs on track candidate

Commonly used iterative algorithm; series of measurements containing inaccuracies and noise ->







# **DEMONSTRATOR FIRMWARE**

later retrieval





# DEMONSTRATOR OVERVIEW **DEMONSTRATOR DATA TAKING**

# **Objective** - Run **Monte-Carlo physics samples** emulating **conditions at HL-LHC** through hardware demonstrator







### **DEMONSTRATOR RESULTS TRACK FINDING PERFORMANCE**

- ~1% p<sub>T</sub> resolution, 2 mm z<sub>0</sub> resolution in barrel
  - 1-2 extra bits to encode stub position, z<sub>0</sub> res -> 1mm in barrel



p<sub>T</sub> resolution -> precision of mass estimate of decaying particle

*z*<sup>0</sup> resolution -> ability to match tracks to initial decay vertices





# **DEMONSTRATOR RESULTS** LATENCY MEASUREMENTS

- Fixed latency event independent
- Plenty of margin in design
- Room to increase clock speed, link speed, and to improve utilisation -> lower latency possible





**FIRST OUT** LAST OUT LEVEL-1 TARGET





### CONCLUSIONS **SUMMARY**

- Highly flexible track-finder/pattern recognition algorithm
- Highly scalable, time/physical segmentation could be as large/small as required based on data rates



- Proven with currently available hardware, that a level-1 track-trigger based on FPGA processing boards is a **feasible** and **safe** solution
- Plenty of time to improve and optimise algorithms for global trigger requirements

Thanks for listening

I look forward to answering your questions



### CONCLUSIONS REFERENCES

- Geneva, Jun, 2015.
- of the 13th Pisa Meeting on Advanced Detectors.
- <u>C12024, doi:10.1088/1748-0221/7/12/C12024.</u>
- M. Pesaresi, "Development of a new Silicon Tracker for CMS at Super-LHC". PhD thesis, Imperial College London, 2010.
- M. Pesaresi and G. Hall, "Simulating the performance of a p T tracking trigger for CMS", Journal of Instrumentation 5 (2010) C08003, doi: <u>10.1088/1748-0221/5/08/C08003.</u>
- Italy, 5-10 Jun 2016, doi:10.1109/RTC.2016.7543102.

CMS Collaboration, "Technical Proposal for the Phase-II Upgrade of the CMS Detector", Technical Report CERN-LHCC-2015-010. LHCC-P-008. CMS-TDR-15-02,

CMS Collaboration, "CMS Technical Design Report for the Phase-2 Tracker Upgrade", Technical Report CERN-LHCC-2017-xxx. CMS-TDR-xxx, Geneva, month, 2017.

G. Hall, "A time-multiplexed track-trigger for the {CMS} HL-LHC upgrade", Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 824 (2016) 292 - 295, doi:10.1016/j.nima.2015.09.075. Frontier Detectors for Frontier Physics: Proceedings

K. Compton et al., "The MP7 and CTP-6: multi-hundred Gbps processing boards for calorimeter trigger upgrades at CMS", Journal of Instrumentation 7 (2012)

An FPGA-Based Track Finder for the L1 Trigger of the CMS Experiment at the High Luminosity LHC Presented at 20th IEEE-NPSS Real Time Conference, Padua,

















### CONCLUSIONS ACKNOWLEDGEMENTS

We gratefully thank the following sources for their support of this work

- Science & Technology Facilities Council (STFC)
- Devices for Frontier Exploration in Research and Industry"
- Maxeler Technologies
- The Worshipful Company of Scientific Instrument Makers (UK)









EU FP7-PEOPLE-2012-ITN project nr. 317446, INFIERI, "Intelligent Fast Interconnected and Efficiency

THE WORSHIPFUL COMPANY OF SCIENTIFIC INSTRUMENT MAKERS





### BACKUP **GEOMETRIC PROCESSOR (GP)**



- necessary
  - downstream



### Assigns stubs to 2 φ x 18 η sub-sectors, duplicating across boundaries when

Simplifies the track-finding task and allow for increased parallelisation

- Routes all stubs in a given sector to **dedicated** output link pairs
  - 72 inputs -> 36 outputs
- Highly configurable arbitrator blocks



### BACKUP **RESOURCE USAGE & FUTURE FPGA CHIPS**

- consist of **two** KU115 or **one** VU11P FPGA
- latency of 2 µs

### **Demonstrator TFP** Virtex Ultrascale 11P





• Overall **resource usage** of the system shows that with some optimisations, a future TFP could feasibly

### Future work will be to build a system using the KU115 FPGA and 16.3 Gbps optical links with a target





### BACKUP **RESOURCE USAGE & FUTURE FPGA CHIPS**

- consist of **two** KU115 or **one** VU11P FPGA
- latency of 2 µs

GP

HT





• Overall **resource usage** of the system shows that with some optimisations, a future TFP could feasibly

### Future work will be to build a system using the KU115 FPGA and 16.3 Gbps optical links with a target





### BACKUP **RESOURCE USAGE & FUTURE FPGA CHIPS**

- consist of **two** KU115 or **one** VU11P FPGA
- latency of 2 µs

|                                         | LUTs [10 <sup>3</sup> ] | DSPs | <b>FFs</b> [10 <sup>3</sup> ] | <b>BRAM (36 Kb)</b> |
|-----------------------------------------|-------------------------|------|-------------------------------|---------------------|
| GP                                      | 121                     | 1056 | 205                           | 222                 |
| HT                                      | 244                     | 2304 | 299                           | 1188                |
| KF and DR                               | 398                     | 5112 | 316                           | 1776                |
| Infrastructure per MP7                  | 90                      | 0    | 91                            | 291                 |
| <b>TFP Total (excl. infrastructure)</b> | 763                     | 8472 | 820                           | 3186                |
| Virtex 7 690                            | 433                     | 3600 | 866                           | 1470                |
| Kintex Ultrascale 115                   | 633                     | 5520 | 1266                          | 2160                |
| Virtex Ultrascale+ 11P                  | 1296                    | 9216 | 2592                          | 2016                |



• Overall **resource usage** of the system shows that with some optimisations, a future TFP could feasibly

Future work will be to build a system using the KU115 FPGA and 16.3 Gbps optical links with a target



# BACKUP SIMULATION RESULTS

| Stage      | Efficiency [%] | # of tracks | <pre># of fakes</pre> | <pre># of duplicates</pre> |
|------------|----------------|-------------|-----------------------|----------------------------|
| HT         | 97.1           | 331         | 139                   | 126                        |
| KF         | 95.1           | 190         | 27                    | 103                        |
| DR         | 94.4           | 79          | 16                    | 3                          |
| Full chain | 94.4           | 79          | 16                    | 3                          |



Figure 22: Total number of reconstructed tracks per event reconstructed in the tracker when processing  $t\bar{t}$  events superimposed in 0, 140, and 200 PU events. These results are obtained from emulation, and are shown for when effects of truncation, caused by excess data flow through the system, are both included and excluded.



### BACKUP **SIMULATION RESULTS**



point simulation.

Figure 21: Relative  $p_T$  resolution,  $\phi$  resolution [rad],  $z_0$  resolution [cm] and  $\cot \theta$  resolution measured for single isolated muons, with  $5 < p_T^{\mu} < 15 \text{ GeV}$  obtained from emulation, using different levels of precision in simulation: Default encoding (10-bit r, 12-bit z, 15-bit  $\phi$  stub coordinates); improved encoding (12-bit *r*, 14-bit *z*, 15-bit  $\phi$  stub coordinates); and full floating-



### BACKUP TRACK TRIGGER PROPOSAL

- Total available L1 time will be 12.5 µs, but only ~4 µs is available for track finding
- Must construct a track finder that is capable of processing very high data rates (~20,000 stubs per event) down to the O(10) genuine/interesting tracks expected on average, within this latency target
- adjacent detector octants in φ
- Fully **time-multiplexed** system Processing of subsequent events done on parallel independent nodes No further duplication or sharing between -1000 regions is required downstream
  - Highly scalable system

### Proposal: Track Finding Processor (TFP) (FPGA data stream processing boards) receive data links from



Each TFP processes 1/8 in φ and 1/tmp (time multiplex period) in time

**One TFP becomes the demonstrator slice unit** 

T. James, ICL



x N, where N = TM period

