### A Probability-Optimized Fast Timing Trigger for the Belle II Time of Propagation Detector

L. Macchiarulo, X. Gao Department of Electrical Engineering University of Hawaii K. Nishimura, G. Varner Department of Physics University of Hawaii

Belle II Trigger/DAQ 25-JAN-2011 Beijing

# Agenda

#### Introduction

- Basic Idea of the Trigger Algorithm
- Firmware Implementation and Test
- Conclusion

# Introduction

- Because of its intrinsically fast timing, the timing of propagation (TOP) counter has been chosen as the primary particle identification device (PID) in the barrel region of Belle II.
- Sub 100 *ps* time resolution is only obtained in offline reconstruction.
- A trigger with a couple of *ns* time resolution would help to reduce the data volume from out-of-time hits in the Silicon Vertex Detector.
- Requirements of the trigger:
  - *Processing time (a couple of μs).*
  - *Timing resolution (a couple of ns).*

# Hardware Overview

Combine hits:

#### 16 detector sectors

16 logical staves:



# Agenda

- Introduction
- Basic Idea of the Trigger Algorithm
- Firmware Implementation
- Conclusion

# **Basic Idea of the Algorithm**





- Estimation of the interaction point and time based on the received photon pattern is not straightforward:
  - A small separation between the horizontal borders of the counters folds the Cherenkov cone into degenerate patterns.
  - The estimation must be done in a couple of  $\mu$ s.
- Our solution: taking advantage of the information contained in the PDF (probability density function) of the time and space data.

Time information from the photon detector

Compare with the candidate PDFs. Find out the PDF that best matches the received time patterns.

Report event time and position

# Algorithm evaluation



Both Time and Space information is used Time quantization: 1ns Background noise: 10 MHz



Only Time information is used Time quantization: 1ns Background noise: 10 MHz



Both Time and Space information is used Time quantization: 2ns Background noise: 10 MHz



Only Time information is used Time quantization: 1ns Background noise: 40 MHz

# Firmware Implementation and Test

- Firmware Implementation
- Firmware Test

### Firmware Implementation

Implemented in a Virtex 4 FPGA (XC4VFX40-10FFG672I).



# Firmware Implementation (cont.) Pipelined-sorter is based on the merge-sorting algorithm.





#### Resource usage (32-bit width):

- No. of Slices: 360 (1%).
- No. of Slice FFs: 310 (0%).
- No. of 4 input LUTs: 663 (1%).

#### Timing info

## Firmware Implementation (cont.) • Basic implementation of trigger



timing = info

and MAX position

LUTs: 200 x 64 x 20 bits

200 LUTs

- To save resource usage, only 100 correlators are used to perform the 200 correlation operations.
- The frequency of the trigger block is twice the frequency of the sorter to avoid throughput bottleneck

- No. of Slices: 9453 (50%).
- No. of Slice FFs: 14364 (38%).
- No. of 4 input LUTs: 9766 (26%).
  - No. of RMB16s: 100 (69%)

11

# *Firmware Implementation (cont.)*Resource usage of other parts



- Resource usage of the Aurora cores and the Aurora RX stream interfaces at the receiving interface:
  - No. of Slices: 3200 (17%).
  - No. of Slice FFs: 4344 (11%).
  - No. of 4 input LUTs: 5152 (13%).

- Resource usage of the Aurora core and the Aurora TX stream interface at the sending interface:
  - No. of Slices: 416 (2%).
  - No. of Slice FFs: 534 (1%).
    - No. of 4 input LUTs: 689 12 (1%).

# Firmware Implementation (summary)

- Implemented in a Virtex 4 FPGA (XC4VFX40-10FFG672I).
- Overall resource usage:
  - No. of Slices: 14255 (76%);
  - No. of Slice FFs: 21112 (56%);
  - No. of 4 input LUTs: 16923 (45%);
  - No. of RAMB16s: 110 (76%).



# Firmware Test

- RTL (register transfer level) simulation test
- In-chip test (via Xilinx Chipscope)

# **RTL** simulation test



- 5000 timing patterns are generated by Monte Carlo simulation (GEANT4).
- The whole design (including all the aurora cores, stream interfaces, sorter, trigger, FIFOs) is tested simultaneously.
- Test Results:
  - Test Passed.
  - 100% code coverage is reported by Modelsim.
  - Detailed functionality coverage is under construction. (needs a well-defined description of the protocol).

## Firmware Test

- RTL simulation test
- In-chip test (via Xilinx Chipscope)

# In-Chip Test (via Xilinx Chipscope)

• Since currently we only have one board, Aurora interfaces and other parts are tested separately.

- Aurora interface test
- Sorter-trigger test

# Aurora Interface Test

• Test covers: Aurora core, Aurora RX stream interface and Aurora TX stream interface.



- Throughput: 2.4 Gbps
- Latency: 0.6 μs

# In-Chip Test (via Xilinx Chipscope)

- Since currently we only have one board, Aurora interfaces and other parts are tested separately.
- Aurora interface test
- Sorter-trigger test

# Sorter-trigger Test

• Test covers: sorter logic, trigger logic and FIFOs.



Units Under Test

Observed by Chipscope

- Throughput: 75M time words per second
- Latency: 0.8 μs

# Firmware Test (summary)

- Overall Performance
  - Throughput: 75M time words per second;
    - Latency:



# Status

- We have prototyped a probability-optimized fast timing trigger for the Belle II TOP detector.
- The algorithm has been implemented and tested in a Xilinx Vertex 4 FPGA.
- Future work will optimize the trigger performance under various background noise and experimental conditions, including overlapping multiple track hits and using event correlations between counters.
- Ready to test on first iTOP counter, once it becomes available