Contents lists available at ScienceDirect



Nuclear Inst. and Methods in Physics Research, A

journal homepage: www.elsevier.com/locate/nima

# Embedded readout electronics R&D for the large PMTs in the JUNO experiment

M. Bellato<sup>a</sup>, A. Bergnoli<sup>a</sup>, A. Brugnera<sup>a,b</sup>, S. Chen<sup>c</sup>, Z. Chen<sup>d</sup>, B. Clerbaux<sup>e</sup>, F. dal Corso<sup>a</sup>, D. Corti<sup>a</sup>, J. Dong<sup>c</sup>, G. Galet<sup>b</sup>, A. Garfagnini<sup>a,b,\*</sup>, A. Giaz<sup>a,b</sup>, G. Gong<sup>c</sup>, C. Grewing<sup>f</sup>, J. Hu<sup>d</sup>, R. Isocrate<sup>a</sup>, X. Jiang<sup>d</sup>, F. Li<sup>d</sup>, F. Li<sup>c</sup>, I. Lippi<sup>a</sup>, F. Marini<sup>a,b</sup>, Z. Ning<sup>d</sup>, A. Olshevskiy<sup>g</sup>, D. Pedretti<sup>a</sup>, P.A. Petitjean<sup>e</sup>, M. Robens<sup>f</sup>, V. Shutov<sup>g</sup>, A. Stahl<sup>h</sup>, J. Steinmann<sup>h</sup>, Y. Sun<sup>d</sup>, S. van Waasen<sup>f</sup>, Y. Wang<sup>d</sup>, Z. Wang<sup>d</sup>, W. Wei<sup>d</sup>, X. Yan<sup>d</sup>, Y. Yang<sup>e</sup>, A. Aiello<sup>i</sup>, A. Andronico<sup>i</sup>, V. Antonelli<sup>j</sup>, W. Bandini<sup>k</sup>, A. Brigatti<sup>j</sup>, A. Barresi<sup>m,n</sup>, A. Budano<sup>o</sup>, R. Bruno<sup>i</sup>, A. Cabrera<sup>p</sup>, A. Cammi<sup>m,q</sup>, R. Caruso<sup>i,1</sup>, D. Chiesa<sup>m,n</sup>, C. Clementi<sup>r</sup>, S. Costa<sup>i,1</sup>, X. Ding<sup>j,s</sup>, S. Dusini<sup>a</sup>, A. Fabbri<sup>o</sup>, M. Fargetta<sup>i,l</sup>, G. Fiorentini<sup>k</sup>, R. Ford<sup>j,t</sup>, A. Formozov<sup>j</sup>, M. Giammarchi<sup>j</sup>, M. Grassi<sup>a,b</sup>, C. Landini<sup>j</sup>, P. Lombardi<sup>j</sup>, C. Lombardo<sup>i,1</sup>, Y. Malyshkin<sup>g</sup>, F. Mantovani<sup>k</sup>, S.M. Mari<sup>o</sup>, C. Martellini<sup>o</sup>, A. Martini<sup>u</sup>, E. Meroni<sup>j</sup>, M. Mezzetto<sup>a</sup>, L. Miramonti<sup>j</sup>, P. Montini<sup>o</sup>, M. Montuschi<sup>k</sup>, M. Nastasi<sup>m,n</sup>, F. Ortica<sup>r</sup>, A. Paoloni<sup>u</sup>, S. Parmeggiano<sup>j</sup>, N. Pelliccia<sup>r</sup>, E. Previtali<sup>m,n</sup>, G. Ranucci<sup>j</sup>, D. Riondino<sup>o</sup>, A.C. Re<sup>j</sup>, B. Ricci<sup>k</sup>, A. Romani<sup>r</sup>, P. Saggese<sup>j</sup>, G. Salamanna<sup>o</sup>, F.H. Sawy<sup>a,b</sup>, A. Serafini<sup>k</sup>, G. Settanta<sup>o</sup>, C. Sirignano<sup>a,b</sup>, M. Sisti<sup>m,n</sup>, L. Stanco<sup>a</sup>, V. Strati<sup>k</sup>, C. Tuvé<sup>i,1</sup>, G. Verde<sup>i,1</sup>, L. Votano<sup>u</sup>, J. Zhang<sup>d</sup> <sup>a</sup> INFN Sezione di Padova, Padova, Italy <sup>b</sup> Università di Padova, Dipartimento di Fisica e Astronomia, Padova, Italy <sup>c</sup> Tsinghua University, Beijing, China <sup>d</sup> Institute of High Energy Physics, Beijing, China e Universitè Libre de Bruxelles, Brussels, Belgium <sup>f</sup> Forschungszentrum Jülich GmbH, Central Institute of Engineering, Electronics and Analytics - Electronic Systems(ZEA-2), Jülich, Germany g Joint Institute for Nuclear Research, Dubna, Russia <sup>h</sup> III. Physikalisches Institut B, RWTH Aachen University, Aachen, Germany <sup>i</sup> INFN Sezione di Catania. Catania. Italy <sup>j</sup> INFN Sezione di Milano e Università di Milano, Dipartimento di Fisica, Milano, Italy <sup>k</sup> INFN Sezione di Ferrara e Università di Ferrara, Dipartimento di Fisica e Scienze della Terra, Italy <sup>1</sup> Università di Catania, Dipartimento di Fisica e Astronomia, Catania, Italy <sup>m</sup> INFN Sezione di Milano Bicocca, Milano, Italy <sup>n</sup> Università di Milano Bicocca, Dipartimento di Fisica, Milano, Italy ° INFN Sezione di Roma Tre e Università di Roma Tre, Dipartimento di Matematica e Fisica, Roma, Italy

- <sup>p</sup> IJC Laboratory, CNRS/IN2P3, Université Paris-Saclay, 91405 Orsay, France
- <sup>q</sup> Politecnico di Milano, Dipartimento di Energetica, Milano, Italy
- <sup>r</sup> INFN Sezione di Perugia e Università di Perugia, Dipartimento di Chimica, Biologia e Biotecnologie, Perugia, Italy
- <sup>s</sup> Gran Sasso Science Insitute, L'Aquila, Italy
- t SNOLAB, Lively, Ontario, Canada
- <sup>u</sup> INFN Laboratori Nazionali di Frascati, Italy

# ARTICLE INFO

# ABSTRACT

Keywords: Electronics Photomultiplier Large scale neutrino experiment Jiangmen Underground Neutrino Observatory (JUNO) is a next generation liquid scintillator neutrino experiment under construction phase in South China. Thanks to the anti-neutrinos produced by the nearby nuclear power plants, JUNO will be able to study the neutrino mass hierarchy, one of the open key questions in neutrino physics. One key ingredient for a successful measurement is to use high speed, high resolution sampling electronics located very close to the detector signal. Linearity in the response of the electronics is another

\* Correspondence to: Universitá di Padova, Dipartimento di Fisica e Astronomia, via F. Marzolo 8, I-35131 Padova, E-mail address: alberto.garfagnini@pd.infn.it (A. Garfagnini).

https://doi.org/10.1016/j.nima.2020.164600

Received 20 June 2020; Received in revised form 30 August 2020; Accepted 1 September 2020 Available online 19 September 2020 0168-9002/© 2020 Elsevier B.V. All rights reserved.



UCLEAF & METHOR

important ingredient for the success of the experiment. During the initial design phase of the electronics, a custom design with the Front-End and Read-Out electronics located very close to the detector analog signal has been developed and successfully tested. The present paper describes the electronics structure and the first tests performed on the prototypes. The electronics prototypes have been tested and they show good linearity response, with a maximum deviation of 1.3% over the full dynamic range (1-1000 p.e.), fulfilling the JUNO experiment requirements.

## 1. Introduction

The Jiangmen Underground Neutrino Observatory (JUNO) [1] is a next generation neutrino experiment under construction in South China. Thanks to the nearby Yangjiang and Taishan nuclear power plants, JUNO will attack the open question of neutrino mass hierarchy by measuring the inverse beta decay interactions of reactor anti-neutrinos in the detector. The JUNO detector structure [2] is quite simple but impressive: a large acrylic sphere (34.5 m diameter), kept in position by a stainless steel truss, contains almost 20 kton of ultra pure liquid scintillator - Linear Alkyl Benzene as solvent, with the scintillating PPO fluorine (2.5-Diphenyloxazole) and a diluted wavelength shifter (bis-MSB). The stainless steel support structure holds the inner vessel and almost 20 000 large (20-inch) PMTs and about 25 000 small (3-inch) PMTs [3]. The described central detector will be placed inside an instrumented water pool that will act both as a Cherenkov muon veto and as a shield against environmental radiation (gammas and neutrons) coming from the rock. Finally, a top tracker made with the plastic scintillator detectors of the former OPERA [4] experiment at Gran Sasso [5] will be placed on top of the water pool.

A key ingredient for the measurement of the neutrino mass hierarchy is an excellent but challenging energy resolution of the central detector: 3% at 1 MeV or better is required. Moreover, independently of the energy resolution and thanks to the large statistics, JUNO is going to measure precisely the neutrino mixing parameters,  $\theta_{12}$ ,  $\Delta m_{21}^2$ and  $\Delta m_{ee}^2$  with an ultimate sensitivity, below the 1% level [1]. Beyond mass hierarchy and precision determination of the neutrino oscillation parameters, a large liquid scintillator detector can give access to valuable data on many topics in astroparticle physics, like supernova burst and diffuse supernova neutrinos, solar neutrinos, atmospheric neutrinos, geo-neutrinos, nucleon decay, indirect dark matter searches and a number of additional exotic searches. A reference to the JUNO rich physics program can be found elsewhere [1].

The Front-End and Read-Out electronics for the large PMTs system are an important component and their performance is crucial for the successes of the measurements. This translates in a very good resolution both in single photon detection and in multi photon signal. The overall requirements coming from physics are the following:

- signal range: from 1 p.e. to 100 p.e. with a linear response and charge resolution from 0.1 p.e. to 1 p.e.; this requires that the noise level remains below 0.1 p.e. for single p.e. detection.
- background range: from 100 p.e. to 1000 p.e. with a resolution of 1 p.e.
- signal rise time around 2.5 ns. The requirement translates in a bandwidth of about 400 MHz and therefore a sampling rate of 1Gsample/s is appropriate.

To achieve such goals, considering the detector structure and topology, the Read-Out electronics has to be positioned very close to the PMTs. This novel concept, compared to legacy large scintillator based neutrino experiments, (see for instance [6] and [7]), allows to reach the best performances in terms of signal to noise ratio since the analog part of the signal is digitized at a very early stage. Moreover, the data readout throughput is lowered thanks to the reduced number of cables needed to communicate to the back-end electronics. Since local data storage is possible, this opens the possibility to perform complex signal pre-processing tasks locally, before data is sent to the DAQ. On top of that, several constraints affect the electronics design: as an example it must satisfy high reliability criteria since it cannot be repaired or replaced in case of malfunctions or breakdown. Furthermore, it has to be designed with low power consumption in order to minimize the single channel power consumption and fit in a limited space requirement for the installation.

According to [2] the following guidelines have been identified for the electronics design:

- positioning of the Front-End and Read-Out electronics close to the PMT output signal;
- usage of high speed and high resolution waveform digitizers with large bandwidth;
- exploitation of signal processing and local data storage, very close to the PMT;
- interface to the DAQ and Trigger electronics through Ethernet cables;
- Power over Ethernet and synchronous signals transport (Clock and Trigger) through the same Ethernet cable;
- single channel power consumption not greater than 10 W;
- high reliability of the PMT electronics: less than 0.5% of malfunctioning or broken channels in six years of data taking.

The present paper reports on the result of an R&D effort carried on inside the JUNO collaboration to design and test an electronics readout scheme as a possible candidate for the final large PMT electronics.

# 2. The electronics scheme

The structure of the electronics is shown in Fig. 1.

The electronics is split into two parts: one located on the PMT, in the underwater water tank and henceforth referred to as 'wet' electronics, and the 'dry' electronics in the electronics room.

The 'wet' electronics is made of the following components (see Fig. 1, from left to right):

- Base: the PMT voltage divider and splitter;
- High Voltage Unit (HVU): a programmable module which provides the bias voltage to the voltage divider;
- Global Control Unit (GCU): the intelligent part of the 'wet' electronics. It receives the analog signal, digitizes it and processes the digital output.
- Power and Communication Board (PB): the interface to the 'dry' electronics. It drives the synchronous Clock (CLK) and Trigger (TRG) links and provides power to the 'wet' components.

The 'dry' electronics is composed of the Back End Card (BEC) which receives and sends the digital data, distributes the power, and handles the synchronous signals (CLK and TRG), the trigger electronics, which will be described elsewhere, the central JUNO clock synchronized to GPS, and the power supplies.

The communication between the dry and wet parts uses a standard CAT5e cable. The four twisted pairs of the cable accommodate:

- an asynchronous down-link using the 100BASE-TX fast Ethernet communication standard.
- an asynchronous up-link using the same 100BASE-TX fast Ethernet standard.
- a synchronous 62.5 MHz clock signal which is derived from the central JUNO clock.



Fig. 1. Electronics scheme of the JUNO large PMT electronics. The 'wet' electronics (left) is connected to the 'dry' electronics (right) by means of a 100 m long CAT5 cable (middle).

- a Trigger input sending a digital "1" every 16 ns if a photon is detected by the PMT. The trigger decision, initiating the readout of the PMT, is distributed through the asynchronous down-link.

Power is transmitted through a static voltage difference between the two wires of the twisted pairs. The digital power is transmitted on both asynchronous links at a voltage of 24 V; the analog power uses the clock line with the same voltage.

A realization of the prototype boards assembled in a castle-like configuration, before being coupled to the PMT, is shown in Fig. 2. In the lateral view (Fig. 2, left), from top to bottom the following boards can be seen: PB, GCU, an empty shielding board, and the PMT base. The HVU is on top of the PMT base, touching the shielding board. The diameter of the boards is about 140 mm, while the height of the assembly is about 100 mm. The Ethernet socket, which was used to test the prototypes, is visible on the top of Fig. 2, left.

A full view of the PB, with all the components, is available on the right plot of Fig. 2. Connections between the different boards are made with cables soldered on to the PCBs.

In the following sections, a description of the different boards is given.

# 2.1. PMT voltage divider and High Voltage Unit

JUNO will deploy, in total, about 20 000 large size PMTs of two different types [8]:

- about 5000 dynode PMTs, model R12860, from Hamamatsu Photonics;
- and about 15 000 Micro-Channel Plate Photomultipliers (MCP-PMT), produced by North Night Vision Technology.

The Hamamatsu R12860 PMT is based on a "Venetian-blind" dynode structure, while the NNVT PMTs use one micro-channel plate. They need different voltage dividers. Fig. 3 shows the electrical scheme; it can be seen that the high voltage supply to the anode is positive while the photocathode is on ground. The signal output is doubled and the maximum signal amplitude is limited to about 8 V to protect the consecutive electronics from over-voltage. One side of the board is soldered directly to the PMT pins. The HVU is mounted on the other side of the board.

The high voltage is generated by the HVU, a custom module that converts a 24 V DC voltage to a high DC voltage (HV) using a cascade of half-wave doublers (Cockcroft–Walton multipliers). Such a system does not need any HV cables or connectors. The module is equipped with an embedded micro-controller. It monitors all operations and provides a RS485 half-duplex interface to the GCU.

The properties of the HVU are:

- ripple: 10 mVptp

#### 2.2. Preamplifier and analog-digital unit

To allow maximum flexibility during the design of the readout electronics, it has been decided to mount an FMC [9] low-pin-count connector of the GCU board,1 which allows to 'plug' different Analog-Digital Units during the prototyping and testing phase. The ADUs were mounted on an FMC mezzanine board (see Fig. 4(a)). The ADU receives the input charge, converts it into a voltage, digitizes the waveform and sends it to GCU for further processing. The input signal is connected to the ADU thanks to a SMA connector. As can be seen from Fig. 4(b), the ADU consists of a custom Front-End Chip (FEC), two commercial Trans-Impedance Amplifiers (TIA), two drivers, and two custom ASIC ADCs. The ADC is a high speed AD converter that provides a superior maximum sampling frequency rate of 1 GHz, with a 14-bit resolution capability. This ADC adopts multi-bit pipeline architecture that assures lower power consumption. A differential clock input is used to control all internal conversion cycles. The circuit design uses 65 nm CMOS process. A logical diagram of the main ADC parts is given in Fig. 5. On the left part of the Figure are visible the clock differential inputs (CKIP/CKIN), the analog signal differential inputs (AIP/AIN), the common mode voltage input (VCMI) and the ADC reference voltages (VCM/VREF). The SPI control part of the ADC is marked on the bottom part of the Figure. The power supplies and grounds for the analog and digital parts of the circuit are available on the top. Finally, the differential output digital signal lines (DOP[0:13]/DOC[0:13]) and the clock output synchronized with the output data (DCOP/DCON) are available in the right part. A description of the ADC design and performances will be available in Ref. [10].

The circuit is completed by a Phase Locked Loop (PLL) and some peripheral circuits. The two preceding drivers amplify the signals with different gain: a low gain with a dynamic range from 0 to 7.5 V – equivalent to about 1000 pe – and a high gain with a reduced range from 0 to 960 mV equivalent to 128 pe. The output link uses a 14-bit Double Data Rate (DDR) parallel bus, with the data synchronized to a 500 MHz clock. This sampling clock is generated by an external Phase-Locked Loop (PLL) mounted on the ADU. It receives the system clock of 62.5 MHz from the GCU and provides a low jitter (100 fs RMS) 1 GHz clock to the ADC. The circuit is completed by a Test Pulse Circuit which can generate a programmable test pulse to check the status of the full electronics chain.

## 2.3. Global Control Unit

The Global Control Unit (GCU) is the core of the JUNO readout electronics; Fig. 6 shows a top (left) and bottom (right) photograph of one of the GCU prototypes. The main task is the acquisition of the PMT waveform, its processing (local trigger generation, charge reconstruction, and timestamp tagging) and temporary storage before being sent to the data acquisition (DAQ) upon a trigger request.

<sup>-</sup> range of HV: 1500 V-3000 V in steps of 0.5 V.

<sup>-</sup> HV long term stability: 0.05%

<sup>-</sup> temperature coefficient: 100 ppm/°C

<sup>-</sup> maximum output current: 300 µA

<sup>&</sup>lt;sup>1</sup> It can be seen on the right part of Fig. 6(b).



Fig. 2. Assembly of the prototype boards: lateral (left) and top (right) views.



Fig. 3. Top: Hamamatsu PMT voltage divider schematics. Bottom: MCP-PMT voltage divider schematics.



Fig. 4. (a): ADU top side; the main components have been highlighted. (b): ADU logical scheme.



Fig. 5. ADC block diagram. A description is given in the text.



Fig. 6. GCU prototypes picture: top side (A) and bottom side (B).

A block diagram of the GCU can be seen in Fig. 7. The core of the board is a Xilinx Kintex-7 FPGA (XC7K160T), which is a good compromise between number of available I/O ports, power consumption, performance and cost. A continuous stream of 14-bit data, sampled at 1 Gsample/s is transferred from the ADC to the FPGA via 14 LVDS lines (500 MHz DDR) The FPGA is able to handle all the data packaging, processing and buffering. Metadata, containing for instance the timestamp and the trigger number, is attached to the event segments packages and stored in the board memory. Upon a trigger request, validated waveforms are sent to the DAQ event builder via Fast Ethernet. The IPBus Core [11] protocol is used for data transfer, slow control monitoring, and control operations. It allows a transparent manipulation of the FPGA register across the Ethernet, allowing to connect the Ethernet network to the I2C, SPI, and UART GCU local buses.

A typical slow control operation can be either the setting of the PMT High Voltage through the HVU, which is connected through an optically isolated RS485 interface to the Kintex-7 FPGA, or the readout of the local GCU temperature sensors. The synchronization and communication protocol, running on the two synchronous links, is based on the Timing, Trigger and Control (TTC) [12] protocol, developed at CERN. It provides the capability to exchange data between the GCU and the BEC, such as trigger timestamps and calibration information, as well as sending trigger input upstream to the Central Trigger Processor (CTP).

A description of the TTC implementation on the current hardware and discussion of the results can be found in Ref. [13]. The data streams are DC-balanced. A Clock Data Recovery (CDR) chip in the GCU recovers the master clock of the experiment from the data stream. The synchronization is a key feature: it guarantees that all the 20 000 local clocks are aligned with the global time within a system clock period of 16 ns.

A critical point in the readout scheme is the capability to handle 8 Gbit/s of raw data, (14 LVDS lines at 500 MHz DDR) continuously from the ADC. The waveforms need to be stored while waiting for triggers from the CTP. We expect a trigger latency of about 100  $\mu$ s. Upon a trigger, a readout window of pre-defined length will be extracted from the local buffer and sent to DAQ through the asynchronous link. A circular, level-1 cache is allocated inside the main FPGA memory. The available block RAM in the Kintex-7 is 11700 Kbits and it allows to store up to 1.4 ms of data, which is well above the required latency.

In normal operation mode a trigger rate of about 1 kHz is expected. In case of a supernova explosion, the data rate will rapidly increase by orders of magnitude. The FPGA's internal cache would be too small to handle the data; therefore a dedicated 2 GB DDR3 memory has been added to the GCU. The memory controller supports write operations up to about 21.3 Gbit/s which is sufficient to handle the incoming data rate and to store two seconds of continuous data. The usage of a data



Fig. 7. GCU Block diagram. A description of the different parts is given in the text.

compression algorithm would further improve the effectively available memory. Since the GCU will no longer be accessible after the detector is filled with water and liquid scintillator, the only interface, Fast Ethernet has to provide both data readout and remote FPGA reprogramming. Therefore the GCU is equipped with a second smaller FPGA (Spartan-6) with the purpose of ensuring a fail-safe reconfiguration of the Kintex-7 by means of a virtual JTAG connection over the IPbus, eliminating the need of a dedicated JTAG connector and cable. As can be seen in Fig. 7, the two FPGAs are connected to the Physical 100 BASE-T Ethernet switch and interconnected via JTAG. The virtual JTAG also allows to use the Xilinx debugging tools (Impact and Chipscope). A custom Xilinx virtual cable server XVC [14] opens a TCP port for the Xilinx tools and provides support for the IPbus/UDP protocol bridging the JTAG commands to the GCU's JTAG chain via fast Ethernet, passing throughout the IPbus core instantiated in the Spartan-6.

#### 2.4. Power and communication board

The Power and Communication Board (PB) provides the power to the 'wet' electronics and the interface to the CAT-5e cable that connects to the 'dry' electronics. Power is transmitted through the asynchronous data links using a custom Power Over Ethernet (POE) approach: the standard POE [15] technology is adopted for the power rails, but with a lower voltage (24 V instead of 48 V<sup>2</sup>) and without the overhead of the POE protocol. Analog power is conveyed through the clock link. The CLK signal is AC coupled onto a power rail. A diagram showing how power is decoupled from signal is given in Fig. 8. The circuit works as a low pass filter for the DC supply and a high pass filter for the differential signal. The voltage of both power lines can be adjusted independently to compensate for the power losses over the long 100 m cable.

The PB is connected to the GCU. Data links and a dedicated 12 V (1 A, max) power rail are provided. From the 12 V power rail, the GCU will internally generate all the required voltages. The PB also connects to the HVU through a low ripple power line. The voltage

is in the range between 23 V and 30 V with a maximum allowed current of 80 mA. Three separated ground potentials are provided. They are connected to the 'dry' electronics through the shields of the corresponding CAT5e cable. There is a digital ground and an analog ground which are connected to each other at a single point in the 'wet' electronics in the ADU. A third ground is transported on the outer shield of the CAT5e cable and connected to the steel housing of the 'wet' electronics for electrical shielding.

During the assembly of the boards the CAT5e cable is soldered onto the PB. Cable ties are foreseen to hold the cable in place to protect the solder joints from possible stress. The cable will be split into its pairs which are then soldered close to the corresponding drivers/receivers located in different positions on the PCB. A scheme of assembly of the three boards with the signal and power connections is given in Fig. 9.

# 2.5. Back end card

The Back End Card (BEC) is the first board of the 'dry' electronics. It is used as a concentrator and a bridge between the 'wet' electronics and the DAQ and trigger systems. The main task of the BECs is the handling of the data links from/to the reception of the trigger input and the distribution of the power and the clock to the 'wet' electronics. One BEC connects to 48 GCUs. Since JUNO will deploy around 20 000 large PMTs, about 420 BEC will be needed. A schematic view with a focus on the role of the BEC is presented in Fig. 10.

The BEC consists of two parts: the baseboard and the Trigger and Timing (TTIM) FMC mezzanine card. The baseboard routes all the signals. It compensates the losses due to the long cables on the incoming signals and connects to trigger, DAQ system, central clock and power supplies. The readout and slow control data streams which are transmitted over Ethernet are passively routed to a commercial POE switch. The BEC baseboard design is shown in Fig. 11 (left part).

The PCB is equipped with 48 RJ45 connectors located on the bottom side of the baseboard to provide the connections to the 'wet' electronics. Close to the connectors, 48 equalizers are mounted to handle the upcoming trigger inputs. The output form the 48 differential pairs is connected to two custom-defined LPC connectors with two serial 0 Ohm resistors in each path. The two LPC connectors are situated in the

<sup>&</sup>lt;sup>2</sup> Due to high reliability design constraint.



Fig. 8. Schematics of how the clock signal decouples from the power in the synchronous link diagram.



Fig. 9. Electronics boards connections schemes. The different signals and power rails routing is indicated. Details are given in the text.



Fig. 10. Logical diagram of the JUNO large PMT electronics, with BEC logical scheme enlightened.

middle part of the baseboard, and provide connection to the TTIM. On the top side of the LPC another 48 differential pairs connect back to the RJ45 connectors for the down-link trigger validation signals. In total 96 differential pairs are connected to the two LPC connectors. In the middle of the top part of the baseboard there is the power connector for the BEC. It is separated from the power supplies for the 'wet' electronics to allow for flexibility in the grounding. Since one BEC has 48 identical ports and each port supports bi-directional data transfer, two ports can be cross-connected for testing. The TTIM can be used to generate 250 Mb/s PRBS data. The eye diagram shown in Fig. 11 (right part) shows a stable bi-directional data transfer realized connecting two channels on a BEC board through a 100 m long Ethernet cable.



Fig. 11. Left: BEC baseboard design. Right: Synchronization tests results, eye diagram.



Fig. 12. Development of the failure rate throughout the lifetime of an electronic component [16].

## 3. Reliability

The 'wet' electronics cannot be accessed after liquid scintillator filling. As mentioned in the introduction, JUNO requires less than 1% channels failures during the first six years of operation. We assume that half of the failures stem from PMTs and their bases, so that less than 0.5% of the electronics may fail, taking also into account failures of the cables and leakage into the electronics housings. The failure of electronics over time can be described by three major phases in the so-called bathtub curve (see Fig. 12). In the beginning of the operation the failure rate is dominated by infant mortality. During this phase, devices or components with small defects like bad solder joints. fail. High reliability electronics infant mortality can be overcome with careful screening and burn-in. Throughout the useful live-time of a device random failures are dominant, leading to a constant failure rate. All discussions and definitions in the following sections describe this random dominated lifetime. At the end of the lifespan the risk increases again due to aging effects like decreasing chemical stability [16]. In Table 1 the relevant acronyms used in reliability engineering are specified. The essential value is the failure rate  $\lambda$ , expressed in failures in time (FIT) which is assumed to be constant over the useful lifetime. Failure probability can be calculated using an exponential function (Eq. (1)):

$$P(\text{fail}) = 1 - e^{-\lambda \cdot t} \tag{1}$$

The failure rate  $\lambda$  is usually normalized to 10<sup>9</sup> h of operation, which shifts typical electronics to FIT-values of  $\mathcal{O}(1)$ .

# 3.1. Calculating the reliability

A device's failure rate can be described by the sum of the failures of all included components. The military handbook MIL-HDBK-217F [17]

| Table | 1 |  |
|-------|---|--|
|-------|---|--|

| Definition of acronyms used in reliability engi | ineering | [16]. |
|-------------------------------------------------|----------|-------|
|-------------------------------------------------|----------|-------|

| Terms                          | Definition                                                                                                                                                                                                                                        |
|--------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Failure Rate $\lambda$         | The failure rate describes number of failures per<br>time for one component, assuming a constant<br>failure rate. $\lambda$ is given in units of FIT.                                                                                             |
| Failure In Time (FIT)          | Measure of the number of fails per device $10^9$ h,<br>e.g. $\lambda = 100$ FIT = 100 failures in $10^9$ h.                                                                                                                                       |
| Mean Time To Failure<br>(MTTF) | The Mean Time To Failure is the mean lifetime under operation before a defect occurs and is consequently the inverse of the failure rate $\lambda = \frac{1}{NTTF}$ . Mean Time between Failures (MTBF) is a synonym if the device is repairable. |

Notice 2 was used as our baseline and the FIDES [18] served as a crosscheck. The military handbook is a well established tool for estimating the reliability of a device. It is based on data obtained during operation and uses simple assumptions to create easily usable models. For the reliability calculation, two different methods are introduced for different stages of the project: the "part count" and the "part stress" methods. The part count method is a conservative approach that can be used in the early phase of a project to get an initial estimate of the reliability. The part stress method refines this estimate at a subsequent stage of the development when all part parameters, e.g. voltage stress and temperature, are known. The results are conservative but reasonable for most devices and components [19]. However, for some components with significant improvements in processing over the last few years like CMOS-microcircuits, the reliability results are too negative. Additionally, SMD components are missing in these models, but they play a crucial part in modern electronics. On top of the failure rate of the components, we need to consider failures of the PCB assembly. This is calculated with the FIDES guide [18]. The failure rate depends mainly on the technology, the number of solder joints, the environment of the final assembly and the reliability of the manufacturer.

One may either test every component individually or the entire device in a single measurement. But with the entire device, the problem may arise that a failing component leads to a cascade of other components failing and the origin of the failure may not be identified. Alternatively, testing all components by themselves is a valid method too, but as the failure rate of standard components is very low, many components and a long testing time are needed. A common way to accelerate the tests is to increase the stress on the component, for example increasing the temperature to accelerate chemical aging. The simplest way to describe the probability of a device to fail is by an exponential function.

Some assumptions have to be made: the failure rate of the device has to be constant, which is valid only after infant mortality and before being worn out. The failure rate can be calculated as

$$\lambda = \frac{\chi^2 (2 \cdot (f+1), \text{CL}) \cdot 10^9 \text{ h}}{2 \cdot t \cdot d \cdot AF} , \qquad (2)$$

where  $\lambda$  is the failure rate in 10<sup>9</sup> h, *f* is the number of devices which failed,  $\chi^2$  is the  $\chi^2$  value for  $(2 \cdot (f + 1))$  degrees of freedom, given a confidence level CL. Finally, *t* is the test duration in hours, *d* is the total number of devices tested and *AF* is the acceleration factor, defined for thermal stress, as in Eq. (2):

$$AF = \exp\left(\frac{E_a}{k_B}\left(\frac{1}{T_{\text{use}}} - \frac{1}{T_{\text{stress}}}\right)\right)$$
(3)

Here  $E_a$  is the activation energy (in eV),  $k_B$  Boltzmann's constant and  $T_{use}$  and  $T_{stress}$  are the absolute temperatures (in Kelvin) of the accelerated test and normal use respectively. If a large number of failures is observed, the  $\chi^2$ -function may be approximated by the number of failures, but usually the number of failures is small. Typically, a confidence level of 60% is used. The factor of 10<sup>9</sup> h normalizes the result. During the test all devices have to be operational, i.e. under power. The early failures result from defects that occur in production and assembly and they need to be subtracted from the calculation. We foresee a screening for early failures with some thermal cycling to suppress infant mortality. The target value for all of the 'wet' electronics is 95 FIT.

#### 3.2. Power and communication board reliability

As an example, the details of the reliability calculation for the PB are presented below. We modified the design and especially the selection of the components through several interactions to minimize its failure rate. It was decided to use only components which are qualified by the manufacturers and FIT values are provided. We use a conservative approach: all components are classified as critical for the operation of the board. The failure of a temperature sensor is assumed to have the same impact as the failure of a truly critical component like the Ethernet transformer. A dedicated code *ReliabilityCalc*<sup>3</sup> was developed. The program calculates the reliability using the manufacturer's data or the military handbook, including temperature dependencies and stress levels. The failure rate of the PB with all of its 266 components is estimated at

$$\lambda < 40.4 \text{ FIT}$$
 (4)

at a temperature of 40 °C for every component. The temperature was measured with a dummy board potted in oil, with simple resistors simulating the heat dissipation. The contribution of the different parts, after optimization, can be seen in Fig. 13. The failure rate is dominated by one passive component: the PoE coil.<sup>4</sup> Unfortunately no alternative with a better failure rate could be found. The right plot of Fig. 13 shows the FIT value as a function of temperature. The exponential rise is dominated by silicon chips due to their high activation energy.

The failure of PB corresponds to about 42% of the budget available to all of the 'wet' electronics. Considering that PB holds most of the power electronics, this might be acceptable.

Table 2 presents the estimated failure rates of all electronics. The major contribution given by capacitors is somehow expected due to a relative large number of capacitors on the board (see Fig. 6); moreover many of those are tantalum capacitors and they cannot be replaced with ceramic capacitors due to their unaffordable larger dimensions. Experience from the BESIII experiment [20] shows that failures in the electronics boards over a ten year of running are dominated by tantalum capacitors, giving a confirmation that they properly dominate our reliability calculations.

Obviously some more optimization is needed to fully reach the goal of 95 FIT. In parallel to the estimate of the failures in normal operation, we investigated a number of exceptional events, such as power cuts. We ensured that none of those exceptional events constitutes a significant risk of failure.

Table 2

| Electronics estimated failure rates. |              |                                      |  |  |
|--------------------------------------|--------------|--------------------------------------|--|--|
| Unit                                 | Failure rate | Comment                              |  |  |
| HVU                                  | 50           | Dominated by the HV filter capacitor |  |  |
| GCU                                  | 107          | Dominated by many capacitors         |  |  |
| PB                                   | 40           | See text                             |  |  |
| Dry electronics                      | 0            | Replaceable                          |  |  |
| Cables                               | 30           |                                      |  |  |
|                                      |              |                                      |  |  |

#### 4. Prototyping and tests

Several prototypes of all components of the 'wet' and 'dry' electronics were produced. After extensive standalone tests, the 'wet' electronics was assembled into the stack seen in Fig. 9 and connected through a 100 m CAT5e cable to the prototype of a BEC. Commercial units provided LV power and the clock signal to the BEC. A preliminary version of the DAQ was used to communicate with the electronics. The JUNO central trigger was not yet included in the tests. Due to a mistake in the routing a cable was needed to patch the Ethernet connection between PB and GCU. The sockets are visible in Fig. 2. The 'wet' electronics was mounted on a dynode PMT (Hamamatsu R12860) to create a complete vertical slice of one channel. The PMT assembly was located in a light-tight box. The vertical slice was intensively tested. Afterwards, the electronics was coupled to a MCP-PMT (NNVT) and potted into its watertight housing and everything was tested again. All the initial tests were performed in Italy, where at that time only the Hamamatsu PMT was available; tests with the encapsulated electronics where performed in China, where both Hamamatsu and NNVT PMTs where available. Both PMTs used in the tests where equalized to the same gain. All the results of the tests are presented below.

# 4.1. Initial tests

#### 4.1.1. Linearity

To test the linearity of the response, the input was connected to a CAEN Fast Digital Detector Emulator DT5810. It provides pulses with a fixed rise and decay time, but with a programmable amplitude. The left panel of Fig. 14 shows a sample of simulated signals fed to the electronics: the amplitudes have been varied from 5 mV up to 200 mV with a default rise and fall time of 30 ns and 120 ns, respectively.<sup>5</sup> For reference, a single photon from the PMT creates a pulse with a typical amplitude of 10 mV. An external trigger, provided by the DT5810, was used and data were acquired through the whole electronics chain. The average charges of more than 10 000 pulses per injected charge are plotted on the right side of Fig. 14, against the input amplitude. The plot shows excellent linearity. The maximal deviation from a linear fit is 1.3%. The result is well within the JUNO requirements.

#### 4.1.2. Single photo-electron measurements

All the following tests have been performed with a dynode PMT from Hamamatsu, with a gain of  $(1.75 \pm 0.12) \cdot 10^7$ . The left plot of Fig. 15 shows a few pulses of different amplitude recorded through the full vertical slice. The data show a stable baseline with no overshoot or wiggles at the tail of the pulses. The rise time of the signal is around 7 ns and the decay time around 30 ns. The pulses were integrated over 50 ns. The charge spectrum is shown in Fig. 15. The mean amplitude for single p.e. was measured to be  $9.39 \pm 0.03$  mV, while the average noise level is  $0.45 \pm 0.04$  mV. A signal-to-noise ratio of  $20.9 \pm 1.9$  was extracted for single p.e. The fit presented in Fig. 15 gives a peak-to-valley ratio of 3.8. The single p.e. resolution is around 31%. The vertical slice was in stable operation for few days without any loss of data.

<sup>&</sup>lt;sup>3</sup> Available from RWTH Aachen University through https://github.com/ JochiSt/ReliabilityCalc, DOI 10.5281/zenodo.1134161.

<sup>&</sup>lt;sup>4</sup> Ethernet magnetics, 749012013, from Würth Electronics.

 $<sup>^5</sup>$  The measurements were not performed for larger amplitudes since the signal rise times increase dramatically going to the  $\mu s$  domain.



Fig. 13. Power and Communication Board reliability. Left: single components contribution. Right: FIT value temperature dependence.



Fig. 14. Left: DT5810 input signals (smallest amplitude: 5 mV, highest amplitude: 200 mV). Right: reconstructed charge as a function of the input amplitude.



Fig. 15. Left: reconstructed pulses for different p.e. values. Right: Single p.e. spectrum. The spectrum was fitted with 3 Gaussians and an exponential function for the background (black line): one Gaussian for the noise peak centered at zero, one for the single p.e. contribution (green line) and the last one for the two p.e component (magenta line). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)



Fig. 16. Stability over time. Left: GCU temperatures; a stable temperature of about 40°C-50°C is reached after 16 h (about 1000 min) after power-on. Center: Baseline pedestal stability, monitor for one week. Right: Noise level as a function of time, monitored for one week.

# 4.2. Tests after potting

For the tests with the potted electronics, a MCP-PMT from NNVT was used. The electronics was cased by an air-filled, stainless steel housing, which in turn was glued to the neck of the PMT with several type of epoxy. The glue joint and the cable feed-through were covered with a heat-shrinkable tube as a second layer of leakage protection. The performances were measured again after the potting procedure. The gain of the PMT has been adjusted to  $(1.67 \pm 0.13) \cdot 10^7$ . The temperature of the GCU was monitored with four temperature sensors inside the FPGA. The temperature trend is shown in the left plot of Fig. 16. After a fast initial increase the temperature stabilizes around 40 °C to 50 °C. There is no significantly change over the next 220 h of measurement. During the operation the outside air temperature was stabilized with ventilation to 25 °C. A slight better cooling is expected in water. Position and width of the baseline were stable during the whole period. From the width we extract a noise level corresponding to  $0.57 \pm 0.04$  mV. Both waveform baseline pedestal and noise levels were stable in time, as can be seen from the two rightmost plots of Fig. 16. With a measured mean amplitude for single p.e. of  $10.78 \pm 0.09$  mV, a signal-to-noise ratio of  $18.9 \pm 1.3$  has been extracted, a value which is compatible, within the statistical error, with that obtained with the tests before potting.

We recorded single photon spectra with a pulsed LED in front of the photo cathode. A trigger generated by the pulse generated was sent to the GCU. The data was recorded through the full vertical slice. There was no visible change in the rise or decay time of the pulses after potting. Again the pulses were integrated over 50 ns. The charge spectrum is shown in Fig. 17. The fit of the charge spectrum is explained in Fig. 15. The different contributions of the fit are reported in Fig. 15. The single p.e. resolution is around 34%, which is also compatible to the results before the potting procedure.

# 5. Conclusions

JUNO will be the largest liquid scintillator detector ever built for neutrino physics. The scientific goals put stringent constraints on the performance of the readout electronics. Especially challenging are the excellent energy resolution required for the determination of the mass hierarchy, the large data rate from supernova events due to the large mass of the detector and the handling of the huge signals of cosmic muons. The readout electronics of the large PMTs is an essential ingredient for the success of the experiment. A novel design of the electronics has been presented. The electronics is mounted on the back end of the PMTs to the PMT output signal, embedded in the watertight steel housing. A substantial effort has gone into optimizing the reliability of the system. The tests confirm the expected performance of the whole system. It was verified that the potting does not degrade the performances.



Fig. 17. Single p.e. spectrum, after potting.

# CRediT authorship contribution statement

The JUNO detector has been designed and is being constructed by the JUNO Collaboration over the span of more than 5 years. The JUNO Collaboration sets the science goals, discusses and approves the scientific results. The results presented in the paper are the result of several years of work and discussion with contributions from the whole JUNO electronics group. The manuscript was prepared by a subgroup of authors based on the work done by all the JUNO electronics groups. Finally, all the authors reviewed and discussed the results, read and approved the submitted version of the manuscript.

## Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

# Acknowledgments

Part of this work has been supported by the Italian-Chinese collaborative research program jointly funded by the Italian Ministry of Foreign Affairs and International Cooperation (MAECI) and the National Natural Science Foundation of China (NSFC). We also acknowledge the support by the Deutsche Forschungsgemeinschaft, DFG, Germany, FG 2319 and of the F.R.S-FNRS funding agency (Belgium).

#### References

- [1] F. An, et al., J. Phys. G: Nucl. Part. Phys. 43 (2016) 030401.
- [2] T. Adam, et al., JUNO Conceptual Design Report, 2015, arXiv:1508.07166.

## M. Bellato, A. Bergnoli, A. Brugnera et al.

## Nuclear Inst. and Methods in Physics Research, A 985 (2021) 164600

- [3] M. He, Double calorimetry system in JUNO, in: Proceedings To the IV International Conference on Technology and Instrumentation in Particle Physics, TIPP2017, arXiv:1706.08761.
- [4] N. Agafonova, et al., Phys. Rev. Lett. 115 (2015) 121802.
- [5] T. Adam, et al., Nucl. Instrum. Methods A 577 (2007) 523.
- [6] G. Alimonti, et al., Nucl. Instrum. Methods A 600 (2009) 568.
- [7] A. Suzuki, Eur. Phys. J. C 74 (2014) 3094.
- [8] L.J. Wen, et al., Nucl. Instrum. Methods A 947 (2019) 162766, arXiv:1903.12595.
- [9] FPGA Mezzanine Card (FMC) is an ANSI/VITA (VMEbus International Trade Association) 57.1 standard.
- [10] JUNO electronics group, paper, in preparation.
- [11] C. Ghabrous Larrea, et al., J. Instrum. 10 (2015) C02019.
- [12] Timing, Trigger and Control (TTC) Systems for the LHC, http://ttc.web.cern.ch/ ttc/.

- [13] D. Pedretti, et al., IEEE Trans. Nucl. Sci. 66 (2019) 1151, arXiv:1806.04586v2.
- [14] Alvi Clark, Lui Bielich, Xilinx virtual cable running on zynq-7000 using the petalinux tools, 2015, Xilinx Application Note XAPP1251 (v1.0).
- [15] Power Over Ethernet, IEEE 802.3af and 802.3at standards.
- [16] ReliaSoft Corporation, Life Data Analysis Reference, Vol. 1, 2015, p. 5.
- [17] MIL-HDBK-217F, reliability standard, ADD A REFERENCE.
- [18] FIDES group, Reliability methodology for electronic systems snls235h, 2009, p. 9, https://www.fides-reliability.org/.
- [19] J. J. Marin, R. W. Pollard, Experience report on the FIDES reliability prediction method Reliability and Maintainability Symposium, in: 2005. Proceedings. Annual, 2005.
- [20] Z. Ning, private communications.