The VULOM-based trigger logics replaces several crates of NIM and CAMAC electronics used for trigger decisions, counting and dead-time locking --- and open up many new possibilities for handling pending and calibration triggers. As the TRIDI modules need similar kinds of logics, and actually could serve as small local trigger systems for stand-alone test operations, the same code is used for both, with minor tweaks for the different in- and outputs. All logics run with one 100 MHz clock.
The bold intention is that this firmware should be suitable for (almost) any triggered nuclear physics experiment. If you can think of something useful which it cannot do, but that would make it useful also for you, do not hesitate to contact the author --- customisations handled by general extensions are of interest.
The layout of the address space of the VME registers is dependent on the configuration of the VHDL code, i.e. the number of various resources implemented. To make use easy, the location of all registers are calculated at compile time, and corresponding C structures that can be used for addressing are generated as trlo_defs.h. (The use of a base pointer and offsets is anyhow not an approach recommended by the author.) The version of the code is identified by a md5sum of the entire VHDL source, to avoid incompatibilities between versions.
The TRLO firmware consists of several parts:
Any destination can use any source.
There is also the possibility to connect any module output to receive from any input without clocking, i.e. without clocking jitter.
This replaces the previous chain of alignment delays, LMUs and trigger boxes.
For each detector signal, each connected to one of the 16 VULOM ECL inputs (or 8 ECL and 8 NIM TRIDI inputs) the signal is clocked. In
In the figure below, ... denotes clock boundaries, i.e. FPGA pipe-line stages. Note that all diverted signals not are time critical and therefore separated by clocking at the diversion.
ECL-INPUT (1-16) ..... | ..... DELAY-LINE (alignment, shift register) | +--> SOFTSCOPE (for alignment) | STRETCHER (countdown, restartable) | +--> SCALER (before LMU, own l.e. detect) | +-- LMU INPUT --+ | |---< LMU AUX IN (from general logics) +- LMU OUTPUTS -+ ..... | ..... +--> SCALER (after LMU / before dead-time) | +--> OR (to avoid inhibit release at trailing trigger) | DEADTIME VETO (inhibit signal from trigger state machine) | L.E. DETECT (make single-cycle pulses) | +--> SCALER (after dead-time) | REDUCTION (counter + mux of selected factor 2^n) | +--> SCALER (after reduction) | +--> TPAT (to trigger state machine) | OR (any accepted trigger makes an accepted event) | STRETCHER (countdown) ..... | ..... MASTER START (stretched, selectable output (via mask))
Note that the number of LMU outputs = number of TPAT bits need not be the same as the number of LMU inputs.
Note that any signal that make it past the dead-time veto, i.e. possibly to the master start (provided downscale is right) *is* an accepted trigger. This is arranged together with the trigger state machine by it letting enough cycles pass to ensure that all such signals are allowed to combine into the recorded TPAT. It is also carefully respected in case of internal (pending) triggers.
By applying the leading-edge detect (which turns long multi-clock signals into single-clock pulses) before the scaler before DT, all remaining logics, in particular scalers and the reduction counter, can work directly on the propagated information.
By running the fast-path at a relatively slow clock (100 MHz) instead of something faster it is possible to combine the major part of the operations into two pipe-line stages, instead of many more which then also would have incurred more overhead for each clock cycle (flip-flop sample-and-hold) and be more uneven in the usage of the cycles.
The logix matrix is modelled after the LeCroy 2365 CAMAC module. Each output j is independent. For each output, it can be selected which inputs shall affect the result and how, according to:
out(j) = reg_not(j) XOR OR_OVER_ALL_i ( ( reg_and(i,j) AND in(i)) | (reg_not_and(i,j) AND NOT in(i)) )
The setup are set via the control registers trig_lmu_not (reg_not) and trig_lmu_[n]and (reg_and and reg_not_and). The above formula can produce a simple OR function by setting reg_not and reg_not_and to 0 and let reg_and select which inputs to include. To achieve the functionality of an coincidence/anti-coincidence unit (as is usually wanted for triggering), set reg_not to 1, reg_not_and to 1 for all signals required in the coincidence and reg_and to 1 for all signals that shall be absent (veto). This achieves the desired function by generating a 1 from the OR_OVER_ALL_i whenever some signal is in a state that shall prevent the final output, i.e. either a veto is present, or a required coincidence is missing. Input signals having 0 for both reg_not_and and reg_and are ignored (no-care). Setting both reg_not_and and reg_and is a useless setting. Unused outputs must have reg_not set to 0, since otherwise, if no input is selected by either reg_and or reg_not_and, it will give an continuous output signal.
Below, i loops over the LMU inputs, j over the LMU outputs, k over the mux inputs and l over the aux inputs.
|trig_delay[i]||The distance between the read and write pointers of the delay RAM to multiplex into the stretcher. Due to implementation details, the minimum value 0 corresponds to a delay of 3 cycles.|
|trig_delay_mode[i]|| Use the direct input instead of the delay-line
When two copies are needed with different delays, the previous input can be used before the delay instead of the current (TRLO_TRIG_INPUT_MODE_xxx):
It is selectable if the stretcher only restarts at the leading edge, or as long as the pulse is active (TRLO_TRIG_RESTART_MODE_xxx):
|trig_stretch[i]||Length of stretched signal. Should be longer than the coincidence/trigger acceptance window.|
|trig_mux[k]||Select module input for multiplexed trigger input, i.e. one of TRLO_MUX_SRC_xxx that marks a real module input.|
|trig_mux_delay[k]||Analogous to trig_delay for the multiplexed inputs.|
|trig_mux_delay_mode[k]||Analogous to trig_delay_mode. The THIS and PREV flags do however not apply.|
|trig_mux_stretch[k]||Analogous to trig_stretch.|
| One bit (1 << i) in each register for each i -> j
connection, together with 1 in bit j of
trig_lmu_not gives [and,nand]:
[0,0] - don't care,
Together with 0 in bit j of trig_lmu_not:
[0,0] - no output,
|Analogous, for the multiplexed inputs, i.e. bits (1 << k)|
|Analogous, for the auxiliary inputs, i.e. bits (1 << l).|
|trig_lmu_not||j bits of inversion for the outputs. With bit (1 << j), output j constructs an AND condition (coincidence) instead of an OR condition. A zero bit is must be used to disable an output which has no coincidence requirements at all.|
|tpat_enable||Bitmask of triggers (j) to enable.|
|trig_red[j]||Select the downscale factor 2^n.|
These scalers (32-bits) are latched on each event:
|before_lmu[i]||Pulses before the lmu.|
|before_lmu_mux[i]||Pulses before the lmu, multiplexed inputs.|
|before_lmu_aux[i]||Pulses before the lmu, auxiliary inputs.|
Write access to these registers execute actions:
|trig_pending||Set pending triggers for the bit-pattern written.|
|trig_clear_pending||Remove pending triggers for the bit-pattern.|
|pulse|| Actions as per (TRLO_PULSE_xxx):
TRIG_SCALER_RESET - reset the trigger scalers.
TRIG_SCALER_LATCH - latch the trigger scalers.
MASTER_START - fire one master start pulse. (Useful to generate gates for special triggers, e.g. pedestal determination.)
Several control signals are provided to and from the multiplexer.
To the fast-path:
From the fast-path:
The data acquisition cycle control is in charge of the system dead-time, which is both produced internally (quickly in response to accepted triggers to prevent multi-triggering) and externally (from TRIVA, and busy from converter modules).
Once the DAQ is in running mode, it is normally in the IDLE state. When a trigger has managed to pass through the fast_path and survived downscale, it becomes an accepted trigger. The accepted trigger makes the trigger state machine go into the (programmable) coincidence acceptance window, during which further TPATs are accepted. As the window reaches the end, the internal dead-time (inhibit) is activated to prevent fast_path from letting further TPATs through. When using multiple TPATs, the acceptance window should be long enough to allow for any timing jitter present between different coincidence conditions that may be related to the same event. Inhibit will then be on until we reach IDLE mode again.
After no further TPATs can arrive (a few cycles are after the acceptance window closure are allowed for, to let the inhibit propagate to the trigger state machine TPAT input), the mapping from received TPATs to TRIVA trigger numbers (1-15) is done. Then the highest trigger is selected (priority) and also encoded, to be sent to the TRIVA (send time is 10 cycles, i.e. 100 ns). Note that the priority encoding in most cases rather should be thought of as defining a clear rule for tie-breaking in the rare cases of multiple triggers being generated within the acceptance window, than that some triggers are more important.
To allow the TRIVA (or equivalent) time to deliver its dead-time, an internal (programmable) busy counter is started. Once this reaches the end, the WAIT_TRIVA state is entered, where it stays until the external dead-time (from the TRIVA) is released. Upon release of the dead-time, it is checked that busy inputs also are clear and that no trailing end of a trigger is present at the LMU output. After this, IDLE state is again active. The start-up mode for the state machine is WAIT_TRIVA.
LOOP INH IDLE x off Wait for TPAT, (or pending trigger, see below). | START_WINDOW off Load counter for coincidence window. | WINDOW x off Countdown of coincidence window. | END_WINDOW on Last call, no further orders. | TRIG_SELECT on All TPATs arrived, do TPAT -> TRIG mapping. | PRIO_ENCODE on Select priority trigger, encode it. | START_SEND_TRIG on Load counter for send timer. Pulse accepted. | SEND_TRIG x on Send the selected trigger and accepted TPATs. | BUSY_START on Load counter for internal fast dead-time. | BUSY x on Internal fast dead-time. | WAIT_TRIVA x on Wait for the DAQ (TRIVA) to finish processing. | TRIVA_DONE x on Check for pending triggers while inhibit is | still active. | Wait for release of any converter module busy. | Wait for absence of trailing triggers from LMU. | (IDLE) (off)
For multi-event operation, note that it is allowed for a TPAT to map to TRIVA trigger 0, i.e. in this case the TRIVA will not see a trigger at all and only the internal dead-time as well as busy signals wired from the converter modules will hinder the system from reaching IDLE mode again.
Note that the busy from multi-event modules must be wired to the busy input and not to the dead-time input, as the (TRIVA) dead-time will prevent pending triggers from being treated, while the busy input will allow pending triggers to be processed. And the pending trigger is what would be used if a module with an IRQ signals that it is full and therefore needs readout. (Such full modules are expected to issue continuous busy signals until at least one event is read out.)
When releasing the inhibit (i.e. allowing triggers to pass from the LMU to the reduction counters of the fast_path), there is a chance that an event just happened some while ago (such that its coincidence signals are still present).
If it was very recently, i.e. only a few ten ns ago, and if the leading-edge detect of fast_path would be before the inhibit veto, we would loose any TPAT bits that correspond to LMU outputs that was then vetoed, but later ones would survive. I.e. we would get a wrong TPAT. Thus, leading-edge before inhibit veto is not good.
Likewise, if the trigger was longer ago, but the leading-edge detect of fast_path would be after the inhibit veto, we would in this case possibly also loose TPAT bits, for those with shorter coincidences. Moreover, this event will likely miss to record any good times, as the master start will be generated unusually late. Ergo, leading-edge after inhibit veto is also not good.
To remedy this, the trigger state machine will not pass from TRIVA_DONE to IDLE mode (i.e. release the inhibit) while any signal is still active from the LMU. Incomplete triggers are thus ignored by extended dead-time. As this affects different kinds of events equally, it does not affect the ratios of collected events. To get complete TPATs in the case where a new trigger comes just the cycle after the inhibit was released, the leading-edge detect is placed after the inhibit veto. This allows these signals (that then are delayed by at most 1-2 cycles) to also be completely recorded.
Note: this has the consequence that if any incoming fast_path detector signal is so noisy that it is constantly on, _and_ if it uses the TRLO_TRIG_DELAY_MODE_WHEN_PRESENT mode, it can completely block the acquisition by never allowing the inhibit to be released. (The display and VME registers can be used to detect such situations.)
The exception to the normal control cycle is the occurrence of pending triggers, either software or hardware generated. A pending trigger is a request to generate a particular trigger, which will not go away until that particular trigger has been generated and accepted by the priority encoder. There are two ways of reaching these triggering states, depending on if the system was IDLE when the request was made.
If non-idle, i.e. already processing a trigger, then a check for pending triggers is made after the TRIVA has released its dead-time, but before we have released the inhibit to the fast_path, i.e. there is no chance for any TPATs to be accepted.
TRIVA_DONE on Check for pending triggers while inhibit is | still on. | PULSE_SELECT on Triggers requested are those that are pending. | (PRIO_ENCODE) (on) Select priority trigger, as usual.
If the system is in IDLE mode as the trigger arrives, there exist the chance of a real (detector) trigger arriving and sneaking through the fast-path while accepting the pending trigger. The issue is that if it is accepted by the fast_path, a master start will have been generated, which may not be compatible with the wishes of the pending trigger. This is handled by realising that the pending trigger can wait until the accepted trigger has been dealt with the normal way. This is done with states resembling those that follow the closure of the normal coincidence window. If a TPAT is seen, control is transferred to the normal closing procedure, and the ordinary trigger is handled.
A pulsed trigger is another way of making a trigger, but it will only be accepted if the system is IDLE while the pulse happens. This is also subject to the safety measures preventing spurious master starts. Note: pulsed triggers are currently disabled. No use for them appeared so far.
IDLE off Pending (or pulsed) trigger is check for. | PENDING_PULSE_TRIG Wait in case TPAT sneaked trough. | on (if it sneaked, go to TRIG_SELECT) | PULSE_SELECT on Triggers requested are those that are pending | or pulsed. | (PRIO_ENCODE) (on) Select priority trigger, as usual.
If the TRIVA suddenly starts DT, the best we can do is to go to WAIT_TRIVA state and wait for the DT to go off. Even though we could check for this condition also in START_WINDOW and WINDOW, it will not prevent spurious triggers from leaking through in those cases... So only the cases that will anyhow not (soon) end up in WAIT_FOR_TRIVA must do this check, i.e. IDLE and TRIVA_DONE.
It is ensured that all master start signals that are issued are accompanied with an taken trigger and thus also an accept pulse. For that reason, if an master start is issued at the same time as an suddenly appearing deadtime or busy signal, the system will linger for one cycle in SUDDEN_DT or SUDDEN_BUSY and go to START_WINDOW instead of WAIT_TRIVA.
IDLE off Sudden deadtime and busy is checked for. | SUDDEN_DT or Wait one cycle in case TPAT sneaked trough. SUDDEN_BUSY on (if it sneaked, go to START_WINDOW) | WAIT_TRIVA on Wait for the DAQ (TRIVA) to finish processing.
Note that in both cases, if the DAQ or acquisition modules actually are busy, the thus generated trigger may not be completely processed. Such errors should be caught by the readout programs, and errors mean that either the deadtime or busy cabling is leaky.
This also applies to the 'stop acquisition' software generated trigger 15 event. Using also the GO bit from the TRIVA would not help. Even if the acquisition software as the first action of 'stop acq' removes the GO bit and then continuously holds either DT or !GO, during the setting of !GO, we may generate a spurious master start... Therefore, in order to have a completely leak-free system, it would be necessary to send the stop acq trigger 15 via us... No other way.
This just shows that one and only point must act as the hour-glass waist in terms of fan-in and fan-out of the system dead-time, in analogy to what also applies to the master start for common timing, being distributed to all systems. As far as the master start goes, the output of the TRLO is not that point! (Since e.g. the TCAL module start is fanned-in at a later point.)
The time (63 bits) and TPAT/trigger/count (i.e. contents of trig_tpat_cnt, see below) are for every trigger stored as three 32-bit words into a multi-entry buffer. The first word contains the low 32 bits of the time. The second holds the next 31 bits together with an overflow marker in the highest bit. The third word contains the TPAT/trigger/count. The buffer has 512 entries and thus can store information of up to 170 triggers. If it becomes full, and trigger information thereby cannot be stored and is lost, the following stored trigger will have the high bit of the second word (time hi) set to one. A control word allows to set an 'almost-full' level, at which an output at the multiplexer will go active. This can be connected to a pending trigger to enforce readout, even if using the multi-event (trigger = 0) mode.
Reading from any address within the output array will provide the next value from the FIFO. A status word with the number of 32-bit words (not triggers) left must be consulted before reading, as there is no unique no-valid-data marker (0x5a5aa5a5 will however be delivered). The status word also has a 16-bit XOR checksum (XOR of low and high 16-bit word parts) of the data currently in the buffer. After reading the number of data words stated, the checksum read together with the word count should match the data. If the checksum is non-zero when no more data words are available, then the internal memory array has suffered a bit-flip.
|tpat_trig[j]||For each tpat (j) determine which trigger (i) (if any) it provokes. Use 0 for multi-event mode.|
|max_multi_trig||Limit the number of events that do not produce a trigger (tpat_trig[j] = 0). 0 = disable. Note that any trigger resets the internal counter, i.e. readout must be performed on all triggers.|
|multi_trigger||Fire this trigger upon reaching max_multi_trig.|
|accept_window_len||Length of the coincidence acceptance window, 10 ns units.|
|fast_busy_len||Length of the internally generated dead-time, 10 ns units.|
|trig_control|| The global control register has two bits telling
if an accepted trigger should set the internal
dead-time and/or busy flip-flops. Intended for
testing purposes (TRLO_TRIG_CONTROL_xxx):
The trigger scalers can be reset after latching each accepted event:
ACCEPT_RESETS_TRIG_SCALER - Reset each event.
|trig_time||Latched timing counter of the current accepted event, lo and hi 32-bit word in  and .|
|trig_tpat_cnt||TPAT (i.e. LMU out, reduction accepted pattern) of current accepted event. The encoded trigger number is stored in bits 24-27 and a 4-bit event counter in bits 28-31.|
|trig_count||The 32-bit event counter.|
|trig_checksum|| Checksum of the previous two words, useful to
verify correct VME transfer. Use only when in
dead-time, as the contents otherwise may change
while the words are read. To also detect
double-errors on individual data lines, it is
trig_checksum = ((trig_tpat_cnt >> 1) | (trig_tpat_cnt << 31)) ^ ((trig_count >> 2) | (trig_count << 30));
|trig_status|| Bit-mask giving the status of the trigger state
machine, and the reasons why it is where it is
DT_IN - Dead-time input sees signal.
The values describing the STATE above are (TRLO_TRIG_STATUS_STATE_xxx):
IDLE - Idle.
The values describing the REASON above (to have inhibit on) are (TRLO_TRIG_STATUS_REASON_xxx):
IDLE - Idle (no inhibit).
|pending||Bit-mask of triggers still pending.|
|lmu_stuck_in||Technically from fast_path. Bit-mask of which LMU inputs that are stuck active for more than 100 us.|
|lmu_stuck_out||As above, but for LMU outputs, also see LMU_STUCK_OR above.|
|lmu_enabled_stuck_out||As above, but only including enabled LMU outputs, also see LMU_ENABLED_STUCK_OR above.|
|multi_trig_buf_status||Status of the multi-trigger buffer, number of 32-bit words available and XOR checksum (TRLO_MULTI_TRIG_BUF_STATUS_xxx):|
Several signals are provided to and from the multiplexer.
To the state machine:
From the state machine:
|ACCEPT_TRIG(i)||Signal with the accepted trigger, only one bit at a time. Signal is 10 cycles long, i.e. 100 ns. (0) indicates an multi-event trigger 0.|
|ENCODED_TRIG(i)||Signal with the encoded accepted trigger, suitable to be sent to the TRIVA. The signals are 10 cycles long.|
|pulse|| Actions as per (TRLO_PULSE_xxx):
MULTI_TRIG_BUF_CLEAR - clear the multi-event buffer
SET_INT_DT - set internal deadtime flip-flop
CLEAR_INT_DT - clear internal deadtime flip-flop
SET_INT_BUSY - set internal busy flip-flop
CLEAR_INT_BUSY - clear internal busy flip-flop
The tracer can provide information about the time-wise relationship between signals, much like an oscilloscope. It runs autonomously - recording is self-triggered by user-defined signal coincidences
The tracer first stores the input bit-pattern on every clock cycle into a circular buffer provided by a memory block (128 entries x max 24 values). This runs continuously until and beyond a trace-trigger condition is detected. On a trigger, the compactification starts from the beginning of the buffer, i.e. ~128 entries back and continues for 256 cycles. The trigger condition thus ends up at the middle (minus 2 for technical reasons, i.e at 126).
The 24 values for the trigger tracer are the trigger signals after the delay and stretcher, before the LMU starting at bit 0. The aux trigger signals are available starting at bit 20.
The information to be read out is compressed by only copying the patterns that change, together with their time information into a new memory buffer, with the time in the high 8 bits, and the bit-pattern in the lower bits. This way, the number of VME transfers are reduced (compared to reading the full non-changing history). The compressed trace always begins with a 31-bit time-stamp corresponding to the time of the trigger condition, marked with a 1 in the highest bit. Then follows the state at time 0, and then for all times when the pattern changed. It ends with a checksum (16 bit xor of the two halfs of data words, right-rotate-1-xor for each item) of the data, marked with a full 16 bit zeroes in the high bits. This distinguishes against the data as any data following the first pattern (with time 0) has a non-zero time-stamp.
The maximum length of a stored trace is 258 32-bit words. The compacted buffer at most can handle 1024 entries. By only starting to store data if it has < 512 items, the check can be done at one place before accepting a new trigger without any risk of overflow during compression. If full condition is reached, the tracer goes into idle state, and has to be restarted via VME (after some readout or clearing).
Once the compactification copying is done, the tracer can immediately trigger again, as the ring-buffer was filled during compactification. Due to the compactification into a second RAM block, many traces can be collected autonomously.
IDLE Waiting for start command from VME. | START Starting. If buffer space insufficient, go to idle. | INIT_FILL Make sure ring buffer has recent values, by looping | for 128 cycles. | ACTIVE Wait for coincidence condition. | COINCIDENCE Coincidence found. Store current time-stamp. | COMPACTING_FIRST Store first bit-pattern, i.e. value stored into | ring buffer 128 cycles ago. | COMPACTING For 255 cycles, check if ring buffer value stored | 128 cycles ago differs from last bit-pattern written | to output buffer. If so, store. | COMPACTED Store checksum. Next state is start.
The trigger condition is controlled by setting a number of multiplexers, which will monitor their selected inputs. As each input sees a signal, a local shift-register flag is set, which will persist until cleared. Clearing happens every n cycles, and the shift register has four slots, allowing a maximum coincidence window of 3n to 4n-1, depending on when the flag was set. n is a user parameter, at most settable to 255. Values larger than 31 may cause parts of the coincidence condition to be earlier than the first sample stored.
By setting several multiplexers to the same source, the number of required coincidences are reduced (as they become self-coincidences). Each multiplexer can be set as an anti-coincidence requirement by setting its 6th bit. The coincidence shift registers cannot be explicitly cleared, but by setting the clear clock to 1 (or 0), they will be flushed within 4 cycles, i.e. for all practical purposes immediately. (This is much less than the minimum VME access time.)
The only control register holds the 5-bit multiplexer values, the anti-coincidence bits as well as the coincidence timeout power. Writing to it immediately changes the coincidence requirements, but does otherwise not affect any running acquisition. The tracer is started by a VME pulse, and can also be stopped/cleared by another.
For every read-out anywhere in the read-out array, the read-out pointer is advanced by one. One register holds the number of values still available in the buffer, and must be consulted before readout, as there exist no unique buffer-is-empty special value. It also has two bits telling if data collection is active, and if a trace is being compacted, respectively. If the latter is active, the number of available data words will increase, as at least the checksum will be appended. Spurious reads on an empty buffer will not move the internal read pointer.
|tracer_control|| Select trigger signals and condition. Set by
a combination (or) or:
TRLO_TRACER_COINC_MUX(i,no) selects signal no for coincidence mux i
TRLO_TRACER_ANTI_COINC(i) sets anti-coincidence requirement for mux i
TRLO_TRACER_COINC_COUNTER(t) set the coincidence shift-register shift interval to t cycles
|tracer_status||The number of 32-bit data words available in the output buffer, if the tracer is active, and if it is compacting data (TRLO_TRACER_STATUS_xxx):|
The TRLO is also equipped with several other small function blocks. This is to besides direct triggering also allow it to perform some other experiment-specific decisions and recordings based on (other / related) logic signals. To allow general routing of the signals from the inputs to these functions, and then to the trigger cycle control and vice versa, each consumer can choose whichever signal producer to use as a source for its actions.
This means that each consumer is fed by a multiplexer sourcing all producers. This is implemented by aliasing all producers and consumers as two arrays, respectively. The setup register for each destination tells which source to use (with xxx from the tables below):
mux[TRLO_MUX_DEST_xxx(j)] = TRLO_MUX_SRC_xxx(i);
Additionally, each module output can be directly connected to the signal of any module input, without clocking (latching), in order to transport timing signals. This would be mostly of interest to the TRIDI together with its signal bus. Together with this, the direct input can alternatively be ANDed with the selected clocked output (that can come from any function generator) to allow for logical conditions deciding if the signal should be sent at all. By making sure that the direct non-latched signal comes after the logic decision, the timing would be defined by the direct signal.
Implementation note: the multiplexing unfortunately uses two clock cycles. This is due to the large fan-out required for each source in combination with the rather long multiplexing chains.
To partially overcome the connection latency, the outputs of some functional units (and module inputs) and be directly connected to the input of some (other) functional units (and module outputs). These direct connections go between signal numbers with the same index only. The inputs are or'ed together with any signal chosen via the multiplexer.
|mux[j]|| Tell which source (i) should be used by each
destination (j), i.e. one of TRLO_MUX_SRC_xxx.
j is one of TRLO_MUX_DEST_xxx.
|nonlatched_mux[k]|| Tell which non-clocked module input (l) should
be used by module output (k), i.e. one of
TRLO_MUX_SRC_xxx that marks a real input (not a
k is one of TRLO_MUX_DEST_xxx that marks a real output.
|nonlatched_or[m]||Bitmask telling which unclocked module inputs (1 << l) should be or'ed together to form direct OR m.|
|nonlatched_mode[k]|| Two bits for each output (TRLO_NONLATCHED_MODE_xxx):
The direct input can be either of (TRLO_NONLATCHED_IN_xxx):
|direct_mask[j][i]|| Bitmask telling which signals (indices) of the
destination functional unit (j) shall receive
direct signals from source functional unit (i).
Destination (j) is one of (TRLO_DIRECT_DEST_xxx):
Source (i) is one of (TRLO_DIRECT_SRC_xxx):
Note that the master start can be connected faster (i.e. without the multiplexing delay) with the sum_out_mask control register (see fast_path) to (m)any module output(s).
The logic functions uses one or more signals as given by the multiplexer to generate new signals that again can be accessed by the multiplexer. They are described in the following, with their respective control registers (the index i just denotes that there are several of each kind):
Generate a one-clock pulse every period clock cycles. Please use a gate-and-delay generator to make it longer.
To synchronise a pulser to some external periodic event, first set a start time and then set which pulser(s) to restart at that time.
|pend_restart_wait||Bitmask of pulsers waiting for restart.|
|period[i]|| Period in 10 ns steps. (TRLO_PERIOD_xxx):
VALADD - The minimum period (encoded by a 0 value).
|restart_at||Restart selected pulsers at given time (c.f. timing_tick).|
Pseudo-random sequence generator. Uses a LFSR (linear feedback shift register). The two units are independent with periods 2^63-1 and 2^60-1.
Generate an output signal every n'th (leading edge) pulse.
Independently delay and stretch a signal. The delay-line is implemented as a shift register, i.e. will not loose pulses.
|stretch[i]||Output signal length.|
|restart_mode[i]||Restart mode of the stretcher (TRLO_RESTART_MODE_xxx):|
A logic matrix behaving the same as the one for the fast_path (see that for details).
| One bit per register for each i -> j connection.
Each output can optionally be passed through a delay- & stretch-unit, like the pure unit described above. This incurs an additional 2 cycle delay due to the delay unit when enabled.
|lmu_stretch[j]||Output signal length.|
|lmu_restart_mode[j]|| Restart mode of the stretcher
Enable the delay- & stretch (TRLO_LMU_RESTART_GATE_xxx):
Convert start- and stop pulses to a long gate. Use to e.g. implement a spill mimic.
As output, one can inspect the state of the flip-flops:
|edge_gate||Bit-mask of the edge-gate generator flip-flops.|
The flip-flops can also be pulsed via VME:
|pulse|| As per (TRLO_PULSE_xxx):
EDGE_GATE_START(i) - start gate for generator i.
EDGE_GATE_START_ALL - convenience bitmask to start all gate generators.
EDGE_GATE_STOP(i) - end gate for generator i.
EDGE_GATE_STOP_ALL - convenience bitmask to stop all gate generators.
Make an OR of selected mux source signals. Useful to e.g. combine many module inputs containing busy-signals from converter modules.
|all_or_mask[i][j]||Mask telling which mux sources should be used for OR signal i. j denotes that several 32-bit registers are needed to cover the full mux source array. It is suggested to use the TRLO_ALL_OR_xxx helper macros.|
Makes an output signal when a sum of selected LMU inputs is equal-or-more than a selected level. The LMU inputs are re-used for inputs as this otherwise would need a lot of multiplexers itself, and is not likely to be used in many cases.
A second set of coincidence units use bitmasks of the module inputs.
Each of these outputs can optionally be passed through a delay- & stretch-unit. This incurs an additional 2 cycle delay due to the delay unit when enabled.
|input_coinc_stretch[i]||Output signal length.|
|input_coinc_restart_mode[i]|| Restart mode of the stretcher
Enable the delay- & stretch (TRLO_INPUT_COINC_RESTART_GATE_xxx):
The multiplexer sources and destinations can be pulsed via VME. First set up the respective bit-pattern arrays, then issue a pulse. It is suggested to use the TRLO_ALL_OR_xxx helper macros.
|pulse_mux_src_mask[i]||Bit-mask of TRLO_MUX_SRC_xxx signals to pulse.|
|pulse_mux_dest_mask[i]||Bit-mask of TRLO_MUX_DEST_xxx signals to pulse.|
Count events on the input signal. The scalers can be latched in blocks (common for several scalers) on different signals, or by software.
|scaler_mode[i]|| An combination (or) of two settings:
What kind of events are counted (TRLO_SCALER_MODE_xxx):
LEADING_EDGE - number of pulses, leading edge
When to latch (TRLO_SCALER_LATCH_xxx):
|SCALER(i)||Signal i to count.|
|SCALER_LATCH(j)|| Signal to reset generic scaler block j.
The latch inputs are common for blocks of TRLO_SCALER_LATCH_BLOCK scalers, i.e. scaler to scaler[TRLO_SCALER_LATCH_BLOCK-1] is latched by the first latch input, and so on.
As output, the scaler delivers a 32-bit latched count value.
The scalers can also be latched and reset via VME pulses:
Even if the scalers can be reset, there is no need to reset the latched values - they will be new whenever latched. Furthermore, it is often easier to never reset scalers, but rather let them run continuously. Differences between two readings are easy to calculate in software.
Each multiplexer source (TRLO_MUX_SRC_xxx) is also directly connected to a scaler, counting the number of leading edge pulses. They must be latched via VME before readout. They are intended for debugging or monitoring.
As output, the scaler delivers 32-bit latched values.
Latch the timing count on an event of the selected signal.
The readout consists of two 32-bit values (lo and hi) of the latched timer. (In a previous version, there was a latch count in the 4 high bits (nibble) telling how many times the latch has latched. Possibly was useful to detect missed latches.)
The time stamps (30 bits) of the timer latches are also recorded in individual multi-entry buffers (512 entries). If it becomes full (either due to slow read-out, or discard-on-full, see below), the high bit of the data-word following lost entries is set to one. 60-bit time stamps can also be recorded in two consequtive words, starting with the low 30 bits. Bit 31 is 1 for the hi data words.
A control word allows to set an 'almost-full' level, at which an output at the multiplexer will go active. This can e.g. be connected to a pending trigger to enforce readout. A input from the multiplexer allows to select a discard-on-full mode where old entries are discarded thereby keeping the latest entries instead of the oldest. By some trickery with a edge-to-gate generator, this can be used to discard old entries until a trigger is accepted (preferably with some delay added), to ensure that data around the trigger is kept.
Reading from any address within the output array will provide the next value. A status word with the number of words (not triggers) left must be consulted before reading, as there is no unique no-valid-data word (0x5a5a5a5a will however be delivered).
|multi_latch_control[i]|| Almost-full level of the multi-entry buffer
TWO_WORDS - Record 60-bit time stamps.
|multi_latch_status[i]|| Number of data words available
LOST_WRITE - Data lost due to full buffer. Set until a new entry is written in the buffer.
|multi_latch[i]||Next data word.|
|MULTI_LATCH_DISCARD_OLD(i)||Multiplexer input control signal.|
|MULTI_LATCH_ALM_FULL(i)||Multiplexer output control signal.|
Latch the values of all mux source signals on the leading edge of the selected signal, and store in output registers.
|pattern_latch[i][j]||Latched bit-pattern with all mux sources. j denotes that several 32-bit registers are needed to cover the full mux source array.|
|version_md5sum|| md5sum of the full VHDL code. Available as
(TRLO_MD5SUM_xxx) constant in the C interface
STAMP - 32 low bits of the stamp.
FULL - String with the entire md5sum.
|compile_time||Time of the compile (seconds since 1970-01-01).|
|Parity bits for all setup registers, 32 in each word.|
|timing_tick||Current value of the full-speed 64-bit timing counter, lo and hi 32-bit word in  and , latched via VME (TRLO_PULSE_TIMER_LATCH).|
|deadtime_tick||Count of full-speed timing ticks when deadtime (internal inhibit) was active. (c.f. the 'timing_tick' output register.) Not latched, if using the high word, make sure it did not change while reading the low.|
|pulse|| Action as per (TRLO_PULSE_xxx):
TIMER_RESET - Reset the full-speed timing counter. Note that due to implementation details, if a timer channel has been recently latched (up to including this cycle), the latched value will be negative. Recently is a few ten clock cycles (precisely r=2*2^n, where 2^n >= the total number of timing latches). The reset is only expected to be used on startup, and most often not even then.
TIMER_LATCH - Latch the full-speed timing counter.
The front-panel LEDs can show the status of any signal from the multiplexer. The last two LEDs are hardwired to show if the serial timestamp and the triva mimic have a lock on their respective serial input bitstreams.
The current value of the local full-speed timing counter can be sent via a serial protocol to other modules. The receiving ends automatically synchronise to the signal and store time-stamps when receiving independent latch signals. The precision of the latched time-stamps only depend on the clock frequency of the receiving module, with a sigma of about 0.35 clock cycles and worst deviation of 2 clock cycles observed during testing. The serial signal can be transported using 'any' cable means. The top front-panel LED on a VULOM module or second bottom on a TRIDI module indicates receiver protocol lock.
Each serial message super-cycle consist of 16 data-cycles of 32 symbols of synchronisation pattern and 32 symbols with data payload. The payload is Hamming encoded to enable forward error correction, with a total of 11 payload bits and 5 parity bits. 4 bits are used for the time information and 7 for a set of multiplexed signals. The synchronisation pattern is also used to monitor the signal integrity by counting bad symbols. (The synchronisation pattern is carefully crafted such that there is a margin of 2 symbols until it is equally possible that a suspected wrongly received synchronisation in fact is any other possible pattern sequence including encoded data.)
The serial protocol is designed such that exactly half of the signal symbols are 0 and the other half 1. This is achieved by Manchester encoding of the data and a balanced synchronisation pattern. There are at most two symbols with the same value next to each other; sequences of such pairs are delivered by the synchronisation pattern, as well as sequences of fast-flipping symbols. Both these are used by the receiver when doing its initial frequency search (i.e. symbol length determination), which may take about one second.
After rough frequency lock, the receiver goes into a phase-tracking mode using the 0-1 transitions with slow frequency correction. Symbols are sampled at the middle of each slot. Once the receiver has locked onto the bit-stream, it will stay locked even if a number of transitions should be missing. The phase-tracking essentially works by employing a fractional counter such that it can predict when the next 0-1 transition should occur. Each actual transition leads to a correction of half receiver clock cycle. When a counter has seen 4 more corrections in one direction, the frequency is adjusted slightly. When the frequency estimate is spot-on, the phase will just correct slightly forth-and-back.
The result of the phase tracking is used to provide the low part of the received time whenever a latch is requested. The high part is provided by the time information received every message super-cycle. Time information in adjacent message super-cycles must match for the timing receiver to consider time as good. Super-cycles with known transmission errors are overcome by the phase track counting.
|serial_timestamp_speed||Symbol-length of the serial message. Period = 8 * 2^n, i.e. 8, 16, 32 or 64.|
|serial_timestamp_buf_control||Almost-full level of the multi-trigger buffer (TRLO_MULTI_TRIG_BUF_CONTROL_xxx):|
|serial_timestamp_status|| Number of data words available in bits 0-9.
Bit 15 marks desynchronised serial message
reception. Bit 16 marks bitstream sync, and
bit 17 that bitstream has had sync loss.
Bit 18 marks data pattern sync, and bit 19
that the data pattern has had sync loss.
Bits 20-23 is a counter of bad bits, and
bits 24-31 is a checksum (XOR) of the data
currently in the output buffer.
Bits 17, 19 and 20-23 are cleared by a pulse, see below.
|serial_tstamp|| Next data word from multi-timestamp buffer.
|SERIAL_TSTAMP_IN||Serial message input to receiver (decoder).|
|SERIAL_TSTAMP_LATCH||Latch a timestamp. (Creates 2-word entry in buffer).|
|SERIAL_TSTAMP_ALM_FULL||Timestamp output buffer is almost full.|
|SERIAL_TSTAMP_DESYNC||Signal marking reception desynchronisaition.|
|pulse|| Action as per (TRLO_PULSE_xxx):
SERIAL_TSTAMP_BUF_CLEAR - Clear the latched timestamp buffer.
SERIAL_TSTAMP_FAIL_CLEAR - Clear the flip-flops remembering reception desynchronisation and bad bits.
Seven signals can also be multiplexed within the serial message. They are delivered every data-cycle and updated in the multiplexer of the receivers. In case the receiver looses lock of the serial message, the output signals will be output as 0.
15 bits of auxillary data bits are also transmitted along with the time. They can be used to transport some information from the sender to the receievers, or e.g. be used to identify the sending stream.
The signals are introduced by the sender:
Available at the receiving end:
|SERIAL_SIGNALS_OUT(i)||Multiplexed signal i.|
|serial_timestamp_aux_status|| Value of auxillary data bits in low bits
0-14, at last end of super-cycle. Bits
16-22 contain the serial signals, at last
end of data-cycle. Bit 23 marks
desynchronised serial message reception.
In order to rather easily share a common time reference with foreign DAQ systems that have no means to use e.g. the serial time protocol, a simple "speaking clock" protocol is implemented.
Provided that the foreign DAQ system is able to locally timestamp a received logical signal, it can receive the periodic signals of the heimtime protocol, and during analysis the common time scale (as provided by the TRLO II) can be recovered.
The protocol consists of two parts. Every 2^19 local clock cycles a pulse is generated. With the local clock of 100 MHz this means every 5.24288 ms (or 190.7 Hz of signals). In order to tell time, for 32 pulses starting every 2^26 ticks (or 128 pulses, or about 0.671 s apart), it delivers two additional pulses. They either have a separation of 0.16384 or 0.65536 ms. The short separation means 0, and the long separation means 1, in a 32-bit time stamp. The 32-bit time-stamp starts at local bit 24.
For analysis, reception of one full time message would be enough, as it then can perform dead counting of the pulses. It is naturally recommended to continously verify that the received timestamps match with the previous ones.
The reason for having both this and the serial timestamp protocol is that the serial protocol lends itself to easy FPGA decoding and precision following, while this Heimtime protocol allows for rather straightforward handling in analysis, without requiring tremendous amounts of data to be recorded by the foreign DAQ system.
In order to be able to precisely track another time scale, the speed and offset of the local source of the serial and Heimtime protocols can be adjusted. A user-specified value is added to the slewed time counter every clock cycle. Note that the 24 low bits are added below the value being sent. They provide precision control as their constribution accumulated over many clock cycles. The absolute value of the counter can also be shifted by setting and the adding an offset.
|slew_counter_add||Value to add to the slewed time counter every clock cycle.|
|slew_counter_offset||Value to add to the slew time counter on a pulse.|
|slew_counter|| Action pulse to add the offset, as per
ADD_OFFSET_LO - Add to low 32 (real) bits.
ADD_OFFSET_HI - Add to high 32 bits.
When used with the MBS, the encoded trigger from the trigger state machine is connected to the master TRIVA module trigger inputs, and the deadtime from the TRIVA sent back to the trigger state machine. The task of the TRIVA is to notify the controlling processor of each readout event and keep deadtime until the readout process has requested its release. The notification can be either by interrupt or by register polling by the readout program.
In a multi-branch setup, the TRIVA module of each slave crate also form the interface with the local readout processors. The trigger bus between the TRIVA modules deliver the triggers from the master system and collect the deadtime from all slaves. It also ensures synchronous event-wise operation of the entire multi-branch system.
The TRIMI part of the TRLO II performs the same function as a TRIVA, but within the same module. It has the same register layout, making it immediately compatible with the MBS data acquisition system.
For a multi-branch system, the TRIMI uses a unidirectional serial protocol to distribute triggers and deadtime return by simple on/off signalling: serial trigger link with common deadtime (STL/CDT). The receiving (slave) end automatically synchronise to the (STL) signal. The bottom front-panel LED on a VULOM or TRIDI module module indicate receiver protocol lock. The serial link-signal can be transported using 'any' cable means. It may be fanned out directly at the master module, or in a tree-like fashion at slave or other modules. Likewise, the deadtime return signal may be or'ed at the TRIMI module or earlier points. When the TRIMI module inputs are used to collect the deadtime signals, some specialised monitoring facilities can simplify debugging.
For larger systems, the serial trigger link can be arranged in a tree-like fashion. If a TRIMI module of an intermediate node is used for STL fan-out and CDT fan-in, it can do individual monitoring of the deadtime from the subsystems. It can also be reconfigured as a local master without any recabling. Independently of how the STL is organised as a tree in one or several layers, the MBS can still treat the system as a flat topology.
The typical failure mode (trigger desynchronisation) of a multi-branch TRIMI system is either that the STL is being run with a too short symbol period or the deadtime is improperly wired. Both cases are detected by the slave systems, which will notice the missing event counters and report trigger mismatch.
The serial protocol is similar to the one used by the serial timestamp distribution. A 32-bit idle pattern is sent continuously, which serves both as help during setup, and allows the receiver to lock onto the signal. At every second symbol, the transmitter may send a special 18-symbol start pattern, indicating that a trigger will follow. The trigger payload is a 4 bit trigger number, 4 bit event counter and 2 bits of an multi-event counter. It is Hamming encoded to enable forward error correction, leading to a total of 15 bits or 30 symbols due to the Manchester encoding of the data. Trigger link reset commands, as well as side-band 8-bit messages are transmitted in the same way, but with a different start pattern.
Trigger latency is 18 + 30 symbol periods. With a minimum receivable symbol period of 6.75 local clock cycles, usually chosen as 10, this at a transmitter clock of 100 MHz corresponds to 4.8 us. While not fully implemented yet, the system is prepared to allow the slave systems to issue the interrupt to the processor after the header word, i.e. 1.8 us. With long connection distances, signal dispersion may require longer symbol lengths. An alternative is to use optical transcievers, which also prevent grounding issues.
|link_status||STL/CDT receiver status.|
|link_serial_in||Select front-panel for STL input signal.|
Output signals may be directed to any front-panel output i, set by the bitmask bit (i%16) in conn_out[i/16]:
Deadtime from slaves can be received from any front-panel input i, set by the bitmask bit (i%16) in dt_in[i/16]:
|advisory|| Enable bitmask for advisory deadtimes. These are
not subject to the checking for missing or
double-pulsing. Useful for deadtimes from
time-stamped 'slave' systems, which legally both
may ignore (miss) triggers, or issue their own
additional local triggers.
Further entries in dt_in[i/16] can be used to diagnose the deadtime return signals:
|current||Current status bitmask of deadtime. Note: all inputs are shown, regardless of the enable/advisory bitmask. To allow check of not-yet connected (enabled) systems.|
|good||Input has delivered deadtime for this trigger before the fast slave deadtime elapsed. For all inputs, regardless of the enable bitmask. (Reset each trigger).|
The start and end of the global deadtime, as well as the end of the local deadtime, and the end of all enabled deadtime receiver inputs are recorded in a 512-entry buffer.
The front-panel 24x36 LCD-display is used to show information about the status of the fast_path and the trigger state-machine. Note that this information also is available over VME, in many cases also as scalers for much better rate-estimates.
Where the X-column, read from below, is double-dots showing active fast_path LMU inputs, the top four mark the AUX LMU inputs.
The Y-column, from below, show the active LMU outputs.
The Z-column, from below, show accepted TPAT bits, i.e. after reduction.
The two A-columns, from below, show accepted triggers, 0-7 in the left column and 8-15 in the right.
Dd. - marks (D) DT from triva input, (d) internal DT flip-flop, (i) internal inhibit (interal fast dead-time), (X) when active LMU output prevents trigger state machine to become IDLE, or (.) for none.
Bb. - marks (B) busy input, (b) internal busy flip-flop, or (.) if none.
lo. - marks the (1-based hexadecimal) number of the first stuck LMU output that is enabled, or (.) if none. See lmu_enabled_stuck_out.
I. - marks inhibit (trigger state-machine veto) (I) or (.) for none. A skull (☠) is shown when active LMU outputs prevent normal triggering.
A. - shows the accepted trigger number in hexadecimal (.) if none.
re - shows the reason for the current trigger, i.e. which path it took in the state-machine.
st - shows the trigger state machine state. (1) is IDLE, (B=11) is WAIT_TRIVA, (C=12) is TRIVA_DONE (i.e. wait for busy release).
VM and VK - are the high byte of the VME address, i.e. module setting. Above these, two rows of bits show the 20 lowest bits of the trigger counter. The leftmost bit of the lower row flips back-and-forth every 2^10 = 1024 events, and the leftmost bit of the upper row every 2^20 events. These bits blinking show if events are being processed and give an idea of the rate.
TM - triva mimic info: (H) if not in go mode, otherwise the current trigger number when in (global) deadtime or (.) if idle. Active local deadtime is shown as a line below the trigger number. To the right, four markers from top to bottom: locked on serial trigger bus signal, slave mode, irq pending and mismatch.
The TCAL and CLOCK triggers are generated by pending triggers, in turn provided by pulsers from the general logics. Different rates in- and off-spill can be selected by using the general LMU. Thus, these triggers no longer eat TPAT/trigger-box entries. By using the pending instead of pulsed trigger input, rates can be guaranteed.
Spill-mimic from the accelerator BOS and EOS signals is handled by the general edge-to-gate conversion. The in/off-spill coincidence signal is provided by the aux input to the fast-path LMU, thereby not consuming a separate input for spill-is-on. BOS and EOS triggers are handled by the pending trigger system to guarantee delivery.
(Post) pile-up rejection is supported by using the general stretcher which can provide the pile-up veto signal to an auxiliary fast-path LMU input.
Monitoring of the per-event begin and end of dead-time is provided by the timer-latches. Can also be measured for individual systems with the TRIDIs.
Multi-event mode is used to make the most of the awkward (REX)-ISOLDE duty cycle.
Clock latches are used to record the times of T1 and T2 (protons on target and REX-pulse, respectively). It is no longer necessary to make a trigger for these events.
(If one still want to make an EBIS trigger, the delay generators can be used to delay T2 to make the trigger happen after the REX pulse, so that the read-out does not occur during the ion burst.)