Timestamps

Data from multiple systems can be combined using timestamps. This may be necessary when distances are too large to operate hard dead-time domains using e.g. a trigger bus. It can also simplify operations by coupling systems loosely, since restarts and other operations need not be coordinated over the entire setup. This is especially helpful during experiment preparation. (It should be noted that it is still necessary for the master trigger system to respect local busy and dead-time signals, lest many events shall be globally incomplete).

The drawback of timestamps is the difficulty to ensure event synchronisation, i.e. that correlated data can be uniquely assigned as such during analysis. With a hard dead-time domain, this is easily strictly ensured already by the hardware trigger bus, and then (redundantly) verified during event-building. Such verification is inherently not possible for timestamps, due to the loose coupling—since events intentionally are not guaranteed to have a 1:1 correspondence—strict checks cannot be employed.

The only signs of malfunctioning timestamp distribution and recording (which is the task of the readout, and not drasi itself) is usually problems with the time sorter stage, which for most problems just stall. Discerning the root cause of such issues is often difficult. Furthermore, small or gradual shifts or drifts are usually initially too small, and thus have no visible effect on the time-sorting, while they may still have severe impacts on analysis.

In short: monitoring is required!

Monitoring timestamp alignment

Drasi provides several means to continuously monitor the timestamp alignment of recorded data. The most precise methods effectively do a correlation analysis. But by only considering timestamp differences, they are less useful for debugging systems with large offsets, which is better done using absolute times.

Precision correlations from all events

The timestamps for all events from all time-sorter sources are fed to a separate analysis thread. The timestamps are already in numeric order, since this is the basic operating principle of the time sorter itself. The assumption behind the analysis is that events from different systems have been recorded in response to a common triggers, and thus timestamp values should be clustered. One system is chosen as reference, and the stream of timestamps of all other systems are individually associated with the reference system. The difference to the closest reference timestamp is calculated, and kept in a short recent-history list (approximately the 2000 most recent differences). By the assumption, if a system records data at the same time as the reference system, these differences will be centred around a certain offset value with a standard deviation (variance) given by the precision. Various cable lengths etc contribute to the offset value, whose absolute value is of little importance. The offset should however be stable over time.

For each monitoring update (twice per second), the list of recent differences for each non-reference system is analysed to find the offset and standard deviation. The analysis is able to ignore some amount of random coincidences. The values are reported in the monitoring tree view. The amount of outliers are also reported.

This method can show the timestamp alignment to its full precision (often some few 10 ns).

Clean precision correlations from synchronisation triggers

Due to the presence of random coincidences between auxiliary events in the systems, and the close proximity of timestamps between events from each system at higher event rates, the results of the above analysis is often not clear enough to directly answer the question if timestamp alignment is good.

By also providing a lower-rate stream of events, that are known to only occur with a certain minimum interval in each system, analysed separately, the outcome will be much cleaner.

These lower-rate triggers should have a regularly recurring minimum interval (e.g. be periodic), since a filter before the analysis continuously determine this interval, and does not allow correlation of timestamps with the reference system that have differences larger than 1/10 of the minimum interval in the reference system. This is to allow for these synchronisation triggers to be randomly missing from any system, including the reference.

Alignment verification vs. absolute time (computer clock and NTP)

By noting the local CPU time of a readout node in conjunction with a subset of event timestamps, it is possible to do a linear fit of the timestamp scale vs. the local CPU time. If the local CPU time in turn is globally synchronised (e.g. using NTP), then such measurements provide a relationship between the timestamps recorded in individual systems vs. an absolute timescale, and thus can be compared using the absolute time as reference. This becomes even easier when the timestamp values themselves are based on absolute time, as is the case for White rabbit. This can be used to find systems which have offsets above a few us (with good NTP servers and discipline) or more. (Note that this method is not precise enough to determine the quality of the timestamps. When timestamp distribution and recording is working well, this monitoring will just be reporting the local CPU clock misalignment vs. NTP.) To handle cases when the local CPU clock is not or only roughly NTP-synchronised, drasi can itself query NTP servers to provide a more accurate timescale relationship (see the --ntp option).

The linear fit of this method has a second purpose: while only a few timestamps are correlated to local CPU time (since CPU time sampling costs about 1 us), all timestamps are however verified for basic sanity. Each timestamp shall be larger then the previously reported, but it shall not be much larger than what is expected by the linear fit. When a larger timestamp is recorded, the local CPU time is again sampled, to possibly move the allowance forward. If any of these checks fail, a report is made, to alert the user.

Forgiving time sorter

The drasi time sorter is designed to allow operation even when some sources are missing, not delivering data, or abruptly disconnect and then perhaps become available again. This is an explicit design choice and goal since it simplifies operation, especially during experiment preparation.

The time-sorter can however also be operated such that the presence of all sources is required (check details). This is however not recommended.

Unaligned sources

When a source provides timestamps with large offsets, it will affect basic time-sorting operation:

  • If a system reports too large (i.e. too new) timestamps, its data will never be considered for sorting, since data from other systems is always taken first.

    Since the data from this system is not processed, it will eventually clog up its data buffers up to the time sorter. This can however take a long time if the buffers are large, which may lead to a large loss of data.

  • If a system reports too small (i.e. too old) timestamps, then this system will be the only system whose data passes through time sorting, since the other systems have seemingly newer data.

    Data from all other systems will clog up in their data buffers.

The time-sorter can not fix or circumvent the above problems, since the issue are wrong timestamps. It however tries to heuristically detect and report when timestamps from different sources are wildly differing, such that the user can take appropriate (debug) actions early.

To further aid in this process, the tree view monitor also report the timestamp offset of the next (unsorted) event in queue for each time sorter source.

Not implemented yet. What additionally could be done is to regard absolute timestamps which are too early or late compared to the current actual time as broken. That would prevent individual faulty timestamps from causing the above issues for other systems.

Missing sources

When a source is missing (e.g. due to failure to establish a connection to it), it will not be considered by the time sorter.

  • A similar situation occurs when a source does not deliver data (but allows connection). The time sorter must have one pending (next, unsorted) event from each source available, in order to choose which (smallest) to process next. The permanent absence of data from one source therefore stalls the entire sorting process. To avoid this, sources which are completely drained and do not deliver any new data for a specific time can be disabled using the ts-disable option, until they provide further data. This is indicated with an ‘X’ marker in the tree view monitor.
  • When a source gives broken timestamps, they are for sorting purposes regarded as 0, i.e. emitting the events directly (the original timestamp data is written). This however means that only such sources are sorted, and when they provide no further data, other sorting cannot continue, since the next timestamp of this source is not known. This is avoided by ignoring sources which only give broken timestamps during periods when they provide no further events. This is indicated with a ‘B’ (for broken) marker in the tree view monitor.

Timeout handling

When a source is included to participate in the timesorting (again), it will likely provide some first data with timestamps that are before the current time sorted by all other (active) sources. That data will therefore be immediately emitted to the output, but out-of-order with respect to the other sources (as the earlier data from those has already been emitted).

If this happens during startup, e.g. when no actual correlations are present, it does not matter. If it however happens during data collection due to some source being (repeatedly) removed from participating in the time sorting due to temporarily not delivering data (or delivering obviously wrong timestamps) for too long times, then the data, even if eventually recorded, will not be in order. This defies the purpose of the time sorter:

Caution

Careless disabling (ts-disable, --merge-ts-disable) of timesort sources can lead to loss of data, or that recorded data need to be re-sorted.

To assist in monitoring that timesort sources only are disabled when they are genuinely malfunctioning, and not just due to lack of triggers for a while, a sequence of timeout values are employed. Essentially, it is a chain of promises from the original data sources (readout) all the way to the time sorter that uses that chain of data.

Task Timeout Description
Readout Keep-alive Triggers and events provided by user
Readout In-spill stuck timeout Disable delayed event building.
Merger, EB/TS Input max ev. interval Promise from source (check on connect).
Time sorter Disable timeout Disable time-sorter source.
Any Max event interval Promise to next merger (EB, TS) in chain.
  • Keep-alive triggers/events are typically required in each readout in order to always fulfil a promise to deliver data regularly, also in circumstances when no normal triggers are generated.

  • When delayed event building is employed, data is intentionally not transmitted over the network for a while. This time must be smaller than the promise to any following system (the event builder) of how often data will be delivered.

    To fulfil the promise even if the spill state logic is not working correctly, delayed event building is disabled when suspected to be stuck, controlled by the --inspill-stuck-timeout option.

    (Disabling the delayed event building only impacts performance by sending data over the network at suboptimal times. If this would not be done, the resulting temporary lack of data at a later time sorter would lead to out-of-order sorting when the source is temporarily disabled.)

  • For each merger source, connected using the drasi protocol, the merger is upon connection establishment told what promise (if any) the user has made in that system using the --max-ev-interval option regarding the maximum time between events that will be delivered.

    If the merger (as time sorter) uses a disable timeout for a source and/or the merger itself makes a promise to a next stage, it is required that the source-reported interval is smaller than any of these, with a margin.

  • A disable timeout (ts-disable) must be smaller, with a margin, than a promise (if any) the merger itself makes to a next stage (--max-ev-interval).

  • The maximum event interval that is reported to any destination will have the network transport flush timeout added.

In summary: when a time-sorter source uses the ts-disable option, then all earlier data stages before that source must make and fulfil --max-ev-interval promises. Otherwise, --max-ev-interval is not needed, but can be useful in readout nodes to monitor any keep-alive trigger.