Skip to content

Comments

Promoted TLOG message about incomplete TriggerRecords to ers::error#477

Merged
bieryAtFnal merged 2 commits intodevelopfrom
kbiery/incomplete_TR_error_message
Feb 13, 2026
Merged

Promoted TLOG message about incomplete TriggerRecords to ers::error#477
bieryAtFnal merged 2 commits intodevelopfrom
kbiery/incomplete_TR_error_message

Conversation

@bieryAtFnal
Copy link
Collaborator

Description

At the SWI&T meeting on 10-Feb-2026, we talked briefly about modifying the TRBModule code so that there is an error or warning message that includes the number of expected and present Fragments when an incomplete TriggerRecord is sent to the DataWriter.

This PR covers the proposed changes for this.

Some notes:

  • I made the upgraded message an error message (instead of just a warning) because it seems that the occurrence of missing fragments is truly an error. If others feel differently we can certainly discuss this.
  • I did not simply add the number of expected and present Fragments to the existing error message that complains about a timeout in building a TriggerRecord mainly because the existing TLOG message is in a different part of the code. And, it seemed to me that this different part of the code could be called from somewhere other than the code that enforces the timeout. So, there seemed (to me) to be value in keeping them separate.

Here are the two messages that are printed out for each timed-out TriggerRecord, when the changes in this PR are included:

trigger id: 2-0/202 generate at: 110676539179764645 timed out
sending incomplete TriggerRecord downstream  (trigger/run_number=2-0/202, 6 of 10 fragments included)

Here are sample instructions for demonstrating the upgraded message:

DATE_PREFIX=`date '+%d%b'`
TIME_SUFFIX=`date '+%H%M'`

source /cvmfs/dunedaq.opensciencegrid.org/setup_dunedaq.sh
setup_dbt latest
dbt-create -n NFD_DEV_260211_A9 ${DATE_PREFIX}FDDevTest_${TIME_SUFFIX}
cd ${DATE_PREFIX}FDDevTest_${TIME_SUFFIX}/sourcecode

git clone https://github.com/DUNE-DAQ/daqsystemtest.git -b develop
git clone https://github.com/DUNE-DAQ/dfmodules.git -b develop
git clone https://github.com/DUNE-DAQ/fdreadoutlibs.git -b develop
git clone https://github.com/DUNE-DAQ/fdreadoutmodules.git -b develop
git clone https://github.com/DUNE-DAQ/trigger.git -b develop
git clone https://github.com/DUNE-DAQ/hsilibs.git -b develop
cd ..

cd sourcecode/dfmodules/plugins
sed -i 's,TLOG_DEBUG(27),usleep(300000);\n     TLOG_DEBUG(27),' FragmentAggregatorModule.cpp
cd ../../../

dbt-workarea-env
dbt-build -j 12
dbt-workarea-env

daqconf_set_connectivity_service_port local-1x1-config config/daqsystemtest/example-configs.data.xml
daqconf_set_rc_controller_port local-1x1-config config/daqsystemtest/example-configs.data.xml

mkdir -p rundir
cd rundir

drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config ${USER}-local-test boot wait 2 conf wait 2 start --run-number 201 wait 3 enable-triggers wait 10 disable-triggers wait 2 drain-dataflow wait 2 stop-trigger-sources stop scrap terminate

egrep -i 'error|warning' log*.txt | egrep 'timed out|sending incomplete'

echo ""
echo -e "\U1F535 \U2705 Note that the existing error messages when the building of a TriggerRecord times out don't include information about the number of fragments received and expected. \U2705 \U1F535"
echo ""
echo ""
sleep 3

cd ../sourcecode/dfmodules
git checkout kbiery/incomplete_TR_error_message
cd ../../

dbt-workarea-env
dbt-build -j 12
dbt-workarea-env

cd rundir

drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config ${USER}-local-test boot wait 2 conf wait 2 start --run-number 202 wait 3 enable-triggers wait 10 disable-triggers wait 2 drain-dataflow wait 2 stop-trigger-sources stop scrap terminate

egrep -i 'error|warning' log*.txt | egrep 'timed out|sending incomplete'

echo ""
echo -e "\U1F535 \U2705 Note that the new messages do include information about the number of fragments received and expected. \U2705 \U1F535"
echo ""
echo ""
sleep 3

echo ""
echo -e "\U1f7e1 \U1f7e1 Please be careful using this software area for anything other than this special test. \U1f7e1 \U1f7e1 "
echo -e "\U1f7e1 \U1f7e1 It has been modified to include a very atypical extra delay, and it may produce \U1f7e1 \U1f7e1 "
echo -e "\U1f7e1 \U1f7e1 unexpected results when used for normal testing. \U1f7e1 \U1f7e1 "
echo ""
echo ""

Type of change

  • Optimization (non-breaking change that improves code/performance)

Testing checklist

  • Unit tests pass (e.g. dbt-build --unittest)
  • Minimal system quicktest passes (pytest -s minimal_system_quick_test.py)
  • Full set of integration tests pass (daqsystemtest_integtest_bundle.sh)

@bieryAtFnal bieryAtFnal requested a review from eflumerf February 11, 2026 15:57
@bieryAtFnal bieryAtFnal merged commit a0f5780 into develop Feb 13, 2026
4 checks passed
@bieryAtFnal bieryAtFnal deleted the kbiery/incomplete_TR_error_message branch February 13, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants