Skip to content

Refactor control of Source in EventProcessor#50843

Open
Dr15Jones wants to merge 17 commits intocms-sw:masterfrom
Dr15Jones:refactorEventProcessor
Open

Refactor control of Source in EventProcessor#50843
Dr15Jones wants to merge 17 commits intocms-sw:masterfrom
Dr15Jones:refactorEventProcessor

Conversation

@Dr15Jones
Copy link
Copy Markdown
Contributor

@Dr15Jones Dr15Jones commented Apr 30, 2026

PR description:

  • moved details of handling of the Source to its own dedicated class, SourceCoordinator
  • less code now is run within the Source's serial task queue
  • did some simplification of the scheduling
  • Run and Lumi merging now happens within the routine that reads the initial transition.

PR validation:

Code compiles, all framework unit tests pass.

resolves cms-sw/framework-team#2190

Dr15Jones and others added 13 commits April 17, 2026 10:04
Throw exception in endJob not during destructor.
Co-authored-by: Copilot <copilot@github.com>
- moved looper setup out of source queue
- improved function names

Co-authored-by: Copilot <copilot@github.com>
This isolates all calls to the source queue to be in dedicated functions.

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
- move error handling out of source queue
- merge run/lumi before calling looper

Co-authored-by: Copilot <copilot@github.com>
@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Apr 30, 2026

cms-bot internal usage

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50843/49191

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @Dr15Jones for master.

It involves the following packages:

  • FWCore/Framework (core)
  • FWCore/Services (core)

@Dr15Jones, @cmsbuild, @makortel, @smuzaffar can you please review it and eventually sign? Thanks.
@fwyzard, @makortel, @wddgit this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@Dr15Jones
Copy link
Copy Markdown
Contributor Author

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: ClangBuild
Size: This PR adds an extra 136KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23ba8e/52990/summary.html
COMMIT: 749e367
CMSSW: CMSSW_17_0_X_2026-04-30-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50843/52990/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed Clang Build

I found compilation warning while trying to compile with clang. Command used:

USER_CUDA_FLAGS='--expt-relaxed-constexpr' USER_CXXFLAGS='-Wno-register -fsyntax-only' /usr/bin/time -v scram build -k -j 32 COMPILER='llvm compile'

See details on the summary page.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 1, 2026

Pull request #50843 was updated. @Dr15Jones, @cmsbuild, @makortel, @smuzaffar can you please check and sign again.

@Dr15Jones
Copy link
Copy Markdown
Contributor Author

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 1, 2026

-1

Failed Tests: UnitTests
Size: This PR adds an extra 64KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23ba8e/53000/summary.html
COMMIT: 930683a
CMSSW: CMSSW_17_0_X_2026-05-01-1600/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50843/53000/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed Unit Tests

I found 1 errors in the following unit tests:

---> test test_PedeConversion had ERRORS

Comparison Summary

There are some workflows for which there are errors in the baseline:
1000.0 step 2
1001.0 step 2
101.0 step 1
10224.0 step 1
11634.0 step 1
12434.0 step 1
12834.0 step 1
12846.0 step 1
13034.0 step 1
1306.0 step 1
13234.0 step 1
1330.0 step 1
135.4 step 1
136.731 step 2
136.793 step 2
136.874 step 2
139.001 step 2
140.56 step 2
14034.0 step 1
14234.0 step 1
16834.0 step 1
17034.0 step 1
18434.0 step 1
18634.0 step 1
2022.0010001 step 2
2023.0020001 step 2
2024.0000001 step 2
2024.0010001 step 2
2024.0020001 step 2
2024.0030001 step 2
2024.0040001 step 2
2024.0050001 step 2
2024.0060001 step 2
2024.0070001 step 2
2025.0000002 step 2
2025.0010001 step 2
25.0 step 1
2500.3001 step 2
250202.181 step 1
25202.0 step 1
312.0 step 1
34434.0 step 1
34434.75 step 1
34434.911 step 1
34496.0 step 1
34500.0 step 1
34634.999 step 1
4.22 step 2
4.53 step 2
5.1 step 1
7.3 step 1
8.0 step 1
9.0 step 1
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • You potentially added 21760 lines to the logs
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 1
  • DQMHistoTests: Total histograms compared: 0
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 0
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0 KiB( 0 files compared)
  • Checked 74 log files, 0 edm output root files, 1 DQM output files

@Dr15Jones
Copy link
Copy Markdown
Contributor Author

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 2, 2026

-1

Failed Tests: UnitTests
Size: This PR adds an extra 64KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23ba8e/53018/summary.html
COMMIT: 930683a
CMSSW: CMSSW_17_0_X_2026-05-02-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50843/53018/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed Unit Tests

I found 1 errors in the following unit tests:

---> test test_PedeConversion had ERRORS

Comparison Summary

Summary:

  • You potentially added 2 lines to the logs
  • Reco comparison results: 5 differences found in the comparisons
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4187168
  • DQMHistoTests: Total failures: 46
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4187102
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
  • Checked 227 log files, 197 edm output root files, 53 DQM output files
  • TriggerResults: found differences in 1 / 51 workflows

@Dr15Jones
Copy link
Copy Markdown
Contributor Author

So the failing unit test has the following output

%MSG-i Alignment:  (NoModuleName) BzeroReferenceTrajectoryFactory()  02-May-2026 15:54:10 CEST pre-events
mass: 0.105658
momentum: 10
%MSG
%MSG-i Alignment:  (NoModuleName) MillePedeAlignmentAlgorithm()  02-May-2026 15:54:10 CEST pre-events
Start in mode 'pedeRead' with output directory ''.
%MSG
%MSG-i Alignment:  (NoModuleName) AlignmentProducer::AlignmentProducer()  02-May-2026 15:54:10 CEST pre-events

%MSG
%MSG-i Alignment:  AlignmentProducer:AlignmentProducer@callESModule  AlignmentProducer::produceTracker() 02-May-2026 15:54:11 CEST  PostGlobalBeginRun

%MSG
%MSG-i Alignment:   GEMGeometryESModule:GEMGeometryAlignmentProducerAsAnalyzer@callESModule  GeometryAligner::applyAlignments() 02-May-2026 15:54:11 CEST PostGlobalBeginRun
Starting to apply alignments.
%MSG
%MSG-i Alignment:   GEMGeometryESModule:GEMGeometryAlignmentProducerAsAnalyzer@callESModule  GeometryAligner::applyAlignments() 02-May-2026 15:54:11 CEST PostGlobalBeginRun
Finished to apply 1422 alignments with 0 non-zero APE.

while in the IB the same unit test has

%MSG
%MSG-i Alignment:  (NoModuleName) MillePedeAlignmentAlgorithm()  02-May-2026 12:33:22 CEST pre-events
Start in mode 'pedeRead' with output directory ''.
%MSG
%MSG-i Alignment:  (NoModuleName) AlignmentProducer::AlignmentProducer()  02-May-2026 12:33:22 CEST pre-events

%MSG
%MSG-i Alignment:  AlignmentProducer:AlignmentProducer@callESModule  AlignmentProducer::produceTracker() 02-May-2026 12:33:23 CEST  PostBeginProcessBlock

%MSG
%MSG-i Alignment:   GEMGeometryESModule:GEMGeometryAlignmentProducerAsAnalyzer@callESModule  GeometryAligner::applyAlignments() 02-May-2026 12:33:23 CEST  PostBeginProcessBlock
Starting to apply alignments.
%MSG
%MSG-i Alignment:   GEMGeometryESModule:GEMGeometryAlignmentProducerAsAnalyzer@callESModule  GeometryAligner::applyAlignments() 02-May-2026 12:33:23 CEST  PostBeginProcessBlock
Finished to apply 1422 alignments with 0 non-zero APE.

Note that the IB says this happens in begin process block transition while the PR has global begin run.

@Dr15Jones
Copy link
Copy Markdown
Contributor Author

More telling, the IB has

%MSG-i Alignment:   DTGeometryESModule:DTGeometryAlignmentProducerAsAnalyzer@callESModule  GeometryAligner::applyAlignments() 02-May-2026 12:33:23 CEST  PostBeginProcessBlock
Finished to apply 3650 alignments with 0 non-zero APE.
%MSG
%MSG-i Alignment:  @finishedCallESModule AlignmentProducer::beginOfJob()  02-May-2026 12:33:23 CEST PostBeginProcessBlock

%MSG
%MSG-i Alignment:  @finishedCallESModule  AlignmentProducerBase::initAlignmentAlgorithm() 02-May-2026 12:33:23 CEST  PostBeginProcessBlock
Begin
%MSG
%MSG-i Alignment:  @finishedCallESModule GeometryAligner::applyAlignments()  02-May-2026 12:33:23 CEST PostBeginProcessBlock
Starting to apply alignments.
...
%MSG
%MSG-i Alignment:  @finishedCallESModule AlignmentProducer::startingNewLoop()  02-May-2026 12:33:23 CEST PostBeginProcessBlock
Starting loop number 0
%MSG
%MSG-i Alignment:  PostGlobalBeginRun AlignmentProducerBase::beginRunImpl()  02-May-2026 12:33:23 CEST PostGlobalBeginRun
EventSetup-Record changed.
%MSG
Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 02-May-2026 12:33:23.580 CEST

while the PR has

%MSG-i Alignment:   DTGeometryESModule:DTGeometryAlignmentProducerAsAnalyzer@callESModule  GeometryAligner::applyAlignments() 02-May-2026 15:54:11 CEST PostGlobalBeginRun
Finished to apply 3650 alignments with 0 non-zero APE.
%MSG
%MSG-i Alignment:  @finishedCallESModule AlignmentProducerBase::beginRunImpl()  02-May-2026 15:54:11 CEST PostGlobalBeginRun
EventSetup-Record changed.
%MSG
Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 02-May-2026 15:54:11.360 CEST

so it looks like beginJob for AlignmentProducerBase is not begin called.

@Dr15Jones
Copy link
Copy Markdown
Contributor Author

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 2, 2026

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50843/49215

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 2, 2026

Pull request #50843 was updated. @Dr15Jones, @makortel, @smuzaffar can you please check and sign again.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 2, 2026

+1

Size: This PR adds an extra 44KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23ba8e/53021/summary.html
COMMIT: 19d682c
CMSSW: CMSSW_17_0_X_2026-05-02-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50843/53021/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 3 lines to the logs
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4187168
  • DQMHistoTests: Total failures: 20
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4187128
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
  • Checked 227 log files, 197 edm output root files, 53 DQM output files
  • TriggerResults: no differences found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor scheduling details in CMSSW

2 participants