Skip to content

Fix flaky ChangelogBackendTestCase timeout in assertChangelogAttributesInRootDSE#607

Merged
vharseko merged 2 commits intocopilot/fix-inverted-logic-substring-checkfrom
copilot/fix-changelog-backend-test-issue
Mar 31, 2026
Merged

Fix flaky ChangelogBackendTestCase timeout in assertChangelogAttributesInRootDSE#607
vharseko merged 2 commits intocopilot/fix-inverted-logic-substring-checkfrom
copilot/fix-changelog-backend-test-issue

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 31, 2026

searchInChangeNumberModeOnOneSuffixMultipleTimes was intermittently failing with firstchangenumber: expected "1" but was "0" because change number assignment is asynchronous and the 3-second retry window in assertChangelogAttributesInRootDSE was too short on slower runners (notably macOS).

Change

  • ChangelogBackendTestCase.java: Increase TestTimer.maxSleep in assertChangelogAttributesInRootDSE from 3 s to 10 s, matching the timeout already used by searchChangelog and other retry loops in the same class.
// Before
TestTimer timer = new TestTimer.Builder()
    .maxSleep(3, SECONDS)
    .sleepTimes(100, MILLISECONDS)
    .toTimer();

// After
TestTimer timer = new TestTimer.Builder()
    .maxSleep(10, SECONDS)
    .sleepTimes(100, MILLISECONDS)
    .toTimer();
Original prompt

Problem

The CI build on PR #599 is failing with a test failure in ChangelogBackendTestCase.searchInChangeNumberModeOnOneSuffixMultipleTimes. The failure is a timing/race condition unrelated to PR #599's changes (which fix AttributeValuePasswordValidator).

Failing assertion

ChangelogBackendTestCase.searchInChangeNumberModeOnOneSuffixMultipleTimes:589
  -> assertChangelogAttributesInRootDSE:861
  -> assertAttributeValue:1431

firstchangenumber: expected "1" but was "0"

The test publishes 4 update messages to the replication server and then immediately asserts that firstchangenumber in the Root DSE equals 1. However, change number assignment is asynchronous — the replication server needs time to process changes. The current 3-second timeout in assertChangelogAttributesInRootDSE is insufficient, especially on macOS runners.

Root Cause

In opendj-server-legacy/src/test/java/org/opends/server/backends/ChangelogBackendTestCase.java, the method assertChangelogAttributesInRootDSE (line 854) uses a TestTimer with only a 3-second maxSleep:

TestTimer timer = new TestTimer.Builder()
    .maxSleep(3, SECONDS)       // <-- too short
    .sleepTimes(100, MILLISECONDS)
    .toTimer();

Other similar retry loops in the same test class (e.g., searchChangelog at line 1124) already use 10 seconds. The 3-second timeout is inconsistent and causes flaky failures.

Fix

In opendj-server-legacy/src/test/java/org/opends/server/backends/ChangelogBackendTestCase.java, change line 858 in the assertChangelogAttributesInRootDSE method from:

.maxSleep(3, SECONDS)

to:

.maxSleep(10, SECONDS)

This aligns the timeout with the rest of the test infrastructure in this class and gives the replication server sufficient time to assign change numbers before asserting.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.

Assistant:

I'll start by fetching the job logs and the workflow definition to understand the failure.

Assistant: ## Failing Job Analysis

The build is failing due to 1 actual test failure (the PgSQL skips are expected and not the cause of the build failure).

Root Cause

The failing test is in ChangelogBackendTestCase.searchInChangeNumberModeOnOneSuffixMultipleTimes at line 589, which calls assertChangelogAttributesInRootDSE (line 861), which in turn calls assertAttributeValue (line 1431):

ChangelogBackendTestCase.searchInChangeNumberModeOnOneSuffixMultipleTimes:589
  -> assertChangelogAttributesInRootDSE:861
  -> assertAttributeValue:1431

firstchangenumber: expected "1" but was "0"

What's Happening

In the test method searchInChangeNumberModeOnOneSuffixMultipleTimes (line 581), the very first call at line 589 is:

assertChangelogAttributesInRootDSE(1, 4);

This asserts that firstchangenumber in the Root DSE equals 1. However, the Root DSE is returning 0 instead.

Looking at assertChangelogAttributesInRootDSE (line 854), there's a TestTimer that retries for up to 3 seconds (line 858). The change number assignment is asynchronous — the replication server needs time to assign change numbers after update messages are published. In this case, the 3-second timeout is insufficient; the firstchangenumber attribute hasn't been updated from 0 to 1 before the timer expires.

Solution

Increase the timeout in assertChangelogAttributesInRootDSE to give the replication server more time to assign change numbers. Change line 858 from 3 seconds to a longer duration (e.g., 10 seconds):

// File: opendj-server-legacy/src/test/java/org/opends/server/backends/ChangelogBackendTestCase.java
// Line 854-860

  private List<SearchResultEntry> assertChangelogAttributesInRootDSE(
      final int expectedFirstChangeNumber, final int expectedLastChangeNumber) throws Exception
  {
    TestTimer timer = new TestTimer.Builder()
-     .maxSleep(3, SECONDS)
+     .maxSleep(10, SECONDS)
      .sleepTimes(100, MILLISECONDS)
      .toTimer();

This is a timing/race condition issue. The test at line 583 sets purge delay to 0:

replicationServer.getChangelogDB().setPurgeDelay(0);

Then publishes 4 update messages and immediately expects the change numbers to be assigned. On macOS runners (where this job ran), the...

This pull request was created from Copilot chat.

Copilot AI changed the title [WIP] Fix timing issue in ChangelogBackendTestCase test Fix flaky ChangelogBackendTestCase timeout in assertChangelogAttributesInRootDSE Mar 31, 2026
Copilot AI requested a review from vharseko March 31, 2026 15:31
@vharseko vharseko marked this pull request as ready for review March 31, 2026 16:25
@vharseko vharseko merged commit 8f999cf into copilot/fix-inverted-logic-substring-check Mar 31, 2026
10 of 12 checks passed
@vharseko vharseko deleted the copilot/fix-changelog-backend-test-issue branch March 31, 2026 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants