Skip to content

[DSIP-95][API] Complete the functionality of using dependencies in the complement data#18003

Open
det101 wants to merge 35 commits intoapache:devfrom
det101:DSIP-95
Open

[DSIP-95][API] Complete the functionality of using dependencies in the complement data#18003
det101 wants to merge 35 commits intoapache:devfrom
det101:DSIP-95

Conversation

@det101
Copy link
Copy Markdown
Contributor

@det101 det101 commented Feb 27, 2026

Was this PR generated or assisted by AI?

yes. The UT portion is generated by AI, Main functional code AI assisted

Purpose of the pull request

close #17748

Brief change log

The function of supplementing data supports dependency relationships: During the process of supplementing data, it can identify the dependency relationships of workflows and recursively pull up downstream dependent workflows

Verify this pull request

This pull request is code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(or)

Pull Request Notice

Pull Request Notice

If your pull request contains incompatible change, you should also add it to docs/docs/en/guide/upgrade/incompatible.md

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements downstream workflow triggering for complement/backfill runs in the API layer, adding support for “trigger dependent workflows” behavior and accompanying unit tests.

Changes:

  • Implemented doBackfillDependentWorkflow to fetch downstream workflow definitions and trigger backfill runs for them.
  • Added visited-code tracking intended to prevent self/cyclic triggering and duplicate downstream triggers.
  • Added BackfillWorkflowExecutorDelegateTest with basic scenarios for downstream triggering and filtering.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
dolphinscheduler-api/src/main/java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegate.java Adds dependent workflow backfill triggering logic and wiring for lineage + workflow definition lookup.
dolphinscheduler-api/src/test/java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegateTest.java Adds unit tests for the new dependent backfill triggering logic (single-hop scenarios).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't fill in the content according to the PR template, please fix it.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@det101 det101 force-pushed the DSIP-95 branch 2 times, most recently from ca11eb6 to 1841eaf Compare March 25, 2026 08:51
@det101 det101 requested a review from Copilot March 25, 2026 08:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

dolphinscheduler-api/src/main/java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegate.java:119

  • In parallel mode, expectedParallelismNumber can be 0 (the validator only rejects values < 0). If it is 0, splitDateTime(listDate, expectedParallelismNumber) will divide by zero and throw ArithmeticException. Treat 0 the same as null (default to listDate.size()), or explicitly guard against <= 0 before calling splitDateTime.
        final BackfillWorkflowDTO.BackfillParamsDTO backfillParams = backfillWorkflowDTO.getBackfillParams();
        Integer expectedParallelismNumber = backfillParams.getExpectedParallelismNumber();

        List<ZonedDateTime> listDate = backfillParams.getBackfillDateList();
        if (expectedParallelismNumber != null) {
            expectedParallelismNumber = Math.min(listDate.size(), expectedParallelismNumber);
        } else {
            expectedParallelismNumber = listDate.size();
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +413 to +444
public void testDoParallelBackfillWorkflow_ShouldIsolateVisitedCodesAcrossChunks() {
long upstreamCode = 500L;
WorkflowDefinition upstreamWorkflow =
WorkflowDefinition.builder().code(upstreamCode).releaseState(ReleaseState.ONLINE).build();
List<ZonedDateTime> dates = Arrays.asList(
ZonedDateTime.parse("2026-02-01T00:00:00Z"),
ZonedDateTime.parse("2026-02-02T00:00:00Z"),
ZonedDateTime.parse("2026-02-03T00:00:00Z"));
BackfillWorkflowDTO.BackfillParamsDTO params = BackfillWorkflowDTO.BackfillParamsDTO.builder()
.runMode(RunMode.RUN_MODE_PARALLEL)
.backfillDateList(dates)
.expectedParallelismNumber(2)
.backfillDependentMode(ComplementDependentMode.ALL_DEPENDENT)
.allLevelDependent(true)
.executionOrder(ExecutionOrder.ASC_ORDER)
.build();
BackfillWorkflowDTO dto = BackfillWorkflowDTO.builder()
.workflowDefinition(upstreamWorkflow)
.backfillParams(params)
.build();
Set<Long> baseVisitedCodes = new HashSet<>(Collections.singleton(upstreamCode));
List<Set<Long>> visitedSnapshotPerChunk = new java.util.ArrayList<>();

doAnswer(invocation -> {
Set<Long> chunkVisited = invocation.getArgument(2);
visitedSnapshotPerChunk.add(new HashSet<>(chunkVisited));
chunkVisited.add(9000L + visitedSnapshotPerChunk.size());
return null;
}).when(backfillWorkflowExecutorDelegate).doBackfillDependentWorkflowForTesting(any(), any(), any());

List<Integer> result = backfillWorkflowExecutorDelegate.executeWithVisitedCodes(dto, baseVisitedCodes);

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test calls executeWithVisitedCodes, which will run the real doBackfillWorkflow and attempt to use registryClient / Clients.withService(IWorkflowControlClient) to contact a master. Since neither is mocked/stubbed in this test, it will fail with NPE or a ServiceException before exercising the visited-codes isolation assertions. Consider refactoring to unit-test the chunk visited-code cloning without invoking the master trigger, or add a test seam/mocking for the backfill trigger step.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fill in the content according to the PR description template. Many parts of your code are written by ai, but there is no statement in PR.

@det101
Copy link
Copy Markdown
Contributor Author

det101 commented Apr 3, 2026

Please fill in the content according to the PR description template. Many parts of your code are written by ai, but there is no statement in PR.

done

@SbloodyS SbloodyS closed this Apr 7, 2026
@SbloodyS SbloodyS reopened this Apr 7, 2026
SbloodyS
SbloodyS previously approved these changes Apr 7, 2026
Copy link
Copy Markdown
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@det101
Copy link
Copy Markdown
Contributor Author

det101 commented Apr 7, 2026

@caishunfeng Please help review

@SbloodyS SbloodyS requested a review from ruanwenjun April 7, 2026 07:48
Copy link
Copy Markdown
Member

@ruanwenjun ruanwenjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to split into two methods.
First, find out all downstream workflows (This method might be useful at other api).
Second, trigger them.

@det101
Copy link
Copy Markdown
Contributor Author

det101 commented Apr 8, 2026

It's better to split into two methods. First, find out all downstream workflows (This method might be useful at other api). Second, trigger them.

@ruanwenjun Could you confirm whether “all downstream” here means only immediate dependents (same as today, just refactored), or all reachable downstream workflows (transitive closure)? I can go either way once I know which you want.

Split downstream discovery from execution, add transitive resolution coverage, and keep parallel backfill triggering downstream dates by isolating visited state per chunk.

Made-with: Cursor
@det101
Copy link
Copy Markdown
Contributor Author

det101 commented Apr 8, 2026

It's better to split into two methods. First, find out all downstream workflows (This method might be useful at other api). Second, trigger them.

I split the backfill dependent logic into resolve downstream + trigger downstream, and fixed the parallel case so downstream workflows no longer miss dates across chunks. @ruanwenjun

@det101 det101 requested a review from ruanwenjun April 9, 2026 06:30
Copy link
Copy Markdown
Member

@ruanwenjun ruanwenjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM

boolean filterOfflineWorkflow) {

Set<Long> resultCodes = new LinkedHashSet<>();
Set<Long> expandedCodes = new HashSet<>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Set<Long> expandedCodes = new HashSet<>();
Set<Long> visitedWorkflowCodes = new HashSet<>();

@ruanwenjun ruanwenjun closed this Apr 14, 2026
@ruanwenjun ruanwenjun reopened this Apr 14, 2026
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 60%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DSIP-95][API] Complete the functionality of using dependencies in the complement data

4 participants