Skip to content

[refactor] Centralize batch ingestion job spec constants into SegmentGenerationJobUtils#18413

Open
Akanksha-kedia wants to merge 1 commit intoapache:masterfrom
Akanksha-kedia:centralize-batch-ingestion-job-spec-constants
Open

[refactor] Centralize batch ingestion job spec constants into SegmentGenerationJobUtils#18413
Akanksha-kedia wants to merge 1 commit intoapache:masterfrom
Akanksha-kedia:centralize-batch-ingestion-job-spec-constants

Conversation

@Akanksha-kedia
Copy link
Copy Markdown
Contributor

@Akanksha-kedia Akanksha-kedia commented May 4, 2026

Description

Centralizes duplicated string constants (SEGMENT_GENERATION_JOB_SPEC, DEPENDENCY_JAR_DIR, STAGING_DIR) from HadoopSegmentGenerationJobRunner and SparkSegmentGenerationJobRunner into the shared SegmentGenerationJobUtils class in pinot-batch-ingestion-common.

These three config-key strings were independently declared as private literals in each runner. Because they must stay in sync (they reference the same executionFrameworkSpec.extraConfigs keys), having separate copies creates a risk of silent drift if one runner is updated but the other is not.

Changes Made

  • SegmentGenerationJobUtils.java -- added three public static final String constants: SEGMENT_GENERATION_JOB_SPEC, DEPENDENCY_JAR_DIR, and STAGING_DIR.
  • HadoopSegmentGenerationJobRunner.java -- removed the private DEPS_JAR_DIR_FIELD and STAGING_DIR_FIELD literals; replaced all usages with SegmentGenerationJobUtils.DEPENDENCY_JAR_DIR and SegmentGenerationJobUtils.STAGING_DIR. The existing public SEGMENT_GENERATION_JOB_SPEC field is preserved as a delegating alias (= SegmentGenerationJobUtils.SEGMENT_GENERATION_JOB_SPEC) so that any external callers continue to compile without changes.
  • SparkSegmentGenerationJobRunner.java -- removed the private DEPS_JAR_DIR and STAGING_DIR literals; replaced all usages with the shared constants from SegmentGenerationJobUtils.

Backward Compatibility

  • HadoopSegmentGenerationJobRunner.SEGMENT_GENERATION_JOB_SPEC remains a public field with the same value, so downstream code that references it is unaffected.
  • No config key values were changed -- only where the string literals are defined.

Testing Done

  • Verified the project compiles (mvn install -DskipTests on the affected modules)
  • Existing unit tests for pinot-batch-ingestion-hadoop and pinot-batch-ingestion-spark-3 continue to pass
  • Manual inspection confirms all usages now reference the centralized constants

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • No new warnings introduced
  • Backward compatible -- no API/config changes

…obUtils

Move duplicated string constants (SEGMENT_GENERATION_JOB_SPEC,
DEPENDENCY_JAR_DIR, STAGING_DIR) from HadoopSegmentGenerationJobRunner
and SparkSegmentGenerationJobRunner into the shared
SegmentGenerationJobUtils class in pinot-batch-ingestion-common.

This eliminates silent drift between the Hadoop and Spark runners,
where the same config keys were independently declared as private
literals. The Hadoop runner's public SEGMENT_GENERATION_JOB_SPEC
field is retained as a delegating alias for backward compatibility.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.51%. Comparing base (fd45173) to head (1712885).

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18413      +/-   ##
============================================
- Coverage     63.52%   63.51%   -0.01%     
  Complexity     1709     1709              
============================================
  Files          3250     3250              
  Lines        198949   198949              
  Branches      30826    30826              
============================================
- Hits         126381   126364      -17     
- Misses        62488    62496       +8     
- Partials      10080    10089       +9     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 63.51% <ø> (-0.01%) ⬇️
temurin 63.51% <ø> (-0.01%) ⬇️
unittests 63.51% <ø> (-0.01%) ⬇️
unittests1 55.57% <ø> (-0.01%) ⬇️
unittests2 34.96% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Akanksha-kedia
Copy link
Copy Markdown
Contributor Author

@xiangfu0

@Akanksha-kedia
Copy link
Copy Markdown
Contributor Author

@xiangfu0 please review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants