Skip to content

Conversation

@FerriolCalvet
Copy link
Collaborator

AI summary

This pull request introduces several important updates to configuration, documentation, and processing scripts, focusing on standardizing parameter names related to subgenic region analysis, improving output organization, and refining process naming for clarity. It also adds new output handling and a process for mutation-specific QC plots.

Parameter and Naming Standardization:

  • Replaces all occurrences of omega_withingene, omega_subgenic_bedfile, omega_autodomains, and omega_autoexons with standardized parameters: create_subgenic_regions, subgenic_bedfile, autodomains, and autoexons across configuration files, documentation, and scripts for consistency. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

  • Updates process and parameter references in expand_regions and its tests to use the new standardized names. [1] [2] [3]

Process and Output Improvements:

  • Renames several processes from SUBSET* to QUERY* for better clarity, and updates output variable references accordingly. [1] [2] [3]

  • Adds a new process PLOT_MUTATION_SPECIFIC for generating mutation-specific QC plots, with appropriate outputs and version tracking.

  • Updates the PLOT_DEPTHS process to emit new outputs for average depth per gene and per gene/sample.

  • Changes process naming in mutational density analysis from SYNMUTREADSRATE to SYNMUTREADSDENSITY for consistency, and updates documentation and output listings accordingly. [1] [2] [3]

Output Organization and Publication:

  • Introduces a new configuration file, results_outputs.config, to centralize and standardize the publication directories for plots and QC outputs across multiple processes. [1] [2]

  • Adds and configures new processes in panels.config to publish expanded region outputs for various panel enrichment scenarios.

Schema and Documentation Updates:

  • Updates error messages and required fields in assets/schema_input.json for clarity, and removes bam from the required fields.

  • Updates documentation to reflect parameter renaming and process changes, ensuring accuracy and clarity for users. [1] [2] [3] [4] [5] [6]

Other Configuration Enhancements:

  • Adds a new CUSTOMBEDFILE process and refines GROUPGENES to use the new create_subgenic_regions parameter.

  • Updates file staging in PLOT_SATURATION_PROPORTIONS for improved output handling.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements a comprehensive refactoring focused on parameter standardization, process renaming, and workflow improvements. The changes replace omega_* prefixed parameters with more general subgenic_* naming, rename SUBSET* processes to QUERY* for clarity, and standardize channel factory methods from Channel to lowercase channel. Additionally, the PR introduces optional BAM file requirements, adds new QC plotting capabilities, and reorganizes output publishing through a new configuration file.

Changes:

  • Standardized parameter names from omega_withingene, omega_autodomains, omega_autoexons, omega_subgenic_bedfile to create_subgenic_regions, autodomains, autoexons, subgenic_bedfile across all configuration files, workflows, modules, and documentation
  • Renamed processes from SUBSET* to QUERY* (e.g., SUBSETMUTATIONSQUERYMUTATIONS) for better semantic clarity
  • Standardized channel factory methods from Channel to lowercase channel throughout the codebase
  • Made BAM files optional in the input samplesheet when using custom depths, with corresponding validation updates
  • Added new results_outputs.config for centralized output directory management and PLOTTINGQC subworkflow with omega QC filtering
  • Updated default parameter values including consensus_panel_min_depth (500→200), mutation_depth_threshold (40→100), and hotspot_expansion (30→0)

Reviewed changes

Copilot reviewed 44 out of 52 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
workflows/deepcsa.nf Updated process names (SUBSET→QUERY, SYNMUTREADSRATE→SYNMUTREADSDENSITY), channel factory standardization, BAM handling logic, and PLOTTINGQC integration
subworkflows/local/*.nf Consistent renaming of SUBSET to QUERY processes, channel factory standardization, removal of unnecessary .first() calls on process outputs
modules/local/expand_regions/main.nf Updated parameter references from omega_* to standardized names
nextflow.config Renamed parameters, reordered filter criteria, updated default thresholds for depth and mutation filtering
nextflow_schema.json Updated schema with new parameter names, descriptions, and reorganized options sections
conf/tools/*.config Updated process names and added publish directories for expanded regions outputs
conf/results_outputs.config New file centralizing output publication paths for plots and QC processes
bin/*.py Updated mutation density metric names (MUTREADSRATE→MUTREADSDENSITY), added omega flagging support, improved BAM column handling
docs/*.md Updated documentation to reflect new parameter names and process changes
subworkflows/nf-core/utils_*.nf Channel factory standardization and function name updates

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

//
if (check_conda_channels) {
checkCondaChannels()
checkCondachannels()
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name checkCondachannels is inconsistent with the standard naming convention. The lowercase 'c' in 'channels' breaks the camelCase pattern. This should be checkCondaChannels to maintain proper camelCase naming.

Copilot uses AI. Check for mistakes.
// When running with -profile conda, warn if channels have not been set-up appropriately
//
def checkCondaChannels() {
def checkCondachannels() {
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name checkCondachannels is inconsistent with the standard naming convention. The lowercase 'c' in 'channels' breaks the camelCase pattern. This should be checkCondaChannels to maintain proper camelCase naming.

Copilot uses AI. Check for mistakes.


PLOTSATURATIONPROPORTIONS(groups_mutations, panel, full_panel_rich.first(), expanded_panel)
PLOTSATURATIONPROPORTIONS(groups_mutations, panel, full_panel_rich, expanded_panel)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of .first() from full_panel_rich may cause issues if this channel emits multiple items. The receiving process PLOTSATURATIONPROPORTIONS may not handle multiple items correctly, potentially leading to the process being executed multiple times unexpectedly. Verify that this channel only emits a single item or ensure the downstream process can handle multiple emissions.

Copilot uses AI. Check for mistakes.
@FerriolCalvet
Copy link
Collaborator Author

update filter to flag

- add contamination plot with numbers
- add storing of all contamination sources
- fixes Relative mutability outputs misnamed
Fixes #334
@FerriolCalvet FerriolCalvet linked an issue Jan 31, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants