Skip to content

add methurator gtestimator and plot modules#11110

Open
edogiuili wants to merge 11 commits intonf-core:masterfrom
edogiuili:methurator
Open

add methurator gtestimator and plot modules#11110
edogiuili wants to merge 11 commits intonf-core:masterfrom
edogiuili:methurator

Conversation

@edogiuili
Copy link
Copy Markdown

In this PR, I add methurator gtestimator and plot modules.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Broadcast software version numbers to topic: versions - See version_topics
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

Copy link
Copy Markdown
Contributor

@famosab famosab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution to nf-core! We really appreciate it. I added a few comments to your PR.


output:
tuple val(meta), path("methurator_*.yml") , emit: summary_report
path "versions.yml" , emit: versions
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the versions output to utilize topics? More information about that can be found in the docs.

Comment on lines +10 to +13
input:
tuple val(meta), path(bam)
tuple val(meta2), path(bai)
tuple val(meta3), path(fasta)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put all these inputs into one tuple? That will make sure you're sure that EVERY time everything comes together in the right combination.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since methurator uses MethylDackel inside, I sticked to the inputs that are present in MethylDackel nf-core module. Do you think it's still better to put them all into one tuple?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not something I will force you to do :D but its something we try to move towards to because it has some advantages (but it will be obsolete in the future with how nextflow will change its input syntax) so I would say this is up to you and keeping it consistent to methyldackel is a good argument to keep it like this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can run

nextflow lint -format -sort-declarations -spaces 4 -harshil-alignment

on this file to clean this up nicely.

"""

stub:
def prefix = task.ext.prefix ?: "methurator_summary_${meta.id}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it called _summary here and in th output just merthurator?

Comment on lines +3 to +4
Methurator is a Python package designed to estimate CpGs saturation
for DNA methylation sequencing data.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use distinct descriptions for the module and the overall tool

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments on above nf.test file

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can run

nextflow lint -format -sort-declarations -spaces 4 -harshil-alignment

on this file to clean this up nicely.


output:
tuple val(meta), path("plots/*.html") , emit: plots
path "versions.yml" , emit: versions
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the versions output to utilize topics? More information about that can be found in the docs.

Comment on lines +33 to +36
def prefix = task.ext.prefix ?: "plots/${meta.id}.html"
"""
mkdir plots/
touch ${prefix}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets not create a dir if it only contains one output

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary cause the process outputs the file into the plots/ folder, so I needed to create a file in that folder to avoid getting the following error in the stub:

Missing output file(s) `plots/*.html` expected by process `METHURATOR_PLOT (test_bam)

But maybe there is a more elegant way of doing this :D

tuple val(meta), path(summary_report)

output:
tuple val(meta), path("plots/*.html") , emit: plots
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many *.html files do you expect?

Copy link
Copy Markdown
Author

@edogiuili edogiuili Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only 1 in this case, but it might be many if a user inputs a several BAM files simultaneously to the methurator/gtestimator and then methurator/plot. So I think it is safer to keep the *

@edogiuili
Copy link
Copy Markdown
Author

Thanks for your suggestions Famke! I've addressed them all, apart from:

  • the inputs of methurator/gtestimator (for now)
  • the plots/ dir being created in the methurator/plot stub (find my answer here)

For the additional snapshots, instead of capturing the full files—which are unstable—I extracted a snapshot of the content from lines 3 to 9 for the methurator/gtestimator summary reports, and a snapshot of the HTML filename for the methurator plots. What do you think?

@edogiuili
Copy link
Copy Markdown
Author

A small fix in the eval() expression to extract the software version. I changed the expression from methurator --version to methurator --version | sed 's/.* //' to only extract the version number (e.g. 2.1.1) and not the entire output (e.g. methurator, 2.1.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants