Skip to content

Releases: itmat/Normalization

PORT v0.8.1-beta 6/15/2016

15 Jun 14:15

Choose a tag to compare

  • News
    1. added geneSymbol column to the ensembl annotation files so it can be used for Exon-Intron-Junction level normalization
    2. "convert_gtf_to_PORT_geneinfo.transcripts.pl" script also outputs a geneSymbol column
    3. changed default maxjobs number to 1000 for both lsf and sge in config
  • Bug fixes
    1. -alt_out option bug fixed
    2. cigar2span fixed to work with GSNAP output (^#S#D and #D#S$ case)
    3. cleanup script bug fixed

PORT v0.8-beta 6/6/2016

06 Jun 18:37

Choose a tag to compare

  • News
    • PORT infers read length from unaligned files (uses average read length)
    • inferred introns size cutoff set at 75000
    • reads mapping to highly expressed features (gene,exon,intron) are handled separately and the resampled reads get put back into the final sam/bam (not just spreadsheet level).
    • Implemented '-alt_out option'. Users can redirect the normalized data to an alternate location.
    • sam2mappingstats reports number of Non-Unique alignments and reads instead of percentages.
    • bam to sam step omitted for bam input
    • script available for ensembl gtf -> gene info file conversion.
  • bug fixes
    • cigar2spans now accounts for ND, DN cases
    • genepercents calculation fixed (sum of all min counts were used as total before, now using total gene mappers instead)
    • making list of high expressors for novel exon case fixed

PORT v0.7.5-beta 10/20/2015

20 Oct 21:25

Choose a tag to compare

news:

  • "-v" flag outputs version of PORT.
  • ribo percents uses total number of reads, not all mapped reads for computing the stats.

bug fixes:

  • quantifygenes_gnorm2 step in runall_normalization.pl name_of_job had a typo, fixed.
  • check_samformat.pl outputs the problematic reads with the error.

PORT v0.7.4-beta 10/7/2015

07 Oct 19:16

Choose a tag to compare

news

  • expected_num_reads.txt for Exon-Intron-Junction level normalization now includes the exon-inconsistent read information.

bug fixes

  • getstats.pl: port does not throw error anymore when sam2mappingstats output file is missing stats.
  • runall_normalization.pl: checksam step was missing -se flag for single end data, fixed.
  • PORT can handle unique read normalization.

PORT v0.7.3-beta 8/31/2015

31 Aug 21:17

Choose a tag to compare

news

  • takes bam input
  • provides breakdown file for exon-intron-junction normalization
  • predict number of reads provide comma separated list of sample ids
  • checks sam/bam format to make sure:
    • sam/bam has proper tags
    • (paired-end) mated alignments are in adjacent lines
  • infers paired/single-end, sam/bam, gzipped, and fasta/fastq (options removed)

bugfixes

  • sam2cov : outputs forward reverse, not sense and antisense; fixed the output file names
  • catshuffiles step bug fixed in runall_normalization.pl

PORT v0.7.2-beta 7/16/2015

16 Jul 18:36

Choose a tag to compare

News:

  • restart only the failed jobs when -resume or -resume_at option used
  • delete intermediate files from blast step when cleanup is set to true
  • all merging steps and filter highly expressed genes step run at sample level to cut unnecessary wait time
  • default lsf queue names in config file set to new PMACS cluster queue names
  • undetermined reads renamed to exon inconsistent reads
  • modified runall scripts to avoid too many jobs getting submitted to one node

Bug fixes:

  • compress step bug fix; pipeline now waits until all jobs are completed

v0.7.1-beta

12 Apr 17:19

Choose a tag to compare

Bug fixes

  • properly checks input file format
  • genefilter.pl script name replacement step fixed (doesn't use regex anymore)
  • quants2spreadsheet now uses 6G queue when # samples > 200.

v0.7-beta 4/9/2015

09 Apr 21:11

Choose a tag to compare

News:

  1. Users can pre-filter the ribosomal reads prior to running PORT and skip BLAST step.
  2. rRNA FASTA for non-mammalian organisms (Drosophila melanogaster (dm), Zebrafish (danRer) and C.elegans) are available.
  3. No longer need to provide two gene info files (gene info and annotation file) for EXON-INTRON-JUNCTION level normalization.
  4. Users need to provide chromosome names (for non standard names) and mitochondrial chromosome name.
  5. sam2cov now supports data aligned with GSNAP

Bug fixes:

  1. sam2junctions was not working properly when genome fasta file was not in one-line format. PORT now checks and converts into the correct format before generating jucntions files.
  2. high expressers were not getting put back in when -cutoff_highexp with the same cutoff value in both PART1 and PART2. It now works properly.
  3. renamed stranded coverage file names (sense and antisense instead of fwd and rev)
  4. STATS files provide more descriptive header/footer. All STATS files in % notation.
  5. filter high expressers were not working properly for single end data > fixed
  6. fixed issue with list for intronquants (when -novel_off flag was used).
  7. BLAST loops through query file to avoid memory problems.
  8. PORT checks input file formats before starting
  9. two PORT runs with the same STUDY name cannot be run at the same time.

v0.6.3-beta 2/2/2015

02 Feb 19:37

Choose a tag to compare

  • New blast - faster.
  • Name of default/normal queue added to config file.

Exon-Intron-Junction Norm

  • Novel introns (inferred from junctions file)
  • Flanking regions quantified
  • Novel exons runs in parallel

Gene Norm

  • sam2gene runs in chunks for shorter runtime
  • highly expressed genes (if filtered out), gets put back into the final spreadsheet

11/20/2014

20 Nov 21:22

Choose a tag to compare

bug fixes:

  • runall_shuf.pl (stranded non-unique data)
  • quantification modified (does not double count anymore)