Conversation
|
As I've been diving further into this package, I am realizing that this isn't exactly what I thought it was. Bioalignments.jl is for aligning two sequences in string format. Do we have any julia packages for aligning fastx files together? We could convert the fastx files to strings and then use Bioalignments, but I wasn't sure if the package was optimized for working with such big strings. What have you done in the past? Do you usually just use a command line tool in a bash script? |
|
Preview at https://biojulia.github.io/BioTutorials/29 |
|
Preview at https://biojulia.github.io/BioTutorials/29 |
|
@kescobo I have finished the rough draft of my overview of the BioAlignments package. However, before wrapping up the tutorial, I would love some insight on the above comment/the best example to display (whether that's an alignment between some strings, or actual fastx files). |
| In a pairwise alignment, there is one reference sequence, and one query sequence, | ||
| Pairwise alignment differs from multiple sequence alignment (MSA) because | ||
| it only aligns two sequences, while MSAs align three or more. | ||
| In a pairwise alignment, there is one reference sequence and one query sequence, |
There was a problem hiding this comment.
While true - this is really an implementation detail. Most of the time, there's no difference if one thing is used as reference vs the other. The same is not true for MSA
| - GlobalAlignment: global-to-global alignment | ||
| - `GlobalAlignment`: global-to-global alignment | ||
| - Aligns sequences end-to-end | ||
| - Best for sequences that are already very similar |
There was a problem hiding this comment.
Might be nice to give examples here - an example here might be you have a particular gene from two closely related bacteria, or you're comparing alleles of a gene between two individuals.
| It imposes an affine gap penalty for insertions and deletions, | ||
| which means that it penalizes the opening of a gap more than a gap extending. | ||
| This aligns (pun intended!!) with the biological principle that creating a gap is a rare event, | ||
| while extending an already existing gap is less so. |
There was a problem hiding this comment.
I would word this difficulty - "creating a gap" and "extending" it, are analysis terms, not biological ones. Maybe something like "Deletions are rare mutations, but if there's a deletion, the length of the deletion is variable. Longer deletions are less likely than short ones only because they change the structure of the encoded protein more"
Explaining how to use BioAlignments.jl and providing some examples!