Home

Help ABI Chroma Align

With this applicaton it is possible to align one or more ABI chromatogram sequences (Sanger method) (max 4) against a reference sequence.
The alignment of chromatograms and sequences (text format) facilitates their comparison. The program displays and highlights any inconsistencies.
The program is useful, in particular, if you sequence a particular gene (whose reference sequence is known) to verify the existence of possible mutations both in homozygosity and in heterozygosis.
The graphic display of the chromatograms, aligned to the reference sequence, facilitates the interpretation of any uncertainties of the bases of the sequences. In fact, sometimes background noise is interpreted automatically by the chromatograms as heterozygous polymorphisms. The alignment of the chromatograms with the possibility of zooming into a particular region, allows, for example, to visually verify the goodness of the prediction of particular mutations.


File ABI

If you work with a client locally, you can insert up to 4 rows of chromatograms in ABI format, otherwise, due to data transmission problems, you can only enter two
You cannot insert new files if you come from other applications on this platform

The sequence of the first file represents the leader sequence and will be considered the seed to align all the other ABI sequences.
The textual sequences of the ABI files will be assembled to form a single 'contig' used for comparison with the reference sequence.


REFERENCE SEQUENCE

One or more reference sequences can be inserted in FASTA format. Among these, the sequence that best aligns with the contig will be considered the main reference.

In the reference sequence, polymorphisms can be entered using the IUPAC characters. This will facilitate the comparison between the chromatograms and the reference sequence.

 


Automatic translation

By ticking the 'Automatic translation' box it will be possible to observe directly in the alignment, the effect of any mutations in the coding regions and the change of the protein sequence.
To use this option it is necessary to set only one sequence and it is necessary to write it in a particular way (the symbolism used by the UCSC Genome Browser has been adopted):
- the transcribed regions must be written in uppercase, while the non-transcribed ones (eg introns or UTR) must be written in lower case.
- it is necessary to specify, by using the square brackets '[ ]', the beginning and end of the translation.
The program will provide; a 'warning' if the first codon does not appear to be 'ATG' = methionine, or if the last codon does not appear to be a STOP codon.
A 'warning' will be given even if the coding region is not divisible by three. In this case the coding sequence will be appropriately shortened.

If the complete sequence of a gene is not entered, it is necessary that the signaling of the beginning of the translation corresponds with the beginning of a codon, otherwise an incorrect shift of the translation is obtained.

It is easy to get the required format, using, for example, UCSC Genome Browser.

example
tagtgaccTCAATGTTCGCAGTCGACA[ATGAGCAGAGTT GATGTACCGCACGGATGTGGGTCGCCACACaaccttatat ccttagcgcagcgcgaaacggcagcgnn........... .......AGATCCGGGTGCCCCACTATCACTGA]ACTGG

highlight
IIn this line some symbols are inserted to highlight the polymorphisms and / or discrepancies between the reference sequence and the sequence of the contig.

Symbols used:
'*': Same polymorphism in Reference and contig
'*': Different polymorphisms in Reference and contig
'§': polymorphism only on ABI sequence, but not in Reference
'?' bases of two or more different ABI Sequences
'#' Non polymorphic and different bases between contig and Reference


highlight in translate
only if you have chosen 'Automatic translation' (see above): in the 'Translate Reference' and 'Translate contig' lines the symbols of the aligned amino acids are written.
If a stop codon is encountered, the '-' symbol is inserted:
The amino acid is written in red if the one translated from the contig is different to that represented by the reference.
In the presence of polymorphisms (different amino acids) the symbol '#' is written