[mira_talk] how to convert gap4 format to fasta multiple sequence alignment format
- From: Jorge.DUARTE@xxxxxxxxxxxx
- To: mira_talk@xxxxxxxxxxxxx
- Date: Fri, 6 Nov 2009 16:45:11 +0100
Hello,
i'm trying to find a way to convert several staden files (edited in gap4)
into fasta multiple sequence alignment files.
I manage to use gap2caf to convert gap4 to caf format.
Then i can use Bastien's utilities to convert caf into several other
formats like ace for instance.
Then i tried to use Bio::Assembly::IO bioperl package to print out each
contig alignment in fasta format,
but with no success (code attached: the start and end of the LocatableSeq
objects don't seem to be set).
Does anyone have any idea of some script which could do the work please ?
Or should i write a perl script to parse the caf format myself ?
Thanks for your help
jorge.
#!/usr/local/bin/perl
$input = shift || die
"Usage: staden2fasta input_file > output_file";
use Bio::Assembly::IO;
$in = Bio::Assembly::IO->new(-file => $input, -format => 'ace');
my $assem = $in->next_assembly;
print STDERR "there are ", $assem->get_nof_contigs, " contigs in this
assembly\n";
foreach my $contig ($assem->all_contigs){
print STDERR "\tthere are ", $contig->no_sequences, " sequences in this
contig\n";
foreach my $seq ($contig->each_seq){
print ">".$seq->id."\n";
$i = 1;
while ($i < $seq->start) {
print "-";
$i++;
}
print $seq->seq;
# $i=$seq->end;
# while ($i <= $contig->get_consensus_length) {
# print "-";
# }
print "\n";
}
}
---
Jorge Duarte
Bioinformatics Research Engineer
BIOGEMMA - Upstream Genomics Group
Z.I. Du Brézet
8, Rue des Frères Lumière
63028 CLERMONT FERRAND Cedex 2
FRANCE
Tel : +33 (0)4 73 42 79 70 (Accueil)
Fax : +33 (0)4 73 42 79 81
E-mail : jorge.duarte@xxxxxxxxxxxx
*****************************************************************
Pour toute demande de support merci d'inclure
BIOGEMMA_BioInfo_Service ou bioinfo@xxxxxxxxxxxx
dans les destinataires lors du premier contact
*****************************************************************
BIOGEMMA S.A.S. au capital social de 48.335.652,00 ?. 1, Rue Edouard
Colonne - 75001 PARIS. RCS PARIS 412 514 366
This message and any attachments are confidential and intended solely for
the use of the addressee(s) named above. The information contained in this
email may also be legally privileged. If you have received this email in
error, please notify us immediately by reply email or by fax and then
delete it. Any use, distribution or reproduction of this message is
strictly prohibited. The integrity or authenticity of this message cannot
be guaranteed. We therefore shall not be liable for the message if
altered, changed or falsified. Thank you.
Cet email et ses pièces jointes sont strictement confidentiels et destinés
uniquement à l'usage du (des) destinataire(s) sus-indiqué(s). Les
informations contenues dans cet email sont légalement protégées. Si vous
avez reçu cet email par erreur, merci de nous le retourner immédiatement
par courrier électronique ou télécopie avant de le supprimer. Toute
utilisation ou reproduction de cet email est strictement interdite. La
véracité et l'authenticité de cet email et de son contenu ne peuvent être
garanties et nous ne pouvons être tenus responsables de leur altération,
modification ou falsification. Merci.
Other related posts:
- » [mira_talk] how to convert gap4 format to fasta multiple sequence alignment format - Jorge . DUARTE