Hi Peter, That is one rapid replyé Thank you. Reads' length vary from 111-322 bases. "You could also show use the first (five) reads, or at least their IDs" i did not understand this part. " Perhaps you are seeing IUPAC ambiguity codes?" You are so right! IUPAC ambiguity codes absolutely did not come to my mind. I checked the files and i can only find IUPAC codes as unexpected. That is the beauty of brain storms ( well, this might not fit into the definition though) BTW, i checked the mira manual and B.C. added " -CO:fnicpst " for this kind of issues. Thank you again and again.. > Date: Thu, 29 Sep 2011 18:54:58 +0100 > Subject: [mira_talk] Re: awkward letters in assembled data > From: p.j.a.cock@xxxxxxxxxxxxxx > To: mira_talk@xxxxxxxxxxxxx > > 2011/9/29 Visam Gültekin <teutara@xxxxxxxxxxx>: > > Hi, > > > > I have been using mira for a while, and i am pretty satisfied. I did some > > genome assemblies, everything went well. > > Right now i have to assemble some EST data that i do not know which > > technology has been used for sequencing. > > > > First of all, is there any suggestion for this kind of data? > > > > (by changing file name and some little tweak on command, i did some trial > > for different tech. There is no big difference) > > Do a histogram of the read lengths, that would be a big clue. > > What file format are your reads in? > > You could also show use the first (five) reads, or at least their IDs, > that is often enough for a good guess. > > > For the next issue, after assembly i manually checked the result files. > > There are "wwww" "rqk" etc. awkward letters in contigs. How is that > > possible. I am lucky that i checked and realized that in results, > > otherwise.. > > Perhaps you are seeing IUPAC ambiguity codes? > > Peter > > -- > You have received this mail because you are subscribed to the mira_talk > mailing list. For information on how to subscribe or unsubscribe, please > visit http://www.chevreux.org/mira_mailinglists.html