[mira_talk] Re: 454 cleaning

  • From: Robin Kramer <kodream@xxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Thu, 18 Nov 2010 14:37:50 -0700

Nevermind, I forgot to turn on.

 -CL:msvs=on

That'll help!!!!

On 11/18/10, Robin Kramer <kodream@xxxxxxxxx> wrote:
> So after I do the assembly with the SSAHA2 mappings.
>
> When I grep through the file I still find thousands of contigs
> containing the vector.
>
> Would it be more appropriate totally clip the vectors first, so that
> mira doesn't have to worry about these things?
>
> Sincerely yours,
>
> Robin
>
> On 11/18/10, Robin Kramer <kodream@xxxxxxxxx> wrote:
>> I can track down the forward primers pretty easy.
>>
>> they are hard to miss considering 1/4 of the reads start with.
>>
>> tcagAAGCAGTGGTATCAACGCAGAGTACGGGGG
>>
>> I think that GGGGG at the end is the leading G problem, hypothetically
>> due to linker chemistry.
>>
>> I have tried it with four G homopolymers five G homopolymers and six G
>> homopolymers, and it seems there is a significant drop at the sixth.
>>
>> Is there any problem with feeding that into the SSAHA2?
>>
>> Also I found the reverse complement of this sequence, hence our chimeras.
>>
>> Is there potentially a B adapter floating around in the mix?  I didn't
>> obviously see another adapter in the sequence.
>>
>> Also FYI, the adapter was an exact match for many sequences in the TSA.
>>
>> Sincerely yours,
>>
>> Robin
>>
>> On 11/17/10, Robin Kramer <kodream@xxxxxxxxx> wrote:
>>> It seems the sff's are available from the SRA, but only through the .SRA
>>> file.
>>>
>>> Robin
>>>
>>> On 11/12/10, Gao, Guangtu <Guangtu.Gao@xxxxxxxxxxxx> wrote:
>>>> Hi Robin,
>>>>
>>>> You might consider to check the adaptors and contaminants using
>>>> SeqTrim.
>>>> I also downloaded some EST sequences from NCBI for assembly before. I
>>>> found that the adaptors are not totally cleaned from those sequences
>>>> and
>>>> they made chimeras.
>>>>
>>>> Guang
>>>>
>>>> -----Original Message-----
>>>> From: mira_talk-bounce@xxxxxxxxxxxxx
>>>> [mailto:mira_talk-bounce@xxxxxxxxxxxxx] On Behalf Of Robin Kramer
>>>> Sent: Thursday, November 11, 2010 12:44 PM
>>>> To: mira_talk@xxxxxxxxxxxxx
>>>> Subject: [mira_talk] 454 cleaning
>>>>
>>>> Hi,
>>>>
>>>> I am doing an assembly from data publicly available at NCBI.
>>>>
>>>> The data are available here:
>>>>
>>>> http://www.ncbi.nlm.nih.gov/sra/SRX021565?report=full
>>>>
>>>> It is 454 data, but unfortunately neither the sff or xml files are
>>>> available.
>>>>
>>>> I assembled the data with Mira, using the no xml flags.
>>>>
>>>> Which appeared to give a nice assembly.
>>>>
>>>> However when I BLASTN and BLASTX the first and forth contig their
>>>> appear to be problems with the data.
>>>>
>>>> With the first contig, when I blastx it, it gives strong hits to two
>>>> different genes on different ends, as if it were a chimera.  When I
>>>> blastn the sequence I get a strong hit on one side, then in the middle
>>>> I get section with multiple hits to different species in the TSA.
>>>> When I look at the pileup, there is a thin place in the gene an a huge
>>>> drop off in coverage from one side.
>>>>
>>>> I think this appears to be due to a adapter trimming problem with the
>>>> 454 data.
>>>>
>>>> The fourth contig when blastN has a very strong gene hit from a
>>>> closely related species, but at the end has another small stretch that
>>>> matches many other sequences in TSA that are distantly related.  The
>>>> adapter looking portion has small coverage with a giant change in
>>>> coverage in the strong region.
>>>>
>>>> To me it appears as if some of the adapters are consistently not
>>>> getting trimmed(in this set and in TSA).
>>>>
>>>> Here is a relevant thread in seqanswers.
>>>> http://seqanswers.com/forums/showthread.php?t=3462
>>>> As well as a link out to previous discussions in this list.
>>>> //www.freelists.org/post/mira_talk/454-adaptor-clipping
>>>>
>>>> Is there any consensus on recleaning the 454 adapters?  I don't even
>>>> know what the sequences would be to expect.
>>>>
>>>> The assemblies of the two contigs are pasted after this message.
>>>>
>>>> Sincerely yours,
>>>>
>>>> Robin
>>>>
>>>>>SRR054580_Asha_rep_c1
>>>> AGTTTCTTAACACTTGGACCAATATATTATTTTCCTTTGTTTTGCAAGAAGGATAAAAGA
>>>> AAGAAACAAASGWMDAAAAGAGTTTACTAGAAAACTCATCGAGCTAGTTTCTCCACTTAT
>>>> TTATTTTATGCTTTTCCCGCAAAAGTTTGGTGACTCATACATGAGGATAGATACACATAG
>>>> ACTCACGTTATTTTACACACGTATATATATAAGGAAAGGCAGGCTAAGCCTTTGATTTAT
>>>> TTGATTATTGATCCGCGCACTATTGGCAAAAAGACAGTAGTGGGGTAGCACAGCAGATGC
>>>> AAAAAGATGAAGCATAGCTCTAAGCCACATATCTCATTTGAGAGTGACGAGGAGGTGGAA
>>>> CGAGAAAGTTGAAAGGGTTGTTGTTCTTGATCTGCCTGGCCTGTTCGCTATGGAGGTTGA
>>>> AAGTGTGTTGAATGACTTCCTCAGGCAATGCGTTTAACAAAGAGTTTCTACCTGCAAGTG
>>>> TGCCAGTCACAGGTATATCATGGGTCTTGAATGCCACGTACTTGAAGTTGTTGCTCTGCG
>>>> ATTTTGCAGCCACCGCAAAGTTTTGTGGCACGATCAGCACTTGTCCCTCTTGCAGCTCCC
>>>> CATCAAACACTCTATCACCAGTGCAATTCACCACTTGCATCATCGCCCTCCCTTCCAATG
>>>> CGTATACTATGCTGTTTGCGTTCAGGTTGTAGTGAGGCACGAACATGGCATTCTTGCGGA
>>>> GAGATCCGAACTGAGCACTGAGTTTGAGGAGCAAGAGGGCTGGGAAGTCAAGGCCGGTGG
>>>> CGGTTGTAATGCTACCAGCTTGAGGGTTGAAGAAGTCAGGCGATGAAGTTTGACCAATGT
>>>> TGTGGCGAAGTCTCATTGTGCAAATGGTTTCATCAATGCCATTTCTGCTCTTGCTTTTGC
>>>> TCTTCTGTGGCTTCTCTTCCTCTTCATCGTCGTCATCTTCTTCCTCTTCCTCTGCTCTCT
>>>> GTTGCTGCTTTCTCGTTGGTGGAGCTGTCACGCTCAGACCTCCCTCCACTTTCACAATGG
>>>> CTCCTTTCTCTTCGTCCTCGTTCACACCTTGGAGGTTTTTCACTATCTTCCTGTCCACGT
>>>> TCAACGCTTGTTCCAAGAATTCTGGGGTGAAGCCACTGAATATGTTGCCGCCTTCATTAT
>>>> CTTCTTCTTGTTCTTGATGTTGTTTTCCCTTCTGGCTTTGGCTTTGCTGATATTGTACGA
>>>> ACTCTTGCTCTTGGTTCCCAGCAAGATAGAATCTCCTAGGCATCTGATCGAGCTGGTTCT
>>>> GTAAGCTGTTGGTGTGAATAAGAGAAACTGCAACAACGGGAGTGTCTTGATTGTTGAACA
>>>> TCCAGAAAGCAGCACCGGTAGGCACTGCGATCAAATCACCCTCTCTAAAGTGATACACCT
>>>> TTTGGTGACGGTCTTGAGGCTTCTGGCTTTGTCCTCTTTGAGTTGGCTCTTCAAAAGTCT
>>>> GAGGACAACCGGAGAAAATGATGCCAAAAATACCACTACCTTGTTGAATGAAGATTTGCT
>>>> GGGGAGCGTTGGTGAAGAATGGTCTGCGGAGGCCATTGCGTTGGAGGGTGCAGCGAGAGA
>>>> GGGCAACACCGGCACACTGGAAAGGCTTGCTGTTAGGGTTCCATGTCTCTATGAACCCAC
>>>> CTTCCGACTCTATACGGTTATCGGGTTTGAGGGCATTCATGCGTTGGAGTTGGCACTCAT
>>>> ATTCATTTTGCTGTGGCTGCTGTGTCTTATCTTTGCTAGCGAAGCACCCACTCAAAAGCA
>>>> CAAGACAAAGGGAAAGAGATAGCGCAAGAAGCTTAGCCATGGATATGAATATGATTGATT
>>>> TGTTTGTGGTGTCCCCCGTACTCTGCGTTGATACCACTGCTTAAGCAGTGGTATCAACGC
>>>> AGAGTACGGGGGTGGACCCAATGACACCATTTTCATTTATTATTCGGATCATGGTGCTCC
>>>> TGGTCTTGTCACCATGCCAGTAGGGGGAATATGTCATGGCCAACGATTTTGTGAATGTCT
>>>> TGAAGAAGAAACATGATGCTAAATCCTACAAAAAGATGGTGATATACTTGGAAGCATGTG
>>>> AATCTGGGAGCATGTTTGAAGGGATACTACCTAATAACATAAGCATATATGCGACCACAG
>>>> CTTCCAACGCAGATGAGGATAGTTTTGCATATTATTGTCCTCATTCCTACCCTTCTCCTC
>>>> CAACTGAGTACACCACTTGTTTGGGAGATGTGTACAGCATTTCGTGGTTAGAAGATAGTG
>>>> ACAAAAATGACATGACAATAGAAACGCTGCAGCAACAATATGAAACCGTTCGCCGAAGAA
>>>> CGTTAATTGGTAATGTCGACACCTCTTCTCATGTGAAACAATACGGAGATAGAAAATTCG
>>>> AGAACGATACTCTTGCTACCTACATTGGTGCACCTGTTAAAACCAACCCCACCAACTCTG
>>>> CAAATGCATATTCCTTTGAACCATATAGTCCTCAAACTAGACATGTTAGCCAACGAGATG
>>>> CTCATTTACTCTACCTTAAGCTAGAGTTGCAAAAAGCCCCGGATGGTTCTATGGAAAAGT
>>>> TGAAAGCTCAAATAGAGTTGGATGATGAAATTGCACATAGGAAGCATTTAGATAGTGTTT
>>>> TCCATCTCATAGGGGATCTCTTGTTTGGAGAAGAGAATAATATCTCTACCATGTTGCTCC
>>>> ATGTTCGTCCACCAGGCCAGCCTCTTGTCGATGATTGGGATTGTTTCAAGACCCTTATAA
>>>> AAACTTACGAGAGCAATTGCGGTAAATTGTCAATCTATGGAAGGAAATACACAAGAGCCT
>>>> TTGCTAACATGTGCAATGCTGGCATTTCTGAGGAGCAAATGGTAGTAGCCTCTTCACAAG
>>>> CTTGTCCCAAGGAAAATCCTTCTTAAATTAATTCGTTAAGTTGATAATGTAATAACCAAT
>>>> ATATATCATGAAAGATTAAAAATTGTGCTTTCATTCTACAAAATGGATTATAATCCTTTG
>>>>
>>>>>SRR054580_Asha_rep_c4
>>>> TCTCCGACTCAGAAGCAGTGGTATCAACGCAGAGTCTTGGGGAACTGGAATTGACGATCA
>>>> AGTTGGTCACACCTGTTGCTCCAGCAACATAGTGCAGAAATTGCATGTGTCCAATGTGTA
>>>> GATCTCTAACAAGATCATAATTATAACATTCTATGTGTAGTTGACTCTTGCTTTTGATTA
>>>> ACTCCTGCATAGATGTTTCTACCAAAAATGAAAAAAAAAATCATTAATAGATGCATATTG
>>>> CAGCTAAATTTAGCAGTGAGTTGGTGATACCTCATCCCCCAGTTAGATAAAAGCCACTAG
>>>> AAGCTGCATTTTCAAATCAACAAGTAGTGATTTATGGCTTCTTTGGGTTTTATGGTGTGT
>>>> TTTGTAGAAAATTTGTCCTTCATTTTAGCTATGAGCATTCATTGGGTATTGCATAAGTTT
>>>> TGATGCTATTGTATTGATTTTGATATAAGAAAAGAAAAGTTGTAATGCGTTTGTTTCAAT
>>>> TATTTTTTTTTAAAGAAATGATATTTTTAACTTGTGGAGAGTTTTAAGAGATTTAGATAA
>>>> CTTGTAAGGTAACAGATTGTAGAAGTATAAATTACTCTGCCATAAATGAAGCTTTAAGTG
>>>> CACTACAAGTAAACAACT
>>>>
>>>> --
>>>> You have received this mail because you are subscribed to the mira_talk
>>>> mailing list. For information on how to subscribe or unsubscribe,
>>>> please
>>>> visit http://www.chevreux.org/mira_mailinglists.html
>>>>
>>>>
>>>> --
>>>> You have received this mail because you are subscribed to the mira_talk
>>>> mailing list. For information on how to subscribe or unsubscribe,
>>>> please
>>>> visit http://www.chevreux.org/mira_mailinglists.html
>>>>
>>>
>>
>

-- 
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: