Hi Yi,
Huang Yi wrote:
Dear All,
Thanks very much for all your help. This virus is a single-stranded RNA virus.
We didn't do any deletion on 3' end and the raw data was from host+viral
mixture.. I don't know how to set MIRA to assemble a circle virus or linear
virus. I appreciate if any of you could let me know it.
In result folder, there are two unpadded fasta files, one is Largecontig.fasta
and the other is out.fasta. I found here were some short fragments which can
map to 3'UTR. Although there are some short gaps, I think they are also part of
assembled genome, right?
One other question is, when I blast the 5'UTR sequence, the direction of it was
opposite to that of other close relatives. This 5UTR comes from a Largecontig
which contains an ORF with correct direction. It is not a short fragment. Is
there any methods that I can check how mira assembled this part, further to
know if this is the special character of this virus or something wrong with the
data?
Thanks again!
Yi
---------- Forwarded message ----------
From: *Martin MOKREJŠ* <mmokrejs@xxxxxxxxx <mailto:mmokrejs@xxxxxxxxx>>
Date: 2015-10-04 0:51 GMT+08:00
Subject: [mira_talk] Re: no 3'end
To: mira_talk@xxxxxxxxxxxxx <mailto:mira_talk@xxxxxxxxxxxxx>
Hi Yi,
Huang Yi wrote:
Hello,
Question again. I am working on a small virus genome now. The data are illumina
reads. When I used denovo assembly, mira quickly made a strain, which share over 95%
nucleotide identities with a reference virus genome. But that denovo assembled strain
didn't contain 3'UTR. If I used reference assembly, mira gave me a "complete"
strain, which is highly similar to reference (~99%). Many reads can map to reference
genome's 3'UTR region very well, which is around 600nt.
You did not say whether it is a virus with circular or linear genome. Also,
provided Adrian Pelin proposed inspecting SNPs ... well, first of all, is this
an RNA or DNA virus? You speak of 3'-UTR so I guess it is a +RNA virus. What is
the target genome length? What coverage you have in do novo vs. mapping modes?
I prefer to trust the denovo assembled strain because that virus were
isolated from a different host. It may not as same as reference. But I am
curious that why mira didn't assemble the 3'UTR region? Does it mean my studied
virus didn't have that 600nt long 3'UTR or there is any parameter I didn't set
correctly? Thanks!
Could be mira discarded the ends of reads because they were too close to
adapters. Did you use some custom adapters for e.g. reverse-transcription of
viral RNA? Did you remove them from your data?
I could also imagine it was a virus with a circular genome but assembled in a
linear assembly mode - or mira discarded reads which seemed to map to both
linear ends. You should provide more details.
Martin
--
Martin Mokrejs, Ph.D.
Adapter/artefact removal from datasets based on the following technologies:
454 / IonTorrent / Evrogen MINT / Clontech SMART / ..., Illumina
http://www.bioinformatics.cz/software/supported-protocols/