[mira_talk] Assembly after vector screening

  • From: Nestor Zaburannyi <nestor@xxxxxxxxxxxxx>
  • To: mira_talk@xxxxxxxxxxxxx
  • Date: Wed, 31 Aug 2011 12:33:35 +0200

Dear all,

After the assembly of my organism that was sequenced from pool of cosmids, i
have thousands of UIPACs and repeats. Can anyone recommend some adjustments
to make this assembly better?

  Average consensus quality:                    70
  Consensus bases with IUPAC:                   3387    (you might want to 
check these)
  Strong unresolved repeat positions (SRMc):    1817    (you might want to 
check these)
  Weak unresolved repeat positions (WRMc):      24      (you might want to 
check these)
  Sequencing Type Mismatch Unsolved (STMU):     0       (excellent)
  Contigs having only reads wo qual:            0       (excellent)
  Contigs with reads wo qual values:            0       (excellent)


mira -project=sp_0047 -job=denovo,genome,accurate,solexa \
COMMON_SETTINGS -GE:not=40 \
SOLEXA_SETTINGS -GE:uti=1 -CL:bsqc=1 -CL:msvs=1:msvsmfg=1000:msvsmeg=1000 
-LR:ft=fastq:mxti=0 -FN:fqi=illumina.fastq -GE:tismin=266:tismax=494:tpbd=-1

smalt commands that i use:

smalt_x86_64 index -k 7 -s 1 vector vector.fasta
smalt_x86_64 map -f ssaha -d -1 -m 19 -p -n 8 vector illumina.fastq > 
sp_0047_smaltvectorscreen_in.txt

But i've experimented with "-m" from 13 to 37. Results are different, but 
always with many UIPACs.

Do these two problems stem from vector leftovers of only few bp at the ends of 
the reads? If yes, how can MIRA fight them?

Sincerely yours
Nestor


--
You have received this mail because you are subscribed to the mira_talk mailing 
list. For information on how to subscribe or unsubscribe, please visit 
http://www.chevreux.org/mira_mailinglists.html

Other related posts: