[mira_talk] Questions about TCS file fields

From: Robert Bruccoleri <bruc@xxxxxxxxxxxxxxxxxxxxx>
To: mira_talk@xxxxxxxxxxxxx
Date: Sat, 19 Mar 2011 17:01:45 -0400

I'm working on a mapping project where I have two regions of the humangenome and hundreds of millions of Illumina reads sequenced against(from multiple samples). I'm breaking up the Illumina reads intomanageable chunks and using the Mapping option of Mira to map themagainst the genomic DNA. The genomic DNA is being read in as a backbonestrain.

I'd like to combine results for multiple maps together, but I'm reallyconfused about some fields in the TCS file. First, what is field 8?According to the documentation: "total coverage in number of reads. Thisnumber can be higher than the sum of the next five columns if Ns orIUPAC bases are present in the sequence of reads.", However, when I lookat an entry for a mapping where there are no reads, just the backbone,this field has a value of 5.

Second, in regions where there are no reads mapped, I'm findingcoverages of more than 1, and quality scores for bases that aren't inthe reference. Shouldn't the lines corresponding to reference sequenceswith no reads just have the default quality score for the backbone andcoverage of 1 for the base in the corresponding position in the backbone?

Finally, is there any more documentation on the format besides what's inthe manual?


Thanks.

--Bob Bruccoleri

begin:vcard
fn:Robert Bruccoleri
n:Bruccoleri;Robert
org:Audacious Energy, LLC and Congenomics, LLC
adr:;;;;;;USA
email;internet:bruc@xxxxxxx
title:President
version:2.1
end:vcard

Follow-Ups:
- [mira_talk] Re: Questions about TCS file fields
  - From: Bastien Chevreux

[mira_talk] Questions about TCS file fields

Other related posts: