Hi Bastien, thank you so much for all the free support you are giving to the community. I'm trying to make a "definitive" assembly of my ESTs for SNP mining, but little concern rised from coverage values and repeat histogram. I sequenced three different samples (i.e. varieties) of my plant, which is tremendously heterozygous. about 0,6 M 454-titanium each, 1,7 M on the whole (+36k sanger) As first thought I planned to assemble each varieties, separately and then cluster the contigs together again (maybe using a less stringent alignment due to the diversity between varieties). After, I wondered that there shouldn't be such a big difference rather than assembling everything just once. In this way I proceeded. Am I correct? What do you think about repeat histogram? The high 0-level seems weird to me, I expected more repeat on the avg cov. (level-1) having so many reads, all coming from normalized libraries. Do you think I need to switch on -SK:mnr and -SK:nrr even if I'm looking at deep covered genes, (to be able to mine heterozygous SNPs within the same sample)? Can be these switch useful in EST clustering too? Moreover, if one single sample estimated the coverage to 9x with hstat, all the three together just encreased to 15x (less than doubled). Does it make sense to you? or my parameters are not permissive to cluster quite diverged genes in the three samples? Trying with two separated sample I got 35k-40k contigs each, while assembled together they rised to 70k contigs, thus I'm afraid that reads coming from different samples may split apart. Could you please give a quick look at my parameters and give me an opinion, with hints on how to play around this? thanks so much! COMMON_SETTINGS -GE:not=8 -AS:sep=yes:ugpf=no -SK:not=8:pr=85:mnr=no -CO:mr=yes:asir=yes -OUT:ora=yes:org=no -SB:lsd=yes -CL:ascdc=yes SANGER_SETTINGS -LR:wqf=no -AS:epoq=no:bdq=20 -CL:cpat=no -OUT:sssip=yes -AL:mo=50:ms=50:mrs=93:egp=no -CO:fnicpst=yes -ED:ace=no 454_SETTINGS -OUT:sssip=yes -AL:mo=50:ms=50:mrs=93:egp=no -CL:cpat=no:qc=yes:qcmq=15:qcwl=20 -CO:fnicpst=yes:rodirs=10 -DP:ure=yes -ED:ace=no Measured avg. frequency coverage: 15 Deduced thresholds: ------------------- Min normal cov: 6.0 Max normal cov: 24.0 Repeat cov: 28.5 Heavy cov: 120.0 Crazy cov: 300.0 Mask cov: 1500 Repeat ratio histogram: ----------------------- 0 50239170 1 16063758 2 4565828 3 2193594 4 1285108 5 856230 6 606206 7 446770 8 336404 9 262926 10 210342 11 171042 12 136860 ... ... ... 885 2 927 2 972 2