14 Jun 2012 22:23
blat: sim4 Vs psl form
Mbandi S.K <mbandi <at> sanbi.ac.za>
2012-06-14 20:23:10 GMT
2012-06-14 20:23:10 GMT
Dear ALL; Firstly, I'm happy to join this mailing list. I do not know if this group is the right place for my question. Kindly bear with me if my question is trivial or has been dealt with already. I have recently settled on BLAT v. 34 for a portion of my project to screen for EST(cDNA) that well aligned to my reference sequence. However, I find it hard to understand the effects of -minIdentity and -fastMap on the output. I also noticed that just changing the output format, affects the the reports in the output file. More ESTs are reported in sim4 format than in psl format. I want to write a parser to calculate coverage, identity etc in other for me to build a filtering matrix. attached here are two test files:query.fa and target.fa. I'm aware -fastMap is for DNA-DNA, but just for test purposes, I ran: blat target.fa -t=dna query.fa -q=dna -out=psl -minIdentity=100 -fastMap -dots=1 test.psl and blat target.fa -t=dna query.fa -q=dna -out=psl -fastMap -dots=1 test.psl However in the first instance; I do not find hits which I expected even though default -minIdentity is 90 which is less stringent to 100. When out=sim4 is used, the hits are totally different. Query.fa contains mutated and unmodified versions of seq1 from target.fa file. Has anyone experience strange results like this? Which output is better from experience? I will appreciate clarity in this regard. Many thanks,(Continue reading)
RSS Feed