Scott Cain | 24 Sep 23:58

genbank2gff.pl choking on CONTIG sections

Hi all,

The BioPerl script bp_genbank2gff.pl, which will either convert a
Genbank record to GFF or load it directly to a Bio::DB::GFF database,
is choking on GenBank records with CONTIG sections.  Since I don't
think these would ever be useful for generating GFF or loading into a
database (ie, the user will want to get all of the features on the
parts, not know what the parts are), is there a way to force a
Bio::DB::WebDBSeqI/Bio::DB::BioFetch to get the full record (like
specifying view=gbwithparts in the url at ncbi)?

Thanks,
Scott

--

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D. cain.cshl <at> gmail.com
GMOD Coordinator (http://gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
Jason Stajich | 25 Sep 01:05

Re: genbank2gff.pl choking on CONTIG sections

It should already if it is using Bio::DB::GenBank -- do you have  
example of a fail?  There seems to be some defaulting to EMBL for the  
source in the biofetch code so it might be worth twiddling.

from Bio::DB::GenBank

Note that when querying for GenBank accessions starting with 'NT_' you
will need to call $gb->request_format('fasta') beforehand, because
in GenBank format (the default) the sequence part will be left out
(the reason is that NT contigs are rather annotation with references
to clones).

Some work has been done to automatically detect and retrieve whole NT_  
clones
when the data is in that format (NCBI RefSeq clones). The former  
behavior prior
to bioperl 1.6 was to retrieve these from EBI, but now these are  
retrieved
directly from NCBI. The older behavior can be regained by setting the
'redirect_refseq' flag to a value evaluating to TRUE.

On Sep 24, 2008, at 3:00 PM, Scott Cain wrote:

> Hi all,
>
> The BioPerl script bp_genbank2gff.pl, which will either convert a
> Genbank record to GFF or load it directly to a Bio::DB::GFF database,
> is choking on GenBank records with CONTIG sections.  Since I don't
> think these would ever be useful for generating GFF or loading into a
> database (ie, the user will want to get all of the features on the
(Continue reading)


Gmane