Erich Schwarz | 29 Sep 07:56

exporting contigs with CDSes, stored via Bio::DB::GFF, into individual GenBank records?

Hi all,

    I have newly sequenced contigs, with CDS predictions, loaded 
into a Bio::DB::GFF-readable format (i.e., loaded into a MySQL 
database via Bio::DB::GFF).  I'd like to export each contig, with 
its annotated CDSes, into a single GenBank-formatted record for each 
contig (in order to be able submit this stuff to GenBank, without 
having to waste time with Sequin).  Is there some straightforward 
way of getting Bio::DB::GFF to do that?

    Some time ago, when I last had to decipher BioPerl, I came up 
with code that would let me export protein translations of the 
contigs' CDSes in GenBank format:

-------------------------------------------------------------------

#!/usr/bin/env perl

use strict;
use warnings;
use Bio::Seq;
use Bio::SeqIO;
use Bio::DB::GFF;

my $query_database = $ARGV[0];
my $dna = q{};
my $db = Bio::DB::GFF->new( -dsn => $query_database);

my $gb_file = 'example.gb';
my $seq_out = Bio::SeqIO->new( -file => ">$gb_file", -format => 'genbank', );
(Continue reading)

Jason Stajich | 30 Sep 18:48

Re: exporting contigs with CDSes, stored via Bio::DB::GFF, into individual GenBank records?

Eric.
CC-ing Gbrowse since this is regarding Gbrowse data-store.

I've definitely done exactly this although I remember I had to tweak  
the features a bit to make sure i had some some of the necessary stuff  
to sequin.

If you want to get a specific segment you just do what you already  
have in your code:
my $segment = $db->segment($contig_name);

Or you can iterate through all the features - depends on how you named  
your segments/contigs/chromsomes, I named mine "contig:scaffold" for  
type:source
  my $iterator = $dbh->get_seq_stream(-type=>'scaffold');
   while (my $s = $iterator->next_seq) {
   }

Now You *should* be able to pass this segment object to $seqio- 
 >write_seq($segment);
However Bio::DB::GFF::Feature doesn't implement the whole SeqI APi so  
you probably have to create your own sequence and move the features  
over:

y $iterator = $dbh->get_seq_stream(-type=>'scaffold');
   while (my $s = $iterator->next_seq) {
	my $seq = Bio::Seq->new();
	$seq->primary_seq($s->seq);
	for my $feature ( $s->features('processed_transcript') ) {
		my $f = Bio::SeqFeature::Generic->new(-location => $feature->location,
(Continue reading)


Gmane