Mark Johnson | 18 Aug 22:52

Bio::Annotation issues with BioSQL

    I'm presently refactoring an in-house protein annotation pipeline
and converting it to use BioSQL as a data store.  I've noticed some
slightly screwy behavior with regard to how some of the
Bio::Annotation classes are handled:

-Instances of Bio::Annotation::SimpleValue and
Bio::Annotation::StructuredValue attached to the annotation collection
for a sequence feature (Bio::SeqFeature::Generic) are converted to
tags/values on the feature.
-Instances of Bio::AnnotationDBLink with attached comments loose the comment.

    I'm storing and retrieving things thusly:

my $dbadp = Bio::DB::BioDB->new(
                                                   -database => 'biosql',
                                                   -user        => $user',
                                                   -pass       => $pass,
                                                   -dbname  => $ora_instance,
                                                   -driver      => 'Oracle'
                            );

my $adp = $dbadp->get_object_adaptor("Bio::SeqI");

my $seq = Bio::Seq->new(
                        -id                         => 'DEBUG001',
                        -accession_number => 'DBG001',
                        -desc                     => 'Debug Sequence',
                        -seq                      => 'GATTACA',
                        -namespace           => 'DEBUG',
                       );
(Continue reading)

Hilmar Lapp | 19 Aug 19:56

Re: [BioSQL-l] Bio::Annotation issues with BioSQL


On Aug 18, 2008, at 4:53 PM, Mark Johnson wrote:

>    I'm presently refactoring an in-house protein annotation pipeline
> and converting it to use BioSQL as a data store.  I've noticed some
> slightly screwy behavior with regard to how some of the
> Bio::Annotation classes are handled:
>
> -Instances of Bio::Annotation::SimpleValue and
> Bio::Annotation::StructuredValue attached to the annotation collection
> for a sequence feature (Bio::SeqFeature::Generic) are converted to
> tags/values on the feature.
>
> -Instances of Bio::Annotation::DBLink with attached comments loose  
> the comment.
> [...]
> $query->where(["s.display_id like DEBUG%'"]);

There's a single quote missing here, but I'm assuming that's a result  
of copy/paste editing?

> [...]
>    Is bioperl-db / BioSQL trying to tell me that I shouldn't be using
> Bio::Annotation::SimpleValue and Bio::Annotation::StructuredValue?

Your example code doesn't contain an example for where you are getting  
the B::A::StructuredValue object from. If you didn't create that  
yourself, it would be good to know what you did to end up with that.  
Chris Fields has written B::A::Tagtree which would be way forward, and  
if you created the object yourself, can you take a look at that and  
(Continue reading)

Mark Johnson | 20 Aug 20:43

Re: [BioSQL-l] Bio::Annotation issues with BioSQL

On Tue, Aug 19, 2008 at 12:56 PM, Hilmar Lapp <hlapp <at> gmx.net> wrote:
> On Aug 18, 2008, at 4:53 PM, Mark Johnson wrote:
> There's a single quote missing here, but I'm assuming that's a result of
> copy/paste editing?

Yes, I was a bit sloppy with the example.

> Your example code doesn't contain an example for where you are getting the
> B::A::StructuredValue object from. If you didn't create that yourself, it
> would be good to know what you did to end up with that. Chris Fields has
> written B::A::Tagtree which would be way forward, and if you created the
> object yourself, can you take a look at that and see whether that class
> wouldn't serve your purpose as well or even better?

I created the B::A::StructuredValue myself.  I'm using it to store the
output from PSORTb, which gives a cellular localization and a score
for a protein sequence (gene), which I'm trying to keep paired
together, if possible.  I'll take a look at B::A::Tagtree, that's
probably a better fit.

> In order to be stored in BioSQL structured (hierarchical, nested) annotation
> is flattened into a string representation, because BioSQL can't store nested
> annotation collections natively. Right now if I am not mistaken upon
> retrieval this is not converted back into a B::A::Tagtree object but rather
> left flat. This is being worked on though, we've just discussed some issues
> connected with that.

The data I have isn't really deeply nested.  I just like to keep
related annotation in one object, if possible.

(Continue reading)

Chris Fields | 20 Aug 22:25
Gravatar

Re: [BioSQL-l] Bio::Annotation issues with BioSQL


On Aug 20, 2008, at 1:43 PM, Mark Johnson wrote:

> ...
>
>> I could make B::A::StructuredValue work the same way, but I'm not  
>> sure what
>> it provides that B::A::Tagtree doesn't. The latter uses Data::Stag  
>> under the
>> hood, which is much cleaner, and more extensible in the future.
>
> Perhaps B::A::StructuredValue should be deprecated?

Probably.  The only place it was used in core was SeqIO::swiss (and  
now that uses Tagtree in bioperl-live).

Let me know if you have any problems with Bio::Annotation::Tagtree.  I  
am planning on doing some more work with it soon.

chris

Gmane