Mark Livingstone | 17 Jul 2012 03:49
Picon

[Biopython] The PDBParser Permissive setting

Hi Guys,

In my code I am experimenting with different ways of doing RMSD
calculations. I have code which in addition to normal CA based RMSD
can do (CA & CB) RMSD and also sidechain RMSD. On a perfect PDB file
this works well. Unfortunately, the curation I have is fairly average
/ poor in quality :-( and I only find out when one of the liberal
number of Try/Except blocks falls over.

I need a better way to find out sooner if a PDB file is missing data.

I am wondering therefore is for PDBParser I set Permissive=0, and
after setting the relevant models and chains etc, I did

wt_atoms = Bio.PDB.Selection.unfold_entities(wtc, 'A')

If this successfully works without throwing an Exception, can I assume
that this unfolded chain is perfect, or are there ways that I could
still be tripped up?

Alternatively, can anyone suggest code that I can employ in my
curation process that will give me a decent sanity check of PDB
quality, so I can get on writing experimental code - and not
Try/Except blocks :-(

Thanks in advance,

MarkL
_______________________________________________
Biopython mailing list  -  Biopython <at> lists.open-bio.org
(Continue reading)

João Rodrigues | 17 Jul 2012 08:42
Picon
Gravatar

Re: [Biopython] The PDBParser Permissive setting

Hey Mark,

What kind of validation do you want?

Cheers,

João
No dia 17 de Jul de 2012 02:52, "Mark Livingstone" <
livingstonemark <at> gmail.com> escreveu:

> Hi Guys,
>
> In my code I am experimenting with different ways of doing RMSD
> calculations. I have code which in addition to normal CA based RMSD
> can do (CA & CB) RMSD and also sidechain RMSD. On a perfect PDB file
> this works well. Unfortunately, the curation I have is fairly average
> / poor in quality :-( and I only find out when one of the liberal
> number of Try/Except blocks falls over.
>
> I need a better way to find out sooner if a PDB file is missing data.
>
> I am wondering therefore is for PDBParser I set Permissive=0, and
> after setting the relevant models and chains etc, I did
>
>
> wt_atoms = Bio.PDB.Selection.unfold_entities(wtc, 'A')
>
> If this successfully works without throwing an Exception, can I assume
> that this unfolded chain is perfect, or are there ways that I could
> still be tripped up?
(Continue reading)

Mark Livingstone | 17 Jul 2012 10:54
Picon

Re: [Biopython] The PDBParser Permissive setting

Hi João,

I guess it would be good if I could get a data structure that had no
discontinuities, no missing data points or unknowns. I would be able
to tell it to ignore HOH or other irrelevancies.

My use case as I mentioned is RMSD and similar algorithms, so one
continuous structure with all the data attached that I can iterate
through, selecting atoms / residues as needed, and get the names and
coordinates as I go.

So I guess I want a PDB Diagnostic type program to allow me to find
exemplary PDB files to use during initial stages of development while
I do proof of concept, since I know that finding edge case PDBs for
later work is not as hard it seems as finding good ones ;-) Maybe the
simplest way to think of the sort of PDBs is you can run your software
and you don't need any try / except blocks for Biopython to work well
:-D

Cheers,

MarkL

On 17 July 2012 16:42, João Rodrigues <anaryin <at> gmail.com> wrote:
> Hey Mark,
>
> What kind of validation do you want?
>
> Cheers,
>
(Continue reading)

João Rodrigues | 17 Jul 2012 11:35
Picon
Gravatar

Re: [Biopython] The PDBParser Permissive setting

You mean for example, no chain breaks? And no missing atoms in residues?
You can check the first one with a warning catcher (I think I answered
something like this a few time ago here in the mailing list). The second
one is trickier, you'll need a sort of topology to know which atoms belong
to each residue. I have something like that in my GSOC branch but it's very
very very experimental..

Is this what you mean? Which others would you be looking for? I think that
for RMSD alone you need only to make sure that you match equivalent atoms.
That should be easy enough without major modifications or endless
try/excepts :)
_______________________________________________
Biopython mailing list  -  Biopython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython


Gmane