Egon Willighagen | 31 Jul 15:59 2008
Picon

Re: The Blue Obelisk is challenged: predicting solubility

On Thu, Jul 31, 2008 at 3:43 PM, Rajarshi Guha <rguha@...> wrote:
> Reproducible is definitely a worthy model. But equally important is
> quality.

Surely we aim at the most statistically sound *and* most predictive
model. But anyone can do that.

> You're right that given the small training set, it might a bit tricky
> to get a good quality model. But are you thinking of straight up
> QSAR?

Not generally... I am not aware of polymorph prediction tools that are
open source, but I am not saying that we should not take that
approach... actually...

> Recent papers [1,2] suggests that crystal lattice energies
> should be taken into account - but for that they used structures from
> the CCDC.

I fully agree... the solubility is certainly not just a function of
the molecular structure, but of the (polymorph) crystal structure
too...

Indeed, it's an interesting challenge :)

Polymorph prediction in itself already is a challenge (there is a
biannual(?) competition for that), and correlating those polymorphs to
solubility is a whole other story... force fields for crystal
structures have their limitations...

(Continue reading)

Rajarshi Guha | 31 Jul 16:10 2008
Picon

Re: The Blue Obelisk is challenged: predicting solubility


On Jul 31, 2008, at 9:59 AM, Egon Willighagen wrote:
> On Thu, Jul 31, 2008 at 3:43 PM, Rajarshi Guha <rguha@...>  
> wrote:
>> Reproducible is definitely a worthy model. But equally important is
>> quality.
>
> Surely we aim at the most statistically sound *and* most predictive
> model. But anyone can do that.

That is debatable :)

>> Recent papers [1,2] suggests that crystal lattice energies
>> should be taken into account - but for that they used structures from
>> the CCDC.
>
> I fully agree... the solubility is certainly not just a function of
> the molecular structure, but of the (polymorph) crystal structure
> too...
>
> Indeed, it's an interesting challenge :)
>
> Polymorph prediction in itself already is a challenge (there is a
> biannual(?) competition for that), and correlating those polymorphs to
> solubility is a whole other story... force fields for crystal
> structures have their limitations...
>
> So, a simple QSAR approach would be rather quick-and-dirty, but might
> not do that bad...

(Continue reading)

Egon Willighagen | 31 Jul 16:19 2008
Picon

Re: The Blue Obelisk is challenged: predicting solubility

On Thu, Jul 31, 2008 at 4:10 PM, Rajarshi Guha <rguha@...> wrote:
> First need the appropriate structrues

See:

http://www-jmg.ch.cam.ac.uk/data/solubility/

BTW, the file has 3D coordinates, and look at the first entry, the
planar napthol:

------------
1-Naphthol
  MOE2005           3D

 19 20  0  0  0  0  0  0  0  0999 V2000
   -0.0090    0.0000   -0.0140 C   0  0  0  0  0  0  0  0  0  0  0  0
------------

Yes, the took the smart choice of orienting planar molecules in the XZ
plane (not XY). Go MOE!
(Remember our discussion the other day :)

I'm beefing up the atom typing stuff in the CDK, and will report
results (and hopefully Sybyl atom types too) asap. I don't expect
anything wrong, but one never knows about a typo somewhere in the
input.

Egon

--

-- 
(Continue reading)

Egon Willighagen | 31 Jul 16:21 2008
Picon

Re: The Blue Obelisk is challenged: predicting solubility

On Thu, Jul 31, 2008 at 4:19 PM, Egon Willighagen
<egon.willighagen@...> wrote:
> On Thu, Jul 31, 2008 at 4:10 PM, Rajarshi Guha <rguha@...> wrote:
>> First need the appropriate structrues
>
> See:
>
> http://www-jmg.ch.cam.ac.uk/data/solubility/
>
> BTW, the file has 3D coordinates, and look at the first entry, the
> planar napthol:

And the SD files also include the InChI's of the structures.

Egon

--

-- 
----
http://chem-bla-ics.blogspot.com/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
Rajarshi Guha | 31 Jul 16:37 2008
Picon

Re: The Blue Obelisk is challenged: predicting solubility


On Jul 31, 2008, at 10:19 AM, Egon Willighagen wrote:
> On Thu, Jul 31, 2008 at 4:10 PM, Rajarshi Guha <rguha@...>  
> wrote:
>> First need the appropriate structrues
>
> See:
>
> http://www-jmg.ch.cam.ac.uk/data/solubility/
>
> BTW, the file has 3D coordinates, and look at the first entry, the
> planar napthol:

Hmm, where did they come from? They can't be CCDC structures (?). If  
not, how were they generated, was solvent used etc?

-------------------------------------------------------------------
Rajarshi Guha  <rguha@...>
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
-------------------------------------------------------------------
Does Ramanujan know Polish?
                --  E.B. Ross

Egon Willighagen | 31 Jul 16:47 2008
Picon

Re: The Blue Obelisk is challenged: predicting solubility

On Thu, Jul 31, 2008 at 4:37 PM, Rajarshi Guha <rguha@...> wrote:
>> http://www-jmg.ch.cam.ac.uk/data/solubility/
>>
>> BTW, the file has 3D coordinates, and look at the first entry, the
>> planar napthol:
>
> Hmm, where did they come from? They can't be CCDC structures (?). If not,
> how were they generated, was solvent used etc?

Valid questions; can't quickly find it in the paper...

Egon

--

-- 
----
http://chem-bla-ics.blogspot.com/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
Rajarshi Guha | 31 Jul 16:49 2008
Picon

Re: The Blue Obelisk is challenged: predicting solubility


On Jul 31, 2008, at 10:47 AM, Egon Willighagen wrote:
> On Thu, Jul 31, 2008 at 4:37 PM, Rajarshi Guha <rguha@...>  
> wrote:
>>> http://www-jmg.ch.cam.ac.uk/data/solubility/
>>>
>>> BTW, the file has 3D coordinates, and look at the first entry, the
>>> planar napthol:
>>
>> Hmm, where did they come from? They can't be CCDC structures (?).  
>> If not,
>> how were they generated, was solvent used etc?
>
> Valid questions; can't quickly find it in the paper...

In any case, no harm in rerunning the 3D coord generation and conf  
analysis

-------------------------------------------------------------------
Rajarshi Guha  <rguha@...>
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
-------------------------------------------------------------------
Q: What do you get when you cross a Post Modernist with a Mafioso?
A: An offer you can't understand.


Gmane