Raymond Hettinger | 10 Oct 01:08
Favicon

Documentation idea

Background
----------
In the itertools module docs, I included pure python equivalents for each of the C functions.  Necessarily,
some of those 
equivalents are only approximate but they seem to have greatly enhanced the docs.  Something similar is in
the builtin docs for 
any() and all().  The new collections.namedtuple() factory function also includes a verbose option that
prints a pure python 
equivalent for the generated class. And in the decimal module, I took examples directly from the spec and
included them in 
doc-testable docstrings.  This assured compliance with the spec while providing clear examples to anyone
who bothers to look at the 
docstrings.

For itertools docs, I combined those best practices and included sample calls in the pure-python code (see
the current docs for 
itertools to see what I mean -- perhaps look at the docs for a new tool like itertools.product() or
itertools.izip_longest() to see 
how useful it is).

Bright idea
----------
Let's go one step further and do this just about everywhere and instead of putting it in the docs, attach an
exec-able string as an 
attribute to our C functions.  Further, those pure python examples should include doctests so that the user
can see a typical 
invocation and calling pattern.

Say we decide to call the attribute something like ".python", then you could write something like:

(Continue reading)

Christian Heimes | 10 Oct 01:33
Favicon

Re: Documentation idea

Raymond Hettinger wrote: lots of cool stuff!

The idea sounds great!

Are you planing to embed the pure python code in C code? That's going to 
increase the data segment of the executable. It should be possible to 
disable and remove the pure python example with a simple ./configure 
option and some macro magic. File size and in memory size is still 
critical for embedders.

Christian

_______________________________________________
Python-Dev mailing list
Python-Dev <at> python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org

Raymond Hettinger | 10 Oct 01:50
Favicon

Re: Documentation idea

[Christian Heimes]
> The idea sounds great!
>
> Are you planing to embed the pure python code in C code?

Am experimenting with a descriptor that fetches the attribute string from a separate text file.  This keeps
the C build from getting 
fat.  More importantly, it let's us write the execable string in a more natural way (it bites to write C style
docstrings using \n 
and trailing backslashes).  The best part is that people without C compilers can still submit patches to the
text files.

Raymond 

_______________________________________________
Python-Dev mailing list
Python-Dev <at> python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org

Lisandro Dalcin | 10 Oct 04:13

Re: Documentation idea

On Thu, Oct 9, 2008 at 8:50 PM, Raymond Hettinger <python <at> rcn.com> wrote:
> [Christian Heimes]
>>
>> The idea sounds great!
>>
>> Are you planing to embed the pure python code in C code?
>
> Am experimenting with a descriptor that fetches the attribute string from a
> separate text file.

Have you ever considered the same approach for docstrings in C code?
As reference, NumPy already has some trickery for maintaining
docstrings outside C sources. Of course, descriptors would be a far
better for implementing and support this in core Python and other
projects...

>  This keeps the C build from getting fat.  More
> importantly, it let's us write the execable string in a more natural way (it
> bites to write C style docstrings using \n and trailing backslashes).  The
> best part is that people without C compilers can still submit patches to the
> text files.
>
>
> Raymond
> _______________________________________________
> Python-Dev mailing list
> Python-Dev <at> python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/dalcinl%40gmail.com
(Continue reading)

Raymond Hettinger | 10 Oct 04:39
Favicon

Re: Documentation idea

Yes, I'm looking a couple of different approaches to loading the strings.
For now though, I want to focus on the idea itself, not the implementation.
The important thing is to gather widespread support before getting into
the details of how the strings get loaded.

Raymond

----- Original Message ----- 
From: "Lisandro Dalcin" <dalcinl <at> gmail.com>

Have you ever considered the same approach for docstrings in C code?
As reference, NumPy already has some trickery for maintaining
docstrings outside C sources. Of course, descriptors would be a far
better for implementing and support this in core Python and other
projects...

>  This keeps the C build from getting fat.  More

_______________________________________________
Python-Dev mailing list
Python-Dev <at> python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org

Brett Cannon | 10 Oct 03:41
Favicon

Re: Documentation idea

On Thu, Oct 9, 2008 at 4:12 PM, Raymond Hettinger <python <at> rcn.com> wrote:
[SNIP]
> Bright idea
> ----------
> Let's go one step further and do this just about everywhere and instead of
> putting it in the docs, attach an exec-able string as an attribute to our C
> functions.  Further, those pure python examples should include doctests so
> that the user can see a typical invocation and calling pattern.
>
> Say we decide to call the attribute something like ".python", then you could
> write something like:
>
>   >>> print(all.python)
>  def all(iterable):
>       '''Return True if all elements of the iterable are true.
>
>       >>> all(isinstance(x, int) for x in [2, 4, 6.13, 8])
>       False
>       >>> all(isinstance(x, int) for x in [2, 4, 6, 8])
>       True
>       '''
>
>       for element in iterable:
>           if not element:
>                return False
>       return True
>
> There you have it, a docstring, doctestable examples, and pure python
> equivalent all in one place.  And since the attribute is distinguished from
> __doc__, we can insist that the string be exec-able (something we can't
(Continue reading)

glyph | 10 Oct 05:37
Gravatar

Re: Documentation idea


On 9 Oct, 11:12 pm, python <at> rcn.com wrote:
>Background
>----------
>In the itertools module docs, I included pure python equivalents for 
>each of the C functions.  Necessarily, some of those equivalents are 
>only approximate but they seem to have greatly enhanced the docs.

Why not go the other direction?

Ostensibly the reason for writing a module like 'itertools' in C is 
purely for performance.  There's nothing that I'm aware of in that 
module which couldn't be in Python.

Similarly, cStringIO, cPickle, etc.  Everywhere these diverge, it is (if 
not a flat-out bug) not optimal.  External projects are encouraged by a 
wealth of documentation to solve performance problems in a similar way: 
implement in Python, once you've got the interface right, optimize into 
C.

So rather than have a C implementation, which points to Python, why not 
have a Python implementation that points at C?  'itertools' (and 
similar) can actually be Python modules, and use a decorator, let's call 
it "C", to do this:

    @C("_c_itertools.count")
    class count(object):
        """
        This is the documentation for both the C version of 
itertools.count
(Continue reading)

Jared Grubb | 10 Oct 05:54

Re: Documentation idea

This is a really interesting idea. If extra memory/lookup overhead is  
a concern, you could enable this new feature by default when the  
interactive interpreter is started (where it's more likely to be  
invoked), and turn it off by default when running scripts/modules.

Jared

On 9 Oct 2008, at 20:37, glyph <at> divmod.com wrote:

>
> On 9 Oct, 11:12 pm, python <at> rcn.com wrote:
>> Background
>> ----------
>> In the itertools module docs, I included pure python equivalents  
>> for each of the C functions.  Necessarily, some of those  
>> equivalents are only approximate but they seem to have greatly  
>> enhanced the docs.
>
> Why not go the other direction?
>
> Ostensibly the reason for writing a module like 'itertools' in C is  
> purely for performance.  There's nothing that I'm aware of in that  
> module which couldn't be in Python.
>
> Similarly, cStringIO, cPickle, etc.  Everywhere these diverge, it is  
> (if not a flat-out bug) not optimal.  External projects are  
> encouraged by a wealth of documentation to solve performance  
> problems in a similar way: implement in Python, once you've got the  
> interface right, optimize into C.
>
(Continue reading)

Brett Cannon | 10 Oct 19:27
Favicon

Re: Documentation idea

On Thu, Oct 9, 2008 at 8:37 PM,  <glyph <at> divmod.com> wrote:
>
> On 9 Oct, 11:12 pm, python <at> rcn.com wrote:
>>
>> Background
>> ----------
>> In the itertools module docs, I included pure python equivalents for each
>> of the C functions.  Necessarily, some of those equivalents are only
>> approximate but they seem to have greatly enhanced the docs.
>
> Why not go the other direction?
>
> Ostensibly the reason for writing a module like 'itertools' in C is purely
> for performance.  There's nothing that I'm aware of in that module which
> couldn't be in Python.
>
> Similarly, cStringIO, cPickle, etc.  Everywhere these diverge, it is (if not
> a flat-out bug) not optimal.  External projects are encouraged by a wealth
> of documentation to solve performance problems in a similar way: implement
> in Python, once you've got the interface right, optimize into C.
>
> So rather than have a C implementation, which points to Python, why not have
> a Python implementation that points at C?  'itertools' (and similar) can
> actually be Python modules, and use a decorator, let's call it "C", to do
> this:
>
>   @C("_c_itertools.count")
>   class count(object):
>       """
>       This is the documentation for both the C version of itertools.count
(Continue reading)

Terry Reedy | 10 Oct 22:45

Re: Documentation idea

glyph <at> divmod.com wrote:
> 
> On 9 Oct, 11:12 pm, python <at> rcn.com wrote:
>> Background
>> ----------
>> In the itertools module docs, I included pure python equivalents for 
>> each of the C functions.  Necessarily, some of those equivalents are 
>> only approximate but they seem to have greatly enhanced the docs.
> 
> Why not go the other direction?
> 
> Ostensibly the reason for writing a module like 'itertools' in C is 
> purely for performance.  There's nothing that I'm aware of in that 
> module which couldn't be in Python.
> 
> Similarly, cStringIO, cPickle, etc.  Everywhere these diverge, it is (if 
> not a flat-out bug) not optimal.  External projects are encouraged by a 
> wealth of documentation to solve performance problems in a similar way: 
> implement in Python, once you've got the interface right, optimize into C.
> 
> So rather than have a C implementation, which points to Python, why not 
> have a Python implementation that points at C?  'itertools' (and 
> similar) can actually be Python modules, and use a decorator, let's call 
> it "C", to do this:
> 
>    @C("_c_itertools.count")
>    class count(object):
>        """
>        This is the documentation for both the C version of itertools.count
>        and the Python version - since they should be the same, right?
(Continue reading)

Brett Cannon | 10 Oct 23:57
Favicon

Re: Documentation idea

On Fri, Oct 10, 2008 at 1:45 PM, Terry Reedy <tjreedy <at> udel.edu> wrote:
> glyph <at> divmod.com wrote:
>>
>> On 9 Oct, 11:12 pm, python <at> rcn.com wrote:
>>>
>>> Background
>>> ----------
>>> In the itertools module docs, I included pure python equivalents for each
>>> of the C functions.  Necessarily, some of those equivalents are only
>>> approximate but they seem to have greatly enhanced the docs.
>>
>> Why not go the other direction?
>>
>> Ostensibly the reason for writing a module like 'itertools' in C is purely
>> for performance.  There's nothing that I'm aware of in that module which
>> couldn't be in Python.
>>
>> Similarly, cStringIO, cPickle, etc.  Everywhere these diverge, it is (if
>> not a flat-out bug) not optimal.  External projects are encouraged by a
>> wealth of documentation to solve performance problems in a similar way:
>> implement in Python, once you've got the interface right, optimize into C.
>>
>> So rather than have a C implementation, which points to Python, why not
>> have a Python implementation that points at C?  'itertools' (and similar)
>> can actually be Python modules, and use a decorator, let's call it "C", to
>> do this:
>>
>>   @C("_c_itertools.count")
>>   class count(object):
>>       """
(Continue reading)

Terry Reedy | 11 Oct 06:46

Re: Documentation idea

Brett Cannon wrote:
> On Fri, Oct 10, 2008 at 1:45 PM, Terry Reedy <tjreedy <at> udel.edu> wrote:

>> The advantage of the decorator version is that the compiler or module loader
>> could be special cased to recognize the 'C' decorator and try it first
>> *before* using the Python version, which would serve as a backup.  There
>> could be a standard version in builtins that people could replace to
>> implement non-standard loading on a particular system.  To cater to other
>> implementations, the name could be something other than 'C', or we could
>> define 'C' to be the initial of "Code" (in the implementation language).
>>  Either way, other implementation could start with a do-nothing "C"
>> decorator and run the file as is, then gradually replace with lower-level
>> code.
>>
> 
> The decorator doesn't have to require any special casing at all
> (changing the parameters to keep the code short)::
> 
>   def C(module_name, want):
>      def choose_version(ob):
>          try:
>            module = __import__(module_name, fromlist=[want])
>            return getattr(module, want)
>           except (ImportError, AttributeError):
>             return ob
>       return choose_version
> 
> The cost is purely during importation of the module and does nothing
> fancy at all and relies on stuff already available in all Python VMs.

(Continue reading)

Brett Cannon | 11 Oct 08:53
Favicon

Re: Documentation idea

On Fri, Oct 10, 2008 at 9:46 PM, Terry Reedy <tjreedy <at> udel.edu> wrote:
> Brett Cannon wrote:
>>
>> On Fri, Oct 10, 2008 at 1:45 PM, Terry Reedy <tjreedy <at> udel.edu> wrote:
>
>>> The advantage of the decorator version is that the compiler or module
>>> loader
>>> could be special cased to recognize the 'C' decorator and try it first
>>> *before* using the Python version, which would serve as a backup.  There
>>> could be a standard version in builtins that people could replace to
>>> implement non-standard loading on a particular system.  To cater to other
>>> implementations, the name could be something other than 'C', or we could
>>> define 'C' to be the initial of "Code" (in the implementation language).
>>>  Either way, other implementation could start with a do-nothing "C"
>>> decorator and run the file as is, then gradually replace with lower-level
>>> code.
>>>
>>
>> The decorator doesn't have to require any special casing at all
>> (changing the parameters to keep the code short)::
>>
>>  def C(module_name, want):
>>     def choose_version(ob):
>>         try:
>>           module = __import__(module_name, fromlist=[want])
>>           return getattr(module, want)
>>          except (ImportError, AttributeError):
>>            return ob
>>      return choose_version
>>
(Continue reading)

Fernando Perez | 16 Oct 06:41

Re: Documentation idea

Raymond Hettinger wrote:

> Bright idea
> ----------
> Let's go one step further and do this just about everywhere and instead of
> putting it in the docs, attach an exec-able string as an
> attribute to our C functions.  Further, those pure python examples should
> include doctests so that the user can see a typical invocation and calling
> pattern.
> 
> Say we decide to call the attribute something like ".python", then you
> could write something like:
> 
>     >>> print(all.python)
>    def all(iterable):
>         '''Return True if all elements of the iterable are true.
> 

[...]

+1 from the peanut gallery, with a note: since ipython is a common way for
many to use/learn python interactively, if this is adopted, we'd
*immediately* add to ipython's '?' introspection machinery the ability to
automatically find this information.  This way, when people type "all?"
or "all??" we'd fetch the doc and source code.

A minor question inspired by this: would it make sense to split the
docstring part from the code of this .python object?  I say this because in
principle, the docstring should be the same of the 'parent', and it would
simplify our implementation to eliminate the duplicate printout. 
(Continue reading)

Scott Dial | 16 Oct 20:13
Favicon

Re: Documentation idea

Raymond Hettinger wrote:
> * It will assist pypy style projects and other python implementations
> when they have to build equivalents to CPython.
> 
> * Will eliminate confusion about what functions were exactly intended to
> do.
> 
> * Will confer benefits similar to test driven development where the
> documentation and  pure python version are developed first and doctests
> gotten to pass, then the C version is created to match.

I haven't seen anyone comment about this assertion of "equivalence".
Doesn't it strike you as difficult to maintain *two* versions of every
function, and ensure they match *exactly*? The utility to PyPy-style
projects is minimized if the two version aren't identical. And while
it's possible to say, "the tests say they are equiavelent, so they are;"
history is quite clear about people depending on "features" that are
untested and were unintended side-effects of the manner in which
something was implemented. I think it would be a dilution of developer
man-hours to force them to maintain two versions in lock-step, and it
significantly adds to the burden of writing and reviewing potential
bugfixes.

While I applaud the idea of documenting C functions in this manner,
let's not confuse documentation with equivalence. If the standard
distribution of Python exports the C version, then all bets are off
whether the Python version is a drop-in replacement (even if the
buildbots regularly test them). I feel so strongly about this that I
think that the consideration of adding this should be frame /solely/ as
a documentation tool and nothing more.
(Continue reading)

Brett Cannon | 16 Oct 21:59
Favicon

Re: Documentation idea

On Thu, Oct 16, 2008 at 11:13 AM, Scott Dial
<scott+python-ideas <at> scottdial.com> wrote:
> Raymond Hettinger wrote:
>> * It will assist pypy style projects and other python implementations
>> when they have to build equivalents to CPython.
>>
>> * Will eliminate confusion about what functions were exactly intended to
>> do.
>>
>> * Will confer benefits similar to test driven development where the
>> documentation and  pure python version are developed first and doctests
>> gotten to pass, then the C version is created to match.
>
> I haven't seen anyone comment about this assertion of "equivalence".
> Doesn't it strike you as difficult to maintain *two* versions of every
> function, and ensure they match *exactly*?

More time-consuming than difficult. Raymond is currently talking about
things like built-ins and methods on types who do not exactly change
very often.

> The utility to PyPy-style
> projects is minimized if the two version aren't identical. And while
> it's possible to say, "the tests say they are equiavelent, so they are;"
> history is quite clear about people depending on "features" that are
> untested and were unintended side-effects of the manner in which
> something was implemented.

Right, and when we find out that there is a difference, we typically
standardize on a specific version and developers using the bogus
(Continue reading)

Raymond Hettinger | 16 Oct 23:11
Favicon

Re: Documentation idea

> Raymond Hettinger wrote:
>> * It will assist pypy style projects and other python implementations
>> when they have to build equivalents to CPython.
>>
>> * Will eliminate confusion about what functions were exactly intended to
>> do.
>>
>> * Will confer benefits similar to test driven development where the
>> documentation and  pure python version are developed first and doctests
>> gotten to pass, then the C version is created to match.
>
> I haven't seen anyone comment about this assertion of "equivalence".
> Doesn't it strike you as difficult to maintain *two* versions of every
> function, and ensure they match *exactly*?

Glad you brought this up.  My idea is to present rough equivalence
in unoptimized python that is simple and clear.  The goal is to provide
better documentation where code is more precise than English prose.
That being said, some subset of the existing tests should be runnable
against the rough equivalent and the python code should incorporate doctests.
Running both sets of test should suffice to maintain the rough equivalence.

The notion of exact equivalence should be left to PyPy folks who can attest
that the code can get convoluted when you try to simulate exactly when
error checking is performed, read-only behavior for attributes, and making
the stacktraces look the same when there are errors.  In contrast, my
goal is an approximation that is executable but highly readable and expository.

My thought is to do this only with tools where it really does enhance the
documentation.  The exercise is worthwhile in and of itself.  For example,
(Continue reading)

Doug Hellmann | 16 Oct 23:28
Gravatar

Re: Documentation idea


On Oct 16, 2008, at 5:11 PM, Raymond Hettinger wrote:

>> Raymond Hettinger wrote:
>>> * It will assist pypy style projects and other python  
>>> implementations
>>> when they have to build equivalents to CPython.
>>>
>>> * Will eliminate confusion about what functions were exactly  
>>> intended to
>>> do.
>>>
>>> * Will confer benefits similar to test driven development where the
>>> documentation and  pure python version are developed first and  
>>> doctests
>>> gotten to pass, then the C version is created to match.
>>
>> I haven't seen anyone comment about this assertion of "equivalence".
>> Doesn't it strike you as difficult to maintain *two* versions of  
>> every
>> function, and ensure they match *exactly*?
>
> Glad you brought this up.  My idea is to present rough equivalence
> in unoptimized python that is simple and clear.  The goal is to  
> provide
> better documentation where code is more precise than English prose.
> That being said, some subset of the existing tests should be runnable
> against the rough equivalent and the python code should incorporate  
> doctests.
> Running both sets of test should suffice to maintain the rough  
(Continue reading)

Raymond Hettinger | 16 Oct 23:37
Favicon

Re: Documentation idea

From: "Doug Hellmann" <doug.hellmann <at> gmail.com
> This seems like a large undertaking.

Not necessarily.  It can be done incrementally, starting with things like str.split() that almost no one
understands completely.  It 
should be put here and there where it adds some clarity.

> I'm sure you're not  underestimating the effort, but I have the sense that you may be  overestimating the
usefulness of the 
> results (or maybe I'm  underestimating them through some lack of understanding).  Would it be  more optimal
(in terms of both 
> effort and results) to extend the  existing documentation and/or docstrings with examples that use all of 
the functions so 
> developers can see how to call them and what results  to expect?

The idea includes pure python code augmented by doctestable doctrings
with enough examples.  So, we're almost talking about the same thing.
There is one difference; since the new attribute is guaranteed to be
executable, it can be reliably run through doctest.  The same is *not* true
for arbitrary docstrings.

Raymond

_______________________________________________
Python-Dev mailing list
Python-Dev <at> python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org

(Continue reading)


Gmane