Gabriel Gilder | 20 Jan 2010 19:23
Picon
Gravatar

Docs for String.split

Hi there,
Just joined to point out a small oversight in the Ruby docs - I looked into the process for contributing documentation changes but it was a little over my head. :) So I figured someone here could probably take care of it really quickly.

So, the docs for String.split cover splitting by regex, but they don't mention that if you have a capturing subpattern (or several) in your regex, that gets included in the returned array.

For example:

"1, 2.34,56, 7".split(%r{(,\s*)}) #=> ["1", ", ", "2.34", ",", "56", ", ", "7"]
"word :separator: word".split(/(:(\w+):)/) #=> ["word ", ":separator:", "separator", " word"]

I've tested this in Ruby 1.8.7, but I imagine it works the same way in Ruby 1.9 as well, so both versions of the docs should probably be updated.

Thanks, please let me know if you have any questions!

-Gabriel


--------------------------------
Gabriel Gilder
Graphic Design & Web Programming
http://gabrielgilder.com
gabriel <at> gabrielgilder.com

Roger Pack | 20 Jan 2010 19:47
Picon

Re: Docs for String.split

> I've tested this in Ruby 1.8.7, but I imagine it works the same way in Ruby
> 1.9 as well, so both versions of the docs should probably be updated.

Looks like it may be mentioned in the 1.9 docs...

http://rubydoc.ruby-forum.com/doc/ruby-1.9.1-p129/classes/String.html#M000305

-r

Gabriel Gilder | 20 Jan 2010 19:53
Picon
Gravatar

Re: Docs for String.split

Ah, good catch, I didn't notice that one extra line. Perhaps some examples like the ones I posted could still help though? It's easy to miss that note.

-Gabriel


On Wed, Jan 20, 2010 at 10:47 AM, Roger Pack <rogerdpack2 <at> gmail.com> wrote:
> I've tested this in Ruby 1.8.7, but I imagine it works the same way in Ruby
> 1.9 as well, so both versions of the docs should probably be updated.

Looks like it may be mentioned in the 1.9 docs...

http://rubydoc.ruby-forum.com/doc/ruby-1.9.1-p129/classes/String.html#M000305

-r


Hugh Sasse | 20 Jan 2010 20:13
Picon
Picon

Re: Docs for String.split

On Wed, 20 Jan 2010, Gabriel Gilder wrote:

> Hi there,
> Just joined to point out a small oversight in the Ruby docs - I looked into
> the process for contributing documentation changes but it was a little over
> my head. :) So I figured someone here could probably take care of it really
> quickly.
> 
> So, the docs for String.split cover splitting by regex, but they don't
> mention that if you have a capturing subpattern (or several) in your regex,
> that gets included in the returned array.
> 
> For example:
> 
> "1, 2.34,56, 7".split(%r{(,\s*)}) #=> ["1", ", ", "2.34", ",", "56", ", ",
> "7"]
> "word :separator: word".split(/(:(\w+):)/) #=> ["word ", ":separator:",
> "separator", " word"]
> 
> I've tested this in Ruby 1.8.7, but I imagine it works the same way in Ruby
> 1.9 as well, so both versions of the docs should probably be updated.

Maybe a patch a bit like this:

--- ruby-1.8.7-p173/string.c.orig	2009-02-17 02:59:26.000000000 +0000
+++ ruby-1.8.7-p173/string.c	2010-01-20 19:10:04.718058600 +0000
 <at>  <at>  -3515,7 +3515,9  <at>  <at> 
  *     
  *  If <i>pattern</i> is a <code>Regexp</code>, <i>str</i> is divided where the
  *  pattern matches. Whenever the pattern matches a zero-length string,
- *  <i>str</i> is split into individual characters.
+ *  <i>str</i> is split into individual characters. If
+ *  <i>pattern</i> includes one or more capturing subpatterns,
+ *  these will be returned in the array returned by split.
  *     
  *  If <i>pattern</i> is omitted, the value of <code>$;</code> is used.  If
  *  <code>$;</code> is <code>nil</code> (which is the default), <i>str</i> is
 <at>  <at>  -3532,6 +3534,8  <at>  <at> 
  *     " now's  the time".split(' ')   #=> ["now's", "the", "time"]
  *     " now's  the time".split(/ /)   #=> ["", "now's", "", "the", "time"]
  *     "1, 2.34,56, 7".split(%r{,\s*}) #=> ["1", "2.34", "56", "7"]
+ *     "1, 2.34,56".split(%r{(,\s*)})  #=> ["1", ", ", "2.34", ",", "56"]
+ *     "wd :sp: wd".split(/(:(\w+):)/) #=> ["wd ", ":sp:", "sp", " wd"]
  *     "hello".split(//)               #=> ["h", "e", "l", "l", "o"]
  *     "hello".split(//, 3)            #=> ["h", "e", "llo"]
  *     "hi mom".split(%r{\s*})         #=> ["h", "i", "m", "o", "m"]

keeps the alignment of existing examples.  I've not run these to test
them, by the way.
> 
> Thanks, please let me know if you have any questions!
> 
> -Gabriel
> 
        Hugh

Roger Pack | 21 Jan 2010 23:22
Picon

Re: Docs for String.split

> Maybe a patch a bit like this:

Looks good--submit a patch to core with it :)
-r

Hugh Sasse | 22 Jan 2010 01:32
Picon
Picon

Re: Docs for String.split

On Thu, 21 Jan 2010, Roger Pack wrote:

> > Maybe a patch a bit like this:
> 
> Looks good--submit a patch to core with it :)

Awaiting comment from Gabriel Gilder.....
> -r
> 
        Hugh

Gabriel Gilder | 22 Jan 2010 01:44
Picon
Gravatar

Re: Docs for String.split

Looks great to me!

On Thu, Jan 21, 2010 at 4:32 PM, Hugh Sasse <hgs <at> dmu.ac.uk> wrote:
On Thu, 21 Jan 2010, Roger Pack wrote:

> > Maybe a patch a bit like this:
>
> Looks good--submit a patch to core with it :)

Awaiting comment from Gabriel Gilder.....
> -r
>
       Hugh


Hugh Sasse | 25 Jan 2010 20:18
Picon
Picon

Re: Docs for String.split


On Thu, 21 Jan 2010, Gabriel Gilder wrote:

> Looks great to me!
> 
> On Thu, Jan 21, 2010 at 4:32 PM, Hugh Sasse <hgs <at> dmu.ac.uk> wrote:
> 
> > On Thu, 21 Jan 2010, Roger Pack wrote:
> >
> > > > Maybe a patch a bit like this:
> > >
> > > Looks good--submit a patch to core with it :)
> >
> > Awaiting comment from Gabriel Gilder.....
> > > -r
> > >
> >         Hugh

The patch was accepted, along with a number of supporting test cases
as a separate patch.

        Hugh


Gmane