Re: Docs for String.split
Hugh Sasse <hgs <at> dmu.ac.uk>
2010-01-20 19:13:39 GMT
On Wed, 20 Jan 2010, Gabriel Gilder wrote:
> Hi there,
> Just joined to point out a small oversight in the Ruby docs - I looked into
> the process for contributing documentation changes but it was a little over
> my head. :) So I figured someone here could probably take care of it really
> quickly.
>
> So, the docs for String.split cover splitting by regex, but they don't
> mention that if you have a capturing subpattern (or several) in your regex,
> that gets included in the returned array.
>
> For example:
>
> "1, 2.34,56, 7".split(%r{(,\s*)}) #=> ["1", ", ", "2.34", ",", "56", ", ",
> "7"]
> "word :separator: word".split(/(:(\w+):)/) #=> ["word ", ":separator:",
> "separator", " word"]
>
> I've tested this in Ruby 1.8.7, but I imagine it works the same way in Ruby
> 1.9 as well, so both versions of the docs should probably be updated.
Maybe a patch a bit like this:
--- ruby-1.8.7-p173/string.c.orig 2009-02-17 02:59:26.000000000 +0000
+++ ruby-1.8.7-p173/string.c 2010-01-20 19:10:04.718058600 +0000
<at> <at> -3515,7 +3515,9 <at> <at>
*
* If <i>pattern</i> is a <code>Regexp</code>, <i>str</i> is divided where the
* pattern matches. Whenever the pattern matches a zero-length string,
- * <i>str</i> is split into individual characters.
+ * <i>str</i> is split into individual characters. If
+ * <i>pattern</i> includes one or more capturing subpatterns,
+ * these will be returned in the array returned by split.
*
* If <i>pattern</i> is omitted, the value of <code>$;</code> is used. If
* <code>$;</code> is <code>nil</code> (which is the default), <i>str</i> is
<at> <at> -3532,6 +3534,8 <at> <at>
* " now's the time".split(' ') #=> ["now's", "the", "time"]
* " now's the time".split(/ /) #=> ["", "now's", "", "the", "time"]
* "1, 2.34,56, 7".split(%r{,\s*}) #=> ["1", "2.34", "56", "7"]
+ * "1, 2.34,56".split(%r{(,\s*)}) #=> ["1", ", ", "2.34", ",", "56"]
+ * "wd :sp: wd".split(/(:(\w+):)/) #=> ["wd ", ":sp:", "sp", " wd"]
* "hello".split(//) #=> ["h", "e", "l", "l", "o"]
* "hello".split(//, 3) #=> ["h", "e", "llo"]
* "hi mom".split(%r{\s*}) #=> ["h", "i", "m", "o", "m"]
keeps the alignment of existing examples. I've not run these to test
them, by the way.
>
> Thanks, please let me know if you have any questions!
>
> -Gabriel
>
Hugh