Mateusz Loskot | 11 Jun 2012 16:58
Gravatar

Result of boost::split against empty string changed in post-1.45 release

Hi,

I have a simple program runs boost::split against empty string.
I'm testing with Boost 1.45 and later versions like 1.46 or 1.49.
As stated in the // comment below, there is difference in behaviour
depending on the Boost version:

// https://gist.github.com/2910461
#include <cassert>
#include <string>
#include <vector>
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>

int main(int argc, wchar_t* argv[])
{
    std::wstring s;
    std::vector<std::wstring> a;

    assert(s.empty());
    boost::split(a, s, boost::is_any_of(L", "), boost::token_compress_on);
    // Boost 1.45:  true
    // Boost 1.46+: false
    //   Assertion failed: s.empty() == a.empty(), file
boost_split_empty_string.cpp, line 16
    assert(s.empty() == a.empty());
    return 0;
}

IMHO, the behaviour exposed in newer versions is a bug.
(Continue reading)

Olaf van der Spek | 11 Jun 2012 17:03

Re: Result of boost::split against empty string changed in post-1.45 release

On Mon, Jun 11, 2012 at 4:58 PM, Mateusz Loskot <mateusz <at> loskot.net> wrote:
> IMHO, the behaviour exposed in newer versions is a bug.
> Could anyone shed light on what happened in post-1.46?

What does a contain?

--

-- 
Olaf

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Mateusz Loskot | 11 Jun 2012 17:05
Gravatar

Re: Result of boost::split against empty string changed in post-1.45 release

On 11 June 2012 16:03, Olaf van der Spek <ml <at> vdspek.org> wrote:
> On Mon, Jun 11, 2012 at 4:58 PM, Mateusz Loskot <mateusz <at> loskot.net> wrote:
>> IMHO, the behaviour exposed in newer versions is a bug.
>> Could anyone shed light on what happened in post-1.46?
>
> What does a contain?

a	[1]("")

Best regards,
--

-- 
Mateusz Loskot, http://mateusz.loskot.net

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Pavol Droba | 11 Jun 2012 20:09
Picon

Re: Result of boost::split against empty string changed in post-1.45 release


On Mon, 11 Jun 2012 16:58:40 +0200, Mateusz Loskot <mateusz <at> loskot.net>  
wrote:

> Hi,
>
> I have a simple program runs boost::split against empty string.
> I'm testing with Boost 1.45 and later versions like 1.46 or 1.49.
> As stated in the // comment below, there is difference in behaviour
> depending on the Boost version:
>
> // https://gist.github.com/2910461
> #include <cassert>
> #include <string>
> #include <vector>
> #include <boost/algorithm/string/split.hpp>
> #include <boost/algorithm/string/classification.hpp>
>
> int main(int argc, wchar_t* argv[])
> {
>     std::wstring s;
>     std::vector<std::wstring> a;
>
>     assert(s.empty());
>     boost::split(a, s, boost::is_any_of(L", "),  
> boost::token_compress_on);
>     // Boost 1.45:  true
>     // Boost 1.46+: false
>     //   Assertion failed: s.empty() == a.empty(), file
> boost_split_empty_string.cpp, line 16
(Continue reading)

Olaf van der Spek | 11 Jun 2012 22:05

Re: Result of boost::split against empty string changed in post-1.45 release

On Mon, Jun 11, 2012 at 8:09 PM, Pavol Droba <droba <at> topmail.sk> wrote:
>
> On Mon, 11 Jun 2012 16:58:40 +0200, Mateusz Loskot <mateusz <at> loskot.net>
> wrote:
>
>> Hi,
>>
>> I have a simple program runs boost::split against empty string.
>> I'm testing with Boost 1.45 and later versions like 1.46 or 1.49.
>> As stated in the // comment below, there is difference in behaviour
>> depending on the Boost version:
>>
>> // https://gist.github.com/2910461
>> #include <cassert>
>> #include <string>
>> #include <vector>
>> #include <boost/algorithm/string/split.hpp>
>> #include <boost/algorithm/string/classification.hpp>
>>
>> int main(int argc, wchar_t* argv[])
>> {
>>    std::wstring s;
>>    std::vector<std::wstring> a;
>>
>>    assert(s.empty());
>>    boost::split(a, s, boost::is_any_of(L", "), boost::token_compress_on);
>>    // Boost 1.45:  true
>>    // Boost 1.46+: false
>>    //   Assertion failed: s.empty() == a.empty(), file
>> boost_split_empty_string.cpp, line 16
(Continue reading)

Mateusz Loskot | 11 Jun 2012 22:26
Gravatar

Re: Result of boost::split against empty string changed in post-1.45 release

On 11 June 2012 19:09, Pavol Droba <droba <at> topmail.sk> wrote:
> On Mon, 11 Jun 2012 16:58:40 +0200, Mateusz Loskot <mateusz <at> loskot.net> wrote:
>>
>> I have a simple program runs boost::split against empty string.
>> I'm testing with Boost 1.45 and later versions like 1.46 or 1.49.
>> As stated in the // comment below, there is difference in behaviour
>> depending on the Boost version:
>>
>> // https://gist.github.com/2910461
>> #include <cassert>
>> #include <string>
>> #include <vector>
>> #include <boost/algorithm/string/split.hpp>
>> #include <boost/algorithm/string/classification.hpp>
>>
>> int main(int argc, wchar_t* argv[])
>> {
>>    std::wstring s;
>>    std::vector<std::wstring> a;
>>
>>    assert(s.empty());
>>    boost::split(a, s, boost::is_any_of(L", "), boost::token_compress_on);
>>    // Boost 1.45:  true
>>    // Boost 1.46+: false
>>    //   Assertion failed: s.empty() == a.empty(), file
>> boost_split_empty_string.cpp, line 16
>>    assert(s.empty() == a.empty());
>>    return 0;
>> }
>>
(Continue reading)

Marshall Clow | 12 Jun 2012 05:48
Picon

Re: Result of boost::split against empty string changed in post-1.45 release

On Jun 11, 2012, at 1:26 PM, Mateusz Loskot wrote:
> On 11 June 2012 19:09, Pavol Droba <droba <at> topmail.sk> wrote:
>> On Mon, 11 Jun 2012 16:58:40 +0200, Mateusz Loskot <mateusz <at> loskot.net> wrote:
>>> 
>>> I have a simple program runs boost::split against empty string.
>>> I'm testing with Boost 1.45 and later versions like 1.46 or 1.49.
>>> As stated in the // comment below, there is difference in behaviour
>>> depending on the Boost version:
>>> 
>>> // https://gist.github.com/2910461
>>> #include <cassert>
>>> #include <string>
>>> #include <vector>
>>> #include <boost/algorithm/string/split.hpp>
>>> #include <boost/algorithm/string/classification.hpp>
>>> 
>>> int main(int argc, wchar_t* argv[])
>>> {
>>>    std::wstring s;
>>>    std::vector<std::wstring> a;
>>> 
>>>    assert(s.empty());
>>>    boost::split(a, s, boost::is_any_of(L", "), boost::token_compress_on);
>>>    // Boost 1.45:  true
>>>    // Boost 1.46+: false
>>>    //   Assertion failed: s.empty() == a.empty(), file
>>> boost_split_empty_string.cpp, line 16
>>>    assert(s.empty() == a.empty());
>>>    return 0;
>>> }
(Continue reading)

Mateusz Loskot | 12 Jun 2012 10:19
Gravatar

Re: Result of boost::split against empty string changed in post-1.45 release

On 12 June 2012 04:48, Marshall Clow <mclow.lists <at> gmail.com> wrote:
> On Jun 11, 2012, at 1:26 PM, Mateusz Loskot wrote:
>> On 11 June 2012 19:09, Pavol Droba <droba <at> topmail.sk> wrote:
>>> On Mon, 11 Jun 2012 16:58:40 +0200, Mateusz Loskot <mateusz <at> loskot.net> wrote:
>>>>
>>>> I have a simple program runs boost::split against empty string.
>>>> I'm testing with Boost 1.45 and later versions like 1.46 or 1.49.
>>>> As stated in the // comment below, there is difference in behaviour
>>>> depending on the Boost version:
>>>>
>>>> // https://gist.github.com/2910461
>>>> #include <cassert>
>>>> #include <string>
>>>> #include <vector>
>>>> #include <boost/algorithm/string/split.hpp>
>>>> #include <boost/algorithm/string/classification.hpp>
>>>>
>>>> int main(int argc, wchar_t* argv[])
>>>> {
>>>>    std::wstring s;
>>>>    std::vector<std::wstring> a;
>>>>
>>>>    assert(s.empty());
>>>>    boost::split(a, s, boost::is_any_of(L", "), boost::token_compress_on);
>>>>    // Boost 1.45:  true
>>>>    // Boost 1.46+: false
>>>>    //   Assertion failed: s.empty() == a.empty(), file
>>>> boost_split_empty_string.cpp, line 16
>>>>    assert(s.empty() == a.empty());
>>>>    return 0;
(Continue reading)

Pavol Droba | 16 Jun 2012 22:03
Picon

Re: Result of boost::split against empty string changed in post-1.45 release

Hello,

On Mon, 11 Jun 2012 22:26:33 +0200, Mateusz Loskot <mateusz <at> loskot.net>  
wrote:

>
> Pavol,
>
> You are right, there are some discussions about this behavior,
> For example
>
> http://lists.boost.org/Archives/boost/2005/01/79380.php
> https://svn.boost.org/trac/boost/ticket/534
>
> and others. Thanks for pointing that.
>
> I have to admit that to me the current behavior is not intuitive.
> Also, I have failed to find any example of it in the docs.
>

I will not argue whether the behavior is intuitive or not. Unfortunately
this term is different from person to person.

Important is that the current setup is "correct". It follows a well defined
constraint and it is deterministic.

And actually the rationale behind it is quite simple. Imagine that you
are parsing a CSV file. You need to get the exactly same number
of elements regardless whether the last token is empty or not.

(Continue reading)

Olaf van der Spek | 16 Jun 2012 23:23

Re: Result of boost::split against empty string changed in post-1.45 release

On Sat, Jun 16, 2012 at 10:03 PM, Pavol Droba <droba <at> topmail.sk> wrote:
> And actually the rationale behind it is quite simple. Imagine that you
> are parsing a CSV file. You need to get the exactly same number
> of elements regardless whether the last token is empty or not.
>
> It is much easier to remove the empty tokens than to guess whether
> they were supposed to be in the result or not.
>
> On the other hand, you are right, the documentation should be more
> explicit.

It's not about that, is it? It's about the case where the input is empty.

--

-- 
Olaf

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Mateusz Loskot | 17 Jun 2012 02:42
Gravatar

Re: Result of boost::split against empty string changed in post-1.45 release

On 16 June 2012 22:23, Olaf van der Spek <ml <at> vdspek.org> wrote:
> On Sat, Jun 16, 2012 at 10:03 PM, Pavol Droba <droba <at> topmail.sk> wrote:
>> And actually the rationale behind it is quite simple. Imagine that you
>> are parsing a CSV file. You need to get the exactly same number
>> of elements regardless whether the last token is empty or not.
>>
>> It is much easier to remove the empty tokens than to guess whether
>> they were supposed to be in the result or not.
>>
>> On the other hand, you are right, the documentation should be more
>> explicit.
>
> It's not about that, is it? It's about the case where the input is empty.

And there seem to be more than one valid or practical approach possible, indeed.
By the way, it looks Python has chosen similar approach to Boost:

>>> "".split(',')
['']

>>> len("".split(','))
1

Best regards,
--

-- 
Mateusz Loskot, http://mateusz.loskot.net

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

(Continue reading)

Mateusz Loskot | 17 Jun 2012 02:38
Gravatar

Re: Result of boost::split against empty string changed in post-1.45 release

On 16 June 2012 21:03, Pavol Droba <droba <at> topmail.sk> wrote:
> On Mon, 11 Jun 2012 22:26:33 +0200, Mateusz Loskot <mateusz <at> loskot.net> wrote:
>>
>> You are right, there are some discussions about this behavior,
>> For example
>>
>> http://lists.boost.org/Archives/boost/2005/01/79380.php
>> https://svn.boost.org/trac/boost/ticket/534
>>
>> and others. Thanks for pointing that.
>>
>> I have to admit that to me the current behavior is not intuitive.
>> Also, I have failed to find any example of it in the docs.
>
>
> I will not argue whether the behavior is intuitive or not.

Sure, and I'm not trying either.

> Unfortunately this term is different from person to person.
>
> Important is that the current setup is "correct". It follows a well defined
> constraint and it is deterministic.

Yes, it's only the matter of awareness about this behaviour.

> And actually the rationale behind it is quite simple. Imagine that you
> are parsing a CSV file. You need to get the exactly same number
> of elements regardless whether the last token is empty or not.
>
(Continue reading)


Gmane