Vinay Sajip | 13 Dec 2011 23:56
Picon
Favicon
Gravatar

lib2to3 and matching of fixers

I'm trying to write a custom lib2to3 fixer which, when processing the source

x = b'foo' b'bar'

will match the right hand side of the expression. From what I can see I should
be able to match it with 'atom' or 'power', but it doesn't seem to match either
of those. The individual byte-string tokens are matched using STRING, as
expected.

What should I use to match the right-hand side, for example if I want to do
constant folding and transform to something like b'foobar'?

Regards,

Vinay Sajip
Benjamin Peterson | 14 Dec 2011 02:30
Favicon
Gravatar

Re: lib2to3 and matching of fixers

2011/12/13 Vinay Sajip <vinay_sajip@...>:
> I'm trying to write a custom lib2to3 fixer which, when processing the source
>
> x = b'foo' b'bar'
>
> will match the right hand side of the expression. From what I can see I should
> be able to match it with 'atom' or 'power', but it doesn't seem to match either
> of those. The individual byte-string tokens are matched using STRING, as
> expected.

What are you using exactly? I would expect "atom< STRING+>" to work.

>
> What should I use to match the right-hand side, for example if I want to do
> constant folding and transform to something like b'foobar'?

--

-- 
Regards,
Benjamin
Vinay Sajip | 14 Dec 2011 10:23
Picon
Favicon
Gravatar

Re: lib2to3 and matching of fixers

> What are you using exactly? I would expect "atom< STRING+>" to
> work.

Thanks, that worked. I tried "atom" and "STRING+" separately - sorry, I'm unfamiliar with the pattern-matching language, but hopefully I'll get more familiar as I work with it!

Regards,

Vinay Sajip
_______________________________________________
Python-porting mailing list
Python-porting@...
http://mail.python.org/mailman/listinfo/python-porting
Vinay Sajip | 14 Dec 2011 12:43
Picon
Favicon
Gravatar

Fw: lib2to3 and matching of fixers

>> What are you using exactly? I would expect "atom< STRING+>" to 

>> work.
>
>Thanks, that worked. I tried "atom" and "STRING+" separately - sorry, I'm unfamiliar with the pattern-matching
> language, but hopefully I'll get more familiar as I work with it!

I'm now trying to match a tuple using the following pattern:

PATTERN = "atom< '(' args=testlist_gexp< any+ > ')' >"

This is a subset of the pattern used in the isinstance fixer that Armin Ronacher wrote. However, the pattern
fails to match the source

x = (int, long)

The parse tree for this source looks like this:

Node(file_input, [Node(simple_stmt, [Node(expr_stmt, [Leaf(1, 'x'), Leaf(22, '='), Node(atom,
[Leaf(7, '('), Node(testlist_gexp, [Leaf(1, 'int'), Leaf(12, ','), Leaf(1, 'long')]), Leaf(8,
')')])]), Leaf(4, '\n')]), Leaf(0, '')])

Since it contains an "atom" node with children consisting of a parenthesised "testlist_gexp" node, I'm
not sure why there is no match. What am I missing?

Thanks and regards,

Vinay Sajip
Benjamin Peterson | 14 Dec 2011 18:35
Favicon
Gravatar

Re: Fw: lib2to3 and matching of fixers

2011/12/14 Vinay Sajip <vinay_sajip@...>:
>>> What are you using exactly? I would expect "atom< STRING+>" to
>
>>> work.
>>
>>Thanks, that worked. I tried "atom" and "STRING+" separately - sorry, I'm unfamiliar with the pattern-matching
>> language, but hopefully I'll get more familiar as I work with it!
>
>
> I'm now trying to match a tuple using the following pattern:
>
> PATTERN = "atom< '(' args=testlist_gexp< any+ > ')' >"
>
>
> This is a subset of the pattern used in the isinstance fixer that Armin Ronacher wrote. However, the
pattern fails to match the source
>
> x = (int, long)
>
>
> The parse tree for this source looks like this:
>
> Node(file_input, [Node(simple_stmt, [Node(expr_stmt, [Leaf(1, 'x'), Leaf(22, '='), Node(atom,
[Leaf(7, '('), Node(testlist_gexp, [Leaf(1, 'int'), Leaf(12, ','), Leaf(1, 'long')]), Leaf(8,
')')])]), Leaf(4, '\n')]), Leaf(0, '')])

You have to account for the newline.

--

-- 
Regards,
Benjamin
Vinay Sajip | 14 Dec 2011 21:38
Picon
Favicon
Gravatar

Re: Fw: lib2to3 and matching of fixers

Benjamin Peterson <benjamin <at> ...> writes:

> 
> You have to account for the newline.
> 

Thanks for the response, but it's not clear how or why. As to the "why": the
portion of the tree I get by looking at

tree.children[0].children[0].children[2]

is

Node(atom, [
    Leaf(7, '('),
    Node(testlist_gexp, [
        Leaf(1, 'int'),
        Leaf(12, ','),
        Leaf(1, 'long')
    ]),
    Leaf(8, ')')
])

which is the bit I expect to match. It does not contain the newline (that's
present as the last child the enclosing grandparent node (simple_stmt) obtained
by looking at tree.children[0]:

Node(simple_stmt, [
    Node(expr_stmt, [
        Leaf(1, 'x'),
        Leaf(22, '='),
        Node(atom, [
            Leaf(7, '('),
            Node(testlist_gexp, [
                Leaf(1, 'int'),
                Leaf(12, ','),
                Leaf(1, 'long')
            ]),
            Leaf(8, ')')
        ])
    ]),
    Leaf(4, '\n')
])

The pattern I used appears consistent with the subset of the pattern used in
the is_instance fixer to match a tuple in the isinstance call, other than
whitespace in the pattern string itself:

    power<
        'isinstance'
        trailer< '(' arglist< any ',' atom< '('
            args=testlist_gexp< any+ >
        ')' > > ')' >
    >

in the is_instance fixer, as against my fixer's

    atom< '(' args=testlist_gexp< any+ > ')' >

So I can't see why the newline needs to be accounted for, nor how to do it.
Arranging the whitespace in my pattern to exactly match that segment in the
is_instance fixer's pattern made no difference, as I would have expected.

So, how do I account for the newline, and why do I need to when it's two levels
up in the tree?

Thanks for your patience with lib2to3 newbie.

Regards,

Vinay Sajip
Benjamin Peterson | 14 Dec 2011 23:18
Favicon
Gravatar

Re: Fw: lib2to3 and matching of fixers

2011/12/14 Vinay Sajip <vinay_sajip@...>:
> Benjamin Peterson <benjamin <at> ...> writes:
>
>>
>> You have to account for the newline.
>>
>
> Thanks for the response, but it's not clear how or why. As to the "why": the
> portion of the tree I get by looking at

Sorry. I was trying to read the parse tree from the linear version in
your last mail and failed. You are correct.

Does your fixer us BM_compatible = True? Try setting that to false.

--

-- 
Regards,
Benjamin
Vinay Sajip | 15 Dec 2011 00:01
Picon
Favicon
Gravatar

Re: Fw: lib2to3 and matching of fixers

> 

> Does your fixer us BM_compatible = True? Try setting that to false.
> 

With BM_compatible = False, the match is indeed made, and my fixer's transform method called. Thanks!

Is the meaning of the BM_compatible flag, or the overall matching and transformation algorithm,
documented somewhere? Most of the fixers I looked at had BM_compatible = True, and I took my cue from the
fix_isinstance fixer as it appeared closest to what I was trying to do.

Regards,

Vinay Sajip
Benjamin Peterson | 15 Dec 2011 00:05
Favicon
Gravatar

Re: Fw: lib2to3 and matching of fixers

2011/12/14 Vinay Sajip <vinay_sajip@...>:
>>
>
>> Does your fixer us BM_compatible = True? Try setting that to false.
>>
>
> With BM_compatible = False, the match is indeed made, and my fixer's transform method called. Thanks!
>
> Is the meaning of the BM_compatible flag, or the overall matching and transformation algorithm,
documented somewhere? Most of the fixers I looked at had BM_compatible = True, and I took my cue from the
fix_isinstance fixer as it appeared closest to what I was trying to do.

No, unfortunately. The BM matcher is faster, but has weird behavior on
some things (as you can see).

--

-- 
Regards,
Benjamin

Gmane