Terrence Brannon | 28 Aug 21:22

re.compile versus r''

Hello, I'm using a tool (PLY) which apparently expects the tokens to
be created using r''

But because one token is a rather complex regular expression, I want
to create the regular expression programmatically.

How can I generate a string and then create something of the same type
that the r'' function does?

Concretely, in the program below, consonant is not the same type as
t_NAME, but I assume that it needs to be for PLY to use it for
tokenizing:

import re

t_NAME       = r'[a-zA-Z_][a-zA-Z0-9_]*'

guttural   = 'kh?|gh?|\"n'
palatal    = '(?:chh?|jh?|\~n)'
cerebral   = '\.(?:th?|dh?|n)'
dental     = '(?:th?|dh?|n)'
semivowel  = '[yrlv]'
sibilant   = '[\"\.]?s'
aspirant   = 'h'

consonant = re.compile('|'.join([guttural , palatal , cerebral ,
dental , semivowel , sibilant , aspirant]))

print consonant
print t_NAME
(Continue reading)

Terrence Brannon | 28 Aug 21:27

Re: re.compile versus r''

Oh my god, how embarrassing. the r'' notation is to create raw string
<http://www.swc.scipy.org/lec/glossary.html#gdef-raw_string>

I thought it was some form of blessing a string into a regular
expression class.

--
http://mail.python.org/mailman/listinfo/python-list

Fredrik Lundh | 28 Aug 21:29

Re: re.compile versus r''

Terrence Brannon wrote:

> Hello, I'm using a tool (PLY) which apparently expects the tokens to
> be created using r''
> 
> But because one token is a rather complex regular expression, I want
> to create the regular expression programmatically.
> 
> How can I generate a string and then create something of the same type
> that the r'' function does?

r'' is an alternative syntax for string literals that affects how escape 
sequences are interpreted; there's no separate string type for strings 
created by this syntax.

</F>

--
http://mail.python.org/mailman/listinfo/python-list

Chris Rebert | 28 Aug 21:32

Re: re.compile versus r''

On Thu, Aug 28, 2008 at 12:23 PM, Terrence Brannon <metaperl <at> gmail.com> wrote:
> Hello, I'm using a tool (PLY) which apparently expects the tokens to
> be created using r''
>
> But because one token is a rather complex regular expression, I want
> to create the regular expression programmatically.
>
> How can I generate a string and then create something of the same type
> that the r'' function does?

The "r" prefix isn't a function or a type, it's merely a special
literal syntax for strings that's handy when you're writing regexes
and therefore have to deal with another level of backslash escaping.
See the second to last paragraph of
http://docs.python.org/ref/strings.html  for more info.

Regards,
Chris

>
> Concretely, in the program below, consonant is not the same type as
> t_NAME, but I assume that it needs to be for PLY to use it for
> tokenizing:
>
> import re
>
> t_NAME       = r'[a-zA-Z_][a-zA-Z0-9_]*'
>
> guttural   = 'kh?|gh?|\"n'
> palatal    = '(?:chh?|jh?|\~n)'
(Continue reading)


Gmane