Bart Kastermans | 28 Aug 22:01

Negative regular expressions (searching for "i" not inside command)

I have a file in which I am searching for the letter "i" (actually
a bit more general than that, arbitrary regular expressions could
occur) as long as it does not occur inside an expression that matches
\\.+?\b (something started by a backslash and including the word that
follows).

More concrete example, I have the string "\sin(i)" and I want to match
the argument, but not the i in \sin.

Can this be achieved by combining the regular expressions?  I do not
know the right terminology involved, therefore my searching on the
Internet has not led to any results.

I can achieve something like this by searching for all i and then
throwing away those i that are inside such expressions.  I am now just
wondering if these two steps can be combined into one.

Best,
Bart
--

-- 
http://www.bartk.nl/
--
http://mail.python.org/mailman/listinfo/python-list

Guilherme Polo | 28 Aug 23:04
Gravatar

Re: Negative regular expressions (searching for "i" not inside command)

On Thu, Aug 28, 2008 at 5:04 PM, Bart Kastermans
<kasterma <at> math.wisc.edu${hostname}> wrote:
> I have a file in which I am searching for the letter "i" (actually
> a bit more general than that, arbitrary regular expressions could
> occur) as long as it does not occur inside an expression that matches
> \\.+?\b (something started by a backslash and including the word that
> follows).
>
> More concrete example, I have the string "\sin(i)" and I want to match
> the argument, but not the i in \sin.
>
> Can this be achieved by combining the regular expressions?  I do not
> know the right terminology involved, therefore my searching on the
> Internet has not led to any results.

Try searching again with the "lookahead" term, or "negative lookahead".

>
> I can achieve something like this by searching for all i and then
> throwing away those i that are inside such expressions.  I am now just
> wondering if these two steps can be combined into one.
>
> Best,
> Bart
> --
> http://www.bartk.nl/
> --
> http://mail.python.org/mailman/listinfo/python-list
>

(Continue reading)

Terry Reedy | 28 Aug 23:45

Re: Negative regular expressions (searching for "i" not inside command)


Bart Kastermans wrote:
> I have a file in which I am searching for the letter "i" (actually
> a bit more general than that, arbitrary regular expressions could
> occur) as long as it does not occur inside an expression that matches
> \\.+?\b (something started by a backslash and including the word that
> follows).

You should either make sure that the opposite, a match of \\.+?\b inside 
a match of your target re, cannot occur, or consider what you want to 
happen if it can.

> More concrete example, I have the string "\sin(i)" and I want to match
> the argument, but not the i in \sin.
> 
> Can this be achieved by combining the regular expressions?  I do not
> know the right terminology involved, therefore my searching on the
> Internet has not led to any results.
> 
> I can achieve something like this by searching for all i and then
> throwing away those i that are inside such expressions.

If you do not need the original position in the text of each match, and 
you are not concerned about target matches encompassing splitter 
matches, you could switch the order of searching.

for fragment in re.split(text, r'\\.+?\b'):
   <search fragment for target>

 >  I am now just wondering if these two steps can be combined into one.
(Continue reading)


Gmane