kaltheat | 2 Aug 2012 13:20

buggy awk regex handling?


Hi,

I tried to replace three letters with three letters by awk using the sub-routine.
I assumed that my regular expression does mean the following:

match if three letters of any letter of alphabet occurs anywhere in input

$ echo AbC | awk '{sub(/[[:alpha:]]{3}/,"cBa"); print;}'
AbC

As you can see the result was unexpected.
When I try doing it for at least one letter, it works:

$ echo AbC | awk '{sub(/[[:alpha:]]+/,"cBa"); print;}'
cBa

Same problem without macro:

$ echo AbC | awk '{sub(/[A-Za-z]{3}/,"cBa"); print;}'
AbC

$ echo AbC | awk '{sub(/[A-Za-z]+/,"cBa"); print;}'
cBa

I thought that it might have something to do with the curly braces. But escaping
them doesn't do the trick.

What am I doing wrong?
Or is awk buggy?
(Continue reading)

RW | 2 Aug 2012 15:17

Re: buggy awk regex handling?

On Thu, 02 Aug 2012 13:20:52 +0200
kaltheat wrote:

> 
> 
> Hi,
> 
> I tried to replace three letters with three letters by awk using the
> sub-routine. I assumed that my regular expression does mean the
> following:
> 
> match if three letters of any letter of alphabet occurs anywhere in
> input
> 
> $ echo AbC | awk '{sub(/[[:alpha:]]{3}/,"cBa"); print;}'
> AbC
> 
> As you can see the result was unexpected.
> When I try doing it for at least one letter, it works:
> 
> $ echo AbC | awk '{sub(/[[:alpha:]]+/,"cBa"); print;}'
> cBa
> ...
> What am I doing wrong?
> Or is awk buggy?

Traditional awk implementations don't support {n}, but I think POSIX
implementations should. 
_______________________________________________
freebsd-questions <at> freebsd.org mailing list
(Continue reading)

Warren Block | 2 Aug 2012 16:04
Favicon

Re: buggy awk regex handling?

On Thu, 2 Aug 2012, RW wrote:

> On Thu, 02 Aug 2012 13:20:52 +0200
> kaltheat wrote:
>
>> I tried to replace three letters with three letters by awk using the
>> sub-routine. I assumed that my regular expression does mean the
>> following:
>>
>> match if three letters of any letter of alphabet occurs anywhere in
>> input
>>
>> $ echo AbC | awk '{sub(/[[:alpha:]]{3}/,"cBa"); print;}'
>> AbC
>>
>> As you can see the result was unexpected.
>> When I try doing it for at least one letter, it works:
>>
>> $ echo AbC | awk '{sub(/[[:alpha:]]+/,"cBa"); print;}'
>> cBa
>> ...
>> What am I doing wrong?
>> Or is awk buggy?
>
> Traditional awk implementations don't support {n}, but I think POSIX
> implementations should.

Using gawk instead of awk agrees with that.  Printing the result of the 
sub (the number of substitutions performed) makes it a little more 
clear:
(Continue reading)


Gmane