Alex Taylor | 30 May 2009 02:29

setlocale() and codepage (3.3.6)

Using GCC 3.3.6, I don't know if this applies to later versions...

Calling
  char *loc = setlocale(LC_TIME, "")
is supposed to set the time conventions to the current locale defined in
the environment, which it does.

However, it seems to automatically append the codepage specifier ISO8859-1,
regardless of what the current codepage is.

For instance, running under Japanese OS/2, this code:
  char *loc = setlocale(LC_TIME, "")
  printf("Locale:     %s\n", loc );

reports:
  ja_JP.ISO8859-1
which isn't very useful, as ISO8859-1 doesn't support Japanese text.  If
I then try to print a localized time string from strftime, the Japanese
characters are all replaced by 0x1A (even though the current process 
codepage supports them).  

Is there any graceful way to tell setlocale() to use the current process 
codepage?
KO Myung-Hun | 31 May 2009 15:19

Re: setlocale() and codepage (3.3.6)

Hi/2.

Unfortunately, kLIBC does not assume current codepage for locale unless 
charset is specified.

So you should specify charset(IBM-xxx, where xxx is codepage) to some 
env. var. such as 'LANG', too.

For example, for Korea

    set LANG=ko_KR.IBM-949

Or, you can use 'SYSTEM' for the current codepage with kLIBC.

    set LANG=ko_KR.SYSTEM

And setlocale() uses env. var. for default locale in the following order.

    1. LC_ALL
    2. LC_xxxx, in this case, LC_TIME
    3. LANG
    4. assume 'C' locale

So you should pass a proper argument, or set a proper env. var. using a 
proper function such as putenv().

Alex Taylor wrote:
> Using GCC 3.3.6, I don't know if this applies to later versions...
>
> Calling
(Continue reading)

Alex Taylor | 1 Jun 2009 13:23

Re: setlocale() and codepage (3.3.6)

On Sun, 31 May 2009 13:19:57 UTC, KO Myung-Hun <komh <at> chollian.net> wrote:

> Unfortunately, kLIBC does not assume current codepage for locale unless 
> charset is specified.
> 
> So you should specify charset(IBM-xxx, where xxx is codepage) to some 
> env. var. such as 'LANG', too.
> 
> Or, you can use 'SYSTEM' for the current codepage with kLIBC.
> 
>     set LANG=ko_KR.SYSTEM

Ah, that sounds like what I want.  I'll give it a try, thanks!
Paul Smedley | 30 May 2009 23:13
Picon

Re: setlocale() and codepage (3.3.6)

Hi Alex,

On Sat, 30 May 2009 00:29:02 UTC, "Alex Taylor" 
<mail.me <at> reply.to.address> wrote:

> Using GCC 3.3.6, I don't know if this applies to later versions...
> 
> Calling
>   char *loc = setlocale(LC_TIME, "")
> is supposed to set the time conventions to the current locale defined in
> the environment, which it does.
> 
> However, it seems to automatically append the codepage specifier ISO8859-1,
> regardless of what the current codepage is.
> 
> For instance, running under Japanese OS/2, this code:
>   char *loc = setlocale(LC_TIME, "")
>   printf("Locale:     %s\n", loc );
> 
> reports:
>   ja_JP.ISO8859-1
> which isn't very useful, as ISO8859-1 doesn't support Japanese text.  If
> I then try to print a localized time string from strftime, the Japanese
> characters are all replaced by 0x1A (even though the current process 
> codepage supports them).  
> 
> Is there any graceful way to tell setlocale() to use the current process 
> codepage?
I don't know - but FWIW - this problem also affects recent versions of
PostgreSQL - which complain about the lack of UTF8 support, as it 
(Continue reading)

Alex Taylor | 31 May 2009 07:46

Re: setlocale() and codepage (3.3.6)

On Sat, 30 May 2009 21:13:02 UTC, "Paul Smedley" 
<pauldespam <at> despamsmedley.id.au> wrote:

> > For instance, running under Japanese OS/2, this code:
> >   char *loc = setlocale(LC_TIME, "")
> >   printf("Locale:     %s\n", loc );
> > 
> > reports:
> >   ja_JP.ISO8859-1
> > which isn't very useful, as ISO8859-1 doesn't support Japanese text. 
> > 
> > Is there any graceful way to tell setlocale() to use the current process 
> > codepage?
>
> I don't know - but FWIW - this problem also affects recent versions of
> PostgreSQL - which complain about the lack of UTF8 support, as it 
> thinks the locale is en_US.ISO8859-1
> 
> I guess this is a bug in setlocale in libc063 (gcc version is 
> irrelevant here) 

OK, thanks.  I guess it's good to know it's not just me...

Gmane