19 May 2010 17:29
wchar_t encoding?
Paul Koning <Paul_Koning <at> Dell.com>
2010-05-19 15:29:38 GMT
2010-05-19 15:29:38 GMT
Gents, I'm working on a patch to gdb 7.1 to make it work on NetBSD. The issue is that GDB 7 uses iconv to handle character strings, and uses wide chars internally so it can handle various non-ASCII scripts. The trouble for NetBSD is that it asks iconv to translate to a character set named "wchar_t". That means "whatever the encoding is for the wchar_t data type". GNU libiconv supports that, so on platforms that use that library things are fine. NetBSD supports iconv, but it doesn't know the "wchar_t" encoding name. So I proposed a patch that substitutes what appears to be used instead, namely UCS-4 in platform native byte order (so "ucs-4le" on x86, for example). This seems to work. The trouble is that I'm getting pushback on the patch, because of concerns that the encoding used for wchar_t is not actually UCS-4. In particular, there is this article: http://www.gnu.org/software/libunistring/manual/libunistring.html#The-wc har_005ft-mess which says that on Solaris and FreeBSD the encoding of wchar_t is "undocumented and locale dependent". (Ye gods!) Now, NetBSD is not FreeBSD... so... what is the answer for NetBSD? Is it like FreeBSD? (If so, it would be good to fix that.) Or is it a fixed encoding, and if so, is it indeed ucs-4? Thanks, paul(Continue reading)
Martin
RSS Feed