Dmitry Bogatov | 5 Aug 14:00 2014
Picon

Dealing with encodings

Hello!

How teach hxt to handle "KOI8-R" encoding of input file?

And it seems that so many great packages (like hxt, feed, curl) uses String.
Is it some work in progress to port them to Text?

--
Best regards, Dmitry Bogatov <KAction <at> gnu.org>,
Free Software supporter, esperantisto and netiquette guardian.
GPG: 54B7F00D
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Ivan Lazar Miljenovic | 5 Aug 14:09 2014
Picon

Re: Dealing with encodings

On 5 August 2014 22:00, Dmitry Bogatov <KAction <at> gnu.org> wrote:
> Hello!
>
> How teach hxt to handle "KOI8-R" encoding of input file?

http://hackage.haskell.org/package/text-icu ?

>
> And it seems that so many great packages (like hxt, feed, curl) uses String.
> Is it some work in progress to port them to Text?

Because no-one has changed them to do so.  Some of them probably also
predate the rise in popularity of text.

>
> --
> Best regards, Dmitry Bogatov <KAction <at> gnu.org>,
> Free Software supporter, esperantisto and netiquette guardian.
> GPG: 54B7F00D
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe <at> haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

--

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic <at> gmail.com
http://IvanMiljenovic.wordpress.com
(Continue reading)

MigMit | 5 Aug 17:12 2014
Picon

Re: Dealing with encodings

Please, don't do that. The overabundance of cyrillic encodings caused great pain in the past; don't help
this genie out of the bottle again. Especially since KOI8-R is the worst of cyrillic encodings.

On 05 Aug 2014, at 16:00, Dmitry Bogatov <KAction <at> gnu.org> wrote:

> Hello!
> 
> How teach hxt to handle "KOI8-R" encoding of input file?
> 
> And it seems that so many great packages (like hxt, feed, curl) uses String.
> Is it some work in progress to port them to Text?
> 
> --
> Best regards, Dmitry Bogatov <KAction <at> gnu.org>,
> Free Software supporter, esperantisto and netiquette guardian.
> GPG: 54B7F00D
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe <at> haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
Dmitry Bogatov | 5 Aug 19:06 2014
Picon

Re: Dealing with encodings

* MigMit <miguelimo38 <at> yandex.ru> [2014-08-05 19:12:19+0400]
> Please, don't do that. The overabundance of cyrillic encodings caused
> great pain in the past; don't help this genie out of the bottle
> again. Especially since KOI8-R is the worst of cyrillic encodings.

I totally agree that using anything, but utf-8 is crime.  But fact is
fact -- I need to parse html page, that is koi8 encoded. What should I do?

--
Best regards, Dmitry Bogatov <KAction <at> gnu.org>,
Free Software supporter, esperantisto and netiquette guardian.
GPG: 54B7F00D
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Carter Schonwald | 5 Aug 19:58 2014
Picon

Re: Dealing with encodings

use text-icu and text to decode it into Text, then use the standard tools 


On Tue, Aug 5, 2014 at 1:06 PM, Dmitry Bogatov <KAction <at> gnu.org> wrote:
* MigMit <miguelimo38 <at> yandex.ru> [2014-08-05 19:12:19+0400]
> Please, don't do that. The overabundance of cyrillic encodings caused
> great pain in the past; don't help this genie out of the bottle
> again. Especially since KOI8-R is the worst of cyrillic encodings.

I totally agree that using anything, but utf-8 is crime.  But fact is
fact -- I need to parse html page, that is koi8 encoded. What should I do?

--
Best regards, Dmitry Bogatov <KAction <at> gnu.org>,
Free Software supporter, esperantisto and netiquette guardian.
GPG: 54B7F00D

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Danilov Alexander | 6 Aug 07:25 2014
Picon

Re: [OBORONA-SPAM] Dealing with encodings

05.08.2014 16:00, Dmitry Bogatov пишет:
> Hello!
>
> How teach hxt to handle "KOI8-R" encoding of input file?
>
> And it seems that so many great packages (like hxt, feed, curl) uses String.
> Is it some work in progress to port them to Text?
> Prelude> :search encoding
Prelude> :search encoding
Searching for: encoding
package encoding
Data.Text.Encoding module Data.Text.Encoding
Data.Text.Lazy.Encoding module Data.Text.Lazy.Encoding
GHC.IO.Encoding module GHC.IO.Encoding
System.IO hGetEncoding :: Handle -> IO (Maybe TextEncoding)
GHC.IO.Handle hGetEncoding :: Handle -> IO (Maybe TextEncoding)
System.IO hSetEncoding :: Handle -> TextEncoding -> IO ()
GHC.IO.Handle hSetEncoding :: Handle -> TextEncoding -> IO ()
System.IO localeEncoding :: TextEncoding
GHC.IO.Encoding localeEncoding :: TextEncoding
System.IO mkTextEncoding :: String -> IO TextEncoding
GHC.IO.Encoding mkTextEncoding :: String -> IO TextEncoding

You can read file in any encoding available in system and later convert it into Text.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Dmitry Bogatov | 6 Aug 14:14 2014
Picon

Re: [OBORONA-SPAM] Dealing with encodings

* Danilov Alexander <danilovalexalex <at> yandex.ru> [2014-08-06 09:25:54+0400]
> 05.08.2014 16:00, Dmitry Bogatov пишет:
> >Hello!
> >
> >How teach hxt to handle "KOI8-R" encoding of input file?
> >
> >And it seems that so many great packages (like hxt, feed, curl) uses String.
> >Is it some work in progress to port them to Text?
> >Prelude> :search encoding
> Prelude> :search encoding
> Searching for: encoding
> package encoding
> Data.Text.Encoding module Data.Text.Encoding
> Data.Text.Lazy.Encoding module Data.Text.Lazy.Encoding
> GHC.IO.Encoding module GHC.IO.Encoding
> System.IO hGetEncoding :: Handle -> IO (Maybe TextEncoding)
> GHC.IO.Handle hGetEncoding :: Handle -> IO (Maybe TextEncoding)
> System.IO hSetEncoding :: Handle -> TextEncoding -> IO ()
> GHC.IO.Handle hSetEncoding :: Handle -> TextEncoding -> IO ()
> System.IO localeEncoding :: TextEncoding
> GHC.IO.Encoding localeEncoding :: TextEncoding
> System.IO mkTextEncoding :: String -> IO TextEncoding
> GHC.IO.Encoding mkTextEncoding :: String -> IO TextEncoding

Problem is that I have XML file from unknown source. I do not know
encoding A-priori, it's specified in file itself. And to get it, I need
to parse XML.

--
Best regards, Dmitry Bogatov <KAction <at> gnu.org>,
Free Software supporter, esperantisto and netiquette guardian.
GPG: 54B7F00D
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Danilov Alexander | 6 Aug 18:09 2014
Picon

Re: [OBORONA-SPAM] Dealing with encodings

06.08.2014 16:14, Dmitry Bogatov пишет:
> * Danilov Alexander <danilovalexalex <at> yandex.ru> [2014-08-06 09:25:54+0400]
>> 05.08.2014 16:00, Dmitry Bogatov пишет:
>>> Hello!
>>>
>>> How teach hxt to handle "KOI8-R" encoding of input file?
>>>
>>> And it seems that so many great packages (like hxt, feed, curl) uses String.
>>> Is it some work in progress to port them to Text?
>>> Prelude> :search encoding
>> Prelude> :search encoding
>> Searching for: encoding
>> package encoding
>> Data.Text.Encoding module Data.Text.Encoding
>> Data.Text.Lazy.Encoding module Data.Text.Lazy.Encoding
>> GHC.IO.Encoding module GHC.IO.Encoding
>> System.IO hGetEncoding :: Handle -> IO (Maybe TextEncoding)
>> GHC.IO.Handle hGetEncoding :: Handle -> IO (Maybe TextEncoding)
>> System.IO hSetEncoding :: Handle -> TextEncoding -> IO ()
>> GHC.IO.Handle hSetEncoding :: Handle -> TextEncoding -> IO ()
>> System.IO localeEncoding :: TextEncoding
>> GHC.IO.Encoding localeEncoding :: TextEncoding
>> System.IO mkTextEncoding :: String -> IO TextEncoding
>> GHC.IO.Encoding mkTextEncoding :: String -> IO TextEncoding
> Problem is that I have XML file from unknown source. I do not know
> encoding A-priori, it's specified in file itself. And to get it, I need
> to parse XML.
>
>
Usually, encoding specified in xml file, and xml parser may recode text data itself.
I show you API to recode text, it xml parser unable to recode.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Gmane