luisicozgz | 10 Sep 2007 16:09
Picon
Favicon

Re: Problem with UTF-8

Hello,

The console shows the following thing:

[Fatal Error] :521:16: Invalid byte 2 of 2-byte UTF-8 sequence.
org.apache.xmlrpc.client.XmlRpcClientException: Failed to parse 
servers response: Invalid byte 2 of 2-byte UTF-8 sequence.
	at 
org.apache.xmlrpc.client.XmlRpcStreamTransport.readResponse
(XmlRpcStreamTransport.java:177)
	at org.apache.xmlrpc.client.XmlRpcStreamTransport.sendRequest
(XmlRpcStreamTransport.java:145)
	at org.apache.xmlrpc.client.XmlRpcHttpTransport.sendRequest
(XmlRpcHttpTransport.java:94)
	at 
org.apache.xmlrpc.client.XmlRpcSunHttpTransport.sendRequest
(XmlRpcSunHttpTransport.java:44)
	at org.apache.xmlrpc.client.XmlRpcClientWorker.execute
(XmlRpcClientWorker.java:53)
	at org.apache.xmlrpc.client.XmlRpcClient.execute
(XmlRpcClient.java:166)
	at org.apache.xmlrpc.client.XmlRpcClient.execute
(XmlRpcClient.java:136)
	at org.apache.xmlrpc.client.XmlRpcClient.execute
(XmlRpcClient.java:125)
	at comunicacion.<init>(comunicacion.java:86)
	at Main.main(Main.java:9)
Caused by: org.xml.sax.SAXParseException: Invalid byte 2 of 2-byte 
UTF-8 sequence.
	at 
(Continue reading)

Stephane Bortzmeyer | 10 Sep 2007 16:31
Picon

Re: Problem with UTF-8

On Mon, Sep 10, 2007 at 02:09:38PM -0000,
 luisicozgz <luisicozgz <at> yahoo.com> wrote 
 a message of 518 lines which said:

> <member><name>direccion</name>

Is it really direccion and not dirección? Because the result you
posted has only ASCII characters so I really do not see how it could
raise UTF-8 problems.

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/xml-rpc/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/xml-rpc/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:xml-rpc-digest <at> yahoogroups.com 
    mailto:xml-rpc-fullfeatured <at> yahoogroups.com

<*> To unsubscribe from this group, send an email to:
    xml-rpc-unsubscribe <at> yahoogroups.com

(Continue reading)

John Wilson | 10 Sep 2007 20:13
Picon

Re: Re: Problem with UTF-8


On 10 Sep 2007, at 15:31, Stephane Bortzmeyer wrote:

> On Mon, Sep 10, 2007 at 02:09:38PM -0000,
>  luisicozgz <luisicozgz <at> yahoo.com> wrote
>  a message of 518 lines which said:
>
>> <member><name>direccion</name>
>
> Is it really direccion and not dirección? Because the result you
> posted has only ASCII characters so I really do not see how it could
> raise UTF-8 problems.

If it is dirección then what's probably happening is that the server  
is encoding the response in ISO 8859/1 but we can see from the  
message dump it is not emitting an XML declaration.

This is a bug in your server implementation.

John Wilson

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/xml-rpc/

<*> Your email settings:
    Individual Email | Traditional

(Continue reading)

luisicozgz | 13 Sep 2007 16:24
Picon
Favicon

Re: Problem with UTF-8

Hello,

I have discovered that I have a problem with the accents.I received 
this response:

<methodResponse>
<params>
<param>
<value><array>
<data>
<value><struct>
<member><name>nombre</name>
<value><string>HOSTAL ANTÓN</string></value>
</member>
<member><name>direccion</name>
<value><string>ANTÓN</string></value>
</member>
<member><name>codigo_postal</name>
<value><string>50619</string></value>
</member>
<member><name>zona</name>
<value><int>1</int></value>
</member>
<member><name>poblacion</name>
<value><int>1</int></value>
</member>
</struct></value>
</data>
</array></value>
</param>
(Continue reading)

John Wilson | 13 Sep 2007 17:11
Picon

Re: Re: Problem with UTF-8


On 13 Sep 2007, at 15:24, luisicozgz wrote:

> Hello,
>
> I have discovered that I have a problem with the accents.I received
> this response:

Answered on the Apache list.

Basically you have to fix the server to either use UTF-8 or emit a  
header.

You can't fix it at the client end.

John Wilson

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/xml-rpc/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/xml-rpc/join
    (Yahoo! ID required)

(Continue reading)

luisicozgz | 14 Sep 2007 09:41
Picon
Favicon

Re: Problem with UTF-8

Hello,

I have in PHP(Server) the next line to create the xml file:

<?xml version="1.0" encoding="utf-8"?>

Have I with this a valid header or what must I to have?

The Java Source file is encoding in UTF-8.

The response XML-RPC:

<?xml version="1.0"?>      <-----Must appear hear encoding="utf-8"?
<methodResponse>
<params>
<param>
<value><array>
<data>
<value><struct>
<member><name>nombre</name>
<value><string>HOTEL ANTÓN</string></value>
</member>
<member><name>direccion</name>
<value><string>HOTEL ANTÓN</string></value>
</member>
<member><name>codigo_postal</name>
<value><string>50619</string></value>
</member>
<member><name>zona</name>
<value><int>1</int></value>
(Continue reading)

Stephane Bortzmeyer | 14 Sep 2007 11:26
Picon

Re: Problem with UTF-8

On Fri, Sep 14, 2007 at 07:41:30AM -0000,
 luisicozgz <luisicozgz <at> yahoo.com> wrote 
 a message of 464 lines which said:

> I have in PHP(Server) the next line to create the xml file:

I do not understand. The XML-RPC server is written in Java or in PHP?

> The Java Source file is encoding in UTF-8.

Irrelevant.

> The response XML-RPC:
> 
> <?xml version="1.0"?>      <-----Must appear hear encoding="utf-8"?

No, this is the default in XML.

As explained by Gaetano Giunta, there is a discrepancy between the
*claim* (no encoding, which means encoding is UTF-8) and the *reality*
(Latin1 actually used).

To fix the discrepancy, you can change the claim or you can change the
reality. As Gaetano said, choose the easiest way.

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/xml-rpc/
(Continue reading)

Gaetano Giunta | 14 Sep 2007 11:36
Picon
Gravatar

Re: Re: Problem with UTF-8

if the server is in PHP, AND you are using the standard xmlrpc library 
(that is, the native xmlrpc extension) to build it (as far as I can 
tell), you can
- use utf8_encode() on your php data before sending it to the xmlrpc 
encoding layer
- use a header() call in your server php script to set Content-type: 
text/xml; charset=iso-8859-1 (note: syntax is by memory, it might be a 
little different)
- send a modified xml prologue, as you proposed. Using "iso-8859-1"
either will work.

If you are using a different php xmlrpc implementation server side, 
chances are there is some support for charset encoding in your toolkit

> On Fri, Sep 14, 2007 at 07:41:30AM -0000,
> luisicozgz <luisicozgz <at> yahoo.com <mailto:luisicozgz%40yahoo.com>> wrote
> a message of 464 lines which said:
>
> > I have in PHP(Server) the next line to create the xml file:
>
> I do not understand. The XML-RPC server is written in Java or in PHP?
>
> > The Java Source file is encoding in UTF-8.
>
> Irrelevant.
>
> > The response XML-RPC:
> >
> > <?xml version="1.0"?> <-----Must appear hear encoding="utf-8"?
>
(Continue reading)

John Wilson | 14 Sep 2007 13:50
Picon

Re: Re: Problem with UTF-8


On 14 Sep 2007, at 10:36, Gaetano Giunta wrote:

> if the server is in PHP, AND you are using the standard xmlrpc library
> (that is, the native xmlrpc extension) to build it (as far as I can
> tell), you can
> - use utf8_encode() on your php data before sending it to the xmlrpc
> encoding layer
> - use a header() call in your server php script to set Content-type:
> text/xml; charset=iso-8859-1 (note: syntax is by memory, it might be a
> little different)
> - send a modified xml prologue, as you proposed. Using "iso-8859-1"
> either will work.
>
> If you are using a different php xmlrpc implementation server side,
> chances are there is some support for charset encoding in your toolkit

Setting the charset on the HTTP Content-type header violates the XML- 
RPC spec and, in general, will have no effect on the behaviour of the  
client.

Most software I know of which consumes XML over HTTP will ignore the  
charset. The problem is that it is almost always wrong (e.g. it is  
omitted but the encoding of the document is not US-ASCII).

I'm pretty sure that Apache XML-RPC ignores the Content-type encoding.

John Wilson

 
(Continue reading)

Ulrich Schaefer | 14 Sep 2007 14:01
Picon
Favicon

Re: Re: Problem with UTF-8

John Wilson schrieb:
> Most software I know of which consumes XML over HTTP will ignore the  
> charset. The problem is that it is almost always wrong (e.g. it is  
> omitted but the encoding of the document is not US-ASCII).
>
> I'm pretty sure that Apache XML-RPC ignores the Content-type encoding.
>
>   
I agree.

(the following is from my longish experiments with XML-RPC 
interoperability several months ago)

Due to an "underspecification" in the initial XML-RPC specification, 
there is no guarantee that different implementations really treat 8 bit 
encoded strings such as UTF-8, ISO-8859-X or EUC-JP correctly in both 
directions.
The only thing that is defined for the string data type is exchange of 7 
bit US-ASCII characters. Playing with headers may help, but not 
necessarily, depending on the implementations used (I'm talking about 
*existing* XML-RPC implementations; using the same Java XML-RPC 
implementation at both ends e.g. is no problem).
My solution (without patching the XML-RPC libraries) for properly 
connecting Python clients with a Java Server was to implement the 
transfer via the binary data type (base64-encoded) with encoding and 
decoding from/to Unicode at both ends. This is bad because of the 
transcoding overhead, but formed the only solution for bidirectional 
Unicode text exchange that worked correctly even for Japanese 
characters:-). I can send Java & Python code examples if requested.

(Continue reading)

John Wilson | 14 Sep 2007 14:57
Picon

Re: Re: Problem with UTF-8


On 14 Sep 2007, at 13:01, Ulrich Schaefer wrote:

> John Wilson schrieb:
>> Most software I know of which consumes XML over HTTP will ignore the
>> charset. The problem is that it is almost always wrong (e.g. it is
>> omitted but the encoding of the document is not US-ASCII).
>>
>> I'm pretty sure that Apache XML-RPC ignores the Content-type  
>> encoding.
>>
>>
> I agree.
>
> (the following is from my longish experiments with XML-RPC
> interoperability several months ago)
>
> Due to an "underspecification" in the initial XML-RPC specification,
> there is no guarantee that different implementations really treat 8  
> bit
> encoded strings such as UTF-8, ISO-8859-X or EUC-JP correctly in both
> directions.
> The only thing that is defined for the string data type is exchange  
> of 7
> bit US-ASCII characters. Playing with headers may help, but not
> necessarily, depending on the implementations used (I'm talking about
> *existing* XML-RPC implementations; using the same Java XML-RPC
> implementation at both ends e.g. is no problem).
> My solution (without patching the XML-RPC libraries) for properly
> connecting Python clients with a Java Server was to implement the
(Continue reading)

Gaetano Giunta | 13 Sep 2007 16:32
Picon
Gravatar

Re: Re: Problem with UTF-8

either will work. find out the easiest change to make...

> Hello,
>
> I have discovered that I have a problem with the accents.I received
> this response:
>
> <methodResponse>
> <params>
> <param>
> <value><array>
> <data>
> <value><struct>
> <member><name>nombre</name>
> <value><string>HOSTAL ANTÓN</string></value>
> </member>
> <member><name>direccion</name>
> <value><string>ANTÓN</string></value>
> </member>
> <member><name>codigo_postal</name>
> <value><string>50619</string></value>
> </member>
> <member><name>zona</name>
> <value><int>1</int></value>
> </member>
> <member><name>poblacion</name>
> <value><int>1</int></value>
> </member>
> </struct></value>
> </data>
(Continue reading)


Gmane