Picon
Favicon

[jira] Commented: (UIMA-387) XMI Serializer can write invalid control characters


    [
https://issues.apache.org/jira/browse/UIMA-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492934
] 

Marshall Schor commented on UIMA-387:
-------------------------------------

I don't think we should (silently) change user data (i.e., replacing funny characters with spaces).  I
would prefer the XML 1.1 approach, unless someone has a reason 1.0 is needed.  

That still leaves the 0x00 character not being valid - Could we output something that was valid XML but when
read in by our deserializer would be able to be converted back to 00?  I suppose if we came up with such a
mechanism, it could be used in XML 1.0 for all the "bad" characters.  Maybe something like outputing a
special XML element we define which has a hex representation of the bad character(s)?  

How does EMF handle this?

-Marshall

> XMI Serializer can write invalid control characters
> ---------------------------------------------------
>
>                 Key: UIMA-387
>                 URL: https://issues.apache.org/jira/browse/UIMA-387
>             Project: UIMA
>          Issue Type: Bug
>          Components: Core Java Framework
>    Affects Versions: 2.1
>            Reporter: Adam Lally
(Continue reading)


Gmane