Paul Taylor | 9 Apr 2010 13:48

Which byte order should be used when using UTF16 BOM with ID3v23

Hi, ID3v23 doesnt support UTF8 but it does support UTF16 with BOM, i.e 2 
bytes per character which can be either Most Significant Byte (MSB) or 
Least Significant Byte (LSB) first as indicated by the BOM that can be 
0xFF 0xFE or 0xFe 0xFF.

Trouble if you use 0xFF 0xFE it matches the pattern for synchronization 
and if you do synchronize the tag then many applications dont understand 
synchronization. Whereas if you use 0xFE 0xFF you dont need 
unsynchronization, but  I dont thinks Windows likes this byte order.

Anybody else had similar problems and come up with the best supported 
solution.

thanks Paul
Mathias Kunter | 10 Apr 2010 12:14
Picon
Favicon

Re: Which byte order should be used when using UTF16 BOM with ID3v23

Yes, unsynchronized tags aren't supported very well. However, de facto all software and hardware mp3 players support ID3 version 2 tags today, at least for skipping them if they're present within an mp3 file. It therefore shouldn't often be nescessary to unsynchronize an ID3 tag at all.

If you need to ensure compatibility with (old) software or hardware mp3 implementations which don't support ID3 version 2 tags and therefore actually scan the tag for a mp3 synchronization pattern, I would avoid using the 0xFF 0xFE byte order mark. You then may don't use any byte order mark at all and encode the string as big endian (as specified by the unicode standard), or explicitely use the big endian 0xFE 0xFF byte order mark - most applications which support UTF-16 should also be able to actually decode a big endian string!

Mathias K.


Von: Paul Taylor <paul_t100 <at> fastmail.fm>
An: id3v2 <at> id3.org
Gesendet: Freitag, den 9. April 2010, 13:48:57 Uhr
Betreff: [ID3 Dev] Which byte order should be used when using UTF16 BOM with ID3v23

Hi, ID3v23 doesnt support UTF8 but it does support UTF16 with BOM, i.e 2 bytes per character which can be either Most Significant Byte (MSB) or Least Significant Byte (LSB) first as indicated by the BOM that can be 0xFF 0xFE or 0xFe 0xFF.

Trouble if you use 0xFF 0xFE it matches the pattern for synchronization and if you do synchronize the tag then many applications dont understand synchronization. Whereas if you use 0xFE 0xFF you dont need unsynchronization, but  I dont thinks Windows likes this byte order.

Anybody else had similar problems and come up with the best supported solution.

thanks Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: id3v2-unsubscribe <at> id3.org
For additional commands, e-mail: id3v2-help <at> id3.org


__________________________________________________
Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen Massenmails.
http://mail.yahoo.com
Paul Taylor | 10 Apr 2010 23:16

Re: Which byte order should be used when using UTF16 BOM with ID3v23

Hi Mathias
Mathias Kunter wrote:
> Yes, unsynchronized tags aren't supported very well. However, de facto 
> all software and hardware mp3 players support ID3 version 2 tags 
> today, at least for skipping them if they're present within an mp3 
> file. It therefore shouldn't often be nescessary to unsynchronize an 
> ID3 tag at all.
>
If you have an APIC frame its very likely to have bytes that fall foul 
of the unsynchronization schema , and if you don't do Unsychronization 
then that image WILL NOT display correctly in iTunes. So this is one 
example where unsysnchronization is needed for newer software not for 
the music to play okay, but for the metadata to display okay.
> If you need to ensure compatibility with (old) software or hardware 
> mp3 implementations which don't support ID3 version 2 tags and 
> therefore actually scan the tag for a mp3 synchronization pattern, I 
> would avoid using the 0xFF 0xFE byte order mark. You then may don't 
> use any byte order mark at all and encode the string as big endian (as 
> specified by the unicode standard), or explicitely use the big endian 
> 0xFE 0xFF byte order mark - most applications which support UTF-16 
> should also be able to actually decode a big endian string!
Ok I'll take another look at BE ( I thought it caused problems for 
WIndows but perhaps my diagnosis was wrong) , but I don't think you can 
just drop the BOM thats breaking the ID3 standard

>
> Mathias K.
Paul
Mathias Kunter | 11 Apr 2010 17:53
Picon
Favicon

Re: Which byte order should be used when using UTF16 BOM with ID3v23

> If you have an APIC frame its very likely to have bytes that fall foul of the unsynchronization schema ,
>and if you don't do Unsychronization then that image WILL NOT display correctly in iTunes. So this is
>one example where unsysnchronization is needed for newer software not for the music to play okay, but
>for the metadata to display okay.

Yes, it's a pain...

>Ok I'll take another look at BE ( I thought it caused problems for WIndows but perhaps my diagnosis was
>wrong) , but I don't think you can just drop the BOM thats breaking the ID3 standard

Ah yes, ID3 specifies that a BOM must be present (the ISO specification of UTF-16 doesn't - I remembered incorrectly). Well, Windows Media Player stores strings within ID3 tags as little endian strings (as most unicode strings on the Windows platform are stored), but I'm not aware of problems caused by big endian strings. I however also didn't test this with all common versions of Windows Media Player.

Mathias


Von: Paul Taylor <paul_t100 <at> fastmail.fm>
An: id3v2 <at> id3.org
Gesendet: Samstag, den 10. April 2010, 23:16:03 Uhr
Betreff: Re: [ID3 Dev] Which byte order should be used when using UTF16 BOM with ID3v23

Hi Mathias
Mathias Kunter wrote:
> Yes, unsynchronized tags aren't supported very well. However, de facto all software and hardware mp3 players support ID3 version 2 tags today, at least for skipping them if they're present within an mp3 file. It therefore shouldn't often be nescessary to unsynchronize an ID3 tag at all.
>
If you have an APIC frame its very likely to have bytes that fall foul of the unsynchronization schema , and if you don't do Unsychronization then that image WILL NOT display correctly in iTunes. So this is one example where unsysnchronization is needed for newer software not for the music to play okay, but for the metadata to display okay.
> If you need to ensure compatibility with (old) software or hardware mp3 implementations which don't support ID3 version 2 tags and therefore actually scan the tag for a mp3 synchronization pattern, I would avoid using the 0xFF 0xFE byte order mark. You then may don't use any byte order mark at all and encode the string as big endian (as specified by the unicode standard), or explicitely use the big endian 0xFE 0xFF byte order mark - most applications which support UTF-16 should also be able to actually decode a big endian string!
Ok I'll take another look at BE ( I thought it caused problems for WIndows but perhaps my diagnosis was wrong) , but I don't think you can just drop the BOM thats breaking the ID3 standard

>
> Mathias K.
Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: id3v2-unsubscribe <at> id3.org
For additional commands, e-mail: id3v2-help <at> id3.org


__________________________________________________
Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen Massenmails.
http://mail.yahoo.com

Gmane