Benjamin Wolff Bohl | 5 Jun 2012 19:36
Picon
Favicon

TEI, XSL-FO, and Glyphs

Hi everybody,
I was wondering, whether anybody has a best practice way of handling 
Glyphs (especially asian ones) when transformin TEI files to PDF using 
oXygen. Of course if I select a font containing the glyphs but that 
would make me check on every glyph beforehand. Is there a way of 
fallback handling glyphs not available in the font my xml-fo file specifies?

Best wishes,
Benjamin

--

-- 
Benjamin Wolff Bohl

***********************************************************
Edirom - Projekt "Digitale Musikedition"
Musikwissenschaftliches Seminar Detmold/Paderborn
Gartenstraße 20
D – 32756 Detmold

Tel. +49 (0) 5231 / 975-669
Fax: +49 (0) 5231 / 975-668

http://www.edirom.de
***********************************************************

Marcus Bingenheimer | 6 Jun 2012 06:46
Picon

TEI, XSL-FO, and Glyphs

Hi Benjamin,

It is some years ago that I have used XSL-FO with Chinese, so things might have changed, but here are my five cents anyway:
 
I was wondering, whether anybody has a best practice way of handling Glyphs (especially asian ones) when transformin TEI files to PDF using oXygen. Of course if I select a font containing the glyphs but that would make me check on every glyph beforehand.

If with "glyphs" you mean CKJ-Ideographs any large font like PMingLiU ( 新細明體)(on Windows) should work.
If you must assume that your document contains (very) rare characters (Unicode CJK Unified Ideographs Extension B, C or D) then you should see a square in the output.
 
Is there a way of
fallback handling glyphs not available in the font my xml-fo file specifies?

I seem to remember that the default Apache XSL-FO engine (Apache FOP) in oXygen would not work for the PMingLiU Extension B Extension, but you could try with the Han Nom A and B fonts.
Anyway the default engine is relatively weak. I am involved with two projects that include text with lots of rare CJK-characters. We produce PDFs from TEI via Open Office, which provides sufficient formatting options for our purposes, certainly more than XSL-FO, but of course less than LateX or Indesign.

all the best

marcus

--
Dr. Marcus Bingenheimer 馬德偉
Department of Religion, Temple University

Conal Tuohy | 6 Jun 2012 07:33
Picon
Favicon

Re: TEI, XSL-FO, and Glyphs

Hi Benjamin

Years ago I remember generating font-metrics files and building a 
userconfig.xml file for FOP, in order to be able to print special 
glyphs. I don't remember all the details, but this, for instance, might 
be helpful: http://www.firebirdsql.org/manual/fontembed.html

FOP also has a font-substitution feature which may be helpful: you could 
group all your fonts into one substitution group and rely on this 
feature to choose the best font: 
http://xmlgraphics.apache.org/fop/1.0/fonts.html#substitution

In the worst case you may have to explicitly map characters to the 
appropriate fonts:

You could specify in your XSL-FO document which font to use for a 
particular character. You could do this with an XSLT that post-processes 
the XSL-FO, matching particular characters, and wrapping them in 
<fo:inline elements> in order to specify a font with a matching glyph. 
See http://xmlgraphics.apache.org/fop/faq.html#pdf-characters

To identify which fonts contains glyphs with a particular character, you 
could use the FOP font metrics tools, or you could try this (Microsoft 
Windows) utility program: http://wiki.digitalclassicist.org/Find_Glyph

I hope that's helpful!

Conal

On 06/06/12 03:36, Benjamin Wolff Bohl wrote:
> Hi everybody,
> I was wondering, whether anybody has a best practice way of handling 
> Glyphs (especially asian ones) when transformin TEI files to PDF using 
> oXygen. Of course if I select a font containing the glyphs but that 
> would make me check on every glyph beforehand. Is there a way of 
> fallback handling glyphs not available in the font my xml-fo file 
> specifies?
>
> Best wishes,
> Benjamin
>

--

-- 
Conal Tuohy
eResearch Business Analyst
Victorian eResearch Strategic Initiative
+61-466324297

Benjamin Wolff Bohl | 7 Jun 2012 09:44
Picon
Favicon

Re: TEI, XSL-FO, and Glyphs

Hi Marcus,
hi Conal,

thank you for your advice.
As I read from what you've said I will have to "know", e.g. modify my data.
What I've done so far is generate font-etrics files for all teh fonts I 
want to embed and directly reference them in my fop-config.xml. Moreover 
I decided on Using "Aria Unicode MS" font for anything "foreign". In 
order grasp the glyphs and to automatically tag them with <foreign> I 
used regular expressions searching for certain unicode codepoint ranges 
to search all my TEI files.

Thanks,
Benjamin

Benjamin Wolff Bohl

***********************************************************
Edirom - Projekt "Digitale Musikedition"
Musikwissenschaftliches Seminar Detmold/Paderborn
Gartenstraße 20
D – 32756 Detmold

Tel. +49 (0) 5231 / 975-669
Fax: +49 (0) 5231 / 975-668

http://www.edirom.de
***********************************************************


Am 06.06.2012 07:33, schrieb Conal Tuohy:
> Hi Benjamin
>
> Years ago I remember generating font-metrics files and building a 
> userconfig.xml file for FOP, in order to be able to print special 
> glyphs. I don't remember all the details, but this, for instance, 
> might be helpful: http://www.firebirdsql.org/manual/fontembed.html
>
> FOP also has a font-substitution feature which may be helpful: you 
> could group all your fonts into one substitution group and rely on 
> this feature to choose the best font: 
> http://xmlgraphics.apache.org/fop/1.0/fonts.html#substitution
>
> In the worst case you may have to explicitly map characters to the 
> appropriate fonts:
>
> You could specify in your XSL-FO document which font to use for a 
> particular character. You could do this with an XSLT that 
> post-processes the XSL-FO, matching particular characters, and 
> wrapping them in <fo:inline elements> in order to specify a font with 
> a matching glyph. See 
> http://xmlgraphics.apache.org/fop/faq.html#pdf-characters
>
> To identify which fonts contains glyphs with a particular character, 
> you could use the FOP font metrics tools, or you could try this 
> (Microsoft Windows) utility program: 
> http://wiki.digitalclassicist.org/Find_Glyph
>
> I hope that's helpful!
>
> Conal
>
>
>
> On 06/06/12 03:36, Benjamin Wolff Bohl wrote:
>> Hi everybody,
>> I was wondering, whether anybody has a best practice way of handling 
>> Glyphs (especially asian ones) when transformin TEI files to PDF 
>> using oXygen. Of course if I select a font containing the glyphs but 
>> that would make me check on every glyph beforehand. Is there a way of 
>> fallback handling glyphs not available in the font my xml-fo file 
>> specifies?
>>
>> Best wishes,
>> Benjamin
>>
>
>


Gmane