Andrew Coppin | 12 Oct 12:18

Pandoc questions

There doesn't seem to be any option to make Pandoc produce actual MathML 
output. Is there a reason for this?

(The only option I can see is to spit out raw LaTeX plus a 70KB 
JavaScript program to transform this into MathML at the client end --- 
which seems a little silly to me. There's also no way to style the raw 
LaTeX differently in case JavaScript is unavailable.)

Also, while Markdown *almost* does what I want, there are a few small 
constructs it doesn't have. For example, I'd like to have some way to 
denote a "term" the first time I use it. I could just use italics, but 
I'd prefer some way to visually indicate that this isn't just an 
emphasised word, it's a new technical term. In HTML, I'd use a style 
class, and in LaTeX I'd define a new command. But I can't see a way to 
do something that will still allow Pandoc to generate correct LaTeX 
*and* correct HTML from a single Markdown source... Any hints?
David Menendez | 12 Oct 17:29

Re: Pandoc questions

On Sun, Oct 12, 2008 at 6:21 AM, Andrew Coppin
<andrewcoppin <at> btinternet.com> wrote:
> Also, while Markdown *almost* does what I want, there are a few small
> constructs it doesn't have. For example, I'd like to have some way to denote
> a "term" the first time I use it. I could just use italics, but I'd prefer
> some way to visually indicate that this isn't just an emphasised word, it's
> a new technical term. In HTML, I'd use a style class,

Markdown allows arbitrary HTML tags, so you can just put the terms in
a <dfn> element.

I don't know if that will work with the LaTeX conversion. Markdown is
specifically designed to produce HTML, so it's not clear to me how
Pandoc does any of the non-HTML output formats.

--

-- 
Dave Menendez <dave <at> zednenem.com>
<http://www.eyrie.org/~zednenem/>
Andrew Coppin | 12 Oct 17:45

Re: Pandoc questions

David Menendez wrote:
> Markdown allows arbitrary HTML tags, so you can just put the terms in
> a <dfn> element.
>
> I don't know if that will work with the LaTeX conversion. Markdown is
> specifically designed to produce HTML, so it's not clear to me how
> Pandoc does any of the non-HTML output formats.
>   

Well, no... Markdown is a way of marking up text in a way which is still 
moderately readable to human beings. You can turn it into any markup 
format in principle. The trouble as, as soon as Pandoc doesn't 
understand the markup, you can't really expect it to handle the 
translation any more...
David Menendez | 12 Oct 22:30

Re: Pandoc questions

On Sun, Oct 12, 2008 at 11:45 AM, Andrew Coppin
<andrewcoppin <at> btinternet.com> wrote:
> David Menendez wrote:
>>
>> Markdown allows arbitrary HTML tags, so you can just put the terms in
>> a <dfn> element.
>>
>> I don't know if that will work with the LaTeX conversion. Markdown is
>> specifically designed to produce HTML, so it's not clear to me how
>> Pandoc does any of the non-HTML output formats.
>>
>
> Well, no... Markdown is a way of marking up text in a way which is still
> moderately readable to human beings. You can turn it into any markup format
> in principle. The trouble as, as soon as Pandoc doesn't understand the
> markup, you can't really expect it to handle the translation any more...

The first sentence on the Markdown web page is: "Markdown is a
text-to-HTML conversion tool for web writers."

<http://daringfireball.net/projects/markdown/>

The Markdown syntax guide states: "Markdown's syntax is intended for
one purpose: to be used as a format for writing for the web. ... For
any markup that is not covered by Markdown's syntax, you simply use
HTML itself."

Markdown text may contain arbitrary HTML blocks. Any attempt to
produce LaTeX or PDF from Markdown must therefore be able to convert
arbitrary HTML.
(Continue reading)

John MacFarlane | 16 Oct 23:49
Picon

Re: Pandoc questions

+++ Andrew Coppin [Oct 12 08 11:21 ]:
> There doesn't seem to be any option to make Pandoc produce actual MathML  
> output. Is there a reason for this?

1. Nobody has written the LaTeX -> MathML code yet, and I've been too
lazy.  Anyone who is interested in doing this should get in touch.

2. Not all browsers can process MathML. The current system (using the
LaTeXMathML.js javascript) has the advantage of "falling back" to raw
LaTeX in browsers that don't support MathML.

John
Andrew Coppin | 17 Oct 19:30

Re: Pandoc questions

John MacFarlane wrote:
> +++ Andrew Coppin [Oct 12 08 11:21 ]:
>   
>> There doesn't seem to be any option to make Pandoc produce actual MathML  
>> output. Is there a reason for this?
>>     
>
> 1. Nobody has written the LaTeX -> MathML code yet, and I've been too
> lazy.  Anyone who is interested in doing this should get in touch.
>   

Well, I'd certainly be "interested". I use mathematics *a lot* in my 
writing. Presumably modifying a large program like Pandoc is intractably 
difficult though?

It strikes me that perhaps using LaTeX to enter mathematical markup is 
rather against the spirit of Markdown. Surely there should be an option 
to include raw LaTeX, but a more "natural" encoding that covers "most" 
mathematics would be nice also. Of course, that means somebody has to 
design it first...

> 2. Not all browsers can process MathML. The current system (using the
> LaTeXMathML.js javascript) has the advantage of "falling back" to raw
> LaTeX in browsers that don't support MathML.
>   

It's been a while since I looked, but I believe the spec provides a way 
to provide an "alternative" block of XML, similar to the 'alt' tag in 
the <img> element, for precisely this reason. (And if there was a math 
converter, rather than raw LaTeX you could provide something a little 
(Continue reading)

John MacFarlane | 18 Oct 03:12
Picon

Re: Pandoc questions

>> 1. Nobody has written the LaTeX -> MathML code yet, and I've been too
>> lazy.  Anyone who is interested in doing this should get in touch.
>>   
>
> Well, I'd certainly be "interested". I use mathematics *a lot* in my  
> writing. Presumably modifying a large program like Pandoc is intractably  
> difficult though?

Just write a separate library that parses LaTeX input and returns MathML
output. Pandoc could then use this library. So you wouldn't need to know
anything about pandoc's internals. Just write a function 

teXMathToMathML :: String -> String. 

This would be a great contribution!  You could get a head start by
looking at the LaTeXMathML.js code.

> It strikes me that perhaps using LaTeX to enter mathematical markup is  
> rather against the spirit of Markdown. Surely there should be an option  
> to include raw LaTeX, but a more "natural" encoding that covers "most"  
> mathematics would be nice also. Of course, that means somebody has to  
> design it first...

I think it makes good sense to use LaTeX, which is already designed to
be natural but flexible, and is already known by most mathematicians.
My guess is that in designing a more natural format, one would
eventually reinvent something like LaTeX...

John
(Continue reading)

Andrew Coppin | 18 Oct 10:21

Re: Pandoc questions

John MacFarlane wrote:
>>> 1. Nobody has written the LaTeX -> MathML code yet, and I've been too
>>> lazy.  Anyone who is interested in doing this should get in touch.
>>>   
>>>       
>> Well, I'd certainly be "interested". I use mathematics *a lot* in my  
>> writing. Presumably modifying a large program like Pandoc is intractably  
>> difficult though?
>>     
>
> Just write a separate library that parses LaTeX input and returns MathML
> output. Pandoc could then use this library. So you wouldn't need to know
> anything about pandoc's internals. Just write a function 
>
> teXMathToMathML :: String -> String. 
>
> This would be a great contribution!  You could get a head start by
> looking at the LaTeXMathML.js code.
>   

OK. I'll give that a go at some point...

> I think it makes good sense to use LaTeX, which is already designed to
> be natural but flexible, and is already known by most mathematicians.
>   

Seems like a valid argument.

> My guess is that in designing a more natural format, one would
> eventually reinvent something like LaTeX...
(Continue reading)

Andy Smith | 24 Oct 03:21
Picon

Re: Re: Pandoc questions

2008/10/17 Andrew Coppin <andrewcoppin <at> btinternet.com>:
> It strikes me that perhaps using LaTeX to enter mathematical markup is
> rather against the spirit of Markdown. Surely there should be an option to
> include raw LaTeX, but a more "natural" encoding that covers "most"
> mathematics would be nice also. Of course, that means somebody has to design
> it first...

Here's something along those lines, which I found recently on the W3C
MathML software page:

http://www1.chapman.edu/~jipsen/asciimath.html

It's a converter from an ASCII syntax to Presentation MathML, written
in JavaScript to allow mathematical notation on web pages to be
converted to MathML in browsers that support it, or kept as ASCII in
browsers that don't. There's a specification of the ASCII syntax which
would be a good starting point if you want to write another
implementation:

http://www1.chapman.edu/~jipsen/mathml/asciimathsyntax.html

Presumably this can't express everything that MathML can (and it
doesn't deal with Content MathML), so it would be useful to support
MathML in the source, like Markdown allows inline HTML, or LaTeX.

Andy

Gmane