Gabriel Wicke | 28 Dec 2011 13:35
Favicon

Re: Refactoring in progress with parserTests

On 12/28/2011 05:45 AM, Neil Kandalgaonkar wrote:
> I pulled out most of the parser-y parts from the parserTests, leaving 
> behind just tests.

Very good, this was really needed.

> However, the parser is still a bit of a monster object, hence the 
> deliberately silly name, ParserThingy.
> 
> I'm trying to decompose it into a chain, roughly like:

The current implementation already operates as a chain, as documented in
https://www.mediawiki.org/wiki/Future/Parser_development:

PEG wiki/HTML tokenizer         (or other tokenizers / SAX-like parsers)
    | Chunks of tokens
    V
Token stream transformations
    | Chunks of tokens
    V
HTML5 tree builder
    | HTML 5 DOM tree
    V
DOM Postprocessors
    | HTML5 DOM tree
    +------------------> (X)HTML serialization
    |
    V
DomConverter
    | WikiDom
(Continue reading)

Neil Kandalgaonkar | 28 Dec 2011 14:24
Picon

Re: Refactoring in progress with parserTests

Sorry, I didn't mean to imply that this division was my idea or 
anything. The phases of parsing are explicit already. By 'monster' 
object I don't mean that it is large or incomprehensible, but that it 
has a few too many responsibilities to be easy to test.

For instance, right now it's returning its output as a property of 
itself, and the serializer is sort of added on later. The pipeline 
should be a bit clearer and more stateless.

Anyway this is easily fixed, and will be soon...

On 12/28/11 4:35 AM, Gabriel Wicke wrote:
> On 12/28/2011 05:45 AM, Neil Kandalgaonkar wrote:
>> I pulled out most of the parser-y parts from the parserTests, leaving
>> behind just tests.
>
> Very good, this was really needed.
>
>> However, the parser is still a bit of a monster object, hence the
>> deliberately silly name, ParserThingy.
>>
>> I'm trying to decompose it into a chain, roughly like:
>
> The current implementation already operates as a chain, as documented in
> https://www.mediawiki.org/wiki/Future/Parser_development:
>
> PEG wiki/HTML tokenizer         (or other tokenizers / SAX-like parsers)
>      | Chunks of tokens
>      V
> Token stream transformations
(Continue reading)

Gabriel Wicke | 28 Dec 2011 15:11
Favicon

Re: Refactoring in progress with parserTests

> By 'monster'
> object I don't mean that it is large or incomprehensible, but that it 
> has a few too many responsibilities to be easy to test.

Yeah, there is definitely a bit of cruft left that should disappear
after eventification. I'll convert the TokenTransformDispatcher to event
listener / emitter while refactoring it. That will remove the big
callback that obscures the pipeline flow a bit.

> For instance, right now it's returning its output as a property of 
> itself, and the serializer is sort of added on later. The pipeline 
> should be a bit clearer and more stateless.
> 
> Anyway this is easily fixed, and will be soon...

Awesome!

Gabriel

Gmane