Marcin Junczys-Dowmunt | 3 Aug 2012 10:43
Picon

Adding compact phrase table to trunk

Hi all,
I have a created a new branch "minphr" in which I am going to prepare a 
version of my compact phrase table that can be easily merged into the 
trunk. I have several questions here, so I do not cause any trouble when 
merging. Also I want to include a usage instruction in my MTM2012 paper, 
so it would be nice if I could make sure none of the program names, 
extensions or options change after publication.

So if you have objections or better ideas concerning any of the points 
below, please let me know.

1) Phrase table type: When specifying the phrase table format in the 
config file, the phrase table type has to be set. I would like to use 
12, which as far as I see it is free. Unless I can get 3 :)

2) Binary names: the programs that generate the phrase table and the 
corresponding lexical reordering models will be named 
"processPhraseTableMin" and "processLexicalTableMin".

3) Extensions: phrase tables and lexical reordering tables are compiled 
into single files, i would like to choose "*.minphr" and "*.minlexr" as 
extension for those correspondingly.

4) Lexical reordering model: I would like to make Moses load a 
"*.minlexr" reordering model if one is available, checking for that 
before the standard binary format and other formats.

5) Options: I would like to add the following boolean options: 
"-minphr-memory" and "-minlexr-memory" which tell Moses to load the 
models into memory instead of reading them from disk.
(Continue reading)

Hieu Hoang | 3 Aug 2012 11:29
Picon

Re: Adding compact phrase table to trunk

Hi Marcin,

glad to hear it, a compact phrase table could be really useful. All you 
points seems good.

I think the code has changed a bit since you forked, it may take a while 
to merge. Before you git push, please run the regression test.
    ./bjam ...... -a --with-regtest=[moses-reg-test-data]

Let me know if you have any other questions

On 03/08/2012 09:43, Marcin Junczys-Dowmunt wrote:
> Hi all,
> I have a created a new branch "minphr" in which I am going to prepare a
> version of my compact phrase table that can be easily merged into the
> trunk. I have several questions here, so I do not cause any trouble when
> merging. Also I want to include a usage instruction in my MTM2012 paper,
> so it would be nice if I could make sure none of the program names,
> extensions or options change after publication.
>
> So if you have objections or better ideas concerning any of the points
> below, please let me know.
>
> 1) Phrase table type: When specifying the phrase table format in the
> config file, the phrase table type has to be set. I would like to use
> 12, which as far as I see it is free. Unless I can get 3 :)
>
> 2) Binary names: the programs that generate the phrase table and the
> corresponding lexical reordering models will be named
> "processPhraseTableMin" and "processLexicalTableMin".
(Continue reading)


Gmane