Nickolay V. Shmyrev | 5 Sep 2007 15:49
Picon
Favicon

Re: Raised f0 in clustergen

В Втр, 04/09/2007 в 07:48 -0400, Alan W Black пишет:
> Nickolay V. Shmyrev wrote:
> > This leads to more generic question - how we can model speech parameters
> > better. Probably we should model logarithm of f0 instead of f0 like in
> > hts and adjust distance matrix for mcep. What is the base for logarithm
> > then?
> > 
> > Are there articles on appropriate topic?
> > 
> 
> Yes they are appropriate topics.  I had noticed before that tuning the 
> F0 by hand could make the voice sound better (even if the generated F0 
> was not actually close to the source speaker).
> 
> The F0 generation in clustergen is really quite different from HTS, a 
> smoothed F0 over the whole sentence is generated (through unvoiced 
> regions) which is then predicted from a separate model from the mcep 
> model.  This is producing pretty good values (correlation and rmse) 
> compared to other F) models I've done in the past.  HOwever I've not 
> really done listening tests on them.
> 
> Though I have seen on other systems that playing with the F0 values can 
> improve the sound of the voice even.  The Log F0 vs absolute F0 may make 
> a difference, though in some experiments I've found it makes a flatter 
> F0 (smaller variance), which I believe is the bigger problem.  I've not 
> done listening tests here, but we are deep in the process building new 
> prosodic models for Festival (clustergen and otherwise) based on the new 
> story data we now have access to.

Very interesting, thanks
(Continue reading)

Nickolay V. Shmyrev | 19 Sep 2007 14:43
Picon
Favicon

Re: Raised f0 in clustergen

Hello all

In continuation of mine f0 investigations I'm very unsatisfied with the
pitchmark extraction algorithm in festival (pitchmark program and
make_pm_wave script). To my opinion it performs very bad and thus causes
critical decrease in voice performance both in f0 prediction and in
clunits voice where MCEPs are pitch synchronized. Also it requires
filter parameters tuning easily avoidable with dynamic programming like
in pda.

So I see the following options:

1) extend pda with arguments to output pitchmarks and replace pitchmark
with pda

2) Adopt some free algorithm with dynamic programming to get pitchmarks.
For example I'm currently looking on wavesurfers's jkGetF0 code and it
looks rather relaible

3) Use some existing free algorithm.

Any opinions on this?

Alan W Black | 19 Sep 2007 16:09
Picon
Favicon

Re: Raised f0 in clustergen

Nickolay V. Shmyrev wrote:
> Hello all
> 
> In continuation of mine f0 investigations I'm very unsatisfied with the
> pitchmark extraction algorithm in festival (pitchmark program and
> make_pm_wave script). To my opinion it performs very bad and thus causes
> critical decrease in voice performance both in f0 prediction and in
> clunits voice where MCEPs are pitch synchronized. Also it requires
> filter parameters tuning easily avoidable with dynamic programming like
> in pda.
> 
> So I see the following options:
> 
> 1) extend pda with arguments to output pitchmarks and replace pitchmark
> with pda

Actualy pitchmark and pda are basically the same algorithms packed in 
different binaries.
> 
> 2) Adopt some free algorithm with dynamic programming to get pitchmarks.
> For example I'm currently looking on wavesurfers's jkGetF0 code and it
> looks rather relaible
> 
> 3) Use some existing free algorithm.

I would strongly recommend ESPS epoch, we used to use that when it was 
proprietary, but it is now free and there are scripts in festvox that 
can use it.

Alan
(Continue reading)

Nickolay V. Shmyrev | 19 Sep 2007 17:36
Picon
Favicon

Re: Raised f0 in clustergen

В Срд, 19/09/2007 в 10:09 -0400, Alan W Black пишет:
> Nickolay V. Shmyrev wrote:
> > Hello all
> > 
> > In continuation of mine f0 investigations I'm very unsatisfied with the
> > pitchmark extraction algorithm in festival (pitchmark program and
> > make_pm_wave script). To my opinion it performs very bad and thus causes
> > critical decrease in voice performance both in f0 prediction and in
> > clunits voice where MCEPs are pitch synchronized. Also it requires
> > filter parameters tuning easily avoidable with dynamic programming like
> > in pda.
> > 
> > So I see the following options:
> > 
> > 1) extend pda with arguments to output pitchmarks and replace pitchmark
> > with pda
> 
> Actualy pitchmark and pda are basically the same algorithms packed in 
> different binaries.
> > 
> > 2) Adopt some free algorithm with dynamic programming to get pitchmarks.
> > For example I'm currently looking on wavesurfers's jkGetF0 code and it
> > looks rather relaible
> > 
> > 3) Use some existing free algorithm.
> 
> I would strongly recommend ESPS epoch, we used to use that when it was 
> proprietary, but it is now free and there are scripts in festvox that 
> can use it.
> 
(Continue reading)

Alan W Black | 19 Sep 2007 17:34
Picon
Favicon

Re: Raised f0 in clustergen

Nickolay V. Shmyrev wrote:
> В Срд, 19/09/2007 в 10:09 -0400, Alan W Black пишет:
>> Nickolay V. Shmyrev wrote:
>>> Hello all
>>>
>>> In continuation of mine f0 investigations I'm very unsatisfied with the
>>> pitchmark extraction algorithm in festival (pitchmark program and
>>> make_pm_wave script). To my opinion it performs very bad and thus causes
>>> critical decrease in voice performance both in f0 prediction and in
>>> clunits voice where MCEPs are pitch synchronized. Also it requires
>>> filter parameters tuning easily avoidable with dynamic programming like
>>> in pda.
>>>
>>> So I see the following options:
>>>
>>> 1) extend pda with arguments to output pitchmarks and replace pitchmark
>>> with pda
>> Actualy pitchmark and pda are basically the same algorithms packed in 
>> different binaries.
>>> 2) Adopt some free algorithm with dynamic programming to get pitchmarks.
>>> For example I'm currently looking on wavesurfers's jkGetF0 code and it
>>> looks rather relaible
>>>
>>> 3) Use some existing free algorithm.
>> I would strongly recommend ESPS epoch, we used to use that when it was 
>> proprietary, but it is now free and there are scripts in festvox that 
>> can use it.
>>
> 
> Great thanks. I suppose it's the same wavesurfer code I was talking
(Continue reading)

Nickolay V. Shmyrev | 21 Sep 2007 00:52
Picon
Favicon

Re: Raised f0 in clustergen

В Срд, 19/09/2007 в 11:34 -0400, Alan W Black пишет:

> I'm not sure what the license is of wavesurfer, but ESPS has a BSD like 
> license.  However I don't see that it should be included in speech_tools 
> just better documented as an option.

It might be funny but I was not able to find binaries for ESPS and
failed to compile it from sources (I tried for two hours). So I have to
write a handler in speech tools anyhow. It's attached. I've also ported
f0 determination program (analog of pda from ESPS), I'll send it later
too. It also uses dynamic programming and should be rather reliable.

The only question appeared. It seems that epochs works fine on voiced
segments but it just inserts too many unused pitchmarks on unvoiced ones
and even on silence because it looks for relative maximum. Is it just a
disadvantage of the algorithm or there is way to avoid pitchmarks with
some parameter?

Attachment (pitchmarks-epochs.tar.gz): application/x-compressed-tar, 9 KiB
Nickolay V. Shmyrev | 21 Sep 2007 00:52
Picon
Favicon

Raised f0 in clustergen

? ???, 19/09/2007 ? 11:34 -0400, Alan W Black ?????:

> I'm not sure what the license is of wavesurfer, but ESPS has a BSD like 
> license.  However I don't see that it should be included in speech_tools 
> just better documented as an option.

It might be funny but I was not able to find binaries for ESPS and
failed to compile it from sources (I tried for two hours). So I have to
write a handler in speech tools anyhow. It's attached. I've also ported
f0 determination program (analog of pda from ESPS), I'll send it later
too. It also uses dynamic programming and should be rather reliable.

The only question appeared. It seems that epochs works fine on voiced
segments but it just inserts too many unused pitchmarks on unvoiced ones
and even on silence because it looks for relative maximum. Is it just a
disadvantage of the algorithm or there is way to avoid pitchmarks with
some parameter?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pitchmarks-epochs.tar.gz
Type: application/x-compressed-tar
Size: 10102 bytes
Desc: not available
Url :
https://lists.berlios.de/pipermail/festlang-talk/attachments/20070921/44d27109/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
(Continue reading)

Alan W Black | 19 Sep 2007 17:34
Picon
Favicon

Raised f0 in clustergen

Nickolay V. Shmyrev wrote:
> ? ???, 19/09/2007 ? 10:09 -0400, Alan W Black ?????:
>> Nickolay V. Shmyrev wrote:
>>> Hello all
>>>
>>> In continuation of mine f0 investigations I'm very unsatisfied with the
>>> pitchmark extraction algorithm in festival (pitchmark program and
>>> make_pm_wave script). To my opinion it performs very bad and thus causes
>>> critical decrease in voice performance both in f0 prediction and in
>>> clunits voice where MCEPs are pitch synchronized. Also it requires
>>> filter parameters tuning easily avoidable with dynamic programming like
>>> in pda.
>>>
>>> So I see the following options:
>>>
>>> 1) extend pda with arguments to output pitchmarks and replace pitchmark
>>> with pda
>> Actualy pitchmark and pda are basically the same algorithms packed in 
>> different binaries.
>>> 2) Adopt some free algorithm with dynamic programming to get pitchmarks.
>>> For example I'm currently looking on wavesurfers's jkGetF0 code and it
>>> looks rather relaible
>>>
>>> 3) Use some existing free algorithm.
>> I would strongly recommend ESPS epoch, we used to use that when it was 
>> proprietary, but it is now free and there are scripts in festvox that 
>> can use it.
>>
> 
> Great thanks. I suppose it's the same wavesurfer code I was talking
(Continue reading)

Nickolay V. Shmyrev | 19 Sep 2007 17:36
Picon
Favicon

Raised f0 in clustergen

? ???, 19/09/2007 ? 10:09 -0400, Alan W Black ?????:
> Nickolay V. Shmyrev wrote:
> > Hello all
> > 
> > In continuation of mine f0 investigations I'm very unsatisfied with the
> > pitchmark extraction algorithm in festival (pitchmark program and
> > make_pm_wave script). To my opinion it performs very bad and thus causes
> > critical decrease in voice performance both in f0 prediction and in
> > clunits voice where MCEPs are pitch synchronized. Also it requires
> > filter parameters tuning easily avoidable with dynamic programming like
> > in pda.
> > 
> > So I see the following options:
> > 
> > 1) extend pda with arguments to output pitchmarks and replace pitchmark
> > with pda
> 
> Actualy pitchmark and pda are basically the same algorithms packed in 
> different binaries.
> > 
> > 2) Adopt some free algorithm with dynamic programming to get pitchmarks.
> > For example I'm currently looking on wavesurfers's jkGetF0 code and it
> > looks rather relaible
> > 
> > 3) Use some existing free algorithm.
> 
> I would strongly recommend ESPS epoch, we used to use that when it was 
> proprietary, but it is now free and there are scripts in festvox that 
> can use it.
> 
(Continue reading)

Alan W Black | 19 Sep 2007 16:09
Picon
Favicon

Raised f0 in clustergen

Nickolay V. Shmyrev wrote:
> Hello all
> 
> In continuation of mine f0 investigations I'm very unsatisfied with the
> pitchmark extraction algorithm in festival (pitchmark program and
> make_pm_wave script). To my opinion it performs very bad and thus causes
> critical decrease in voice performance both in f0 prediction and in
> clunits voice where MCEPs are pitch synchronized. Also it requires
> filter parameters tuning easily avoidable with dynamic programming like
> in pda.
> 
> So I see the following options:
> 
> 1) extend pda with arguments to output pitchmarks and replace pitchmark
> with pda

Actualy pitchmark and pda are basically the same algorithms packed in 
different binaries.
> 
> 2) Adopt some free algorithm with dynamic programming to get pitchmarks.
> For example I'm currently looking on wavesurfers's jkGetF0 code and it
> looks rather relaible
> 
> 3) Use some existing free algorithm.

I would strongly recommend ESPS epoch, we used to use that when it was 
proprietary, but it is now free and there are scripts in festvox that 
can use it.

Alan
(Continue reading)

Nickolay V. Shmyrev | 19 Sep 2007 14:43
Picon
Favicon

Raised f0 in clustergen

Hello all

In continuation of mine f0 investigations I'm very unsatisfied with the
pitchmark extraction algorithm in festival (pitchmark program and
make_pm_wave script). To my opinion it performs very bad and thus causes
critical decrease in voice performance both in f0 prediction and in
clunits voice where MCEPs are pitch synchronized. Also it requires
filter parameters tuning easily avoidable with dynamic programming like
in pda.

So I see the following options:

1) extend pda with arguments to output pitchmarks and replace pitchmark
with pda

2) Adopt some free algorithm with dynamic programming to get pitchmarks.
For example I'm currently looking on wavesurfers's jkGetF0 code and it
looks rather relaible

3) Use some existing free algorithm.

Any opinions on this?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: =?koi8-r?Q?=FC=D4=C1?= =?koi8-r?Q?_=DE=C1=D3=D4=D8?=
	=?koi8-r?Q?_=D3=CF=CF=C2=DD=C5=CE=C9=D1?=
(Continue reading)

Nickolay V. Shmyrev | 19 Sep 2007 14:43
Picon
Favicon

Raised f0 in clustergen

Hello all

In continuation of mine f0 investigations I'm very unsatisfied with the
pitchmark extraction algorithm in festival (pitchmark program and
make_pm_wave script). To my opinion it performs very bad and thus causes
critical decrease in voice performance both in f0 prediction and in
clunits voice where MCEPs are pitch synchronized. Also it requires
filter parameters tuning easily avoidable with dynamic programming like
in pda.

So I see the following options:

1) extend pda with arguments to output pitchmarks and replace pitchmark
with pda

2) Adopt some free algorithm with dynamic programming to get pitchmarks.
For example I'm currently looking on wavesurfers's jkGetF0 code and it
looks rather relaible

3) Use some existing free algorithm.

Any opinions on this?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: =?koi8-r?Q?=FC=D4=C1?= =?koi8-r?Q?_=DE=C1=D3=D4=D8?=
	=?koi8-r?Q?_=D3=CF=CF=C2=DD=C5=CE=C9=D1?=
(Continue reading)


Gmane