Huw Selley | 1 Dec 12:30 2010
Picon

Read request throughput

Hi,

I have been doing some performance testing with couch and am hoping someone here will be able to help me
ascertain if/how I can get higher throughput.

Scenario:

I am trying to measure max couch throughput - for these tests im happy with just repeatedly requesting the
same document.
I have some reasonable boxes to perform these tests - they have dual quad core X5550 CPUs with
HyperThreading enabled and 24GB RAM.
These boxes have a stock install of oracle enterprise linux 5 on them (which is pretty much RHEL5).
The oracle supplied erlang version is R12B5 and I am using couch 1.0.1 built from source.

The database is pretty small (just under 100K docs) and I am querying a view that includes some other docs
(the request contains include_docs=true) and using jmeter on another identical box to generate the traffic.
The total amount of data returned from the request is 1467 bytes.
For all of my tests I capture system state using sadc and there is nothing else happening on these boxes.

In my initial round of testing I found that I was only getting ~126 requests/s throughput which surprised me
somewhat. Looking at the generated graphs from the test run there were plenty of resources to go round - the
disk controller was nowhere near busy and neither was the cpu.

Before coming here to question my findings I took a 3rd box (same spec) and built couch from the tip of the
1.1.x branch (rev 1040477). After compiling couch and installing it I found that it didn't start up (or log
anything useful). After a bit of digging I figured it's probably due to the age of the erlang version being
used - I upgraded to OTP R14B and rebuilt couch against it. This gave me a working install again.

I got an immediate throughput increase to ~500 requests/s which was nice but the data being collected via
sadc still showed that the cpu was at most 20% utilised and the disk controller was doing next to nothing (I
(Continue reading)

Adam Kocoloski | 1 Dec 14:30 2010
Picon

Re: Read request throughput

Hi Huw, thanks for this detailed report.  I'll respond with a few suggestions inline:

On Dec 1, 2010, at 6:30 AM, Huw Selley wrote:

> Hi,
> 
> I have been doing some performance testing with couch and am hoping someone here will be able to help me
ascertain if/how I can get higher throughput.
> 
> Scenario:
> 
> I am trying to measure max couch throughput - for these tests im happy with just repeatedly requesting the
same document.
> I have some reasonable boxes to perform these tests - they have dual quad core X5550 CPUs with
HyperThreading enabled and 24GB RAM.

So the Erlang VM starts 16 schedulers by default, right?  Some people have reported improvements in Erlang
application performance with HyperThreading disabled, but I've not heard of any CouchDB-specific
tests of that option yet.

> These boxes have a stock install of oracle enterprise linux 5 on them (which is pretty much RHEL5).
> The oracle supplied erlang version is R12B5 and I am using couch 1.0.1 built from source.

Newer versions of Erlang have much much better symmetric multiprocessing performance, so not too
surprising you saw a big boost when you upgraded.

> The database is pretty small (just under 100K docs) and I am querying a view that includes some other docs
(the request contains include_docs=true) and using jmeter on another identical box to generate the traffic.

include_docs=true is definitely more work at read time than embedding the docs in the view index.  I'm not
(Continue reading)

Huw Selley | 2 Dec 12:29 2010
Picon

Re: Read request throughput

Thanks for the response Adam :) 
Some updates below:

On 1 Dec 2010, at 13:30, Adam Kocoloski wrote:
<snip>
> 
> So the Erlang VM starts 16 schedulers by default, right?  Some people have reported improvements in Erlang
application performance with HyperThreading disabled, but I've not heard of any CouchDB-specific
tests of that option yet.

Yeah, that's right - 16:16 by default.

<snip>

> 
>> The database is pretty small (just under 100K docs) and I am querying a view that includes some other docs
(the request contains include_docs=true) and using jmeter on another identical box to generate the traffic.
> 
> include_docs=true is definitely more work at read time than embedding the docs in the view index.  I'm not
sure  about your application design constraints, but given that your database and index seem to fit
entirely in RAM at the moment you could experiment with emitting the doc in your map function instead ...
> 
>> The total amount of data returned from the request is 1467 bytes.
> 
> ... especially when the documents are this small.

Sure, but I would have expected that to only really help if the system was contending for resources? I am
using linked docs so not sure about emitting the entire doc in the view.

<snip>
(Continue reading)

Adam Kocoloski | 2 Dec 15:41 2010
Picon

Re: Read request throughput

On Dec 2, 2010, at 6:29 AM, Huw Selley wrote:

>> include_docs=true is definitely more work at read time than embedding the docs in the view index.  I'm not
sure  about your application design constraints, but given that your database and index seem to fit
entirely in RAM at the moment you could experiment with emitting the doc in your map function instead ...
>> 
>>> The total amount of data returned from the request is 1467 bytes.
>> 
>> ... especially when the documents are this small.
> 
> Sure, but I would have expected that to only really help if the system was contending for resources? I am
using linked docs so not sure about emitting the entire doc in the view.

Didn't realize you were using linked docs.  You're certainly right, there's no way to emit those directly.

>> Hmm, I've heard that we did something to break compatibility with 12B-5 recently.  We should either fix it
or bump the required version.  Thanks for the note.
> 
> COUCHDB-856?

Ah, right. That one was my fault.  But Filipe fixed it in r1034380, so it shouldn't have caused you any trouble here.

>> Do you know if the CPU load was spread across cores or concentrated on a single one?  One thing Kenneth did
not mention in that thread is that you can now bind Erlang schedulers to specific cores.  By default the
schedulers are unbound; maybe RHEL is doing a poor job of distributing them.  You can bind them using the
default strategy for your CPUs by starting the VM with the "+sbt db" option.
> 
> It was using most of 2 cores. I had a go with "+sbt db" and it didn't perform as well as "-S 16:2".
> 
> WRT disabling HT - I need to take a trip to the datacentre to disable HT in the bios but I tried disabling some
(Continue reading)

Filipe David Manana | 8 Dec 20:24 2010
Picon

Re: Read request throughput

Huw,

Today trunk was patched to increase both read and write performance
when there are several requests in parallel to the same database/view
index file.

The corresponding ticket is https://issues.apache.org/jira/browse/COUCHDB-980

Would be much appreciated if you could try the latest trunk and report back :)

best regards,

On Thu, Dec 2, 2010 at 2:41 PM, Adam Kocoloski <kocolosk@...> wrote:
> On Dec 2, 2010, at 6:29 AM, Huw Selley wrote:
>
>>> include_docs=true is definitely more work at read time than embedding the docs in the view index.  I'm
not sure  about your application design constraints, but given that your database and index seem to fit
entirely in RAM at the moment you could experiment with emitting the doc in your map function instead ...
>>>
>>>> The total amount of data returned from the request is 1467 bytes.
>>>
>>> ... especially when the documents are this small.
>>
>> Sure, but I would have expected that to only really help if the system was contending for resources? I am
using linked docs so not sure about emitting the entire doc in the view.
>
> Didn't realize you were using linked docs.  You're certainly right, there's no way to emit those directly.
>
>>> Hmm, I've heard that we did something to break compatibility with 12B-5 recently.  We should either
fix it or bump the required version.  Thanks for the note.
(Continue reading)

Huw Selley | 9 Dec 11:09 2010
Picon

Re: Read request throughput

Hi,

On 8 Dec 2010, at 19:24, Filipe David Manana wrote:

> Huw,
> 
> Today trunk was patched to increase both read and write performance
> when there are several requests in parallel to the same database/view
> index file.

Great news :)

> 
> The corresponding ticket is https://issues.apache.org/jira/browse/COUCHDB-980
> 
> Would be much appreciated if you could try the latest trunk and report back :)

WOW - I built from svn rev 1043651 (again with Erlang R14B) this morning and have just performed the same
jmeter tests with some good results.
I am still seeing the same throughput score from jmeter, ~500 requests/s but what is interesting is that I
can now drive up the threadpool count in jmeter from 25 (the value I had for my last round of testing) up to 750
with no errors - just increased request latency (which is to be expected).

Processor utilisation also looks more like I would expect:

09:16:43 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
09:16:48 AM  all   19.82    0.00    9.19    0.00    0.04    0.25    0.00   70.70   6136.20
09:16:48 AM    0   43.80    0.00    5.00    0.00    0.00    0.00    0.00   51.20   1000.20
09:16:48 AM    1   38.40    0.00   21.00    0.00    0.00    0.40    0.00   40.20    460.80
09:16:48 AM    2   44.49    0.00   14.43    0.00    0.00    0.00    0.00   41.08      8.80
(Continue reading)

Filipe David Manana | 9 Dec 12:04 2010
Picon

Re: Read request throughput

Hi Huw,

Great news!

I don't expect you to see any significant performance differences
between 1.0.x and 1.1x however.

Thanks for letting us know about your tests.

best regards,

On Thu, Dec 9, 2010 at 10:09 AM, Huw Selley <huw.selley@...> wrote:
> Hi,
>
> On 8 Dec 2010, at 19:24, Filipe David Manana wrote:
>
>> Huw,
>>
>> Today trunk was patched to increase both read and write performance
>> when there are several requests in parallel to the same database/view
>> index file.
>
> Great news :)
>
>>
>> The corresponding ticket is https://issues.apache.org/jira/browse/COUCHDB-980
>>
>> Would be much appreciated if you could try the latest trunk and report back :)
>
> WOW - I built from svn rev 1043651 (again with Erlang R14B) this morning and have just performed the same
(Continue reading)

Filipe David Manana | 11 Dec 18:20 2010
Picon

Re: Read request throughput

Huw,

I backported that ticket's patch into the branch 1.0.x (from which
1.0.2 will be based).
Just found now that 1.1.x (and trunk) have worst read and write
performance compared to latest 1.0.x. The issue seems to be a major
Mochiweb version upgrade. We're trying to find out what changed.

Therefore I recommend you to use 1.0.x.

regards,

On Thu, Dec 9, 2010 at 11:04 AM, Filipe David Manana
<fdmanana@...> wrote:
> Hi Huw,
>
> Great news!
>
> I don't expect you to see any significant performance differences
> between 1.0.x and 1.1x however.
>
> Thanks for letting us know about your tests.
>
> best regards,
>
> On Thu, Dec 9, 2010 at 10:09 AM, Huw Selley <huw.selley@...> wrote:
>> Hi,
>>
>> On 8 Dec 2010, at 19:24, Filipe David Manana wrote:
>>
(Continue reading)

Huw Selley | 14 Dec 13:27 2010
Picon

Re: Read request throughput

Hi,

Sorry about the late response, been pretty busy.

On 11 Dec 2010, at 17:20, Filipe David Manana wrote:

> Huw,
> 
> I backported that ticket's patch into the branch 1.0.x (from which
> 1.0.2 will be based).

Awesome :)

> Just found now that 1.1.x (and trunk) have worst read and write
> performance compared to latest 1.0.x. The issue seems to be a major
> Mochiweb version upgrade. We're trying to find out what changed.
> 
> Therefore I recommend you to use 1.0.x.

Will do! Any idea when 1.0.2 gets released?

Thanks and regards
Huw

Filipe David Manana | 16 Dec 19:31 2010
Picon

Re: Read request throughput

On Tue, Dec 14, 2010 at 12:27 PM, Huw Selley <huw.selley@...> wrote:
> Hi,
>
> Will do! Any idea when 1.0.2 gets released?

Hopefully soon. There's just one issue remaining to fix for the 1.0.2 release.

regards,

>
> Thanks and regards
> Huw
>

--

-- 
Filipe David Manana,
fdmanana@..., fdmanana@...

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."


Gmane