2 Apr 2012 16:44
Re: slow mapred_search key lookups for single terms
Ryan Zezeski <rzezeski <at> basho.com>
2012-04-02 14:44:08 GMT
2012-04-02 14:44:08 GMT
Hi Michael, you'll find my responses inline...
On Sat, Mar 31, 2012 at 5:04 PM, Michael Radford <mrad <at> blorf.com> wrote:
I'm seeing very slow performance from Riak search even when querying
single terms, and I'd appreciate any advice on how to get insight into
where the time is going.
Right now, I'm using this function to time queries with the Erlang pb client:
TS =
fun (Pid, Bucket, Query) ->
T0 = now(),
{ok, Results} = riakc_pb_socket:search(Pid, Bucket, Query),
MuSec = timer:now_diff(now(), T0),
io:format(user, "~b results, ~f sec~n", [length(Results), MuSec/1000000])
end.
Just an FYI, you should checkout `timer:tc`.
The bucket I'm querying currently has ~300k keys total (each 16
bytes). (The whole cluster has maybe 1.5M entries in about a dozen
buckets. It's running 1.0.2, 4 nodes on 4 8-core c1.xlarge EC2
instances.)
For single-term queries that return 10k+ keys, I'm routinely seeing
times above 6 seconds to run the above function. Intermittently,
however, I'll see the same queries take only 2 seconds:
> TS(Pid,Bucket,<<"full_text:flower">>).
12574 results, 6.094149 sec
ok
> TS(Pid,Bucket,<<"full_text:flower">>).
12574 results, 1.938894 sec
ok
> TS(Pid,Bucket,<<"full_text:flower">>).
12574 results, 1.981492 sec
ok
> TS(Pid,Bucket,<<"full_text:flower">>).
12574 results, 6.120589 sec
ok
> TS(Pid,Bucket,<<"full_text:red">>).
13461 results, 6.572473 sec
ok
> TS(Pid,Bucket,<<"full_text:red">>).
13461 results, 6.626049 sec
ok
> TS(Pid,Bucket,<<"full_text:red">>).
13461 results, 2.155847 sec
ok
Queries with fewer results are still slow, but not quite as slow as 6 seconds:
> TS(Pid,Bucket,<<"full_text:ring">>).
6446 results, 2.831806 sec
ok
> TS(Pid,Bucket,<<"full_text:ring">>).
6446 results, 3.037162 sec
ok
> TS(Pid,Bucket,<<"full_text:ring">>).
6447 results, 0.780944 sec
ok
And queries with no matches only take a few milliseconds:
> TS(Pid,Bucket,<<"full_text:blorf">>).
0 results, 0.003269 sec
ok
During the slow queries, none of the 4 machines seems to be fully
taxing even one cpu, or doing almost any disk i/o.
What does intra/inter network look like?
As far as I can tell from looking at the riak_kv/riak_search source,
my query should only be hitting the index and streaming back the keys,
not trying to read every document from disk or sort by score. Is that
correct?
It will not read the documents at all but it will sort on score. Currently there is no way to disable sorting.
Assuming that's the case, I can't imagine why it takes so long to look
up 10k keys from the index for a single term, or why the times seem to
be bimodal (which seems like a big clue). Any pointers welcome!
Where is your client sitting in regards to your cluster? Is it in the local network? Could you try attaching to one of your riak nodes, running the query there and compare results?
e.g.
riak attach
> search:search(Bucket, Query).
-Ryan
_______________________________________________ riak-users mailing list riak-users <at> lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
RSS Feed