Russell Blau | 10 Jul 2012 16:19

s1 replag, some observations

As most of us already know, replag on enwiki has been going up and up
since about 30 June. As it says on status.toolserver.org, "Hight replag
because of inserting of many SHA1-hashes."  (Note to DaB.: the first
word should be spelled "High".)

I asked DaB. on IRC how long this might go on, and he replied one to two
weeks.  However, I've since done some independent investigation that
suggests that his estimate might be a little low.

It turns out that there are three large blocks of consecutive entries in
the revision database that need to be populated with SHA1 hashes. 
Apparently there are three processes running in parallel on the WMF
servers that are filling in each of these blocks from the bottom, by
numerical order of rev_id.  Knowing this, we can estimate how many
revisions still need to be populated at any given point; and, taking
such estimates at various points in time, can estimate how long the
process will take.  (Needless to say, this is only an estimate since the
rate at which database changes are processed on the toolserver side is
variable; also, the blocks of rev_ids are not actually consecutive due
to deletions, but we can assume for our purposes that the deleted
revisions are distributed uniformly throughout the database.)

It further turns out that it is only possible to compute this estimate
for sql-s1-user (thyme), because the enwiki_p view on sql-s1-rr
(rosemary) does not have the rev_sha1 field at all (!).  It appears that
the server on rosemary is receiving millions of database updates each
day from WMF and throwing them in the bit bucket.

Anyway, based on four observations spaced at 6 hour intervals, it
appears that thyme is populating about 353,000 revisions per hour, or
(Continue reading)

Petr Onderka | 10 Jul 2012 16:33
Picon

Re: s1 replag, some observations

Isn't there anything we (or TS admins) can do about this?
Like asking WMF to populate the SHA1s at a slower rate?

Petr Onderka
[[en:User:Svick]]

On Tue, Jul 10, 2012 at 4:19 PM, Russell Blau <russblau <at> imapmail.org> wrote:
> As most of us already know, replag on enwiki has been going up and up
> since about 30 June. As it says on status.toolserver.org, "Hight replag
> because of inserting of many SHA1-hashes."  (Note to DaB.: the first
> word should be spelled "High".)
>
> I asked DaB. on IRC how long this might go on, and he replied one to two
> weeks.  However, I've since done some independent investigation that
> suggests that his estimate might be a little low.
>
> It turns out that there are three large blocks of consecutive entries in
> the revision database that need to be populated with SHA1 hashes.
> Apparently there are three processes running in parallel on the WMF
> servers that are filling in each of these blocks from the bottom, by
> numerical order of rev_id.  Knowing this, we can estimate how many
> revisions still need to be populated at any given point; and, taking
> such estimates at various points in time, can estimate how long the
> process will take.  (Needless to say, this is only an estimate since the
> rate at which database changes are processed on the toolserver side is
> variable; also, the blocks of rev_ids are not actually consecutive due
> to deletions, but we can assume for our purposes that the deleted
> revisions are distributed uniformly throughout the database.)
>
> It further turns out that it is only possible to compute this estimate
(Continue reading)

DaB. | 10 Jul 2012 20:35
Favicon

Re: s1 replag, some observations

Hello,
At Tuesday 10 July 2012 20:34:41 DaB. wrote:
> Like asking WMF to populate the SHA1s at a slower rate?

yes, that would be helpful. I had already asked in #wikimedia-tech during a 
talk, but got no response (but it could be that it was just overlooked).

Sincerely,
DaB.

-- 
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Hello,
At Tuesday 10 July 2012 20:34:41 DaB. wrote:
> Like asking WMF to populate the SHA1s at a slower rate?

yes, that would be helpful. I had already asked in #wikimedia-tech during a 
talk, but got no response (but it could be that it was just overlooked).

Sincerely,
DaB.

--

-- 
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
DaB. | 10 Jul 2012 16:39
Favicon

Re: s1 replag, some observations

Hello,
At Tuesday 10 July 2012 16:36:29 DaB. wrote:
> It appears that
> the server on rosemary is receiving millions of database updates each
> day from WMF and throwing them in the bit bucket.

I'm not sure what "throwing them in the bit bucket." means, but I guess 
something like "throwing away"; this is not the case, the field was just not 
visible for users – I change that now. I will also remove the typo of the 
status soon.

Sincerely,
DaB.

--

-- 
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
Hello,
At Tuesday 10 July 2012 16:36:29 DaB. wrote:
> It appears that
> the server on rosemary is receiving millions of database updates each
> day from WMF and throwing them in the bit bucket.

I'm not sure what "throwing them in the bit bucket." means, but I guess 
something like "throwing away"; this is not the case, the field was just not 
visible for users – I change that now. I will also remove the typo of the 
status soon.

Sincerely,
(Continue reading)

Russell Blau | 11 Jul 2012 16:14

Re: s1 replag, some observations

On Tue, Jul 10, 2012, at 04:39 PM, DaB. wrote:
> Hello,
> At Tuesday 10 July 2012 16:36:29 russblau wrote:
> > It appears that
> > the server on rosemary is receiving millions of database updates each
> > day from WMF and throwing them in the bit bucket.
> 
> I'm not sure what "throwing them in the bit bucket." means, but I guess 
> something like "throwing away";

"bit bucket" == /dev/null   :-)

> this is not the case, the field was just not 
> visible for users – I change that now. 

I'm glad to know that the appearance was deceiving.  Having data from
rosemary now,
I come up with an estimated completion date there of July 30, assuming
that the
rate at which updates are received from WMF does not change.
--

-- 
  Russell Blau
  russblau <at> imapmail.org

_______________________________________________
Toolserver-l mailing list (Toolserver-l <at> lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
Platonides | 12 Jul 2012 01:29
Picon

Re: s1 replag, some observations

On 11/07/12 16:14, Russell Blau wrote:
> I'm glad to know that the appearance was deceiving.  Having data from
> rosemary now,
> I come up with an estimated completion date there of July 30, assuming
> that the rate at which updates are received from WMF does not change.

I thought it would be worse. How are you measuring the sha1-populated
boundaries?

Russell Blau | 12 Jul 2012 15:38

Re: s1 replag, some observations

On Thu, Jul 12, 2012, at 01:29 AM, Platonides wrote:
> On 11/07/12 16:14, Russell Blau wrote:
> > I'm glad to know that the appearance was deceiving.  Having data from
> > rosemary now,
> > I come up with an estimated completion date there of July 30, assuming
> > that the rate at which updates are received from WMF does not change.
> 
> I thought it would be worse. How are you measuring the sha1-populated
> boundaries?

I do a series of queries in the form "SELECT rev_sha1 FROM revision
WHERE rev_id = NNN", with the NNN's selected by a binary search
algorithm, to find the lowest rev_id for which rev_sha1 is "".

Of course, this was preceded by some other queries to establish that
there are, in fact, three consecutive blocks of unpopulated revisions,
and that the upper boundaries of these blocks have not changed.
--

-- 
  Russell Blau
  russblau <at> imapmail.org


Gmane