Robin H. Johnson | 12 Jul 01:13

wget abuse of sources.g.o must stop

I was tracing into why sources.gentoo.org is so slow at times, and
gives errors, timeouts etc. I found that there is wget being used
against the site, with a lot of rapid-fire requests for the *checkout*
URLs - most often for entire directories. And very often, right when
wget is being used, the sources.g.o performance sucks badly.

The wget requests account for 25-30% of the daily requests to
sources.gentoo.org. It's meant for browsing, not pulling.

If you are pulling multiple files, or some file on a regular basis, you
should be using the anoncvs/anonsvn systems instead.

export ANON=":pserver:anonymous <at> anoncvs.gentoo.org:/var/cvsroot"
Listing the files for a package:
# cvs -z0 -d $ANON rls gentoo-x86/$CAT/$PN/

Grabbing an entire package:
# T=`mktemp -d`
# cvs -z0 -d $ANON co -d $T gentoo-x86/$CAT/$PN

Grabbing a single file, without a temp:
# cvs -z0 -d $ANON co -p gentoo-x86/$CAT/$PN/$FILENAME >$OUTPUTFILE

Pursuant to the above, the any useragent matching /^Wget/ will be
blocked from the 'gentoo' and 'gentoo-x86' repos of sources.gentoo.org
as of July 14th.

Either change to using the proper anonymous service, or change your
useragent to describe what you are doing with the service, so that I can
specifically ban your user-agent if it's causing too much load.
(Continue reading)

Fabian Groffen | 12 Jul 08:20

Re: wget abuse of sources.g.o must stop

On 11-07-2008 16:16:42 -0700, Robin H. Johnson wrote:
> I was tracing into why sources.gentoo.org is so slow at times, and
> gives errors, timeouts etc. I found that there is wget being used
> against the site, with a lot of rapid-fire requests for the *checkout*
> URLs - most often for entire directories. And very often, right when
> wget is being used, the sources.g.o performance sucks badly.

I just changed the prefix script "ecopy" to use anon cvs instead of
wget.  I have no clue how often the script is used, but I really hope
not that often that it could cause the problems on sources.g.o.

-- 
Fabian Groffen
Gentoo on a different level
--

-- 
gentoo-dev <at> lists.gentoo.org mailing list

Gokdeniz Karadag | 13 Jul 14:44

Re: wget abuse of sources.g.o must stop

Robin H. Johnson demis ki::
....
> Pursuant to the above, the any useragent matching /^Wget/ will be
> blocked from the 'gentoo' and 'gentoo-x86' repos of sources.gentoo.org
> as of July 14th.
> 
> Either change to using the proper anonymous service, or change your
> useragent to describe what you are doing with the service, so that I can
> specifically ban your user-agent if it's causing too much load.
> 

I think a post on planet gentoo and/or gentoo.org would be beneficial.

-- 
Gokdeniz Karadag

--

-- 
gentoo-dev <at> lists.gentoo.org mailing list


Gmane