Allowing google bot and google preview
2012-01-27 09:52:38 GMT
All,
We are using apache 2.2.17 on windows server 2008 sp2. We run mod_security 2 core ruleset 2.1.1 - I realise this is a little out of date. We generally are running ok, but started having issues with our website when we started to try to let google to index it. Rule 960015 which stops requests without an accept header kicked in and stopped the google bot. I searched and found a post by Ryan which suggested that you should filter the incoming requests to a known IP address range, rather than just the user-agent, as this can be faked. This is fine, but if google updates their IP range, there is no way for them to publish this notification. I found a post where a google bot engineer suggested that the way to ensure it was a google bot was to preform a reverse lookup and then a forward look up on the IP, and make sure it comes from google.com. Is there anything in mod_security that would facilitate this?
Currently we have the rule
SecRule REQUEST_HEADERS:User-Agent "\+http:\/\/www\.google\.com\/bot\.html" \
"phase:1,nolog,allow,ctl:ruleEngine=Off"
So my first question is how can we just skip the rule instead of turning off all mod_security. I know you can set skip after, but doesn't that mean we have to edit the rules files, and remember all our edits and reapply when updating - was wondering if there was another mechanism. Second question is shortly after we managed to get the google bot through, google bot seems to have sent his friend "Google Web Preview". This is also tripping up rule 90015. The rule output shown below. Do we just need to add another suitable user-agent skip rule? My third question is that presumably none of this is new - other sites using mod_security must hit this issue often. So is there some default configuration to allow google (and other robots) in that we have missed?
Thanks for any help
Chris
--84670000-B-- GET /public/image/mainBackgroundStrip.jpg HTTP/1.1 Cookie: JSESSIONID=A7D03B59EED2EEDC28B5CCD3E1524143 Referer: https://www.qnspay.com/server/home;jsessionid=A7D03B59EED2EEDC28B5CCD3E1524143 User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.51 (KHTML, like Gecko; Google Web Preview) Chrome/12.0.742 Safari/534.51 Accept-Encoding: gzip,deflate Host: www.qnspay.com X-Forwarded-For: 81.143.18.220
--84670000-F-- HTTP/1.1 412 Precondition Failed Last-Modified: Mon, 14 Feb 2011 16:50:10 GMT ETag: "2000000003a1c-1f4-49c40d74cca58" Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Content-Length: 328 Content-Type: text/html
--84670000-H-- Message: Access denied with code 412 (phase 2). Operator EQ matched 0 at REQUEST_HEADERS. [file "D:/apps/Apache2.2/conf/modsecurity2/base_rules/modsecurity_crs_21_protocol_anomalies.conf"] [line "46"] [id "960015"] [rev "2.1.1"] [msg "Request Missing an Accept Header"] [severity "CRITICAL"] [tag "PROTOCOL_VIOLATION/MISSING_HEADER_ACCEPT"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] Action: Intercepted (phase 2) Stopwatch: 1327656817879800 15600 (0 0 -) Producer: ModSecurity for Apache/2.5.13 (http://www.modsecurity.org/); core ruleset/2.1.1. Server: Apache WebApp-Info: "QNS" "-" "-"
--84670000-K-- SecAction "auditlog,status:412,phase:1,t:none,nolog,pass,setvar:tx.critical_anomaly_score=5,setvar:tx.error_anomaly_score=4,setvar:tx.warning_anomaly_score=3,setvar:tx.notice_anomaly_score=2" SecAction "auditlog,status:412,phase:1,t:none,nolog,pass,setvar:tx.inbound_anomaly_score_level=5" SecAction "auditlog,status:412,phase:1,t:none,nolog,pass,setvar:tx.outbound_anomaly_score_level=4" SecAction "auditlog,status:412,phase:1,t:none,nolog,pass,setvar:tx.paranoid_mode=0" SecAction "auditlog,status:412,phase:1,t:none,nolog,pass,setvar:tx.max_num_args=255" SecAction "auditlog,status:412,phase:1,t:none,nolog,pass,setvar:'tx.allowed_methods=GET HEAD POST OPTIONS',setvar:'tx.allowed_request_content_type=application/x-www-form-urlencoded multipart/form-data text/xml application/xml application/x-amf text/x-gwt-rpc',setvar:'tx.allowed_http_versions=HTTP/0.9 HTTP/1.0 HTTP/1.1',setvar:'tx.restricted_extensions=.asa/ .asax/ .ascx/ .axd/ .backup/ .bak/ .bat/ .cdx/ .cer/ .cfg/ .cmd/ .com/ .config/ .conf/ .cs/ .csproj/ .csr/ .dat/ .db/ .dbf/ .dll/ .dos/ .htr/ .htw/ .ida/ .idc/ .idq/ .inc/ .ini/ .key/ .licx/ .lnk/ .log/ .mdb/ .old/ .pass/ .pdb/ .pol/ .printer/ .pwd/ .resources/ .resx/ .sql/ .sys/ .vb/ .vbs/ .vbproj/ .vsdisco/ .webinfo/ .xsd/ .xsx/',setvar:'tx.restricted_headers=/Proxy-Connection/ /Lock-Token/ /Content-Range/ /Translate/ /via/ /if/'" SecRule "REQUEST_HEADERS:User-Agent" " <at> rx ^(.*)$" "auditlog,status:412,phase:1,t:none,pass,nolog,t:sha1,t:hexEncode,setvar:tx.ua_hash=%{matched_var}" SecAction "auditlog,status:412,phase:1,t:none,pass,nolog,initcol:global=global,initcol:ip=%{remote_addr}_%{tx.ua_hash}" SecRule "REQUEST_METHOD" " <at> rx ^(?:GET|HEAD)$" "log,auditlog,status:412,phase:1,chain,rev:2.1.1,t:none,block,msg:'GET or HEAD requests with bodies',severity:2,id:960011,tag:PROTOCOL_VIOLATION/EVASION,tag:WASCTC/WASC-21,tag:OWASP_TOP_10/A7,tag:PCI/6.5.10,tag:http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.3" SecRule "REQUEST_METHOD" "! <at> rx ^OPTIONS$" "log,auditlog,status:412,phase:2,chain,rev:2.1.1,t:none,block,msg:'Request Missing an Accept Header',severity:2,id:960015,tag:PROTOCOL_VIOLATION/MISSING_HEADER_ACCEPT,tag:WASCTC/WASC-21,tag:OWASP_TOP_10/A7,tag:PCI/6.5.10" SecRule "&REQUEST_HEADERS:Accept" " <at> eq 0" "t:none,setvar:tx.msg=%{rule.msg},setvar:tx.anomaly_score=+%{tx.notice_anomaly_score},setvar:tx.protocol_violation_score=+%{tx.notice_anomaly_score},setvar:tx.%{rule.id}-PROTOCOL_VIOLATION/MISSING_HEADER-%{matched_var_name}=%{matched_var}"
--84670000-Z--
------------------------------------------------------------------------------ Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________ mod-security-users mailing list mod-security-users <at> lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mod-security-users Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs: http://www.modsecurity.org/projects/commercial/rules/ http://www.modsecurity.org/projects/commercial/support/
Best regards,
Chris
*) reliably: neglecting the fact that IP address might be spoofed.
Am 27.01.2012 um 10:52 schrieb chris derham:
> All,
>
> We are using apache 2.2.17 on windows server 2008 sp2. We run mod_security 2 core ruleset 2.1.1 - I realise
this is a little out of date. We generally are running ok, but started having issues with our website when we
started to try to let google to index it. Rule 960015 which stops requests without an accept header kicked
in and stopped the google bot. I searched and found a post by Ryan which suggested that you should filter the
incoming requests to a known IP address range, rather than just the user-agent, as this can be faked. This
is fine, but if google updates their IP range, there is no way for them to publish this notification. I found
a post where a google bot engineer suggested that the way to ensure it was a google bot was to preform a
reverse lookup and then a forward look up
on the IP, and make sure it comes from google.com. Is there anything in mod_security that would facilitate this?
>
> Currently we have the rule
>
> SecRule REQUEST_HEADERS:User-Agent "\+http:\/\/www\.google\.com\/bot\.html" \
> "phase:1,nolog,allow,ctl:ruleEngine=Off"
>
> So my first question is how can we just skip the rule instead of turning off all mod_security. I know you can
set skip after, but doesn't that mean we have to edit the rules files, and remember all our edits and reapply
when updating - was wondering if there was another mechanism. Second question is shortly after we managed
to get the google bot through, google bot seems to have sent his friend "Google Web Preview". This is also
tripping up rule 90015. The rule output shown below. Do we just need to add another suitable user-agent
skip rule? My third question is that presumably none of this is new - other sites using mod_security must
hit this issue often. So is there some default configuration to allow google (and other robots) in that we
have missed?
>
> Thanks for any help
>
> Chris
>
> --84670000-
RSS Feed