chris derham | 27 Jan 23:33
Picon

Re: mod-security-users Digest, Vol 68, Issue 16


>>> THAT is basically the big question we want to find out - IF IT IS OWNED BY Google.
>>> That cannot reliably be done via reverse-DNS (as you stated above).
>>>
>>> So, what we would need to do now, would be to do a reverse lookup and a
>>> forward lookup on the result, asserting that the forward lookup points to
>>> the original IP address:
>>>
>>>   EVIL-IP              --reverse-lookup--> IP.crawl.google.com
>>>   IP.crawl.google.com  ----dns-lookup----> 1.2.3.4
>>>
>>>   1.2.3.4 =? EVIL-IP
>>>
>>> Thus, to mask your evil IP by your devilish DNS, you'd also have to have
>>> some control over the forward DNS resolver.
>>>
>>> Still doable, but requires more effort.
>>> Does that sound better to you
>>
>> in theory yes, practically what you try to do is not possible
>>
>> it is dangerous, there is no RFC saying A-Record/PTR needs to match
>> and there will never be because it can not match in all cases
>> like a round-robin record below
>>
>> [harry <at> srv-rhsoft:~]$ nslookup www.google.com
>> Server:         127.0.0.1
>> Address:        127.0.0.1#53
>>
>> Non-authoritative answer:
>> www.google.com  canonical name = www.l.google.com.
>> Name:   www.l.google.com
>> Address: 173.194.69.106
>> Name:   www.l.google.com
>> Address: 173.194.69.147
>> Name:   www.l.google.com
>> Address: 173.194.69.99
>> Name:   www.l.google.com
>> Address: 173.194.69.103
>> Name:   www.l.google.com
>> Address: 173.194.69.104
>> Name:   www.l.google.com
>> Address: 173.194.69.105
>>
>> [harry <at> srv-rhsoft:~]$ nslookup 173.194.69.106
>> Server:         127.0.0.1
>> Address:        127.0.0.1#53
>>
>> Non-authoritative answer:
>> 106.69.194.173.in-addr.arpa     name = bk-in-f106.1e100.net.

So in an effort to help the discussion, here is the original link I referred to where a google bot engineer says this is the way to go http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html. In addition for one of the google bot attempts to access our site, it came from IP 66.249.67.172. Performing the forward/reverse lookup gives the expected results

C:\>nslookup  66.249.67.172
Server:  UnKnown
Address:  192.168.2.1

Name:    crawl-66-249-67-172.googlebot.com
Address:  66.249.67.172

C:\>nslookup crawl-66-249-67-172.googlebot.com
Server:  UnKnown
Address:  192.168.2.1

Non-authoritative answer:
Name:    crawl-66-249-67-172.googlebot.com
Address:  66.249.67.172

So while this may not work for round-robin servers, the google bots do not appear to be load balanced,

The only problem I see with Chris's approach, is that you would have to wait for google bots to be blocked before you could detect their ips, perform the reverse/forward lookups and then block them. Assuming google have a large pool of google bots, this might take some time before you could get the same bot back again and let them in. On the other hand, invoking this double dns lookup when someone presents a suitable user agent sounds like a likely candidate for denial of service.

So have I got the wrong end of the stick with this?

Thanks

Chris
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
mod-security-users mailing list
mod-security-users <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mod-security-users
Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
http://www.modsecurity.org/projects/commercial/rules/
http://www.modsecurity.org/projects/commercial/support/
Ryan Barnett | 28 Jan 00:55

Re: mod-security-users Digest, Vol 68, Issue 16

I think Josh recommended a good approach. Check the User-Agent for google-bot and the execute a Lua script
to do the double lookups and alert on mismatches.

This selective lookup check vs just enabling double hostname lookups in Apache us a better choice for performance.

Thinking on this more though, we should probably add a new operator for this - @dnsLookup.

Ryan

On Jan 27, 2012, at 5:33 PM, chris derham <chris <at> derham.me.uk<mailto:chris <at> derham.me.uk>> wrote:

>>> THAT is basically the big question we want to find out - IF IT IS OWNED BY Google.
>>> That cannot reliably be done via reverse-DNS (as you stated above).
>>>
>>> So, what we would need to do now, would be to do a reverse lookup and a
>>> forward lookup on the result, asserting that the forward lookup points to
>>> the original IP address:
>>>
>>>   EVIL-IP              --reverse-lookup--> IP.crawl.google.com<http://IP.crawl.google.com>
>>>   IP.crawl.google.com<http://IP.crawl.google.com>  ----dns-lookup----> 1.2.3.4
>>>
>>>   1.2.3.4 =? EVIL-IP
>>>
>>> Thus, to mask your evil IP by your devilish DNS, you'd also have to have
>>> some control over the forward DNS resolver.
>>>
>>> Still doable, but requires more effort.
>>> Does that sound better to you
>>
>> in theory yes, practically what you try to do is not possible
>>
>> it is dangerous, there is no RFC saying A-Record/PTR needs to match
>> and there will never be because it can not match in all cases
>> like a round-robin record below
>>
>> [harry <at> srv-rhsoft:~]$ nslookup www.google.com<http://www.google.com>
>> Server:         127.0.0.1
>> Address:        127.0.0.1#53
>>
>> Non-authoritative answer:
>> www.google.com<http://www.google.com>  canonical name = www.l.google.com<http://www.l.google.com>.
>> Name:   www.l.google.com<http://www.l.google.com>
>> Address: 173.194.69.106<tel:173.194.69.106>
>> Name:   www.l.google.com<http://www.l.google.com>
>> Address: 173.194.69.147<tel:173.194.69.147>
>> Name:   www.l.google.com<http://www.l.google.com>
>> Address: 173.194.69.99
>> Name:   www.l.google.com<http://www.l.google.com>
>> Address: 173.194.69.103<tel:173.194.69.103>
>> Name:   www.l.google.com<http://www.l.google.com>
>> Address: 173.194.69.104<tel:173.194.69.104>
>> Name:   www.l.google.com<http://www.l.google.com>
>> Address: 173.194.69.105<tel:173.194.69.105>
>>
>> [harry <at> srv-rhsoft:~]$ nslookup 173.194.69.106<tel:173.194.69.106>
>> Server:         127.0.0.1
>> Address:        127.0.0.1#53
>>
>> Non-authoritative answer:
>> 106.69.194.173.in-addr.arpa     name = bk-in-f106.1e100.net<http://bk-in-f106.1e100.net>.

So in an effort to help the discussion, here is the original link I referred to where a google bot engineer
says this is the way to go
http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html. In
addition for one of the google bot attempts to access our site, it came from IP 66.249.67.172. Performing
the forward/reverse lookup gives the expected results

C:\>nslookup  66.249.67.172
Server:  UnKnown
Address:  192.168.2.1

Name:    crawl-66-249-67-172.googlebot.com<http://crawl-66-249-67-172.googlebot.com>
Address:  66.249.67.172

C:\>nslookup crawl-66-249-67-172.googlebot.com<http://crawl-66-249-67-172.googlebot.com>
Server:  UnKnown
Address:  192.168.2.1

Non-authoritative answer:
Name:    crawl-66-249-67-172.googlebot.com<http://crawl-66-249-67-172.googlebot.com>
Address:  66.249.67.172

So while this may not work for round-robin servers, the google bots do not appear to be load balanced,

The only problem I see with Chris's approach, is that you would have to wait for google bots to be blocked
before you could detect their ips, perform the reverse/forward lookups and then block them. Assuming
google have a large pool of google bots, this might take some time before you could get the same bot back
again and let them in. On the other hand, invoking this double dns lookup when someone presents a suitable
user agent sounds like a likely candidate for denial of service.

So have I got the wrong end of the stick with this?

Thanks

Chris
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
mod-security-users mailing list
mod-security-users <at> lists.sourceforge.net<mailto:mod-security-users <at> lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/mod-security-users
Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
http://www.modsecurity.org/projects/commercial/rules/
http://www.modsecurity.org/projects/commercial/support/

________________________________
This transmission may contain information that is privileged, confidential, and/or exempt from
disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any
disclosure, copying, distribution, or use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. If you received this transmission in error, please immediately
contact the sender and destroy the material in its entirety, whether in electronic or hard copy format.
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
mod-security-users mailing list
mod-security-users <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mod-security-users
Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
http://www.modsecurity.org/projects/commercial/rules/
http://www.modsecurity.org/projects/commercial/support/

chris derham | 30 Jan 18:50
Picon

Re: mod-security-users Digest, Vol 68, Issue 16


I think Josh recommended a good approach. Check the User-Agent for google-bot and the execute a Lua script to do the double lookups and alert on mismatches.

This selective lookup check vs just enabling double hostname lookups in Apache us a better choice for performance.

Thinking on this more though, we should probably add a new operator for this - <at> dnsLookup.

Ryan

Replied to Ryan only - sorry

All,

So I have played about, and have something. I think it is about 90% there - I can't seem to work out how to access variables declared in mod_security and loop over them in lua. Also I said before, I don't know lua and might be doing something stupid. It will need more robot user-agent and domains added for other search engine robots, but presumably this can be done once they trigger false positives on live sites

SecAction "pass,setvar:'tx.robot_user_
agents=google-bot'"
SecAction "pass,setvar:'tx.robot_domains=googlebot.com'"
SecRule &REQUEST_HEADERS:User-Agent "! <at> within %{tx.robot_user_agents}" \
            "skipAfter:END_SEARCH_ENGINE_BOT_CHECK,phase:2,msg:'searchEngineBotReverseDnsLookup skipped as not search engine bot activity',allow"
SecRuleScript modsecurity2/searchEngineBotReverseDnsLookup.lua "phase:1,msg:'search engine bot allowed',allow,ctl:ruleEngine=Off"
SecMarker END_SEARCH_ENGINE_BOT_CHECK

I wanted to use ctl:removeRuleById:960015 - this would then just turn off the single rule that is causing issues, but leave all the other rules in place - I know their motto is do no evil, but you can't trust everyone right? However this isn't working for me - perhaps I have a version of mod_security that doesn't support this? We will update if this is the case. The error message I get is

Error parsing actions: Missing ctl value for name: removeRuleById:960015

The searchEngineBotReverseDnsLookup.lua file contains the following


require("io");

function trim(s)
  return (s:gsub("^%s*(.-)%s*$", "%1"))
end

function dnsLookup(source, search, hitCountRequired)
    local n = os.tmpname()

    os.execute ("nslookup " .. source .. " > " .. n)
    local hitCount = 0

    for line in io.lines (n) do
      if string.match(line, search) then
        hitCount = hitCount + 1
        if (hitCount == hitCountRequired) then
            result = string.sub(line, string.len(search) + 1)
            break
        end
      end
    end
    os.remove(n)
    return trim(result)
end

function string.ends(String,End)
   return End=='' or string.sub(String,-string.len(End))==End
end

function isKnownRobotDomain(domain)
    local result = true
    local d = m.getvars("tx.robot_domains")
    for i = 1, #d do
        if (string.ends(domain, d[i].value)) then
            return true;
        end
    end
    return false;
end

function main()
    local clientIp = m.getvar("REMOTE_ADDR");
    local reverseDnsName = dnsLookup(clientIp, "Name:", 1)
    local dnsClientIp = dnsLookup(reverseDnsName, "Address:", 2)
    local match = clientIp == dnsClientIp
    local matchDescription = ""
    local robotDomainDescription = "n/a"
    local result = nil;
    if (match) then
        matchDescription = "matches"
        if (isKnownRobotDomain(reverseDnsName)) then
            robotDomainDescription = "match"
            result = "match"
        else
            robotDomainDescription = "missmatch"
        end
    else
        matchDescription = "fails"
    end
    m.log(4, "searchBotReverseDnsLookup: clientIp " .. clientIp .. " " .. matchDescription .." reverse dns lookup: " .. reverseDnsName .. " != " .. dnsClientIp .. " robotDomain: " .. robotDomainDescription);
    return result;
end

This bit that doesn't work is the " local d = m.getvars("tx.robot_domains") for i = 1, #d do" - I can't seem to get the for loop to start. Any thoughts on what is wrong? Other than that I think that the approach will work ok During testing as expected the first hit takes a while, but then the dns cache kicks in so subsequent accesses are very quick

Thanks

Chris

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
mod-security-users mailing list
mod-security-users <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mod-security-users
Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
http://www.modsecurity.org/projects/commercial/rules/
http://www.modsecurity.org/projects/commercial/support/
Ryan Barnett | 30 Jan 19:16

Re: mod-security-users Digest, Vol 68, Issue 16

Chris -
For the ctl action, use this instead – ctl:ruleRemoveById:960015

While you do have the one use-case of wanting to remove that one rule when GoogleBot is indexing your site,
you can certainly also use these rules to alert if a client is masquerading as GoogleBot and flag them as a
malicious client.

I will work on your rules a bit and get back to you.

-Ryan

From: chris derham <chris <at> derham.me.uk<mailto:chris <at> derham.me.uk>>
Reply-To: "chris <at> derham.me.uk<mailto:chris <at> derham.me.uk>" <chris <at> derham.me.uk<mailto:chris <at> derham.me.uk>>
Date: Mon, 30 Jan 2012 11:50:01 -0600
To:
"mod-security-users <at> lists.sourceforge.net<mailto:mod-security-users <at> lists.sourceforge.net>" <mod-security-users <at> lists.sourceforge.net<mailto:mod-security-users <at> lists.sourceforge.net>>
Subject: Re: [mod-security-users] mod-security-users Digest, Vol 68, Issue 16

I think Josh recommended a good approach. Check the User-Agent for google-bot and the execute a Lua script
to do the double lookups and alert on mismatches.

This selective lookup check vs just enabling double hostname lookups in Apache us a better choice for performance.

Thinking on this more though, we should probably add a new operator for this - @dnsLookup.

Ryan

Replied to Ryan only - sorry

All,

So I have played about, and have something. I think it is about 90% there - I can't seem to work out how to access
variables declared in mod_security and loop over them in lua. Also I said before, I don't know lua and might
be doing something stupid. It will need more robot user-agent and domains added for other search engine
robots, but presumably this can be done once they trigger false positives on live sites

SecAction "pass,setvar:'tx.robot_user_
agents=google-bot'"
SecAction "pass,setvar:'tx.robot_domains=googlebot.com<http://googlebot.com/>'"
SecRule &REQUEST_HEADERS:User-Agent "!@within %{tx.robot_user_agents}" \
            "skipAfter:END_SEARCH_ENGINE_BOT_CHECK,phase:2,msg:'searchEngineBotReverseDnsLookup skipped
as not search engine bot activity',allow"
SecRuleScript modsecurity2/searchEngineBotReverseDnsLookup.lua "phase:1,msg:'search engine bot allowed',allow,ctl:ruleEngine=Off"
SecMarker END_SEARCH_ENGINE_BOT_CHECK

I wanted to use ctl:removeRuleById:960015 - this would then just turn off the single rule that is causing
issues, but leave all the other rules in place - I know their motto is do no evil, but you can't trust everyone
right? However this isn't working for me - perhaps I have a version of mod_security that doesn't support
this? We will update if this is the case. The error message I get is

Error parsing actions: Missing ctl value for name: removeRuleById:960015

The searchEngineBotReverseDnsLookup.lua file contains the following

require("io");

function trim(s)
  return (s:gsub("^%s*(.-)%s*$", "%1"))
end

function dnsLookup(source, search, hitCountRequired)
    local n = os.tmpname()

    os.execute ("nslookup " .. source .. " > " .. n)
    local hitCount = 0

    for line in io.lines (n) do
      if string.match(line, search) then
        hitCount = hitCount + 1
        if (hitCount == hitCountRequired) then
            result = string.sub(line, string.len(search) + 1)
            break
        end
      end
    end
    os.remove(n)
    return trim(result)
end

function string.ends(String,End)
   return End=='' or string.sub(String,-string.len(End))==End
end

function isKnownRobotDomain(domain)
    local result = true
    local d = m.getvars("tx.robot_domains")
    for i = 1, #d do
        if (string.ends(domain, d[i].value)) then
            return true;
        end
    end
    return false;
end

function main()
    local clientIp = m.getvar("REMOTE_ADDR");
    local reverseDnsName = dnsLookup(clientIp, "Name:", 1)
    local dnsClientIp = dnsLookup(reverseDnsName, "Address:", 2)
    local match = clientIp == dnsClientIp
    local matchDescription = ""
    local robotDomainDescription = "n/a"
    local result = nil;
    if (match) then
        matchDescription = "matches"
        if (isKnownRobotDomain(reverseDnsName)) then
            robotDomainDescription = "match"
            result = "match"
        else
            robotDomainDescription = "missmatch"
        end
    else
        matchDescription = "fails"
    end
    m.log(4, "searchBotReverseDnsLookup: clientIp " .. clientIp .. " " .. matchDescription .." reverse dns
lookup: " .. reverseDnsName .. " != " .. dnsClientIp .. " robotDomain: " .. robotDomainDescription);
    return result;
end

This bit that doesn't work is the " local d = m.getvars("tx.robot_domains") for i = 1, #d do" - I can't seem to
get the for loop to start. Any thoughts on what is wrong? Other than that I think that the approach will work
ok During testing as expected the first hit takes a while, but then the dns cache kicks in so subsequent
accesses are very quick

Thanks

Chris

________________________________
This transmission may contain information that is privileged, confidential, and/or exempt from
disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any
disclosure, copying, distribution, or use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. If you received this transmission in error, please immediately
contact the sender and destroy the material in its entirety, whether in electronic or hard copy format.

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
mod-security-users mailing list
mod-security-users <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mod-security-users
Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
http://www.modsecurity.org/projects/commercial/rules/
http://www.modsecurity.org/projects/commercial/support/

Christian Bockermann | 30 Jan 12:33

Re: mod-security-users Digest, Vol 68, Issue 16

Am 27.01.2012 um 23:33 schrieb chris derham:
> So in an effort to help the discussion, here is the original link I referred to where a google bot engineer
says this is the way to go
http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html. In
addition for one of the google bot attempts to access our site, it came from IP 66.249.67.172. Performing
the forward/reverse lookup gives the expected results
> 
> C:\>nslookup  66.249.67.172
> Server:  UnKnown
> Address:  192.168.2.1
> 
> Name:    crawl-66-249-67-172.googlebot.com
> Address:  66.249.67.172
> 
> C:\>nslookup crawl-66-249-67-172.googlebot.com
> Server:  UnKnown
> Address:  192.168.2.1
> 
> Non-authoritative answer:
> Name:    crawl-66-249-67-172.googlebot.com
> Address:  66.249.67.172
> 
> So while this may not work for round-robin servers, the google bots do not appear to be load balanced,
> 
> The only problem I see with Chris's approach, is that you would have to wait for google bots to be blocked
before you could detect their ips, perform the reverse/forward lookups and then block them. Assuming
google have a large pool of google bots, this might take some time before you could get the same bot back
again and let them in. On the other hand, invoking this double dns lookup when someone presents a suitable
user agent sounds like a likely candidate for denial of service.
> 
> So have I got the wrong end of the stick with this?

I've implemented a simple check like this within the latest version of the jwall-tools. Some
more details can be found in a blog post on this:

	https://secure.jwall.org/blog/2012/01/30/1327919340000.html

Currently I am investigating the use of external databases provided by RIPE, Whois, ... to
validate complete network blocks. E.g. with a wois query for 66.249.67.172

    NetRange:       66.249.64.0 - 66.249.95.255
    CIDR:           66.249.64.0/19
    OriginAS:
    NetName:        GOOGLE
    NetHandle:      NET-66-249-64-0-1
    ...
    OrgName:        Google Inc.
    OrgId:          GOGL

This might be a good indicator for validating the block 66.249.64.0/19 as google-bots.

Any comments on that?

Chris
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
mod-security-users mailing list
mod-security-users <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mod-security-users
Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
http://www.modsecurity.org/projects/commercial/rules/
http://www.modsecurity.org/projects/commercial/support/


Gmane