I have a WordPress website with AWS on use, namely the Cloudfront service, to serve CSS, images and JS from the cloud.

Lately, I noticed a lot of hits from bots coming from :

  • IP: Hostname: ec2-54-236-71-87.compute-1.amazonaws.com
  • IP: Hostname: ec2-54-147-229-75.compute-1.amazonaws.com
  • IP: Hostname: ec2-34-207-96-105.compute-1.amazonaws.com
  • IP: Hostname: ec2-52-202-239-36.compute-1.amazonaws.com
  • IP: Hostname: ec2-34-203-222-34.compute-1.amazonaws.com
  • ...

Goeip trace them to Ashburn, USA. They crawl all the RSS feeds of my website (posts, categories), almost every minute, coming from https://www.google.com/.

Their user-agent are a bit random :

Browser: undefined
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a3pre) Gecko/20070330

or :

Browser: Chrome version 0.0 running on MacOSX
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1309.0 Safari/537.17

or :

Browser: undefined
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:; ) Gecko/20101203

How can I find out what they want ? Are they related to the cache generation for Cloudfront ?

