Web crawlers are used to index the internet to help people search the web more efficiently.

Prerequisites

  • Basic knowledge of JavaScript and HTML

  • Basic knowledge of Treasure Data

  • Basic knowledge of Treasure Data JavaScript SDK


Treasure Data recommends that you implement any new features or functionality at your site using the Treasure Data JavaScript SDK version 3 Beta. It manages cookies differently. Be aware when referring to most of these articles that you need to define the suggested event collectors and Treasure Data JavaScript SDK version 3 calls in your solutions.

For example, change //cdn.treasuredata.com/sdk/2.5/td.min.js to //cdn.treasuredata.com/sdk/3.0.0-beta/td.min.js.

User-Agents for Google Crawlers

Because Treasure Data JavaScript SDK tracks all page views, raw data usually contains a lot of accesses from web crawlers. You can use td_browser parameter to recognize if the access is coming from the browser or not.

td_browser is recognized by user agents, and it works on our SDK Backend server. td_browser shows the following value for each Google Crawler.

Crawler

user-agents

HTTP(S) requests user-agent

td_browser

Googlebot (Google Web search)

Googlebot

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

“Googlebot”

Googlebot (Google Web search)

Googlebot

(rarely used): Googlebot/2.1 (+http://www.google.com/bot.html)

“Googlebot”

Googlebot News

Googlebot-News (Googlebot)

Googlebot-News

“Other”

Googlebot Images

Googlebot-Image (Googlebot)

Googlebot-Image/1.0

“Other”

Googlebot Video

Googlebot-Video (Googlebot)

Googlebot-Video/1.0

“Other”

Google Mobile (feature phone)

Googlebot-Mobile

SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)

“UP.Browser”

Google Mobile (feature phone)

Googlebot-Mobile

DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)

“Other”

Google Smartphone

Googlebot

Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12F70 Safari/600.1.4 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

“Googlebot”

Google Mobile AdSense

Mediapartners-Google

[various mobile device types] (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html)

“Other”

Google Mobile AdSense

Mediapartners (Googlebot)

[various mobile device types] (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html)

“Other”

Google AdSense

Mediapartners-Google

Mediapartners-Google

“Other”

Google AdSense

Mediapartners (Googlebot)

Mediapartners-Google

“Other”

Google AdsBot landing page quality check

AdsBot-Google

AdsBot-Google (+http://www.google.com/adsbot.html)

“Other”

  • No labels