Top 10 Good Bots
Bots
and botnets are commonly associated with cybercriminals stealing data,
identities, credit card numbers and worse. But bots can also serve good
purposes. Separating good bots from bad can also make a big difference in how
you protect your company’s website and ensure that that your site gets the
Internet traffic it deserves.
Most
good bots are essentially crawlers sent out from the world’s biggest web sites
to index content for their search engines and social media platforms. You WANT
those bots to visit you. They bring you more business! Shutting them down as
part of strategy to block bad bots is a losing strategy.
Here,
in reverse order of how likely they are to visit any web site, are the 10 most
important good bots that you should know about now. Make sure your security
strategy welcomes these bots (or at least know why you chose to block them)!
1. Googlebot :-Googlebot is Google’s web
crawling bot (sometimes also called a “spider”). Googlebot uses an algorithmic
process: computer programs determine which sites to crawl, how often, and how
many pages to fetch from each site. Googlebot’s crawl process begins with a
list of webpage URLs, generated from previous crawl processes and augmented with
Sitemap data provided by webmasters. As Googlebot visits each of these websites
it detects links (SRC and HREF) on each page and adds them to its list of pages
to crawl. New sites, changes to existing sites, and dead links are noted and
used to update the Google index.
2. Baiduspider :-Baiduspider is a robot of
Baidu Chinese search engine. Baidu (Chinese: 百度; pinyin:
Bǎidù) is the leading Chinese search
engine for websites, audio files, and images.
3. MSN Bot/Bingbot :-Retired October 2010 and
rebranded as Bingbot, this is a web-crawling robot (type of Internet bot),
deployed by Microsoft to supply Bing (search engine). It collects documents
from the web to build a searchable index for the Bing (search engine).
4. Yandex Bot :-Yandex bot is Yandex’s
search engine’s crawler. Yandex is a Russian Internet company which operates
the largest search engine in Russia with about 60% market share in that
country. Yandex ranked as the fifth largest search engine worldwide with more
than 150 million searches per day as of April 2012 and more than 25.5 million
visitors.
5. Soso Spider :-Soso.com is a Chinese
search engine owned by Tencent Holdings Limited, which is well known for its
other creation QQ. As of 13 May 2012, Soso.com is ranked as the 36th most
visited website in the world and the 13th most visited website in China,
according to Alexa Internet. On an average, Soso.com gets 21,064,490 page views
everyday.
6. Exabot :-Exabot is the crawler for
ExaLead out of France. Founded in 2000 by search engine pioneers, Dassault
Systèmes, ExaLead provides search and unified information access software.
7. Sogou Spider :-Sogou.com is a Chinese
search engine. It was launched August 4, 2004. As of April 2010, it has a rank
of 121 in Alexa’s Internet rankings. Sogou provides an index of up to 10
billion web pages.
8. Google Plus Share :- Google Plus lets you share
recommendations with friends, contacts and the rest of the web – on Google
search. The +1 button helps initialize Google’s instant share capabilities, and
it also provides a way to give something your public stamp of approval.
9. Facebook External Hit :-Facebook allows its
users to send links to interesting web content to other Facebook users. Part of
how this works on the Facebook system involves the temporary display of certain
images or details related to the web content, such as the title of the webpage
or the embed tag of a video. The Facebook system retrieves this information
only after a user provides a link.
10. Google Feedfetcher :-Used by Google to
grab RSS or Atom feeds when users choose to add them to their Google homepage
or Google Reader. Feedfetcher collects and periodically refreshes these
user-initiated feeds, but does not index them in Blog Search or Google’s other
search services (feeds appear in the search results only if they’ve been
crawled by Googlebot).
Comments
Post a Comment