1 Million Free Hits: Preventing crawling

To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain. Additionally, a page can be explicitly excluded from a search engine's database by using a meta tag specific to robots. When a search engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed, and will instruct the robot as to which pages are not to be crawled. As a search engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish crawled. Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered search spam.[28]

Search engines may penalize sites they discover using black hat methods, either by reducing their rankings or eliminating their listings from their databases altogether. Such penalties can be applied either automatically by the search engines' algorithms, or by a manual site review. Infamous examples are the February 2006 Google removal of both BMW Germany and Ricoh Germany for use of deceptive practices.[33] and the April 2006 removal of the PPC Agency BigMouthMedia.[34] All three companies, however, quickly apologized, fixed the offending pages, and were restored to Google's list.[35]

As a marketing strategy

Eye tracking studies have shown that searchers scan a search results page from top to bottom and left to right (for left to right languages), looking for a relevant result. Placement at or near the top of the rankings therefore increases the number of searchers who will visit a site.[36] However, more search engine referrals does not guarantee more sales. SEO is not necessarily an appropriate strategy for every website, and other Internet marketing strategies can be much more effective, depending on the site operator's goals.[37] A successful Internet marketing campaign may drive organic traffic to web pages, but it also may involve the use of paid advertising on search engines and other pages, building high quality web pages to engage and persuade, addressing technical issues that may keep search engines from crawling and indexing those sites, setting up analytics programs to enable site owners to measure their successes, and improving a site's conversion rate.[38]

SEO may generate a return on investment. However, search engines are not paid for organic search traffic, their algorithms change, and there are no guarantees of continued referrals. Due to this lack of guarantees and certainty, a business that relies heavily on search engine traffic can suffer major losses if the search engines stop sending visitors.[39] It is considered wise business practice for website operators to liberate themselves from dependence on search engine traffic.[40] A top-ranked SEO blog Seomoz.org[41] has reported, "Search marketers, in a twist of irony, receive a very small share of their traffic from search engines." Instead, their main sources of traffic are links from other websites.

1 Million Free Hits

Jumat, 13 November 2009

Preventing crawling

Tidak ada komentar:

Posting Komentar

Komentar Anda

Mengenai Saya

Pengikut

Arsip Blog