|
Duplicate Content
Creating duplicate
content/mirrors/redirects might be one of the worst things you could
possibly do if you want to
succeed in the search engines. When
search engines were first getting popular, you could simply point
10
domain names to the same Web site and they would all stack up on the
same page of results for the same keywords. Meaning, if you ranked
well with one phrase, all 10 of those sites would do the same. This
was a burden to the search engines,
so now they use very
sophisticated algorithms to filter out duplicate content.
They examine all aspects of site structure, image names, and
matching text. When too many of these areas match another web site
it triggers a red flag, and the site is penalized. Beware of mirror
sites, affiliate sites, or any other "cookie-cutter" web marketing
service that promises big profits with little effort. Today's
engines will remove or reject duplicate content, so this usually
leads to failure.
If you want to survive on the web make sure your site has original
and unique content.
The safest way to get top search engine placement is to produce real
content.
Following are three examples of the use of duplicate content:
Mirrored Sites: Mirrored sites occur when the web site sits
in one folder on a particular server, has one IP address, but 2
different domains pointing to the same folder. So, when you type in
the two different URLs they both bring up the same site. This is a
horrible scenario for search engines. Some see this as being
tricked, rules it as duplicate content and will penalize the site.
They can choose to only index one of the domains or not list either
of them.
Cloned Sites: Sites that are cloned sit in two different
folders; have two different IP addresses and 2 different domain
names. This scenario is generally ok with the search engines as long
as at least 40% of the site content if different from the other.
Again, engines will not stand for duplicate content and will
penalize the site if they are not different. The Database of
products can stay the same as the long as total look and feel of
each site is different.
Redirects: Pointing multiple
domains to one site seems to be very popular these days. This is
done by registering multiple domains and redirecting to one main
domain name. Again, this can cause problems because the engines
think you are trying to trick them by taking them somewhere else.
They treat this just as a clone, consider it duplicate content, and
can penalize the site. If you must have multiple domains, use a 301
redirect on all secondary domain names pointing to your main URL
Doorways
Doorway pages are an essential aspect of an effective search engine
optimization. In an effort to improve rankings, however, some
marketers have spammed the search engines with doorway pages,
generating multiple pages with little information, making it a topic
of much controversy. Search engines have responded to this practice,
and are now much stricter in their rules and requirements. Filters
have been created to block the "spammers rendition" of doorway
pages.
A doorway page, or gateway page, is an alternate entryway to a web
site created in the interest of obtaining a top ranking on
particular keyword phrases in a major search engine. Doorway pages
are often hosted at a different location than the original site. In
other words, a new domain name is registered (usually one that
includes keyword phrases) and the doorway page is created on that
domain name, with links to a destination page on another web site.
Typically, these pages match the look and feel of the original site.
Avoid registering a large number of domains with this tactic because
it could be considered spam by the search engines and could get your
site penalized.
Frames
Frames present some possibilities to us from a web site design
standpoint, but should be avoided if at all possible when it comes
to search engine optimization and getting your site listed in the
search engines. Many spider-based engines cannot crawl through them,
and specific coding is necessary to make them readable by the
engines. This coding is viewed as spam amongst most of the search
engines. Spiders want to be able to read and view everything that
the visitor of the site can. If your site does use frames make sure
that you take advantage of the content area on your site that
doesn't use frames. It's a powerful section of the site, and if used
properly it can result in some excellent rankings. Nevertheless,
frames do pose problems and spiders can't read them. Despite many of
the limitations frames pose, many frameset 'issues' can be turned
into frameset 'positives.'
If you are going to use frames for search engine optimization make
sure that you use them wisely. You can still create a pleasing
interface on a two-frame set by specifying the dimensions of your
top or bottom frame as 5 to 8 pixels or 5 to 8%. That should help
you avoid the spam filters.
Cloaking
Cloaking, aka spoofing, is a method of page delivery where different
pages are served from the same address, no matter if the visitor is
a human or a spider. In other words, browsers such as Internet
Explorer are served one page, and spiders visiting the same address
are served a different page, usually an optimized page. There are
two methods of delivering cloaked pages - IP address and Agent name.
There are two reasons people use cloaking techniques.
1. Since viewers never
see the page that is viewable to the spider, the code can't be
stolen. In highly competitive markets the ability to conceal code
from the competition can be an extremely powerful advantage
2. A human visitor
never sees the page that is served to the spider, the spider page
does not have to be pleasing.
As a result, it can be optimized with every trick in the book.
By using cloaking, nobody sees the page except for the spider. That
gives cloaked pages an extremely powerful advantage over web pages
that were optimized to accommodate a professionally appealing
design.
But, cloaking may be one of the most frowned upon techniques among
all engines.
Filters will pick up pages like the following in no time:
Agent Name:
Delivering a specific page based on agent name is a rather simple,
but risky task. You simply utilize some code that says to basically
take the visitor one place and the spider to another. While very
effective, agent delivery is not a foolproof way to hide your code.
Someone can easily use an agent-faking program to report his or her
agent name as that of a spider when visiting your page. They will
then see exactly what is being served on each page.
Some browsers offer the user a choice of User Agent variables to
submit to any web site they visit. Someone might spoof a search
engine spider's User Agent variable to detect whether you are using
cloaked pages. Whatever the case, any time you use cloaking you take
the chance of being labeled a spammer, a very good chance to say the
least.
IP Address Delivery:
An Internet Protocol Address is a numeric address, which identifies
your connection and presence on the Internet.
In addition to sites having IP addresses, so do the spiders.
Since you can 'sniff' for the IP address when someone visits your
site, you can use this data to push specific pages to the spiders.
This method is more complicated than Agent Name Delivery because it
requires you to maintain a never-ending list of IP addresses and IP
addresses change all the time with the addition of new ones.
The advantage to IP Address Delivery is that someone can't steal or
mimic your IP address, making it
impossible for anyone to see the code that is presented to the
spider.
IP cloaking is abusive
it attempts to manipulate a search engine's index.
Since IP cloaking is deceptive, search engines routinely purge IP
cloaked pages and in some cases ban them permanently.
Link Farms
Since so many engines use link popularity as an integral part of
their ranking algorithms, many webmasters responded by joining link
farms and stuffing their sites and others with as many links as
possible. But, all links are not good links. In fact, bad linking
strategies may get you banned from some engines.
A link farm is a network of web pages, which are heavily
cross-linked with each other for the sole purpose of increasing link
popularity. The web pages usually are in more than one domain or in
more than one server. When a web site joins a link farm, it gets a
link from each of these pages and in turn it also has to link back
to each of those pages. This will then affect the link popularity of
the site. But search engines definitely detect the link farms as
well as the web sites participating in the link farms. GoogleŽ, for
one, disapproves of link farms and labels the links they generate as
spam. In fact, some sites get removed from the index altogether if
they are affiliated with link farms or link stuffing
Some webmasters have
chosen to remove all links going out to other sites. That is an
overreaction that decreases the site value to visitors and hurts the
Web in general because cross-linking is a basic tenet of the
Internet. Links are fine - even encouraged - if they are related to
your topic, but link farms rarely provide useful content to
visitors. If your site is selling cars, linking to car parts sites,
car forums and other car related sites, is very safe and encouraged.
You are only providing access to other sites that are of interest to
your visitors. But, if you signed up with a service that promises to
generate five hundred inbound links to your site only if you agree
to add two hundred outbound links in return, then you are likely
participating in a link farm.
Instead of linking to related information of value to your visitors,
you are sending them to sites with non-relevant and useless
information. Search engines will not penalize you for good, relevant
links, but are quick to punish sites that try to spam them with
unrelated links.
Spider Design
Blocks
Despite the best efforts to make your site look unique and
attractive, some of the web's most prized design technology can be a
major stumbling block for a search engine spider.
Flash Sites (or flash introductions) - while beautiful, can't be
read by a spider. Your solution options are to use an entrance page
that is keyword text phrase intense, create a two frame frameset
where one frame is only one pixel high and use the No Frames area,
or to alternate the use of Flash and static HTML. Following are
design attributes that block spiders:
Frames - despite the unique design and product capabilities
they present, can be a major problem for search engine spiders. Many
spiders can't read them. The solution is to utilize your No Frames
content to optimize your page or stay away from them altogether.
Image Maps - are something that can possibly pose a problem
with some engines. If you plan to use an image map, make sure there
are other links on the page (perhaps on the bottom), that link to
your other pages.
Password Protected Pages - are pages you probably do not want
in the engines anyway. Just be notified, that like a human, the
spider can't enter any area that is protected by a password.
PDF Files - usually provided by Adobe Acrobat Reader, present
a major stumbling block to most spiders. Some engines are beginning
to index these kinds of pages, but from a SEO perspective this is
one format that you want to avoid.
Search Engine Spamming
Search engine spamming is the use of unethical techniques for
improving the position of a Web site in a search engine. In order to
improve their position in a search engine, some Web site owners use
spamming and try to fool the search engines.
Each search engine's objective is to produce the most relevant
results to its visitors. Producing the most relevant results for any
particular search query is the determining factor of being a popular
search engine. Every search engine measures relevancy according to
its own algorithm, thereby producing a different set of results.
Search engine spam occurs if anybody tries to artificially influence
a search engine's basis of calculating relevancy.
The following techniques can be considered spamming:
- Code swapping ("bait &
switch")
This means optimizing a page for high search engine position,
and then swapping another page in its place once a top rank is
achieved. This technique will not lead to a long-lasting search
engine placement because filters have been implemented across
the board to detect this.
- Content Spam
With the help of this spam technique only the search engines can
view a particular part of the data in a web resource. Some
commonly used content spam techniques are as follows:
Invisible text - Hiding keywords within the background by
using exact or similar font colors is one of the most common
search engine spam techniques to date. This can be done by using
tables or a background with a different color other than the
real background for the site.
Keyword stuffing - Another very popular search engine
spam trick, used along with hidden text, is the repetition of
keywords on the bottom of the page in very small fonts. Since
the font is hidden, keywords are crammed into a section of the
site with the intent of capturing the spider's attention.
- E-mail spamming
E-mail spamming means sending commercial messages to email
addresses from unwanted and unknown sources. These messages can
include, but are not limited too, chain emails, get-rich scheme
messages and messages that contain adult related material.
There are various ways of collecting email addresses. The
easiest is to collect them from newsgroups. Newsgroups are a
great source of information, but spammers collect email
addresses out of the posted articles in the newsgroups with the
help of special software. E-mail spamming can be used to
generate links and solicit search engine submission services.
- Meta spam
In order to manipulate a search engine's relevancy algorithms,
meta data can be used as a web resource inaccurately or
incoherently. Following are the common Meta spam techniques:
Unrelated keywords - In order to fool crawlers it has
become a common technique to use popular keywords that are not
relevant to the site's content. For the time being, one may be
able to trick a few people searching for such words into
clicking at the link, but soon they will leave the site when
they receive irrelevant information on the topic they were
originally searching for. This kind of search engine spam upsets
both the search engines and their users.
Hidden tags - The use of keywords in hidden HTML tags,
for the most part, are considered spam to most search engines
and will warrant penalization. These tags can include, but are
not limited to: title, meta description, http-equivalent,
comment, style, hidden value, font, alt, author, option and no
frame.
- No content
If sites do not contain any unique and relevant content to offer
visitors, search engines can consider this spam. On that note,
illegal content, duplicate content and sites consisting of large
affiliate links are also considered to be of low value to search
engine relevancy.
- Over submitting
Each search engine has its own regulations on how many pages can
be submitted at a time and how often it can be submitted.
Submitting the same page more than once a month to the same
search engine and submitting too many pages each day is not
allowed.
Search Engine Optimization
Search engines strive to provide the most relevant results to
their users, but spam swamps their indexes with irrelevant and
misleading information. It is advisable to make no mistakes and
stay clear of anything that could be seen as spam by the
engines. Instead, focus on an ethical approach to SEO. Search
engines will react to the spam techniques when they become a big
enough issue and they are affecting searchers. Banning has
definitely been known to happen.
The following list will give you an idea of the basic "DO NOT'S"
for the search engines:
- Do not use text that is the
same or slightly different color as the background to 'hide'
keywords.
- Do not repeat the keywords
in the Meta tags (use them only once), and do not use
keywords that are unrelated to the site's content.
- Do not create a title like
"web hosting, web hosting, web hosting." This is considered
spam.
- Do not repeat the keyword to
increase its frequency on a page (Keyword stuffing). Search
engines now have the ability to detect this: they can spider
a page and determine whether the frequency is above a
"normal" level in proportion to the rest of the words in the
document - this is also known as keyword density.
- Do not optimize a page for
top ranking, and then swap another page in its place once a
top ranking is achieved.
- Do not put misleading words
on the page in the hopes of attracting visitors looking for
another topic.
- Do not submit a page to the
search engines that, once loaded, automatically redirects to
a page of different information.
- Do not create a page that
prohibits the user from using the browser's back button to
return to the search engine results.
- Do not create "doorway
pages."
- Do not submit the same page
more than once on the same day to the same search engine.
- Do not put multiple
instances of the Title Tag in the HTML code.
- Do not put pages of content
in layers and position them off screen or practice the same
kind of behavior by turning the visibility of the layers
off.
- Do not use small or
'invisible text' in the page.
- Do not send query to a
search engine with an automated 'rank reporting tool'
hundreds of times per day.
- Do not purchase multiple
domains and put duplicate copies of the web site on each
domain.
- Do not participate in Link
Farm programs.
- Do not submit different
versions of the web site in the hope of getting multiple
listings.
- Do not submit more than the
allowed number of pages per engine per day or week. Each
engine has a limit on how many pages one can manually submit
to it using its online forms.
- Do not cloak.
- Do not support affiliate
sites with the same or similar content but different site
designs.
- Do not create a page that is
stuffed with keyword content so far down the page that it is
unlikely anyone will ever scroll down that far.
- Do not create a plain page
specifically designed to rank highly, and then once indexed,
upload a different page to your server.
- Do not put hundreds of 1x1
transparent .gif's on your page and assign them all the same
ALT text. This is rather easy to detect.
- Do not use CSS to set the
text size of a particular tag to 0% and then fill your page
with 'invisible text.'
As you can see, there are a lot of
ways to fool the search engines, but just about all of them are
detectable - and that makes them very dangerous.
If you are serious about custom delivery to the engines, there
is really only one way to go - and that is with a professional
search engine optimization.
We Offer a SEO Contract for your
site for just $150 per month, or you can save by signing up for
a year just $1500.
|