Avoid SEO Penalties by Taking These 3 Steps to Control Duplicate Content
While there are certainly myths exaggerating the dangers of duplicate content, it is still possible to lose rankings and search engine traffic over the issue. You could be losing out on those leads you’ve worked so hard for, even as we speak.
It’s time to handle the situation proactively. Fortunately, doing so is relative simple—we’ll help you minimize the potential risks incurred by turning a blind eye to duplicate content.
Today, you’ll learn to tackle the most common instances of problematic duplicate content in three simple steps. But first, let’s explore what duplicate content is exactly, and reveal when you should worry about it.
What is Duplicate Content?
Duplicate content is content that exists at multiple URLs across the web. It can be within your own site, or across many domains. Google cares only about long-form content (such as a blog post) not small blurbs or teaser paragraphs.
Most duplicate content is unintentional and probably exists on your own site. This is bad, because it creates conflicting data for search engines. How can they know which page to feature over another? Even without direct penalization, the SEO impact is watered down for every extra page that has the same content.
[Tweet “The most common source of duplicate content is your own site.”]
Thankfully, when your content winds up on other websites, Google generally does a good job at sniffing out the original source. However, that doesn’t always stop others from finding and linking back to these incorrect sources, instead of yours.
Not all hope is lost. By taking a few simple steps, you can go a long way to future-proofing your site for these situations.
Step 1: Purge Your Website of Duplicate Content
Let’s start by identifying and removing the duplicate content on your own site. This lowers the likelihood that Google is diluting your page rank for keywords across several pages. Once this is fixed, all the power should go to the original, intended post.
Here are the most common scenarios to check for.
Did you know that Google considers domain.com and www.domain.com to be separate websites? Thankfully, WordPress takes care of this for you by default.
However, if you are using a SSL certificate, you also have two additional versions to worry about. These are http://domain.com and https://domain.com.
Try it out! Can you access both versions of your domain in the browser? If so, there’s some work to be done. What you want is for any and all of these versions of your domain to redirect back to your choice. You can handle these redirects right in WordPress.
The first, and simplest step, is to update your site URL within WordPress. You can do this under Settings > General.
Now, there still may be a few leftover technical gaps from when you first installed the site. To make sure the redirect is working in every situation, use a plugin like Easy HTTPS Redirection.
No Canonical Links
‘Canonical’ means there is an original, preferred version of something. Canonical links exist to show Google where credit is due, directing SEO juice to the correct source.
By setting a canonical link, you can help avoid any accidental problems that might arise. For example, if your redirects aren’t working the way you expected, Google still knows where to look instead.
Yoast SEO includes these by default, and makes it easy to add any custom canonical links you may need. We recommend it wholeheartedly.
Improper Archive Handling
Allowing a complete blog post to display in your archives is one of the most common causes of duplicate content. By doing this, your whole site is competing against itself on every category, tag, and date archive. Believe us, that can add up fast.
You can avoid this conundrum in two different ways:
- Always use the excerpt throughout the archives, instead of letting the whole post content display. Start by using the more tag. For old posts, you can enforce this by using the Auto Excerpt Everywhere plugin.
- Don’t allow Google to see archive lists in the first place. This is called a ‘no index‘ page. Once more, Yoast SEO makes it easy to manage no index pages!
Step 2: Customize Your RSS Feed to Attribute Your Content
Many spam sites rely on scraping RSS feeds to pull in content as if it were theirs.
Google can usually tell whether your page came first, but you can trick these sites into helping out—just add a link to the original source in your RSS feed output, so it’s always credited by the scrapers! Even if Google already knows, this helps you if anyone else accidentally finds the scraper’s page instead of yours.
Yoast SEO enables you to do this by adding content before or after the post in the RSS feed.
Log into WordPress, then navigate through to SEO > Advanced > RSS. From here, you’ll see two text boxes to manage your RSS feed content:
Here’s an example snippet you could use above, and then below the RSS feed (respectively):
- “Having trouble viewing the text? You can always read the original article here: %%POSTLINK%%”
- “We love your comments! Leave one at the post here: %%POSTLINK%%”
Not only does this help with duplicate content, but it optimizes your RSS feed with a call to action for regular subscribers.
Step 3: Report External Duplicate Content
According to Google’s support pages, if a site is hosting your content without your permission, you can submit a copyright claim. They don’t take these lightly, so make sure that your report is worthwhile!
The truth is, there are many pieces of duplicate content out there, and most of it won’t hurt your SEO. But if you’ve found a site that seems to be benefiting off of your content, it may be time to act. Start by contacting the administrators to take the offended content down. If they refuse, then there’s cause to go over their heads.
The following two scenarios, when combined, are valid for submitting a complaint:
- A site is hosting your content in its entirety—i.e. not just a blurb or summary of your posts, but complete articles as if they were theirs.
- That site is ranking in Google for that page and relevant keywords.
Dealing with duplicate content requires only a few proactive measures. Once you’ve taken the appropriate precautions, you need only keep an eye out for flagrant copyright violations creeping up in Google’s search results.
Follow these steps:
- Check your own site for any potential duplicate error issues
- Send your RSS feed out with a permalink to the original post
- Report severe cases of duplicate content directly to Google
Have you run into duplicate content issues yourself? Let us know your worst duplicate content struggles in the comments section below, so we can help you to avoid them in the future!