Google’s Penguin update, and the unnatural link warnings they’ve been sending out through Webmaster Tools, shows that they’re now looking to penalise suspicious & paid links instead of just devaluing them.
But the thing that really interests me is how Google determines which links are paid and which aren’t. If you’re an SEO, when you see a paid link, in most cases it’s generally pretty obvious if it’s unnatural or paid for – but it’s not as simple for a machine to detect.
There’s been some speculation as to what kinds of signals Google is looking at – sites that have the link warnings apparently tend to have a lot of sitewide links, most likely in footers and sidebars – and they also often have a very high keyword to brand anchor text ratio.
I’m not convinced that the ratio of anchor text used is enough to flag links as suspicious, at least not on it’s own. If the site or page doesn’t have that many links, then it’s a small sample size that could be easily skewed, and could lead to a lot of false positives. Another issue is that exact match domains would effectively get a free pass (although, that might still be true).
Time as a signal
I have a theory – and please note, this hasn’t been proven – that Google is looking at another signal to work out which links are suspicious. One of the big differences between paid links and natural links is when they’re placed. The majority of paid links are added to pages retroactively – i.e. a website has a page that mentions car insurance and a company might then approach them and offer to pay them on a monthly basis to change that text to a link.
I believe that if Google has crawled a page, and then at a later date recrawls that page and discovers a new link – with hardly any extra content added – that link is now flagged as suspicious. They might devalue it, they might send out a webmaster tools message or they might do both – but that link could well be flagged. The exception to this is if the page they’re crawling is the homepage, and potentially category pages, where content might change frequently.
If a reasonable chunk of text is also added at the same time as the link, then it potentially wouldn’t be flagged (so genuine updates to news articles wouldn’t accidentally flag that link).
How Google could deal with sidewide links
Other times, a paid link might be added to a sidebar in the form of a banner ad, or in a blogroll link, or as a link in the footer. These are, in 99% of cases, now sitewide links. They’d potentially trip the same filter as above, because those links would appear on pages that Google has already crawled, but there’d also be a higher percentage of false positives here (i.e. good links being flagged as bad) as bloggers often link to sites they genuinely endorse in blogrolls too.
If I were Google, I’d treat those links differently to deal with the increase in false positives. Unless I was confident that the link was classified correctly as either paid or natural, I’d consider silently devaluing that link and not sending out a link warning. After a time limit (maybe 6 months, maybe a year), I’d allow that link to start flowing Page Rank. If you’re buying links, you don’t want to pay for them and not have them work for months – you might be more likely to notice that the links you’re building aren’t working, so you stop renewing them. If it’s a genuine editorial link in a blogroll, then it’s more likely that that site can wait a while before getting the link value – because that link is mainly serving to pass them useful traffic.
“I’d like to get a few paid link reports anyway because I’m excited about trying some ideas here at Google to augment our existing algorithms” – Matt Cutts, 2007
I think this is probably something that Google have been doing for a while, way before the webmaster tools warnings were sent out. Matt Cutts mentioned in the past that they’ve been working on algorithms to automatically detect paid links, and I imagine there are probably other signals they’re looking at too.
How Google could detect paid links is a post from: Shark SEO.