Google, transparency and our not-so-secret formula

Tuesday, March 2, 2010 | 9:51 AM

Labels: , ,

Recently the European Commission opened a preliminary inquiry into competition complaints. Part of the complaint alleges that Google operates without sufficient transparency into how and why web sites rank in our search results. The notion that Google isn't transparent is tough for me to swallow. Google has set the standard in how we communicate with web site publishers. Let me tell you about some of the ways we explain to sites how we rank them and why.

One of the most widely-discussed parts of Google's scoring has always been PageRank. That "secret ingredient" is hardly a secret. Here it is. That early paper not only gave the formula for PageRank, but mentioned many of the other signals in Google's ranking, including anchor text, the location of words within documents, the relative proximity of query words in a document, the size and type of fonts used, the raw HTML of each page, and capitalization of words. Google has continued to publish literally hundreds of research papers over the years. Those papers reveal many of the "secret formulas" for how Google works and document essential infrastructure that Google uses. Some of these papers have spurred not only open-source projects but entire companies in their own right.

Academic papers are one thing, but Google also aims to engage and educate in many other ways. In 1999, Sergey Brin participated in the first Search Engine Strategies conference for webmasters. In 2001, Google became one of the first search engines to engage online at a publisher forum called WebmasterWorld. One representative (GoogleGuy) has posted over 2800 times, while another (AdWordsAdvisor) has posted almost 5000 times.

Google's efforts at transparency and communication have evolved with the web. We started blogging in May 2004 and have written thousands of posts on our official blog. Google now has over 70 official blogs, including an official webmaster blog specifically to help site owners understand how Google works and help them rank appropriately in our search results. Google publishes more blog posts than almost any other large company. We also provide extensive public documentation on our web site with advice for publishers, in dozens of different languages.

As the head of Google's webspam team (which tries to stop attempts to violate our clearly documented, public webmaster guidelines), people often ask me questions about how Google works. That's why I started my own personal blog in 2005 and have written hundreds of posts about Google. The topics range from common web site mistakes to advice for new bloggers. I've had the pleasure of speaking to web site owners or doing public web site reviews at over 30 different search conferences. In fact, I'll be answering questions at another search conference this week - along with a dozen or so Google colleagues.

We've tried all sorts of experiments to help site owners understand how Google's search ranking works. We've done multiple live webmaster chats online with hundreds of simultaneous participants. We've experimented with tweeting. We've participated in podcasts. And here's one of my favorite ways we've helped to break out of the black box and give advice to publishers: in the past year, we've taken questions from the public and posted hundreds of video answers on a webmaster video channel. Those videos have been watched over 1.5 million times (!). We also engage online across the blogosphere to answer questions about Google's practices.

The list goes on and on. Google has reached out to other search engines on methods to make life easier for website owners. The resulting standards include specifying preferred web site url formats as well as Sitemaps, an easy way for webmasters to tell search engines about the pages on their site. Google provides a webmaster forum where both Google employees and helpful outside "superusers" hang out and answer questions about specific sites. We've run in-person website clinics to provide specific one-on-one feedback and advice in locations from San Francisco to India to Russia to virtual site clinics in Spanish. We've even confirmed ranking signals that Google doesn't use in our algorithms, such as the keywords meta tag, which saves site owners from doing needless work and helps avoid frivolous lawsuits.

The frustrating thing is that even if all 20,000 employees at Google worked full-time on answering questions from website publishers, we still couldn't talk to every site owner. Why not? Because the web has over 192 million domain names registered. That's why we introduced Google Webmaster Tools, a one-stop location to provide scalable, self-service information and to let webmasters provide us with data. Describing the powerful tools we provide to site owners for free would take an entire other blog post, but a number of the offerings include:
  • Site owners can get recommendations about issues like duplicate meta descriptions or missing title tags.
  • Site owners who we believe have violated our webmaster guidelines and where Google has taken corresponding action regarding their site in our index can submit a request for reconsideration.
  • Site owners who have been hacked can get details about malware on their site. After they remove the hacked content, they can fetch pages from their site as Googlebot to make sure the malicious content is really gone.
  • Site owners can find out about errors that Google encountered while crawling their site.
A Google employee recently blogged about using these free, public tools to diagnose an issue with his webhost where he had exceeded his bandwidth quota. Millions of webmasters have taken similar advantage of Google's free tools for site owners to get helpful information about their site.

At Google, we try to be as open as we can, even to the point of helping users export their data out of Google's products. At the same time, we don't think it's unreasonable for any business to have some trade secrets, not least because we don’t want to help spammers and crackers game our system. If people who are trying to game search rankings knew every single detail about how we rank sites, it would be easier for them to 'spam' our results with pages that are not relevant and are frustrating to users -- including porn and malware sites.

Ultimately, criticizing Google for its "secret formula" is an easy claim to make, but it just isn't true. Google has worked day after day for years to be open, to educate publishers about how we rank sites, and to answer questions from both publishers and our users. So if that's how people choose to define "secret," then ours must be the worst kept secret in the world of search.

Posted by Matt Cutts, Principal Engineer, Search Quality Team

33 comments:

jonraasch said...

But we still can't have the actual algorithm right? :P (Just kidding, and "well put" on several of your points)

admin said...

Can we get any clarification on your last summer 'it doesn't do what you think it does' quote regarding no follow as we both know that it's got very little to do with page rank sculpting ;-). Point is your happy to help when it suits you. Your not happy to help when it doesn't - eg no follow and have the whole webmaster community on the web running around in a tiz! Personally if the European Commission forced Google to be more open I'd be pretty much out of a Job so I hope that doesn't happen.

Dave said...

I am particularly interested in the Local Search which changes quite dramatically. That said, there are lots of insightful tips but very few facts. Most of my investigations are done via webmaster tools which I hope will keep up its aggressive improvements.

Sweet Spot Marketing said...

Agreed Matt. Way to stick up for yourself and Google!

The only "competition complaints" I see currently that are legit are business listings in Google Maps. I have seen these become super spammy in the last 6 months. Wish the Google spam report for Google map listings was a little quicker to respond.

I'm currently building my own case study to see how long it takes Google to respond to map spam.

webaficionado said...

I agree that Google makes a good effort to reach out to Webmasters and Google Software/Tools users. More so than many other software products our company subscribes to.

I think short of publishing a white paper/tome on the mechanics of Google's Algorithm/s that determine ranking there is not much more Google could do. Even then it would only be "transparent" to some and a new flow of complainers will emerge.

My only complaint would be lack of information on website penalties and why/when they have been allocated.

Otherwise Keep Up the Good Work

Toby Mason
Rise Digital

Makers of elevator shoes said...

Thank you Matt. I Love Google. I trust Google. Im fighting against a few spammy Competitors for my key word "elevator shoes" Im stickin to the rules bro. but they aint. All the best with your Skinhead, looks good mate

Chris said...

Right-on Matt. This is a great post and tribute to all of the hard work you guys (and gals) do at Google. Every person in a web-related field appreciates your dedication to making search better. Cheers!

Anjar said...

um ... nice article ... maybe I did not notice the really free facilities given the actual google has written everything there.

thank you for reminding

Argentina said...

What I would criticize about google is that in my opinion it is failing to put the better sites on the top.

For example, i would like to see google ignore sites with ILLEGAL content, illegal torrents, file sharing illegal sites, ETC ETC... those are ranking on the top of google, while other sites that work hard, can't hardly rank well.


i would really like to see google fight spam software and black hat software, but i seriously can't see google get a hold of them...

every time i do a search of a movie, i see all illegal sites in the top 10... i can't understand how those sites manage to rank so well.. while others that WORK hard and honest to rank well, can't get more than 10 hits per day from google... and they also only get 100 hits per day from google images hotlinking their images... its pretty frustrating to work hard during 1 full year handwriting a site and getting natural back links, to find out that the site gets only 10 hits per day... while illegal sites are on the top with 4000 hits from google per day... Its very very frustrating for a honest site owner...

Gus said...

I was also very appreciative of the Google Meet-up in Boston. Not only did "Google" (Adam Lasnik) answer my questions, they bought me a beer!

pSouper said...

Superbly defended - Quite why a defense is required I have now idea. Maybe KFC or Coke like to offer up their secrets too - I think not. What's more is that the world is no poorer for it. Is it really the case that if you have a great idea it's yours to keep - but if you have the best idea it's unfair or anti competitive?

The world is full of inventions that would fall foul in this area and yet time has proven again and again that even the strongest ideas can be deposed and depreciated by the lateral approaches of others.

If this is to be the fate of Google then so be it but don't jump up and down in a pink fit just because Google aren't giving away the golden tickets to a lifetime at the top of the page.

Who do they think would benefit from an extended degree of transparency? the biggest corporates with the biggest budgets to employ the biggest teams - is that so different from today?

it smells a lot like the work a fat cat backed lobbyist than the real concerns of the everyman.

HALVORBARS said...

It is absolutely hilarious that this has even become an issue at all, much less a lawsuit.

Google's search engine has lead the way with transparency. Without this unparalleled transparency, they would not be the leader in nearly every market they operate in. People trust Google's organic search results. Plain and simple.

I've been keeping up to date with all the news surrounding this lawsuit and the company behind it. Foundem has no claim at all. They're an affiliate site. They do not represent any sort of authoritative or even trustworthy vertical search. I attempted to comment on the Search Neutrality's ("initiative") site and my comment has yet to appear. There is one company represented in this initiative (them) and they have a whopping one comment. Doesn't look like their own site is "neutral" enough to allow comments.

fhucho said...

This is only vaguely related to what is discussed in the blog post: In my experience it's almost impossible to communicate with Google, I will give one example - I wanted to know recently when will Android Market expand to more countries (I mean being able to sell/buy apps). There are many questions about this on official Google forums, without any answer from Google emplyee. I also asked on Google Developer Day, being told something like "We're working on it". Why can't someone from Google just give an answer like "For some reason we can't tell you this information" or "Android Market will be exapanded to more countries sometimes in 2H 2010"?

From my point of view this is quite in contrast with the claim that Google tries to be open.

Bob Shirilla said...

Matt,

I totally agree - The secret stuff isn't very top secret any more. Google wants to deliver the best results for every search.

As a business person, I realize being the best is very challenging. That includes my website.

Thanks for posting this on Twitter - wish you posted all of your activity on Twitter.

Bob Shirilla

mr.g said...

don't forget all the patents too!

Renan said...

Clap-Clap-Clap! Excelent post!

Donovan said...

So when do we find out what the secret sauce formula is Matt?

secretsaucemarketing [@] mac [.] com

Donovan Roddy =oP

Desire Athow said...

The truth is that there is no single secret formula that gets Google ticking. Instead there are a number of very clever people trying to make life simple for everyone else AND at the same time firefighting those very, very bad guys that use search engine for spamming and phishing. Try a search on "download windows 7" and look for those EDU websites to understand what I mean.

Matthew Tillett said...

Great post Matt.

Whatever trade secrets Google has, they need to stay behind closed doors. The instant they are released will be the downfall of Googles accurate search results. The accuracy of Googles search results are the reason why users continue trust to use Google.

Google has provided vast amounts of information on how to rank and achieve successfully within Googles search results, and you continue to do so - thank you!

The information is there, you just have to be prepared to put in the effort and study. It's not difficult!

The reward is success - if it's just given to you, you won't appreciate your achievement.

Monte Huebsch said...

Great post Matt. Well said and I believe you are very open about most things. I also agree with the comments about the LBC. It is a spammy mess. Can hyou focus on that for a while?

Sebastian said...

Even without access to Google's secret sauce, avoiding penalties and indexing issues caused by flawed code is a breeze with all the detailed info out there.

Craig said...

I disagree with Argentina. Google shouldn't filter illegal sites, they are there to reflect the information that is available on the web and if most of it is about torrent sites then the results will have to reflect it. They already have filters for adult-oriented content, but how many people would turn on a filter that will hide torrent sites when you could do it yourself with "-torrent" in the search bar. So people don't like your site, tough luck. IMDB doesn't like torrents and it is always the first result when I search for a movie.

JezC said...

It ain't what you say, it's what you do. Examples - the Chilean Earthquake - kudos for creating the resource. But after several days the best you can do for the UK, Germany, Italy, France, etc is to tell Americans how to call a US resource. Why not tell Brits to call the Foreign Office, the French to call their Department, etc. Why do you think it is important for the UK server to tell Brits about how important Americans are, and that Brits can look after themselves. The *implicit* message is that we're second class citizens.

How about the high speed broad/fibre experiment? Offered to the US only. So the rest of world uses chipped stones and smoke signals? If you want us to feel valued, then you need to value us.

At *many* opportunities (not all) but far too many, the GOOG plays to a US audience. And that parochial view of the world means we're suspicious that you might have unconscious influences on how you treat the human factors in search ranking. Yes, you use algorithms - but they are informed by human decision making. Humans in California. Rewarded in dollars. And whose first instinct is *frequently* to think of addressing US needs first.

You've done a great job of communicating how you want us to think you wor. But the *implicit* message is that the rest of the world is just less important.

Thank you for your attention.

JezC said...

Drat. I worked in the US for almost ten years. I know how that last posting will be received. I should point out that I have previously written an article decrying Foundem's claims, and looking at Ciao, and pointing out that other vertical search companies (Trovit, for example) are present.

Google's problem, IMO, is not a failure to communicate the search algorithm, but a failure to show that Europe, or Africa, or Asia, is as important to Google as the USA. You sit in the US, with US taxes, under US law, offering US beta tests, and even global charity relief with US resources, and then try to tell us that we're wrng when we think that you think of the US first. If you want us to think we have value, then you need to treat us as if we are valued. Otherwise the suspicion will continue that you prefer US companies. Doesn't that seem reasonable to you?

Jeff Ballweg said...

Not sure that EU realise, keeping a few things secret, and thereby keeping search-spam to a minimum actually improves our results, not the other way around. The last thing my customer needs is rubbish out-ranking his genuine business.

Thanks for the post. And your blog. And your Twitter. And Webmaster Tools. And all the papers. Maybe you ought to publish a book?

Jeff Ballweg Web Design
Christchurch, New Zealand

Cliff said...

Well done Matt, you guys have been great. I think people are getting frustrated with caffeine roll out. Great job on cleaning up spam. I've noticed. They get erased and don't come back as quickly.

Ed Bloom said...

nice rebuttal Matt.

But is the real story here not the fact that Microsoft has learned from it's anti trust days and decided the best way to tackle google is to try to slow you down via litigation rather than trying to catch up with you by innovating and competing fairly?

just a thought.

Sarah said...

I realize this post is "old news," but I've bookmarked it to give to my own clients who just don't understand "why Google does x," or who think SEO is some huge secret.

Matt, please take my sincere "thanks" back to whomever it concerns for giving us (and by "us" I mean website owners/bloggers/Internet users) a plethora of tools and information!

Victor Tuszing said...

I agree with JezC - I'm myself suspicious that Google prefers US companies and US website owners. I have several travel related sites (hosted on a server in the USA) on which I work hard since 3 years and follow all rules and practices indicated by Google - original content, incoming links from relevant sites, no bad neighborhood, fast loading pages, frequent updates, original news, etc. You think this helps? No way, I can not get serious traffic in any way and keep in mind that I'm speaking about 26 sites! On the other hand, I worked several years ago for a US company and what you think? Their sites, on which I applied exactly the same techniques as on mines, got much more traffic into a much shorter time, even if the sites were pretty new and even without really serious content on it! Isn't strange? For me, this is enough to think that in the Google's algorithm are present secret codes that shouldn't be there - like a piece of code that gives advantage to US companies against others, for example. Can be somebody sure that there isn't present such formulas?

patrick said...

what a great post. I really enjoy reading this post. application migration

Silverstall said...

I have yet to meet anyone in business who has not a legitimate complaint about the European Commission, from local farmers, fishermen and even many in the jewellery business. Its inconsistent and illogical directives are treated with contempt by those whose businesses are affected by their ill thought, illogical and ill-advised judgements and directives.
As a QC recently advised "don't let the bastards wind you up"

fredp said...

My website was banned in 2006 just when my competitor decided to put a lot ( but a very lot) of money into Adwords. Now I see some webmaster are doing the same that I did (spamdexing) I was doing light spamdexing ( making duplicate page for sinonyms ) I ask for Google to have similar attitude and banned those sites. No reactions for 8 months now. If u write to google is the same writing to Sant Claus u don"t get an answer. Google abuse of its dominante position they can not introduce distortion of competition, it is in the law. Why does google not respect the EU law ?

San Diego DUI Lawyer said...

Well said...Google has always been innovative and diplomatic. As the web has changed, Google has adjusted to maintain position in search. The inherent power in #1 is NOT abused. G is very business friendly.
If only our government could be as "together".
Why can't the feds run the country as smoothly as Google runs the web?