Tech

Google won’t comment on potentially massive leak of its search algorithm documentation

Google’s search algorithm is perhaps the most important system on the Internet, dictating which sites live and die and what the web’s content looks like. But exactly how Google ranks websites has long been a mystery, pieced together by journalists, researchers and people working in search engine optimization.

Now, an explosive leak purporting to show thousands of pages of internal documents appears to offer unprecedented insight into how search works – and suggests Google hasn’t been entirely honest about it for years. So far, Google has not responded to multiple requests for comment on the legitimacy of the documents.

Rand Fishkin, who worked in SEO for more than a decade, says a source shared 2,500 pages of documents with him in hopes that reporting on the leak would counter ‘lies’ that Google employees had shared about how the search algorithm works. The documents describe Google’s search API and detail the information available to employees, according to Fishkin.

The details Fishkin shares are dense and technical, probably more readable for developers and SEO experts than the layman. The leaked content also does not necessarily prove that Google uses the specific data and signals mentioned for search rankings. The leak instead describes the data that Google collects about web pages, sites and searchers and offers indirect guidance to SEO experts about what Google seems to be interested in, as SEO expert Mike King wrote in his overview documents.

The leaked documents cover topics like what type of data Google collects and uses, which sites Google elevates for sensitive topics like elections, how Google handles small websites, and much more. Some information in the documents appears to conflict with public statements by Google representatives, according to Fishkin and King.

“‘Lied’ is harsh, but it’s the only accurate word to use here,” King writes. “While I don’t necessarily blame Google’s public representatives for protecting their proprietary information, I do take issue with their efforts to actively discredit people in the worlds of marketing, technology and journalism who have presented reproducible findings.”

Google did not respond The edge’s requests for comments regarding the documents, including a direct request to refute their legitimacy. Fishkin said The edge in an email that the company did not dispute the veracity of the leak, but that an employee asked it to change some language in the message regarding how an event was characterized.

Google’s secret search algorithm has given rise to an entire industry of marketers who closely follow Google’s public guidelines and execute them for millions of businesses around the world. These pervasive and often annoying tactics have led to a general narrative that Google search results are deteriorating, filled with garbage that website operators feel obligated to produce in order for their sites to be visible. In response to The edgeIn their previous reports on SEO-based tactics, Google representatives often fall back on a familiar defense: That’s not what Google’s guidelines say.

But some details in the leaked documents call into question the accuracy of Google’s public statements about how search works.

One example cited by Fishkin and King is whether Google Chrome data is used in ranking. Google representatives have repeatedly indicated that they do not use Chrome data to rank pages, but Chrome is specifically mentioned in sections about how websites appear in search. In the screenshot below, which I captured as an example, the links appearing under the main vogue.com URL may be created in part using Chrome data, according to the documents.

Chrome is mentioned in a section explaining how additional links are created.
Image: Google

Another question raised is what role, if any, EEAT plays in ranking. EEAT stands for Experience, Expertise, Authority and Trustworthiness, a Google metric used to evaluate the quality of results. Google representatives have previously stated that EEAT is not a ranking factor. Fishkin notes that he didn’t find much in the documents mentioning EEAT by name.

King, however, detailed how Google appears to collect a page’s authorship data and has a field to track whether an entity on the page is the author. Part of the documents shared by King indicate that the field was “primarily developed and optimized for news articles… but is also populated for other content (e.g., scientific articles).” While this doesn’t confirm that signatures are an explicit ranking metric, it does show that Google at least keeps track of this attribute. Google representatives have previously insisted that author bylines are something website owners should do for readers, not Google, because it doesn’t impact rankings.

While these documents aren’t exactly smoking gun proof, they provide an in-depth, unfiltered look into a closely guarded black box system. The US government’s antitrust case against Google – which revolves around search – has also led to the release of internal documentation, offering additional insight into how the company’s core product works.

Google’s general distrust of how search works has led to websites that look the same as SEO marketers try to outsmart Google based on what the company offers. Fishkin also denounces posts credulously confirming Google’s public claims as true, without further analysis.

“Historically, some of the search industry’s loudest and most prolific publishers have been happy to uncritically repeat Google’s public statements. They write headlines like “Google says XYZ is true” rather than “Google says XYZ; “The evidence suggests otherwise,” writes Fishkin. “Please do better. If this leak and the DOJ lawsuit can create one change, I hope it is it. »

News Source : www.theverge.com
Gn tech

Back to top button