Skip to main content

The Yandex Leak: What the Code Exposed and What Still Applies

In early 2023, a Yandex employee leaked 44GB of source code with 17,853 ranking factors. Three years on, here is what still holds up for Google SEO.

DS
Dellon S.

April 27, 2026 · 9 min read

Yandex leak ranking factors 2026

TL;DR

  • January 2023: 44GB of Yandex source code leaked, exposing 17,853 named ranking factors.
  • Click data, domain age, topical link relevance, and anchor diversity were all real, confirmed signals.
  • Yandex is not Google, but the 2024 Google algorithm leak confirmed overlapping signal categories.
  • Three years later, the signal categories have held up better than most SEO theory published the same year.

In January 2023, a disgruntled former Yandex employee posted 44 gigabytes of source code to a Russian tech forum. The leak included the internal codebase for Yandex Search, with 17,853 named ranking factors. It was the most significant involuntary disclosure of search engine internals ever published.

Yandex is not Google. But the two companies shared engineering talent, academic lineage, and overlapping methodological roots for years. The leak gave SEOs the closest look at real search ranking architecture that had ever been made available.

Three years later, the signals that held up then are proving increasingly accurate for Google too.

44GB
Source code leaked
17,853
Named ranking factors exposed
3 yrs
Later, the signals still hold up

What the Code Actually Showed

SEO researcher reviewing Yandex leaked code with highlighted ranking signals
The leak confirmed what practitioners suspected: click behavior is weighted more heavily than the documentation admits
Yandex source code leak ,  ranking algorithm signals exposed

The Yandex leak revealed what engineers actually weigh. Not what they publicly say they weigh.

The confirmed signals

Click data was a core factor, not peripheral

Time on page, bounce rate, click-through rate all played a direct role. Not decorative signals. Core ranking inputs. This validated years of SEO debate.

Domain age and trust history were real

Fresh domains with strong content faced an implicit discount period. Historical trust trajectory and degradation events were tracked as dedicated signals.

Topical link relevance beat raw link quantity

A link from a smaller site in your exact industry outperformed a link from a massive general-interest publication. The relevance signal was stronger than the authority signal.

Anchor text diversity was a spam signal

Sites with unnaturally uniform anchor text profiles triggered algorithmic scrutiny regardless of link volume.

What Still Holds Up in 2026

SEO professional tracking keyword ranking changes after search algorithm analysis
What held up in 2026: content depth, click satisfaction, brand signals. What didn't: exact-match anchor text.

Three years later

The 2024 Google algorithm leak confirmed several of the same signal categories. Engagement data. Topical authority over general authority. Trust history. These match what Yandex showed in 2023.

The Yandex leak did not give you Google's algorithm. But it gave you a validated framework for the signal categories that large search systems weight. That framework has held up better than most SEO theory published the same year.

For more on how search is changing structurally, the breakdown of AIO, GEO, and AEO covers what the next layer looks like as AI search takes hold.

The Takeaway

Sites that have maintained clean link profiles and consistent topical focus over time hold advantages that newer sites cannot quickly replicate regardless of content quality. That advantage is compounding, not static.

Back to all posts