About the Index
A public benchmark for social preview reliability
We analyze HTML metadata quality across a stable ecommerce cohort. Cohort metrics are public; domain lookup is diagnostic only.
How the index worksAbout the Index
We analyze HTML metadata quality across a stable ecommerce cohort. Cohort metrics are public; domain lookup is diagnostic only.
How the index worksSubdomains and full URLs are normalized automatically.
The index measures structural social preview reliability for retail ecommerce domains. Measurement is HTML-only and does not execute JavaScript. Public reporting is aggregated at cohort level, with no public ranking lists.
Cohort construction is deterministic and rank-based. Source inputs are taken from a fixed Tranco date and transformed into a bounded ecommerce crawl cohort using conservative structural criteria. Universe and active set sizes describe crawl scope, while benchmark eligibility determines aggregate inclusion.
The reporting unit is the registrable domain (eTLD+1). Subdomains are normalized into the registrable domain for cohort accounting and score aggregation. Country-code domains are analyzed independently to avoid cross-market blending.
shop.example.com and www.example.com normalize to example.com.nike.com and nike.de are analyzed as separate units.Analyzed under host: ... when redirects resolve to a canonical host.Crawling fetches server-rendered HTML and metadata endpoints only. JavaScript execution is not performed. This aligns measurement with how social preview crawlers primarily consume metadata from initial HTML responses.
Product URLs are identified using conservative structural signals rather than merchant brand assumptions. Discovered URLs are grouped into homepage, product, category, blog, and other types before reliability classification.
URL checks are classified into Stable, Degraded, or Unreliable tiers. Unreliable marks signals with high likelihood of broken or materially degraded link previews.
og:image.og:image.canonical vs og:url mismatch when present.Classification is intentionally conservative. Image bytes are not stored; only metadata probes (status and dimensions) are recorded.
Benchmark aggregates include only domains that satisfy structural retail catalog eligibility within the snapshot window. This keeps the benchmark comparable across runs and avoids blending fundamentally different site shapes into retail catalog statistics.
For snapshot full_20260227-002835, the benchmark sample size is 647 because only domains passing these eligibility rules are included.
Each public snapshot is computed from a fixed data window and deterministic rank-ordered cohort traversal. Aggregate metrics are derived only from eligible domains observed inside that window.
Some hosts may block automated HTML access through rate limiting, bot defenses, or network controls. These effects are reported as aggregate crawl accessibility metrics.
Accessibility and blocking metrics reflect transport-level crawler access and should not be interpreted as user-facing service availability or downtime.
For applied analysis and operational examples, see: