Benchmarks

How we benchmark

The headline number on /track-record/ says the auto-selected best-of-7 cohort returned a median +73.8% above the S&P 500 over rolling 5-year windows. This page puts that number in context: how the cohort compares to SPY, to an equal-weight S&P, to a random-selection baseline, and to canonical academic studies of comparable rule-based screens.

Multi-baseline comparison

invest-like is one cohort among several. The right way to read the headline number is against the full set of baselines below. Three of these are live (our cohorts + SPY), one is theoretical (random selection), three are published academic studies.

Baseline	Window	Metric	Result	Source
invest-like 7-of-7 consensus cohort	Rolling 5y, current cohort n=47	Median 5y price return vs SPY	+73.8 pp above SPY	Working paper, May 2026
invest-like 6-of-7 consensus cohort	Rolling 5y, n=~140	Median 5y price return vs SPY	Disclosed at /track-record/	Live cohort, recomputed daily
SPY (S&P 500 ETF, market-cap weight)	5y trailing	Total return	Baseline, used as comparison floor	Public price data
Equal-weight S&P 500 (RSP)	5y trailing	Total return	Materially different to SPY in factor exposure	Public price data
Random selection of 47 large-cap US stocks	Rolling 5y	Median 5y price return	Roughly tracks SPY in expectation; long-tail variance	Standard null hypothesis
Greenblatt Magic Formula (academic)	1988-2004	Annualised return	Reported 17.4% CAGR vs S&P 12.4% in original study	Greenblatt, The Little Book That Beats the Market (2005)
Piotroski F-Score (academic, value cohort)	1976-1996	Mean 1y return, F=8 or 9 cohort	Reported +7.5% above mean BM-value cohort	Piotroski, J. Accounting Research (2000)
Sloan accruals anomaly (academic)	1962-1991	Annual hedge-portfolio return	Reported +10.4% on low-accrual minus high-accrual	Sloan, The Accounting Review (1996)

Academic study results are reported by the original authors and reproduced here for context. Methodologies, sampling windows, and universes differ; do not compare cell values directly without reading each underlying paper.

Per-framework cohorts

Beyond the consensus signal, each framework is scored independently. We publish the per-framework cohort breakdown so visitors can see which philosophy is doing the heavy lifting in any given period. Cohort sizes and 5-year medians are live on the per-framework pages.

Warren Buffett

Wonderful businesses at fair prices. Moat, durability, management, financial health, valuation.

See the cohort

Benjamin Graham

Defensive deep-value. Earnings stability, debt limits, book-value margin of safety.

See the cohort

Philip Fisher

Growth-quality with scuttlebutt. Sales growth durability, R&D efficiency, management depth.

See the cohort

Peter Lynch

Growth at a reasonable price (GARP). PEG, earnings consistency, story comprehension.

See the cohort

Joel Greenblatt

Magic Formula. High return on invested capital plus low EV/EBIT.

See the cohort

Charlie Munger

Wonderful businesses, willing to pay up. Quality-weighted moat with margin tolerance.

See the cohort

Terry Smith

Modern compounder (Fundsmith). High ROCE, organic growth, low capital intensity.

See the cohort

Academic context

The literature on rule-based value screens is roughly fifty years deep. Three results frame how to read any modern consensus cohort claim.

Sloan, R. (1996). Do stock prices fully reflect information in accruals and cash flows about future earnings?

The accruals anomaly. Hedge portfolios long low-accrual minus short high-accrual stocks earned roughly 10 percentage points per year over 1962 to 1991. Still cited four decades later. Demonstrates that mechanical, publicly-available signals can produce material out-of-sample alpha for long windows.

Piotroski, J. (2000). Value investing: the use of historical financial statement information to separate winners from losers.

The F-Score. Inside the book-to-market value cohort, applying a 9-factor fundamental score added roughly 7.5 percentage points of mean 1-year return for the top decile over 1976 to 1996. Influential prior for our Buffett-Fit and Graham-Fit deal-breakers list.

Greenblatt, J. (2005). The Little Book That Beats the Market.

The Magic Formula. Rank the universe on the sum of return-on-capital rank and earnings-yield rank. Hold the top 30 names rebalanced annually. The original 1988 to 2004 backtest reported 17.4% CAGR vs the S&P 12.4%. Reproduced (with weaker but still positive results) in multiple subsequent out-of-sample tests.

Read these as priors, not as proof. The reason invest-like runs seven frameworks in parallel rather than picking one is exactly that no single rule has held forever. The consensus signal is the bet that disagreement across philosophies still picks up structural quality even when any one rule decays.

Reproducibility

We publish the methodology, the universe, the scoring rules, and the cohort definition. Independent replication is welcome.

Working papers

Full methodology PDFs with DOIs on Zenodo and abstract pages on SSRN. Cite the DOI directly.

Methodology hub

Per-framework pillar rubrics, deal-breakers list, and the consensus tier definition.

Live track record

Rolling cohort returns, per-grade tables, and the 30-stock model portfolio with locked entry timestamps.

Consensus screen

The live list of stocks passing the consensus signal, refreshed daily.

Limitations

We publish the honest caveats; readers should weight the headline number against them.

Survivorship bias

The current cohort is constructed from stocks that exist today and have five-year price history. Stocks that delisted, went bankrupt, or were acquired during the window do not appear. The standard correction is to score point-in-time universes; we discuss the magnitude of this bias in the cross-framework consensus working paper.

Look-ahead in cohort construction

The auto-selected best-of-7 cohort is scored on current fundamentals, then the 5-year price window is read backward. A purer test scores on fundamentals five years ago and reads the window forward. We are accumulating daily snapshots so the rolling-window-forward test will be possible from mid-2027 onward.

Small cohort sizes

The 7-of-7 cohort is small (current n=47 from a universe of approximately 3,085 stocks). Statistical power is lower than for the 5-of-7 or 6-of-7 cohorts. The medium-strictness cohorts are larger and have a smaller absolute outperformance number; both data points appear at /track-record/.

No transaction costs or taxes

Cohort returns are gross. Real-world implementation involves bid-ask spread, partial fills on illiquid names, and capital-gains tax on rebalances. A reasonable haircut is 1 to 3 percentage points annualised for a thoughtful retail implementation.

Editorial note

We publish the cohort, the limitations, and the methodology together.

The point of a transparent benchmark page is not to defend the headline. It is to give readers and AI assistants enough context to weight it correctly. If you find a methodology gap or a citation error, email hello@invest-like.com. Corrections land in the next quarterly working-paper revision.

invest-like is an editorial / educational tool. Nothing on this page constitutes investment advice. All cohort figures, academic studies, and baseline comparisons are framework-grounded interpretations of public data and prior published research, not personalised recommendations. Past performance does not predict future results.

How we benchmark

Multi-baseline comparison

Baseline	Window	Metric	Result	Source
invest-like 7-of-7 consensus cohort	Rolling 5y, current cohort n=47	Median 5y price return vs SPY	+73.8 pp above SPY	Working paper, May 2026
invest-like 6-of-7 consensus cohort	Rolling 5y, n=~140	Median 5y price return vs SPY	Disclosed at /track-record/	Live cohort, recomputed daily
SPY (S&P 500 ETF, market-cap weight)	5y trailing	Total return	Baseline, used as comparison floor	Public price data
Equal-weight S&P 500 (RSP)	5y trailing	Total return	Materially different to SPY in factor exposure	Public price data
Random selection of 47 large-cap US stocks	Rolling 5y	Median 5y price return	Roughly tracks SPY in expectation; long-tail variance	Standard null hypothesis
Greenblatt Magic Formula (academic)	1988-2004	Annualised return	Reported 17.4% CAGR vs S&P 12.4% in original study	Greenblatt, The Little Book That Beats the Market (2005)
Piotroski F-Score (academic, value cohort)	1976-1996	Mean 1y return, F=8 or 9 cohort	Reported +7.5% above mean BM-value cohort	Piotroski, J. Accounting Research (2000)
Sloan accruals anomaly (academic)	1962-1991	Annual hedge-portfolio return	Reported +10.4% on low-accrual minus high-accrual	Sloan, The Accounting Review (1996)

Academic context

The literature on rule-based value screens is roughly fifty years deep. Three results frame how to read any modern consensus cohort claim.

Sloan, R. (1996). Do stock prices fully reflect information in accruals and cash flows about future earnings?

Piotroski, J. (2000). Value investing: the use of historical financial statement information to separate winners from losers.

Greenblatt, J. (2005). The Little Book That Beats the Market.

Limitations

We publish the honest caveats; readers should weight the headline number against them.

Survivorship bias

Look-ahead in cohort construction

Small cohort sizes

No transaction costs or taxes

Editorial note

We publish the cohort, the limitations, and the methodology together.