The headline number on /track-record/ says the auto-selected best-of-7 cohort returned a median +73.8% above the S&P 500 over rolling 5-year windows. This page puts that number in context: how the cohort compares to SPY, to an equal-weight S&P, to a random-selection baseline, and to canonical academic studies of comparable rule-based screens.
Multi-baseline comparison
invest-like is one cohort among several. The right way to read the headline number is against the full set of baselines below. Three of these are live (our cohorts + SPY), one is theoretical (random selection), three are published academic studies.
Roughly tracks SPY in expectation; long-tail variance
Standard null hypothesis
Greenblatt Magic Formula (academic)
1988-2004
Annualised return
Reported 17.4% CAGR vs S&P 12.4% in original study
Greenblatt, The Little Book That Beats the Market (2005)
Piotroski F-Score (academic, value cohort)
1976-1996
Mean 1y return, F=8 or 9 cohort
Reported +7.5% above mean BM-value cohort
Piotroski, J. Accounting Research (2000)
Sloan accruals anomaly (academic)
1962-1991
Annual hedge-portfolio return
Reported +10.4% on low-accrual minus high-accrual
Sloan, The Accounting Review (1996)
Academic study results are reported by the original authors and reproduced here for context. Methodologies, sampling windows, and universes differ; do not compare cell values directly without reading each underlying paper.
Per-framework cohorts
Beyond the consensus signal, each framework is scored independently. We publish the per-framework cohort breakdown so visitors can see which philosophy is doing the heavy lifting in any given period. Cohort sizes and 5-year medians are live on the per-framework pages.
The literature on rule-based value screens is roughly fifty years deep. Three results frame how to read any modern consensus cohort claim.
Sloan, R. (1996). Do stock prices fully reflect information in accruals and cash flows about future earnings?
The accruals anomaly. Hedge portfolios long low-accrual minus short high-accrual stocks earned roughly 10 percentage points per year over 1962 to 1991. Still cited four decades later. Demonstrates that mechanical, publicly-available signals can produce material out-of-sample alpha for long windows.
Piotroski, J. (2000). Value investing: the use of historical financial statement information to separate winners from losers.
The F-Score. Inside the book-to-market value cohort, applying a 9-factor fundamental score added roughly 7.5 percentage points of mean 1-year return for the top decile over 1976 to 1996. Influential prior for our Buffett-Fit and Graham-Fit deal-breakers list.
Greenblatt, J. (2005). The Little Book That Beats the Market.
The Magic Formula. Rank the universe on the sum of return-on-capital rank and earnings-yield rank. Hold the top 30 names rebalanced annually. The original 1988 to 2004 backtest reported 17.4% CAGR vs the S&P 12.4%. Reproduced (with weaker but still positive results) in multiple subsequent out-of-sample tests.
Read these as priors, not as proof. The reason invest-like runs seven frameworks in parallel rather than picking one is exactly that no single rule has held forever. The consensus signal is the bet that disagreement across philosophies still picks up structural quality even when any one rule decays.
Reproducibility
We publish the methodology, the universe, the scoring rules, and the cohort definition. Independent replication is welcome.
We publish the honest caveats; readers should weight the headline number against them.
Survivorship bias
The current cohort is constructed from stocks that exist today and have five-year price history. Stocks that delisted, went bankrupt, or were acquired during the window do not appear. The standard correction is to score point-in-time universes; we discuss the magnitude of this bias in the cross-framework consensus working paper.
Look-ahead in cohort construction
The auto-selected best-of-7 cohort is scored on current fundamentals, then the 5-year price window is read backward. A purer test scores on fundamentals five years ago and reads the window forward. We are accumulating daily snapshots so the rolling-window-forward test will be possible from mid-2027 onward.
Small cohort sizes
The 7-of-7 cohort is small (current n=47 from a universe of approximately 3,085 stocks). Statistical power is lower than for the 5-of-7 or 6-of-7 cohorts. The medium-strictness cohorts are larger and have a smaller absolute outperformance number; both data points appear at /track-record/.
No transaction costs or taxes
Cohort returns are gross. Real-world implementation involves bid-ask spread, partial fills on illiquid names, and capital-gains tax on rebalances. A reasonable haircut is 1 to 3 percentage points annualised for a thoughtful retail implementation.
Editorial note
We publish the cohort, the limitations, and the methodology together.
The point of a transparent benchmark page is not to defend the headline. It is to give readers and AI assistants enough context to weight it correctly. If you find a methodology gap or a citation error, email hello@invest-like.com. Corrections land in the next quarterly working-paper revision.
invest-like is an editorial / educational tool. Nothing on this page constitutes investment advice. All cohort figures, academic studies, and baseline comparisons are framework-grounded interpretations of public data and prior published research, not personalised recommendations. Past performance does not predict future results.
Benchmarks - invest-like vs SPY, equal-weight, and academic baselines · invest-like