- 17 hours ago
- 6 min read
Our research demonstrates that decomposing equity flows by investor type — a classification no other dataset provides — generates durable, market-neutral alpha from intraday to multi-month horizons.

The identification gap, solved by flow analytics
Every trade on the U.S. equity consolidated tape is recorded — price, size, timestamp. But no trade is identified. The tape is anonymous. A portfolio manager watching the flow of orders has no way to distinguish whether a surge of activity comes from institutional rotation, retail momentum-chasing, or market-maker inventory management.
This anonymity isn't a minor inconvenience. It means that the forces driving price formation — the heterogeneous flows from participants with fundamentally different objectives, constraints, and information sets — remain invisible. Practitioners are left inferring causation from price alone, or relying on stale proxies: 13F filings (quarterly, delayed by 45 days), broker tagging (partial, inconsistent), or volume-weighted heuristics (aggregate, undifferentiated).
Our quantitative research team — led by Morgan Slade and Seb Yan — set out to address this gap directly. Using our equity flow analytics dataset, which classifies individual trades by investor type at 1-minute granularity across 17+ years of U.S. equity market history, we demonstrate that participant-classified flows generate statistically significant and economically meaningful alpha across multiple investment horizons.
Sortino 2.69 From Flow Decomposition Alone
The headline result is striking. A dollar-neutral long-short strategy built entirely on participant-classified flow features — with no fundamental data, no macro inputs, and no technical indicators — achieved a Sortino ratio of 2.69 and a Sharpe ratio of 1.66 over an 11-year out-of-sample period from 2014 to 2025.
Annual returns averaged 13.0% at 7.55% annualized volatility. Maximum drawdown was contained at −9.9%. And critically, market beta was 0.06 with a downside beta of −0.01 — confirming that the strategy's returns derive from flow-driven idiosyncratic positioning, not hidden exposure to broad equity markets.
These numbers come from a walk-forward validation framework with strict separation between training (3 years), validation (6 months), and testing (6 months) windows. All reported performance is genuinely out-of-sample.
But the short-horizon result, while compelling, isn't the full story. The same dataset generates profitable market-neutral strategies across two additional holding periods — a finding that directly challenges the assumption that flow signals are useful only for high-frequency trading.
At a 10-day horizon, the strategy delivered 5.27% annualized return at 6.40% volatility (Sharpe 0.83, Sortino 1.22), with daily turnover dropping to 87% — meaningfully more implementable than the 1-day variant. At a 60-day horizon, performance remained robust: 6.8% annual return at 7.04% volatility (Sharpe 0.97, Sortino 1.45), with daily turnover falling to just 5%.
Performance stability exceeded 0.94 across all three horizons. All three strategies maintained near-zero beta exposure. The alpha at each frequency comes from different flow features — short-horizon strategies capture mean-reversion in flow shocks, while long-horizon strategies capture cumulative footprint effects — but the underlying data source is the same.

Dollar-neutral long-short portfolios constructed with risk-parity sizing. Short-horizon results are net of zero bps/side; mid- and long-horizon results are net of 5 bps/side transaction costs. Walk-forward validation with non-overlapping 3-year training, 6-month validation, and 6-month testing windows.

The strategy compounds to nearly 300% over the 11-year out-of-sample period. Drawdown periods cluster around the 2018 volatility spike, March 2020 COVID dislocation, and the 2022 rate-hike cycle, consistent with brief execution-cost sensitivity under extreme market conditions.
Two Independent Signals, Opposite Regimes
Our analysis reveals a structural finding that explains why flow decomposition generates alpha across such different frequencies: institutional and retail flows are effectively orthogonal.
Across the S&P 500, correlations between institutional and retail net flows range from −0.01 to −0.46. They carry independent information about future returns. And their predictive power concentrates in opposite volatility regimes.

Correlations are overwhelmingly near zero and skewed negative (range: −0.01 to −0.46), confirming that retail activity constitutes an independent information channel not captured by institutional positioning.
Institutional flow signals are most informative during periods of elevated market stress. Conditioning the baseline institutional flow signal on elevated market volatility, the annualized information coefficient information ratio (ICIR) nearly doubled — from 2.26 to 4.11, with all results statistically significant at p<0.001. The mechanism is intuitive: large institutional outflows during volatile markets create temporary dislocations and overbought/oversold conditions that revert predictably.
Retail flow signals follow the opposite pattern. In calm markets, retail flow signals achieve an ICIR of 1.85. Their predictive content concentrates in the overnight return component during tranquil periods, consistent with retail activity being more momentum-driven and informationally relevant when markets are stable.

This regime-dependent orthogonality has a portfolio construction implication that extends beyond any single strategy. Our data provides access to two independent alpha channels covering the full volatility spectrum. During stress, institutional signals are at peak informativeness. During calm, retail signals take over. There is no dead zone.
The Factor-Stripping Test
Sophisticated allocators evaluating any new signal source will immediately ask: is this just beta in disguise? We addressed this head-on.
Using our Endogenous Factor Model to residualize returns — removing all systematic risk components before evaluating the flow signal — the ICIR actually improved. For the baseline signal, it rose from 2.13 (raw returns) to 2.26 (residual returns). For the enhanced signal, from 3.86 to 4.11.

Predictive power is concentrated in the overnight leg and decays rapidly over subsequent days. Rolling IC is generally stable across regimes, with temporary attenuation during the COVID-19 dislocation period (2020). The signal displays consistent mean-reversion rather than momentum behavior.
This is unusual. Most signals weaken when you strip out factor exposure, because part of their apparent predictive power was capturing systematic tilts. The fact that our signal gets stronger confirms it captures genuine idiosyncratic information — stock-specific content that common risk factors don't explain.
What This Means for Practitioners
The multi-horizon result creates distinct implementation pathways depending on mandate and operational capacity.
For systematic and statistical arbitrage desks, the 1-day strategy offers the strongest risk-adjusted returns but demands sophisticated execution. Daily turnover exceeds 150%, making the strategy dependent on efficient market access and careful impact management. The positive skew of 0.26 in the return distribution suggests the signal is relatively undiscovered — not yet a crowded trade.
For medium-frequency mandates — swing trading, sector-rotation overlays, or tactical allocation — the 10-day horizon offers a practical sweet spot. Alpha strength compresses relative to the 1-day variant, but turnover falls substantially, execution costs diminish, and the equity curve smooths. This is where flow data becomes accessible to a broader set of institutional users.

The strategy compounds to 73% over the backtest period with a smoother equity curve than the short-horizon variant. Maximum drawdown of −9.5% and near-symmetric return distribution (skew −0.04) reflect lower sensitivity to single-day execution shocks.
For large allocators prioritizing scalability, the 60-day horizon is the most implementable. At 5% daily turnover and a Sharpe near 1, the strategy can absorb meaningful capital. Cumulative flow features capture positioning build-ups that unfold over weeks, providing a structural signal that complements higher-frequency strategies in a multi-strategy portfolio.

Returns exceed 100% cumulatively with a maximum drawdown of −9.1% and performance stability of 0.96. Daily turnover of ~5% makes this configuration operationally feasible for large institutional mandates. Minimal market beta (0.04) confirms diversifying rather than duplicating broad equity exposure.
Across all three horizons, the near-zero beta exposure means flow-based strategies diversify rather than duplicate existing market exposure. For a CIO constructing a portfolio of alpha sources, this represents an independent return stream.
Robustness Across Market Regimes
Our dataset spans January 2007 to July 2025 — covering four distinct market regimes: the Global Financial Crisis, the post-crisis recovery, the COVID-19 dislocation, and the 2022–23 rate-hike cycle. We validated strategies out-of-sample across all of these periods. Performance stability above 0.94 indicates consistent behavior regardless of the macro environment.
The study covers approximately 210,000 stock-day observations in the S&P 500 alone, with complementary analysis on the Russell 1000 and Russell 2000 universes. Coverage averages 95% of each index. The use of time-varying index constituents and pre-delisting filters controls for survivorship bias.
Our research builds on established academic work. Gabaix and Koijen's inelastic markets hypothesis provides the theoretical foundation: even modest net flows can exert persistent price impact. Barber, Odean, and Zhu's work on retail trading biases informs the retail signal design. We extend this literature by demonstrating that flow decomposition by investor type — at a granularity and history unavailable in academic datasets — translates theory into implementable alpha.
Other Research on the Value of Investor Flow Analytics by Investor Type
For decades, asset pricing has been dominated by fundamentals. Our research proves this view is incomplete.
The primary mechanical driver of stock prices is not fundamentals, but the force of institutional order flows. Access our research Decoding Realtime Order Book Dynamics To Measure Market Inelasticity
By isolating institutional form retial flow, our model successfully predicted net 13F direction for S&P500 stocks Access our research Decoding Realtime Order Book Dynamics To Predict SEC Form 13F Filings
What Comes Next
We're actively exploring several extensions of this research. Finer participant classification — beyond the current institutional, retail, and HFT categories — could reveal additional independent signal sources. Cross-asset applications (fixed income, futures, FX) and international equity markets would test whether the mechanisms we've identified here generalize beyond U.S. equities.
The deeper question is whether these flow patterns reflect a persistent feature of market structure — the enduring information gap between trade execution and trade identification — or a temporary inefficiency that competition will erode. Our 17-year track record, spanning multiple regimes and exhibiting positive return skew, suggests the former.
Every trade is recorded. Until now, none was identified.


