XTech Equity Option Flow Ep. 1

wyatt8240
Jul 11
13 min read

Updated: Sep 19

Here is the transcript of the podcast, drawing directly from the provided source material:

Have you ever felt like you're just drowning in market data? You know, charts, headlines, everything coming at you, and you're just trying to figure out what really makes US stocks tick.

Oh, absolutely. It's a constant flood, and you want to be informed, right? But cutting through all that noise to find what's genuinely insightful. That feels like the real challenge.

It really does. It's more than just a feeling. It's uh it's a fundamental issue today.

Exactly. The key isn't just more data. It's knowing which signals actually matter and, you know, how to interpret them. quickly, effectively.

And that is precisely our mission today. We're going on a deep dive. We want to understand what truly drives us stock returns and how some uh really cutting edge analytics are starting to uncover these hidden signals. Signals most people just miss.

Yeah, we've got a really fascinating stack of research here. Everything from like groundbreaking academic papers that totally challenge old beliefs about market efficiency, right? Beliefs people have held for a long time to some proprietary insights from exponential technology. They're a company that's really at the forefront turning raw data into well actionable intelligence. Hashtag tag the zoo of characteristics and evolving market efficiency. Okay, so let's kick things off. Imagine just for a second you're faced with this mountain of market data, hundreds of signals maybe all claiming they can predict the future. How do you find the real clues in all that?

Well, that's the core of the challenge John Cochran famously posed back in uh 2011. He talked about a veritable zoo of firm characteristics.

A zoo. I like that analogy.

Yeah. And his question was sharp. Which of these things actually give us independent information about say average US monthly stock returns?

Okay. So, who took up that challenge?

That's where researchers Greenhand and Zang (GHC for short) stepped in. And they did it with incredible rigor.

How so?

Well, they looked at 94 characteristics all at the same time using decades of data. You know, from CRSP, Compastad, IBES, the big databases from 1980 to 2014.

That's a huge data set

It is and crucially they were really careful to avoid the common traps like what kind of traps things like uh data snooping bias you know finding patterns that look real in past data but just won't repeat ah okay seeing ghosts in the data exactly and also making sure not to give too much weight to those really tiny micro cap stocks which can sometimes skew results right gotcha so what did they find after all that careful work well this This is where it gets really surprising. For the non-micro cap stocks, so most of the market, only 12 characteristics actually provided independent predictive information. Wait, just 12 out of 94?

Just 12? Think about that. Nearly a hundred supposed signals and only a dozen offered unique reliable insight. That's stunning. Like finding a needle in a hay stack. What were some of these 12? Anything familiar?

Yeah, some familiar names like book-to-market, uh, one month momentum, but also things like cash holdings, R&D to market cap, return volatility. Interesting. And what's really wild is whether they looked at them one by one or altogether, that same number kept popping up, 12. It suggests there's this like intrinsic scarcity of independent predictors.

Okay, so scarcity is one key takeaway, but I sense there is more to the story.

Oh, there is a really dramatic twist. Actually, the GHz study pinpointed this profound shift happening around 2003.

2003.

What happened then?

Before 2003, those 12 characters istics in the non-micro caps they gave a mean monthly hedge return of like 1.9%. Pretty significant.

Yeah, that's not trivial at all.

But after 2003, only two characteristics were still independently predictive for those stocks.

Two.

And the hedge return it dropped so much it was basically statistically insignificant. Zero effectively.

Wow, that's a massive change. Almost like a switch flipped overnight. What could cause something like that?

Well, the researchers suggested it was a perfect storm. Really, a confluence of changes in how information flows and how trading happens. Okay, like what? Think about Sarbanes-Oxley coming in. The SEC demanding faster filings and crucially the explosion of automated trading systems, high-frequency trading algorithms.

Ah, so the machines got faster at arbitraging away those predictable patterns.

Precisely. It seems like the market just became much much more efficient very quickly, eroding the power of those traditional factors.

Did that happen across the board? What about the micro caps?

Even for micro caps, the number of predictors dropped significantly. The So, interestingly, some predictability did hang on in that smaller segment.

So, what does this huge shift mean for you, the listener, trying to navigate the markets today?

It means quite simply that if your models aren't constantly learning and adapting, they're basically flying blind. Static models based on yesterday's signals, they're well, they're dead in the water.

Which brings us to a follow-up study, right, by Han Heatch, and Zoo (HRZ).

Exactly. HHRZ confirmed that Yeah, those 94 characteristics can still be useful for forecasting returns. But, and this is a big butt, you absolutely need advanced forecast combination methods.

Combination methods like what?

Things like lasso and elastic net. Think of them as um intelligent filters. They sift through all those potential signals and pick out the ones that are truly relevant right now. And they help avoid overfitting that data snooping problem again.

Precisely. They guard against finding those phantom patterns. And what these methods showed is that yeah, on average maybe around 30 characteristics are consistently relevant for cross-sectional forecasting. Okay, so more than the original 12, but still not all 94, right? But there's significant churn. The specific characteristics that matter, they change over time. The key point is even after 2003, these smarter combined forecasts, they still provided useful information.

Unlike the conventional forecasts that just looked at factors individually,

exactly. Those became much less effective after the shift. So the takeaway seems to be while the old Simple rules don't work anymore. The right analytical tools, the adaptive tools can still find value. That's it. You need models that aren't just smart, but adaptive. Constantly learning. # tag the hidden signals in options data.

Okay. We've talked about how the classic fundamental characteristics, their predictive power has really changed. But what about other parts of the market, maybe more dynamic areas? You mentioned options earlier,

right? The options market often overlooked by many, but it turns out it holds some uh surprisingly pot insights. So what's the first clue like the overall volume of options trading compared to stock trading? The OS ratio.

Yeah, the option to stock volume ratio. What's fascinating and maybe a bit counterintuitive is that the overall OS ratio actually negatively predicts stock returns in the following week.

Negatively. So more options trading predicts lower returns. How does that work?

Well, one interpretation is that really high overall options activity might just signal, you know, too much noise. or maybe a crowded trade that's about to unwind or reverse.

Interesting. So, the total volume isn't the whole story.

Not at all. The real juice comes when you break it down into what are called signed OS measures.

Signed measures.

Yeah. Basically distinguishing between opening trades, people taking new positions and closing trades, people getting out of positions and also whether those trades involve calls or puts.

Ah, so it's not just how much options trading but the nature of that trading. Can you give an example?

Sure. So, when traders are opening new long call positions, that's called with OBC open by calls over stock volume. That activity tends to positively predict future returns.

Makes sense. Buying calls usually means you expect the price to go up.

Exactly. Conversely, opening new sell call positions, OCS, tends to predict lower returns. Okay. And opening trades are generally more predictive than closing trades.

Generally, yes. Opening volumes seem to carry more fresh information. There is one key exception, though. Unwinding bought call positions. Closing out those long call trades still seems to carry some predictive weight.

And does it matter if the options are like way out of the money versus closer to the current price? The moneyiness.

Absolutely. It plays a significant role. Options that are out of the money, the OTM ones, right? The ones that need a bigger price move to pay off. Higher leverage. Exactly. Those they tend to show stronger predictive power for some of these signed OS measures. The thinking is that informed traders might use these higher leverage options when they have strong conviction to maximize potential gains. Now, I can hear some people thinking, "Okay, but isn't this just reflecting investor disagreement? People have different opinions. They trade options."

That's a really important point and a common alternative theory. And yes, these OS measures are highly correlated with metrics of investor disagreement. But there's a butt. There's a butt. The research shows pretty clearly that the return predictability isn't just because of disagreement. There seems to be a genuine information component beyond simply differing views.

Okay. So, if it's not just disagreement and it's not the overall volume, but these specific types of trades, then who are the informed traders leaving these signals? Is it the big institutions?

This is where it gets really intriguing. The studies found that signed volumes coming from firm proprietary traders, you know, the prop desks of the big banks and institutions. Yeah. Their trades are generally not informative about future stock returns in this context.

Really? Why not?

Well, the hypothesis is maybe they have other ways to leverage their information. They don't necessarily need the synthetic leverage of options for those specific trades. Maybe they use equities directly or other derivatives.

Okay. So, if it's not the big prop desks, then whose options trades are predictive?

Customer trades, small, medium, and large customer trades, those are informative.

Wow. So, the information flow is maybe broader than we think. It's not just the institutional insiders moving the needle here.

It definitely suggests that it implies that valuable predictive information is flowing through a much wider base of market participants. That really makes you think, doesn't it? How these different players with their unique behaviors, the pros, the retail customers, how they all create these distinct signals, kind of hidden in plain sight if you know how to look at the data.

Exactly. It shows just how nuance market analysis can be once you start digging beneath the surface. Hashtag cutting edge analytics with exponential technology. All right, so we've uncovered these academic findings, the scarcity of predictive characteristics, the shift in 2003, the hidden signals and options trades, fascinating stuff. But the big question for our deep dive is, can you actually use this? Does it translate into a real world edge? And that's where companies like exponential technology, extent bridging that gap between academic discovery and practical application, taking these insights and building tools to generate actionable intelligence.

So tell us about their approach. What's something like X tech equity flow.

XC equity flow is a really good example. It's basically a data set, a very robust one built specifically on a deep understanding of these option market behaviors we just talked about.

And what do people use it for?

All sorts of things. Developing strategies, feeding machine learning models, identifying potential sources of edge. It's super versatile. You can use it for dynamic analysis, cluster analysis, back testing, building sentiment indicators,

even for things like zero DTE options, the zero day ones.

Yeah, absolutely. It includes analysis for those too. And it used uses specific metrics derived from the flow data like the volume weighted average strike spot price ratio (VWKS) to actually predict returns on the underlying stock. Okay, that's focused on options flow. But you mentioned Xtech has a broader approach too. It's been called EFM, right? Their adaptive cross asset indogenous factor model or EFM. This is well it's pretty state-of-the-art analytical firepower.

Sounds like it. Break it down for us. Indogenous factor model.

Okay, so traditional factor models often rely on predefined econom ic variables right like inflation, interest rates, etc. The EFM is different. It extracts the latent risk factors directly from the market data itself.

Ah, so the data tells it what the factors are rather than humans guessing.

Exactly. It uses advanced stats like principal component analysis (PCA) to do this. The big advantage is it reduces human bias and you know the risk of specifying the model incorrectly.

Makes sense. What else is unique about it?

Well, it's indogenous as we said. It's also cross asset works consistently across stocks, futures, FX, even crypto. Uses highfrequency data down to the minute with history going back over 17 years.

Wow. One minute granularity for 17 years. That's massive.

It is. Plus, it incorporates advanced covariance estimation techniques for really stable and accurate risk matrices.

And how does this help with finding alpha or managing risk?

It calculates what are called EFM residual returns. Basically, you take the raw return of an asset and subtract the part that's explained by exposure to these EFM factors. Yeah,

what's left over the residual is potentially pure alpha, the part not explained by known risks. It's super useful for alpha generation and also for really precise risk attribution. Okay, so they have these powerful tools. How do they actually measure if a signal like the equity flow is strong and what patterns have they found using these tools? Good question. They measure signal strength rigorously using stats like the cross-sectional Spearman rank information coefficient or IC and Pearson IC. Think of it as a strength meter for the signal and they test this against those EFM residual returns we just talked about over long periods like 2008 right through to 2025 plus across the S&P 500 universe

and what are some of the interesting patterns they've uncovered you mentioned institutional versus retail flow earlier

right that's a really fascinating one they found a stark difference institutional net flow the big players activity tends to show short-term mean reversion for T plus zero close tolo residual return returns, meaning their big trades often see prices move back a bit shortly after.

Exactly. Their impact seems to cause a temporary push that then fades slightly on average within the same day after accounting for market moves. But retail flow that shows significant momentum over holding periods for those same T plus 0 residual returns.

So individual investors flows tend to predict continued movement in the same direction that day. Completely different patterns, completely different. It highlights really distinct behaviors between these groups. And this points towards well market inefficiencies right if these flows are predictable to some extent absolutely quantifiable inefficiencies ran statistical tests autocorrelation Hurst exponent analysis on institutional cumulative net flow they confirmed inherent inefficiencies and momentum a significant chunk of symbols like 71% showed autocorrelation meaning the flow wasn't random it had predictable patterns

not a random walk

not a random walk the average Hurst exponent was around 65 which strongly suggests persistent trending behavior in the equity flow not randomness. Okay, this is getting concrete now. Can you give us a real world example something tangible where these tools spotted something significant?

Yeah, there's a really compelling case study involving Berserkshire Hathaway's huge sale of Apple stock.

Ah, I remember hearing about that sale.

Right. Well, XTEX flow indicator actually identified this massive selling activity more than 135 days before it became public knowledge through the 13F filing in August 2024,

135 days early. Seriously.

Seriously. And not only did they spot it early, their system estimated the trade value, which was over 51 billion within about 3% accuracy.

That is that's incredible. Spotting a $50 billion trade that far in advance with that accuracy, it really shows the power. They have these cumulative normalized net flow dashboards for various stocks, Meta, Microsoft, Nvidia, TSMC, Tesla, others, and you can visually track these larger flow trends and see how they relate to the stock's residual returns. It makes these underlying currents visible.

Wow. Okay. So, the detection capability is amazing. What about actually trading on this? What do the back testing results look like? Can you actually quantify the edge?

Yeah, the back tests are pretty striking. They ran one strategy sort of a quant challenge setup on Apple from 2019 to 2024. It showed annualized returns over 64%.

64% annually.

Yeah. With a sharp ratio of 1.54 which indicates strong riskadjusted returns. That's impressive for a single stock strategy. What about broader longerterm tests?

They did those two extended back tests over 15 years from 2010 to 2025. Used different risk setups, long short, long, neutral, short and the strategies consistently generated significant cumulative returns.

So robust profitability over a long period seems so especially on the long side actually. They have specific examples too like flow momentum reversal strategies for stocks like TSMC and Micro Strategy showing really strong cumulative gains over those long back test period. periods.

So this is where all that theory, all that complex data analysis really hits the road. It translates into tangible historical performance figures.

Exactly. And for you, the listener, it shows how these sophisticated tools can take this incredibly complex, noisy market data, and distill it into something potentially actionable. Those market inefficiencies we talked about, they become quantifiable, potentially exploitable. It feels like a fundamental shift in how you can even approach understanding the market, let alone particip in it. #tact tagoutro.

Well, we have certainly covered a lot of ground today. It's been quite the deep dive.

It really has. We started with that zoo of characteristics,

right? And finding that only a handful really had independent predictive power, especially after that big market efficiency shift around 2003. It really pruned the zoo, didn't it?

It did. Then we moved into the uh the hidden world of options data.

Yeah. Uncovering how signed volumes, the type of trade, the moneyiness, how those details reveal subtle but quite powerful signals about where stocks might be heading, particularly from customer trades, interestingly enough.

And finally, we saw how a company like exponential technology is taking all these concepts and running with them.

Absolutely. Using tools like extec equity flow and that really sophisticated indogenous factor model, EFM. Their research vividly shows how institutional flows and retail flows leave these distinct predictable footprints. We saw the backtested results, the potential for impressive performance and that incredible Berkshire Hathaway example, predicting a massive trade months in advance. Just amazing. It really hammers home the power of deep datadriven analysis in finance today. The more precisely you can dissect and interpret this torrent of information,

the clearer the opportunities become where maybe others just see random noise.

Exactly.

Which leaves us and you with a final maybe provocative thought to chew on. If even these huge institutions trying their best to mask their activity still leave behind these det subtle footprints that can predict returns. What else is out there? What other market movements that look chaotic or random on the surface, right? What other patterns might be lurking underneath, just waiting to be revealed if you look at them through the right analytical lens, the kind of cutting edge lens we've been discussing today?

It certainly makes you wonder what else is hiding in the data. It does. So, keep exploring, keep questioning, and keep digging into how information and especially the smart analysis of that information continues to shape the world of investment opportunities.

Learn how to subscribe to macro data sources and forecasts: https://www.exponential-tech.ai/

Learn more about XTech Products: https://calendly.com/xtech-marketing-leads/brief-introduction-to-exponential?from=slack&month=2025-08

Product

Data Catalog

Pricing

Professional Services

Research Services

Data Monetization

About Exponential

Team

Contact

Careers

Blog

How Real-Time Flow Data Revealed the Palantir Sell-Off Before It Happened

Recent Posts

Comments

Unlock Your Data's Potential Today.

About

Learn

Engage

Legal

Unlock Your Data's Potential Today