machine learning – QuantPedia

The Memorization Problem: Can We Trust LLMs’ Forecasts?

July 17, 2025

Everyone is excited about the potential of large language models (LLMs) to assist with forecasting, research, and countless day-to-day tasks. However, as their use expands into sensitive areas like financial prediction, serious concerns are emerging—particularly around memory leaks. In the recent paper “The Memorization Problem: Can We Trust LLMs’ Economic Forecasts?”, the authors highlight a key issue: when LLMs are tested on historical data within their training window, their high accuracy may not reflect real forecasting ability, but rather memorization of past outcomes. This undermines the reliability of backtests and creates a false sense of predictive power.

Can We Profit from Disagreements Between Machine Learning and Trend-Following Models?

June 26, 2025

When using machine learning to forecast global equity returns, it’s tempting to focus on the raw prediction—whether some stock market is expected to go up or down. But our research shows that the real value lies elsewhere. What matters most isn’t the level or direction of the machine learning model’s forecast but how much it differs from a simple, price-based benchmark—such as a naive moving average signal. When that gap is wide, it often reveals hidden mispricings. In other words, it’s not about whether the ML model predicts positive or negative returns but whether its view disagrees sharply with what a basic trend-following model would suggest. Those moments of disagreement offer the most compelling opportunities for tactical country allocation.

Can We Finally Use ChatGPT as a Quantitative Analyst?

May 30, 2025

In two of our previous articles, we explored the idea of using artificial intelligence to backtest trading strategies. Since then, AI has continued to develop, with tools like ChatGPT evolving from simple Q&A assistants into more complex tools that may aid in developing and testing investment strategies—at least, according to some of the more optimistic voices in the field. Over a year has passed since our first experiments, and with all the current hype around the usefulness of large language models (LLMs), we believe it’s the right time to critically revisit this topic. Therefore, our goal is to evaluate how well today’s AI models can perform as quasi-junior quantitative analysts—highlighting not only the promising use cases but also the limitations that still remain.

Is Machine Learning Better in Prediction of Direction or Value?

May 21, 2025

Building machine learning models for trading is full of nuances, and one important but often overlooked question is: what exactly should we try to predict—the direction of the next market move or the actual value of the asset’s return? A recent paper by Cheng, Shang, and Zhao, titled “Direction is More Important than Speed” offers a clear and practical answer. Their research shows that focusing on direction—simply whether returns will be positive or negative—leads to better model accuracy and, more importantly, stronger real-world investment performance. This is especially true when using machine learning methods, where predicting the direction allows models to better capture downside risks and build more effective trading strategies. For anyone using ML in finance, this paper makes a strong case that predicting where the market is headed is often more valuable than predicting how far it will go.

Are Sector-Specific Machine Learning Models Better Than Generalists?

May 14, 2025

Can machine learning models better predict stock returns if they are tailored to specific industries, or is a one-size-fits-all (generalist) approach sufficient? This question lies at the heart of a recent research paper by Matthias Hanauer, Amar Soebhag, Marc Stam, and Tobias Hoogteijling. Their findings suggest that the optimal solution lies somewhere in between: a “Hybrid” machine learning model that is aware of industry structures but still trained on the full cross-section of stocks offers the best performance.

Does the Image-Based Industry Classification Outperform?

February 18, 2025

For decades, investors and analysts have relied on traditional industry classifications like GICS, NAICS, or SIC to group companies into sectors and peer groups. However, these rigid categorizations often fail to capture the evolving nature of businesses, especially in an era of technological convergence and rapid industry shifts. Machine learning (ML) offers a more dynamic and data-driven alternative by analyzing company visuals—such as logos, product images, and branding elements—to identify similarities that go beyond predefined classifications. A recent study applies this approach to construct new industry groupings and tests them in industry momentum and reversal. The results show that ML-generated groups lead to superior performance, once again highlighting the potential of image-based classification in financial analysis.

Design Choices in ML and the Cross-Section of Stock Returns

December 17, 2024

For those who have not yet had the chance to read it, we recommend the latest empirical study by Minghui Chen, Matthias X. Hanauer, and Tobias Kalsbach, which shows that design choices in machine learning models, such as feature selection and hyperparameter tuning, are crucial to improving portfolio performance. Non-standard errors in machine learning predictions can lead to substantial portfolio return variations, and authors are highlighting the importance of robust model evaluation techniques.

The Impact of Methodological Choices on Machine Learning Portfolios

November 4, 2024

Studies using machine learning techniques for return forecasting have shown considerable promise. However, as in empirical asset pricing, researchers face numerous decisions around sampling methods and model estimation. This raises an important question: how do these methodological choices impact the performance of ML-driven trading strategies? Recent research by Vaibhav, Vedprakash, and Varun demonstrates that even small decisions can significantly affect overall performance. It appears that in machine learning, the old adage also holds true: the devil is in the details.

The Expected Returns of Machine-Learning Strategies

July 29, 2024

Does the investment in sophisticated machine learning algorithm research and development pay off? It is an important question, especially in light of the increasing costs related to the R&D of such algorithms and the possibility of decreasing returns for some methods developed in the more distant past. A recent paper by Azevedo, Hoegner, and Velikov (2023) evaluates the expected returns of machine learning-based trading strategies by considering transaction costs, post-publication decay, and the current high liquidity environment. The obstacles are not low, but research suggests that despite high turnover rates, some machine learning strategies continue to yield positive net returns.

Impact of Business Cycles on Machine Learning Predictions

April 15, 2024

As an old investing adage goes, “Everybody’s a genius in a bull market.” It is easy to fall victim to the Dunning-Kruger effect, where attribution bias makes us mistake our luck for abilities. When the business cycles change, there are great problems with precise stock price predictability. And this is not the only problem for humans, who are baffled by many mental heuristics. Machine learning algorithms experience similar problems, too. What is happening, and why is it so? A new paper by Wang, Fu, and Fan gives an explanation and proposes some remedies …