Don’t Trust AI for Important Things Such As Investment Decisions

Until AI algorithms understand what words mean, they won’t be reliable for important decisions—especially those with money on the line

By Sam Wyatt & Gary N. Smith

Photo illustration of a pixelated, torn, and disintegrating one hundred dollar bill with a line graph weaving up and down behind and infront of the bill through the torn edge — Dem10/Getty Images

When ChatGPT debuted on November 30, 2022, followed soon after by other AI chatbots, the reaction was unbridled astonishment followed by barely restrained hype.

Entrepreneur and software engineer Marc Andreessen described ChatGPT in a post on X (formerly Twitter) as “pure, absolute, indescribable magic.” Bill Gates told Forbes that ChatGPT was “every bit as important as the PC, as the internet.” If that hyperbole was not enough, Sundar Pichai, CEO of Alphabet and Google, proclaimed in a 60 Minutes interview that artificial intelligence “is the most profound technology that humanity is working on—more profound than fire.” Turing Award winner Geoffrey Hinton told CBS News, with no apparent sense of irony, “I think it’s comparable in scale with the Industrial Revolution or electricity—or maybe the wheel.”

Alas, for nearly 70 years, AI cheerleaders have overpromised and underdelivered. It is now increasingly clear that GPT and other LLMs are not intelligent in any meaningful sense and cannot be relied on for important decisions, such as hiring choices, prison sentencing, loan approval, insurance rates—and investing.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

AI-powered investing is particularly interesting because it provides a quantifiable way to assess the abilities of the technology. The first AI-powered Exchange Traded Fund (ETF) was launched on October 18, 2017, by the investment platform EquBot, with the memorable ticker symbol AIEQ (“AI” for AI and “EQ” for equity). EquBot claimed that AIEQ was “the ground-breaking application of three forms of AI”: genetic algorithms, fuzzy logic and adaptive tuning. Wow! Chida Khatua, CEO and co-founder of EquBot, boasted in a news release that AIEQ “has the ability to mimic an army of equity research analysts working around the clock, 365 days a year, while removing human error and bias from the process.”

Sign me up!

Two weeks later, ETF provider Horizons (now Global X) launched the Active AI Global Equity Fund (MIND), which it described in a news release:

MIND is sub-advised by Mirae Asset Global Investments..., which uses an investment strategy entirely run by a proprietary and adaptive artificial intelligence system that analyzes data and extracts patterns.... The machine learning process underpinning MIND’s investment strategy is known as Deep Neural Network Learning—which is a construct of artificial neural networks that enable the A.I. system to recognize patterns and make its own decisions, much like how the human brain works, but at hyper-fast speeds.

Steve Hawkins, then president and CEO of Horizons, added, “Unlike today’s portfolio managers who may be susceptible to investor biases such as overconfidence or cognitive dissonance, MIND is devoid of all emotion.”

That’s the hype. The reality is that both funds have trailed the S&P 500 badly. Through December 31, 2023 (the most recent data we have), AIEQ had a cumulative total return of 63 percent, compared with the S&P’s 108 percent. MIND, before it was shut down in 2022, had a cumulative total return of –12 percent compared with 65 percent for the S&P.

Have more recent AI-powered funds have done better, maybe? Nope.

In an analysis that has not yet been peer-reviewed, we looked at all publicly available AI-driven ETFs and mutual funds that have been launched since October 18, 2017. We found 11 funds that are fully AI, such as AIEQ and MIND, in that the investment decisions are made without human intervention. We also found 43 partly AI funds that use AI but allow human involvement. For example, the Qraft AI-Enhanced U.S. Large Cap Momentum ETF (AMOM) uses an AI system to inform “stock selection” while having human advisers retain “full discretion over investment decisions,” according to Qraft’s descriptions of the fund.

We found that only 10 of the 43 partly AI funds have done better than the S&P 500 during their lifetimes. The average annual return for all 43 funds was about five percentage points per year worse than the S&P 500 (7.11 percent versus the S&P’s 12.43 percent). It was even more calamitous for the fully AI funds. Every single one did worse than the S&P 500. Six of 11 funds actually lost money. Overall, the 11 fully AI funds lost 1.8 percent per year on average, while the S&P 500 gave investors an average annual return of 7.6 percent. As well, in the short time that they have been in existence, six of the 11 fully AI funds and 25 of the 43 partly AI funds have been shuttered.

The Achilles’ heel of AI systems is that while they are unparalleled at finding statistical patterns, they have no way of judging whether the patterns they find are plausible or pointless. If there is a correlation for one year between daily stock prices and the low temperatures in Antelope, Mont. (which there was), these algorithms might well use that statistical correlation to make investment decisions because they do not know what temperatures are or what stock prices are, let alone whether the two might be logically related.

Along with signs on Wall Street over the past month that the AI hype train is faltering more widely, the disappointing returns from even “groundbreaking” algorithms point to deep shortcomings in the overly celebrated technology.

Until AI algorithms understand what words mean and how they relate to the real world, they will continue to be unreliable for important decisions, including but not limited to investing.

This is an opinion and analysis article, and the views expressed by the author or authors are not necessarily those of Scientific American.