TurboQuant panic hits memory stocks, but the AI fuel story is still HBM

Google’s TurboQuant work lands with a dramatic claim: by compressing the key value cache of large language models, it can cut memory needs for inference by roughly a factor of six, which traders immediately read as bad news for standard DRAM demand. That was enough to send Samsung, SK Hynix and Micron sharply lower as screens started to price in the end of the AI driven memory boom.

Looked at in context, though, this looks more like a classic overreaction to a complex research update than the start of a new downcycle. TurboQuant is an early stage algorithm aimed mainly at making inference more efficient on conventional DRAM, while the structural story in high bandwidth memory, the stacked chips bolted next to GPUs and critical for training large AI models, is still defined by tight supply, rising demand and full order books at the key suppliers.

Breakthrough technology sparks market panic

Google Research $GOOG has unveiled a new memory compression algorithm called TurboQuant , which researchers say can compress key cache memory used in large language models at least six times faster with up to eight times faster inference, without sacrificing accuracy .

The market reaction was immediate and dramatic. On Thursday, shares of the world's two largest memory chip makers, SK Hynix and Samsung $SSNLF, fell 6% and nearly 5% respectively in South Korean trading. Samsung Electronics closed down 4.71%, while SK Hynix fell 6.23%, pulling the South Korean benchmark KOSPI index down 3.22% .

A similar trend continued in the US markets, where shares of companies such as Micron Technology $MU, which fell 7% , and SanDisk $SNDK, which fell 6.8% . These moves followed declines in SanDisk and Micron shares in the US on Wednesday .

How TurboQuant works and why it scares investors

TurboQuant represents a revolutionary approach to solving one of AI's biggest bottlenecks - the enormous memory requirements during inference operations. TurboQuant is a compression method that achieves high model size reduction with zero loss of precision, making it ideal for supporting both key cache (KC) compression and vector search.

The technology works in two phases. The first phase uses PolarQuant, which thinks about mapping high-dimensional space differently. Instead of using standard Cartesian coordinates (X, Y, Z), PolarQuant converts vectors into polar coordinates consisting of a radius and a set of angles. The breakthrough lies in the geometry: after random rotation, the distribution of these angles becomes highly predictable and concentrated.

The second phase acts as a mathematical error corrector. Even with the efficiency of PolarQuant, a residual amount of error remains. TurboQuant applies a 1-bit quantized Johnson-Lindenstrauss (QJL) transformation to this residual data.

The actual market impact remains a question

Despite the immediate market reaction, analysts caution against exaggerated concerns. Ray Wang, a memory analyst at SemiAnalysis, said Google's research won't necessarily lead to the need for fewer chips. Cache values are "a key bottleneck that needs to be addressed for better models and hardware performance," he said. Wang said it will be "hard to avoid higher memory consumption" as a result of improving the performance of models .

It is also important to distinguish between different types of memory. It should be noted that compared to standard DRAM chips, this technology will have less impact on HBM (High Bandwidth Memory). TurboQuant is mainly used to optimize the inference of AI models, a phase that mostly requires only ordinary DRAM chips. However, HBM remains a necessity in the AI training phase.

According to a CNBC report, despite Thursday's stock drop, a perfect storm of factors continues to support the memory market over the long term. Significant demand coupled with supply shortages pushed memory prices to unprecedented levels and supported gains for Samsung, SK Hynix and Micron .

Structural fundamentals remain solid

It is also key to remember that TurboQuant is still only a research project. It is worth noting that TurboQuant has not yet been deployed on a larger scale; it is still a laboratory breakthrough at this time. This makes comparisons to something like DeepSeek, or even the fictional company Pied Piper, more difficult.

Data shows that the HBM market size will grow 58% to $54.6 billion in 2026, accounting for nearly 40% of the DRAM market. The sudden increase in demand has led to an imbalance between supply and demand. Despite Samsung, SK Hynix and Micron allocating 70% of their new/additional capacity to HBM, there remains a 50-60% capacity gap for HBM.

According to Wells Fargo analysts, the Google TurboQuant update could actually be a positive for memory companies. Although this kind of breakthrough might look negative for memory companies, the idea of the Jevons paradox suggests that the opposite can happen - making AI more efficient reduces costs, which can actually encourage much wider use and demand .

Structural drivers tied to AI infrastructure, supply constraints and tight HBM markets support a resilient long-term outlook. Investors should distinguish between short-term noise and fundamental trends anchored in persistent memory shortages and AI workload expansion .