Penguin Solutions (PENG): The Only Production Fix for the AI Inference Memory Wall

·9 min read

What if the next leg of the AI trade is not faster GPUs, but the memory feeding them?

$PENG, Penguin Solutions. US-listed, $2.8 billion market cap. A 38-year-old memory and infrastructure company that, after decades of being a quiet HPC integrator, just became the only player shipping a production fix to one of the most discussed bottlenecks in AI infrastructure: the inference memory wall.

Idea credit: hat tip to Nicolas at snmart.substack.com for the board composition and moat framing.

The Bottleneck Has a Name: The Memory Wall

Training is compute-bound. Inference is memory-bound. Those two sentences are the entire investment thesis on Penguin Solutions in compressed form.

A 70-billion-parameter model running at a 128K-token context window can require roughly 150 GB just for the KV (key-value) cache. An H100 GPU holds 80 GB of HBM. The model has nowhere to put the state it needs to keep generating coherent output. The naive industry response has been to throw more GPUs at the problem, which solves the memory shortage by absorbing the cost of compute capacity that is mostly sitting idle. Buying more GPUs to solve a memory problem is bad math, and the hyperscalers know it.

The Production Fix: MemoryAI KV Cache Server

Penguin's MemoryAI KV Cache Server is, as of mid-2026, the only production-ready CXL appliance solving the inference memory wall at scale. CXL (Compute Express Link) is the open interconnect standard that lets a pool of DRAM sit outside a GPU and be addressed by the GPU with near-memory latency. In practice this means an appliance that can hold orders of magnitude more KV cache than any single accelerator and feed it back to the GPU faster than NVMe or networked storage could.

The numbers: 11 TB of pooled memory per appliance. 10x faster than NVMe. NVIDIA Dynamo compatible (so it slots into the existing inference orchestration stack rather than asking the operator to rebuild it). A Tier-1 US bank has already deployed it. The competitive set (XConn, Astera Labs, SK Hynix) is still in demo and qualification stages.

Being first to production in this category is the moat. Hyperscaler and enterprise customers do not buy demos. They buy what has been deployed and validated by another customer they trust. Once Penguin has the Tier-1 US bank reference, the next dozen deployments come faster than the competitive set can catch up.

The Moat Shows Up in the Margin

The cleanest way to see Penguin's competitive position is to compare gross margins to the rest of the AI infrastructure category. Penguin runs at 31.2 percent non-GAAP gross margin. Supermicro, the commodity AI integrator everyone knows, runs at roughly 10 percent. That is a 3x gap in unit-margin economics for what looks from the outside like a similar product.

The gap is not random. It comes from four structural advantages compounded over a long operating history:

  • Patented conformal coating IP. Memory modules ruggedised for extreme environments (defence, space, industrial). Premium pricing in premium use cases.
  • ICE ClusterWare software. Proprietary management software for HPC clusters. The customer doesn't just buy hardware; they buy a cluster that manages itself.
  • NVIDIA Elite Partner status. Early access to new platforms, joint engineering, and qualification slots competitors don't get.
  • 3.3 billion GPU runtime hours of telemetry. A decade-plus of operating data on how clusters actually behave at scale. The kind of dataset a competitor cannot conjure with capital alone.

We covered this kind of "moat shows up in the margin" pattern in our writeup on Nokia's AI infrastructure cornered resource. Penguin is running a smaller-scale version of the same playbook, focused on the memory layer rather than the network layer.

The Celestial AI Option Value

Penguin was an early investor in Celestial AI, a photonic memory startup. In 2025, Marvell acquired Celestial for $3.25 billion. Penguin booked a $27.5 million gain on its stake. That is the explicit financial event. The much more interesting part is what was retained.

As part of the acquisition, Penguin held on to its co-development partnership with Celestial for the Photonic Memory Appliance. The thesis here lines up naturally with the CXL story already in market:

"CXL solves memory-per-server. Photonics solves memory-per-rack. They are sequential bottleneck fixes, not competing technologies."

If the CXL appliance becomes the standard solution for inference memory at the server level over the next 24 months, the photonic appliance becomes the next-generation solution at the rack level over the following 24. Penguin is positioned in both timelines without paying twice for the R&D, because the heavy lifting on photonics was done by Celestial and is now folded into Marvell's stack with Penguin as a co-development partner.

The Board Is the Signal

For a $2.8 billion company, Penguin's board is unusual. The seats:

  • Chair: Penny Herscher. Currently Chair of Lumentum, the global photonics giant.
  • Director: Mark Papermaster. Currently sitting CTO of AMD. Not retired, not ex-AMD. The current CTO, while AMD itself is in the middle of the biggest AI architecture transition in its history.
  • Director: David Heard. Currently President of Nokia Networks. Formerly CEO of Infinera (now part of Nokia after the 2024 acquisition). Connects directly into the optical-networking AI infrastructure thesis we covered in our Nokia (NOKIA.HE) writeup.
  • New CPO: Ian Colle. Ex-AWS HPC leadership. Brings the hyperscaler product playbook directly into Penguin's go-to-market organisation.

None of these people need a board seat for the cash. They need a board seat at Penguin because they think Penguin is going to matter to the next architecture of the AI data center, and they want to be part of it. The signal value of a board composition like this is independent of anything in the financial statements.

Capital Return: Modest, Anchored, Improving

Capital allocation is not the headline story here, but the trajectory is sound:

  • No dividend. Right call for a company in this growth and capex phase.
  • $38 million repurchased in FY25, fresh $75 million authorisation. Modest but real. Float discipline matters for a name this small.
  • $200 million debt paydown brought net cash to $46 million positive. Balance sheet is no longer the constraint.
  • $200 million SK Telecom strategic investment in December 2024 anchors the cash position and brings a Korean hyperscaler customer relationship into the corporate development conversation.

The MoatMap Scorecard: Q51 V37 M83, StockRank 87

Here is the Penguin Solutions MoatMap StockRank:

  • Quality: 51/100. Middling on trailing metrics. The Quality factor is pricing the 38-year operating history, not the post- pivot AI infrastructure positioning. Forward Quality almost certainly inflects upward as MemoryAI revenue ramps.
  • Value: 37/100. P/E of 66x trailing collapses to 17.7x on forward guidance, with management raising revenue by 12 percent on the latest call. You are not buying a cheap stock; you are buying a re-rating story.
  • Momentum: 83/100. Stock up 90 percent in 30 days, sitting roughly 1 percent off the 52-week high. The market is in the process of figuring out the story.
  • Composite StockRank: 87/100. Strong Buy. Momentum carries most of the load.

A Q51 V37 M83 profile in a name up 90 percent in 30 days is the textbook shape of a momentum-led rerate. Historically, the worst time to buy these names is the week after they appear on every “AI winner” screen; the best time is 6 to 12 months later when the fundamental story has caught up to the chart and the Quality factor has had time to inflect. We covered the dynamics of momentum-led factor profiles in our guide to factor investing. Worth weighing that base rate against the specific catalyst stack here.

The Question Worth Sitting With

Here is the question that keeps coming back. When the sitting CTO of AMD, the Chair of Lumentum, and the President of Nokia Networks all attach their public reputations to a $2.8 billion company, what do they know about the next architecture of the AI data center that the market has not yet priced?

The bull read says they know that inference memory is the next leg of the AI capex cycle, that CXL appliances are the production answer for the next 24 months, that photonic appliances are the production answer for the 24 months after that, and that Penguin is the only company with credible exposure to both. The 90 percent rally and the Tier-1-bank reference customer are early validation. The board is a leading indicator, not a lagging one.

The bear read says boards sometimes get this wrong, that CXL appliances may commoditise faster than expected, that hyperscalers historically prefer to build their own memory tier rather than buy from a smid-cap, and that a 66x trailing P/E in an AI-tagged name at the high of its 52-week range is the place momentum-driven drawdowns usually begin.

Both reads can be partially true. The factor framework says momentum carries this far; the bull-case fundamentals say there is a real productisation moat underneath that momentum. Reasonable investors will disagree on the size of the position. The board-composition signal is the kind of qualitative input that should at least push the burden of proof onto the bear side.

Companion Reading

Penguin lives in a category we have written about from several angles:

  • SK Hynix (000660.KS) for the HBM upstream layer of the same memory-wall story.
  • Nokia (NOKIA.HE) for the optical-networking layer and the Lumentum photonics partnership that David Heard is connected to from both sides.
  • Plover Bay (1523.HK) for the edge-compute layer beyond the data center.
  • CoreWeave (CRWV) as the cautionary counterpoint on momentum-led AI infrastructure names where the balance sheet is borrowing against the future.

The Bottom Line

Penguin Solutions is one of the cleanest examples we have seen of a quiet, 38-year-old HPC integrator pivoting into the exact technical category that the next AI capex cycle has to flow through. The MemoryAI KV Cache Server is in production. The Tier-1 reference customer is in place. The board says they expect this to matter. The gross margin gap to commodity integrators is doing exactly what a structural moat should do.

The factor framework rates it 87/100 today on the back of momentum, with Quality and Value as the catch-up story. The 66x trailing P/E is uncomfortable; the 17.7x forward P/E is much less so, if you trust the guide. The board composition is the single qualitative data point that makes this a name worth doing the work on rather than dismissing as another AI-tagged momentum trade.

For investors using Penguin as a single-name idea, our guide to reviewing your portfolio for weak spots is the right framework for sizing in momentum-led names where the catalyst is structural but the entry multiple is unforgiving.

For the full breakdown including segment economics, MemoryAI unit-margin math, the SK Telecom relationship detail, photonic appliance roadmap, and the management quality assessment, the Penguin Solutions Deep Dive is the place to go.

Disclosure: this article is for informational purposes only and is not investment advice.