Intermediate AI

The AI Stack

The layered architecture of modern AI systems - from compute and foundation models to applications - and where startups should focus.

Published March 17, 2026

The Four Layers

Understanding the AI stack helps founders identify where to compete, what to build, and what to buy.

Layer 1: Compute Infrastructure

What it is: The physical hardware (GPUs, TPUs) and cloud services (AWS, GCP, Azure, CoreWeave) that power AI training and inference.

Who controls it: Nvidia (dominant hardware), hyperscalers (cloud compute), and specialized AI cloud providers.

Startup opportunity: Minimal. This is a capital-intensive commodity market. Startups compete here only in niche optimization plays (inference efficiency, edge deployment).

Layer 2: Foundation Models

What it is: Pre-trained large models (GPT-4o, Claude, Gemini, Llama, Mistral, Qwen) that serve as the base for downstream applications.

Who controls it: OpenAI, Anthropic, Google, Meta (open-source), Mistral, DeepSeek.

Startup opportunity: Extremely limited for proprietary frontier models. More viable for: fine-tuned domain-specific models on top of open-source bases, specialized inference optimization, and model distillation.

Layer 3: Data and Retrieval Infrastructure

What it is: The tools that make foundation models work reliably in products - vector databases, embedding models, RAG frameworks, data pipelines, fine-tuning infrastructure, evaluation tooling.

Who controls it: A mix of startups (Pinecone, Weaviate, Qdrant) and open-source projects (LangChain, LlamaIndex, pgvector).

Startup opportunity: Moderate. This layer is consolidating fast, but niche tools (domain-specific evaluation, specialized retrieval for regulated industries) still have room.

Layer 4: Applications

What it is: Products built for end users on top of the other layers. Legal AI tools, coding assistants, customer support bots, document intelligence, AI copilots.

Who controls it: Still largely open - this is where the majority of AI startup investment is going.

Startup opportunity: Highest. The application layer is where vertical expertise, proprietary data, and distribution create durable advantages.

The Strategic Implications

The AI stack is currently in an investment phase where infrastructure companies (Nvidia, OpenAI, Anthropic) are capturing most of the economic value. This mirrors the early internet (Cisco, Sun Microsystems captured early value; Google and Amazon captured long-term value).

For founders, the strategic implication is clear: build in the application layer, buy from the infrastructure layer. Use OpenAI’s API; don’t compete with it. Use Pinecone’s database; don’t rebuild it. Concentrate proprietary investment on the thing only you can build: your domain expertise, your data relationships, and your customer distribution.

Key Takeaway

The AI stack is not a level playing field - different layers have radically different competitive dynamics and startup viability. The winning move for most founders is to build deep in a vertical application, use the best available infrastructure as a commodity, and focus on the proprietary data and distribution advantages that make their specific product defensible.

Frequently Asked Questions

What is the AI stack?
The AI stack refers to the layered system of technologies that make up modern AI applications: compute infrastructure (GPUs, cloud), foundation models (GPT, Claude, Llama), data and retrieval infrastructure (vector databases, RAG pipelines), and the application layer (products built on top). Each layer has different economics, competitive dynamics, and startup opportunities.
Which layer of the AI stack should a startup build in?
Most startups should build at the application layer - on top of existing foundation models. The compute and model layers require billions of dollars and world-class AI research talent to compete in. The application layer is where startup differentiation happens: vertical expertise, proprietary data, and distribution. A few infrastructure tools (vector databases, orchestration frameworks) also represent startup opportunities.
Why is the foundation model layer closed to new startups?
Training frontier foundation models requires $50M–$500M+ in compute, access to massive proprietary datasets, and hundreds of top-tier ML researchers. OpenAI, Anthropic, Google, and Meta have multi-year head starts and ongoing capital advantages. The consolidation is already happening - even well-funded startups like Inflection AI pivoted away from frontier model competition.
Where is value accruing in the AI stack?
Infrastructure and model providers are capturing enormous value in the current cycle (Nvidia, OpenAI, Anthropic). Long-term, value will migrate toward application-layer companies that build proprietary data moats and deep workflow integrations. The historic pattern from software platform shifts (mobile, cloud) is that application companies ultimately capture more cumulative value than infrastructure companies.

Share with your team

Create an account to track your progress across all lessons.

Comments

Log in to join the conversation.

Loading comments...