Question 1

What is a foundation model?

Accepted Answer

A foundation model is a large AI model trained on broad, diverse data at massive scale - billions to trillions of data points - with the intent that it can be adapted to many different downstream tasks. The term was coined by Stanford researchers in 2021 to capture the idea that these models serve as a 'foundation' upon which many applications are built, either directly via API or after fine-tuning for a specific use case.

Question 2

Is every LLM a foundation model?

Accepted Answer

Most modern LLMs are foundation models, but not all foundation models are LLMs. Foundation models also include large vision models (like DALL-E, Stable Diffusion), multimodal models (GPT-4o, Gemini), and audio models (Whisper). The term 'foundation model' is the broader category; 'LLM' refers specifically to the text-based subset.

Question 3

What is the difference between proprietary and open-source foundation models?

Accepted Answer

Proprietary models (GPT-4, Claude, Gemini) are accessed via API, with the model weights kept private by the provider. Open-source models (Llama 3, Mistral, Falcon) release their weights publicly, allowing anyone to download, fine-tune, and self-host them. Proprietary models are typically more capable at the frontier but cost per token; open-source models are cheaper at scale, auditable, and run on your own infrastructure without sending data to third parties.

Question 4

Which foundation model should a startup use?

Accepted Answer

Start with a proprietary API model (GPT-4o, Claude 3.5 Sonnet) for prototyping - they require no infrastructure and are fastest to iterate on. Once you have production traffic and understand your cost and performance requirements, evaluate whether an open-source model (Llama 3, Mistral) self-hosted on cloud GPUs makes economic sense. For most startups processing under 1 million requests per day, the API model remains cheaper when engineering and infrastructure costs are factored in.

Question 5

What does it mean to 'adapt' a foundation model?

Accepted Answer

Adapting a foundation model means tailoring its behavior for a specific use case without training it from scratch. The three main adaptation techniques are: prompt engineering (no code changes, just better instructions), RAG (grounding responses in retrieved documents), and fine-tuning (updating a subset of the model's weights on task-specific examples). All three build on top of the foundation model rather than replacing it.

Type	Modality	Examples
LLM	Text only	GPT-4, Claude 3, Llama 3, Mistral
Multimodal	Text + image (+ audio)	GPT-4o, Gemini 1.5 Pro, Claude 3 Opus
Image generation	Text → image	DALL-E 3, Stable Diffusion XL, Midjourney
Audio	Speech → text, text → speech	Whisper, ElevenLabs, Voicebox
Code	Text + code	GitHub Copilot (GPT-4), StarCoder2, DeepSeek-Coder

Dimension	Proprietary (GPT-4, Claude)	Open-Source (Llama 3, Mistral)
Capability	Frontier quality	Approaching frontier at 70B+
Cost	Per-token API pricing	Infrastructure + engineering
Data privacy	Data sent to provider	Runs on your infrastructure
Customizability	Fine-tuning via API	Full weight access, unconstrained
Time to first call	Minutes	Hours–days (infrastructure setup)
Break-even volume	~1M–10M requests/day	Varies by model size and GPU cost

Foundation Model

What Is a Foundation Model?

Why “Foundation”?

Foundation Models vs. LLMs

Proprietary vs. Open-Source: Startup Tradeoffs

The Scaling Laws Foundation

Key Takeaway

Frequently Asked Questions

Comments