Getting Started with AI: A PM's Perspective

Every PM is being asked: “What’s our AI strategy?” After spending the last year building production AI systems analyzing 10,000+ monthly signals, here’s what I’ve learned about integrating AI into products.

Start with the Problem, Not the Technology

The biggest mistake I see: starting with “we should use GPT-4” instead of “what problem are we solving?”

Bad starting point: “Let’s add AI chat to our product.” Good starting point: “Our users spend 3 hours/week on [task]. Can AI reduce that?”

AI is a tool, not a strategy. The best AI features solve specific, high-value problems.

My Mental Model for AI Products

I think about AI integration in three tiers:

Tier 1: AI-Assisted Features

AI helps users do existing tasks faster. Examples:

Autocomplete in emails (Gmail)
Writing assistance (Grammarly)
Code completion (GitHub Copilot)

PM considerations:

Lowest risk, fastest to ship
Doesn’t change core workflow
Easy to measure impact (time saved, adoption rate)

Tier 2: AI-Augmented Workflows

AI enables new workflows or capabilities. Examples:

Automated summaries (Slack, Notion)
Smart recommendations (Spotify, Netflix)
Conversational interfaces to data

PM considerations:

Medium risk, changes how users work
Requires user education
Measure workflow adoption and satisfaction

Tier 3: AI-First Products

AI is the core product differentiator. Examples:

ChatGPT, Claude (conversational AI)
Midjourney, DALL-E (generative images)
Perplexity (AI search)

PM considerations:

Highest risk, biggest potential impact
Requires rethinking entire UX
Success metrics are product-specific

Most established products should start at Tier 1 and work up.

The AI Product Stack: What You Actually Need

Building production AI isn’t just calling an API. Here’s the stack:

1. The Model (LLM)

Options: OpenAI (GPT-4), Anthropic (Claude), Google (Gemini), Open source (Llama)

PM decision factors:

Cost per request (varies 10-100x)
Latency requirements (real-time vs. batch)
Privacy/security constraints (cloud vs. on-prem)
Model capabilities (reasoning, coding, multilingual)

I’ve shipped features using both GPT-4 and Claude. Both are excellent; choice depends on your specific use case and cost constraints.

2. Prompt Engineering

Your prompts are your product. Bad prompts = bad user experience.

Key techniques:

Few-shot learning: Show examples of desired output
Chain-of-thought: Ask the model to explain its reasoning
Structured outputs: Use JSON schema to enforce format
Prompt templates: Parameterize prompts for consistency

PM tip: Version control your prompts like code. A/B test prompt variations. Monitor quality constantly.

3. RAG (Retrieval-Augmented Generation)

LLMs don’t know your proprietary data. RAG lets you inject relevant context.

Architecture:

User asks question
Search your data for relevant context
Pass context + question to LLM
LLM generates answer based on your data

PM considerations:

Requires vector database (Pinecone, Weaviate, or pgvector)
Chunking strategy affects quality
Search relevance is critical (garbage in = garbage out)

4. Evaluation & Monitoring

AI outputs are non-deterministic. You need continuous quality monitoring.

Metrics I track:

Response latency (p50, p95, p99)
Cost per request
User satisfaction (thumbs up/down)
Task completion rate
Error rate and types

Pro tip: Build a “golden test set” of known inputs/outputs. Run it on every prompt change to catch regressions.

Lessons from Production

Here are hard-won lessons from shipping AI features:

1. Users Don’t Trust “Magic”

Initially, we hid the AI complexity. Mistake. Users need to understand:

What the AI is doing
Where data comes from
Why it made a recommendation

Solution: Show your work. Cite sources. Explain reasoning. Let users verify.

2. Hallucinations Are Real

LLMs confidently generate plausible-sounding nonsense. In production, this is unacceptable.

Mitigations:

Use RAG to ground responses in facts
Add confidence scores
Enable human review for critical paths
Make it easy to report issues

3. Prompt Engineering Is Product Work

Your engineering team can integrate APIs. But crafting prompts that deliver consistent, high-quality outputs? That’s product work.

I’ve spent hours iterating on prompts, testing edge cases, and refining outputs. This isn’t a “set it and forget it” task.

4. Cost Management Is Critical

At scale, AI costs add up fast. GPT-4 API calls can cost $0.01-0.10 per request. If you have 10,000 users making 10 requests/day, that’s $10K-100K/month.

Cost optimizations:

Use cheaper models for simple tasks (GPT-3.5 vs. GPT-4)
Batch requests when possible
Cache common responses
Set rate limits
Monitor cost per feature

Getting Started: A Practical Roadmap

If you’re exploring AI features, here’s what I recommend:

Week 1: Explore & Experiment

Play with ChatGPT, Claude, and Gemini directly
Identify 3-5 high-value use cases in your product
Prototype prompts manually (no code yet)

Week 2: Build a Proof of Concept

Pick the highest-value use case
Build a simple prototype (even a Slack bot works)
Test with internal users
Measure: Does this actually save time/add value?

Week 3: Evaluate Feasibility

Estimate cost at scale (users × requests/day × cost/request)
Assess accuracy/quality (manual review of 100 responses)
Identify edge cases and failure modes
Decide: Build, refine, or pivot?

Week 4+: Iterate & Ship

Start with a small beta cohort
Collect feedback obsessively
Iterate on prompts and UX
Monitor quality and cost
Expand gradually

The PM Role in AI Products

Your job isn’t to become an ML engineer. It’s to:

Identify high-value problems AI can solve
Define success metrics and quality thresholds
Manage tradeoffs between cost, latency, and quality
Design UX that builds trust and handles errors gracefully
Monitor quality and iterate based on user feedback

The best AI features feel simple to users but are sophisticated under the hood. That sophistication comes from product thinking, not just technical implementation.

Final Thoughts

AI is a tool, not magic. The same PM fundamentals apply:

Understand your users deeply
Solve real problems
Measure impact
Iterate relentlessly

The difference is the solution space is expanding rapidly. Features that were impossible two years ago are now trivial. Stay curious, experiment often, and ship incrementally.

Building AI products? I’d love to hear about your experience. What’s working? What’s not? Reach out on X or LinkedIn.

License

This post is licensed under CC BY-NC 4.0. You may quote or translate with attribution. For commercial republishing, please contact me via LinkedIn.

Hero images on this site are AI-generated using Stable Diffusion.