01 What is Generative AI?
Generative AI is a class of AI systems that can produce high-quality content — specifically text, images, and audio. Examples include ChatGPT, Bard, and BingChat.
AI has been mentioned on 38% of S&P 500 earnings calls as of early 2024 — the most mentions in tech, followed by energy. Generative AI as a developer tool will be even more impactful in the long term. AI is already pervasive: Google and Bing for web search, fraud detection in credit card payments, recommender systems on Amazon and Netflix.
02 How Generative AI Works
Where it fits in the AI landscape
Think of AI as a toolbox. Supervised learning is the most important existing tool. Generative AI is the newest addition. The toolbox also contains unsupervised learning and reinforcement learning.
Supervised Learning
Given an input (A), the model learns to predict an output (B). A decade of large-scale supervised learning laid the foundation for generative AI. Early models were small; researchers discovered that larger models with more data continue to improve, which unlocked LLMs.
| Input (A) | Output (B) | Application |
|---|---|---|
| Spam? (0/1) | Spam filtering | |
| Ad, user info | Click? (0/1) | Online advertising |
| X-ray image | Diagnosis | Healthcare |
| Audio recording | Text transcript | Speech recognition |
| Restaurant reviews | Sentiment (pos/neg) | Reputation monitoring |
Large Language Models (LLMs)
LLMs are built by using supervised learning to repeatedly predict the next word. Train on hundreds of billions of words and you get ChatGPT.
LLM as a Thought Partner
LLMs give us new ways to find information — like having a knowledgeable collaborator always on hand. They can brainstorm, rewrite, summarise, and reason. But they sometimes hallucinate — making up plausible-sounding but incorrect facts. Use trustworthy websites for high-stakes topics (health, legal); LLMs are better where less structured web information exists.
03 Generative AI Applications
GenAI tasks fall into three broad modes. Across all of them, the more context you give, the better the output.
- Brainstorm product names or campaign ideas
- Write press releases
- Translation
- Generate specific output with context
- Proof reading
- Summarising long articles or call centre conversations
- Customer email analysis and routing
- Reputation monitoring over time
- Specialised chatbots (travel, cooking, IT support)
- Human-in-the-loop triage
- Deploy internal-facing first, monitor before going public
04 What LLMs Can and Cannot Do
A useful mental model: think of an LLM as a fresh college graduate with lots of general knowledge but no company-specific context, no internet access, and no memory of past sessions. You get a different fresh graduate every conversation.
The model's knowledge is frozen at training time — it won't have details on recent events.
Ask it for Shakespeare quotes about Beyoncé and it will confidently invent them. Always verify factual claims.
LLMs accept input + output of a limited combined length (the context window). Very long documents may not fit.
LLMs work best with unstructured data (text, images, audio). For tabular data, supervised learning is usually better.
LLMs are trained on internet data, which carries societal biases. The model learns and can reproduce those biases.
05 Tips for Prompting
1. Be detailed and specific
Give sufficient context. Describe the task in detail. Vague prompt → vague output.
Initial prompt
With context + detail
2. Guide the model to think through its answer
Add explicit steps to the prompt so the model follows a structured reasoning path.
Final prompt
3. Experiment and iterate
- There are no perfect prompts — adjust and improve until results are satisfactory
- Don't overthink the initial prompt; just start
- Be careful with confidential information
- Always double-check before fully trusting the output
06 Image Generation — Diffusion Models
Image generation is mostly done via Diffusion Models — a supervised learning approach at heart. Training: take a clean image + caption, progressively add noise across steps. The model learns to predict a less noisy image from a noisier one. Generation works in reverse: start from pure noise + a text prompt and iteratively denoise until a clean image appears (typically ~100 steps).
07 Using GenAI in Software Applications
Before GenAI, building a reading, writing, or chatbot app required months of ML engineering. Now a working prototype can be built in hours or days. Example — a reputation monitoring system:
| Approach | Steps | Time |
|---|---|---|
| Traditional ML | Label data → train model → cloud deployment | Months |
| Prompt-based | Write a prompt classifying review sentiment → call LLM API | Hours / days |
08 Lifecycle of a Generative AI Project
Building GenAI applications is a highly empirical, iterative process — not a straight line from idea to deployment.
09 Retrieval Augmented Generation (RAG)
RAG lets an LLM answer questions using specific documents you provide at query time, rather than relying purely on its training knowledge. Think of the LLM as a reasoning engine, not a knowledge store.
RAG Applications
- Chat with PDFs — PDF.ai, ChatPDF, AskYourPDF, docAnalyzer.ai
- Article Q&A — CourseraCoach, Snapchat AI, Hubspot chatbot
- New web search forms — BingChat, You.com
10 Fine-tuning, Pretraining & Model Choices
Fine-tuning
Useful when the context is bigger than the input context length, or when the LLM needs domain-specific language (medical notes, legal, financial documents). Allows a smaller model to match a larger one on a specific task — lower cost, lower latency, runs on-device.
Pretraining
- Expensive — requires $10M+, many months, and vast data
- When in doubt: don't do it. Should be a last resort.
- BloombergGPT was trained on 50B+ parameters of proprietary Bloomberg data — outperforms open-source GPTs on financial text
Model Size
Closed vs Open Source
| Closed Source | Open Source | |
|---|---|---|
| Access | Via cloud API — easy to integrate | Full control, run on own hardware |
| Power | Generally more powerful/large | Rapidly catching up |
| Cost | Relatively inexpensive (contributed by large cos.) | Infra costs shift to you |
| Risk | Vendor lock-in — switching means re-running all prompts | Full data and privacy control |
Instruction Fine-tuning & RLHF
A base LLM predicts the next word — if you ask "What is the capital of France?" it might respond with another question (the most common continuation in training data). Instruction fine-tuning trains it on Q&A pairs to follow instructions instead. RLHF (Reinforcement Learning with Human Feedback) goes further: train a reward model on human preference scores, then tune the LLM to generate responses the reward model rates highly. Companies use RLHF to make LLMs Helpful, Honest, and Harmless (the 3 Hs).
Tool Use & Agents
- Tools — LLMs can call external functions (place an order, run a calculation). LLMs are bad at precise math; a calculator tool solves this by letting the model offload computation and receive the result
- Agents — use LLMs to carry out complex sequences of actions autonomously. Cutting-edge research area, evolving rapidly
11 Cost Intuition
Pricing is based on input + output tokens. Output tokens are generally more expensive. A token is roughly ¾ of a word — 300 words ≈ 400 tokens. Common short words ("the", "a") count as 1 token; longer words split ("program" + "ming").
12 Generative AI & Business
Day-to-day uses
- Writing assistant & thought partner — drafting, editing, ideating
- Marketing — brainstorming email campaigns, ad copy
- Recruiting — summarising candidate reviews in 50 words or less
- Programming — writing Python, debugging, documentation
Task Analysis Framework
AI doesn't automate jobs — it automates tasks. Most jobs are a collection of many tasks. The framework: list the tasks in a role → assess each for GenAI potential along two dimensions:
| Technical Feasibility | Business Value |
|---|---|
| Can AI do it? At what cost? | How much time is spent on this task? |
| Think: fresh grad following prompt instructions | Does doing it faster/cheaper create real value? |
| If unsure, try prompting an LLM to test it | Does doing this more consistently unlock new revenue? |
| An ML engineer can assess RAG/fine-tuning feasibility | Look beyond cost savings — workflow expansion matters |
Augmentation vs Automation
- Augmentation — AI helps a human do the task faster (human-in-the-loop)
- Automation — AI performs the task autonomously once trust is established
Job impacts across roles
- Programmer — writing code and documentation: high potential. Reviewing others' code, gathering requirements: lower
- Lawyer — drafting and reviewing documents, interpreting regulations: high potential. Negotiating, reviewing evidence: lower
- Landscaper — most tasks: low potential. Physical roles remain less impacted than knowledge roles
13 Teams to Build GenAI Software
| Role | Responsibility |
|---|---|
| Software Engineer | Writes the app; ideally knows LLM basics |
| ML Engineer | Implements the AI system; familiar with prompting, RAG, fine-tuning |
| Product Manager | Identifies and scopes the project |
| Prompt Engineer | Usually not a standalone role — typically an ML engineer with extra skills |
| Data Engineer | Organises data, ensures data quality (larger teams) |
| Data Scientist | Analyses data and guides project/business decisions (larger teams) |
A one-person team can be a software engineer with prompting know-how, or an ML engineer. A two-person team: ML engineer + software engineer. Many configurations work — start lean and scale the team as the product complexity grows.
14 Generative AI & Society
Concerns about AI
-
1Will AI amplify human biases? Yes, it can reproduce biases in training data. Mitigations: fine-tuning with curated data and RLHF to train models on an unbiased reward model.
-
2Who loses their job? Geoff Hinton predicted in 2016 that radiologists would be replaced in 5 years. A decade later, not one has lost their job — because radiologists perform 30+ tasks and X-ray interpretation is just one. The correct framing: "AI won't replace radiologists. Radiologists who use AI will replace radiologists that don't." — Curtis Langlotz, Stanford.
-
3Human extinction? Arguments are not concrete. Perfect control isn't needed for technology to be valuable and safe — we can't control turbulence on aeroplanes, but we fly. AI will also be a key tool in solving real risks: climate change, pandemics.
Artificial General Intelligence (AGI)
General-purpose AI ≠ Artificial General Intelligence. AGI by definition can do any intellectual task a human can — drive a car after 20 hours of practice, complete a PhD thesis in 5 years. Though AI is powerful and better than humans at specific tasks, expecting AGI-level performance is an extremely high bar and not where we are today.
Responsible AI
15 Building a More Intelligent World
Intelligence is the power to apply knowledge and skills to make good decisions. Human intelligence is expensive — education, training, time. Artificial intelligence is cheap. AI has the potential to give every individual the ability to access intelligence at low cost.
AI is the new electricity — with the potential to revolutionize all industries and all corners of human life. The fear of AI today is similar to the fear of electricity when it was new. Today, nobody would give up light, heat, and refrigeration for fear of electrocution.
— Andrew NgLooking beyond AI, the world faces climate change, pandemics, and war. Solving them will require all the intelligence we can master — including artificial intelligence.