0of15read0 XP
LLMs in practice: capabilities, limits, hallucinations
Classify which tasks an AI assistant is reliably strong at (summarizing, rewriting, translating, reformatting, first drafts) so you hand it work it can actually do well.
- Time
- 20–25 min
- Type
- exercise
- Bloom
- Apply → Create
- XP
- 100

Architecture diagram for LLMs in practice: capabilities, limits, hallucinations. Create a clean diagram contrasting where an AI assistant is reliable versus where it fails, for a professional audience. Show a central split: on one side, a 'Trust it' lane labeled with language tasks the tool is strong at (summarize, rewrite, translate, draft) shown in a calm blue. On the other side, a 'Verify it' lane labeled with the three failure modes (hallucination: confident but invented; stale knowledge: does not know what happened after training; long-document drift: loses the thread on very long material) shown in a cautionary amber or red. Use a simple icon per item, plain labels, no technical jargon. The single takeaway to land visually: a confident tone looks the same on both sides, so you sort the work by task type, not by how sure the answer sounds.
You'll be able to
- Classify which tasks an AI assistant is reliably strong at (summarizing, rewriting, translating, reformatting, first drafts) so you hand it work it can actually do well.
- Diagnose the predictable ways it fails so you catch them before they cost you: hallucination (confident, well-written, and wrong), stale knowledge (it does not know what happened after it was trained), and losing track of long documents.
- Apply those limits to how you use the tool based on those limits: trust it on fluency, verify it on facts, and never let a confident tone stand in for being correct.
Key concepts · tap to reveal
1/15·Watch·Beat 1 · Hook
0%
Hook
You paste an AI answer into a client document and send it. Two days later the client writes back: the regulation it cited does not exist. The model said it in the same confident voice as everything else. Why does it sound so sure when it is wrong, and what does that change about how you use it?
Your task Write a prompt that asks Claude to recommend the right AI setup for a real task you're facing — then weigh its answer against this lesson, "LLMs in practice: capabilities, limits, hallucinations."
a strong prompt:role · context · task · format · example

Exercise · scenario
A regional hospital system is deploying an LLM-powered chatbot to answer patient questions about medication side effects and drug interactions. During testing, the bot occasionally provides confident-sounding responses about rare drug combinations that contradict the hospital's pharmaceutical database. When developers check the training data, they find no specific information about these combinations. The bot generates plausible-sounding medical terminology and citation formats that don't correspond to real studies.
Deliverable
Add a page to your AI Fluency Playbook called **Where I Trust It, Where I Check It**. Catalog three tasks from your own work where you use, or could use, an AI assistant. For each one, write down: (1) the task in plain words (summarize, draft, translate, look up a fact, calculate), (2) whether it sits in the tool's strength or its blind spot, (3) the failure you would watch for if it is in the blind spot (hallucination, stale knowledge, or losing the thread on long material), and (4) the one verification step you will run before that output goes anywhere with your name on it.
Reveal model answer
Hallucination due to pattern completion without grounding
Practice · Scenarios
0 of 8 revealed
Scenario 1 of 8
An e-commerce platform uses an LLM to generate product descriptions from manufacturer specifications. The system excels at creating engaging copy for electronics and apparel, transforming technical specs into customer-friendly language. However, when tasked with creating original product comparison charts or calculating shipping cost optimization across multiple warehouses, the model produces inconsistent numerical results and logical errors in multi-step reasoning, even when the calculations are explicitly shown in the prompt.
Sources
- [1]Frontiers of Computer Science·A Survey of Large Language Models (2026) · Research
- [2]arXiv·Mitigating Hallucination in Large Language Models: An Application-Oriented Survey on RAG, Reasoning, and Agentic AI (2025) · Research
Submit your work for review
Paste your capstone artifact below. You'll get back a 4-level rubric grade, per-criterion feedback, and three concrete edits to strengthen it.