Skip to main content
← ExitAI Fluency Foundations
0 / 14 lessons0 XP
Lesson3of 14

0of15read0 XP

LLMs in practice: capabilities, limits, hallucinations

Classify which tasks an AI assistant is reliably strong at (summarizing, rewriting, translating, reformatting, first drafts) so you hand it work it can actually do well.

Time
20–25 min
Type
exercise
Bloom
Apply → Create
XP
100
Concept architecture for LLMs in practice: capabilities, limits, hallucinations

Architecture diagram for LLMs in practice: capabilities, limits, hallucinations. Create a clean diagram contrasting where an AI assistant is reliable versus where it fails, for a professional audience. Show a central split: on one side, a 'Trust it' lane labeled with language tasks the tool is strong at (summarize, rewrite, translate, draft) shown in a calm blue. On the other side, a 'Verify it' lane labeled with the three failure modes (hallucination: confident but invented; stale knowledge: does not know what happened after training; long-document drift: loses the thread on very long material) shown in a cautionary amber or red. Use a simple icon per item, plain labels, no technical jargon. The single takeaway to land visually: a confident tone looks the same on both sides, so you sort the work by task type, not by how sure the answer sounds.

Lesson 2.3 — concept architecture

You'll be able to

  • Classify which tasks an AI assistant is reliably strong at (summarizing, rewriting, translating, reformatting, first drafts) so you hand it work it can actually do well.
  • Diagnose the predictable ways it fails so you catch them before they cost you: hallucination (confident, well-written, and wrong), stale knowledge (it does not know what happened after it was trained), and losing track of long documents.
  • Apply those limits to how you use the tool based on those limits: trust it on fluency, verify it on facts, and never let a confident tone stand in for being correct.

Key concepts · tap to reveal

1/15·Watch·Beat 1 · Hook

0%

Hook

You paste an AI answer into a client document and send it. Two days later the client writes back: the regulation it cited does not exist. The model said it in the same confident voice as everything else. Why does it sound so sure when it is wrong, and what does that change about how you use it?

Prompt Labruns here · claude

Your task  Write a prompt that asks Claude to recommend the right AI setup for a real task you're facing — then weigh its answer against this lesson, "LLMs in practice: capabilities, limits, hallucinations."

a strong prompt:role · context · task · format · example

⌘↵ to run
Create a clean diagram contrasting where an AI assistant is reliable versus where it fails, for a professional audience. Show a central split: on one side, a 'Trust it' lane labeled with language tasks the tool is strong at (summarize, rewrite, trans
Diagram · generated brief

Exercise · scenario

A regional hospital system is deploying an LLM-powered chatbot to answer patient questions about medication side effects and drug interactions. During testing, the bot occasionally provides confident-sounding responses about rare drug combinations that contradict the hospital's pharmaceutical database. When developers check the training data, they find no specific information about these combinations. The bot generates plausible-sounding medical terminology and citation formats that don't correspond to real studies.

Deliverable

Add a page to your AI Fluency Playbook called **Where I Trust It, Where I Check It**. Catalog three tasks from your own work where you use, or could use, an AI assistant. For each one, write down: (1) the task in plain words (summarize, draft, translate, look up a fact, calculate), (2) whether it sits in the tool's strength or its blind spot, (3) the failure you would watch for if it is in the blind spot (hallucination, stale knowledge, or losing the thread on long material), and (4) the one verification step you will run before that output goes anywhere with your name on it.

Reveal model answer

Hallucination due to pattern completion without grounding

Practice · Scenarios

0 of 8 revealed

Scenario 1 of 8

An e-commerce platform uses an LLM to generate product descriptions from manufacturer specifications. The system excels at creating engaging copy for electronics and apparel, transforming technical specs into customer-friendly language. However, when tasked with creating original product comparison charts or calculating shipping cost optimization across multiple warehouses, the model produces inconsistent numerical results and logical errors in multi-step reasoning, even when the calculations are explicitly shown in the prompt.

Step 1 · Classify

Sources

  1. [1]Frontiers of Computer Science·A Survey of Large Language Models (2026) · Research
  2. [2]arXiv·Mitigating Hallucination in Large Language Models: An Application-Oriented Survey on RAG, Reasoning, and Agentic AI (2025) · Research
Capstone artifact · auto-graded

Submit your work for review

Paste your capstone artifact below. You'll get back a 4-level rubric grade, per-criterion feedback, and three concrete edits to strengthen it.

0 chars · minimum 50