AWS Certified Generative AI Developer - Professional: Multi-step tool calls + tracing
FULL TRANSCRIPT
Day 20, multi-step tool calls and
tracing. Day 20 is where AWS stops
testing whether you can design agents
and starts testing whether you can debug
them in the real world because a system
you can't see inside is a system you
can't trust, fix, or ship.
Imagine this. A country runs an AI power
grid operator agent. Its job is to
detect outages, identify affected
regions, check safety rules, dispatch
repair crews, and notify authorities.
This is not one API call. It's a chain
of decisions and actions. One night,
something goes wrong. The agent
dispatches the wrong crew. It repeats
the same API call. It takes 45 seconds
to respond. Management asks one
question. Why did the agent do that? And
suddenly, you realize something
terrifying. Without tracing, you have no
idea.
Let's make this clear. Multi-step tool
calls mean the agent is calling multiple
tools in sequence where each result
changes what happens next. This is not a
static workflow. This is not a batch
job. This is not parallel execution. It
is dynamic, stateful, and
decision-driven. And that's exactly why
it breaks in surprising ways. Here's a
simple example. A user says, "Restore
power to the affected suburbs." The
agent does not solve this in one step.
It calls a tool to detect the outage,
then a tool to find affected regions,
then a tool to check safety rules, then
a tool to dispatch a repair crew, then a
tool to notify authorities. Each step
depends on the previous one. If step two
is wrong, everything after it is wrong.
This is why multi-step agents are
powerful and dangerous at the same time.
Now, lock this loop into your head. A
goal comes in. The planner decides the
next step. A tool is called. The tool
returns structured output. The agent
observes the result. Memory is updated.
Then the planner decides again. Agents
loop. They don't run once and stop. If
you understand that loop, you understand
why tracing is mandatory. Here's the
brutal truth AWS wants you to
internalize.
If something goes wrong, you must be
able to answer which tool was called,
with what input in what order, how long
did each step take, which step failed,
did the agent retry, did it loop? If you
can't answer those questions, the system
is unownable. AWS exams quietly reward
engineers who think this way. So, what
does tracing mean in AWS terms? Tracing
means endto-end visibility across the
entire agent flow. You can see LLM
decisions, tool calls, Lambda
executions, retries, timeouts, and
failures. Tracing lets you reconstruct
exactly what the agent did and why it
did it. In AWS, this is built from three
pillars. Cloudatch logs capture tool
inputs, outputs, and errors. AWS X-Ray
gives you distributed traces showing
latency and dependencies across steps,
and structured logging ties everything
together using trace IDs and correlation
IDs. These are not optional in
production and they are absolutely
exam-friendly answers. Here's what AWS
expects you to log for each tool call.
You log a trace ID so steps can be
linked. You log the step number. You log
the tool name. You log the input
parameters. You log the output status.
You log latency in milliseconds. And you
log error codes if anything fails. With
that, you can debug, replay, audit, and
even analyze cost. Without it, you are
guessing. Let's talk about the failures
AWS loves to test. First, infinite
loops. The symptom is simple. The same
tool keeps getting called. The cause is
almost always the same. Memory isn't
updated or the tool result is ignored.
The fix is not a better prompt. The fix
is updating memory after each step and
enforcing a maximum number of steps.
Second, slow responses. The agent takes
too long. This usually comes from too
many steps, slow tools, or no streaming.
The fix is to stream partial responses,
cache tool results, and reduce
unnecessary steps. AWS wants you to
optimize architecture, not just models.
Third, wrong actions. The agent executes
the wrong tool. This happens when tool
schemas are unclear, tool names are
ambiguous, or the planner lacks facts.
The fix is improving tool descriptions,
adding rag before planning, and
validating tool outputs. Again, not a
bigger model. Here's something subtle
AWS loves. Tracing is not just for
debugging. Tracing also shows you which
steps cost the most, which tools
dominate latency, and where token usage
spikes. If the exam asks, "How do you
identify bottlenecks in an agent?" The
answer is not guessing. The answer is
tracing and logs. Now, watch for the
traps. Do not use a bigger model to
debug behavior. Do not add more prompts
to understand failures. Do not rely on
chat history as tracing. None of those
give you observability. Only structured
logs and distributed tracing do. Here is
the one sentence to memorize. Multi-step
agents must be traceable or they are
unfixable. Say that once and day 26.
Final self test. An agent calls several
tools dynamically and sometimes fails in
production. How do you determine where
it failed? Add structured logging and
distributed tracing using Cloudatch and
X-Ray. That's day 20 mastered.
UNLOCK MORE
Sign up free to access premium features
INTERACTIVE VIEWER
Watch the video with synced subtitles, adjustable overlay, and full playback control.
AI SUMMARY
Get an instant AI-generated summary of the video content, key points, and takeaways.
TRANSLATE
Translate the transcript to 100+ languages with one click. Download in any format.
MIND MAP
Visualize the transcript as an interactive mind map. Understand structure at a glance.
CHAT WITH TRANSCRIPT
Ask questions about the video content. Get answers powered by AI directly from the transcript.
GET MORE FROM YOUR TRANSCRIPTS
Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.