TRANSCRIPTEnglish

AWS Certified Generative AI Developer - Professional: Add RAG to the agent

5m 7s779 words128 segmentsEnglish

FULL TRANSCRIPT

0:06

Add rag to the agent. Day 19 is where

0:09

everything finally clicks together. This

0:11

is the moment AWS checks whether you

0:13

understand the difference between an

0:15

agent that acts blindly and an agent

0:18

that acts based on verified knowledge.

0:20

Because once agents can take actions,

0:23

being wrong is no longer just

0:24

embarrassing. It's dangerous.

0:27

Imagine this. A national railway deploys

0:30

an AI signal controller agent. Its job

0:32

is to respond to signal faults, check

0:35

maintenance rules, decide whether trains

0:37

should stop or continue, and notify

0:39

engineers. If this agent invents a rule,

0:41

acts on outdated procedures or guesses

0:43

safety steps, trains stop incorrectly,

0:46

or worse, trains collide. So, the

0:48

railway makes one rule absolutely clear.

0:51

This agent may only act based on

0:52

official manuals. That is why RAG is

0:55

added inside the agent. Here is the core

0:58

idea. You must lock in. An agent plans

1:00

actions. If its plan is based on general

1:03

knowledge, assumptions, or

1:04

hallucinations, it will take wrong

1:06

actions, not just give wrong answers.

1:09

This is the key exam rule. Agents

1:11

without rag hallucinate actions, not

1:13

just text. And AWS considers that

1:15

unacceptable.

1:17

Now, let's place rag in the agent loop.

1:19

This is exam gold. A user gives a goal.

1:22

The planner, the LLM starts reasoning.

1:24

Before it acts, it retrieves facts,

1:26

rules, and policies using rag. The

1:28

planner updates its plan using that

1:30

retrieved context. Only then does the

1:32

executive lambda tools act. The results

1:35

are observed. Memory is updated and the

1:37

loop continues. The critical point is

1:39

this. Rag happens before actions, not

1:42

just before the final answer. Let's see

1:45

what changes when rag is added. Without

1:48

rag, the planner might think signal

1:50

faults usually mean stopping trains.

1:52

That's a dangerous assumption. With RAG,

1:54

the planner retrieves the real rule.

1:57

Section 8.3. If fault code X12 occurs

2:00

during peak hours, trains must continue

2:02

at reduced speed. Now the plan changes,

2:04

the decision changes, the action

2:06

changes. RAG doesn't just change

2:09

wording, it changes behavior. Let's walk

2:11

through the railway example step by

2:13

step. A controller says, "There's a

2:15

signal fault near central station. What

2:17

should we do?" The planner does not act

2:20

immediately. It pauses and says, "I need

2:22

the official procedure." The agent

2:24

triggers rag. The query may be rewritten

2:27

for clarity. Titan embedded converted

2:29

into vectors. The vector store retrieves

2:32

signal fault rules, safety thresholds,

2:35

escalation procedures. Now the planner

2:38

updates the plan. Identify the fault

2:39

code. Check rule section 8.3. Decide

2:43

reduced speed versus full stop. Notify

2:45

maintenance. Only then does the

2:47

executive act. Lambda calls the

2:49

monitoring system. Lambda notifies

2:51

engineers. Finally, memory stores the

2:54

decision so the agent doesn't repeat

2:56

itself. This is a knowledged driven

2:58

agent. AWS expects you to know two rag

3:01

patterns for agents. The first is plan

3:03

then retrieve. The planner decides what

3:05

information it needs then rag fetches

3:07

it. This is best for complex safety

3:10

critical decisions and is what the exam

3:12

prefers. The second is retrieve then

3:14

plan. Rag runs first then the planner

3:17

reasons. This is faster but less

3:19

flexible and less safe. If the scenario

3:21

is safety critical, plan then retrieve

3:23

is the correct answer.

3:25

Now let's anchor this to AWS services.

3:28

The LLM planner runs in Amazon Bedrock.

3:31

Embeddings are created using Titan

3:32

embeddings v2. Retrieval comes from open

3:35

search serverless or a bedrock knowledge

3:37

base. Actions are executed by Lambda

3:39

tools. Memory lives in a vector store or

3:42

database. Tracing and debugging use

3:44

cloudatch and X-ray. If AWS asks, how

3:47

does an agent access enterprise

3:49

knowledge before acting? The answer is

3:51

rag inside the agent planning loop. One

3:54

more important distinction, guardrails

3:57

and rags solve different problems.

3:58

Guardrails block unsafe outputs. Rag

4:01

provides correct facts. If an agent

4:03

makes a wrong decision, guardrails won't

4:05

fix it. Rag will. This is a classic exam

4:08

trap. Memory and RAG together are

4:11

extremely powerful. The agent can

4:13

remember past retrievalss, previous

4:15

decisions, and outcomes. That means

4:17

fewer repeated searches, lower cost, and

4:19

faster decisions over time. Memory can

4:22

be short-term or long-term, but it

4:24

always works alongside RAG. Now, watch

4:26

for the traps. Do not add RAG only to

4:29

the final answer. Do not expect the LLM

4:32

to recall policies from training. Do not

4:34

use fine-tuning instead of RAG. Do not

4:36

let tools decide logic. The correct flow

4:39

is always rag informs the planner. The

4:41

planner decides. Tools execute. Here is

4:44

the one sentence to memorize. Rag gives

4:46

agents facts before they act. If you

4:48

remember that, day 19 is solved. Final

4:51

self test. An agent must decide

4:54

operational actions using internal

4:55

manuals. How should it be designed? Add

4:58

rag inside the agent planning loop

5:00

before tool execution. That's day 19

5:03

mastered.

UNLOCK MORE

Sign up free to access premium features

INTERACTIVE VIEWER

Watch the video with synced subtitles, adjustable overlay, and full playback control.

SIGN UP FREE TO UNLOCK

AI SUMMARY

Get an instant AI-generated summary of the video content, key points, and takeaways.

SIGN UP FREE TO UNLOCK

TRANSLATE

Translate the transcript to 100+ languages with one click. Download in any format.

SIGN UP FREE TO UNLOCK

MIND MAP

Visualize the transcript as an interactive mind map. Understand structure at a glance.

SIGN UP FREE TO UNLOCK

CHAT WITH TRANSCRIPT

Ask questions about the video content. Get answers powered by AI directly from the transcript.

SIGN UP FREE TO UNLOCK

GET MORE FROM YOUR TRANSCRIPTS

Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.