AWS Certified Generative AI Developer - Professional: Dynamic model selection + “provider switching”
FULL TRANSCRIPT
Imagine you run a control room, not a
single machine. People walk in with
questions all day long. Some are simple,
some are complex. Some arrive during
peak traffic. Some arrive when a road is
closed. You never send everyone down the
same road. At the front door is API
gateway. It doesn't think. It just lets
people in. Behind it sits Lambda. Lambda
is not an AI. Lambda is the traffic
controller. Its job is not to answer
questions. Its job is to decide who
should answer. Lambda looks at three
things. First, the rules. These rules
never change quickly. Safety, policy,
tone. Second, the user input. What is
the user asking? How complex is it?
Third, the model choice. Which engine
should handle this request right now?
That third decision is what makes this
system powerful? Lambda does not
hard-code that decision. It reads the
rules from a control board. That control
board is app config. App config holds
feature flags. Feature flags are
switches. Flip a switch and behavior
changes instantly. No redeploy, no
downtime, no panic. One switch might say
simple requests go to a cheaper model.
Another switch might say complex
reasoning goes to a stronger model.
Another switch might say if this
provider fails, use the fallback. Some
requests go to bedrock, some go to
SageMaker. Bedrock is the managed
highway. SageMaker is the custom road
you built yourself. Lambda doesn't care.
It just routes traffic. If one model
slows down, errors out, or gets too
expensive, the controller reacts. It
doesn't crash the system. It reroutes.
That's called graceful degradation.
Users still get answers. The system
stays alive. This design means you can
change models without code changes, test
new models safely, control costs,
survive outages, avoid vendor lockin,
all without touching your application
code. This is called static plus two.
Static rules stay fixed. One dynamic
input comes from the user. The second
dynamic choice is which model runs.
Static rules plus input plus model
selection. That's enterprise design.
Here's the image to remember. A traffic
control center. API gateway is the city
gate. Lambda is the controller. Models
are highways. The controller watches
traffic, reads the rule board, and
redirects cars. If a highway closes,
cars reroute instantly. No one rebuilds
the city. And here's the exam rule that
ends most questions instantly. If the
question says switch models without
redeploy, handle outages gracefully,
control cost dynamically, test models
safely, the answer includes Lambda
routing plus app config feature flags,
not hard-coded logic, not one fixed
model, a control room. Let's make this
real. Below are practical production
style examples of dynamic model
selection, exactly how it's done on AWS
for the exam and real systems. No
theory. You'll see one, architecture.
Two, config. Three, router logic. Four,
what happens at runtime, five, why AWS
loves it. Real example one, cost aware
model routing. Most common exam
scenario. Goal: Cheap model for simple
requests. Powerful model for complex
requests. Switch without redeploying.
Architecture.
App config feature flags. App config
configuration stored in AWS app config.
not in code. Lambda router logic
conceptual exam safe runtime example
user input reset my password complexity
score equals 0.2 Two, routed to Titan
light. Low cost, fast. Another user
input. Analyze this contract and explain
the legal risks. Complexity score equals
0.9. Cloud to cluton at higher
reasoning, higher cost. Exam signal. If
you see reduce cost, simple versus
complex requests, no redeploy, Lambda
router plus app config. Real example
two, provider fallback during outage
resilience. Goal. If one model fails,
automatic fallback. Architecture.
App config flags.
Lambda routing behavior.
Runtime reality. Cloud throttles or
times out. Lambda does not crash. It
switches instantly to Mistral. User
still gets an answer. Exam signal.
Keywords. High availability. Provider
outage. Graceful degradation. Fallback
routing.
Real example three. Canary testing a new
model. AWS loves this goal. Test a new
model safely. Send only some traffic.
Roll back instantly. App config flags.
Lambda logic.
Runtime 90% sonnet 10%. Opus test. If
opus misbehaves, change app config to
0%. Instantly stopped. No redeploy.
Exam signal canary gradual rollout. AB
testing. Feature flags router real
example four regulated versus
non-regulated routing goal sensitive
data safer model nonsensitive faster
model app config flags
lambda routing rule
runtime medical question claw general
question titan exam signal regulated
industry PII data sensitivity dynamic
routing real example five bedrock versus
sagemaker Routing advanced exam case
goal use managed models normally route
specific cases to custom model
architecture
app config flags
routing logic
exam signal custom model fine-tuned full
MLOps sage maker
one memory story locks all examples the
AI traffic control room API gateway city
gate lambda equals traffic controller
app config equals rule board models
equals Equals highways.
The controller reads the board, watches
traffic, reroutes instantly, never
rebuilds the city. Final exam
compression rule. If the question says
change models without redeploy, reduce
costs dynamically, handle outages, test
models safely, your answer includes
Lambda router plus app config feature
flags, hard-coded model equals one model
for everything. Runtime routing equals
UNLOCK MORE
Sign up free to access premium features
INTERACTIVE VIEWER
Watch the video with synced subtitles, adjustable overlay, and full playback control.
AI SUMMARY
Get an instant AI-generated summary of the video content, key points, and takeaways.
TRANSLATE
Translate the transcript to 100+ languages with one click. Download in any format.
MIND MAP
Visualize the transcript as an interactive mind map. Understand structure at a glance.
CHAT WITH TRANSCRIPT
Ask questions about the video content. Get answers powered by AI directly from the transcript.
GET MORE FROM YOUR TRANSCRIPTS
Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.