Seed IQ Solves ARC AGI 3 Games with Human-Level Performance - Denis O & Denise Holt Discuss How
FULLSTÄNDIGT TRANSKRIPT
Should we start with the LS20 R?
Yeah, why not?
All right, let's start with
LS20. I'm going to let it go at maybe
two speed just so that we can observe
the effects while we talk about.
So, I'm going to turn off the
perception. Basically, what's happening,
like I said, like there are multiple
agents involved at every level.
Different agents are responsible for
different parts of the gameplay.
Some are responsible for the long-term
planning per level. Others are
responsible for learning across levels.
And then there are agents responsible
for tracking, perception, and
identifying different objects. So, for
example, if I turn it on again with
perception, you can see that different
targets, different sprites, as they call
them, or IRIs, are being tracked in the
game. And so, the engine computes
uh the best possible path and best
possible action action perception
coupling
uh as it goes. And I'll slow it down a
little bit more cuz it's just a little
too fast. But basically, it is relying
on something called topological
perception. Topological perception and
the advantage is you're not pattern
matching against this window. This
window were to change, just like with
ARC G I 1 and 2 challenges, it would be
able to adapt and would be able to still
establish the causality. So, like with
deep learning, right? With LLMs, with
other approaches in RL, if you change
the structure, if we were to suddenly
increase the size of this window this
world, they they would get lost cuz they
don't know how to readjust to it if it
hasn't been in a data set. With
topological perception feeding into the
manifolds per agent and multiple agents
working through the adaptive adaptive
multi-engine autonomous control,
it allows the structure to restructure
on the fly, understand exactly what's
happening. If there's any shift in
topology of the map, it can be adjust
and readjust its own strategy.
And basically, like
you see it encounter
um pusher. So, with pusher, it takes
three times his trying a strategy to go
around, realizes that it can't go
through this round. So, at this point,
it's going to try again and then reroute
further
to a different strategy. So, it will
find a new solution, go around, take the
sprite, uh and a few things that are
being tracked in the game is like
health, lives. You see the bars at the
bottom signify how many lives you have
uh left. Right now, we are three lives.
We haven't lost any lives. It finds uh
strategies to go around the pushers. It
learns as it goes and then it navigates
to the exit. The priors here is that you
have shapes, you have IRIs, you have
pushers, and you have a target selection
that you have to get to. And you have to
be able to come up precompute a
strategy. The that precomputation
happens at every step. All of the
multiple agents are pretty much
projecting their own internal belief
states into the player. And the player
becomes the actuator. And so, by level
six, it's all
it's already aware and it's trying to
catch. So, this is the level where it's
like Harry Potter trying to
catch multiple things at once, multiple
snitches. So, you have
objects that are moving, oscillating
together. You have to come up with a
strategy of effectively using the
sprites or the IRIs
to connect to them, intercept these,
change the proper shape, then figure out
which which is the next target. Is it
the color? Is it another sprite? So, you
have multiple constraints at once cuz
at any moment, at any step, you may run
out of of good steps, right? So, you
have to readjust your strategy on the
fly. You have to also rotate. So, the
these are different things: color,
shape, and rotation. Has to see The key
thing here is it decided that the route
through the color, where it could
accidentally hit it,
is the best route. And it figures out
just the exact moment to go through that
target. But by level six, it doesn't
even matter that you have hidden things
because at this point it has learned
accumulated knowledge from previous
games. So, it's not even a challenge.
Even with partial view, restricted, as
they call it, a camera view,
it's well aware, okay, IRIs are here.
Like level seven, last level, the
hardest level because
for an LLM or DL, you don't have any
perception left. Like you don't you have
you get partial matches on whatever
you're observing. But we are
constructing uh world model on the fly
of whatever it is that we're dealing
with. And so, as it navigates to
different corners, it already has
accumulated knowledge about what it has
can do, what it can do, what are the
constraints, how to best navigate around
them, how to
how to solve it. So, this one is a
different The here's uh topological
perception is key. You need to fill in
different shapes based on central shapes
inside. And also, like you can see a
topological perception again at play.
Change the structure, change the object.
Once it's understand the causality of
and what the reasoning is behind the
specific problem, it just reapplies it
everywhere.
But by level six uh seven, it doesn't
even matter anymore. You can clearly see
exactly
where is what and what needs to update.
So, it's just filling all the circles
cuz it can see exactly where everything
is. And so, by level six,
it has accumulated enough knowledge to
continue to you know, solve these
challenges one at a time, but then they
grow in complexity, but it doesn't
matter anymore. The manifold is
structured in such a way, it just goes
zooms through it. It doesn't even
matter. And then we can look at the
other game we solved. Mhm. Yeah, what's
interesting is that, you know, with the
the ARC 1 and 2 challenges, when you
were you were doing those and playing
with those, it didn't matter how we
scaled the complexity because it still
solved it the same. Same thing.
Topological perception. Yeah. So, it's
interesting to see that play out against
the dynamic uh window. Correct. What
What What we're doing is we're doing the
same thing, but now we're feeding it
into manifold constantly. So, it's not
just one frame, it's multiple frames
it's seeing it pretty much. Wow. It can
detect objects. It can understand where
it needs to perform an action. And I can
probably make it But it it it tries
things. If it doesn't work out, it
resets, finds a new strategy, starts
adopting that that strategy. It's
adaptive on the in real time.
But it might sometimes look like a
replay, but the reason why is because
it's looking at the topology. Topology
is what it is, right? Between levels,
it's it's set. You have oscillating
objects, but overall, the dynamics are
figured out already. So, there's a
deterministic path that's the best path
to follow. And with like here, it
figures out little by little. It tries
something, there's a reset. Reset means
that the strategy didn't work after a
few clicks. So, it reroutes, recomputes,
resoves,
finds a new higher-level horizon
planning strategy, and then starts
planning at low level. All that planning
is almost instantaneous. And all of this
is tracked by perception. So, you know
exactly where you are, what to click on,
what how to transfer these, and how to
achieve what it's looking to achieve.
LÅS UPP MER
Registrera dig gratis för att få tillgång till premiumfunktioner
INTERAKTIV VISARE
Titta på videon med synkroniserad undertext, justerbart överlägg och fullständig uppspelningskontroll.
AI-SAMMANFATTNING
Få en omedelbar AI-genererad sammanfattning av videoinnehållet, nyckelpunkter och slutsatser.
ÖVERSÄTT
Översätt transkriptet till över 100 språk med ett klick. Ladda ner i valfritt format.
MIND MAP
Visualisera transkriptet som en interaktiv mind map. Förstå strukturen med ett ögonkast.
CHATTA MED TRANSKRIPT
Ställ frågor om videoinnehållet. Få svar från AI direkt från transkriptet.
FÅ UT MER AV DINA TRANSKRIPT
Registrera dig gratis och lås upp interaktiv visning, AI-sammanfattningar, översättningar, mind maps och mer. Inget kreditkort krävs.