• Google announced Project Genie’s World Sketching and the video game industry is aflutter.

  • I recommend a favorite book from my childhood.

  • The dinguanadon is our monster of the week.

The Reason Dinosaurs Are Extinct

A Tyrannosaurus Rex, a Triceratops, and a Velociraptor stumble across a magic lamp in a cave. A genie appears when they rub it, offering them each a wish.
The T-Rex goes first. "I’m tired of hunting. I wish for a massive, juicy, ten-ton steak!"
The genie nods and a gigantic steak appears.
The triceratops goes next. "I’m tired of eating ferns. I want a shower of fresh green lettuce!"
The genie nods and lettuce falls from the sky.
The velociraptor, wanting to be more clever than the other two, thinks for a moment, and declares "I want a meatier shower!"

World Sketching Versus Video Games

Do we still need humans to create games for other humans to play? Last week, Google announced Project Genie 3 and their World Sketching tool. The project uses AI to create explorable interactive worlds, much like a video game. The over-reactions to it are notable, but World Sketching cannot replace real games because games are systems of rules, not just responsive visuals.

World Sketching goes beyond just being a cool tech demo. Many major video game companies saw their stock prices dip on the announcement including Electronic Arts, Roblox, Take Two Interactive, and Unity (since I'm mentioning stocks, I feel a duty to say that nothing I say should be taken as investment advice). The thesis from investors seems to be that companies in the business of creating interactive experiences will not be as profitable in the future if AI can make customized video games on the fly.

I think this thesis is wrong. World Sketching is interesting from a technical standpoint, but I don't think the technique can replace video games, or have any real impact on the entertainment industry. Understanding how Large Language Models (LLMs) and World Sketching works will help us understand why this is the case.

What World Sketching Does

Google demonstrated that the worlds and characters are created by providing a text prompt or starting image of the world and character you want to see, and the AI tool generates a real time video that allows you to use your computer's arrow keys to move the character around, just like you would in a third person video game. Google calls it World Sketching.

If you're not sure how LLMs work, the easiest way to see them in action is to start searching for something on Google. The suggestions that come up are reasonable, even if they aren't related to what I'm actually trying to search for. There's always some kind of suggestion no matter what you start typing, so we know the suggestions haven't been set down as rules. Behind the scenes the search box is using some math and all the searches everyone else in the world is doing to choose some words that are likely to be next. When LLMs are creating text and images, they do something similar. They aren't relying on rules, but using a lot of math and all the text and images they have been trained to predict a reasonable next thing to come after whatever came before.

Rough edges in the World Sketching tech are to be expected because it's new. It sometimes forgets to add the road in a driving simulation, or isn't able to correctly draw a Hollywood star on the ground. These quirks are funny, but it's probably safe to say that they'll be ironed out as the tech improves. After all, AI usually gets the number of fingers on someone's hand correct nowadays.

A screenshot from the World Sketching tool. Notice how the stars are not quite….stars.

In the case of video, like with Project Genie, it's less impressive that they can make the user's input affect the video output. Doing that just requires making whatever buttons the player is pressing part of the prompt that's being continually fed into the model. The real technical achievement is that they can do it so quickly that it appears like a game that's reacting to the input.

But good game design is a lot more than just responding to the input of a player. There's a lot of thought and balance that goes into making a game not just fun, but that has interesting mechanics and supports the player as they learn how to interact with the game world to accomplish the game's objectives.

What Games Are

Games do quite a bit more than just figure out what the next picture to display on the screen will be. They're a codification of rules, laid down by designers, to achieve a particular experience. The rules cover edge cases (i.e. don't walk through walls). They cover nuanced extensions to the core set of rules (i.e. the player has access to eight different weapons, which do slightly different things). While being able to generate the next part of a video based on player input is cool, the predictive nature of LLMs means they can't produce consistent results that players can rely on as game mechanics.

In Project Genie’s World Sketching you can be a shiba inu frolicking in a field. But an animated video is not a game, even if you can control where the dog goes.

Consistency of game rules is very important in video games. For any game of moderate complexity, players need to be taught game mechanics gradually and then challenged to mix and match what they've learned to overcome the game's obstacles. As they gain mastery, new rules are introduced, putting the players into novel situations that they must figure out how to overcome. Games also require continuity so they can take into account what players have done in the past so they can modify the available options and change the world based on the player's choices.

If this is sounding too abstract, let's use an example. The first world of Super Mario Bros. gets you familiar with the mechanics of the game. It starts out with nothing on the screen except Mario. The player can discover they can move left, move right, or jump. Moving to the right, they find out the evil looking mushroom people end the game if you touch them, the pipes can be jumped over, and hitting blocks with your head will break them, occasionally giving a reward. Halfway through the level, the first of three pits is introduced, and the player learns they have to jump over them. The rest of the level is variations on these mechanics. The second level remixes these mechanics, but requires the player to discover you can also go into some pipes after you jump on top of them. Over the course of the game, more mechanics are introduced and more challenges are overcome. The designers are intentionally teaching the rules of the game, adding new mechanics when the player is ready for them.

New mechanics are introduced slowly, going from simply jumping over obstacles to having to precisely time your movements to complete a level.

You could prompt the AI to handle some of this teaching. I imagine you could draw the level by hand and have the World Sketching AI use the image as its starting point for generating its interactive video. But by their nature, LLMs have a hard time following strict rules. Their power is in their ability to operate on problems that have no one right answer. Because of their randomness, they shine when generating pictures, code, and text. Tasks requiring precise answers are where humans have to step in to provide guardrails or create tools for the AI to use to get the correct answer. You still need humans involved if you are going to enforce rules and provide consistency.

A recent project at UC Davis shows the trouble that LLMs have with continuity and game rules. The project uses a group of AI agents to play Dungeons & Dragons with each other. They were evaluated on their ability to keep track of the situation and how well they adhered to the game's rules. The results were quirky, with the AI players often ignoring the rules of the game. They would use equipment they didn't have, forget which characters were already dead, and skip aspects of their own turn. This isn't surprising. LLMs, whether they are generating text or video, are only trying to generate something that could be construed as reasonable. In a commercial video game, this variation would be disastrous. Actions the player took in the past are erased, story elements the player discovered are forgotten, and enemies will act inconsistently or not at all. A player would never be able to demonstrate mastery because the foundational mechanics they're trying to master would be constantly changing and only tenuously enforced.

Sloppy

Games are more than just a series of moving pictures. The World Sketching tech of Project Genie is interesting and there's likely an application for storyboarding or rapid prototyping. But storyboards and prototypes are not the most expensive parts of game development. Video game players want good games they can get good at. The way LLMs work today makes it structurally impossible for them to enforce a game's rules. An AI generated experience will always be the slop that people who hate AI in video games think it is. It's not whether or not AI can create good looking graphics that prevents it from making a video game, it's about its inability to enforce rules and maintain continuity.

For more information related to Project Genie, check out these links.

What I’m Hyping Right Now

Growing up, one of my favorite books was The Enormous Egg. It’s the story of a family that unwittingly adopts a triceratops, Uncle Beazley, and then needs to find a home for him once he grows to big for them to keep.

It was one of my favorite books growing up. A stuffed toy triceratops came with me on many of my adventures. I wasn’t the only one who loved the book. It was even made into a movie.

I was reminded of the book when I came across this wikipedia article about the statue of Uncle Beazley at the Smithsonian. I vividly remember climbing up that statue and sliding down its tail. If you have a dinosaur loving kid in your life I hope you’ll consider sharing this book with them.

Dinguanadon

The dinguanadon was created by the alchemist duo of Halverin Thorne and Mirabel Thorne, whose shared laboratory was as much a home as a place of spell craft. Halverin was driven by academic obsession, determined to reverse extinction and prove that magic could repair the losses of history. Mirabel loved the idea of resurrection but equally delighted in whimsy. When Halverin met with blocks due to the lack of necessary ancient material, Mirabel made the suggestion to fill in the gaps by drawing on the packs of dogs that lived nearby. The result wagged its tail, barked when pleased, and leaned its weight against its creators like a pet.

Some links on this site are affiliate links.

Keep Reading