GenAI will change how games are played
In about 7.5 years you will be able to play any game experience at will.
Thirteen days ago, in my post The future of Programming will be AI first I postulated:
You could imagine a future where a game is simply a bundled deep learning model with a render pipeline.
Since then OpenAI has been busy proving just how right this prediction was by releasing their new Sora video generating model.
Video generation with GenAI is in-itself not a major step towards this prediction, but the specific way OpenAI built Sora is! By OpenAI’s own account; Sora is not just a video generator, it’s a World Simulator.
OpenAI handily backs up this claim by showing the model simulating Minecraft. Here is what OpenAI has to say themself:
Simulating digital worlds. Sora is also able to simulate artificial processes–one example is video games. Sora can simultaneously control the player in Minecraft with a basic policy while also rendering the world and its dynamics in high fidelity. These capabilities can be elicited zero-shot by prompting Sora with captions mentioning “Minecraft.”
These capabilities suggest that continued scaling of video models is a promising path towards the development of highly-capable simulators of the physical and digital world, and the objects, animals and people that live within them.
The wording here is important. Notice how they don’t talk about generating plausible game video. Instead they make the following two statements:
Sora can simultaneously control the player in Minecraft with a basic policy
while also rendering the world and its dynamics in high fidelity
In short, by OpenAI’s own account the control of the player and the rendering of the game are separate processes in the model!
If we take OpenAI by their word, this is already a game engine by the fact that it lives up to the following criteria:
It takes user input (prompting)
It simulates an internal world
It outputs a graphical representation of the internal world
So why can’t you play games with this type of model already? In short:
It’s too slow for realtime gaming.
The limitation for using this as a game engine is clearly not the ability to output high quality content, but how fast the model is able to generate the response.
This will change quickly however, as GenAI models are seeing, what is probably the fastest improvement in speed of quality of any technology ever.
When can you expect to play?
Predicting the progress of AI is almost impossible, and is bound to be wrong, but let’s try anyway.
When Sora released, Sam Altman started taking requests on X and generating videos live. Looking at the response time for those videos, Sora takes somewhere between 15-20 minutes to output one minute of 30fps 2k video.
Let’s first make a few simple assumptions:
It takes 20 minutes to generate one minute of video
Sam Altman has 300x the processing power available to him than a high end gaming PC.
To generate video frames live rather than as video we need 2x the processing power.
With that we can estimate that we need a model that is 12000x faster than Sora to allow for playing games live at home. That might sound like a lot, but taking Nvidia at their word, their performance doubling rate for AI tasks are between 7.26 months to 11.5 months when looking at A100 vs H100 releases. Assuming the pessimistic doubling rate of 11.5 months, that’s still only ~14.5 years.
However AI speeds also improves when algorithms improve. According to OpenAI themself, every 16 months you need half the compute resources to achieve the exact same result.
Considering both factors of improvement we arrive at 12000x in 90.5 months, or just a bit more than 7.5 years, before you can run something like Sora live on a PC at home.
In 7.5 years you will be able to run a model with the same capabilities as Sora at home.
And that’s based on a pessimistic view! Having followed AI closely for the past 3 years, I personally take an optimistic view of how quickly this technology will improve. I believe we are likely to have this level of capability in a 4-5 year timeframe - but I am looking forward to be proven wrong!
How this will likely play out
Of-course it won’t take 7.5 years before we will see GenAI change how games are made and played.
1. Right now
We will see an explosion in user generated content as a result of platforms, such as our FRVR.ai1, making it possible for anyone to create games via Code + Asset generation.
Large learning models are being used to significantly improve performance of games, by rendering less pixels and imagining the rest. Nvidia calls this DLSS and AMD FSR
2. Now to ~3 Years
Users will be able to dynamically modify their visual gaming experience based on simple prompts. OpenAI has already demonstrated this capability on Sora as Video-to-video editing.
We already saw that Nvidia can remaster games live and they also use AI models to upscale the output from their graphics cards via DLSS. Style remapping is a natural extension of these technologies.
Complex physics in games, such as water will be replaced with AI models as it can create realistic output cheaper and faster than running the actual simulation.
Not only is such AI Model simulation orders of magnitude faster, in some cases they are also more precise. As an example OpenAI recently released an open source weather model that beats the best super computers using only 36.7 million parameters. (This means you can run it at home!)
In-game virtual characters will be able to dynamically respond to player input, making them feel almost alive.
In addition many other aspects of games will be generated dynamically via GenAI. For more insights
writes much better on this subject than me.3. About 3-5 years
You will now be able to play games dynamically generated by GenAI via cloud gaming solutions such as GeForce NOW. The models will still be to too big and resource intensive to run on your local hardware, but for a small monthly fee you will be able to rent access to a super computer in the cloud.
At the same time it’s likely we will see GenAI models being able to output instructions directly to the GPU, allowing flexible but still limited dynamic gaming experiences at home.
4. Beyond
Improvements to both hardware and algorithms now makes it possible to run these models locally, both on gaming computers and later directly on mobile phones.
Games will be fully modifiable via simple irritative prompts to modify and update the experience.
In my next article, I will dive much deeper into this subject and try to answer the question “if such games will be fun to make and play?”
Previous articles in this series
Shameless self promotion!
Thanks for the shout-out, Chris!
Pretty wild times.