Why Games Won't Be Much Better On The New Generation Of Consoles

permalink         categories: video games         originally posted: 2006-01-16 23:29:59

Every couple of years or so, the home video game console makers conspire to make a new "generation" of consoles. As I post this, a new "generation" is upon us; the Microsoft XBox 360 is already out, and the Sony PlayStation 3 and Nintendo Revolution are due to be released this year.

While I am keen to see them—and will, in all likelyhood, buy all three—I can't help but think that this will be a less exciting step forward than previous generations. Why? Two reasons: the new consoles are harder to get top performance out of, and they're generally just not "better enough".

Not "Better Enough"

Let's examine the latter claim first. If you consider the evolution of the video game console generations (scroll down to the bottom of that page), you can characterize each generation with its technological advances:
8-bit era Crude graphics, few colors, crude sound
16-bit era More refined graphics, much more colorful, much better sound
32/64-bit era Ubiquitous 3D graphics, sampled sound, CD-quality soundtracks
Sixth generation era Refined 3D graphics, basic network connectivity
Seventh generation era HD quality 3D graphics, refined network connectivity

Each generation from "8-bit" through "Sixth" was a leap in the capabilities of the machine—you can point to definite areas where the newer machines were a definite improvement. There are games that 16-bit machines could play that 8-bit machines simply couldn't, games 32-bit machines could play that 16-bit ones couldn't, and so on.

At the point that we reach the transition from the Sixth to the Seventh, we find diminishing returns. There are few games that the Sixth generation machines can play that the 32-bit generation ones arguably could not. Sure, the 32-bit generation machines still didn't have enough RAM to load giant levels, or enough rendering power to render incredibly complicated scenes. But there are precious few games you can list in the Sixth generation that you can definitely say "A 32-bit era machine simply could not have played this game."

  • Grand Theft Auto 3? It would have been loading continually, but sure.
  • Katamari Damacy? The renderer would have had to cheat a lot more, but absolutely. After all, Katamari Damacy is being ported to the Nintendo DS, which is roughly equivalent in power to a Nintendo 64—itself a 32-bit era machine.
  • Halo? The AI wouldn't have been as smart, but certainly all the play mechanics (an FPS with vehicles) could have been realized on a 32-bit machine. Whether dumber AI means it's a different game is in the eye of the beholder.
Now compare the advancement from the Sixth to the Seventh. It's not like there are any new technologies offered by Seventh generation machines; we've had 3D rendering and sampled sound for three generations. Networking is nothing new. The graphics will be much prettier, and networking will be more ubiquitous, but that's about it.

(The only area I know of where the new consoles have it all over the old is in calculating physics. Most games to date use clever, cheap to compute, but unrealistic physics; while not particularly accurate, it has been sufficient to make fun games with a patina of realism. Some newer games like Half-Life 2 and Flatout incorporate more realistic physics engines in their gameplay; many more games tout "ragdoll physics" but only use the system for death animations. It remains to be seen if the Seventh generation is fast enough to compute realistic physics for gameplay, and if so, whether it will be an improvement.)

Squeezing Out Top Performance

So what about my other claim? Is it really that hard to get top performance out of the new consoles? Yes, and the reason why has to do with the new-fangled multi-core CPUs they all have.

Let's compare the Sixth and Seventh generation consoles based on their CPU speeds and number of concurrent threads of execution:

Microsoft XBox 360 3,200 MHz (Power, RISC)
Three cores, each is hyper-threaded = 6 threads
Microsoft XBox (Classic) 733 MHz (Intel, CISC)
Single core = 1 thread
Sony PlayStation 3 3,200 MHz (Power, RISC)
One full-featured Power core, seven "SPE" cores = 8 threads
Sony Playstation 2 300 MHz (MIPS, RISC)
Single core = 1 thread
Nintendo Revolution Not known, rumored to be 1,800 MHz (Power, RISC)
Not known, rumored to be dual-core = 2 threads
Nintendo GameCube 485 MHz (Power, RISC)
Single core = 1 thread

The jump in CPU speed isn't as large an advance as you might think—particularly from the XBox. The Intel CISC chip gets more done per clock than the Power RISC chip, so it's probably only on the order of 2-3x as fast. And there's more to consider than just raw CPU speed; there's memory bandwidth, and cache size, and out-of-order execution, and pipeline depth, and handwave handwave handwave... I've heard it said that, when all is said and done, the new consoles really are only a couple times faster than the old ones. Certainly not the gen-u-ine 10x speed jump we had between the 32-bit generation and the Sixth generation.

So what about this multi-core business? It's like having more than one CPU, right? Sadly, six cores doesn't simply translate into six times as fast. After all, if you have six cars, you won't get to your destination any faster than if you have one car, right? You'd be better off spending the money getting one really fast car.

A closer analogy is chefs in a kitchen. If you're only making one meal, how many chefs do you really need? Two can't make a dish much faster than one. Meanwhile, the chefs are getting in each other's way. And there's only one stove, and only one sink, and only one food processor, so if both chefs need to use something at the same time one of them simply has to wait for the other to finish.

That analogy is pretty close. Chefs represent CPU cores, and the "one meal" is the big purty game, the only program running on your console (or at least the only one you care about). Just as one chef has to wait for the food processor while the other chef is using it, CPU cores have to wait when one is using the GPU or the sound device. When CPUs have to wait like this, computer scientists call it contention—and it's a multithreaded performance killer.

So what can you use multiple cores for in a video game? Turns out, not much. Video games, like most other programming pursuits, are an inherently serial process. Somewhere, in basically every video game, is what we call The Big Game Loop:

While the game is running,
  1. Scan the inputs for changes (joysticks, keyboards, mice),
     then compute the new game state (did the monster move?
     did the player shoot?).
  2. Render the graphics to the screen.
  3. Go back to 1.
Let's assume for illustrative purposes that steps 1 and 2 take approximately the same amount of time. That might look like this:
CPU
Time 1 Frame 1: Scan input & compute state
Time 2 Frame 1: Render graphics
Time 3 Frame 2: Scan input & compute state
Time 4 Frame 2: Render graphics
Time 5 Frame 3: Scan input & compute state

This doesn't afford a lot of opportunities for multiple "chefs". Scanning inputs doesn't take much time, so there's no point in running that on multiple cores. Recomputing game state is hard to do multithreaded, as game state is generally highly interconnected, which means there would be lots of contention. Rendering is very hard to run on multiple cores as there's only one GPU. (Though I understand there are some architectural approaches on the XBox 360 that may make multithreaded access to the GPU more productive.)

There's really only one big opportunity for multiprocessing here: run steps 1 and 2 on their own cores. The first CPU goes on to re-read inputs and calculate new game state while the second CPU renders the state the first CPU just finished up. That staggered approach looks like this:

CPU Core A CPU Core B
Time 1 Frame 1: Scan input & compute state idle
Time 2 Frame 2: Scan input & compute state Frame 1: Render graphics
Time 3 Frame 3: Scan input & compute state Frame 2: Render graphics
Time 4 Frame 4: Scan input & compute state Frame 3: Render graphics
Time 5 Frame 5: Scan input & compute state Frame 4: Render graphics

Compare the two tables. In our first theoretical example, we've only rendered two screens by time 5. But with two CPUs, we can render four screens in the same time. We've made rendering twice as fast!

Keep in mind that this is only a theoretical example. It's unlikely than these two steps would take exactly the same amount of time. John Carmack of iD added SMP support to Quake 3, allowing it to take advantage of two CPUs, splitting up the work between game state and rendering as above. He found that it was between 35% and 90% faster, depending on how complicated the rendering was. The rendering was almost always the slower part.

It's also worth noting that this was Mr. Carmack's second attempt at SMP support in Quake 3. His first attempt only resulted in a a 3%-15% speedup at best, and was actually slower at worst. Multi-threaded programming is awfuly hard to get right, and it adds its own overhead. Even when it goes well, it's almost impossible to take full advantage of even two simultaneous threads, and you almost always get diminishing returns with each additional core.

So the analogy "you're better off getting one really fast car" still holds. One reason the original PlayStation was so successful was that it was so straightforward to program. It had one fast main CPU with a vector coprocessor, one graphics processor, and one sound processor. Compare that with the Sega Saturn, which had quite a collection: two Hitachi Super-H CPUs, a SH-1 RISC CPU, two "video display processors", a "Saturn Control Unit", two sound processors, and an input controller. Saturn games that pushed the hardware were few indeed, largely because it was so hard to juggle all of those dissimilar CPUs. Getting good performance out of the PlayStation's straightforward design was much easier. By the same token, the Seventh generation of consoles would have been better if we had gotten faster cores instead of more cores. A 2GHz dual-core Athlon would have been both faster and much easier to program, and as a result many more games would have gotten top performance out of it. Two cores would have been plenty; more than that would (and will) largely go to waste.

The Track Record So Far

Let me close by listing the games I have so far for my XBox 360:
  • Call Of Duty 2
  • Geometry Wars: Retro Evolved
  • Hexic HD
  • Kameo: Elements Of Power
  • Perfect Dark Zero
  • Project Gotham Racing 3
Every single one of those games would have played just fine on the original XBox. About all I can say is that the crates look much better in Perfect Dark Zero.

So, then, why am I going to buy all those big, expensive, overheating consoles? For the same reason I own an XBox and a PS2 and a GameCube: the system exclusives. You could only play Katamari Damacy on a PS2. You could only play Super Smash Brothers Melee on a GameCube. You could only play Halo on an XBox. And now you can only play Project Gotham Racing 3 on an XBox 360.

About Momentary Fascinations

RSS

Recent Fascinations

A Quick 2017 Survey Of Rolling Linux Distributions

An Adventure In Buying Home Audio Speakers

The 2014 Lenovo X1 Carbon: Lenovo Giveth, And Lenovo Taketh Away

Bound Inner Classes For Python

All Essays

Years

2017

2014

2013

2011

2010

2007

2006

2005

Tags

books

entertainment

eulogies

game jones

general

meta

music

politics

programming

repeat-1 song of the day

tales of the self-indulgent

technology

trampled liberties

video games

Momentary Fascinations is Copyright 2005-2017 Larry Hastings.