Interview with Mike Ambinder of Valve Software
Valve Software has designed top-selling games including Left 4 Dead, Half-Life, and Team Fortress. I recently spoke with Mike Ambinder, PhD, the company’s full-time experimental psychologist, to discuss the professional practices that
ensure high-quality game experiences.
Q: What’s your role at Valve?
A: My job is to apply knowledge and methodologies from psychology to game
design. That means performing statistical analyses, developing
playtesting methodologies, conducting design experiments, a little bit of interface
design, and investigating alternative hardware among other things.
Q: How can psychology guide game design?
A: Well for example, in the Left 4 Dead series there are several predetermined
locations in the game called “drop points” where health items or
weapons will spontaneously appear. To decide what’s dropped, where, and
when we considered reward and reinforcement schedules, which are elements of
behavioral psychology. You can put things on a fixed schedule so that
they’ll appear at regular intervals. This makes the gameplay experience
more predictable, and there can be real value in that. Or you can use a
variable schedule so that you don’t know what’s going to show up or when it’ll
pop in. Variable schedules can create a higher rate of engagement in the
game and make the experience more enjoyable as uncertainty of occurrence can
increase arousal. A large component of the gameplay in the Left 4 Dead series
is the use of these variable reinforcement schedules.
Q: How is testing integrated into the design process?
A: We’re constantly playtesting. Our philosophy is to playtest as much as
possible, and to start it as soon as we have a playable prototype. Of
course our designers are experienced and generally make good decisions about
gameplay, but we don’t want to just assume we’ve got it right. Game
designs are hypotheses, and every instance of play is an experiment.
Q: What’s your standard testing method?
A: We use a variety of methods, but the most favored is direct observation of
real players working their way through the game. I’m not a fan of the
think-aloud protocol, in part because the constant prompting detracts from the
gameplay experience and can introduce inadvertent bias, and in part because
people can be really bad at explaining why they do what they do. Better
to just sit back, watch, say nothing, and try to understand the player’s
actions. So quiet, direct observation is our preferred method, but we
combine that with player Q&As, surveys, quantitative metrics, eyetracking, and
design experiments, and we’re investigating methods of measuring the player’s
emotional experience during gameplay.
Q: How can eyetracking help to inform game design?
A: Generally, you want to eliminate frequent long eye movements because they
lead to fatigue. For example, if the area map is in the bottom right
corner of the display and your progress through that map is shown in the upper
left, you’ll see the player’s eyes transiting the screen a lot. The
proximity compatibility principle tells us that things which are mentally
proximal should also be physically proximal, and eyetracking can tell us which
things are mentally proximal. By arranging related information together,
you can reduce fatigue and make the interface more efficient to use.
Q: And how can you measure the emotional experience of gameplay?
A: This is still early on, but we’re looking at biometric methods like EEGs
which measure brainwaves, and EMGs which measure the electrical activity of
muscles. But there are questions of their cost and efficacy.
They’re also both very intrusive methods, requiring either a cap that’s wired
into a machine or electrodes attached to the face. In testing you want to
mimic the home experience as much as possible, and EEGs and EMGs both make it
feel more like a lab environment. But new technologies are emerging that
could change that. Remote detection of facial expression seems promising;
these systems produce data along the lines of an EMG but only use a camera to
measure muscle activity in the face.
Emotion can be viewed as a vector and measured along two scales: magnitude and
valence. Magnitude describes the intensity of the emotion, while valence
describes its quality (either positive or negative). You can measure the
magnitude pretty reliably using something like heart rate, but understanding
the valence is the tricky part. How do you know if that intense emotional
response is good or bad? Of course you could just ask, but again that’s not
a preferred method because people don’t describe their own experiences reliably
and you’re introducing bias into the response. Context is a better
basis. If someone is getting killed repeatedly, you can assume that
they’re experiencing a negative emotion.
However, to validate we’d love to have a system which quantifies valence
in real time.
Once we can measure these qualities reliably, we can start asking what the
ideal emotional experience should look like over the course of the player’s
interaction with the game. Maybe that would be something like a pattern
of peaks and valleys that steadily rises over time, as opposed to a prolonged
burst of emotion that’s experienced all at once. That seems like a
plausible theory, but we won’t know until we’ve measured it.
Q: What are some of the design elements that you’ve found make better player
A: I can suggest a few things. First, the player needs to be able to
understand the experience. If you die, you need to understand why you
died. If you reach a decision point, you need to understand what the
implications are of taking path A or path B. The designer needs to
provide a sensible environment.
Variety is also really important. Don’t give people the same monsters
again and again, or force them to traverse the same levels over and over. There
are obvious counterpoints to this, and the constructs of the game may dictate a
lack of variety, so it’s not a hard and fast rule (none of these are), but it
is something we try and emphasize. The Left 4 Dead series is a great
example, because you’re always interacting with a new set of players with
different skill levels and different tactics, and that will completely change
the dynamic of the game. It doesn’t play the same way twice.
Third, you want to provide people with a feeling of continuous
advancement. People prize rewards if they increase in perceived
value. They want to feel that the required level of skill builds
gradually as the game progresses.
Finally, have the player make interesting choices. Which weapon should I
choose? Which armor should I take? If these decisions don’t involve
meaningful tradeoffs, then you’re probably not creating an enjoyable
Q: How do you foster collaboration in multiplayer games?
A: Left 4 Dead is really designed to force players to cooperate. If you
go out on your own, for example, you’ll get incapacitated very quickly.
The game doesn’t prevent you from doing that — it’s a choice you can exercise,
but it’s inevitably a losing strategy. If you have other players near you
then you can collectively put up a stronger fight, and when you fall then they
can easily revive you.
Testing helped us improve collaboration in Left 4 Dead as well. In the
original design, the thinking was that players would build awareness of each
others’ locations just through verbal cues, speaking to one another through a
headset. But it turned out that in the midst of gameplay that doesn’t
work well. When a teammate fell and needed to be revived, the other
players had a difficult time finding him or her. They needed another cue,
so we introduced glowing outlines that appear around your teammates’ bodies, and
which are visible through walls. We found that really increased the
players’ situational awareness, facilitated cooperation, and created a better
Q: What kinds of quantitative metrics do you use to inform design?
A: We work with tons of data. We can track any variable available in
the game. We’ll take information about where people die in each level,
then overlay it on an image of the level to show whether people are dying in
the right places, and in the right numbers. We can examine the growth in
players’ skill levels over time by any of various measures, depending upon the
needs of the game’s design. That may be a fairly coarse metric such as
the ratio of kills to deaths, who gets the most kills, who stays alive the
longest, and so on. Or you can apply several measured in combination to
satisfy a very precise definition of the ideal skill level, such as players who
have a moderately high rate of kills but who win a lot and stay alive for a
very long time.
I really appreciate your time. I’d wish you luck, but with these kinds
of practices it really doesn’t sound like you need it.