Excerpt From "Amazon Unbound" by Brad Stone
Excerpt From "Amazon Unbound" by Brad Stone
T
here was nothing particularly distinctive about the dozen
or so low-rise buildings in Seattle’s burgeoning South Lake
Union district that Amazon moved into over the course of
2010. They were architecturally ordinary and, on the insistence
of its CEO, bore no obvious signage indicating the presence of an
iconic internet company with almost $35 billion in annual sales. Jeff
Bezos had instructed colleagues that nothing good could come from
that kind of obvious self-aggrandizement, noting that people who
had business with the company would already know where it was
located.
While the offices clustered around the intersection of Terry Ave-
nue North and Harrison Street were largely anonymous, inside they
bore all the distinguishing marks of a unique and idiosyncratic corpo-
rate culture. Employees wore color-coded badges around their necks
signifying their seniority at the company (blue for those with up to
five years of tenure, yellow for up to ten, red for up to fifteen), and the
offices and elevators were decorated with posters delineating Bezos’s
fourteen sacrosanct leadership principles.
Within these walls ranged Bezos himself, forty-six years old at
the time, carrying himself in such a way as to always exemplify Ama-
zon’s unique operating ideology. The CEO, for example, went to great
lengths to illustrate Amazon’s principle #10, “frugality”: Accomplish
CEO. Among the TA’s duties were to take notes in meetings, write the
first draft of the annual shareholder letter, and learn by interacting
with the master closely for more than a year. In the role from 2009
to 2011 was Amazon executive Greg Hart, a veteran of the company’s
earliest retail categories, like books, music, DVDs, and video games.
Originally from Seattle, Hart had attended Williams College in West-
ern Massachusetts and, after a stint in the ad world, returned home
at the twilight of the city’s grunge era, sporting a goatee and a pen-
chant for flannel shirts. By the time he was following Bezos around,
the facial hair was gone and Hart was a rising corporate star. “You
sort of feel like you’re an assistant coach watching John Wooden, you
know, perhaps the greatest basketball coach ever,” Hart said of his
time as the TA.
Hart remembered talking to Bezos about speech recognition one
day in late 2010 at Seattle’s Blue Moon Burgers. Over lunch, Hart
demonstrated his enthusiasm for Google’s voice search on his An-
droid phone by saying, “pizza near me,” and then showing Bezos the
list of links to nearby pizza joints that popped up on-screen. “Jeff was
a little skeptical about the use of it on phones, because he thought
it might be socially awkward,” Hart remembered. But they discussed
how the technology was finally getting good at dictation and search.
At the time, Bezos was also excited about Amazon’s growing cloud
business, asking all of his executives, “What are you doing to help
AWS?” Inspired by the conversations with Hart and others about
voice computing, he emailed Hart, device vice president Ian Freed,
and senior vice president Steve Kessel on January 4, 2011, linking the
two topics: “We should build a $20 device with its brains in the cloud
that’s completely controlled by your voice.” It was another idea from
the boss who seemed to have a limitless wellspring of them.
Bezos and his employees riffed on the idea over email for a few
days, but no further action was taken, and it could have ended there.
Then a few weeks later, Hart met with Bezos in a sixth-floor conference
room in Amazon’s headquarters, Day 1 North, to discuss his career op-
A few months after the Yap purchase, Greg Hart and his colleagues
acquired another piece of the Doppler puzzle. It was the technolog-
ical antonym of Yap, which converted speech into text. Instead, the
Polish startup Ivona generated computer-synthesized speech that re-
sembled a human voice.
Ivona was founded in 2001 by Lukasz Osowski, a computer sci-
ence student at the Gdan´sk University of Technology. Osowski had
the notion that so-called “text to speech,” or TTS, could read digi-
tal texts aloud in a natural voice and help the visually impaired in
Poland appreciate the written word. With a younger classmate, Mi-
chal Kaszczuk, he took recordings of an actor’s voice and selected
fragments of words, called diphones, and then blended or “concat-
enated” them together in different combinations to approximate
natural-sounding words and sentences that the actor might never
have uttered.
The Ivona founders got an early glimpse of how powerful their
technology could be. While students, they paid a popular Polish actor
named Jacek Labijak to record hours of speech to create a database
of sounds. The result was their first product, Spiker, which quickly
became the top-selling computer voice in Poland. Over the next few
years, it was used widely in subways, elevators, and for robocall cam-
paigns. Labijak subsequently began to hear himself everywhere and
regularly received phone calls in his own voice urging him, for ex-
dio, GM Voices, the same outfit that had helped turn recordings from
a voice actress named Susan Bennett into Apple’s agent, Siri. To cre-
ate synthetic personalities, GM Voices gave female voice actors hun-
dreds of hours of text to read, from entire books to random articles, a
mind-numbing process that could stretch on for months.
Greg Hart and colleagues spent months reviewing the recordings
produced by GM Voices and presented the top candidates to Bezos.
They ranked the best ones, asked for additional samples, and finally
made a choice. Bezos signed off on it.
Characteristically secretive, Amazon has never revealed the name
of the voice artist behind Alexa. I learned her identity after canvasing
the professional voice-over community: Boulder-based singer and
voice actress Nina Rolle. Her professional website contained links to
old radio ads for products such as Mott’s Apple Juice and the Volks
wagen Passat—and the warm timbre of Alexa’s voice was unmis-
takable. Rolle said she wasn’t allowed to talk to me when I reached
her on the phone in February of 2021. And when I asked Amazon to
speak with her, they declined.
While the Doppler team hired engineers and acquired startups, nearly
every other aspect of the product was hotly debated in Amazon’s of-
fices in Seattle and in Lab126 in Silicon Valley. In one of the earliest
Doppler meetings, Greg Hart identified the ability to play music with
a voice command as the device’s marquee feature. Bezos “agreed with
that framework but he stressed that music may be like 51 percent, but
the other 49 percent are going to be really important,” Hart said.
Over the ensuing months, that amicable consensus devolved into
a long-standing tug-of-war between Hart and his engineers, who
saw playing music as a practical and marketable feature, and Bezos,
who was thinking more grandly. Bezos started to talk about the “Star
Trek computer,” an artificial intelligence that could handle any ques-
tion and serve as a personal assistant. The fifty-cent word “plenipo-
having Alexa respond to such social cues, recalling that “People were
uncomfortable with the idea of programming a machine to respond
to ‘hello.’ ”
Integrating Evi’s technology helped Alexa respond to factual que-
ries, such as requests to name the planets in the solar system, and it
gave the impression that Alexa was smart. But was it? Proponents
of another method of natural language understanding, called deep
learning, believed that Evi’s knowledge graphs wouldn’t give Alexa
the kind of authentic intelligence that would satisfy Bezos’s dream of
a versatile assistant that could talk to users and answer any question.
In the deep learning method, machines were fed large amounts
of data about how people converse and what responses proved sat-
isfying, and then were programmed to train themselves to pre-
dict the best answers. The chief proponent of this approach was an
Indian-born engineer named Rohit Prasad. “He was a critical hire,”
said engineering director John Thimsen. “Much of the success of the
project is due to the team he assembled and the research they did on
far-field speech recognition.”
Prasad was raised in Ranchi, the capital of the eastern India state
of Jharkhand. He grew up in a family of engineers and got hooked
on Star Trek at a young age. Personal computers weren’t common in
India at the time, but at an early age he learned to code on a PC at the
metallurgical and engineering consulting company where his father
worked. Since communication in India was hampered by poor tele-
communications infrastructure and high long-distance rates, Prasad
decided to study how to compress speech over wireless networks
when he moved to the U.S. to attend graduate school.
After graduating in the late 1990s, Prasad passed on the dot-com
boom and worked for the Cambridge, Massachusetts–based defense
contractor BBN Technologies (later acquired by Raytheon) on some
of the first speech recognition and natural language systems. At BBN,
he worked on one of the first in-car speech recognition systems and
automated directory assistance services for telephone companies. In
these facts, proposed to double the size of the speech science team
and postpone a planned launch from the summer into the fall. Held
in Bezos’s conference room, the meeting did not go well.
“You are going about this the wrong way,” Bezos said after reading
about the delay. “First tell me what would be a magical product, then
tell me how to get there.”
Bezos’s technical advisor at the time, Dilip Kumar, then asked
if the company had enough data. Prasad, who was calling into the
meeting from Cambridge, replied that they would need thousands of
more hours of complex, far-field voice commands. According to an
executive who was in the room, Bezos apparently factored in the re-
quest to increase the number of speech scientists and did the calcu-
lation in his head in a few seconds. “Let me get this straight. You are
telling me that for your big request to make this product successful,
instead of it taking forty years, it will only take us twenty?”
Prasad tried to dance around it. “Jeff, that is not how we think
about it.”
“Show me where my math is wrong!” Bezos said.
Hart jumped in. “Hang on, Jeff, we hear you, we got it.”
Prasad and other Amazon executives would remember that meet-
ing, and the other tough interactions with Bezos during the develop-
ment of Alexa, differently. But according to the executive who was
there, the CEO stood up, said, “You guys aren’t serious about making
this product,” and abruptly ended the meeting.
do use the calendar, yes,” someone who did not have several personal
assistants replied.
As in the Doppler project, the deadlines Bezos gave were unre-
alistic, and to try to make them, the team hired more engineers. But
throwing more engineers at failing technology projects only makes
them fail more spectacularly. Kindle was strategically important to
Amazon at the time, so instead of poaching employees internally, the
Tyto group had to look outside, to hardware engineers at other com-
panies like Motorola, Apple, and Sony. Naturally, they didn’t tell any-
one what they’d be working on until their first day. “If you had a good
reputation in the tech industry, they found you,” said a Fire Phone
manager.
The launch was perpetually six months away. The project
dragged on as they tried to get the 3D display to work. The original
top-of-the-line components quickly became outdated, so they de-
cided to reboot the project with an upgraded processor and cameras.
It was given a new owl-themed code name, Duke. The group started
and then canceled another phone project, a basic low-cost handset
to be made by HTC and code-named Otus, which would also use the
Amazon flavor of the Android operating system that ran on Amazon’s
new Fire tablets, which were showing promise as an economical al-
ternative to Apple’s iPad.
Employees on the project were disappointed when Otus was
scrapped because they quietly believed Amazon’s opportunity was
not in a fancy 3D display but in disrupting the market with a free
or low-priced smartphone. Morale on the team started to sour. One
group was so dubious about the entire project that they covertly
bought a set of military dogtags that read “disagree and commit,”
after Amazon leadership principle #13, which says employees who
disagree with a decision must put aside their doubts and work to
support it.
In his annual letter to shareholders released in April 2014, Bezos
wrote, “Inventing is messy, and over time, it’s certain that we’ll fail
at some big bets too.” The observation was curiously prophetic. The
team was preparing to launch the phone at a big event that summer.
Bezos’s wife, MacKenzie, showed up for rehearsals to offer support
and advice.
On June 18, 2014, Bezos unveiled the Fire Phone at a Seattle event
space called Fremont Studios, where he attempted to summon some
of the charismatic magic of the late Steve Jobs and waxed enthusi-
astic about the device’s 3D display and gesture tracking. “I actually
think he was a believer,” said Craig Berman, an Amazon PR vice pres-
ident at the time. “I really do. If he wasn’t, he certainly wasn’t going to
signal it to the team.”
Reviews of the phone were scathing. The smartphone market had
shifted and matured during the painful four years that the Fire Phone
was in development, and what had started as an attempt at creating
a novel product now seemed strangely out of touch with customer
expectations. Because it did not run Google’s authorized version of
Android, it did not have popular apps such as Gmail and YouTube.
While it was cheaper than the forthcoming iPhone 6, it was more ex-
pensive than the multitude of low-cost, no-frills handsets made by
Asian manufacturers, which were heavily subsidized at the time by
the wireless carriers in exchange for two-year contracts.
“There was a lot of differentiation, but in the end, customers
didn’t care about it,” said Ian Freed, the vice president in charge of the
project. “I made a mistake and Jeff made a mistake. We didn’t align
the Fire Phone’s value proposition with the Amazon brand, which is
great value.” Freed said that Bezos told him afterward, “You can’t, for
one minute, feel bad about the Fire Phone. Promise me you won’t lose
a minute of sleep.”
Later that summer, workers in one of Amazon’s fulfillment cen-
ters in Phoenix noticed thousands of unsold Fire Phones sitting un-
touched on massive wooden pallets. In October, the company wrote
down $170 million in inventory and canceled the project, acknowl-
edging one of its most expensive failures. “It failed for all the reasons
we all said it was going to fail—that’s the crazy thing about it,” said
Isaac Noble, one of the early software engineers who had been dubi-
ous from the start.
Ironically, the Fire Phone fiasco augured well for Doppler. With-
out a smartphone market share to protect, Amazon could pioneer the
new category of smart speakers with unencumbered ambition. Many
of the displaced engineers who weren’t immediately snapped up by
Google and Apple were given a few weeks to find new jobs at Amazon;
some went to Doppler, or to a new hit product, Fire TV. Most impor-
tantly, Bezos didn’t penalize Ian Freed and other Fire Phone manag-
ers, sending a strong message inside Amazon that taking risks was
rewarded—especially if the entire debacle was primarily his own fault.
On the other hand, it revealed a worrisome fact about life inside
Amazon. Many employees who worked on the Fire Phone had seri-
ous doubts about it, but no one, it seemed, had been brave or clever
enough to take a stand and win an argument with their obstinate
leader.
After Jeff Bezos walked out on them, the Doppler executives working
on the Alexa prototype retreated with their wounded pride to a nearby
conference room and reconsidered their solution to the data paradox.
Their boss was right. Internal testing with Amazon employees was
too limited and they would need to massively expand the Alexa beta
while somehow still keeping it a secret from the outside world.
The resulting program, conceived by Rohit Prasad and speech sci-
entist Janet Slifka over a few days in the spring of 2013, and quickly
approved by Greg Hart, would put the Doppler program on steroids
and answer a question that later vexed speech experts—how did Am-
azon come out of nowhere to leapfrog Google and Apple in the race to
build a speech-enabled virtual assistant?
Internally the program was called AMPED. Amazon contracted
with an Australian data collection firm, Appen, and went on the road
with Alexa, in disguise. Appen rented homes and apartments, initially
in Boston, and then Amazon littered several rooms with all kinds of
“decoy” devices: pedestal microphones, Xbox gaming consoles, televi-
sions, and tablets. There were also some twenty Alexa devices planted
around the rooms at different heights, each shrouded in an acoustic
fabric that hid them from view but allowed sound to pass through.
Appen then contracted with a temp agency, and a stream of contract
workers filtered through the properties, eight hours a day, six days a
week, reading scripts from an iPad with canned lines and open-ended
requests like “ask to play your favorite tune” and “ask anything you’d
like an assistant to do.”
The speakers were turned off, so the Alexas didn’t make a peep,
but the seven microphones on each device captured everything and
streamed the audio to Amazon’s servers. Then another army of work-
ers manually reviewed the recordings and annotated the transcripts,
classifying queries that might stump a machine, like “turn on Hunger
Games,” as a request to play the Jennifer Lawrence film, so that the
next time, Alexa would know.
The Boston test showed promise, so Amazon expanded the pro-
gram, renting more homes and apartments in Seattle and ten other
cities over the next six months to capture the voices and speech pat-
terns of thousands more paid volunteers. It was a mushroom-cloud
explosion of data about device placement, acoustic environments,
background noise, regional accents, and all the gloriously random
ways a human being might phrase a simple request to hear the
weather, for example, or play a Justin Timberlake hit.
The daylong flood of random people into homes and apart-
ments repeatedly provoked suspicious neighbors to call the police.
In one instance, a resident of a Boston condo complex suspected a
drug-dealing or prostitution ring was next door and called the cops,
who asked to enter the apartment. The nervous staff gave them an
elusive explanation and a tour and afterward hastily shut down the
site. Occasionally, temp workers would show up, consider the bizarre
script and vagueness of the entire affair, and simply refuse to partic-
ipate. One Amazon employee who was annotating transcripts later
recalled hearing a temp worker interrupt a session and whisper to
whoever he suspected was listening: “This is so dumb. The company
behind this should be embarrassed!”
But Amazon was anything but embarrassed. By 2014, it had
increased its store of speech data by a factor of ten thousand and
largely closed the data gap with rivals like Apple and Google. Bezos
was giddy. Hart hadn’t asked for his approval of the AMPED project,
but a few weeks before the program began, he updated Bezos with a
six-page document that described it and its multimillion-dollar cost.
A huge grin spread over Bezos’s face as he read, and all signs of past
peevishness were gone. “Now I know you are serious about it! What
are we going to do next?”
What came next was Doppler’s long-awaited launch. Working
eighty to ninety hours a week, employees were missing whole chunks
of their family’s lives, and Bezos wasn’t letting up. He wanted to see
everything and made impetuous new demands. On an unusually
clear Seattle day, with the setting sun streaming through his confer-
ence room window, for example, Bezos noticed that the ring’s light
was not popping brightly enough, so he ordered a complete redo. Al-
most alone, he argued for a feature called Voice Cast, which linked
an Alexa device to a nearby Fire tablet, so that queries showed up as
placards on the tablet’s screen. When engineers tried to quietly drop
the feature, he noticed and told the team they were not launching
without it. (Few customers ended up using it.)
But he was also right about many things. As launch neared, one
faction of employees worried that the device wasn’t good enough at
hearing commands amid loud music or cross talk and lobbied to in-
clude a remote control, like the one the company made for the Fire
TV. Bezos was opposed to it but agreed to ship remotes with the first
Over the next few weeks, a hundred and nine thousand customers
registered for the waiting list to receive an Echo. Along with some
natural skepticism, positive reviews rolled in, with quotes like “I just
spoke to the future and it listened,” and “it’s the most innovative de-
vice Amazon’s made in years.” Employees emailed Alexa executives
Toni Reid and Greg Hart, pleading for devices for family members
and friends.
After the Echo shipped, the team could see when the devices were
turned on and that people were actually using them. Bezos’s intu-
ition had been right: there was something vaguely magical in sum-
moning a computer in your home without touching the glass of a
smartphone, something valuable in having a responsive speaker that
could play music, respond to practical requests (“how many cups are
there in a quart?”), and even banter with playful ones (“Alexa, are you
married?”).
Many Doppler employees had expected they could now catch
their breath and enjoy all their accrued vacation time. But that is not
what happened. Instead of stumbling ashore from choppy seas to
rest, another giant wave crashed over their heads. Bezos deployed his
playbook for experiments that produced promising sparks: he poured
gasoline on them. “We had a running success on our hands and that’s
where my life changed,” said Rohit Prasad, who would be promoted
to vice president and eventually join the vaunted Amazon leadership
committee, the S-team. “I knew the playbook to the launch of Alexa
and Echo. The playbook for the next five years, I didn’t have.”
Over the next few months, Amazon would roll out the Alexa Skills
Kit, which allowed other companies to build voice-enabled apps for
the Echo, and Alexa Voice Service, which let the makers of products
like lightbulbs and alarm clocks integrate Alexa into their own de-
vices. Bezos also told Greg Hart that the team needed to release new
features with a weekly cadence, and that since there was no way to
signal the updates, Amazon should email customers every week to
alert them to the new features their devices offered.
Bezos’s wish list became the product plan—he wanted Alexa to be
everywhere, doing everything, all at once. Services that had originally
been pushed to the wayside in the scramble to launch, like shopping
on Alexa, now became urgent priorities. Bezos ordered up a smaller,
cheaper version of the Echo, the hockey puck–sized Echo Dot, as well
as a portable version with batteries, the Amazon Tap. Commenting
on the race to build a virtual assistant and smart speaker, Bezos said,
“Amazon’s going to be fine if someone comes along and overtakes us,”
as part of the annual late-summer OP1 series of planning meetings,
the year after Alexa’s introduction. “But wouldn’t it be incredibly an-
noying if we can’t be the leader in creating this?”
Life inside the Prime building, and in the gradually increasing
But there were drawbacks to the frenetic speed and growth. For years
the Alexa smartphone app looked like something a design student
had come up with during a late-night bender. Setting up an Echo, or
networking Echos throughout the home, was more complicated than
it needed to be. It was also difficult and confusing for users to phrase
commands in the right way to trigger third-party skills and special-
ized features.
The decentralized and chaotic approach of countless two-pizza
teams run by single-threaded leaders was manifested in aspects of
the product that had become overly complex. Basic tasks, like setting
up a device and connecting it to smart home appliances, had become
“painful, very painful to the customer,” agreed Tom Taylor, a sardonic
and even-keeled Amazon executive who took over from Mike George
as leader of the Alexa unit in 2017. He set out to “find all the places
that customers are suffering from our organizational structure.”
There was plenty of turmoil that Taylor and his colleagues couldn’t
quell. In March 2018, a bug caused Alexas around the world to ran-
domly emit crazed, unprompted laughter. A few months later, an
Echo inadvertently recorded the private conversation of a couple in
Portland, Oregon, and inexplicably sent the recording to one of the
husband’s employees in Seattle, whose phone number was in his ad-
dress book. Amazon said the device thought it had heard its wake
word and then a series of commands to record and forward the con-
versation. It was an “extremely rare occurrence,” the company said,
and that, “as unlikely as this string of events is, we are evaluating op-
tions to make this case even less likely.” After those incidents, em-
Alexa and will it into existence believed the CEO could practically see
the future. But in at least one respect, he did not.
In 2016, he was reviewing the Echo Show, the first Alexa device
with a video screen. Executives who worked on the project recalled
that on several occasions when Bezos demoed the prototype, he spent
the first few minutes asking Alexa to play videos that ridiculed a cer-
tain GOP presidential candidate.
“Alexa, show me the video, ‘Donald Trump says “China,” ’ ” he
asked, or “Alexa, play Stephen Colbert’s monologue from last night.”
Then “he would laugh like there’s no tomorrow,” said a vice president
who was in the demos.
Bezos had no idea what was coming.