This is part of the Operations topic cluster, which belongs to the Business Expertise Triad.

The Disaffected PhD Skunkworks: A Story About Process Improvement

Feature image for The Disaffected PhD Skunkworks: A Story About Process Improvement

Table of Contents

Sign up for the Newsletter

    Once a week. Three links. No spam. Unsubscribe anytime.

    In our last essay we discussed a famous paper titled Nobody Ever Gets Credit For Fixing Problems That Never Happened — which describes how many companies eventually just push its workers to work harder, instead of doing process improvement. We discussed how this happens, and why; we also took a look at a handful of common business experiences to illustrate how this is a phenomenon that you’ve likely already experienced.

    What I didn’t do was to give you a concrete story about good process improvement. Concrete stories matter: they ground our understanding in real events. So here’s a story about process improvement I really like.

    Customer Service at Early Amazon

    In 1996, Jane Slade and Colleen Byrum joined, then a fledging internet startup located in downtown Seattle. Slade joined as a generalist; Byrum joined as its first Customer Service lead. Back then, the internet was still in its infancy, and Amazon tended to hire generalists more often than not; these people usually started out in the Customer Service department before moving on to other parts of the company. Each CS person was given an X-terminal, with a command line prompt, and was expected to do their jobs with a memorised list of commands that the software engineers would bang out specifically for them. A lot of the first customer requests in 1996 were from early adopters, who would ask questions about what it was like working at Amazon (when they weren’t asking questions about their orders, that is).

    Amazon’s first major bump came with the Wall Street Journal profile of Jeff Bezos in mid-May, 1996. This was the company’s first taste of national press, and therefore its first wave of visitors who weren’t early-adopters. The questions that customer service had to deal with were things like “Help me, the page just ends!” (at which point Amazon’s CS reps had to explain scrolling to them) and “please tell Hotmail that such and such feature is broken” (at which point they had to explain that the entire Internet wasn’t in one large office building somewhere, and that Amazon dot com had nothing to do with Hotmail dot com).

    Byrum was hired to lead Customer Service, but she arrived slightly after Slade. She spent a couple of weeks doing customer service herself, before realising that there was an incoming tsunami of growth (the WSJ profile being the first harbinger — well, that, and the ever increasing workload that poured on top of everyone). She quickly shifted focus to building a pipeline of candidates for the department. She called every temp agency in town, told them that they would be hiring 500 candidates over the next couple of months, and asked them if they wanted a cut — contingent on Byrum giving them feedback on what candidates worked and why others didn’t work, and that if the agencies couldn’t meet her standards they would be dropped. Enough temp agencies agreed to her plan; Byrum’s recruiting program (she called it “hoovering up great clumps of people”) was off to a good start.

    The temps she hired would be offered full time roles a couple of weeks into their jobs. But the entire going was tough. Byrum describes the period in her own words:

    Byrum: When I got to Amazon, the customer service department was five people sitting in a room at x-terms with command line prompts, one of them was a Latinist, one of them was a Shakespearean expert, there was a strange man in the corner completely geeking out with credit cards, he had something called CC motel where credit cards would go in and wouldn’t come out; there was some kind of horrible nightmare going on with him, there was a Rhodes Scholar, and there was Jane. These were the people in the room, and then Jeff said “Customer Service is over there” and then that was the last I talked to the man for three months.

    (...) Bear in mind, the people we were hiring, there were these disaffected PhDs and Masters program wannabes or dropouts, there were these very very bright people — “come work for us for $10 an hour in stock options”, and (they'd ask) “what are stock options?” It was not a good pitch. It was really hard to get them in the door, but we did. And then there was this laborious training process for working in Unix at a command line prompt, and oh by the way don’t leave any of the tools open or you’d bring the website down ... it was mad.

    Slade: We needed people who could write! We needed people who could write and communicate, because most customer service was on a telephone, but ours was mostly email. So people with good judgment, who could write, who could handle technology, that was a pretty tall order for a temp.

    At some point ‘hoovering up great clumps of people’ hit a natural limit. The entire company was in over their heads — none of them had experienced growth like this before. Brad Stone’s book The Everything Store gives us a taste of the period: “... revenues were growing 30 to 40 percent a month, a frenzied rate that undermined attempts at planning and required such a dizzying pace that employees later found gaps in their memory when they tried to recall this formative time. No one had any idea how to deal with that kind of growth, so they all made it up as they went along.”

    The next step, of course, was to create better tools to improve customer service. Byrum’s recruiting pipeline was running as best as it could, spitting out reps by the dozens, but they were still drowning in requests. On Dave Schappell’s Invent Like An Owner podcast, Schappell asks Byrum the obvious question:

    Schappell: So I asked about technology a few times. So now we’re moving along a little bit (after talking about all the hiring you did). And at some point you did get resources, or people assigned, or I’m sure you yelled, begged, borrowed, steal ... and got resources. Tell me about the brute force vs the technology. Tell me how CS scaled, eventually?

    [hysterical laughter from both Byrum and Slade]

    Byrum: First of all, there was no end of begging, bribery, bitching, standing on top of tables and trying to get it ... but (the answer was always) there were no resources were coming to Customer Service, get over it, go away, answer the emails and stop bothering us. It is not going to happen. We have other things that are on more fire, hotter fire, greater risk, than you guys, so suck it up, buttercup.

    So I started ... what I thought of in my own head as the ‘disaffected PhD Skunkworks’. I started recruiting directly into the University of Washington, looking for people who were just that — they weren’t happy with their graduate programs in math, or computer science, or physics in one case ... I’ll never forget, there was this one incredible hire we made, a guy named Clark Grubb, god love him, I think at the time he was a night manager at a hotel, near Seattle airport.

    (...) Anyway, (we got) a handful of guys — I think it was mostly guys — and we stuck them in a corner and gave them books on Perl and SQL, and told them “I want you to do half the time answering customer service emails, and the other half time I want you to read these books and start writing Perl scripts.” And that was how we started to get some automation in customer service. And meanwhile, these guys were writing code and checking it in into the main pile of untested and ungovernable code that was the Amazon website. Anyway, that worked great. Because suddenly we had tools, suddenly we had ways of being more efficient, of meeting our customer needs at a much much faster more efficient way, both at the mail handling level and the tool, operating on orders, level.

    Schappell: But you had people building the tools who had experience answering the questions. You said they did a couple of hours doing it.

    Byrum: Well you had to do customer service first, and that was what made it so hard. You can imagine trying to recruit these people for $10 an hour — and for stock options. They didn’t even know what those were!

    You would have thought that they would have looked at some external customer service software. Except that they did, and:

    Schappell: When you were there was it all home-grown, home-built? Did you try licensing any third party customer service software? Or was it all built internally?

    Slade: I met with many vendors, and in fact some of them were people I socialised with in Seattle. And they would come in and — little missy, explain what they could do for us. And we would explain the volume where we were already, and they would have their smiles plastered on their face ... they could not handle our volume. The stuff that we had already built internally was so far beyond what they could do, they absolutely could not handle it. And we needed to move forward fast. So we made the decision that no one could help us, at that time.

    Needless to say, Colleen Byrum and Jane Slade’s early process improvement hacks carried the company for a good long while.

    Wrapping Up

    I like this story a lot. I thought Byrum was remarkable as a customer service lead, but if you hang around early stage startups enough, you’d know that stories like this abound. There’s always one or more Colleen Byrums hidden in the background of relatively well-run companies; the absence of a good process improvement person is felt more keenly than the presence of one.

    In the original paper on process improvement we discussed in our previous essay, authors Nelson Repenning and John Sterman described a case where a team in Du Pont successfully reversed the org dynamics preventing process improvement at Du Pont’s Washington Works complex. They did this by designing a two day interactive role-playing simulation they called the Manufacturing Game, which illustrated the relationship between the work harder loop and the process improvement loop, albeit in the matter of hours (yay simulations!) not months.

    The common thing between both positive stories seems to be air cover: with the Du Pont story, the team had to win management over, and therefore buy themselves enough time to get the process improvement loop going — horrible short term maintenance losses be damned. With the Amazon story, though, the cover was provided by Colleen Byrum herself, who hired programmers on the sly.

    You’d think that this is a minor detail, but it does seem to be common amongst all the process improvement stories that I know. I remember talking to a manager who told me that he kept an intern in reserve, every couple of months, hidden from upper management, that he would assign tasks like improving the CI/CD system or building better test tooling for the rest of the software org. We had a good laugh over this (I’ve done something similar; all startups are internal disasters, etc) but now I think this is critical for process improvement to take.

    You need some free energy to invest in process improvement, after all. So either you buy cover for the inevitable short term losses, or you hide your improvement activities. I don’t currently see any way around it. It just seems like something that’s necessary to do.

    Originally published , last updated .

    This article is part of the Operations topic cluster, which belongs to the Business Expertise Triad. Read more from this topic here→

    Member Comments