On Teams & Problem Spaces

Teams are most effective when they are organized around problem spaces and are explicitly named after the problem space that they’re solving. Unfortunately, if my experience is representative, this isn’t the norm. If I think back on the organizations that I’ve been a part of as an individual contributor or inherited as a manager at Microsoft, Amazon, and Riot, I’ve frequently seen teams organized around products, solution spaces, and code names. To explain each option, consider a fictitious team with the following properties:

  • The team is part of a larger organization that builds tools for feature teams to deploy, operate, and scale backend services on the company’s private cloud.
  • The team previously decided that its mission is “to make it easier for feature teams to operate deployed services” and its vision is that “services are highly available and easily scalable with virtually no operator intervention”.
  • The team’s current flagship product is an alerting and monitoring system called Brometheus.
  • The team happens to be obsessed with Star Wars (who isn’t?).

Now suppose you’re tasked with naming the team. You could name the team the Brometheus Team after it’s most important product. Alternatively, you call them the Monitoring Team, since the solutions that they currently own are in the monitoring space. You could accept the suggestion of a couple of team members that believe that names don’t matter and have pushed to be called the Jedi Order. Or, you could name the team after the broader problem space and call it the Operability Team.

The final option is superior to the others for several reasons:

  • Unlike the product and solution space options, it doesn’t constrain thinking and imply a particular solution. If you’re the Brometheus Team, your work will always revolve around improving that product (you have a hammer and everything will look like nails). Similarly, the Monitoring Team will always focus on building monitoring products. The Operability Team is free to decide that they can make it easier to operate services by working on auto-scaling, service call-tracing, or debuggability instead.
  • Unlike the code name option, it’s immediately obvious what your team does and what sandbox you play in. The Jedi Order may sound cute and the team may rally around that identity in the short term, but at scale, it becomes difficult for customers to keep a mental map of who is working on what. Further, the team’s new identity is not rooted in being a team and not the problem they’re solving; the former tends to fizzle much more quickly.

Reorganizing or renaming teams can be hard work, particularly if the team’s current identity is tightly wound around their current name or organizational structure. But in the long run, doing the work to identify problem spaces, organize teams around those problem spaces, and to explicitly name teams after the problem space they’re tackling is worth it.

On Cryptocurrency, Blockchain, & Cloud Computing

With Bitcoin’s price soaring, I’ve found myself spending a lot of cycles explaining why I’m still bullish on cryptocurrency and blockchain. I’ve also found myself in a number of conversations where I’m trying to convince friends that most of the talking heads from the financial world fundamentally don’t understand what blockchain is about or how it will change the game. This post is my attempt to channel those discussions and dispel several popular myths that are currently making the rounds on the Twittersphere. Along the way, I’m also going to try to convince you that the blockchain not only has the potential to revolutionize the financial world, but is also poised to have a massive impact on cloud computing. Without further ado, let’s dive into some background on both computing and blockchain.

Blockchain as a Cloud Computer

Every computer can be broken down into two fundamental components: compute and storage. When you use your computer to multiply two numbers together, your computer loads (compute) values from memory or disk (storage) into registers (storage), adds (compute) the values together, and stores (compute) the result in another register (storage) where it can then be stored (compute) in memory or on disk (storage). From web surfing to spreadsheet-crunching to gaming, all computer applications boil down to computation that manipulates stored values in interesting ways.

Cloud computing is no different. Amazon Web Services, the leader in hosted public cloud, offers tens (if not hundreds) of unique services that can all be broken down into compute and storage that runs on Amazon’s massive infrastructure footprint. For example, AWS CodeBuild allows software developers to build and test their code in the cloud and then store the built artifacts in a data store like Amazon’s Simple Storage Service (S3). Like most of its services, Amazon bases the pricing for CodeBuild and S3 on the number of minutes that the underlying virtual machine uses (compute) and the number of terabytes used per month (storage) because the company understands that most of its value-add can be decomposed into compute and storage. AWS made $4.6B in revenue in Q3 of 2017 alone which represents a massive year-over-year revenue growth of 42%, so the global market for cloud computing products is clearly vibrant and growing quickly.

The simplest way to think about a blockchain is a big, distributed cloud computer that no single person, company, or government controls. Individuals called miners connect their computers to the blockchain so that they can be used to process transactions (compute) and write the results of those transactions to a digital ledger (storage). The mechanics of executing transactions depend on the blockchain implementation and are typically heavily rooted in cryptography, but in a proof-of-work system like Bitcoin, each mining node is taking a block of transactions and hashing them (compute) together with the hash of the previous node and a value called a nonce to create a unique hash value that fits a set of constraints. Once a valid hash is mined, the new block is broadcast (compute) to other nodes and the transactions in that block are executed and stored (compute) written to both lightweight and full nodes (storage) across the blockchain network.

Mining is computationally expensive, so the blockchain is orders of magnitude less efficient than comparable distributed compute and storage solutions, but it’s crucial to note that efficiency is intentionally exchanged for a different property: no one entity owns the system, yet the system can still facilitate computational transactions that involve multiple untrusted parties without introducing a trusted intermediary. If you think about the number of transactions that we participate in on a daily basis where we incur some cost to engage a trusted intermediary, this is a big deal. Facebook can monetize your data in undesired ways, PayPal takes a healthy cut to move your money around, and the government can always compel Amazon to delete or hand over data that it’s storing on your behalf.

In order to incent miners to give their compute and storage to the blockchain, the creators of Bitcoin developed the concept of a digital coin or cryptocurrency that can be exchanged in to run transactions on the blockchain. Anyone that wants to write to the blockchain ledger has to offer a small number of coins for their transaction to be processed. The cryptocurrency cost of executing a transaction on the blockchain is linked to the demand for running compute on the network and inversely linked to the computing power connected to the network. The cost in terms of a fiat currency like the US Dollar is also obviously linked to the going exchange rate between the fiat currency and the cryptocurrency, thus the cost in USD to run a transaction on the Bitcoin blockchain has increased steadily of late: in late Q3 of 2017 the cost of writing 200 bytes to the Bitcoin blockchain ledger within 30 minutes was roughly $3-4 USD worth of BTC.

Because of this need to move currency from one party to another, the original application that was baked into Bitcoin’s blockchain was the exchange of the Bitcoin cryptocurrency between parties. Newer blockchains have built on the Bitcoin foundation and embraced the idea of more generic Smart Contracts that allow for arbitrary code to be executed directly on the blockchain. For example, the creators of the Ethereum blockchain have implemented a runtime environment on blockchain for Smart Contracts written in a language called Solidity that is Turing complete, which means that in principle, it can be used to solve any computational problem. The result is that blockchains like Ethereum look similar to cloud computing service in that they allow for arbitrary distributed compute and storage, yet they also display the interesting property of being able to facilitate computation that involves multiple untrusted actors without a single trusted third-party controlling the service. Efforts are underway to bolt this kind of behavior onto the Bitcoin blockchain via sidechains like Rootstock.

That was a lot to grok, but it’s really impossible to critique the current commentary on blockchain and cryptocurrency without at least a high-level understanding of how the pieces fit together. So with that all in mind, let’s dive into a few recent criticisms from high-profile individuals in the financial sector about both Bitcoin specifically and blockchain and cryptocurrencies in general.

“Bitcoin doesn’t have any intrinsic value…”

One extremely common narrative from people that come from the financial world recently is that coins like Bitcoin don’t have any intrinsic value. Just a few days ago, Nobel Prize-winning economist Joseph Stiglitz said that “Bitcoin is successful only because of its potential for circumvention, lack of oversight. It doesn’t serve any socially useful function.” JPMorgan Chase CEO Jamie Dimon has claimed “the only value of Bitcoin is what the other guy will pay for it.”

If you made it through my quick primer above, then you already understand why Stiglitz and Dimon are incorrect. Coins like Bitcoin and Ether can be exchanged for compute and storage on a massive supercomputer with some very compelling properties that allow you to do things like executing transactions between untrusted parties without a trusted intermediary. If you believe that an increasing number of applications will be written and deployed on blockchain to disintermediate our life (and if those currencies are architected in a way that limits velocity) then the value of these currencies will inherently go up as the demand for compute and storage on blockchain increases.

It’s worth pausing here for a moment to mention that there are a few very different kinds of cryptocurrencies floating around today. The first kind is typically called a utility token that has the intrinsic property of being exchangeable for some kind of service. Bitcoin and Ether are both utility tokens because they have the intrinsic property (coded directly into the token and blockchain) of being exchangeable for compute and storage. The second kind of token is often called a tokenized security because it functions more like a traditional security that just happens to be exchangeable on a blockchain. A tokenized security has no intrinsic value but may have value ascribed to it by extrinsic means. For example, a legal contract may promise a share of the future revenue streams of a corporation pro rata to the holders of a specific kind of token. I suspect that people like Stiglitz and Dimon are completely missing the power of blockchain as a cloud computer, so they’re mistaking utility tokens for a tokenized security that is linked to very little value.

“I’m excited about blockchain, but not Bitcoin…”

A second popular thread is that Bitcoin and similar technologies are interesting, but not in their current implementation. One flavor of this attack is that blockchain technology is compelling but cryptocurrencies are not. Another related flavor is that the existing decentralized blockchain implementations will be replaced by blockchain implementations that are controlled by governments and corporations. World Bank President Jim Yong Kim noted that “blockchain technology is something that everyone is excited about, but we have to remember that Bitcoin is one of the very few instances.” He went on to emphasize that the importance of blockchain is the speed with which it can facilitate transactions, drawing parallels to Alibaba’s infrastructure that can facilitate large transactions in seconds. Former Federal Reserve chair Ben Bernanke espoused a similar view when he talked about how Bitcoin would fail but blockchain was interesting and would help federal banks improve their existing payment systems.

Again, this line of thinking is flawed. As noted above, blockchain is an intentionally inefficient system because it trades efficiency for decentralization. It’s hard to see why a bank or government would want to implement an inefficient technology for no reason if decentralization isn’t a desired property. Banks already run on digital systems that can facilitate transactions between people, so what exactly is blockchain bringing to the table? Further, that decentralization can only be maintained if coins are woven directly into the fabric of the blockchain to compensate miners, so the idea of implementing a blockchain without a linked cryptocurrency doesn’t make a lot of sense.

“Bitcoin is an unreliable store of value…”

Another part of the conversation about the value of Bitcoin is centered on whether it will prove to be an effective store of value. A store of value is a mechanism that allows people to preserve facilitate the exchange of wealth and to preserve wealth across both physical space and time. To accomplish this the medium of storage must exhibit a few properties: it must be liquid, it must be scarce, it must possess longevity, and people must be willing to assign a value to it. Historically, stores of value have included things like precious metals, gemstones, livestock, real estate, and fiat currencies.

Some recent challenges to the validity of cryptocurrencies like Bitcoin as a store of value have focused on whether the currencies will continue to exhibit those required properties of a store of value. In the article that was linked above, Jamie Dimon states that “governments are going to crush Bitcoin one day. Governments like to know where the money is, who has it and what you’re doing with it.” In essence, his comments are an attack on the longevity of Bitcoin as well as its liquidity in markets as they become more regulated. Economist and author Raoul Pal claimed that Bitcoin was an unreliable store of value because the group of developers that controls the underlying codebase can change the code: “Even if they don’t change the formula, the fact that they could? That’s enough to say it’s not a long-term store of value.” Pal’s statements cast a doubt on Bitcoin’s scarcity (since software engineers could “print more money”) and whether people should trust the people at the helm enough to assign value to the currency.

The reality is that any government that bans cryptocurrency will miss out on the next great wave of innovation. Where would the US be today if the government issued a ban on all HTTPS traffic because it disrupted the way that intelligence was previously collected? The risk of rogue updates to the codebase is slightly more real, but it’s important to note that there are three groups of actors in each blockchain ecosystem that work as a set of checks and balances against each other: developers, miners, and cryptocurrency holders. The split between Ethereum and Ethereum Classic is a real-life case study in what happens when those groups move in different directions, and it will forever be a warning to the development communities for other blockchains.

Where To From Here?

None of this means that Bitcoin and other cryptocurrencies are destined to continue their meteoric rise. Blockchains have real challenges as they try to scale; when an app like CryptoKitties pushes your network to its limits, you have work to do. Cryptocurrency exchanges are still a major vulnerability of the system and market manipulation is possible at current volumes. For example, it’s widely speculated that the current price of BTC is being propped up by the fraudulent issue of Tether and that if USDT and Bitfinex implode, they will bring all cryptocurrencies along for the ride. All of these risks are real.

But as both a Software Engineer and a VC, I can tell you that I see a lot of companies making big bets on blockchain and using it as the Operating System for applications that were previously impossible to build and will change our lives. Those apps aren’t in production or operating at scale yet, so the analogies between the current environment and the dotcom bubble are reasonable: there may be a crash that is followed by a long period where apps are deployed, adoption grows, and the ecosystem justifies the valuation. Or, maybe, the current lofty valuation on cryptocurrencies is correct for a technology that has the potential to disrupt both the financial sector and cloud computing and near-term growth will continue.

When Clay Christensen introduced the concept of “disruptive innovation” in The Innovator’s Dilemma, he explained that incumbents can’t pursue disruptive innovation when it first arises because the opportunities aren’t profitable enough and the development of disruptive innovation would take scarce resources away from sustaining innovation that is required to keep up with the competition. As the disruptive innovation matures, it begins to capture share up-market, and the incumbent can’t react quickly enough.

Jamie Dimon claims that blockchain isn’t worth his attention because JPMorgan Chase moves $6T in money around the world every day while the daily trading volume of all cryptocurrencies is around $10B. Ironically, with a total market cap of roughly $370B, the basket of all of the cryptocurrencies in the world is now more valuable than JPMorgan Chase. Are major industries going to be disrupted in the next decade? Time will tell, but I’m betting on crypto.

On Productivity

Like you, I’m busy! I have a family that includes a 3- and 4-year-old, a fun (but demanding) job at Riot, I’m a student in the Berkeley-Haas MBA for Executives program, I volunteer work as both a mentor at Techstars and a director at a non-profit organization called Potential Energy, and I have a bunch of other hobbies that range from hacking on side projects to learning German to training for a marathon. I list those commitments purely in an attempt to convince you that my survival hinges on my ability to squeeze every minute out of every day. I get asked about time management and my personal productivity system fairly often, so I’m taking the time to describe the way that I operate here in a blog post for three reasons: in case others can learn from my system, in hopes that I can get feedback and improve my system based on what’s working for others, and to give me a place to point people in the future instead of offering a slightly half-assed verbal brain dump on the spot.

If you just want a tl;dr of the high-level best practices that I’ve based my own system on, they are:

  • Get your life’s backlog into a tool. Groom regularly, and setup short (daily?) sprints.
  • Keep yourself off of the endless email treadmill.
  • Track and update your personal growth in a lightweight artifact that you look at on a regular basis.
  • Be extremely tactical about the setup of your workstation when you go heads down.
  • Exercise and eat healthy, so your body isn’t working against you.

With that said, let’s dig into the deets.

I keep my whole life in Evernote, following a sort of Scrum-like system that was inspired by The Secret Weapon and has been customized over the years. Keeping everything in one place gives me the freedom to focus on particular tasks and sleep well at night while knowing that I’m not dropping anything on the floor. I have three primary notebooks in Evernote: Action, Done, and Reference. I have email forwarding set up so that I can forward email to Evernote from both my personal and work email, and I have Evernote’s web clipper add-on installed in Chrome for capturing web pages. I use a tagging system that includes tags for the place that the item pertains to (@home, @riot, @haas, @techstars, @potentialenergy, etc.), the priority of the item (0-Critical, 1-Urgent, 2-Important, 3-Average, 4-Trivial, 5-Irrelevant), and a special tag for when I will tackle the item (Today, Soon, Daily, Recurring). Everything that first comes into Evernote goes into my full backlog, which I can view with a saved search for everything in Action. Once per day, I pull items from my full backlog into my daily backlog by adding the Today tag; I can also view this backlog with a saved search.

I only check my work email once per day, at the end of the workday. I assume that if anything high priority comes up I can be tracked down in real-time via either IM or in-person communication (and thus far, this has only bitten me once or twice). Closing the email window during the day has been a massive win for me both in terms of time savings and a reduction in context switching, although as a brief aside I’m concerned that the migration to Slack for a lot of communication may make this kind of firewalling more challenging moving forward. When I clear email, I make a bottom-to-top sweep of everything in my inbox, and for each mail I take one of a few actions: I mark it read if it requires no action, I reply to it or take the necessary action if it will take me less than a few minutes, or I forward it to Evernote so that it shows up in my full backlog.

At the end of each evening before I read and call it a night, I go through my full backlog in Evernote and add tags (including a priority, and in some cases the Soon or Today tag) to all the new items that have come in. I then take a few minutes to pull the necessary tasks into my daily backlog for the following day (starting with tasks that had the Soon label), and I look at my calendar to be sure that I can accomplish everything in my daily backlog and that I’ve identified the right set of priorities.

I also keep a Personal Development Plan (PDP) in Google Docs, using a lightweight format called the Agile PDP format that was created by a fellow Rioter named Andre Ben-Hamou. The format probably warrants a post of its own, but it essentially involves identifying three to five high-level focus areas along with a list of backlog items that relate to those focus areas. The difference between the backlog items on my PDP and those that make it into my full backlog from other sources is that these items are intentionally growth focused. Once a week I do a deeper sweep of my entire full backlog to be sure that everything is still relevant and prioritized correctly. After that I review my PDP, add new items as appropriate, and pull items into my full backlog in Evernote as appropriate.

During the day I take great pains to be sure that my work environment is setup correctly. I start by making sure that my desk is clutter-free, and that I have both a snack and some water. I close everything except for Google Calendar (so I know what meetings are coming up) and Evernote, and I disable most notifications on my phone. My desk is in an open team environment, so I usually have headphones on playing music from Digitally Imported along with some very light white noise from A Soft Murmur to help drown out background conversations in the area.

To keep my energy level up throughout the day and help me focus, I make sure that my mind and body are both healthy and in a good place. I spend 10–15 minutes practicing mindfulness through meditation in the afternoon, typically following a guided meditation from Calm. I’ve created a personalized DIY Soylent recipe that I use to augment/replace my meals and ensure that I’m getting the nutrients that I need. I also typically exercise three or four times per week, typically either cycling to work or going for a jog in the morning or the evening.

There are a lot of other nuances to the way that I work, but the above is a fairly comprehensive bird’s-eye view of the artifacts and rituals that I’ve developed to keep me sane and productive. I think effective personal work systems are highly specific to people and environments so I’m not certainly not advocating that you should adopt my system out of the box, but I hope that you will find pieces of my routine helpful. Please take advantage of the comments to let me know what’s working for you in your own routine so that I can consider adopting it as well. Thanks!

Death by a Thousand Chickens

Almost every candidate that I have interviewed at Riot Games asks something along the lines of, “Everything about Riot sounds awesome, but it can’t all be roses – what’s the most challenging thing about working here?” I generally start by assuring them that Riot is, in fact, a particularly amazing place to work for a million different reasons. A few examples from my personal experience here: we invest deeply in growing people, we all care a ton about the products that we’re building and the focus on the people who play our games is palpable, and we provide unlimited opportunity to get plugged in and drive real change on everything from product decisions to the way that teams and organizations operate. But the people asking the question are correct – it’s not all roses, and we also want to be transparent about our areas for improvement. That’s why I always describe an area where I think we currently have a lot of room for improvement: we’re often the victim of “Death by a Thousand Chickens.”

Before I define Death by a Thousand Chickens (which I will hereafter abbreviate as DB1kC, because if there’s one thing engineers enjoy, it’s creating acronyms!) I want to emphasize that it isn’t specific to Riot. It’s a phenomenon that can occur in any organization but is more likely to occur within flat organizations where individuals at all levels are empowered and appeals to hierarchy don’t carry weight. I’m using a specific example from my experience at Riot to help explain the concept, but I think the ideas here are broadly applicable, and I would be thrilled if this post can ignite a discussion on how to prevent DB1kC without reverting to a top-down chain of command.

DB1kC occurs when someone is trying to make a decision at an organization and a large number of well-intentioned people want to be consulted and provide input on the decision, but very few want to be responsible for helping to execute on the decision. To steal a couple of terms from the scrum world, it’s trying to operate with a multitude of chickens and precious few pigs, when the chickens keep running in and tripping up the pigs.

To ground this in a real-world example, let’s travel back in time a couple years to when I first joined Riot. My experience interviewing for a position at the company was decent, but having seen the way that we were interviewing engineers from both the inside and out I was convinced that we could do better. The process took a long time, the content and structure of interviews differed (sometimes drastically) between different organizations and interviewers, and it was difficult to mine the data necessary to make data-informed improvements to the process. I took up the mantle and issued a “call to arms” email to all engineering managers and recruiters and asked people to join a virtual team that would meet on a relatively infrequent basis and serve as a central coordination point for work to improve our interview process. I got a lot of responses to the email from managers who wanted to give input into the work that the group was doing, but most people said that they didn’t have the bandwidth to join in the work themselves.

About 10 people did opt into the group and commit to doing work, and over the course of the next 6–12 months we were able to make a bunch of targeted improvements to the interview process. We rolled out a fresh set of interview “kits” to push us towards more standardized and structured situational and behavioral interview questions, we partnered with central recruiting to help launch a survey to collect metrics on the candidate experience, and we made other changes to the process that reduced candidate turnaround time significantly.

The problem is that as we attempted to deploy each of those improvements, we had to do battle with DB1kC. Some folks who didn’t want to commit their time to work with our v-team told us that we had moved the needle forward, but not in the exact way that they wanted things to go. Others spun up parallel efforts to implement targeted improvements in a particular way for their specific team without engaging with us and attempting to generalize what they were doing so that it could be applied to other teams at Riot. All of these chickens were trying to help, but they weren’t on the sidelines *or* on the playing field. They were essentially standing with one foot on either side of the boundary line, and in the process, they were slowing forward progress (and in some cases bringing it to a screeching halt).

Contrast this with the way that I could have attempted to push changes to the hiring process through for my organization at either Microsoft or Amazon. Once I had documented the proposed changes to the interview process, I could have pitched them up my management chain and gotten buy-in from my VP, who would have turned around and told his entire organization “this is how we’re interviewing from now on.” Note that I’m not saying that escalation is the only tool at those companies, but I am saying that it’s a viable tool that exists and gets used to preempt the possibility of DB1kC. The advantage of driving change socially at a flat organization is that at the end of the day, the idea is more refined (by virtue of more input/iterations) and everyone is aligned; you get 100 percent buy-in. The advantage of leveraging hierarchy, on the other hand, is speed. DB1kC is the extreme example where trying to roll out a change socially results in deadlock and speed essentially goes to zero.

Now despite the risk of DB1kC, I firmly believe that the pros of flat organizations that don’t allow for appeals to hierarchy far outweigh the cons. I am, however, passionately interested in mechanisms that can preserve the benefits of a flat organization without sacrificing the ability to move quickly. I suspect that any successful strategy probably hinges on being crystal clear on who is in the game and who is on the sidelines: if you’re a chicken, assume that the pigs are doing good work and don’t get in the game unless you’re asked to do so. From the outside, companies like Apple and Asana appear to be making the pig/chicken distinction explicit by nominating a single Designated Responsible Individual (DRI) to be accountable for a decision, letting the DRI choose who should be looped in, and then trusting that they probably got it “right enough.” I’m waiting for a good opportunity to pilot the DRI concept at Riot, but I haven’t done so yet.

What say you? Are there other ways to solve the dreaded Death by a Thousand Chickens problem? If so, I would love to hear suggestions in the comments! And before I go, a quick shout out to my colleague Michael Chang who first coined the phrase “Death by a Thousand Chickens” – had he copyrighted it, I would owe him at least a dollar by now…

How Video Games Made Me A Better Software Engineer (& Dad!)

About six months ago I left an amazing job at Amazon for a very different, yet equally amazing job at Riot Games. I won’t bore you with the laundry list of factors that went into my decision, but I will confess that one of the many factors was my life-long love of video games. I’m a bit quirky in the way that I play video games because I can’t play a game casually. When I pick up a game, I play purely to master the game and to challenge myself (and possibly my team, depending on the game) to see how good I/we can be. As crazy as it sounds, that constant quest for mastery has taught me a valuable lesson that not only has made me better at my job as an engineering manager, but has helped me to grow in other areas of my life.

Before I hit you with the punch line, let me give you some quick background to help set the stage. As you may or may not know, Riot Games produces a very popular game called League of Legends that pits two teams of five players against each other in a ~20–60 minute battle to destroy the other team’s Nexus before they destroy yours. League of Legends is one of those games that is relatively quick to learn, but takes a lifetime to master because of the complexity of gameplay. Any player can try to work his or her way up the game’s elo rating system, which is broken down into several divisions: bronze, silver, gold, platinum, diamond, master, and challenger.

When I first started trying to climb the elo ladder, I was able to work my way from bronze to silver by just grinding out a bunch of games. As I was playing, I was building my “mechanics” and learning basic concepts that allowed me to improve fairly quickly. As I kept playing, however, my progress stalled out before I was able to hit gold. That’s when it struck me that if I wanted to improve, I would have to actively start doing things to improve. I wasn’t going to get better by just putting my time in, playing game after game, and making the same mistakes over and over again.

That same concept of mastery applies to almost every area of our lives. When I landed my first job as a software developer, I had so much to learn that I could build my development chops by simply doing my job. At some point that ceased to be true and I had to start doing very intentional things to continue to improve. Sometimes that meant seeking out seasoned veterans for some pair programming, and other times it meant changing teams to work in a new domain or with a new set of tools.

IMHO, the hardest part of improving at something is 1) identifying when we’ve hit our natural plateau and we’re just grinding it out without getting better, 2) deciding that we actually want to invest the immense time and energy needed to get better, and then 3) taking some time where we are very intentionally in the “stretch zone” and practicing for mastery. This season I’ve advanced to platinum in League of Legends, and the only way I was able to accomplish that goal was by setting aside a chunk of play time every week where I wrote down a specific goal (which could be something like “die 3 or less times”, or “kill 85 minions by the 10-minute mark”), focused on achieving that goal while I played, and sometimes watched replays of my games to find mistakes and figure out what goals I should set in the future. As an engineering manager, I put myself in the stretch zone by keeping my personal development plan (PDP) relevant and up-to-date, spending quality time learning from mentors each week, reading books and blogs that are written by other managers that I respect, and continually collecting feedback from the folks that I’m managing on how I can be more effective and using that feedback to drive new goals into my PDP. As a husband and a dad, I get in the stretch zone by sitting down with my wife every Sunday evening and talking through how things are going at home and using those discussions to pick a few things to focus on for the week.

There are a lot of other areas in my life where I’m intentionally not putting in the effort to get in the stretch zone and improve, and I’m fine with that because I only have a finite amount of time and focus. I love playing golf and would like to be a better golfer, but right now I’m just hitting the course occasionally and playing for fun. I suspect very few people have the discipline and the mental focus to context switch and really improve at more than about three things at a time.

I leave you with this challenge: Identify one thing that you want to get better at, and come up with a plan to get into the stretch zone at least a few times a week for the next month. Then leave me a comment below and let me know how your experience went. And the next time someone tells you to quit playing video games and do something productive, tell them that you’re learning valuable lessons that apply to the rest of your life.

Delight Customers By Not Giving Them What They Want

It sounds odd, doesn’t it? I’ve spent the last 2 years at Amazon, where we preach customer obsession day in and day out. It’s one of the things that makes me love my job. Yet I’m firmly convinced that in some very few cases, the best thing to do for a customer is to not give them exactly what they are asking for.

Consider a recent discussion that happened at work during a meeting. To set the stage, you have to understand that Amazon has made a massive bet in continuous deployment over the last several years. John Jenkins, a long time Amazonian before joining Pinterest, gave some impressive stats on how frequently Amazon deploys code to production during his talk at Velocity in 2011. In the two years since that talk the company has doubled down on it’s investment to ship code to production continuously, and while I can’t disclose specific details the numbers today would blow the 2011 numbers out of the water. We have an internal saying that “bake time is for cookies”, and the normal mode of operation for most teams is that changes flow to production within minutes or hours of when they are checked in. Of course, deploying continuously comes with a cost. First, you have to have tooling in place that can drive the correct integration and deployment workflows and respond to failures gracefully. Second, applications have to be designed and implemented a certain way: interfaces must be hardened to allow independent deployment of components, adequate automated tests must be written, etc. In practice this state can be difficult to achieve when you’re starting from a messy, legacy code base.

In this particular instance a team was trying to move to continuous deployment, but they wanted to take baby steps to get there. They were asking us to build legacy features into our shiny new continuous deployment tooling that would allow them to use some of the tools while still running a bunch of manual tests, batching up changes, and deploying infrequently. My boss has an amazing knack for analogies, and he described the situation this way:

“We’re trying to sell modern medicine. We’ve built a beautiful new hospital that people are excited about. But some people can’t fully justify the cost of buying in, and now they’re starting to come to us and ask us to put rooms in the new hospital for bloodletting.”

There were several problems with building the features that this team was requesting. First, it would enable the team not to make the massive leap that was necessary to change the way that they develop and deploy software. In the world of software engineering, states that are viewed as stepping stones often become much more permanent. Worse yet, it would allow other teams to leverage the features and not make the leap. And finally, the work required to build the features would prevent our team that built the continuous deployment tooling from building cool new features that people who were really doing continuous deployment needed.

Don’t get me wrong, this kind of circumstance is the exception and not the rule. If you think that your users are asking for the wrong thing you should first doubt and question yourself, then question yourself again. But occasionally the impulse is correct and the way to delight a customer you have to not give them what they want. They won’t appreciate it in the short term, but if you’re confident that you’re making the correct call they may thank you down the road.

The Final Nail In The Windows Coffin

I generally boot over to Windows for one of 2 reasons: to play games, or to use Office. The rest of my time is happily spent in Ubuntu. I’ve been under the impression that people generally use Windows because it’s more “polished”. My mother is never going to be able to hack away at the command line or understand the dark magics of device drivers, so she needs the neat and tidy packaging that Microsoft offers. Tonight I decided to upgrade from Windows 7 to 8, and it was the worst experience possible. My motivation was that my Windows 7 installation had developed a weird tendency to BSOD (for seemingly random reasons after some debugging) with the dreaded “Page Fault in Nonpaged Area” message, so I figured I would try a clean OS install and thought I would upgrade in the process to see what Windows 8 is all about.

I started by downloading the Microsoft Windows Update utility, as recommended. I went through the steps and was told that I had two purchase options: Windows 8, or Windows 8 Pro. The former was $120, so I spent a while poking around looking for a way to select that I wanted the less expensive “Upgrade” version. I couldn’t figure it out, so I eventually caved and bought the full meal deal. I’m a firm believer in clean installation for Operating Systems based on some anecdotal past experiences, so I downloaded the ISO and burned a DVD. A few minutes later I was booted into the installation utility and was ready to install.

That’s when I hit my first speed bump. When I selected the appropriate disk, Windows told me that it couldn’t create a new partition or locate an existing one, and that I should check the setup log files for more info. I had everything important backed up to Dropbox, so I tried deleting the partition, formatting, and every other option available to me. I reboot and went through the process again with the same result. Before hunting down where the “setup log files” were, I hit Google on my cell phone and stumbled on this article and tried the command line partitioning utilities that were suggested. I rebooted again, and still no dice. After a lot of tinkering, I ended up having to unplug my other drives including the one where Linux was installed and reboot the computer, and then things magically worked.

I hadn’t ever messed with Windows 8, so I surprised to be greeted by no start button and no immediately obvious way to launch applications. I was told that I needed to activate Windows, and asked to re-enter my Product Key that I had already entered a million times while trying to get the installation working (fortunately by this point I had it practically memorized). When I tried to activate I got an error message telling me that my product key could only be used to upgrade Windows, despite the fact that I had been using Windows 7 just an hour prior, was under the impression that I bought a full non-upgrade version of Windows 8, and didn’t see any clear warnings to this effect during the purchase process. I went back to Google and poked around for a while and found this suggestion on hacking the registry to make activation work anyway, which seemed to do the trick.

Next I tried to change the wallpaper from the ugly flower, and that didn’t work without any obvious error messages. I was able to click on other images, but all I saw was weird flashing behavior in the edges of the window and the background didn’t change. Again per Google it sounds like I may need to wait until a while after activation to change my wallpaper, which is just bizarre.

I started downloading apps, and when I hit the Skype site they sent me over to the Windows App Store to download it. Inconveniently, there was no clear visual indication how to get back to my desktop from the Metro style UI. I started trying to poke around with Metro and was annoyed at how poorly it’s visual metaphor seemed to map to the mouse and keyboard, so I searched around for a way to permanently disable Metro for desktop users. Unfortunately that seems to require downloading (or in most cases purchasing) a separate application, which seems absurd.

The icing on the cake was that on my next reboot, I again hit the new and slightly less ugly BSOD with the same error that I was getting before. Both the Windows and Linux memory and disk analysis tools seem to suggest that all is fine on the hardware front, and I have yet to have any issues with Ubuntu which is running on the same machine down to the disk. I guess I’m back to trying to troubleshoot that issue later tonight.

After multiple hours of just trying to get things up and running, I’m to picture my mom buying the latest version of Windows because of “ease of use” and having to run disk partitioning utilities from the command line and edit registry keys. Clearly that ain’t happening. I’m also flashing back to how seamless and straightforward installing Ubuntu was last time around. If my experience isn’t atypical, then I think the final nail has been driven into the Windows coffin. That may sound like a sensational claim, but Windows has already lost the battle for mobile to Android (and to a lesser extent these days, iOS) and more and more of computing is moving away from the desktop. At some point individuals and companies that do use desktops for niche activities aren’t going to be willing to pay $120 for an inferior product to something that they can get for free, particularly if they’re already having to retrain habits because existing UI conventions are already broken in any option.

I’m excited that Steam is out for Linux because it feels like that may start a movement for PC games to ship on non-Windows operating systems. Now if only I can get Office working with Wine, I will never have to boot over to Windows again…

String Alignment, Dynamic Programming, & DNA

Most modern runtime environments provide an implementation of some basic functionality for working with strings. We’re used to leveraging simple tasks like splicing together strings or checking whether one string contains another, but what if we wanted to do something more complex and determine whether two strings have a shared etymology or are likely to have a similar meaning? The problem is interesting for a few reasons. First, there are some very interesting applications for predicting how closely related strings are. In terms of spoken and written language, being able to glean information about the evolutionary origin of words can serve as one building block for more complex problems like both natural language processing and translation. There are also very important uses for this kind of algorithm in related fields like DNA sequencing, and even in comparing the source code of programs. Second, the algorithms at the core of the solution to the problem provide a neat example of Dynamic Programming, which is a handy tool for any developer to have in their toolbox.

To get started, let’s consider a simple real world example. Suppose that we need to programmatically decide whether the English word “dictionary” and the Spanish word “diccionario” are related in order to draw some conclusions about the meaning of each word in absence of a… well, dictionary. We begin by noting that in practice, words tend to evolve over time through the insertion, deletion, and mutation of characters. To capitalize on this observation, we need to start by trying to find the most likely alignment between the two words by inserting gaps to make similar characters match up. The first step in finding that alignment is to establish a scoring system between characters in each word. The scoring system should be based on data and should reward similarity while penalizing differences between the strings. For our example, let’s start with a very simple scoring system where letters that match exactly are worth +2, vowels that don’t match are still worth +1 (and we’ll include the character ‘y’ with vowels), and letters that are mismatched (or matched with gaps) are worth -3. This scoring system takes advantage of a somewhat anecdotal observation that vowels tend to turn into other vowels more frequently than other mutations.

Given these words and the scoring system, our brains can actually scan the two words rather intuitively and posit a valid alignment. We can represent the alignment by showing the two words with dashes inserted for gaps, and a third row where letters indicate a direct match and plus signs indicate a non-match that was still assigned a positive score:


Our alignment has 8 letters that match perfectly and a score of 11, and at a glance it seems like a very strong match. It also happens to be a global alignment, which I’ll touch on later. The gap in “dictionary” is also somewhat arbitrary in that the score would be identical if the gap followed the ‘y’ instead of preceding it, which means that in this case there are multiple alignments that are equally valid.

So we’ve produced an alignment, but how could we automate the alignment finding process? To take things a step further, how can we calculate the probability that the alignment is meaningful and not just coincidence? Let’s deal with the former problem first. At first blush, it looks like we could just implement a recursive top down algorithm that finds the optimal alignment by drilling down to individual characters and then building back up by adding together the maximum scoring alignments of each possible combination of substrings and gaps. If we want to align “dictionary”, we could first find the possible alignments and scores of ‘d’ and ‘i’ and then find the alignments and scores for “di” by picking the from individual character alignments and gaps, and so on. The problem with this approach is that we would end up considering each character in the first string with each character in the second string, and then we would also consider each combination of characters in the first string with each combination in the second string. The result is a less than desirable runtime complexity of O(n^2m) for strings with length n and m, respectively. Unfortunately that won’t scale well with bigger strings (like DNA, which we’ll touch on later) when we have to run a large number alignments to try to calculate the probability that the alignment is significant (which we’ll also touch on later).

An alternative is to use Dynamic Programming (DP): breaking the problem up into smaller sub-problems and solving from the bottom up. The beauty of DP is that we can optimize so that we don’t have to redo some of the simple calculations that get repeated with recursion, and we can also take advantage of nuances of the problem in the way that we setup the algorithm. For string alignment the particular nuance that we can capitalize on is the fact that subsequent characters of the string must occur in order, so at each character we can only either insert the next character or insert a gap in one string or the other.

I’ll pause briefly before explaining the DP algorithm to clarify that there are two flavors to string alignment. Global alignment requires that we use each string in it’s entirety. Local alignment requires that we find only the most aligned substring between the two strings. The most widely used global alignment algorithm is called Needleman-Wunsch, while the local equivalent is an algorithm called Smith-Waterman. Both algorithms are examples of DP, and begin by building up a two dimensional matrix of integers with dimensions of the size of each respective string plus one (because we can start each aligned string with either a character or a gap). I’m going to use the local alignment algorithm in my example, but global alignment is very similar to implement in practice.

The first step in finding a local alignment is building up the matrix of integers, where each location in the matrix represents a score as we traverse the alignment. We start by initializing the values along the top and left of the matrix to zero, since local alignment can start at any point in the string. If we were finding a global alignment we would have initialized the edges so that at each subsequent location we imposed the mismatch or gap penalty, because we would need to include the entire string in the result. Next we start with the top left corner that remains and work either right or down (it doesn’t matter as long as we stay consistent), and at each location we enter the maximum of four values: the value to the left plus the score of the top character and a gap, the value to the top plus the score of the left character and a gap, the value up and left plus the score of the characters to the top and left, or zero. If that explanation isn’t clear or the intuition isn’t there, consider what each decision is implying with respect to the alignment that we’re scoring. The first two options imply inserting a gap in one of the strings, the third implies aligning two characters, and the forth implies ending a local alignment. For our contrived example and scoring system, the matrix looks like the following:

  – D I C C I O N A R I O
– 0 0 0 0 0 0 0 0 0 0 0 0
D 0 2 0 0 0 0 0 0 0 0 0 0
I 0 0 4 1 0 2 0 0 0 0 2 0
C 0 0 1 6 3 0 0 0 0 0 0 0
T 0 0 0 3 3 0 0 0 0 0 0 0
I 0 0 2 0 0 5 2 0 0 0 2 0
O 0 0 0 0 0 2 7 4 1 0 0 4
N 0 0 0 0 0 0 4 9 6 3 0 1
A 0 0 0 0 0 0 1 611 8 5 2
R 0 0 0 0 0 0 0 3 81310 7
Y 0 0 0 0 0 0 0 0 51010 7

Apologies if my somewhat crude tables don’t format well on your device, I was lazy and copied/pasted them from program output without applying a lot of formatting. You can see that building up the matrix takes O(nm) time for strings with length n and m, so we’ve reduced our runtime complexity by an order of magnitude.

The next step is retrieving the alignment from the matrix we start with the location with the maximum value, which we can store as we build up the matrix. From that location we trace backwards and figure out which action resulted in that value by considering the locations to the top, left, and top left of the current location and calculating which value was a valid part of the calculation for the current value. In some cases there will be multiple valid alignments, as indicated by either multiple starting locations with identical scores or multiple valid locations that could have contributed to the calculation for the current location. In that case it’s important to consider the context and decide whether it makes sense to return multiple alignments, pick a single alignment arbitrarily, or create a criteria to pick between alignments (for example penalizing longer alignments). Since we’re finding a local alignment, we trace until we reach a location where the score is equal to or less than zero. If we were doing global alignment we would use the same logic, but we would start in the bottom right location and go until we reach the top left location. The same matrix below shows our trace back with the path that we’ve selected in bold:

  – D I C C I O N A R I O
– 0 0 0 0 0 0 0 0 0 0 0 0
D 0 2 0 0 0 0 0 0 0 0 0 0
I 0 0 4 1 0 2 0 0 0 0 2 0
C 0 0 1 6 3 0 0 0 0 0 0 0
T 0 0 0 3 3 0 0 0 0 0 0 0
I 0 0 2 0 0 5 2 0 0 0 2 0
O 0 0 0 0 0 2 7 4 1 0 0 4
N 0 0 0 0 0 0 4 9 6 3 0 1
A 0 0 0 0 0 0 1 611 8 5 2
R 0 0 0 0 0 0 0 3 81310 7
Y 0 0 0 0 0 0 0 0 51010 7

At this point we’ve got an alignment and a score of 13, which is pretty rad. But how can we tell whether the score is statistically significant? Clearly we can’t make hard and fast rules, because the number that the score produces is relative to a lot of factors including the size of the matrix and the alphabet that was used. One option is to create a bunch (in practice a bunch usually means a minimum of 10^4) of random permutations of one of the strings that reuses all the characters and is the same length, align them with the other string, and then see how many alignments score equal to or better than the initial match. The resulting “P-Value” can be calculated by taking the number of better matches plus one divided by the number of attempts plus one (because we include the initial match in our set). Interpretation of the value depends on the context, but in our case we may decide that a probability value of .5 may indicate that the relationship is random while a value of 10^-4 may suggest a more compelling relationship between the strings. For the given example I tried running 10^5 different alignments while randomizing a string and didn’t see a single alignment with a score that was equal or better, which isn’t a surprise because the alignment that we produced is almost a direct match. The obvious conclusion is that the words almost certainly did evolve from the same origin at some point in human history, and thus likely share meaning.

Now that we’re armed with our nifty algorithm, where else can we put it to use? Some of the more interesting applications for string alignment occur when trying to sequence and compare DNA. At the end of the day, DNA is just a long string that consists of a 4 letter alphabet of nucleotides. DNA produces RNA, which in turn also produces proteins that are made up of a chain of amino acids. Each protein is also just a string that consists of a 20 letter alphabet of amino acids. DNA (and as a result, the proteins that it produces) has evolved through a series of insertions, deletions, and mutations just like written or spoken language. DNA evolution is actually much more pure than that of written and spoken language, because it’s much more difficult for an entire section of DNA to instantly mutate than it is for a brand new word to become adopted in a culture. A gene is the DNA equivalent of a word: it’s a section of DNA that is ultimately responsible for the production of a protein that may contribute to a physical trait. By aligning either DNA or protein sequences, we can make very educated guesses about whether genes are evolutionarily linked, and whether the proteins that they produce fold and function similarly. By looking at highly conserved regions of related proteins over time and making statistical observations about the likelihood of certain amino acids and common mutations, researchers have produced scoring systems like BLOSUM that are very effective when comparing proteins.

Another interesting application for our algorithm is in examining iterations to the source code of a program, or comparing multiple implementations of a program. A program is really just a string where the alphabet is generally defined by the context free grammar of a programming language. Suppose that we want to try to figure out whether a student in a computer science class copied his assignment from a peer, or did the work on his own. In this case we can simply align each two programs, look at the probabilities that the programs are related, and apply some manual investigation and common sense investigation to any results that appear closely related.

That’s a lot of information to pack into a single post, but I hope you find it helpful and are able to think of other creative applications for the algorithms and concepts. As always, I’ll look forward to reading your comments!

Getting Rock Stars Excited About Working For You

The most important thing that the manager of a software development team can do is to staff their team up with rock star developers. The “10x Software Development” principle that Steve McConnell and others brought to the mainstream is the biggest reason why: the best developers aren’t just twice as effective as average ones, they’re an order of magnitude better. This is old news, and we’ve all heard it before by now (and on a side note, it’s also why I’ve previously harped on the importance of building a network of talented folks). What isn’t old news is that there are a few straightforward steps that managers can take to maximize their chances of landing those amazingly talented developers.

I’ve managed teams at both Microsoft and Amazon. I’ve recruited the cream of the software development crop against virtually every other big tech company, exciting start-ups, and even against other teams within my company. I’ve won more of those battles than I’ve lost, and I’ve learned my share in the process. I’m going to share a few very tangible suggestions that you can put into practice today that I humbly submit will make you more effective at winning your own recruiting battles. Most of the tips are equally relevant whether you’re managing a team within a huge company, or hiring employee #1 at a start-up operating out of your garage.

To set the stage, let’s start by focusing on a scenario that you’re likely to come across in the recruiting process. Suppose that you’ve been fortune enough to find a rock star developer who’s interested in your team. She breezed through your interview process, and your HR folks (who incidentally may also be you at a small company) have reached out to the her and started to negotiate compensation. The word comes back from HR that the developer is interested in joining your team, but she has other offers out from two other companies with comparable compensation and she’s having a hard time making a decision. The good HR peeps inform you that you’re at the absolute top the compensation range, so there isn’t any way to sweeten the offer and blow her away with dollar bills. They’ve lined up a sell call for you to chat with the developer, answer her last few questions, and get her jazzed about your team. This is a position that I’ve personally been in at least 10 times over the past year.

In this situation there are two sets of inputs that will determine whether the candidate accepts your offer. The first is a set of things that are mostly out of your control at this point, for example: the brand image of the company that you work for, the location of the team offices, differences in the compensation packages being offered, and the appeal of the projects that each team is working on based on that developer’s particular experiences and interests. The second set is in your control, for example: the way that you come across in the phone conversation and any other communication with the candidate, the way that you choose to paint a picture of your team, and how well you and the person connect. The following are tips that I’ve found effective at maximizing that second set of inputs that are in your control and increasing your odds of landing a rock star:

  • Prepare a one pager that sells your team, today. Leave the stuff that isn’t sexy out. Don’t talk about your ops rotation or that legacy product that your team has to keep on life support. You should field questions about those areas honestly when asked, but the one pager isn’t the place to do so. Sell the long term team vision, and make it something that good developers want to sink their teeth into. Emphasize hard technical challenges. Be sure to call out who your customers are and what the impact of the technology is for them, and consider including a powerful quote from at least one of satisfied customer. Email the one pager to the candidate as soon as possible once you’ve decided to hire them and before the sell call to keep them thinking about your team and help them formulate follow up questions for the phone call.
  • Come to the phone call prepared. Know the candidates resume and background inside and out. You studied this extensively prior to interviewing the candidate, but be sure to refresh your memory immediately before the call. Understand that when you chat, you shouldn’t feel the need to cover absolutely everything that your team is doing; in most cases that will be more than a person can digest in a single phone call. Create a list of no more than 3 high level bullet points that you want to be absolutely sure to cover. Tailor those bullet points to the kind of technologies that the candidate is likely to be excited about.
  • Spend the first 5-10 minutes of the phone call learning more about the candidate. Be sure that you’ve accurately identified their interests. Be ready to adjust your message on the fly to emphasize the points that jive with their interests. Ask probing questions to show them that you’re really interested in what they care about, and to be sure that you’re correctly identifying their interests.
  • Walk around while you’re talking. Put the phone on speaker, or get a headset. Use your arms and other body language, even though the candidate can’t see you. It’s easier to speak naturally and get excited about what you’re talking about if you’re not chained to a desk and phone. Your enthusiasm about the team and products is absolutely key to getting the candidate excited, and it’s harder for most people to convey over the phone than in person. It’s also something that is very difficult to fake if you’re working on a product that you aren’t a fan of.
  • Try to take the phone call to a place where it turns into a technical discussion between you and the candidate. If you can get to a point where they’re excited about the technology, making suggestions, and engaging in a dialogue about where they would take the technology you’re in great shape. Intentionally nurture that kind of discussion.
  • End the call by answering questions and offering to connect them with additional resources. I personally always provide my contact info and encourage the candidate to get in touch with me if any future questions come up. I also offer them a chance to schedule a follow up phone call with a senior engineer from the team (that I trust to represent the team well) to dive further into the technical weeds if the candidate is interested. They will only rarely take you up on that offer, but making the offer still goes a long way to establish the technical credibility of the team.
  • If the opportunity presents itself, follow up with something memorable. When I was contemplating Amazon my baby was just a few months old, and the hiring manager sent me an Amazon branded onesie. We sometimes send candidates Kindles to keep Amazon front and center in their mind and show them how much we want them to join us. I had a candidate a few months ago that wanted to join the team but was hesitant about the weather in Seattle, so I took a picture out the office window on the next day (which happened to be beautiful and sunny) and emailed it to him. Anything that keeps your team and company front and center in the candidate’s mind is awesome. Be sure not to force it if the opportunity doesn’t present itself, because that kind of attempt will likely come across as desperate and produce the opposite effect.

I hope you find those ideas helpful and are able to implement them effectively, unless I happen to be managing the team that you’re competing against for that next rock star developer. 🙂 I’ll certainly welcome feedback and/or additional suggestions from others in the comments below. I’ll add the usual disclaimer that these ideas are all my own, and don’t reflect any of the opinions of my current or former employers.

Debunking the Myths of RPC & REST

The internet is chock-full of articles, blog posts, and discussions about RPC and REST. Most are targeted at answering a question about using RPC or REST for a particular application, which in itself is a false dichotomy. The answers that are provided generally leave something to be desired and give me the impression that there are a slew of developers plugging RESTful architectures because they’re told that REST is cool, but without understanding why. Ironically, Roy Fielding took issue with this type of “design by fad” in the dissertation in which he introduced and defined REST:

“Consider how often we see software projects begin with adoption of the latest fad in architectural design, and only later discover whether or not the system requirements call for such an architecture.”

If you want to deeply understand REST and can read only one document, don’t read this post. Stop here and read Fielding’s dissertation. If you don’t have time to read his dissertation or you’re just looking for a high level overview on RPC and REST, read on. To get started, let’s take a look at RPC, REST, and HTTP each in some detail.

Remote Procedure Call (RPC) is way to describe a mechanism that lets you call a procedure in another process and exchange data by message passing. It typically involves generating some method stubs on the client process that makes making the call appear local, but behind the stub is logic to marshall the request and send it to the server process. The server process then unmarshalls the request and invokes the desired method before repeating the process in reverse to get whatever the method returns back to the client. HTTP is sometimes used as the underlying protocol for message passing, but nothing about RPC is inherently bound to HTTP. Remote Method Invocation (RMI) is closely related to RPC, but it takes remote invocation a step further by making it object oriented and providing the capability to keep references to remote objects and invoke their methods.

Representational State Transfer (REST) is an architectural style that imposes a particular set of constraints on an interface to achieve a set of goals. REST enforces a client/server model, where the client is interested in gaining information and acting on a set of resources that are managed by the server. The server tells the client about resources by providing a representation of one or more resources at a time and providing actions that can either get a new representation of resources or manipulate resources. All communication between the client and the server must be stateless and cachable. Implementations of a REST architecture are said to be RESTful.

Hypertext Transfer Protocol (HTTP) is a RESTful protocol for exposing resources across distributed systems. In HTTP the server tells clients about resources by providing a representation about those resources in the body of an HTTP request. If the body is HTML, legal subsequent actions are given in A tags that let the client either get new representations via additional GET requests, or act on resources via POST/PUT or DELETE requests.

Given those definitions, there are a few important observations to make:

  • It doesn’t make sense to talk about RPC vs REST. In fact you can implement a RESTful service on top of any RPC implementation by creating methods that conform to the constraints of REST. You can even create an HTTP style REST implementation on top of an RPC implementation by creating methods for GET, POST, PUT, DELETE that take in some metadata that mirrors HTTP headers and return a string that mirrors the body of an HTTP request.
  • HTTP doesn’t map 1:1 to REST, it’s an implementation of REST. REST is a set of constraints, but it doesn’t include aspects of the HTTP specific implementation. For example your service could implement a RESTful interface that exposes methods other than the ones that HTTP exposes and still be RESTful.

What people are really asking when they ask whether they should use RPC or REST is: “Should I make my service RESTful by exposing my resources via vanilla HTTP, or should I build on a higher level abstraction like SOAP or XML-RPC to expose resources in a more customized way?”. To answer that question, let’s start by exploring some of the benefits of REST and HTTP. Note that there are separate arguments for why you would want to make a service RESTful and why you would want to use vanilla HTTP, although all the arguments for the former apply to the latter (but not vice versa).

The beauty of REST is that the set of legal actions from any state is always controlled by the server. The contract with the client is very minimal; in the case of HTTP’s implementation of REST the contract is a single top level URI that I can send a GET request to. From that point on the set of legal actions is given to the client by the server at run time. This is in direct contrast to typical RPC implementations where the set of legal actions are much more rigid; as defined by the procedures that are consumed by the client and explicitly called out at build time. To illustrate the point, consider the difference between finding a particular product by navigating to a webpage and following a set of links (for things like product category) that you’re given until you find what you’re looking for, versus calling a procedure to get a product by name. In the first case changes to the API can be made on the server only, but in the second case a coordinated deployment to the client and server is required. The obvious issue with this example is that computers aren’t naturally good at consuming API’s in a dynamic fashion, so it’s difficult (but possible in most cases) to get the benefit of server controlled actions if you’re building a service to be consumed by other services.

One main advantage of using vanilla HTTP to expose resources is that you can take advantage of a whole bunch of technologies that are built for HTTP without any additional effort. To consider one specific example, let’s look at caching. Because the inputs of an HTTP response are well defined (querystring, HTTP headers, POST data) I can stand up a Squid server in front of my service and it will work out of the box. In RPC even if I’m using HTTP for message passing under the hood, there isn’t any guarantee that messages map 1:1 with procedure calls, since a single call may pass multiple messages depending on implementation. Also, the inputs to RPC procedures may or may not be passed in a standard way that makes it possible to map requests to cached responses, and requests may or may not include the appropriate metadata for to know how to handle caching (like HTTP does). It’s nice to not have to reinvent the wheel, and caching is just one example of an area where HTTP gives us the ability to stand on the shoulders of others.

Another advantage to HTTP that I’ll just touch on briefly is that it exposes resources in a very generic way via GET, POST/PUT, and DELETE operations which have stood the test of time. In practice when you roll your own ways to expose resources you’re likely to expose things in a way that is too narrow for future use cases, which results in an explosion in the number of methods, interfaces, or services that end up becoming a part of your SOA. This point really warrants a blog post of it’s own, so I’ll avoid going deeper in this particular post.

There is a lot of additional depth that I could go into, but in the interest of keeping this post reasonably brief I’ll stop here. My hope is that at least some folks find the material here to be helpful when deciding how to best expose resources via a service! If I’ve missed anything or if you feel that any of the details are inaccurate, please feel free to comment and I’ll do my best to update the material accordingly.