On Cryptocurrency, Blockchain, & Cloud Computing

With Bitcoin’s price soaring, I’ve found myself spending a lot of cycles explaining why I’m still bullish on cryptocurrency and blockchain. I’ve also found myself in a number of conversations where I’m trying to convince friends that most of the talking heads from the financial world fundamentally don’t understand what blockchain is about or how it will change the game. This post is my attempt to channel those discussions and dispel several popular myths that are currently making the rounds on the Twittersphere. Along the way, I’m also going to try to convince you that the blockchain not only has the potential to revolutionize the financial world, but is also poised to have a massive impact on cloud computing. Without further ado, let’s dive into some background on both computing and blockchain.

Blockchain as a Cloud Computer

Every computer can be broken down into two fundamental components: compute and storage. When you use your computer to multiply two numbers together, your computer loads (compute) values from memory or disk (storage) into registers (storage), adds (compute) the values together, and stores (compute) the result in another register (storage) where it can then be stored (compute) in memory or on disk (storage). From web surfing to spreadsheet-crunching to gaming, all computer applications boil down to computation that manipulates stored values in interesting ways.

Cloud computing is no different. Amazon Web Services, the leader in hosted public cloud, offers tens (if not hundreds) of unique services that can all be broken down into compute and storage that runs on Amazon’s massive infrastructure footprint. For example, AWS CodeBuild allows software developers to build and test their code in the cloud and then store the built artifacts in a data store like Amazon’s Simple Storage Service (S3). Like most of its services, Amazon bases the pricing for CodeBuild and S3 on the number of minutes that the underlying virtual machine uses (compute) and the number of terabytes used per month (storage) because the company understands that most of its value-add can be decomposed into compute and storage. AWS made $4.6B in revenue in Q3 of 2017 alone which represents a massive year-over-year revenue growth of 42%, so the global market for cloud computing products is clearly vibrant and growing quickly.

The simplest way to think about a blockchain is a big, distributed cloud computer that no single person, company, or government controls. Individuals called miners connect their computers to the blockchain so that they can be used to process transactions (compute) and write the results of those transactions to a digital ledger (storage). The mechanics of executing transactions depend on the blockchain implementation and are typically heavily rooted in cryptography, but in a proof-of-work system like Bitcoin, each mining node is taking a block of transactions and hashing them (compute) together with the hash of the previous node and a value called a nonce to create a unique hash value that fits a set of constraints. Once a valid hash is mined, the new block is broadcast (compute) to other nodes and the transactions in that block are executed and stored (compute) written to both lightweight and full nodes (storage) across the blockchain network.

Mining is computationally expensive, so the blockchain is orders of magnitude less efficient than comparable distributed compute and storage solutions, but it’s crucial to note that efficiency is intentionally exchanged for a different property: no one entity owns the system, yet the system can still facilitate computational transactions that involve multiple untrusted parties without introducing a trusted intermediary. If you think about the number of transactions that we participate in on a daily basis where we incur some cost to engage a trusted intermediary, this is a big deal. Facebook can monetize your data in undesired ways, PayPal takes a healthy cut to move your money around, and the government can always compel Amazon to delete or hand over data that it’s storing on your behalf.

In order to incent miners to give their compute and storage to the blockchain, the creators of Bitcoin developed the concept of a digital coin or cryptocurrency that can be exchanged in to run transactions on the blockchain. Anyone that wants to write to the blockchain ledger has to offer a small number of coins for their transaction to be processed. The cryptocurrency cost of executing a transaction on the blockchain is linked to the demand for running compute on the network and inversely linked to the computing power connected to the network. The cost in terms of a fiat currency like the US Dollar is also obviously linked to the going exchange rate between the fiat currency and the cryptocurrency, thus the cost in USD to run a transaction on the Bitcoin blockchain has increased steadily of late: in late Q3 of 2017 the cost of writing 200 bytes to the Bitcoin blockchain ledger within 30 minutes was roughly $3-4 USD worth of BTC.

Because of this need to move currency from one party to another, the original application that was baked into Bitcoin’s blockchain was the exchange of the Bitcoin cryptocurrency between parties. Newer blockchains have built on the Bitcoin foundation and embraced the idea of more generic Smart Contracts that allow for arbitrary code to be executed directly on the blockchain. For example, the creators of the Ethereum blockchain have implemented a runtime environment on blockchain for Smart Contracts written in a language called Solidity that is Turing complete, which means that in principle, it can be used to solve any computational problem. The result is that blockchains like Ethereum look similar to cloud computing service in that they allow for arbitrary distributed compute and storage, yet they also display the interesting property of being able to facilitate computation that involves multiple untrusted actors without a single trusted third-party controlling the service. Efforts are underway to bolt this kind of behavior onto the Bitcoin blockchain via sidechains like Rootstock.

That was a lot to grok, but it’s really impossible to critique the current commentary on blockchain and cryptocurrency without at least a high-level understanding of how the pieces fit together. So with that all in mind, let’s dive into a few recent criticisms from high-profile individuals in the financial sector about both Bitcoin specifically and blockchain and cryptocurrencies in general.

“Bitcoin doesn’t have any intrinsic value…”

One extremely common narrative from people that come from the financial world recently is that coins like Bitcoin don’t have any intrinsic value. Just a few days ago, Nobel Prize-winning economist Joseph Stiglitz said that “Bitcoin is successful only because of its potential for circumvention, lack of oversight. It doesn’t serve any socially useful function.” JPMorgan Chase CEO Jamie Dimon has claimed “the only value of Bitcoin is what the other guy will pay for it.”

If you made it through my quick primer above, then you already understand why Stiglitz and Dimon are incorrect. Coins like Bitcoin and Ether can be exchanged for compute and storage on a massive supercomputer with some very compelling properties that allow you to do things like executing transactions between untrusted parties without a trusted intermediary. If you believe that an increasing number of applications will be written and deployed on blockchain to disintermediate our life (and if those currencies are architected in a way that limits velocity) then the value of these currencies will inherently go up as the demand for compute and storage on blockchain increases.

It’s worth pausing here for a moment to mention that there are a few very different kinds of cryptocurrencies floating around today. The first kind is typically called a utility token that has the intrinsic property of being exchangeable for some kind of service. Bitcoin and Ether are both utility tokens because they have the intrinsic property (coded directly into the token and blockchain) of being exchangeable for compute and storage. The second kind of token is often called a tokenized security because it functions more like a traditional security that just happens to be exchangeable on a blockchain. A tokenized security has no intrinsic value but may have value ascribed to it by extrinsic means. For example, a legal contract may promise a share of the future revenue streams of a corporation pro rata to the holders of a specific kind of token. I suspect that people like Stiglitz and Dimon are completely missing the power of blockchain as a cloud computer, so they’re mistaking utility tokens for a tokenized security that is linked to very little value.

“I’m excited about blockchain, but not Bitcoin…”

A second popular thread is that Bitcoin and similar technologies are interesting, but not in their current implementation. One flavor of this attack is that blockchain technology is compelling but cryptocurrencies are not. Another related flavor is that the existing decentralized blockchain implementations will be replaced by blockchain implementations that are controlled by governments and corporations. World Bank President Jim Yong Kim noted that “blockchain technology is something that everyone is excited about, but we have to remember that Bitcoin is one of the very few instances.” He went on to emphasize that the importance of blockchain is the speed with which it can facilitate transactions, drawing parallels to Alibaba’s infrastructure that can facilitate large transactions in seconds. Former Federal Reserve chair Ben Bernanke espoused a similar view when he talked about how Bitcoin would fail but blockchain was interesting and would help federal banks improve their existing payment systems.

Again, this line of thinking is flawed. As noted above, blockchain is an intentionally inefficient system because it trades efficiency for decentralization. It’s hard to see why a bank or government would want to implement an inefficient technology for no reason if decentralization isn’t a desired property. Banks already run on digital systems that can facilitate transactions between people, so what exactly is blockchain bringing to the table? Further, that decentralization can only be maintained if coins are woven directly into the fabric of the blockchain to compensate miners, so the idea of implementing a blockchain without a linked cryptocurrency doesn’t make a lot of sense.

“Bitcoin is an unreliable store of value…”

Another part of the conversation about the value of Bitcoin is centered on whether it will prove to be an effective store of value. A store of value is a mechanism that allows people to preserve facilitate the exchange of wealth and to preserve wealth across both physical space and time. To accomplish this the medium of storage must exhibit a few properties: it must be liquid, it must be scarce, it must possess longevity, and people must be willing to assign a value to it. Historically, stores of value have included things like precious metals, gemstones, livestock, real estate, and fiat currencies.

Some recent challenges to the validity of cryptocurrencies like Bitcoin as a store of value have focused on whether the currencies will continue to exhibit those required properties of a store of value. In the article that was linked above, Jamie Dimon states that “governments are going to crush Bitcoin one day. Governments like to know where the money is, who has it and what you’re doing with it.” In essence, his comments are an attack on the longevity of Bitcoin as well as its liquidity in markets as they become more regulated. Economist and author Raoul Pal claimed that Bitcoin was an unreliable store of value because the group of developers that controls the underlying codebase can change the code: “Even if they don’t change the formula, the fact that they could? That’s enough to say it’s not a long-term store of value.” Pal’s statements cast a doubt on Bitcoin’s scarcity (since software engineers could “print more money”) and whether people should trust the people at the helm enough to assign value to the currency.

The reality is that any government that bans cryptocurrency will miss out on the next great wave of innovation. Where would the US be today if the government issued a ban on all HTTPS traffic because it disrupted the way that intelligence was previously collected? The risk of rogue updates to the codebase is slightly more real, but it’s important to note that there are three groups of actors in each blockchain ecosystem that work as a set of checks and balances against each other: developers, miners, and cryptocurrency holders. The split between Ethereum and Ethereum Classic is a real-life case study in what happens when those groups move in different directions, and it will forever be a warning to the development communities for other blockchains.

Where To From Here?

None of this means that Bitcoin and other cryptocurrencies are destined to continue their meteoric rise. Blockchains have real challenges as they try to scale; when an app like CryptoKitties pushes your network to its limits, you have work to do. Cryptocurrency exchanges are still a major vulnerability of the system and market manipulation is possible at current volumes. For example, it’s widely speculated that the current price of BTC is being propped up by the fraudulent issue of Tether and that if USDT and Bitfinex implode, they will bring all cryptocurrencies along for the ride. All of these risks are real.

But as both a Software Engineer and a VC, I can tell you that I see a lot of companies making big bets on blockchain and using it as the Operating System for applications that were previously impossible to build and will change our lives. Those apps aren’t in production or operating at scale yet, so the analogies between the current environment and the dotcom bubble are reasonable: there may be a crash that is followed by a long period where apps are deployed, adoption grows, and the ecosystem justifies the valuation. Or, maybe, the current lofty valuation on cryptocurrencies is correct for a technology that has the potential to disrupt both the financial sector and cloud computing and near-term growth will continue.

When Clay Christensen introduced the concept of “disruptive innovation” in The Innovator’s Dilemma, he explained that incumbents can’t pursue disruptive innovation when it first arises because the opportunities aren’t profitable enough and the development of disruptive innovation would take scarce resources away from sustaining innovation that is required to keep up with the competition. As the disruptive innovation matures, it begins to capture share up-market, and the incumbent can’t react quickly enough.

Jamie Dimon claims that blockchain isn’t worth his attention because JPMorgan Chase moves $6T in money around the world every day while the daily trading volume of all cryptocurrencies is around $10B. Ironically, with a total market cap of roughly $370B, the basket of all of the cryptocurrencies in the world is now more valuable than JPMorgan Chase. Are major industries going to be disrupted in the next decade? Time will tell, but I’m betting on crypto.

How Video Games Made Me A Better Software Engineer (& Dad!)

About six months ago I left an amazing job at Amazon for a very different, yet equally amazing job at Riot Games. I won’t bore you with the laundry list of factors that went into my decision, but I will confess that one of the many factors was my life-long love of video games. I’m a bit quirky in the way that I play video games because I can’t play a game casually. When I pick up a game, I play purely to master the game and to challenge myself (and possibly my team, depending on the game) to see how good I/we can be. As crazy as it sounds, that constant quest for mastery has taught me a valuable lesson that not only has made me better at my job as an engineering manager, but has helped me to grow in other areas of my life.

Before I hit you with the punch line, let me give you some quick background to help set the stage. As you may or may not know, Riot Games produces a very popular game called League of Legends that pits two teams of five players against each other in a ~20–60 minute battle to destroy the other team’s Nexus before they destroy yours. League of Legends is one of those games that is relatively quick to learn, but takes a lifetime to master because of the complexity of gameplay. Any player can try to work his or her way up the game’s elo rating system, which is broken down into several divisions: bronze, silver, gold, platinum, diamond, master, and challenger.

When I first started trying to climb the elo ladder, I was able to work my way from bronze to silver by just grinding out a bunch of games. As I was playing, I was building my “mechanics” and learning basic concepts that allowed me to improve fairly quickly. As I kept playing, however, my progress stalled out before I was able to hit gold. That’s when it struck me that if I wanted to improve, I would have to actively start doing things to improve. I wasn’t going to get better by just putting my time in, playing game after game, and making the same mistakes over and over again.

That same concept of mastery applies to almost every area of our lives. When I landed my first job as a software developer, I had so much to learn that I could build my development chops by simply doing my job. At some point that ceased to be true and I had to start doing very intentional things to continue to improve. Sometimes that meant seeking out seasoned veterans for some pair programming, and other times it meant changing teams to work in a new domain or with a new set of tools.

IMHO, the hardest part of improving at something is 1) identifying when we’ve hit our natural plateau and we’re just grinding it out without getting better, 2) deciding that we actually want to invest the immense time and energy needed to get better, and then 3) taking some time where we are very intentionally in the “stretch zone” and practicing for mastery. This season I’ve advanced to platinum in League of Legends, and the only way I was able to accomplish that goal was by setting aside a chunk of play time every week where I wrote down a specific goal (which could be something like “die 3 or less times”, or “kill 85 minions by the 10-minute mark”), focused on achieving that goal while I played, and sometimes watched replays of my games to find mistakes and figure out what goals I should set in the future. As an engineering manager, I put myself in the stretch zone by keeping my personal development plan (PDP) relevant and up-to-date, spending quality time learning from mentors each week, reading books and blogs that are written by other managers that I respect, and continually collecting feedback from the folks that I’m managing on how I can be more effective and using that feedback to drive new goals into my PDP. As a husband and a dad, I get in the stretch zone by sitting down with my wife every Sunday evening and talking through how things are going at home and using those discussions to pick a few things to focus on for the week.

There are a lot of other areas in my life where I’m intentionally not putting in the effort to get in the stretch zone and improve, and I’m fine with that because I only have a finite amount of time and focus. I love playing golf and would like to be a better golfer, but right now I’m just hitting the course occasionally and playing for fun. I suspect very few people have the discipline and the mental focus to context switch and really improve at more than about three things at a time.

I leave you with this challenge: Identify one thing that you want to get better at, and come up with a plan to get into the stretch zone at least a few times a week for the next month. Then leave me a comment below and let me know how your experience went. And the next time someone tells you to quit playing video games and do something productive, tell them that you’re learning valuable lessons that apply to the rest of your life.

The Final Nail In The Windows Coffin

I generally boot over to Windows for one of 2 reasons: to play games, or to use Office. The rest of my time is happily spent in Ubuntu. I’ve been under the impression that people generally use Windows because it’s more “polished”. My mother is never going to be able to hack away at the command line or understand the dark magics of device drivers, so she needs the neat and tidy packaging that Microsoft offers. Tonight I decided to upgrade from Windows 7 to 8, and it was the worst experience possible. My motivation was that my Windows 7 installation had developed a weird tendency to BSOD (for seemingly random reasons after some debugging) with the dreaded “Page Fault in Nonpaged Area” message, so I figured I would try a clean OS install and thought I would upgrade in the process to see what Windows 8 is all about.

I started by downloading the Microsoft Windows Update utility, as recommended. I went through the steps and was told that I had two purchase options: Windows 8, or Windows 8 Pro. The former was $120, so I spent a while poking around looking for a way to select that I wanted the less expensive “Upgrade” version. I couldn’t figure it out, so I eventually caved and bought the full meal deal. I’m a firm believer in clean installation for Operating Systems based on some anecdotal past experiences, so I downloaded the ISO and burned a DVD. A few minutes later I was booted into the installation utility and was ready to install.

That’s when I hit my first speed bump. When I selected the appropriate disk, Windows told me that it couldn’t create a new partition or locate an existing one, and that I should check the setup log files for more info. I had everything important backed up to Dropbox, so I tried deleting the partition, formatting, and every other option available to me. I reboot and went through the process again with the same result. Before hunting down where the “setup log files” were, I hit Google on my cell phone and stumbled on this article and tried the command line partitioning utilities that were suggested. I rebooted again, and still no dice. After a lot of tinkering, I ended up having to unplug my other drives including the one where Linux was installed and reboot the computer, and then things magically worked.

I hadn’t ever messed with Windows 8, so I surprised to be greeted by no start button and no immediately obvious way to launch applications. I was told that I needed to activate Windows, and asked to re-enter my Product Key that I had already entered a million times while trying to get the installation working (fortunately by this point I had it practically memorized). When I tried to activate I got an error message telling me that my product key could only be used to upgrade Windows, despite the fact that I had been using Windows 7 just an hour prior, was under the impression that I bought a full non-upgrade version of Windows 8, and didn’t see any clear warnings to this effect during the purchase process. I went back to Google and poked around for a while and found this suggestion on hacking the registry to make activation work anyway, which seemed to do the trick.

Next I tried to change the wallpaper from the ugly flower, and that didn’t work without any obvious error messages. I was able to click on other images, but all I saw was weird flashing behavior in the edges of the window and the background didn’t change. Again per Google it sounds like I may need to wait until a while after activation to change my wallpaper, which is just bizarre.

I started downloading apps, and when I hit the Skype site they sent me over to the Windows App Store to download it. Inconveniently, there was no clear visual indication how to get back to my desktop from the Metro style UI. I started trying to poke around with Metro and was annoyed at how poorly it’s visual metaphor seemed to map to the mouse and keyboard, so I searched around for a way to permanently disable Metro for desktop users. Unfortunately that seems to require downloading (or in most cases purchasing) a separate application, which seems absurd.

The icing on the cake was that on my next reboot, I again hit the new and slightly less ugly BSOD with the same error that I was getting before. Both the Windows and Linux memory and disk analysis tools seem to suggest that all is fine on the hardware front, and I have yet to have any issues with Ubuntu which is running on the same machine down to the disk. I guess I’m back to trying to troubleshoot that issue later tonight.

After multiple hours of just trying to get things up and running, I’m to picture my mom buying the latest version of Windows because of “ease of use” and having to run disk partitioning utilities from the command line and edit registry keys. Clearly that ain’t happening. I’m also flashing back to how seamless and straightforward installing Ubuntu was last time around. If my experience isn’t atypical, then I think the final nail has been driven into the Windows coffin. That may sound like a sensational claim, but Windows has already lost the battle for mobile to Android (and to a lesser extent these days, iOS) and more and more of computing is moving away from the desktop. At some point individuals and companies that do use desktops for niche activities aren’t going to be willing to pay $120 for an inferior product to something that they can get for free, particularly if they’re already having to retrain habits because existing UI conventions are already broken in any option.

I’m excited that Steam is out for Linux because it feels like that may start a movement for PC games to ship on non-Windows operating systems. Now if only I can get Office working with Wine, I will never have to boot over to Windows again…

The New Platform War

There’s a new battle raging for customer eyeballs, application developers, and ultimately… dollar signs. To set the stage, flash back to the first platform war: the OS. Windows sat entrenched as the unassailable heavyweight, with Linux and Mac OS barely on the scene as fringe contenders. But the ultimate demise of Windows’ platform dominance didn’t come from another OS at all; it came from the move to the browser. Microsoft initially saw the problem and nipped it in the bud by packaging IE with Windows, then tried to prolong the inevitable by locking the IE team away in a dark basement and trying to stifle browser innovation by favoring closed solutions for browser development like Silverlight instead of open standards like HTML 5. That strategy clearly wouldn’t work forever, and the net result was a big boost in the market share of competing browsers like Firefox and ultimately Chrome. Suddenly people weren’t writing native Windows apps anymore, they were writing applications that ran in the browser and could run on any OS.

The pattern of trumping a dominant platform by building at a higher level has repeated itself many times since. In some sense Google subverted platform power from the browser by becoming the only discovery mechanism for browser apps. When social burst onto the scene Facebook and Twitter became king of the hill by changing the game again. The move to mobile devices has created a bit of a flashback to the days of OS platform dominance, but it’s inevitably a temporary shift. At some point history will repeat itself as devices will continue to become more powerful, standards will prevail, and developers will insist on a way to avoid writing the same app for multiple platforms.

Which brings us to today, as the platforms du jour are again threatened. In this iteration the challenger to the dominance of Facebook and Twitter is the domain specific social apps that are built on top of them. When social network users share their status with friends, text + images + location isn’t enough anymore. Different kinds of activities call for customized mechanisms of data entry and ways to share the data that are tailored for the experience. For instance, when I play 18 holes of golf I enter and share my data with Golfshot GPS, which makes data entry a joy by providing me yardages and information about the course and gives my friends the ability to see very granular details on my round when I share. When I drink a beer I share with Untappd, when I eat at a restaurant I share a Yelp review, if I want to share a panoramic view I use Panorama 360. Even the basic functions like sharing photos and location work better with Instagram and Foursquare than Facebook’s built in mechanisms.

The social networks will never be able to provide this kind of rich interaction for every experience, and they shouldn’t attempt to. At the same time they run the risk of the higher level apps becoming the social network and stealing eyeballs; a position which some apps like Foursquare clearly already have their eyes on. For power users these apps have already made themselves the place to go and enter domain specific data. That trend will continue to expand into the mainstream as people continue to dream up rich ways to capture real life experiences through customized apps. To use the OS analogy: there’s no way that Microsoft can dream up everything that people want to build on top of Windows and bake it into the OS, nor would it be a good thing for consumers if they could.

It will be interesting to see how Facebook and Twitter respond to the trend. I suspect that users will continue to move towards domain specific apps for sharing, but that the social networks will remain the place to browse aggregated status for friends across specific domains. Unless, of course, the owners of the highest profile apps somehow manage to get together and develop an open standard for sharing/storing data and create an alternative browse experience across apps to avoid being limited by the whims of Facebook and Twitter and the limitations on their APIs.

The Physical Versus The Digital

I don’t want to buy things twice. I’m even more hesitant to pay again for intellectual property, which costs little or nothing to clone. I don’t want to buy Angry Birds for my iPhone, Kindle Fire, PC, and Xbox 360. I’m even crankier about buying digital goods when I’ve already bought the IP via physical media. I want the convenience of reading my old college text books on my Kindle without buying them again, and I shouldn’t have to. I hate the dilemma of trying to figure out whether to order my grad school textbooks digitally (because it’s lightweight, convenient, and portable) or not (because the pictures render properly, it’s handier to browse, and looks cooler on the shelf). Maybe I’m in the minority here, but I’m also too lazy to consider buying and setting up a DIY Book Scanner.

Anyone who reads, plays games, or listens to music has shelves or boxes of books, NES cartridges, or CD’s that they probably don’t use often and don’t know what to do with. I would love the option to fire up RBI Baseball or reread Storm of Swords on modern devices with the push of a button, but it’s not worth storing the physical media and/or keeping obsolete devices around.

My frustration has caused me to conclude the relatively obvious: some company needs to offer a way to send back physical media along with a nominal fee in trade for the digital version. The physical media could be resold second hand or donated to charitable causes, and the folks ditching their physical media could access the things that they have already paid for in a more convenient format. Amazon is the one company that seems poised to make this happen given that they deal in both physical/digital and they have efficient content delivery mechanisms in place for goods of both kinds. Is there a financial model that makes swapping physical for digital work for all parties involved, and is it something that will ever happen?

Why You’re Missing the Boat on Facebook Stock

I was about 2 hours into a 5 hour drive en route to an annual weekend golf trip when Facebook went public. That made me a captive audience for the 70 something year old family friend (who admittedly is a sharp cookie at his age and a damn good golfer) in my back seat as he lectured the rest of us on why the stock would be worthless in five years. In the weeks since I’ve heard a million flavors of the same message from people who’s tech savvy ranges from expert hackers to completely clueless. I respectfully disagree, and I think that there is a compelling technical argument that can be made for why Facebook has tremendous upside as a company. So let’s consider the question: should we all be buying Facebook stock at post IPO prices?

The Completely Tangential Bit

The first answer that I get from most folks is no, because Facebook adds no real value to people’s lives. In fact in some ways the result of the company’s existence is a net negative because it causes people to waste massive amounts of time and/or productivity. The company doesn’t produce goods or real services, and some would argue that it’s just a glorified LOLcats. I actually kind of agree, but I don’t think that it matters. What Facebook does produce as a sort of byproduct is an absolutely massive repository of personal data. More on that later.

The Red Herring

The next objection that people raise is based on an assumption that the primary way to monetize the website is ads. The company has certainly toyed with all kinds of ways of putting paid content in front of users, and the early returns seem to indicate that Facebook’s ads don’t work (at least not compared to Google’s paid search advertising). It doesn’t take a rocket scientist to realize that social pages are a whole different beast than search results pages. When people visit Google their intent is to navigate to another page about a topic. They don’t particularly care whether the link that takes them there is an algorithmic search result or a paid ad, they’re just looking for the most promising place to click. When people visit their BFF’s Facebook page they aren’t looking to leave the site, they’re planning on killing some time by checking what their friends are up to. So again on this point I agree; I’m skeptical that Facebook will never see the kind of crazy revenue growth from ads or any sort of paid content on their side that would justify even the current stock price. But advertising is just one way to skin a cat…

The Glimmer of Hope

But slightly off the topic of ads, and in the related space of online sales and marketing is where the first signs of promise can be found. Let’s get back to that data thing: Facebook has an absolute gold mine of knowledge that other companies would pay cold hard cash to access. Consider Amazon, for example. Amazon spends plenty of money mining user data to make more educated recommendations based on past purchase history. What would it be worth to them if they could find out that I have an 8 month old daughter, so I need to buy diapers on a regular basis? That I love Muse, so I may be interested in purchasing and downloading their new album? That I checked in at Century Link Field for a Sounders match last week, so maybe they can tempt me with a new jersey? Those are some of the more obvious suggestions, but there are actually more elaborate scenarios that could be interesting. What if you could combine Amazon purchase data with Facebook social graphs and figure out that three of my friends recently bought a book on a topic that I’m also interested in, and then offer those friends and I all a discount on a future purchase if I buy the book as well?

Facebook’s current market cap as I’m writing this is sitting at 57 billion. To get to a more reasonable 20 price to earnings multiple that seems relatively inline with other growth companies in the industry they need to add around 2 billion in annual earnings. Based on the numbers that I could dig up, that’s less than 1% of online sales in the US alone. Is that possible? Consider the margins of the biggest online retailer. Amazon is legendary for operating on razor thin margins, but their US margins last year were around 3.5%. How much of that margin would they part with for ultra meaningful personalization data that could have a huge positive impact on sales volume? Also, keep in mind that these numbers are for the US only, and they don’t include the astronomical projected growth in online sales moving forward. Regardless of exactly what the model looks like, I think there is a path for Facebook to leverage their data to grab some small piece of that growing pie.

The privacy hawks out there are already sounding alarms, I can hear them from where I’m sitting. But who says that there isn’t a model of sharing data that Facebook users would be happy with? I would venture that there are arrangements where users would be happy to share certain kinds of information to get a more relevant shopping experience. Taking things one step further, there are certainly users who would expose personal information in exchange for deals or rebates that online retailers like Amazon could kick back as an incentive to get the ball rolling, and Amazon isn’t one to pass on a loss leader that drives business with a long term promise of return on investment.

The Real Diamond In The Rough

And that gets us to the crux of the matter. Online sales are just one example of a market that Facebook can get into and leverage it’s data to make a buck. The evolution of computer hardware, the maturity of software that makes it trivial to perform distributed computation in the cloud, and continued advances in machine learning have ushered in the age of big data. Computer scientists who specialize in machine learning and data mining are being recruited to solve problems in every field from pharmaceuticals to agriculture. And the currency that these scientists deal in is huge amounts of data. Facebook has data in spades, and it has a very valuable kind of data that nobody else has.

The model for monetizing that data isn’t clear yet, but I can think of possibilities that make me optimistic that good models exist. For example think about the kind of money that Microsoft continues to pour into improving Bing and leapfrogging Google’s relevance to become the leader in online search. Facebook’s data could be an absolutely massive advantage in trying to disambiguate results and tailor content to a particular user. Google’s SPYW bet and Bing’s Facebook integration are different approaches on trying to integrate bits of social data into search, but they fall way short of the kind of gain that could be had via direct access to Facebook’s massive amount of social data.

Or suppose that a company or government body is trying to gain information about the spread of a particular disease. Maybe they have medical records that include the identities of people who are carriers, but not much more than that. If they had access to Facebook’s data they could suddenly know about the ethnicity, social network (who’s hanging out with who), and habits (through check-ins) of people in both classes: carriers and non-carriers. Applying machine learning to that training set may yield some interesting information on what traits correlate with becoming a carrier of the disease.

The One Armed Bandit

Of course, there’s a risk involved. As a friend of mine aptly pointed out, my case for Facebook’s value looks something like: 1) have a lot of important data, 2) mystery step, 3) profit. I would argue that if the mystery step was clear today, the valuation of Facebook stock would be much higher than even where it’s currently trading. I’ve given a few fictional examples to make the case that the mystery step probably exists. If you buy that argument, then you too should be buying Facebook stock. And this bar may be serving some expensive drinks in the future.

How the Cloud Saved Me from Hacker News

If you’re reading this post, we probably have one thing in common: we both spend at least some of our free cycles perusing Hacker News. I know this because it has driven most of my blog traffic over the past week. I have a habit of submitting my recent blog posts, and the other day I was surprised to see one particular post climb to number three on the Hacker News homepage. My excitement quickly gave way to panic, however, as I realized that the sudden rush of traffic had taken my blog down in the middle of it’s shining hour.

Back up a couple of months. I started blogging back in 2009 on Blogspot. At some point I was tempted by the offering of a free AWS EC2 Micro Instance; I had been thinking about setting up a private Git Server and running a few other servers in the cloud and I decided that like all of these guys, I would migrate my blog to Self Hosted WordPress on EC2. The whole migration was rather painless, I’ll spare the monotonous details because there are quite a few blog posts out there on getting the setup up and running, and how to move content. I will say that the one issue that I ran into is that I had issues with the existing BitNami AMI’s preinstalled with WordPress, so I ended up picking a vanilla Ubuntu AMI and installing LAMP + WordPress myself. Suffice to say that I’m still relatively new-ish to the Linux world, and I pulled it off without much trouble.

But now, my blog was down. Fortunately I was able to cruise over to AWS Management Console and stop my EC2 Instance, upgrade temporarily to a Large Instance, restart it, and then update my Elastic IP. Just like that I was back in business, and my blog that previously got a couple hundred hits on busy days suddenly fielded over 20k hits in a day and another 6k over the next few days.

I figured I would throw together a quick post on my experience for a few reasons. First, because some folks who posting to Hacker News may not have an idea exactly what to expect if they make the homepage. Read: If you’re EC2 hosted, upgrade your Instance size ahead of time. And second, I just wanted to marvel at the power of the cloud. A decade and a half ago I remember ordering a physical Dell rack server and hauling it over to a local ISP where I collocated it for a couple hundred bucks a month and used it host a few websites and custom applications. The fact that I can now spin up a virtual machine in the cloud in minutes, have my software stack up and running in less than an hour, and instantly scale to accommodate huge traffic variance (and all for cheap) is a testimony to the infrastructure underneath modern cloud offerings.

The Software Developer’s Guide to Fitness & Morning Productivity

If you’re a software developer (or frankly, if you spend a large portion of your day sitting in a chair in front of a computer) you will be more productive if you find a way to incorporate a workout into your daily routine. I literally believe that if you’re working 8 hour days today, you will get more done working 7 hours and squeezing in a 30-40 minutes of physical exercise. I believe this because a couple months ago my family and I moved into the city a few blocks from where I work, and I traded long commutes sitting in traffic for some relaxing morning time with the family and a quick work out in the mornings at the fitness center down the hall. The value of living close to work and having a bit of relaxing time in the morning is probably fairly self explanatory, but for now I want to focus on why I’ve found exercising to be so valuable. I also want to call out a few things that I’ve learned in the process that I hope may make your life easier if you aren’t exercising regularly and decide at some point that you want to incorporate a work out into your day. I don’t claim to be a personal trainer or any kind of fitness expert (although I’ve consulted a few while putting together a program that’s effective and gets me in and out of the gym quickly). Don’t treat this post as a replacement for good advice from qualified health and fitness professionals; think of it as one computer geek sharing some practical tips with his fellow geeks about a particular way to get in shape and increase productivity.

Benefits of Exercise

From a pure productivity perspective, the biggest benefit to exercising for me is specific to working out in the morning. Rather than getting to the office feeling like I needed another two hours of sleep and only 4 cups coffee will get me through the day, I show up feeling awake and ready to start knocking off tasks in my queue. Because many of the folks on my teams tend to show up at 10 or 11 and work late my schedule is generally meeting free in the morning, which also makes it the most valuable time to be productive.

I don’t have evidence to support this, but anecdotally I have observed a link between fitness and career success. That’s not to say that you can’t have one without the other, but I believe that you have a better shot of being successful in your career if you work out on a regular basis. Working out makes you feel good, boosts your energy levels, helps strengthen your core muscles so that you’re comfortable sitting in a chair all day, gives you confidence, and perhaps most importantly gets you in a habit of setting goals and achieving them over long periods of time. When you’re jumping between jobs, there’s also evidence to suggest that interviewers make a hire/no hire decision that is extremely tough to overturn in the first 15 seconds of the interview process and whether you like it or not that first impression includes what you look like.

When to Exercise

Some people believe that working out in the morning boosts your metabolism throughout the rest of the day, but the limited research that I’ve seen seems to suggest that regardless of when you work out you get a short metabolism boost that goes away in a set amount of time. I’ve touched on why I find working out in the morning to be especially beneficial, but I would recommend working out at a time where you know you can be consistent; if you try to vary your workout daily according to your schedule you’re going to be way more likely to skip it. If the only way that you can be consistent is to take a quick jog on a treadmill in a 3 piece suit at lunch, do that… and do it consistently.

How to Excercise

Map out a routine that’s short and sweet, and ideally one that you enjoy. Get your heart rate up to your target zone and try to keep it up for 20-30 minutes. Pick a few exercises and do them in circuits with little or no rest between exercises (and a short rest between sets), at high intensity. Lean towards workouts that work large groups of muscles, for example doing push ups (or better yet, burpees) instead of bench press.

Personally I run between 1-2 miles and then pick 3 different exercises and do them in a circuit. I split the exercises into upper body, lower body, and core. I try to make sure that I hit each big muscle group at least once per week. It gets me in and out of the apartment gym in around a half hour, and I’ve found it to be effective. If you’re having trouble figuring out what exercises you should incorporate into your workout, chat with a trainer or check out one of the apps (there are several Crossfit WOD specific ones if you want to go that route) that are available on any phone.

How to Eat

One of the first things that I noticed when I started working out was that after my morning burst of energy I would start getting tired right before lunch. I figured out that eating protein in the morning helped, so I ordered a big tub of whey protein and started making a quick fruit/protein shake with some yogurt/milk every morning. Remember that your body needs protein to rebuild muscles after a workout, and if you’re like me you’re probably not in the habit of eating enough protein to start your day. Protein provides energy for a longer period of time than fat or carbs, so you’ll be getting fuel from your morning snack for longer.

Hope you find this helpful, and if you figure out any workout tips of your own as you get going please do share!

Your Critical Data Isn’t Safe

I’m willing to bet that just about every working moment of your life up to this very instant has been an attempt to make money with the goal of accumulating enough wealth to live comfortably and achieve a set of objectives (retirement, travel, an increased standard of living). That nest egg is probably stored at a bank on a computer system as a set of 1’s and 0’s on a disk. Current analysis shows that in data centers on average somewhere between 2-14% of hard disks fail every year. In other words every single month x% of my monthly paycheck is removed for retirement savings and stored on a disk that has around a 1 in 20 chance of failing sometime this year, and if that money isn’t available in 30 years I’m hosed.

A dire situation indeed, but I’m obviously omitting a few important little details. Fortunately for me the bank has a government/shareholder vested interest in seeing me cash my retirement dollars out someday, so they’ve hired a small army of programmers to design systems that guarantee that my financial data is safe. The question is, how safe? Is my blind faith that my digitally stored assets will never be lost justified?

 

Let’s start by considering a few basic scenarios around data storage and persistence on a single machine. Suppose that I’m typing up a document in a word processing application. What assumptions can I make about whether my data is safe from being lost? Most modern hardware splits storage between fast volatile storage where contents are lost without power (memory), and slower non volatile storage where contents persist without power (disk). It’s possible that in the future advances in non-volatile memory will break down these barriers and completely revolutionize the way that we approach programming computers, but that’s a lengthy discussion for another time. For now it’s probably safe to assume that my word processor is almost certainly storing the contents of my document in memory to keep the application’s user interface speedy, so something as trivial as a quick power blip can cause me to lose my data.

One way to solve this problem is by adding hardware, so let’s say that I head to the store to buy a nice and beefy UPS. I’ve covered myself from the short power outage scenario, but what about when I spill my morning coffee on my computer case and short out the power supply? My critical document still only exists in memory on a single physical machine, and if that machine dies for any reason I’m in a world of hurt.

Suppose I decide to solve this by pushing CTRL+S to save my document to disk every 5 minutes. Can I even assume that my data is being stored on disk when I tell my application to save it? Technically no, it depends on the behavior of both my word processor application and the operating system. When I push save the word processor is likely making a system call to get a file descriptor (if it doesn’t already have one) and making another system call to write some data using that file descriptor. At this point the operating system still probably hasn’t written the data to disk; instead it’s probably written it to a disk buffer in memory that won’t get written to disk until the buffer fills up or someone tells the operating system to flush the buffer.

Let’s assume that I’ve actually examined the code of my word processor and I see that when I press save it is both writing data and flushing the disk buffer. Can I guarantee that my data is on disk when I press save? Probably, but it’s still possible that I will lose power before the operating system has the chance to write all of my data from the buffer to disk. People who implement file systems have to carefully consider these kind of edge cases and define a single atomic event that constitutes crossing the Rubicon, the point of no return. In many current file systems that event is probably the writing of a particular disk segment in a journal with enough data to repeat the operation: if the write to the journal completes then the entire write is considered complete, if it isn’t written then any portion of the write that has been completed should be invalidated.

What if I can somehow guarantee that the the disk write transaction has completed and my document has been written to the disk. Now how safe is my data? I’ve already touched briefly on hard disk failure rates. My disk could die for a variety of electronic or mechanical reasons, or because of non-physical corruption to either firmware or something like the file allocation table.

Again I turn to hardware and I decide set my computer up to use RAID 1 so that my data is saved to multiple redundant disks in the same physical machine. I’ve drastically reduced the chance of losing my data due to the most common disk failure issues, but my data remains at risk of being lost in a local fire or any other event which could cause physical damage to my machine. I may be able to recover the contents of one of the disks despite the machine taking a licking, but there aren’t any guarantees and even if I can recover the data it’s likely to take a significant effort and a lot of time.

I’ve pretty much run out of local options, so I run to the promise of the cloud. I script a backup of my file system to some arbitrary cloud data storage every N minutes. I decide that I’m alright if I lose a few updates between backups, and the data store tells me that it will mirror my data in on disks in separate machines in at least N geographically distinct locales across the globe. So what are the odds that I lose it? Obviously a world class catastrophe like a meteor striking earth could still obliterate my data, but in that scenario I probably wouldn’t be too stressed about losing my document. So what credible threats remain?

One of the biggest dangers for data stored in the cloud is the software that powers the cloud. A while ago I worked on a project (that I won’t name) that involved a very large scale distributed data store with geographic redundancy. We had fairly sophisticated environment management software that handled deploying our application plus data, monitoring the health of the system, and in some cases taking corrective action when anomalies were detected (for things like hardware failure, for example to reimage a machine when it first came online after getting a new disk drive). At one point a bug in the management software caused it to simultaneously start to reimage machines in every data center around the world. The next few days ended up being a pretty wild ones as we worked to mitigate the damage, brought machines back up, and worked through various system edge cases that we had never previously considered. We lost a significant amount of data, but we were fortunate because the kind of data that our system cared about could be rebuilt from various primary data stores. If that weren’t the case we would have lost critical data with significant business impact.

Another risk to data in any cloud is people with the power to bring that cloud down: a disgruntled organization member or employee, an external hacker, or even a government. When arbitrary control of a system can be obtained via any attack vector or even by physical force, one of the potential outcomes is intentional deletion of data. I’ve focused the thread on data safety (by which I mean prevention of data loss) rather than data security (which I would take to mean both safety and the guarantee of keeping data private), but malicious access to data tends to favor the latter since stolen data is lucrative. It’s perfectly plausible that future attacks could focus on trying to delete or alter data and destroy the means of recovering from the data loss, regardless of the degree of replication. Think digital Tyler Durden. People who stored data on MegaUpload probably never envisioned that they would lose it.

My main point is that whether data is held in local memory, on disk, replicated on a few redundant local disks, or distributed across continents and data centers, there is always some degree of risk of losing the data. Based on my anecdotal experience most people don’t associate the correct level of risk with data loss regardless of where the data lives. I think those kind of considerations will become increasingly important as more and more data moves to both public and private clouds with varying infrastructures. There is no such thing as data that can’t be lost, only ways to make data less likely to be lost.

Shifting Gears A Bit

In the past I’ve tended to blog (rather infrequently) about different technical solutions to problems that I’ve stubbed my toes on in hopes that I would spread the love and save others from getting stumped by the same problems, but a few things have happened recently that have impacted the kind of stuff that I will probably bother to blog about in the future. First, Stack Overflow essentially became the single source to answer technical coding questions. Joel Spolsky may claim that the primary UI for Stack Overflow is Google, but to be honest the content on the site is generally so good these days that I head straight there to unlock the deepest darkest coding mysteries and I bypass blogs and other sources of wisdom in the process. I’m sure others do the same, so the value of answering technical questions in a blog is probably diminished.

Another change inspiring factor is that about 3 months ago I quit my job at Microsoft after over 6 years working on Bing and took a position working on the WAP team at Amazon. I won’t bother with the details of what inspired the change, but I’ll just briefly comment that I really enjoyed my time at Microsoft and I’ve also loved working at Amazon thus far. I’ll also point out that I find it pretty remarkable how differently the two companies function and specifically how different the “Manager” job at Amazon is from the “Lead” job at Microsoft (which are essentially equivalent roles). In a nutshell as a Software Development Manager at Amazon you run your team as if you’re running a small startup within a big company, so you’re on the hook for everything from your team strategy and internal marketing to sourcing and hiring to product design and implementation. One of the downsides to this approach is that it doesn’t leave room for the 20-30% coding time that Microsoft typically encourages Software Development Leads to partake in. As a result I’ll probably start focusing the blog a bit more on effectively running a software development team, and a bit less on nitty gritty coding/technical problems/issues.

A third factor is that I’ve started taking classes in the UW PMP CS program which has me daydreaming about things like compilers and operating systems, so stuff that I’m learning in classes or questions related to the material may seep into my blog posts time to time.

That’s all for now, just a brief explanation to the few readers who trickle by my blog that the scenery may change just a bit.