• Skip to main content

Ethereal Bits

Tyson Trautmann's musings on software, continuous delivery, management, & life.

  • Archive
  • About
  • Subscribe
  • LinkedIn
  • Twitter
  • Search

Off Topic

Why You’re Missing the Boat on Facebook Stock

June 9, 2012 by Tyson Trautmann 3 Comments

I was about 2 hours into a 5 hour drive en route to an annual weekend golf trip when Facebook went public. That made me a captive audience for the 70 something year old family friend (who admittedly is a sharp cookie at his age and a damn good golfer) in my back seat as he lectured the rest of us on why the stock would be worthless in five years. In the weeks since I’ve heard a million flavors of the same message from people who’s tech savvy ranges from expert hackers to completely clueless. I respectfully disagree, and I think that there is a compelling technical argument that can be made for why Facebook has tremendous upside as a company. So let’s consider the question: should we all be buying Facebook stock at post IPO prices?

The Completely Tangential Bit

The first answer that I get from most folks is no, because Facebook adds no real value to people’s lives. In fact in some ways the result of the company’s existence is a net negative because it causes people to waste massive amounts of time and/or productivity. The company doesn’t produce goods or real services, and some would argue that it’s just a glorified LOLcats. I actually kind of agree, but I don’t think that it matters. What Facebook does produce as a sort of byproduct is an absolutely massive repository of personal data. More on that later.

The Red Herring

The next objection that people raise is based on an assumption that the primary way to monetize the website is ads. The company has certainly toyed with all kinds of ways of putting paid content in front of users, and the early returns seem to indicate that Facebook’s ads don’t work (at least not compared to Google’s paid search advertising). It doesn’t take a rocket scientist to realize that social pages are a whole different beast than search results pages. When people visit Google their intent is to navigate to another page about a topic. They don’t particularly care whether the link that takes them there is an algorithmic search result or a paid ad, they’re just looking for the most promising place to click. When people visit their BFF’s Facebook page they aren’t looking to leave the site, they’re planning on killing some time by checking what their friends are up to. So again on this point I agree; I’m skeptical that Facebook will never see the kind of crazy revenue growth from ads or any sort of paid content on their side that would justify even the current stock price. But advertising is just one way to skin a cat…

The Glimmer of Hope

But slightly off the topic of ads, and in the related space of online sales and marketing is where the first signs of promise can be found. Let’s get back to that data thing: Facebook has an absolute gold mine of knowledge that other companies would pay cold hard cash to access. Consider Amazon, for example. Amazon spends plenty of money mining user data to make more educated recommendations based on past purchase history. What would it be worth to them if they could find out that I have an 8 month old daughter, so I need to buy diapers on a regular basis? That I love Muse, so I may be interested in purchasing and downloading their new album? That I checked in at Century Link Field for a Sounders match last week, so maybe they can tempt me with a new jersey? Those are some of the more obvious suggestions, but there are actually more elaborate scenarios that could be interesting. What if you could combine Amazon purchase data with Facebook social graphs and figure out that three of my friends recently bought a book on a topic that I’m also interested in, and then offer those friends and I all a discount on a future purchase if I buy the book as well?

Facebook’s current market cap as I’m writing this is sitting at 57 billion. To get to a more reasonable 20 price to earnings multiple that seems relatively inline with other growth companies in the industry they need to add around 2 billion in annual earnings. Based on the numbers that I could dig up, that’s less than 1% of online sales in the US alone. Is that possible? Consider the margins of the biggest online retailer. Amazon is legendary for operating on razor thin margins, but their US margins last year were around 3.5%. How much of that margin would they part with for ultra meaningful personalization data that could have a huge positive impact on sales volume? Also, keep in mind that these numbers are for the US only, and they don’t include the astronomical projected growth in online sales moving forward. Regardless of exactly what the model looks like, I think there is a path for Facebook to leverage their data to grab some small piece of that growing pie.

The privacy hawks out there are already sounding alarms, I can hear them from where I’m sitting. But who says that there isn’t a model of sharing data that Facebook users would be happy with? I would venture that there are arrangements where users would be happy to share certain kinds of information to get a more relevant shopping experience. Taking things one step further, there are certainly users who would expose personal information in exchange for deals or rebates that online retailers like Amazon could kick back as an incentive to get the ball rolling, and Amazon isn’t one to pass on a loss leader that drives business with a long term promise of return on investment.

The Real Diamond In The Rough

And that gets us to the crux of the matter. Online sales are just one example of a market that Facebook can get into and leverage it’s data to make a buck. The evolution of computer hardware, the maturity of software that makes it trivial to perform distributed computation in the cloud, and continued advances in machine learning have ushered in the age of big data. Computer scientists who specialize in machine learning and data mining are being recruited to solve problems in every field from pharmaceuticals to agriculture. And the currency that these scientists deal in is huge amounts of data. Facebook has data in spades, and it has a very valuable kind of data that nobody else has.

The model for monetizing that data isn’t clear yet, but I can think of possibilities that make me optimistic that good models exist. For example think about the kind of money that Microsoft continues to pour into improving Bing and leapfrogging Google’s relevance to become the leader in online search. Facebook’s data could be an absolutely massive advantage in trying to disambiguate results and tailor content to a particular user. Google’s SPYW bet and Bing’s Facebook integration are different approaches on trying to integrate bits of social data into search, but they fall way short of the kind of gain that could be had via direct access to Facebook’s massive amount of social data.

Or suppose that a company or government body is trying to gain information about the spread of a particular disease. Maybe they have medical records that include the identities of people who are carriers, but not much more than that. If they had access to Facebook’s data they could suddenly know about the ethnicity, social network (who’s hanging out with who), and habits (through check-ins) of people in both classes: carriers and non-carriers. Applying machine learning to that training set may yield some interesting information on what traits correlate with becoming a carrier of the disease.

The One Armed Bandit

Of course, there’s a risk involved. As a friend of mine aptly pointed out, my case for Facebook’s value looks something like: 1) have a lot of important data, 2) mystery step, 3) profit. I would argue that if the mystery step was clear today, the valuation of Facebook stock would be much higher than even where it’s currently trading. I’ve given a few fictional examples to make the case that the mystery step probably exists. If you buy that argument, then you too should be buying Facebook stock. And this bar may be serving some expensive drinks in the future.

How the Cloud Saved Me from Hacker News

June 1, 2012 by Tyson Trautmann 7 Comments

If you’re reading this post, we probably have one thing in common: we both spend at least some of our free cycles perusing Hacker News. I know this because it has driven most of my blog traffic over the past week. I have a habit of submitting my recent blog posts, and the other day I was surprised to see one particular post climb to number three on the Hacker News homepage. My excitement quickly gave way to panic, however, as I realized that the sudden rush of traffic had taken my blog down in the middle of it’s shining hour.

Back up a couple of months. I started blogging back in 2009 on Blogspot. At some point I was tempted by the offering of a free AWS EC2 Micro Instance; I had been thinking about setting up a private Git Server and running a few other servers in the cloud and I decided that like all of these guys, I would migrate my blog to Self Hosted WordPress on EC2. The whole migration was rather painless, I’ll spare the monotonous details because there are quite a few blog posts out there on getting the setup up and running, and how to move content. I will say that the one issue that I ran into is that I had issues with the existing BitNami AMI’s preinstalled with WordPress, so I ended up picking a vanilla Ubuntu AMI and installing LAMP + WordPress myself. Suffice to say that I’m still relatively new-ish to the Linux world, and I pulled it off without much trouble.

But now, my blog was down. Fortunately I was able to cruise over to AWS Management Console and stop my EC2 Instance, upgrade temporarily to a Large Instance, restart it, and then update my Elastic IP. Just like that I was back in business, and my blog that previously got a couple hundred hits on busy days suddenly fielded over 20k hits in a day and another 6k over the next few days.

I figured I would throw together a quick post on my experience for a few reasons. First, because some folks who posting to Hacker News may not have an idea exactly what to expect if they make the homepage. Read: If you’re EC2 hosted, upgrade your Instance size ahead of time. And second, I just wanted to marvel at the power of the cloud. A decade and a half ago I remember ordering a physical Dell rack server and hauling it over to a local ISP where I collocated it for a couple hundred bucks a month and used it host a few websites and custom applications. The fact that I can now spin up a virtual machine in the cloud in minutes, have my software stack up and running in less than an hour, and instantly scale to accommodate huge traffic variance (and all for cheap) is a testimony to the infrastructure underneath modern cloud offerings.

The Software Developer’s Guide to Fitness & Morning Productivity

May 27, 2012 by Tyson Trautmann 37 Comments

If you’re a software developer (or frankly, if you spend a large portion of your day sitting in a chair in front of a computer) you will be more productive if you find a way to incorporate a workout into your daily routine. I literally believe that if you’re working 8 hour days today, you will get more done working 7 hours and squeezing in a 30-40 minutes of physical exercise. I believe this because a couple months ago my family and I moved into the city a few blocks from where I work, and I traded long commutes sitting in traffic for some relaxing morning time with the family and a quick work out in the mornings at the fitness center down the hall. The value of living close to work and having a bit of relaxing time in the morning is probably fairly self explanatory, but for now I want to focus on why I’ve found exercising to be so valuable. I also want to call out a few things that I’ve learned in the process that I hope may make your life easier if you aren’t exercising regularly and decide at some point that you want to incorporate a work out into your day. I don’t claim to be a personal trainer or any kind of fitness expert (although I’ve consulted a few while putting together a program that’s effective and gets me in and out of the gym quickly). Don’t treat this post as a replacement for good advice from qualified health and fitness professionals; think of it as one computer geek sharing some practical tips with his fellow geeks about a particular way to get in shape and increase productivity.

Benefits of Exercise

From a pure productivity perspective, the biggest benefit to exercising for me is specific to working out in the morning. Rather than getting to the office feeling like I needed another two hours of sleep and only 4 cups coffee will get me through the day, I show up feeling awake and ready to start knocking off tasks in my queue. Because many of the folks on my teams tend to show up at 10 or 11 and work late my schedule is generally meeting free in the morning, which also makes it the most valuable time to be productive.

I don’t have evidence to support this, but anecdotally I have observed a link between fitness and career success. That’s not to say that you can’t have one without the other, but I believe that you have a better shot of being successful in your career if you work out on a regular basis. Working out makes you feel good, boosts your energy levels, helps strengthen your core muscles so that you’re comfortable sitting in a chair all day, gives you confidence, and perhaps most importantly gets you in a habit of setting goals and achieving them over long periods of time. When you’re jumping between jobs, there’s also evidence to suggest that interviewers make a hire/no hire decision that is extremely tough to overturn in the first 15 seconds of the interview process and whether you like it or not that first impression includes what you look like.

When to Exercise

Some people believe that working out in the morning boosts your metabolism throughout the rest of the day, but the limited research that I’ve seen seems to suggest that regardless of when you work out you get a short metabolism boost that goes away in a set amount of time. I’ve touched on why I find working out in the morning to be especially beneficial, but I would recommend working out at a time where you know you can be consistent; if you try to vary your workout daily according to your schedule you’re going to be way more likely to skip it. If the only way that you can be consistent is to take a quick jog on a treadmill in a 3 piece suit at lunch, do that… and do it consistently.

How to Excercise

Map out a routine that’s short and sweet, and ideally one that you enjoy. Get your heart rate up to your target zone and try to keep it up for 20-30 minutes. Pick a few exercises and do them in circuits with little or no rest between exercises (and a short rest between sets), at high intensity. Lean towards workouts that work large groups of muscles, for example doing push ups (or better yet, burpees) instead of bench press.

Personally I run between 1-2 miles and then pick 3 different exercises and do them in a circuit. I split the exercises into upper body, lower body, and core. I try to make sure that I hit each big muscle group at least once per week. It gets me in and out of the apartment gym in around a half hour, and I’ve found it to be effective. If you’re having trouble figuring out what exercises you should incorporate into your workout, chat with a trainer or check out one of the apps (there are several Crossfit WOD specific ones if you want to go that route) that are available on any phone.

How to Eat

One of the first things that I noticed when I started working out was that after my morning burst of energy I would start getting tired right before lunch. I figured out that eating protein in the morning helped, so I ordered a big tub of whey protein and started making a quick fruit/protein shake with some yogurt/milk every morning. Remember that your body needs protein to rebuild muscles after a workout, and if you’re like me you’re probably not in the habit of eating enough protein to start your day. Protein provides energy for a longer period of time than fat or carbs, so you’ll be getting fuel from your morning snack for longer.

Hope you find this helpful, and if you figure out any workout tips of your own as you get going please do share!

Your Critical Data Isn’t Safe

February 13, 2012 by Tyson Trautmann Leave a Comment

I’m willing to bet that just about every working moment of your life up to this very instant has been an attempt to make money with the goal of accumulating enough wealth to live comfortably and achieve a set of objectives (retirement, travel, an increased standard of living). That nest egg is probably stored at a bank on a computer system as a set of 1’s and 0’s on a disk. Current analysis shows that in data centers on average somewhere between 2-14% of hard disks fail every year. In other words every single month x% of my monthly paycheck is removed for retirement savings and stored on a disk that has around a 1 in 20 chance of failing sometime this year, and if that money isn’t available in 30 years I’m hosed.

A dire situation indeed, but I’m obviously omitting a few important little details. Fortunately for me the bank has a government/shareholder vested interest in seeing me cash my retirement dollars out someday, so they’ve hired a small army of programmers to design systems that guarantee that my financial data is safe. The question is, how safe? Is my blind faith that my digitally stored assets will never be lost justified?

 

Let’s start by considering a few basic scenarios around data storage and persistence on a single machine. Suppose that I’m typing up a document in a word processing application. What assumptions can I make about whether my data is safe from being lost? Most modern hardware splits storage between fast volatile storage where contents are lost without power (memory), and slower non volatile storage where contents persist without power (disk). It’s possible that in the future advances in non-volatile memory will break down these barriers and completely revolutionize the way that we approach programming computers, but that’s a lengthy discussion for another time. For now it’s probably safe to assume that my word processor is almost certainly storing the contents of my document in memory to keep the application’s user interface speedy, so something as trivial as a quick power blip can cause me to lose my data.

One way to solve this problem is by adding hardware, so let’s say that I head to the store to buy a nice and beefy UPS. I’ve covered myself from the short power outage scenario, but what about when I spill my morning coffee on my computer case and short out the power supply? My critical document still only exists in memory on a single physical machine, and if that machine dies for any reason I’m in a world of hurt.

Suppose I decide to solve this by pushing CTRL+S to save my document to disk every 5 minutes. Can I even assume that my data is being stored on disk when I tell my application to save it? Technically no, it depends on the behavior of both my word processor application and the operating system. When I push save the word processor is likely making a system call to get a file descriptor (if it doesn’t already have one) and making another system call to write some data using that file descriptor. At this point the operating system still probably hasn’t written the data to disk; instead it’s probably written it to a disk buffer in memory that won’t get written to disk until the buffer fills up or someone tells the operating system to flush the buffer.

Let’s assume that I’ve actually examined the code of my word processor and I see that when I press save it is both writing data and flushing the disk buffer. Can I guarantee that my data is on disk when I press save? Probably, but it’s still possible that I will lose power before the operating system has the chance to write all of my data from the buffer to disk. People who implement file systems have to carefully consider these kind of edge cases and define a single atomic event that constitutes crossing the Rubicon, the point of no return. In many current file systems that event is probably the writing of a particular disk segment in a journal with enough data to repeat the operation: if the write to the journal completes then the entire write is considered complete, if it isn’t written then any portion of the write that has been completed should be invalidated.

What if I can somehow guarantee that the the disk write transaction has completed and my document has been written to the disk. Now how safe is my data? I’ve already touched briefly on hard disk failure rates. My disk could die for a variety of electronic or mechanical reasons, or because of non-physical corruption to either firmware or something like the file allocation table.

Again I turn to hardware and I decide set my computer up to use RAID 1 so that my data is saved to multiple redundant disks in the same physical machine. I’ve drastically reduced the chance of losing my data due to the most common disk failure issues, but my data remains at risk of being lost in a local fire or any other event which could cause physical damage to my machine. I may be able to recover the contents of one of the disks despite the machine taking a licking, but there aren’t any guarantees and even if I can recover the data it’s likely to take a significant effort and a lot of time.

I’ve pretty much run out of local options, so I run to the promise of the cloud. I script a backup of my file system to some arbitrary cloud data storage every N minutes. I decide that I’m alright if I lose a few updates between backups, and the data store tells me that it will mirror my data in on disks in separate machines in at least N geographically distinct locales across the globe. So what are the odds that I lose it? Obviously a world class catastrophe like a meteor striking earth could still obliterate my data, but in that scenario I probably wouldn’t be too stressed about losing my document. So what credible threats remain?

One of the biggest dangers for data stored in the cloud is the software that powers the cloud. A while ago I worked on a project (that I won’t name) that involved a very large scale distributed data store with geographic redundancy. We had fairly sophisticated environment management software that handled deploying our application plus data, monitoring the health of the system, and in some cases taking corrective action when anomalies were detected (for things like hardware failure, for example to reimage a machine when it first came online after getting a new disk drive). At one point a bug in the management software caused it to simultaneously start to reimage machines in every data center around the world. The next few days ended up being a pretty wild ones as we worked to mitigate the damage, brought machines back up, and worked through various system edge cases that we had never previously considered. We lost a significant amount of data, but we were fortunate because the kind of data that our system cared about could be rebuilt from various primary data stores. If that weren’t the case we would have lost critical data with significant business impact.

Another risk to data in any cloud is people with the power to bring that cloud down: a disgruntled organization member or employee, an external hacker, or even a government. When arbitrary control of a system can be obtained via any attack vector or even by physical force, one of the potential outcomes is intentional deletion of data. I’ve focused the thread on data safety (by which I mean prevention of data loss) rather than data security (which I would take to mean both safety and the guarantee of keeping data private), but malicious access to data tends to favor the latter since stolen data is lucrative. It’s perfectly plausible that future attacks could focus on trying to delete or alter data and destroy the means of recovering from the data loss, regardless of the degree of replication. Think digital Tyler Durden. People who stored data on MegaUpload probably never envisioned that they would lose it.

My main point is that whether data is held in local memory, on disk, replicated on a few redundant local disks, or distributed across continents and data centers, there is always some degree of risk of losing the data. Based on my anecdotal experience most people don’t associate the correct level of risk with data loss regardless of where the data lives. I think those kind of considerations will become increasingly important as more and more data moves to both public and private clouds with varying infrastructures. There is no such thing as data that can’t be lost, only ways to make data less likely to be lost.

Shifting Gears A Bit

February 8, 2012 by Tyson Trautmann Leave a Comment

In the past I’ve tended to blog (rather infrequently) about different technical solutions to problems that I’ve stubbed my toes on in hopes that I would spread the love and save others from getting stumped by the same problems, but a few things have happened recently that have impacted the kind of stuff that I will probably bother to blog about in the future. First, Stack Overflow essentially became the single source to answer technical coding questions. Joel Spolsky may claim that the primary UI for Stack Overflow is Google, but to be honest the content on the site is generally so good these days that I head straight there to unlock the deepest darkest coding mysteries and I bypass blogs and other sources of wisdom in the process. I’m sure others do the same, so the value of answering technical questions in a blog is probably diminished.

Another change inspiring factor is that about 3 months ago I quit my job at Microsoft after over 6 years working on Bing and took a position working on the WAP team at Amazon. I won’t bother with the details of what inspired the change, but I’ll just briefly comment that I really enjoyed my time at Microsoft and I’ve also loved working at Amazon thus far. I’ll also point out that I find it pretty remarkable how differently the two companies function and specifically how different the “Manager” job at Amazon is from the “Lead” job at Microsoft (which are essentially equivalent roles). In a nutshell as a Software Development Manager at Amazon you run your team as if you’re running a small startup within a big company, so you’re on the hook for everything from your team strategy and internal marketing to sourcing and hiring to product design and implementation. One of the downsides to this approach is that it doesn’t leave room for the 20-30% coding time that Microsoft typically encourages Software Development Leads to partake in. As a result I’ll probably start focusing the blog a bit more on effectively running a software development team, and a bit less on nitty gritty coding/technical problems/issues.

A third factor is that I’ve started taking classes in the UW PMP CS program which has me daydreaming about things like compilers and operating systems, so stuff that I’m learning in classes or questions related to the material may seep into my blog posts time to time.

That’s all for now, just a brief explanation to the few readers who trickle by my blog that the scenery may change just a bit.

Why I Bit the Bullet (& IE’s Box Model CSS Bug)

July 27, 2009 by Tyson Trautmann Leave a Comment

>As a Computer Programmer (by trade and for fun) I’m obviously ever reliant on the ability to search the web for info… whether it’s parsing documentation, combing bulletin boards to uncover the solution to the daily problems that someone else has doubtless encountered, or the occasional hail mary when you have no clue what’s going on.

Lately I’ve been burning a bid of the midnight oil on a pet project, and tonight while trying to dust the cobwebs off of my HTML/CSS skills I realized that IE and Firefox were rendering my page differently. I remembered having bumped into the problem before but being rusty I had to go fishing and I found the solution on someone else’s blog. The issue was that IE isn’t standards compliant in terms of the way it behaves when rendering if padding and/or margins are specified. Firefox correctly adds the padding/margins to the size of the div (or whatever you’re applying the style to) while IE doesn’t. There is hope however, rather than going to wild extremes to hack the code to behave differently for different browsers you can just include something like the following at the top of your HTML document to include the proper document type definition:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN” “http://www.w3.org/TR/html4/strict.dtd”>

or

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>

The bug is called the “Box Model Bug for IE”, if you’re really curious you can find more info here. Not the most interesting problem in the world, but it made me realize that rather than just being a leecher in the fount of Internet crap (and wisdom) I too can be a seeder, and thus I begin my own blog about my ventures in coding and all things related to Software Engineering.

  • « Go to Previous Page
  • Go to page 1
  • Go to page 2

Copyright © 2021 · Atmosphere Pro on Genesis Framework · WordPress · Log in