Tom Preston-Werner — GeistHaus

Oct 22, 2025 Updated Oct 22, 2025

Show full content

Announcing PWV Fund I

22 Oct 2025 - Bay Area, CA

New achievement! Preston-Werner Ventures (through which I did my angel investing) has become PWV™ and I’m raising a venture fund targeting $100M to lead pre-seed and seed rounds in today’s most exciting software- and AI-driven companies.

Over the past thirteen years, I’ve invested in over 175 amazing startups, including some of the earliest checks in Stripe, Cursor, Netlify, Snyk, Supabase, PlanetScale, Retool, and so many more. I feel tremendously lucky to count myself as a part of the journey of such talented and driven founders and teams.

But finding and funding the best startups is hard to do alone. So three years ago, I joined forces with my good friends David Price and David Thyresson (DT), and together we took my angel investing practices and leveled them up across every metric. More deal flow, deeper diligence, and purposeful community engagement for our founders.

DT and David are both entrepreneurs at heart, having shipped, scaled, and exited on repeat. We know what it’s like to create something from nothing and envision a better future for the users of our creations. And we know how hard that is and how much the right financial partner can increase the probability of success, and how much the wrong partner can make you wish you’d never tried in the first place.

At PWV we’re building the venture firm we wish we’d had early in our startup journeys:

Founder Focused. We succeed in our mission by helping founders succeed in theirs. We love coming in at the seed stage because it’s where our real-world experience and advice can best accelerate timelines, increase founder confidence, and help avoid costly mistakes. We believe that a company is only as good as its team, and a team is only as good as its founders. That’s why we do everything in our power to keep founders happy and focused.

Community Oriented. Our goal is to be the smartest money a founder ever takes. Part of those smarts is realizing that a community can go farther than an individual, which is why we connect founders with each other through our events, programs, and platforms. With over 150 active companies of various maturities in our extended (13 year) portfolio, the pool of knowledge available to our founders is vast. And when startups learn from each other, every stakeholder wins.

Networked. Seed stage is just the beginning of the journey. When it’s time to raise that Series A and go for growth, founders need connections. Through my past at GitHub and many years of deliberate networking, David, DT, and I have fostered amazing relationships with the world’s very best growth-stage investors. There’s nothing better than a warm intro, and nothing that makes us happier than to make them for our portfolio companies when the time is right.

My goal, with every PWV investment, is to back ideas that truly create a differentiated future. Ideas that change the fundamental rules of what’s possible. Ideas that both embrace technology and celebrate people as the most important factor in the better future we want to build.

Ideas start with founders. Founders start with PWV.

If you’re a visionary technology founder looking to raise your pre-seed or seed, we want to hear from you. Please apply at pwv.com/apply.

If you’d like to learn more about investing in PWV Fund I, please email investors@pwv.com.

http://tom.preston-werner.com/2025/10/22/announcing-pwv-fund-1

RedwoodJS’ Next Epoch: All In on React Server Components

May 30, 2023 Updated May 30, 2023

Show full content

RedwoodJS’ Next Epoch: All In on React Server Components

30 May 2023 - Bay Area, CA

RedwoodJS is the full-stack JS/TS web framework designed to help you grow from side project to startup.

For the last year, the RedwoodJS team has been prototyping solutions to the framework’s lack of a proper server-side rendering (SSR) feature. Today, I’m happy to announce that we have chosen to implement a modern SSR solution with a front-end server, leveraging React’s streaming capabilities. This will also allow us to add React Server Components (RSC) to Redwood as our solution to the many downsides of pure single page applications (SPAs). This will require a significant evolution of Redwood’s core architecture and to celebrate this milestone, the first Redwood release with server rendering capabilities will also be the first release of the next epoch: RedwoodJS Bighorn.

In addition, I’m excited to announce the first ever RedwoodJS Conference, to be held in-person in Grant’s Pass, Oregon, USA on September 26–29. Join us for an intimate, single-track, two-day gathering (plus a workshop day and a local activity day) nestled in the gorgeous mountain forests of southern Oregon.

If you’ve followed Redwood, you may be wondering how we’ve decided to leave our Jamstack-optimized SPA roots behind in favor of a serverful-first, full-stack React framework future. Here’s a laundry list of reasons why:

Many web app developers need strong SEO performance. This means statically rendered HTML delivered to the browser. Redwood’s existing SPA architecture makes that very difficult, but with server rendering (SSR) it’s baked in.
OG tags (the information in a page’s header that allows it to be intelligently unfurled in Twitter, Facebook, Slack, etc) are extremely valuable in many contexts and require either statically delivered HTML or some clever workarounds with edge functions. Again SSR solves for this.
RedwoodJS currently uses GraphQL for the React frontend to talk to the GraphQL API backend. While we love GraphQL for many use cases, implementing a secure and performant GraphQL API can be tricky and there is a definite cost to requiring it during early prototyping of your app. With RSC you’ll have transparent API options that let you prototype at the speed of thought while still upgrading some components to a first class GraphQL experience when that makes sense (perhaps after you’ve added a mobile client and have an experienced GraphQL backend team).
It’s challenging to get top-notch performance out of Redwood in a Jamstack environment. AWS Lambda’s cold start times, code payload limits, and execution timeouts are all hurdles that need to be considered. Most Redwood users today already choose a serverful deployment strategy for exactly these reasons. SSR and RSC is all about performance, both server- and client-side. We intend to make it possible for most of Redwood’s features to continue to work in serverless environments, but from now on we will be optimizing for serverful RSC and all the advantages that will bring.
If you want to pull data from a 3rd party (like Contentful or Shopify) you currently need to run that through your GraphQL API, but with RSC you’ll be able to fetch data from anywhere in a more direct and transparent way.
You can read a full account of RSC’s advantages elsewhere, but more of my favorites are: smaller bundle sizes shipped to the browser, large libraries can be run server-side only (more bundle savings), quicker hydration, and easy server-side secrets.
RSC is the future of React. The React team has made this very clear and we are lucky to be in touch with their amazing team members to help us along this path.

When it comes to React Server Components, it’s still the first inning. Implementing RSC in a framework is a big task, especially because there is no official RSC spec or reference implementation yet. We see this as an opportunity to join the ball game and help guide the future of this awesome technology. Our goal is to make Redwood the premier, open source, fully independent, full-stack React framework that you will choose for your next web app project.

Beyond RSC, we have a ton more features and enhancements planned for the Bighorn epoch. You can follow along with the RedwoodJS Roadmap.

If all this gets you as excited as it gets me, please drop by our Community Forums or come meet me and the Redwood core team in person at the RedwoodJS Conference in September!

http://tom.preston-werner.com/2023/05/30/redwoods-next-epoch-all-in-on-rsc

Major Version Numbers are Not Sacred

May 23, 2022 Updated May 23, 2022

Show full content

Major Version Numbers are Not Sacred

23 May 2022 - Positano, Italy

Ten years ago, I sat down and wrote the first version of what became the Semantic Versioning (SemVer) spec. I was tired of everyone using version numbers in whatever way they wanted and knew we could do better if we all agreed on what each part of a version number meant and how those numbers should change as the project’s code changed.

It worked.

Today, SemVer is one of the world’s most well-known and heavily adopted versioning schemes. Several prominent package ecosystems (like Node’s npm and Rust’s Cargo) built-in SemVer’s concepts from day one. npm alone has more than 1.8 million packages, nearly all of which are versioned with SemVer.

While it’s unrealistic to expect every package to perfectly adhere to SemVer all the time (mistakes do happen), I’ve seen in practice that the vast majority of package developers are quite careful about version number changes and try to follow SemVer in good faith. This can be hard, though, especially when it comes to breaking changes.

In the time before Semantic Versioning, major version number changes often signified large shifts in the architecture, approach, or broad capabilities of the underlying software. In that world, the minor versions would contain breaking changes, especially for products with large API surface areas. Bumping the major version often corresponded to a marketing push to communicate those improvements to the world. This produced a natural outcome: increasing the major version number was a big deal.

Today, in SemVer ecosystems, it’s still a big deal. And I think that’s a problem.

Our collective hesitance to bump the major version of a package is so strong that we sometimes concoct elaborate justifications as to why a breaking change can be included in a minor version release.

“Not many people are using that feature yet; it’ll be fine.”

Or,

“It was clearly intended to be an experimental feature, so we should be able to retroactively mark it as such and then break it in minor, right? Right? RIGHT????”.

Or plainly,

“There isn’t enough to warrant a major version bump; we’re not going to release a major version for some tiny breaking change.”

Either SemVer means something, or it doesn’t. I choose to believe that it does. I’ve seen the benefits to huge package ecosystems and upgrade processes. If we want to live in a world where SemVer improves our collective experience as much as possible, it means we need to believe something new:

Major version numbers are not sacred.

In practice, this means two things:

When you need to release a breaking change—any breaking change—you do it in a major version. Period. No excuses.
You need to find a different way to communicate more significant project-wide changes and improvements. Major version numbers can’t do double duty as both breaking-change indicator and marketing hook.

The first one should be easy. You just have to internalize that major version numbers are not sacred, you’re not going to run out of them, and it’s your duty as a responsible release manager to always indicate that a breaking change is included within. Of course, just because numbers are free doesn’t mean it’s wise to break your API on every release. Each breaking change means manual code modifications for some users of your package. Break the API too frequently, and you’ll risk fatiguing your users by forcing them to dig through your release notes (you do have release notes, right?) and working through the breaking changes to see if any of them are relevant and need action.

The second one is harder. There have been various proposals over the years, a common one being that SemVer should add a 4th part, perhaps called “epoch,” that is the property of the marketing department. So you’d see version 2.3.1.0 instead of just version 3.1.0, and the 2 indicates that this is the 2nd epochal version corresponding to some marketing event or a larger shift in the library. It’s not a horrible idea, but it does increase the visual complexity of the version number quite a bit and would be challenging to get adopted throughout the various SemVer package ecosystems.

Another idea (and one that doesn’t require SemVer itself to change) is to use a code name or marketing name to associate with a range of major versions. When the name changes, that is the marketing event. For instance, Ubuntu is famous for its wacky animal code names like Trusty Tahr and Bionic Beaver. With a bit of extra discipline, it’s not hard to imagine starting with the letter A and naming each “epoch” with a name starting with the next letter in the alphabet, giving users some sense of how many big iterations the software has been through.

These points are not simply an academic exercise for me. In fact, this post is in part an explanation for why we’re about to release RedwoodJS 2.0 only a bit over a month after releasing 1.0.

Weird, right? See how ingrained it has become to think that major versions should only happen once a year?

So why do we need a 2.0 so soon?

Being a full-stack web framework, RedwoodJS has an extensive API surface area. There is simply a lot of functionality being worked on.
We need to move quickly and iterate to stay relevant. The web framework space is a constantly moving target, and success will require rapidly improving capabilities and developer experience.
Not all changes can maintain backward compatibility without incurring unacceptable tech debt or complexity increases in the codebase. There are real development, maintenance, and velocity costs associated with supporting both the old and new ways of doing something. We need to be able to make the call to move forward with new ideas without having to incur these costs or wait many months until a marketing-approved major release can be rolled. The specific breaking change that is prompting 2.0 is this set of changes to the Baremetal deploy strategy.
We are unwilling to compromise the trust of our users by finding some excuse for shipping breaking changes in a minor release. If you know we’re serious about marking breaking changes with a major release, you can spend less time verifying that our minor releases are safe upgrades.

Ok, even if all that’s true, how are we going to reduce the impact of releasing breaking changes so often (perhaps monthly)?

Every minor or major release gets a serious write-up of everything that changed and how to adapt your code to any breaking changes. Even during the pre-1.0 days of RedwoodJS, our team was obsessive about top-notch, comprehensive release notes, and we’ve gotten very good at making them as useful as possible. See the RedwoodJS v1.0.0 release notes for an example of what I mean.
On top of the release notes, whenever possible, we provide executable code mods to automatically upgrade your code to work with the new release! This is a non-trivial undertaking, but by taking the pain out of upgrades, we believe we can do more breaking changes with less impact, leading to a better framework and improved DX more quickly. The Go programming language famously provided code mods to reduce the impact of their frequent breaking changes during the initial development of the language, giving them free license to make sweeping changes without having to support every syntactical mistake they ever made.
I actually prefer more frequent major versions with smaller sets of breaking changes. There is a real danger in saving up all your significant changes and releasing them in one whopping major version once a year. In the early days of GitHub, when we would need to upgrade Ruby on Rails to the latest major version, it would often take weeks or months to accomplish, even with a dedicated developer on the task. Once, because of how extensive the changes needed to be, the upgrade branch started to deviate so much from the main branch that it became too unwieldy to merge, and we had to throw it away and start over with a different approach. Spreading upgrade work out throughout the year would have saved us a lot of time and anguish.

In an effort to drink my own philosophical champagne, I am excited to very soon deliver to you RedwoodJS 2.0, part of the Arapaho Forest epoch. Yep, you guessed it, each epochal version of Redwood will be named after a US national forest, and when we announce a new epochal version (that starts with a B), you can be sure you’ll know about it. In the meantime, you’ll see a number of major versions of Redwood (all with the Arapaho Forest epoch name) without a ton of fanfare surrounding the releases.

I want to live in a world where every breaking change comes gift-wrapped in a major release. I want the promise of SemVer to be fully realized. I think we can get there by rejecting the tyranny of sacred major version numbers. If you feel the same, I hope you’ll join me in embracing this philosophy.

Discuss this post on Hacker News

http://tom.preston-werner.com/2022/05/23/major-version-numbers-are-not-sacred

Introducing the Redwood Startup Fund

Apr 7, 2022 Updated Apr 7, 2022

Show full content

Introducing the Redwood Startup Fund

7 Apr 2022 - Bay Area, CA

Today I’m pleased to announce the Redwood Startup Fund: a $1M fund that will invest $25k–$50k in startups that use RedwoodJS as a primary component in their stack.

I have several goals in mind with this fund:

Help more startups explore more territory, more quickly. This is our mission at RedwoodJS. Until now we’ve provided software, documentation, community, connections, advice, and support, but we’ve had to stop short of providing money. Now, for the most promising people and ideas, we can go all the way.
Increase the diversity of founders in the Redwood community. Over the last six months, Redwood-based startups have raised more than $19M, but the demographics of those founders skew heavily white and male. I’d like to create more opportunities for Black devs, women devs, and other minoritized devs within our broader community to travel down the startup path.
Encourage climate-focused software. Climate change is the existential crisis of our time. We need more ideas and progress in every sector, including software-based approaches to preventing emissions and sequestering carbon.

The fund will be part of my angel investing practice at Preston-Werner Ventures, where I’ve made early investments in companies like Stripe, Netlify, Snyk, Prisma, Supabase, PlanetScale, Gitpod, Railway, Snaplet, StackBlitz, FaunaDB, GraphCDN, Retool, and many other dev tools. I also invest heavily in climate technology companies like Beta (electric airplanes), Mote Hydrogen (green hydrogen & permanent carbon sequestration), and Prometheus Fuels (air to carbon-neutral fuel). Though this is a limited set of sectors, the Redwood Startup Fund is intended to invest broadly across industries and I will be engaging others in helping me evaluate potential startups.

I anticipate deploying the fund’s capital over the next 12 months. This will mean a fairly rapid pace of investment (2–4 investments per month) and I’m hoping that some of these checks will go to individuals that would otherwise not have any means to even try their hand at doing a startup. If I am first money in, even before incorporation, I am totally fine with that, and hope to help folks reach the stage where they can raise a pre-seed. Is there such a thing as pre-pre-seed? Maybe we can call it the “existential round”.

I don’t have all the mechanics figured out yet; how to take applications, or what terms on an existential round might be (though, as an entrepreneur myself, I intend them to founder friendly), but hope to finalize that within a month. If you’re interested in knowing more, sign up for the Redwood Startup Fund mailing list.

For now, dream big and try out the RedwoodJS Tutorial. I hope to talk with you about your Redwood-based startup soon!

–

Discuss this post on Hacker News

http://tom.preston-werner.com/2022/04/07/the-redwood-startup-fund

Announcing RedwoodJS 1.0 and $1M Funding

Apr 4, 2022 Updated Apr 4, 2022

Show full content

Announcing RedwoodJS 1.0 and $1M Funding

4 Apr 2022 - Bay Area, CA

Three years ago, I had an idea for a new web app framework. Two years ago, we released RedwoodJS v0.1 to the world. Today, more than 300 talented individuals have lent their ideas and time in crafting our documentation, design, community, marketing, and code. During the journey, we found our mission: to help more startups explore more territory, more quickly.

Over the last six months alone, startups using RedwoodJS have raised over $19M USD in funding. They are the brave, early adopters that helped shape what Redwood has become. Now it’s time to serve everyone. As of this moment, we fully endorse Redwood for production use across a wide range of applications, from simple solo side projects to funded startups.

To celebrate this milestone, I have two big announcements for you.

#1 - Today, we release RedwoodJS 1.0.

As with all things that start with the number one, this is the first major version of many to come. There is so much more that we can do to help startups bring their multi-client apps to fruition. That is our guiding star. To reach our goals, we need to accelerate. Which brings me to the second announcement.

#2 - Today, I am committing $1,000,000 to RedwoodJS development over the next year.

In the past, we’ve talked about whether we should turn RedwoodJS into a company and raise money for it. We certainly could; other players in the space have done so. But we always ask ourselves if that’s what would best serve the Redwood community. When you take venture funding, you sign up for a rocket ship ride that will either take you to the moon or to crash-land painfully back on earth. Those are the only two choices. And both rides tend to require heavy extraction of value from the customer.

Right now, we prefer a third choice: build a sustainable open source project that evolves and grows naturally, without artificial growth hormones. For a difficult-to-directly-monetize framework project like Redwood, it will take us time to find the right model of sustainability that aligns with the community. That’s hard to do without some money, though.

Fortunately, I find myself in a position to solve this problem in an unusual way. I love open source. I love working on RedwoodJS. I love bootstrapping. So my million dollar spend on Redwood development comes with no strings attached, except that we continue to focus on building the best app framework for startups.

I hope you enjoy the Redwood 1.0 release as much as we do. We will be celebrating all week with events centered on developers, founders, partners, and more, all culminating in another big announcement on Thursday, April 7th. RSVP now to get reminders!

If you want to learn more, skim through the newly updated Redwood Website and spend some time doing the comprehensive Redwood Tutorial.

I hope to see you in the community!

Continue the conversation on the RedwoodJS Discourse Forum.

http://tom.preston-werner.com/2022/04/04/redwood-v1-and-funding

Committing $250k this Year to Racial Justice Efforts

Jun 15, 2020 Updated Jun 15, 2020

Show full content

Committing $250k this Year to Racial Justice Efforts

15 June 2020 - San Francisco

On February 23rd, Ahmaud Arbery was murdered by armed white men while jogging near Brunswick, Georgia. On March 13th, Breonna Taylor was murdered by Louisville Metro Police in her own home. On May 25th, George Floyd was murdered by Minneapolis Police officers. These are just three of the names on the list of thousands of Black men and women who lost their lives to the systemic racism that is woven into the fabric of this country.

It will take a monumental anti-racist movement to change the underlying conditions that allow racism in the United States to flourish. As part of that movement, Theresa and I are committing $250,000 this year to projects that fight systemic racism. Here’s how we intend the funds to be dispersed:

$50,000 toward efforts to put an end to police brutality and the criminalization system.

$50,000 to support Black political candidates and candidates who are most likely to push forward anti-racist policy change.

$50,000 for Black education and business initiatives.

$100,000 to long-term strategic efforts to create jobs and invest in Black communities.

We will issue this financial support over the next few months. When we finalize commitments to specific projects, we will announce them on the Preston-Werner Ventures Blog.

This is the first of many steps we will be taking in our formalized approach to funding racial justice efforts. We look forward to working with Black leaders and community members to dismantle the racist structures that pervade our society. If you find yourself in a position of privilege, I hope you too will join the movement with your time, money, and effort.

http://tom.preston-werner.com/2020/06/15/committing-250k-to-racial-justice-efforts

We Are Giving $1M Toward San Francisco COVID-19 Response

Mar 23, 2020 Updated Mar 23, 2020

Show full content

We Are Giving $1M Toward San Francisco COVID-19 Response

23 Mar 2020 - San Francisco

COVID-19 represents an unprecedented threat to the health and livelihood of people everywhere, and as we wait for the the US government to act on this emergency, my wife, Theresa, and I have decided to put one million dollars toward our local community’s efforts to prepare for and mitigate the effects of the disease.

We have split the funds equally amoung four organizations, focusing on the well-being of frontline medical workers and food security for vulnerable populations here in the city. The programs we chose are:

UCSF Health - We asked the leadership of the UCSF hospital system what they needed most to help their frontline workers and they told us their most urgent current need is in providing hotel rooms for their medical staff that are worried about going home and infecting their families. We hope this money can provide that form of safety and comfort during these difficult times. Give to UCSF.

Zuckerberg SF General Hospital - We talked with the CEO of SF General and will be supporting their efforts in acquiring personal protective equipment (PPE) for their medical staff. We’re also in touch with Ryan Peterson, CEO of Flexport, who is helping coordinate logistics around the purchase and shipping of PPE to SF hospitals in order to help make this happen as quickly as possible. Give to SF General.

YMCA - As the parents of three young children, we greatly empathize with the challenges that parents face during this crisis. This is especially true for those critical healthcare workers who we rely on to care for us. With that in mind, we have partnered with the YMCA to provide free childcare (including for ages 0-5 years of age) to frontline workers here in SF, regardless of where they live. Give to YMCA.

Give2SF Fund - In addition to frontline medical workers, we wanted to support food security for some of the most vulnerable populations in SF. Mayor London Breed and the Office of the Mayor have set up the Give2SF Fund to streamline donations for several programs, one of which will provide grocery store gift cards to at-risk residents of SF, including undocumented and mixed status households, low-income households with a pregnant woman or infant, and older adults and persons who have underlying health conditions who are quarantined and/or who have lost in-home support. Give2SF.

These efforts are all local to our home town, but the coronavirus is global. Through Preston-Werner Ventures we are digging into the most effective ways to help abroad as well. Most of our philanthropic dollars go to international causes, especially around community health workers in Africa and Asia, so we hope to leverage that expertise in this crisis.

If you’ve found that the US economy has worked in your favor over the last several years or decades and are able to contribute funds of any amount to these organizations or others, here in SF or anywhere else in the world, I implore you to do so now. Be decisive. Preparation will make a difference in saving lives and protecting those that are risking theirs in our hospitals and other essential workplaces.

Together, we are strong.

http://tom.preston-werner.com/2020/03/23/giving-1-million-toward-sf-covid-19-response

Joining the Netlify Board to Help Shape the Future of the JAMstack

Mar 4, 2020 Updated Mar 4, 2020

Show full content

Joining the Netlify Board to Help Shape the Future of the JAMstack

4 Mar 2020 - Cordon, France

In late September, 2015, I got an email out of the blue from Mathias Biilmann. He told me about his brand new 2-person startup called Netlify and asked if I’d fancy getting together to chat about it. You see, they were making it really easy to deploy static sites like Jekyll and thought I might be interested in learning more. It sounded cool, and it’s hard for me to turn down anything Jekyll-related, so I agreed to meet.

On October 1st we met up at a bar in the Dogpatch. I sat down with Matt and his cofounder Christian Bach for several hours over some craft beer and they showed me how they could already build and deploy Jekyll sites (and a few other static site generators) to a global CDN with the push of a button. It was like GitHub Pages, but souped up and turbocharged with global scale production deployments in mind from day one.

That existing technology (and the smooth developer experience it offered) was already very impressive, but what really got my attention was their vision around a new term they wanted to coin: the JAMstack (the JAM is for JavaScript, APIs, & Markup).

At the time, we all called Jekyll and friends “static site generators”, even though many of them were starting to do really interesting things with JavaScript and 3rd party APIs to handle the dynamic bits like comments, text search, eCommerce, and more. The problem was not that this description of the generators was inaccurate, but that it failed to capture the capabilities and imaginative possibilities of this new architecture, namely, pre-rendering content into CDN-deliverable static pages and then layering interactivity on top with JavaScript and 3rd party APIs.

In the same way that the term “AJAX” allowed us to take discussions of what was possible on the web to new conceptual heights, Chris and Matt argued that developers and businesses could communicate more effectively about the potential of “JavaScript and API enhanced static markup sites you can deploy with a git push” if we had a catchy name to capture that development methodology and deployment flow.

I thought they were right, loved the named, and advised they start pushing it. While it was impossible to predict whether it would catch on or not (only the developer community gets to decide that), I had a strong inclination that the market was ready for it, and there was a high probability that it would be embraced.

In February of 2016 I participated in Netlify’s seed round, and we kept the conversations and craft beer flowing through the next four years and two rounds of funding.

In that time, Netlify has become the darling of the JAMstack universe by focusing on developer experience and relentlessly pushing the envelope of what you can do with JAMstack, and how easily you can do it. Early on they changed the game with Deploy Previews so you can see a preview version of your entire site for every pull request you make. They integrated Let’s Encrypt so everyone can get SSL for free with zero hassle. They built Netlify Forms so you can submit simple web forms to their servers and see all the results. They launched Netlify Identity to handle authentication and user administration. And then they launched Netlify Functions—allowing you to write AWS Lambda functions in your git repo and have them deployed to AWS with that same single git push—and this time it not only revolutionized the ecosystem, it changed something inside of me.

For many years I’ve been looking for something that makes it possible to author a complex web application and git push deploy it to a scalable “delivery/compute/storage” layer, but with minimal lock-in and awesome global performance. AWS and other cloud providers have tons of building blocks for this, but I’ve never come across an end-to-end solution that felt right to me.

When I first heard about Netlify Functions, something clicked. What if you could author your web app in two parts: a JavaScript frontend client written in something like React that could be delivered statically via Netlify’s CDN, and the business logic layer that could run on AWS Lambda deployed via Netlify Functions? Both can be run on the edge and auto-scale. Choice of storage layer still depends quite a bit on the read/write characteristics of your app, but a variety of auto-scaling solutions are already on the market, with many more to come. All of this could be deployed with a simple git push to GitHub, with Netlify grabbing the repo, building it, and distributing it to CDN and AWS Lambda. HOLY SMOKES this is getting exciting!

Suffice to say, I started badgering Matt and Chris with my ideas to take the JAMstack to the next conceptual level. Beyond pre-rendered pages with layered interactivity, the JAMstack could just as easily encompass this broader scope of full-stack web app development and deployment.

I’ve been so into this idea (and all the rest of the potential of the JAMstack) that I started talking with the Netlify founders about joining the board to help in a more official capacity. I’m pleased to say that as of the closing of their just announced $53 million Series C funding (in which I also participated via Preston-Werner Ventures), I have joined the Netlify board and will be doing my best to keep pushing the JAMstack envelope in ways that will surprise and delight you.

If you get as excited about this stuff as I do, come help us build Netlify into a developer happiness juggernaut. There is so much territory to explore, and we’re going to need a creative and diverse group of individuals to make it happen. Check out what we’re hiring for!

Oh, and speaking of that end-to-end web app development and deployment flow I mentioned…I never found a startup or open source project doing it the way I thought it should be done, so I’m building it. It’s a full-stack edge-ready web application framework designed to deploy to Netlify, and I’ll be announcing it on March 10th. Follow me on Twitter @mojombo to make sure you don’t miss it!

http://tom.preston-werner.com/2020/03/04/joining-the-netlify-board

Snyk - Automatically Scan and Fix Ruby and Nodejs Vulnerabilities

Nov 10, 2016 Updated Nov 10, 2016

Show full content

Snyk - Automatically Scan and Fix Ruby and Nodejs Vulnerabilities

10 Nov 2016 - San Francisco

This is a story about a company called Snyk (pronounced “sneak”), their founder Guy Podjarny, my decision to become one of their advisors, and how they are going to help save you from malevolent agents trying to steal your digital stuff.

If you’re anything like me, you’re simultaneously terrified and in awe of the increasing commonality of large corporate security breaches. Even big names like Ebay, Home Depot, Anthem, JP Morgan Chase, Target, LinkedIn, Dropbox, and Yahoo are falling victim to sophisticated attacks. If you spend even a few minutes looking into it, you’ll be shocked at how frequently these breaches are happening now. The fine folks at Information is Beautiful have an excellent interactive visualization of the World’s Biggest Data Breaches over the last twelve years, in case you want to read all the gory details and never get a restful night of sleep ever again:

I’ve used a fair number of emotionally charged words above that might be triggering your FUD detectors right about now. But be advised: it’s not paranoia when they really are out to get you. If recent, extremely high profile (and subsequently weaponized) breaches like those of the Clinton Campaign and the DNC aren’t enough to make you want to air gap your entire life, then I envy your steely-eyed mettle and implore you to teach me your meditation techniques.

The fact is, security is hard. And it’s getting harder every day. To win, you have to get it right every single time. To lose (and lose big), you only have to screw it up once.

During my years at GitHub, I spent a lot of time assembling a dedicated security team, managing security audits and penetration tests, and working to establish a culture of security awareness amongst our development team. All of this is challenging and expensive, especially for a young company. Even worse, it’s the kind of investment that’s totally invisible when it’s working, making it hard to sustain until that crucial and terrible moment you end up on the front page of Hacker News as the latest victim.

A year ago I was contemplating this, especially the difficult proposition of having developers, furious at work on new features, constantly maintain awareness of security vulnerabilities they might be inadvertently weaving into the product. Web application developers are generally not security experts, and though I would love to live in a world where that wasn’t true, it’s just not a realistic expectation. Meanwhile, modern development means an increasing reliance on 3rd party code. Even a small Rails app will probably have 300 or more gem dependencies after a few months of development. It’s even more in the nodejs world. This level of modularization and code reuse, driven by the explosion of high quality open source over the last decade, is amazing and I absolutely love it, but it comes at a security expense.

Open source projects are not known for their excellent security records. Vulnerabilities like Heartbleed and Shellshock painfully demonstrate the idea that “given enough eyeballs, all bugs are shallow” is completely false. In fact, due to a flaw in YAML, Rails had a pretty extreme remote code execution vulnerability for years. If you were running any version of Rails prior to the fix, you were vulnerable. This stuff is real, and as responsible developers, we need to be more proactive about it.

Luckily, at the time I was pondering these matters, I ran into Guy Podjarny. As a former cofounder of Blaze.io and then CTO of Web Experience at Akamai (which acquired Blaze.io), Guy intimately understands the impact of security on today’s web developers. He was working on an automated tool to scan and fix security vulnerabilities in 3rd party dependencies. I was intrigued. They already had a way to scan nodejs projects and look for known security vulnerabilities in the dependency tree and automatically upgrade or patch affected libraries. I thought this was pretty cool, but it was his vision for what automated security tooling could be that sold me on him and his company. I can’t talk much about that now, but just know that what Snyk is today is just the tip of what will become an intelligent and proactive bodyguard for your entire codebase.

A few months ago, Snyk released GitHub integration to make it fantastically simple to hook up your repos to Snyk and, my favorite feature: the ability to monitor your repo for future vulnerabilities and then automatically submit a pull request with the suggested package upgrade or hotfix patch (nodejs only for now).

Today, Snyk announced support for Ruby. Take a look at that blog post, it does an awesome job of explaining how simple it is to set up and what the generated pull requests look like. It’s totally free for open source projects, and extremely cheap insurance for your important projects.

Make no mistake, 3rd party code is a clear and present danger to your business. If you don’t know if you’re vulnerable, then you must assume that you are and take steps to protect yourself. Snyk makes it easy.

http://tom.preston-werner.com/2016/11/10/snyk

Replicated - An Easier Path from SaaS to Enterprise

Jun 19, 2015 Updated Jun 19, 2015

Show full content

Replicated - An Easier Path from SaaS to Enterprise

19 Jun 2015 - San Francisco

Over the last year I’ve had a chance to learn a lot more about early stage funding and made angel investments in a handful of startups. So far I’ve restricted my involvement to companies with ideas in which I have significant domain knowledge. I also insist on founders with relentless product focus, a fierce desire to help their customers be more awesome, and excellent communication skills. I recently met just such a company, with just such founders, tackling a problem that has personally caused me much pain.

The company is Replicated, and founders Grant Miller and Marc Campbell are making it easier to roll out an on-prem Enterprise offering based on an existing cloud-based SaaS product.

At GitHub, we burned through a lot of developer cycles building our own installer (several times), securing the installation environment, coding an automated licensing management system, integrating single sign-on services (LDAP, Active Directory, CAS, etc, etc), building out a searchable audit system, supporting customer-reviewable support bundles (logs and other diagnostic output), allowing numerous backup strategies, and countless other Enterprise-specific features that were killing our Enterprise deals. All of this on top of hiring and building out the necessary sales, support, and accounting teams to create a smooth Enterprise experience for our customers.

Replicated provides common Enterprise functionality (much of what I mentioned above, and all of it eventually) that you can wrap around your SaaS product, resulting in a first-class on-prem product in a fraction of the time. Beyond just technology, Replicated will help you understand your Enterprise customers through documentation on best practices and insight into the requirements and reasons that large companies desire the features they do. Until you can empathise with your customer (which is very hard to do as a fast-moving SaaS startup), you’ll never build the best product possible.

Getting into the Enterprise market will always be hard. But by reducing the technology burden, Replicated plans to erase much of the pain so you can focus on the other human-centric tasks. Not only am I an investor in Replicated, I believe in their mission and their founders so much that I’ve joined as an advisor. I understand what the uphill slog of the SaaS to Enterprise climb feels like, and I’m going to do my best to ensure you don’t have to suffer it as much as I did.

I’m also pleased to announce that Travis CI is now shipping their Enterprise product using Replicated. To see what the installation process is like, watch Grant install Travis CI Enterprise on a fresh server in about seven minutes. For a deeper dive, Travis CI has also published a blog post covering some of their process in getting their Enterprise installer ready using Replicated.

In the coming weeks, you’ll start to see other well-known startups launching (or re-launching) Enterprise versions of their SaaS software on top of Replicated. If you’re looking to do the same, and want to save yourself a lot of heartache, email contact@replicated.com, and start focusing on what matters the most: your unique and kickass product.

http://tom.preston-werner.com/2015/06/19/replicated

Farewell GitHub, Hello Immersive Computing

Apr 21, 2014 Updated Apr 21, 2014

Show full content

Farewell GitHub, Hello Immersive Computing

21 Apr 2014 - New York City

Today is my last day at GitHub. Recent events have given me a lot of time to reflect on what’s important to me, and I’ve decided to switch gears and focus on building something from scratch again. Since visiting the Oculus VR team at their office three months ago, I’ve come to believe that immersive computing (aka virtual reality) is poised to rival the personal computer, the web, social networking, and mobile devices in its impact. While the timing is more abrupt than I had intended, with everything that’s happened, I think now is the right time to do this, and I’d like to explain why.

First, I want to address the serious accusations that were made against me and my family over the past month. With every decision I made at GitHub and in every interaction I had with employees, I tried to treat people better than they expected and to resolve conflict with empathy. Despite that, I’ve made mistakes, and I am deeply sorry to anyone who was hurt by those mistakes. It devastates me to know that I missed the mark, and I will strive to do better, every day.

That said, I want to be very clear about one thing: neither my wife, Theresa, nor I have ever engaged in gender-based harassment or discrimination. The results of GitHub’s independent investigation unequivocally confirm this and we are prepared to fight any further false claims on this matter to the full extent of the law. I believe in diversity and equality for all people in all professions, especially the tech sector. It’s immensely important to me and I will continue to do my very best to further that belief.

Unfortunately, the investigation and all the attention surrounding it have me concerned that remaining at GitHub would be a distraction for both me and the company. I’m incredibly proud of what I’ve helped build at GitHub and I don’t want the events of the past month to jeopardize that. I care too much about the company and the people here to let that happen. The GitHub team is incredibly strong, with fierce vision, and I have no doubt they will continue to revolutionize software development for decades to come. Founding and building GitHub has been the greatest adventure of my life. I’ve been so lucky to be on this journey with such amazing, helpful, talented, and real people. I’m going to miss working with such a great team, but I’m also insanely excited about the future.

Since the early days of GitHub, I’ve wanted to create a different kind of business. One that was Optimized for Happiness and built atop a Framework of Happiness. One where great people could work on hard problems together to create unbelievably good products. I believe I was able to achieve a great deal of success with that model at GitHub, even if things didn’t always go perfectly according to plan. All of this has been a tremendous learning experience for me.

Last January I stepped down as CEO and handed that role over to cofounder Chris Wanstrath so I could focus on future-facing R&D projects with small teams. This kind of rapid, team-based innovation is what I live for. During my time away from GitHub I started experimenting with Go, OpenGL, and Unity with an eye towards the software side of immersive computing. It felt really good to get back into a code editor and challenge the deeply logical and analytical part of my brain. I’ve enjoyed the challenges of learning how to lead a company with hundreds of people, but it’s very hard for me to deny the allure of coding a system that could once again change the course of history.

I’m telling you this because I think stealth mode is bullshit and if you feel the same way I do about immersive computing then I want to talk with you about it. For the next few months I’m going to be living in Manhattan. My wife, Theresa, is currently participating in Techstars NYC as their very first nonprofit. Her startup, The Omakase Charity, helps donors learn about and support nonprofits that are changing the world with technology. She’s one of the strongest and most thoughtful women I know, and I’m hoping to help her succeed with her mission while I’m here.

Thank you to everyone that reached out to me over the last month, including the generous team at Andreessen Horowitz. Your support has made a huge difference and I’m truly excited for what’s next.

http://tom.preston-werner.com/2014/04/21/farewell-github-hello-immersive-computing

Open Source (Almost) Everything

Nov 22, 2011 Updated Nov 22, 2011

Show full content

Open Source (Almost) Everything

22 Nov 2011 - San Francisco

When Chris and I first started working on GitHub in late 2007, we split the work into two parts. Chris worked on the Rails app and I worked on Grit, the first ever Git bindings for Ruby. After six months of development, Grit had become complete enough to power GitHub during our public launch of the site and we were faced with an interesting question:

Should we open source Grit or keep it proprietary?

Keeping it private would provide a higher hurdle for competing Ruby-based Git hosting sites, giving us an advantage. Open sourcing it would mean thousands of people worldwide could use it to build interesting Git tools, creating an even more vibrant Git ecosystem.

After a small amount of debate we decided to open source Grit. I don’t recall the specifics of the conversation but that decision nearly four years ago has led to what I think is one of our most important core values: open source (almost) everything.

Why is it awesome to open source (almost) everything?

If you do it right, open sourcing code is great advertising for you and your company. At GitHub we like to talk publicly about libraries and systems we’ve written that are still closed but destined to become open source. This technique has several advantages. It helps determine what to open source and how much care we should put into a launch. We recently open sourced Hubot, our chat bot, to widespread delight. Within two days it had 500 watchers on GitHub and 409 upvotes on Hacker News. This translates into goodwill for GitHub and more superfans than ever before.

If your code is popular enough to attract outside contributions, you will have created a force multiplier that helps you get more work done faster and cheaper. More users means more use cases being explored which means more robust code. Our very own resque has been improved by 115 different individuals outside the company, with hundreds more providing 3rd-party plugins that extend resque’s functionality. Every bug fix and feature that you merge is time saved and customer frustration avoided.

Smart people like to hang out with other smart people. Smart developers like to hang out with smart code. When you open source useful code, you attract talent. Every time a talented developer cracks open the code to one of your projects, you win. I’ve had many great conversations at tech conferences about my open source code. Some of these encounters have led to ideas that directly resulted in better solutions to problems I was having with my projects. In an industry with such a huge range of creativity and productivity between developers, the right eyeballs on your code can make a big difference.

If you’re hiring, the best technical interview possible is the one you don’t have to do because the candidate is already kicking ass on one of your open source projects. Once technical excellence has been established in this way, all that remains is to verify cultural fit and convince that person to come work for you. If they’re passionate about the open source code they’ve been writing, and you’re the kind of company that cares about well-crafted code (which clearly you are), that should be simple! We hired Vicent Martí after we saw him doing stellar work on libgit2, a project we’re spearheading at GitHub to extract core Git functionality into a standalone C library. No technical interview was necessary, Vicent had already proven his skills via open source.

Once you’ve hired all those great people through their contributions, dedication to open source code is an amazingly effective way to retain that talent. Let’s face it, great developers can take their pick of jobs right now. These same developers know the value of coding in the open and will want to build up a portfolio of projects they can show off to their friends and potential future employers. That’s right, a paradox! In order to keep a killer developer happy, you have to help them become more attractive to other employers. But that’s ok, because that’s exactly the kind of developer you want to have working for you. So relax and let them work on open source or they’ll go somewhere else where they can.

When I start a new project, I assume it will eventually be open sourced (even if it’s unlikely). This mindset leads to effortless modularization. If you think about how other people outside your company might use your code, you become much less likely to bake in proprietary configuration details or tightly coupled interfaces. This, in turn, leads to cleaner, more maintainable code. Even internal code should pretend to be open source code.

Have you ever written an amazing library or tool at one job and then left to join another company only to rewrite that code or remain miserable in its absence? I have, and it sucks. By getting code out in the public we can drastically reduce duplication of effort. Less duplication means more work towards things that matter.

Lastly, it’s the right thing to do. It’s almost impossible to do anything these days without directly or indirectly executing huge amounts of open source code. If you use the internet, you’re using open source. That code represents millions of man-hours of time that has been spent and then given away so that everyone may benefit. We all enjoy the benefits of open source software, and I believe we are all morally obligated to give back to that community. If software is an ocean, then open source is the rising tide that raises all ships.

Ok, then what shouldn’t I open source?

That’s easy. Don’t open source anything that represents core business value.

Here are some examples of what we don’t open source and why:

Core GitHub Rails app (easier to sell when closed)
The Jobs Sinatra app (specially crafted integration with github.com)

Here are some examples of things we do open source and why:

Grit (general purpose Git bindings, useful for building many tools)
Ernie (general purpose BERT-RPC server)
Resque (general purpose job processing)
Jekyll (general purpose static site generator)
Gollum (general purpose wiki app)
Hubot (general purpose chat bot)
Charlock_Holmes (general purpose character encoding detection)
Albino (general purpose syntax highlighting)
Linguist (general purpose filetype detection)

Notice that everything we keep closed has specific business value that could be compromised by giving it away to our competitors. Everything we open is a general purpose tool that can be used by all kinds of people and companies to build all kinds of things.

What is the One True License?

I prefer the MIT license and almost everything we open source at GitHub carries this license. I love this license for several reasons:

It’s short. Anyone can read this license and understand exactly what it means without wasting a bunch of money consulting high-octane lawyers.
Enough protection is offered to be relatively sure you won’t sue me if something goes wrong when you use my code.
Everyone understands the legal implications of the MIT license. Weird licenses like the WTFPL and the Beer license pretend to be the “ultimate in free licenses” but utterly fail at this goal. These fringe licenses are too vague and unenforceable to be acceptable for use in some companies. On the other side, the GPL is too restrictive and dogmatic to be usable in many cases. I want everyone to benefit from my code. Everyone. That’s what Open should mean, and that’s what Free should mean.

Rad, how do I get started?

Easy, just flip that switch on your GitHub repository from private to public and tell the world about your software via your blog, Twitter, Hacker News, and over beers at your local pub. Then sit back, relax, and enjoy being part of something big.

Discuss this post on Hacker News

http://tom.preston-werner.com/2011/11/22/open-source-everything

Rejected Bio from The Setup

May 3, 2011 Updated May 3, 2011

Show full content

Rejected Bio from The Setup

03 May 2011 - San Francisco

Yesterday, the autobiographical post I wrote for The Setup went live. I wrote that post over a year ago and then entered into an epic battle with @waferbaby about the length of my “Who are you, and what do you do?” section. He said it was too long. I said it could not be shortened. And so the post sat for a year, collecting dust, neither of us prepared to back down.

About a month ago I decided that it was foolish to let the words I had written rot on my hard drive and so I did the only thing I knew how to do: overreact. So I cut the original nine-hundred words of my bio down to fourteen words and resubmitted it to Daniel. Those are the words you see in the post now.

For your pleasure, here is the original bio in its full, unabridged glory.

My name is Tom Preston-Werner. I find that the hyphenated last name makes me sound distinguished and worth listening to. I grew up three decades ago in a small city in Iowa along the Mississippi, which means I shucked a lot of corn and know exactly how many mosquitos will land on your arm should you hold it still for ten minutes at dusk on the muggiest day of the summer. As an aspiring theoretical particle physicist, I worked my way through entire shelves of scientific literature from the public library, desperately wanting to understand the bewildering mathematics that littered the pages like so many leaves on the bottom of that morning’s cup of green tea. I searched in vain for instructors or classmates that could provide me with the insight necessary to comprehend the true meaning of Heisenberg’s Uncertainty Principle, but all I found were underpaid math teachers and disillusioned “students” in search of their next smoke break. After obsessing over US News’ Best Colleges reports for months I finally chose and was accepted to Harvey Mudd, a tiny engineering school in California famous for assigning the greatest number of hours of homework per night. This sounded just perfect to me. Finally a place I could bring up the EPR Paradox and not be immediately stigmatized as “that science weirdo with the hilariously thick glasses and unfortunate hairdo.”

Mudd did not disappoint. But now I had the opposite problem. In order to properly understand particle physics, you must have a deep and profound love of math. You have to be so comfortable with abstract concepts that even Picasso would be jealous. Ironically, in order to grasp the fundamental reality of our universe, you must forget about the “reality” of everyday life and start living in a world comprised of eigenvectors, Hilbert spaces, and Planck’s constant. This was a leap I could not make. I like math, but I’m too easily distracted by macroscopic reality to make it my profession.

Once I accepted that I would never spend late nights poring over bubble chamber printouts at Fermilab, it became obvious that I was destined to enter computer science. I started programming in BASIC on a TRS-80 that my dad bought from Radio Shack when I was 8 years old. Since then, I’d learned to love the discipline and creativity involved in making a machine do my bidding. It was like having a super-obedient but annoyingly logical little brother. He’ll do anything you want as long as you tell him in precise and unambiguous language. The best thing is, the feedback is immediate. In physics, it can take twenty years to prove that a single esoteric particle even exists. When you’re writing a program that displays the number of electrons in each of the shells around the nucleus of every element, the feedback is immediate and intoxicating. With just a few keystrokes, the world is changed forever. Try to get that kind of rush even once in a lifetime as a theoretical particle physicist. I dare you.

In 1999, after two years of college, I dropped out of Harvey Mudd to join a startup with some friends that were graduating. It was the end of the first dot-com bubble and I thought I could strike it rich, right then and there. Sadly, like so many startups of the day, we never accomplished what we envisioned and I ended up bouncing between jobs and consulting gigs for six years until I found myself in San Francisco. If Harvey Mudd was my mecca for physicists, then San Francisco was my mecca for programmers. Where else can you be grabbing lunch at a taqueria and overhear a group at the next table discussing the finer points of optimizing C code to run on an embedded processor?

I moved to San Francisco to take a job as a Ruby developer with a Wikipedia search engine called Powerset. I also began attending Ruby meetups and drinking with local software developers. There are a lot of talented people in the Bay Area and I wanted to meet them all. Within the Ruby community, a distributed version control system called Git was starting to get some attention. It was a really cool way of working with other people on code, but there wasn’t an easy way to get up and running with a group of developers. Along with cofounders Chris Wanstrath and PJ Hyett (who I met at the Ruby meetups) I started a company called GitHub that would address this problem and make it dead simple to share Git repositories and collaborate on code with other developers.

At first, we worked on GitHub on the side, putting in time on evenings and weekends. After six months we launched the site to the public and started charging. Not long after that, Powerset was acquired by Microsoft and I was faced with a choice: stay on as a Microsoft employee with a big retention bonus and give up GitHub or turn down the Microsoft money and quit Powerset to work on GitHub full-time. You can read more about this saga in my blog post entitled How I Turned Down $300,000 from Microsoft to go Full-Time on GitHub. I think I made the right decision.

Today GitHub has twenty-nine employees and more than 730,000 users with over 2,000,000 repositories. We’re growing fast, and I’m having the time of my life!

http://tom.preston-werner.com/2011/05/03/rejected-bio-from-the-setup

Ten Lessons from GitHub's First Year

Mar 29, 2011 Updated Mar 29, 2011

Show full content

Ten Lessons from GitHub’s First Year

29 Mar 2011 / 29 Dec 2008 - San Francisco

This post was written in late December of 2008, more than two years ago. It has stayed in my drafts folder since then, waiting for the last 2% to be written. Why I never published it is beyond my reckoning, but it serves as a great reminder of how I perceived the world back then. In the time since I wrote this we've grown from four people to twenty-six, settled into an office, installed a kegerator, and still never taken outside funding. In some ways, things have changed a great deal, but in the most important ways, things are still exactly the same. Realizing this puts a big smile on my face.

The end of the year is a great time to sit down with a glass of your favorite beverage, dim the lights, snuggle up next to the fire and think about what you’ve learned over the past twelve months.

For me, 2008 was the year that I helped design, develop, and launch GitHub. Creating a new startup is an intense learning experience. Through screwups and triumphs, I have learned some valuable lessons this year. Here’s a few of them.

Start Early

When Chris and I started working on GitHub in late 2007, Git was largely unknown as a version control system. Sure, Linux kernel hackers had been using it since day one, but outside of that small microcosm, it was rare to come across a developer that was using it on a day-to-day basis. I was first introduced to Git by Dave Fayram, a good friend and former coworker during my days at Powerset. Dave is the quintessential early adopter and, as far as I can calculate, patient zero for the spread of Git adoption in the Ruby community and beyond.

Back then, the Git landscape was pretty barren. Git had only recently become usable by normal people with the 1.5 release. As for Git hosting, there was really only repo.or.cz, which felt to me very limited, clumsy, and poorly designed. There were no commercial Git hosting options whatsoever. Despite this, people were starting to talk about Git at the Ruby meetups. About how awesome it was. But something was amiss. Git was supposed to be this amazing way to work on code in a distributed way, but what was the mechanism to securely share private code? Your only option was to setup user accounts on Unix machines and use that as an ad-hoc solution. Not ideal.

And so GitHub was born. But it was born into a world where there was no existing market for paid Git hosting. We would be creating the market. I vividly remember telling people, “I don’t expect GitHub to succeed right away. Git adoption will take a while, but we’ll be ready when it happens.” Neither Chris nor I were in any particular hurry for this to happen. I was working full time at Powerset, and he was making good money as a Rails consultant. By choosing to build early on top of a nascent technology, we were able to construct a startup with basically no overhead, no competition, and in our free time.

Adapt to Your Customers

Here’s a seemingly paradoxical piece of advice for you: Listen to your customers, but don’t let them tell you what to do. Let me explain. Consider a feature request such as “GitHub should let me FTP up a documentation site for my project.” What this customer is really trying to say is “I want a simple way to publish content related to my project,” but they’re used to what’s already out there, and so they pose the request in terms that are familiar to them. We could have implemented some horrible FTP based solution as requested, but we looked deeper into the underlying question and now we allow you to publish content by simply pushing a Git repository to your account. This meets requirements of both functionality and elegance.

Another company that understands this concept at a fundamental level is Apple. I’m sure plenty of people asked Apple to make a phone but Steve Jobs and his posse looked beneath the request and figured out what people really wanted: a nice looking, simple to use, and easy to sync mobile device that kicked some serious ass. And that’s the secret. Don’t give your customers what they ask for; give them what they want.

Have Fun

I went to college at a little school in California called Harvey Mudd. Yeah, I know you haven’t heard of it, but if you remember those US News & World Report “Best Colleges” books that you obsessed over in highschool (ok, maybe you didn’t, but I did), Harvey Mudd was generally ranked as the engineering school with the greatest number of hours of homework per night. Yes, more than MIT, and yes, more than Caltech. It turned out to be true, as far as I can tell. I have fond memories of freaking out about ridiculously complex spring/mass/damper systems and figuring the magnetic flux of a wire wrapped around a cylinder in a double helix. We studied hard–very hard. But we played hard too. It was the only thing that could possibly keep us sane.

Working on a startup is like that. It feels a bit like college. You’re working on insanely hard projects, but you’re doing it with your best friends in the world and you’re having a great time (usually). In both environments, you have to goof off a lot in order to balance things out. Burnout is a real and dangerous phenomenon. Fostering a playful and creative environment is critical to maintaining both your personal health, and the health (and idea output) of the company.

Pay attention to Twitter

I’ve found Twitter to be an extremely valuable resource for instant feedback. If the site is slow for some reason, Twitter will tell me so. If the site is unreachable for people in a certain country (I’m looking at you China), I’ll find out via Twitter. If that new feature we just released is really awesome, I’ll get a nice ego boost by scanning the Twitter search.

People have a tendency to turn to Twitter to bitch about all the little bugs they see on your website, usually appended with the very tiresome “FAIL”. These are irksome to read, but added together are worth noticing. Often times these innocent tweets will inform a decision about whether an esoteric bug is worth adding to the short list. We also created a GitHub account on Twitter that our support guy uses to respond to negative tweets. Offering this level of customer service almost always turns a disgruntled customer into a happy one.

If you have an iPhone, I heartily recommend the Summizer app from Fanzter, Inc. It makes searching, viewing, and responding to tweets a cinch.

Deploy at Will!

At the first RailsConf I had the pleasure of hearing Martin Fowler deliver an amazing keynote. He made some apt metaphors regarding agile development that I will now paraphrase and mangle.

Imagine you’re tasked with building a computer controlled gun that can accurately hit a target about 50 meters distant. That is the only requirement. One way to do this is to build a complex machine that measures every possible variable (wind, elevation, temperature, etc.) before the shot and then takes aim and shoots. Another approach is to build a simple machine that fires rapidly and can detect where each shot hits. It then uses this information to adjust the aim of the next shot, quickly homing in on the target a little at a time.

The difference between these two approaches is to realize that bullets are cheap. By the time the former group has perfected their wind detection instrument, you’ll have finished your simple weapon and already hit the target.

In the world of web development, the target is your ideal offering, the bullets are your site deploys, and your customers provide the feedback mechanism. The first year of a web offering is a magical one. Your customers are most likely early adopters and love to see new features roll out every few weeks. If this results in a little bit of downtime, they’ll easily forgive you, as long as those features are sweet. In the early days of GitHub, we’d deploy up to ten times in one afternoon, always inching closer to that target.

Make good use of that first year, because once the big important customers start rolling in, you have to be a lot more careful about hitting one of them with a stray bullet. Later in the game, downtime and botched deploys are money lost and you have to rely more on building instruments to predict where you should aim.

You Don’t Need an Office

All four fulltime GitHub employees work in the San Francisco area, and yet we have no office. But we’re not totally virtual either. In fact, a couple times a week you’ll find us at a cafe in North Beach, huddled around a square table that was made by nailing 2x4s to an ancient fold-out bulletin board. It’s no Google campus, but the rent is a hell of a lot cheaper and the drinks are just as good!

This is not to say that we haven’t looked at a few places to call home. Hell, we almost leased an old bar. But in the end there’s no hurry to settle down. We’re going to wait until we find the perfect office. Until then, we can invest the savings back into the company, or into our pockets. For now, I like my couch and the cafe just fine.

Of course, none of this would be possible without 37signals’ Campfire web-based chat and the very-difficult-to-find-but-totally-amazing Propane OSX desktop app container that doubles the awesome. Both highly recommended.

Hire Through Open Source

Beyond the three cofounders of GitHub, we’ve hired one full time developer (Scott Chacon) and one part time support specialist (Tekkub).

We hired Tekkub because he was one of the earliest GitHub users and actively maintains more than 75 projects (WoW addons mostly) on GitHub and was very active in sending us feedback in the early days. He would even help people out in the IRC channel, simply because he enjoyed doing so.

I met Scott at one of the San Francisco Ruby meetups where he was presenting on one of his myriad Git-centric projects. Scott had been working with Git long before anyone else in the room. He was also working on a pure Ruby implementation of Git at the same time I was working on my fork/exec based Git bindings. It was clear to me then that depending on how things went down, he could become either a powerful ally or a dangerous foe. Luckily, we all went drinking afterwards and we became friends. Not long after, Scott started consulting for us and wrote the entire backend for what you now know of as Gist. We knew then that we would do whatever it took to hire Scott full time. There would be no need for an interview or references. We already knew everything we needed to know in order to make him an offer without the slightest reservation.

The lesson here is that it’s far easier and less risky to hire based on relevant past performance than it is to hire based on projected future performance. There’s a corollary that also comes into play: if you’re looking to work for a startup (or anyone for that matter), contribute to the community that surrounds it. Use your time and your code to prove that you’re the best one for the job.

Trust your Team

There’s nothing I hate more than micromanagers. When I was doing graphic design consulting 5 years ago I had a client that was very near the Platonic form of a micromanager. He insisted that I travel to his office where I would sit in the back room at an old Mac and design labels and catalogs and touch up photographs of swimwear models (that part was not so bad!). While I did these tasks he would hover over me and bark instructions. “Too red! Can you make that text smaller? Get rid of those blemishes right there!” It drove me absolutely batty.

This client could have just as easily given me the task at the beginning of the day, gone and run the business, and come back in 6 hours whereupon I would have created better designs twice as fast as if he were treating me like a robot that converted his speech into Photoshop manipulations. By treating me this way, he was marginalizing my design skills and wasting both money and talent.

Micromanagement is symptomatic of a lack of trust. The remedy for this ailment is to hire experts and then trust their judgment. In a startup, you can drastically reduce momentum by applying micromanagement, or you can boost momentum by giving trust. It’s pretty amazing what can happen when a group of talented people who trust each other get together and decide to make something awesome.

You Don’t Need Venture Capital

A lot has been written recently about how the venture capital world is changing. I don’t pretend to be an expert on the subject, but I’ve learned enough to say that a web startup like ours doesn’t need any outside money to succeed. I know this because we haven’t taken a single dime from investors. We bootstrapped the company on a few thousand dollars and became profitable the day we opened to the public and started charging for subscriptions.

In the end, every startup is different, and the only person that can decide if outside money makes sense is you. There are a million things that could drive you to seek and accept investment, but you should make sure that doing so is in your best interest, because it’s quite possible that you don’t need to do so. One of the reasons I left my last job was so that I could say “the buck stops here.” If we’d taken money, I would no longer be able to say that.

Open Source Whatever You Can

In order for GitHub to talk to Git repositories, I created the first ever Ruby Git bindings. Eventually, this library become quite complete and we were faced with a choice: Do we open source it or keep it to ourselves? Both approaches have benefits and drawbacks. Keeping it private means that the hurdle for competing Ruby-based Git hosting sites would be higher, giving us an advantage. But open sourcing it would mean that

NOTE: This is where the post ended and remained frozen in carbonite until today. I intend to write a follow up post on our open source philosophy at GitHub in the near future. I’m sure the suspense is killing you!

Discuss this post on Hacker News

http://tom.preston-werner.com/2011/03/29/ten-lessons-from-githubs-first-year

Designer, Architect, Developer

Dec 11, 2010 Updated Dec 11, 2010

Show full content

Designer, Architect, Developer

11 Dec 2010 - San Francisco

Over the last six years I’ve bootstrapped three successful enterprises (Cube6 Media, Gravatar, and GitHub) and failed to gain traction with a handful of others. After a lot of thought and reflections on these experiences, I’ve identified three major skills that should be present in order to best build a successful web application. These roles can be loosely defined as the Designer, the Architect, and the Developer.

In college I spent a lot of time in the campus dark room dipping rolls of film and sheets of paper into various chemical baths beneath a dim red light. The most interesting part, though, was mounting the negative into the projector and exposing the photo paper. Every time I turned on the bright light of the projector I was reminded of a saying that has stuck with me ever since: “A photograph is nothing more than an image created by light.” Think about that for a second. The only way the photograph, and hence, the viewer, interact with the original subject is via the light that was captured. None of the fancy flashes, soft boxes, bounces, umbrellas, or backdrops mean a thing if the light they produce or redirect is in the wrong place. If the light is bad, the photograph is bad.

I think the same concept holds true for web applications. Adapting the saying for our own situation, I would say: “A web application is nothing more than an experience created by design.” Users can’t see what technology you use or whether you follow an agile development process or not. All they experience is what’s on the screen. It can’t be confusing, it can’t look amateur, and it can’t have spelling errors. If the UX is bad, the web application is bad. It’s that simple.

The way you get good UX is by having a good designer. Someone on the team must be skilled not only in making things pretty, but in making them usable as well. Without a good UX/visual design, you may as well not even bother. It’s impossible to stress how important this is.

Design comes first. It defines what you will build. Once you have an idea of what you’re creating, you need to figure out how to make it happen. That’s where the Architect comes in.

With the recent explosion of open source solutions to common problems like databases, web frameworks, job processors, messaging systems, etc, you need a team member that has a broad understanding of the technology landscape. The choices you make early on will impact your company for many years, and the wrong choices can spell disaster. The role of the Architect is to choose the best tools for the job, and to decide when new tools need to be created.

The Architect must also be ready to scale any piece of the site when you start attracting users. There’s a fine line between premature optimization and crumbling under the wave of thousands of new signups. A good architect will always be one step ahead of the curve, laying the groundwork for future scaling needs just before they are needed.

Design and architecture dictate what you build and how you build it, but without someone to do the construction, you’re dead in the water. The role of the Developer is to turn the wishes of the Designer into reality while staying within the constraints that the Architect has put forth. In addition, the Developer has to ensure that the codebase remains healthy and protect against technical debt. Sloppy development up front means a huge amount of wasted effort later on.

The three roles of the Designer, the Architect, and the Developer may reside in a single person, but it’s much more common to see groups of two or three people satisfy all these skills. In fact, the best founding teams are those where everyone fills some combination of roles. This fosters an environment of friendly argument that leads to better decisions.

But whatever you do, make sure your team fills all of these roles. Once you do, executing on your idea should come easily!

http://tom.preston-werner.com/2010/12/11/designer-architect-developer

Optimize for Happiness

Oct 18, 2010 Updated Oct 18, 2010

Show full content

Optimize for Happiness

18 October 2010 - San Francisco

Two days ago I had the pleasure of speaking at Startup School, a yearly conference on entrepreneurism put on by the great folks at Y Combinator. Never before have I see such a high concentration of smart ambitious people in one place.

You can watch the recording of my thirty minute slot on Youtube:

Since I only had about 25 minutes for the talk and 5 minutes for questions, I wanted to expand upon and clarify some of the ideas I introduced during the talk and then make myself available for additional questions. So today (Monday, 18 October 2010) I’ll be answering any questions you have via Hacker News:

Ask me a question on HN!

The very first commit to GitHub was made exactly three years ago tomorrow. In that time our team of thirteen has signed up over 420,000 developers and now hosts 1.3 million Git repositories, making us the largest code host on the planet. And we’ve done all of this without ever taking a dime of funding from outside the company. In fact, even within the company we only invested a few thousand dollars out of our own pockets during the first months to cover legal fees.

During the presentation I talk about a choice between optimizing for happiness and optimizing for money. When I say “optimizing for money” I mean following the traditional venture capital route of raising a ton of money to stash in your bank account and going for a huge exit. The unfortunate reality of this approach is that for aspiring entrepreneurs that are not well connected to the VC world, it can take an extraordinary amount of time and effort to raise that money. Even if you are able to raise capital, you are suddenly responsible to your investors and will need to align your interests with theirs.

In a world dominated by news about Facebook, Apple, Google, YouTube, Zappos, and other companies heavily funded by venture capital, it’s easy to forget that you can still build a highly profitable business with significant impact on a global market without having to first spend three months on Sand Hill Road asking for permission to build your product.

The infrastructure components necessary to run an internet business are finally cheap enough that you can get started without a huge up-front investment. In the months that you would traditionally spend in glass-walled conference rooms you can now build a sophisticated prototype of your product and start getting users signed up and engaging you with useful feedback.

This is what I mean by optimizing for happiness: I’m a hacker; I’m happy when I’m building things of value, not when I’m writing a business plan filled with make believe numbers.

When Chris and I started GitHub, I was working full time at Powerset and Chris was doing consulting work and plugging away on a product of his own. GitHub became the leisure activity that I worked on when I got home from the office. I could craft it however I pleased, and there was nobody telling me what to do. This feeling of control and ownership of something you own is intoxicating.

Within three months we had a simple product and moved into private beta. In six months we launched to the public and started charging for private plans. We’ve been profitable every month since public launch except for one (in which we hired two new employees at once). We did this by making a paycheck via other means until GitHub was generating enough revenue to support us full time at about 2/3 of what we were accustomed to making. We then raised our salaries over the next months when we hit specific revenue goals that allowed us to remain profitable. This happened about one year after inception.

A side effect of bootstrapping a sustainable company is what I like to call infinite runway. This is another element of optimizing for happiness. With venture backed endeavors you generally find that during the first several years the numbers in your bank account are perpetually decreasing, giving your company an expiration date. Your VCs have encouraged you to grow fast and spend hard, which makes perfect sense for them, but not necessary for you. Not if you’re trying to optimize for happiness.

VCs want to see quick success or quick failure. They are optimizing for money. There’s nothing wrong with that as long as you want the same things they do. But if you’re like me, then you care more about building a kickass product than you do about having a ten figure exit. If that’s true, then maybe you should be optimizing for happiness. One way to do this is by bootstrapping a sustainable business with infinite runway. When there are fewer potentially catastrophic events on the horizon, you’ll find yourself smiling a lot more often.

The ironic thing about bootstrapping and venture capital is that once you demonstrate some success, investors will come to YOU. When this happens you will be in a much better place to make a more reasoned choice about taking on additional capital and all the complexities that come with it. Talking to VCs with some leverage in your back pocket is an entirely different game from throwing yourself in front of a conference table full of general partners and trying to persuade them that you’re worth their time and money. Power is happiness.

There are other really great things you can do when you optimize for happiness. You can throw away things like financial projections, hard deadlines, ineffective executives that make investors feel safe, and everything that hinders your employees from building amazing products.

At GitHub we don’t have meetings. We don’t have set work hours or even work days. We don’t keep track of vacation or sick days. We don’t have managers or an org chart. We don’t have a dress code. We don’t have expense account audits or an HR department.

We pay our employees well and give them the tools they need to do their jobs as efficiently as possible. We let them decide what they want to work on and what features are best for the customers. We pay for them to attend any conference at which they’ve gotten a speaking slot. If it’s in a foreign country, we pay for another employee to accompany them because traveling alone sucks. We show them the profit and loss statements every month. We expect them to be responsible.

We make decisions based on the merits of the arguments, not on who is making them. We strive every day to be better than we were the day before.

We hold our board meetings in bars.

We do all this because we’re optimizing for happiness, and because there’s nobody to tell us that we can’t.

Ask me a question on HN!

http://tom.preston-werner.com/2010/10/18/optimize-for-happiness

Readme Driven Development

Aug 23, 2010 Updated Aug 23, 2010

Show full content

Readme Driven Development

23 August 2010 - San Francisco

I hear a lot of talk these days about TDD and BDD and Extreme Programming and SCRUM and stand up meetings and all kinds of methodologies and techniques for developing better software, but it’s all irrelevant unless the software we’re building meets the needs of those that are using it. Let me put that another way. A perfect implementation of the wrong specification is worthless. By the same principle a beautifully crafted library with no documentation is also damn near worthless. If your software solves the wrong problem or nobody can figure out how to use it, there’s something very bad going on.

Fine. So how do we solve this problem? It’s easier than you think, and it’s important enough to warrant its very own paragraph.

Write your Readme first.

First. As in, before you write any code or tests or behaviors or stories or ANYTHING. I know, I know, we’re programmers, dammit, not tech writers! But that’s where you’re wrong. Writing a Readme is absolutely essential to writing good software. Until you’ve written about your software, you have no idea what you’ll be coding. Between The Great Backlash Against Waterfall Design and The Supreme Acceptance of Agile Development, something was lost. Don’t get me wrong, waterfall design takes things way too far. Huge systems specified in minute detail end up being the WRONG systems specified in minute detail. We were right to strike it down. But what took its place is too far in the other direction. Now we have projects with short, badly written, or entirely missing documentation. Some projects don’t even have a Readme!

This is not acceptable. There must be some middle ground between reams of technical specifications and no specifications at all. And in fact there is. That middle ground is the humble Readme.

It’s important to distinguish Readme Driven Development from Documentation Driven Development. RDD could be considered a subset or limited version of DDD. By restricting your design documentation to a single file that is intended to be read as an introduction to your software, RDD keeps you safe from DDD-turned-waterfall syndrome by punishing you for lengthy or overprecise specification. At the same time, it rewards you for keeping libraries small and modularized. These simple reinforcements go a long way towards driving your project in the right direction without a lot of process to ensure you do the right thing.

By writing your Readme first you give yourself some pretty significant advantages:

Most importantly, you’re giving yourself a chance to think through the project without the overhead of having to change code every time you change your mind about how something should be organized or what should be included in the Public API. Remember that feeling when you first started writing automated code tests and realized that you caught all kinds of errors that would have otherwise snuck into your codebase? That’s the exact same feeling you’ll have if you write the Readme for your project before you write the actual code.
As a byproduct of writing a Readme in order to know what you need to implement, you’ll have a very nice piece of documentation sitting in front of you. You’ll also find that it’s much easier to write this document at the beginning of the project when your excitement and motivation are at their highest. Retroactively writing a Readme is an absolute drag, and you’re sure to miss all kinds of important details when you do so.
If you’re working with a team of developers you get even more mileage out of your Readme. If everyone else on the team has access to this information before you’ve completed the project, then they can confidently start work on other projects that will interface with your code. Without any sort of defined interface, you have to code in serial or face reimplementing large portions of code.
It’s a lot simpler to have a discussion based on something written down. It’s easy to talk endlessly and in circles about a problem if nothing is ever put to text. The simple act of writing down a proposed solution means everyone has a concrete idea that can be argued about and iterated upon.

Consider the process of writing the Readme for your project as the true act of creation. This is where all your brilliant ideas should be expressed. This document should stand on its own as a testament to your creativity and expressiveness. The Readme should be the single most important document in your codebase; writing it first is the proper thing to do.

–

Discuss this post on Hacker News

http://tom.preston-werner.com/2010/08/23/readme-driven-development

TomDoc - Reasonable Ruby Documentation

May 11, 2010 Updated May 11, 2010

Show full content

TomDoc - Reasonable Ruby Documentation

11 May 2016 - San Francisco

RDoc is an abomination. It’s ugly to read in plain text, requires the use of the inane :nodoc: tag to prevent private method documentation from showing up in final rendering, and does nothing to encourage complete or unambiguous documentation of classes, methods, or parameters. YARD is much better but goes too far in the other direction (and still doesn’t look good in plain text). Providing an explicit way to specify parameters and types is great, but having to remember a bunch of strict tag names in order to be compliant is not a good way to encourage coders to write documentation. And again we see a @private tag that’s necessary to hide docs from the final render.

Three years ago, after suffering with these existing documentation formats for far too long, I started using my own documentation format. It looked a bit like RDoc but had a set of conventions for specifying parameters, return values, and the expected types. It used plain language and full sentences so that a human could read and understand it without having to parse machine-oriented tags or crufty markup. I called this format TomDoc, because if Linus can name stuff after himself, then why can’t I?

After years in the making, TomDoc is finally a well specified documentation format. You can find the full spec at http://tomdoc.org.

But enough talk. Here’s a sample of what a TomDoc’d method might look like:

# Public: Duplicate some text an abitrary number of times.
#
# text  - The String to be duplicated.
# count - The Integer number of times to duplicate the text.
#
# Examples
#
#   multiplex('Tom', 4)
#   # => 'TomTomTomTom'
#
# Returns the duplicated String.
def multiplex(text, count)
  text * count
end

At first glance you’ll notice a few things. First, and most important, is that the documentation looks nice in plain text. When I’m working on a project, I need to be able to scan and read method documentation quickly. Littering the docs with tags and markup (especially HTML markup) is not acceptable. Code documentation should be optimized for human consumption. Second, all parameters and return values, and their expected types are specified. Types are generally denoted by class name. Because Ruby is so flexible, you are not constrained by a rigid type declaration syntax and are free to explain precisely how the expected types may vary under different circumstances. Finally, the basic layout is designed to be easy to remember. Once you commit a few simple conventions to memory, writing documentation becomes second nature, with all of the tricky decision making already done for you.

Today’s Ruby libraries suffer deeply from haphazard versioning schemes. Even RubyGems itself does not follow a sane or predictable versioning pattern. This lack of discipline stems from the absence of well defined Public APIs. TomDoc attempts to solve this problem by making it simple to define an unambiguous Public API for your library. Instead of assuming that all classes and methods are intended for public consumption, TomDoc makes the Public API opt-in. To denote that something is public, all you have to do is preface the main description with “Public:”. By forcing you to explicitly state that a class or method is intended for public consumption, a deliberate and thoughtful Public API is automatically constructed that can inform disciplined version changes according to the tenets of Semantic Versioning. In addition, the prominent display of “Public” in a method description ensures that developers are made aware of the sensitive nature of the method and do not carelessly change the signature of something in the Public API.

Once a Public API has been established, some very exciting things become possible. We’re currently working on a processing tool that will render TomDoc into various forms (terminal, HTML, etc). If you run this tool on a library, you’ll get a printout of the Public API documentation. You can publish this online so that others have easy access to it. When you roll a new version of the library, you can run the tool again, giving it a prior version as a base, and have it automatically display only the methods that have changed. This diff will be extremely useful for users while they upgrade to the new version (or so they can evaluate whether an upgrade is warranted)!

While I’ve been using various nascent forms of TomDoc for several years, we’re just now starting to adopt it for everything we do at GitHub. Now that I’ve formalized the spec it will be easy for the entire team to write compliant TomDoc. The goal is to have every class, method, and accessor of every GitHub library documented. In the future, once we have proper tooling, we’d even like to create a unit test that will fail if anything is missing documentation.

TomDoc is still a rough specification so I’m initially releasing it as 0.9.0. Over the coming months I’ll make any necessary changes to address user concerns and release a 1.0.0 version once things have stabilized. If you’d like to suggest changes, please open an issue on the TomDoc GitHub repository.

http://tom.preston-werner.com/2010/05/11/tomdoc-reasonable-ruby-documentation

The Git Parable

May 19, 2009 Updated May 19, 2009

Show full content

The Git Parable

19 May 2009 - San Francisco

Git is a simple, but extremely powerful system. Most people try to teach Git by demonstrating a few dozen commands and then yelling “tadaaaaa.” I believe this method is flawed. Such a treatment may leave you with the ability to use Git to perform simple tasks, but the Git commands will still feel like magical incantations. Doing anything out of the ordinary will be terrifying. Until you understand the concepts upon which Git is built, you’ll feel like a stranger in a foreign land.

The following parable will take you on a journey through the creation of a Git-like system from the ground up. Understanding the concepts presented here will be the most valuable thing you can do to prepare yourself to harness the full power of Git. The concepts themselves are quite simple, but allow for an amazing wealth of functionality to spring into existence. Read this parable all the way through and you should have very little trouble mastering the various Git commands and wielding the awesome power that Git makes available to you.

The Parable

Imagine that you have a computer that has nothing on it but a text editor and a few file system commands. Now imagine that you have decided to write a large software program on this system. Because you’re a responsible software developer, you decide that you need to invent some sort of method for keeping track of versions of your software so that you can retrieve code that you previously changed or deleted. What follows is a story about how you might design one such version control system (VCS) and the reasoning behind those design choices.

Snapshots

Alfred is a friend of yours that works down at the mall as a photographer in one of those “Special Moments” photo boutiques. All day long he takes photos of little kids posing awkwardly in front of jungle or ocean backdrops. During one of your frequent lunches at the pretzel stand, Alfred tells you a story about a woman named Hazel who brings her daughter in for a portrait every year on the same day. “She brings the photos from all the past years with her,” Alfred tells you. “She likes to remember what her daughter was like at each different stage, as if the snapshots really let her move back and forth in time to those saved memories.”

Like some sort of formulaic plot device, Alfred’s innocent statement acts as a catalyst for you to see the ideal solution to your version control dilemma. Snapshots, like save points in a video game, are really what you care about when you need to interact with a VCS. What if you could take snapshots of your codebase at any time and resurrect that code on demand? Alfred reads the dawning realization spreading across your face and knows you’re about to leave him without another word to go back and implement whatever genius idea he just caused you to have. You do not disappoint him.

You start your project in a directory named working. As you code, you try to write one feature at a time. When you complete a self-contained portion of a feature, you make sure that all your files are saved and then make a copy of the entire working directory, giving it the name snapshot-0. After you perform this copy operation, you make sure to never again change the code files in the new directory. After the next chunk of work, you perform another copy, only this time the new directory gets the name snapshot-1, and so on.

To make it easy to remember what changes you made in each snapshot, you add a special file named message to each snapshot directory that contains a summary of the work that you did and the date of completion. By printing the contents of each message, it becomes easy to find a specific change that you made in the past, in case you need to resurrect some old code.

Branches

After a bit of time on the project, a candidate for release begins to emerge. Late nights at the keyboard finally yield snapshot-99, the nascent form of what will become Release Version 1.0. It comes to pass that this snapshot is packaged and distributed to the eagerly awaiting masses. Stoked by excellent response to your software, you push forward, determined to make the next version an even bigger success.

Your VCS has so far been a faithful companion. Old versions of your code are there when you need them and can be accessed with ease. But not long after the release, bug reports start to come in. Nobody’s perfect, you reassure yourself, and snapshot-99 is readily retrievable, glad to be brought back to life for the purposes of applying bug fixes.

Since the release, you’ve created 10 new snapshots. This new work must not be included in the 1.0.1 bug fix version you now need to create. To solve this, you copy snapshot-99 to working so that your working directory is at exactly the point where Version 1.0 was released. A few swift lines of code and the bug is fixed in the working directory.

It is here that a problem becomes apparent. The VCS deals very well with linear development, but for the first time ever, you need to create a new snapshot that is not a direct descendent of the preceding snapshot. If you create a snapshot-110 (remember that you created 10 snapshots since the release), then you’ll be interrupting the linear flow and will have no way of determining the ancestry of any given snapshot. Clearly, you need something more powerful than a linear system.

Studies show that even short exposures to nature can help recharge the mind’s creative potential. You’ve been sitting behind the artificially polarized light of your monitor for days on end. A walk through the woods in the brisk Autumn air will do you some good and with any luck, will help you arrive at an ideal solution to your problem.

The great oaks that line the trail have always appealed to you. They seem to stand stark and proud against the perfectly blue sky. Half the ruddy leaves have departed from their branches, leaving an intricate pattern of branches in their wake. Fixating on one of the thousands of branch tips you idly try to follow it back to the solitary trunk. This organically produced structure allows for such great complexity, but the rules for finding your way back to the trunk are so simple, and perfect for keeping track of multiple lines of development! It turns out that what they say about nature and creativity are true.

By looking at your code history as a tree, solving the problem of ancestry becomes trivial. All you need to do is include the name of the parent snapshot in the message file you write for each snapshot. Adding just a single upstream pointer will enable you to easily and accurately trace the history of any given snapshot all the way back to the root.

Branch Names

Your code history is now a tree. Instead of having a single latest snapshot, you have two: one for each branch. With a linear system, your sequential numbering system let you easily identify the latest snapshot. Now, that ability is lost.

Creating new development branches has become so simple that you’ll want to take advantage of it all the time. You’ll be creating branches for fixes to old releases, for experiments that may not pan out; indeed it becomes possible to create a new branch for every feature you begin!

But like everything good in life, there is a price to be paid. Each time you create a new snapshot, you must remember that the new snapshot becomes the latest on its branch. Without this information, switching to a new branch would become a laborious process indeed.

Every time you create a new branch you probably give it a name in your head. “This will be the Version 1.0 Maintenance Branch,” you might say. Perhaps you refer to the former linear branch as the “master” branch.

Think about this a little further, though. From the perspective of a tree, what does it mean to name a branch? Naming every snapshot that appears in the history of a branch would do the trick, but requires the storage of a potentially large amount of data. Additionally, it still wouldn’t help you efficiently locate the latest snapshot on a branch.

The least amount of information necessary to identify a branch is the location of the latest snapshot on that branch. If you need to know the list of snapshots that are part of the branch you can easily trace the parentage.

Storing the branch names is trivial. In a file named branches, stored outside of any specific snapshot, you simply list the name/snapshot pairs that represent the tips of branches. To switch to a named branch you need only look up the snapshot for the corresponding name from this file.

Because you’re only storing the latest snapshot on each branch, creating a new snapshot now contains an additional step. If the new snapshot is being created as part of a branch, the branches file must be updated so that the name of the branch becomes associated with the new snapshot. A small price to pay for the benefit.

Tags

After using branches for a while you notice that they can serve two purposes. First, they can act as movable pointers to snapshots so that you can keep track of the branch tips. Second, they can be pointed at a single snapshot and never move.

The first use case allows you to keep track of ongoing development, things like “Release Maintenance”. The second case is useful for labeling points of interest, like “Version 1.0” and “Version 1.0.1”.

Mixing both of these uses into a single file feels messy. Both types are pointers to snapshots, but one moves and one doesn’t. For the sake of clarity and elegance, you decide to create another file called tags to contain pointers of the second type.

Keeping these two inherently different pointers in separate files will help you from accidentally treating a branch as a tag or vice versa.

Distributed

Working on your own gets pretty lonely. Wouldn’t it be nice if you could invite a friend to work on your project with you? Well, you’re in luck. Your friend Zoe has a computer setup just like yours and wants to help with the project. Because you’ve created such a great version control system, you tell her all about it and send her a copy of all your snapshots, branches, and tags so she can enjoy the same benefits of the code history.

It’s great to have Zoe on the team but she has a habit of taking long trips to far away places without internet access. As soon as she has the source code, she catches a flight to Patagonia and you don’t hear from her for a week. In the meantime you both code up a storm. When she finally gets back, you discover a critical flaw in your VCS. Because you’ve both been using the same numbering system, you each have directories named ‘snapshot-114’, ‘snapshot-115’, and so on, but with different contents!

To make matters worse, you don’t even know who authored the changes in those new snapshots. Together, you devise a plan for dealing with these problems. First, snapshot messages will henceforth contain author name and email. Second, snapshots will no longer be named with simple numbers. Instead, you’ll use the contents of the message file to produce a hash. This hash will be guaranteed to be unique to the snapshot since no two messages will ever have the same date, message, parent, and author. To make sure everything goes smoothly, you both agree to use the SHA1 hash algorithm that takes the contents of a file and produces a 40 character hexadecimal string. You both update your histories with the new technique and instead of clashing ‘snapshot-114’ directories, you now have distinct directories named ‘8ba3441b6b89cad23387ee875f2ae55069291f4b’ and ‘db9ecb5b5a6294a8733503ab57577db96ff2249e’.

With the updated naming scheme, it becomes trivial for you to fetch all the new snapshots from Zoe’s computer and place them next to your existing snapshots. Because every snapshot specifies its parent, and identical messages (and therefore identical snapshots) have identical names no matter where they are created, the history of the codebase can still be drawn as a tree. Only now, the tree is comprised of snapshots authored by both Zoe and you.

This point is important enough to warrant repeating. A snapshot is identified by a SHA1 that uniquely identifies it (and its parent). These snapshots can be created and moved around between computers without losing their identity or where they belong in the history tree of a project. What’s more, snapshots can be shared or kept private as you see fit. If you have some experimental snapshots that you want to keep to yourself, you can do so quite easily. Just don’t make them available to Zoe!

Offline

Zoe’s travel habits cause her to spend countless hours on airplanes and boats. Most of the places she visits have no readily available internet access. At the end of the day, she spends more time offline than online.

It’s no surprise, then, that Zoe raves about your VCS. All of the day to day operations that she needs to do can be done locally. The only time she needs a network connection is when she’s ready to share her snapshots with you.

Merges

Before Zoe left on her trip, you had asked her to start working off of the branch named ‘math’ and to implement a function that generated prime numbers. Meanwhile, you were also developing off of the ‘math’ branch, only you were writing a function to generate magic numbers. Now that Zoe has returned, you are faced with the task of merging these two separate branches of development into a single snapshot. Since you both worked on separate tasks, the merge is simple. While constructing the snapshot message for the merge, you realize that this snapshot is special. Instead of just a single parent, this merge snapshot has two parents! The first parent is your latest on the ‘math’ branch and the second parent is Zoe’s latest on her ‘math’ branch. The merge snapshot doesn’t contain any changes beyond those necessary to merge the two disparate parents into a single codebase.

Once you complete the merge, Zoe fetches all the snapshots that you have that she does not, which include your development on the ‘math’ branch and your merge snapshot. Once she does this, both of your histories match exactly!

Rewriting History

Like many software developers you have a compulsion to keep your code clean and very well organized. This carries over into a desire to keep your code history well groomed. Last night you came home after having a few too many pints of Guinness at the local brewpub and started coding, producing a handful of snapshots along the way. This morning, a review of the code you wrote last night makes you cringe a little bit. The code is good overall, but you made a lot of mistakes early on that you corrected in later snapshots.

Let’s say the branch on which you did your drunken development is called ‘drunk’ and you made three snapshots after you got home from the bar. If the name ‘drunk’ points at the latest snapshot on that branch, then you can use a useful notation to refer to the parent of that snapshot. The notation ‘drunk^’ means the parent of the snapshot pointed to by the branch name ‘drunk’. Similarly ‘drunk^^’ means the grandparent of the ‘drunk’ snapshot. So the three snapshots in chronological order are ‘drunk^^’, ‘drunk^’, and ‘drunk’.

You’d really like those three lousy snapshots to be two clean snapshots. One that changes an existing function, and one that adds a new file. To accomplish this revision of history you copy ‘drunk’ to ‘working’ and delete the file that is new in the series. Now ‘working’ represents the correct modifications to the existing function. You create a new snapshot from ‘working’ and write the message to be appropriate to the changes. For the parent you specify the SHA1 of the ‘drunk^^^’ snapshot, essentially creating a new branch off of the same snapshot as last night. Now you can copy ‘drunk’ to ‘working’ and roll a snapshot with the new file addition. As the parent you specify that snapshot you created just before this one.

As the last step, you change the branch name ‘drunk’ to point to the last snapshot you just made.

The history of the ‘drunk’ branch now represents a nicer version of what you did last night. The other snapshots that you’ve replaced are no longer needed so you can delete them or just leave them around for posterity. No branch names are currently pointing at them so it will be hard to find them later on, but if you don’t delete them, they’ll stick around.

Staging Area

As much as you try to keep your new modifications related to a single feature or logical chunk, you sometimes get sidetracked and start hacking on something totally unrelated. Only half-way into this do you realize that your working directory now contains what should really be separated as two discrete snapshots.

To help you with this annoying situation, the concept of a staging directory is useful. This area acts as an intermediate step between your working directory and a final snapshot. Each time you finish a snapshot, you also copy that to a staging directory. Now, every time you finish an edit to a new file, create a new file, or remove a file, you can decide whether that change should be part of your next snapshot. If it belongs, you mimic the change inside staging. If it doesn’t, you can leave it in working and make it part of a later snapshot. From now on, snapshots are created directly from the staging directory.

This separation of coding and preparing the stage makes it easy to specify what is and is not included in the next snapshot. You no longer have to worry too much about making an accidental, unrelated change in your working directory.

You have to be a bit careful, though. Consider a file named README. You make an edit to this file and then mimic that in staging. You go on about your business, editing other files. After a bit, you make another change to README. Now you have made two changes to that file, but only one is in the staging area! Were you to create a snapshot now, your second change would be absent.

The lesson is this: every new edit must be added to the staging area if it is to be part of the next snapshot.

Diffs

With a working directory, a staging area, and loads of snapshots laying around, it starts to get confusing as to what the specific code changes are between these directories. A snapshot message only gives you a summary of what changed, not exactly what lines were changed between two files.

Using a diffing algorithm, you can implement a small program that shows you the differences in two codebases. As you develop and copy things from your working directory to the staging area, you’ll want to easily see what is different between the two, so that you can determine what else needs to be staged. It’s also important to see how the staging area is different from the last snapshot, since these changes are what will become part of the next snapshot you produce.

There are many other diffs you might want to see. The differences between a specific snapshot and its parent would show you the “changeset” that was introduced by that snapshot. The diff between two branches would be helpful for making sure your development doesn’t wander too far away from the mainline.

Eliminating Duplication

After a few more trips to Namibia, Istanbul, and Galapagos, Zoe starts to complain that her hard drive is filling up with hundreds of nearly identical copies of the software. You too have been feeling like all the file duplication is wasteful. After a bit of thinking, you come up with something very clever.

You remember that the SHA1 hash produces a short string that is unique for a given file contents. Starting with the very first snapshot in the project history, you start a conversion process. First, you create a directory named objects outside of the code history. Next, you find the most deeply nested directory in the snapshot. Additionally, you open up a temporary file for writing. For each file in this directory you perform three steps. Step 1: Calculate the SHA1 of the contents. Step 2: Add an entry into the temp file that contains the word ‘blob’ (binary large object), the SHA1 from the first step, and the filename. Step 3: Copy the file to the objects directory and rename it to the SHA1 from step 1. Once finished with all the files, find the SHA1 of the temp file contents and use that to name the temp file, also placing it in the objects directory.

If at any time the objects directory already contains a file with a given name, then you have already stored that file’s contents and there is no need to do so again.

Now, move up one directory and start over. Only this time, when you get to the entry for the directory that you just processed, enter the word ‘tree’, the SHA1 of the temp file from last time, and the directory’s name into the new temp file. In this fashion you can build up a tree of directory object files that contain the SHA1s and names of the files and directory objects that they contain.

Once this has been accomplished for every directory and file in the snapshot, you have a single root directory object file and its corresponding SHA1. Since nothing contains the root directory, you must record the root tree’s SHA1 somewhere. An ideal place to store it is in the snapshot message file. This way, the uniqueness of the SHA1 of the message also depends on the entire contents of the snapshot, and you can guarantee with absolute certainty that two identical snapshot message SHA1s contain the same files!

It’s also convenient to create an object from the snapshot message in the same way that you do for blobs and trees. Since you’re maintaining a list of branch and tag names that point to message SHA1s you don’t have to worry about losing track of which snapshots are important to you.

With all of this information stored in the objects directory, you can safely delete the snapshot directory that you used as the source of this operation. If you want to reconstitute the snapshot at a later date it’s simply a matter of following the SHA1 of the root tree stored in the message file and extracting each tree and blob into their corresponding directory and file.

For a single snapshot, this transformation process doesn’t get you much. You’ve basically just converted one filesystem into another and created a lot of work in the process. The real benefits of this system arise from reuse of trees and blobs across snapshots. Imagine two sequential snapshots in which only a single file in the root directory has changed. If the snapshots both contain 10 directories and 100 files, the transformation process will create 10 trees and 100 blobs from the first snapshot but only one new blob and one new tree from the second snapshot!

By converting every snapshot directory in the old system to object files in the new system, you can drastically reduce the number of files that are stored on disk. Now, instead of storing perhaps 50 identical copies of a rarely changed file, you only need to keep one.

Compressing Blobs

Eliminating blob and tree duplication significantly reduces the total storage size of your project history, but that’s not the only thing you can do to save space. Source code is just text. Text can be very efficiently compressed using something like the LZW or DEFLATE compression algorithms. If you compress every blob before computing its SHA1 and saving it to disk you can reduce the total storage size of the project history by another very admirable quantity.

The True Git

The VCS you have constructed is now a reasonable facsimile of Git. The main difference is that Git gives you very nice command lines tools to handle such things as creating new snapshots and switching to old ones (Git uses the term “commit” instead of “snapshot”), tracing history, keeping branch tips up-to-date, fetching changes from other people, merging and diffing branches, and hundreds of other common (and not-so-common tasks).

As you continue to learn Git, keep this parable in mind. Git is really very simple underneath, and it is this simplicity that makes it so flexible and powerful. One last thing before you run off to learn all the Git commands: remember that it is almost impossible to lose work that has been committed. Even when you delete a branch, all that’s really happened is that the pointer to that commit has been removed. All of the snapshots are still in the objects directory, you just need to dig up the commit SHA. In these cases, look up git reflog. It contains a history of what each branch pointed to and in times of crisis, it will save the day.

Here are some resources that you should follow as your next step. Now, go, and become a Git master!

–

Discuss this post on Hacker News

http://tom.preston-werner.com/2009/05/19/the-git-parable

Blogging Like a Hacker

Nov 17, 2008 Updated Nov 17, 2008

Show full content

Blogging Like a Hacker

17 Nov 2008 - San Francisco

Back in 2000, when I thought I was going to be a professional writer, I spent hours a day on LiveJournal doing writing practice with other aspiring poets and authors. Since then I’ve blogged at three different domains about web standards, print design, photography, Flash, illustration, information architecture, ColdFusion, package management, PHP, CSS, advertising, Ruby, Rails, and Erlang.

I love writing. I get a kick out of sharing my thoughts with others. The act of transforming ideas into words is an amazingly efficient way to solidify and refine your thoughts about a given topic. But as much as I enjoy blogging, I seem to be stuck in a cycle of quitting and starting over. Before starting the current iteration, I resolved to do some introspection to determine the factors that were leading to this destructive pattern.

I already knew a lot about what I didn’t want. I was tired of complicated blogging engines like WordPress and Mephisto. I wanted to write great posts, not style a zillion template pages, moderate comments all day long, and constantly lag behind the latest software release. Something like Posterous looked attractive, but I wanted to style my blog, and it needed to be hosted at the domain of my choosing. For the same reason, other hosted sites (wordpress.com, blogger.com) were disqualified. There are a few people directly using GitHub as a blog (which is very cool), but that’s a bit too much of an impedance mismatch for my tastes.

On Sunday, October 19th, I sat down in my San Francisco apartment with a glass of apple cider and a clear mind. After a period of reflection, I had an idea. While I’m not specifically trained as an author of prose, I am trained as an author of code. What would happen if I approached blogging from a software development perspective? What would that look like?

First, all my writing would be stored in a Git repository. This would ensure that I could try out different ideas and explore a variety of posts all from the comfort of my preferred editor and the command line. I’d be able to publish a post via a simple deploy script or post-commit hook. Complexity would be kept to an absolute minimum, so a static site would be preferable to a dynamic site that required ongoing maintenance. My blog would need to be easily customizable; coming from a graphic design background means I’ll always be tweaking the site’s appearance and layout.

Over the last month I’ve brought these concepts to fruition and I’m pleased to announce Jekyll. Jekyll is a simple, blog aware, static site generator. It takes a template directory (representing the raw form of a website), runs it through Textile and Liquid converters, and spits out a complete, static website suitable for serving with Apache or your favorite web server. If you’re reading this on the website (http://tom.preston-werner.com), you’re seeing a Jekyll generated blog!

To understand how this all works, open up my TPW repo in a new browser window. I’ll be referencing the code there.

Take a look at index.html. This file represents the homepage of the site. At the top of the file is a chunk of YAML that contains metadata about the file. This data tells Jekyll what layout to give the file, what the page’s title should be, etc. In this case, I specify that the “default” template should be used. You can find the layout files in the _layouts directory. If you open default.html you can see that the homepage is constructed by wrapping index.html with this layout.

You’ll also notice Liquid templating code in these files. Liquid is a simple, extensible templating language that makes it easy to embed data in your templates. For my homepage I wanted to have a list of all my blog posts. Jekyll hands me a Hash containing various data about my site. A reverse chronological list of all my blog posts can be found in site.posts. Each post, in turn, contains various fields such as title and date.

Jekyll gets the list of blog posts by parsing the files in the _posts directory. Each post’s filename contains the publishing date and slug (what shows up in the URL) that the final HTML file should have. Open up the file corresponding to this blog post: 2008-11-17-blogging-like-a-hacker.textile. GitHub renders textile files by default, so to better understand the file, click on the raw view to see the original file. Here I’ve specified the post layout. If you look at that file you’ll see an example of a nested layout. Layouts can contain other layouts allowing you a great deal of flexibility in how pages are assembled. In my case I use a nested layout in order to show related posts for each blog entry. The YAML also specifies the post’s title which is then embedded in the post’s body via Liquid.

Posts are handled in a special way by Jekyll. The date you specify in the filename is used to construct the URL in the generated site. This post, for instance, ends up at http://tom.preston-werner.com/2008/11/17/blogging-like-a-hacker.html.

Files that do not reside in directories prefixed with an underscore are mirrored into a corresponding directory structure in the generated site. If a file does not have a YAML preface, it is not run through the Liquid interpreter. Binary files are copied over unmodified.

In order to convert your raw site into the finished version, you simply run:

$ jekyll /path/to/raw/site
/path/to/place/generated/site

Jekyll is still a very young project. I’ve only developed the exact functionality that I’ve needed. As time goes on I’d like to see the project mature and support additional features. If you end up using Jekyll for your own blog, drop me a line and let me know what you’d like to see in future versions. Better yet, fork the project over at GitHub and hack in the features yourself!

I’ve been living with Jekyll for just over a month now. I love it. Driving the development of Jekyll based on the needs of my blog has been very rewarding. I can edit my posts in TextMate, giving me automatic and competent spell checking. I have immediate and first class access to the CSS and page templates. Everything is backed up on GitHub. I feel a lightness now when I’m writing a post. The system is simple enough that I can keep the entire conversion process in my head. The distance from my brain to my blog has shrunk, and, in the end, I think that will make me a better author.

http://tom.preston-werner.com/2008/11/17/blogging-like-a-hacker

How to Meet Your Next Cofounder

Nov 3, 2008 Updated Nov 3, 2008

Show full content

How to Meet Your Next Cofounder

3 Nov 2008 - San Francisco

Over the last few months I’ve seen a number of people looking for cofounders on Hacker News or via their own personal blogs. I think this is, at best, a highly inefficient way to find a cofounder and, at worst, a way to fool yourself into finding the wrong cofounder. In any case, it’s a naive approach to finding the person that will need to stand by your side in the coming storm that we call “running a startup.”

Don’t get me wrong, the internet is an amazing tool for meeting people. The wider the net you cast, the more likely you are to find the perfect match. But the internet has its limitations. I’ve had internet friends that were engaging, witty, and brilliant online, but in person felt awkward and boring. Conversely, I know people that are volatile and inflammatory online, but present an attitude of friendliness and caring in person. This phenomenon makes it difficult to gauge an individual’s personality from online interaction alone.

A far better use of the internet is to find groups of people that share your interests. Track down the local users group for your language or technology of choice. The simple fact that members of these groups take time out of their day to show up means that they’re more motivated and driven than the average person. Even if it’s a bit of a commute to get to the meetings, start showing up regularly. Prepare a few presentations on topics that you’re passionate about. Bonus points if you present on ideas related to your potential startup. Don’t worry about revealing your game-changing secrets; stealth mode is bullshit. Talk to everyone. Steer the conversation toward your interests and if someone there is excited about the same things, it will be clear.

It may take weeks or months, but in a good group you’ll find a handful of people that you really like. If at all possible, go out drinking with these people after the meetups. This is one of the easiest ways to go from “acquaintance” to “friend” and gives you free license to bring up your craziest of ideas without sounding like too much of a nutjob.

Of the people that you like, several may make excellent candidates for cofounders. Do a little research on these individuals. What does their code look like? Have they done much open source? Do they demonstrate an entrepreneurial spirit? Can they stick with a single project for a long time? Have they been loyal to their friends and companies in the past? A good cofounder should be someone with whom you feel privileged to work. And they should feel privileged to work with you. The two of you should be on very solid ground before you begin your startup adventure, because once you do, the impact of every argument is going to feel like it’s been multiplied by a thousand.

This all sounds like a lot of hard work. Maybe you’re wondering if it would be better to just go solo. I did that with Gravatar, and, in retrospect, it’s painfully obvious that I made a lot of stupid mistakes. When it’s just you and your thoughts it becomes too easy to pick the first thing that pops into your head. We’re programmed to think all of our ideas are good, but reality tells a different story. Truly good decisions are forged from the furnace of argument, not plucked like daisies from the pasture of a peaceful mind. A good cofounder tells you when your ideas are half-baked and ensures that your good ideas actually get implemented.

The second biggest danger with going solo is the loss of motivation. Solipsism might make you feel important at first, but the constant lack of feedback and the absence of support during tough times can easily lead to a premature end to your adventure. Cofounders are like workout buddies. Just when you think there’s no possible way you can do another rep, there they are, rooting you on toward an achievement that wouldn’t be possible without them.

Your choice of cofounder will affect everything you do in your startup. They’ll share every defeat with you and celebrate every success. They’ll help you understand your own ideas better by offering a different perspective. They’ll be the single most important decision you make during the tenure of your startup, so choose wisely and with extreme care.

http://tom.preston-werner.com/2008/11/03/how-to-meet-your-next-cofounder

Looking back on Selling Gravatar to Automattic

Oct 27, 2008 Updated Oct 27, 2008

Show full content

Looking back on Selling Gravatar to Automattic

23 Oct 2008 - San Francisco

For an entrepreneur, the line between horrible mistake and runaway success can be so thin that even Kate Moss would be envious. I lived with Gravatar for nearly four years before that line even became thick enough to measure.

As it’s become one of my favorite parables, I’ll save the details of how I came up with the idea for Gravatar for a future post. What’s important to know is that the idea was spawned not from a business perspective, but from a desperate desire to create something new in the world of blogging.

Spin the clock back four years and you’ll find me sitting at my Windows desktop machine in my underwear with a box of Life cereal to my left and a day old Coke to my right. Since I’d been laid off from my job as a Java developer some months earlier, I’d decided to take the entrepreneurial plunge doing what I knew best: web design. Working in a cone of isolation, I’d become accustomed to waking up late, swinging my legs over the right side of the bed, and in one fluid movement sliding over to the ratty chair I stole from my old college dorm room. I’d spend most of the day working on client projects in ColdFusion or PHP. It was hard work and could become a bit tiresome. I needed an outlet, something that didn’t have a suit on the other end of a telephone telling me how blue was the wrong color and things would be so much better if only the photo had a slightly bigger border. Gravatar would become that outlet for me.

I was really big into web standards at the time, having recently read Zeldman’s seminal work, and became a true believer. Eric Meyers, Dan Cederholm, and Jon Hicks became like gods to me. I worked very hard at making relevant and witty comments around the right kinds of blogs. Being a part of that movement became a significant goal for me. My Movable Type weblog rarely went more than two days without a post on design or standards.

Two weeks after I had the idea for Gravatar the first version was written and deployed. Every request hit the database and dynamically generated a properly sized gravatar via PHP’s gd2 api. Premature optimization and all that, right? The first thing I did after getting the system to a workable state was email all the bloggers I looked up to (and that had no idea who I was). Blog comments at the time were a pretty dreary affair and I guess Cederholm was intrigued enough by my idea that he linked to it in a sidebar micro-post on simplebits.com.

That single mention kicked off a slow but steady trickle of interest in the system. A few blogs here and there installed the plugin and the world started seeing avatars that mysteriously followed them around. At the same time, just as people must have thought Cheez Whiz was a stupid idea when it first came out, some bloggers started railing against Gravatar, calling it frivolous, inefficient, and “an abomination.” This was my first nibble at the smorgasbord of what was to become the “horrible mistake” aspect of Gravatar.

Due to the inherently self-advertising nature of gravatars (the “what the hell is that and how do I get one?” brand of advertising), Gravatar adoption increased at a rapid rate. Having crafted the idea for Gravatar without any semblance of business model or growth projection or build-out strategy, things took a rather dramatic dive away from “runaway success” as my server (yes, singular) buckled under the pressure of tens of requests per second! As it turns out, regenerating a gravatar on every request is not very CPU efficient. Gravatars worldwide suddenly turned into little red Xs. Then, in what has become known as the Twitter Effect, a barrage of emails hit me complaining about how the free service on which they had come to depend was down, and how this would adversely impact my well-being.

I fixed the code. Gravatar came back online with caching. All the while I’d had the bright idea that gravatars would be rated for content, MPAA-style. Because users clearly were not fit to rate their own images, I was manually rating 400 or more avatars each day. If I missed a day, I’d have damn near a thousand waiting for me the next day. In addition to the angry mob, I was very fortunate to have an amazingly supportive group of users that volunteered to help me rate images. I owe them my sanity, and it freed up enough time for me to work on the next iteration of the site.

G2, as I called it, would be written in Rails and use lighttpd plus a convoluted directory structure of symlinks to enable me to pre-render every gravatar (1x1 up to 80x80) and serve only static images. I did this to avoid having to rent or buy the kind of hardware necessary to hook up a properly scaled system. Up until the end I ran Gravatar on a maximum of two rented commodity servers that set me back a mere $300/month, a pittance for the kind of traffic I was serving. I say it was a pittance, but that’s not really true. Donations didn’t even come close to covering that cost.

At some point early in the development of G2, Toni Schneider, the CEO of Automattic (the company behind WordPress.com and Akismet) contacted me after hearing my interview about the future of Gravatar on the WordPress podcast. This was exciting news! How perfect a fit would it be for Gravatar to be bought by Automattic? I was already planning a trip up to San Francisco to meet with the Powerset guys, so the timing worked out perfectly for me to meet in person with Toni. I ended up having lunch with Toni and Matt Mullenweg at 21st Ammendment on 2nd Street. It was a bit intimidating to come to the mecca of tech startups to meet with such huge players in the blogging community. Turned out that both Matt and Toni are great guys and so we had drinks for about two hours, talking about my ideas for Gravatar and how we might be able to work together. Everything seemed great—I was jazzed and they seemed excited—but a few weeks later Toni let me know that the timing was wrong and they couldn’t make a play at that time. He suggested I proceed with G2 and they’d proceed with their own avatar system. I was pretty bummed about the outcome, but I took their advice and kept going.

A few weeks before G2 was finished, the site imploded in a big way. One machine. Hundreds of requests per second. That poor CPU must have thought it was the End Times. Instead of wasting time on getting the existing system back up, I put on my headphones, turned it up to eleven, and got back to work on G2.

I’m not sure if you’ve ever had to work on a project while your users are publicly skewering you on your blog for allowing the service to go down, but it’s as close to depression as I’ve ever come. If I’d had less pride, I would have popped everyone a huge middle finger and let the service die, but instead I waded through the hundreds of comments and deleted the ones with threats, hatred, or my favorite, the words “fuck you” repeated 600 times. It wasn’t fair, I told myself, that I should be sitting here with high blood pressure trying to raise Gravatar from the dead while the unappreciative masses do what they do best on the internet. The only thing that kept me going was never being able to tell which side of the mistake/success boundary I was sitting on. It was hard to think of the situation as anything but a huge failure, but the shitstorm the downtime was causing indicated that people found the service valuable. I want to say that Twitter went through the same thing, but they suffered their downtime with millions of dollars in the bank. The only thing I had was a full time job unrelated to Gravatar and a credit card that reminded me every month of my bad judgment.

Finally G2 was done, deployed, and fully operational. I have no idea how many users I lost due to the several weeks of downtime, but I don’t think it was very many. There seems to be a corollary to the Twitter Effect that I’d call the Forgiveness Effect. It dictates that if a user enjoys a free service and that service is currently up, all past atrocities will be easily and quickly forgiven. With the site running again, things looked to be shifting back towards a success.

Things were not all rainbows and unicorns though. My Rube Goldbergian architecture had a few quirks that needed to be dealt with. The site still had some elusive bugs from the overly-rapid development cycle. And just like new lanes on freeways always fill up immediately, the two new servers I was running started causing expensive bandwidth overages. I had taken a job at Powerset at this point and the combined pressures of these two commitments started to weigh me down. Once again I started feeling like all the effort I put into Gravatar was for nothing. Like I would never benefit from any of it.

In a last ditch effort to save Gravatar from final doom, I emailed Toni and pitched him again on Gravatar. I figured it was a long shot, but what the hell, couldn’t hurt. Things must have changed in the prior 6 months because Toni was very receptive to the idea. We met again, at 21st Ammendment, and hashed out a tentative deal over drinks. I’d never sold anything like this before, so my technique was probably very amateurish. I’m almost certain I could have gotten a better deal out of it, but I had the smell of desperation about me and I really did want to see Gravatar end up in Automattic’s hands.

Four days later, Automattic made their official offer. On September 21st, 2007 we inked the deal and Gravatar became both the first company that I ever sold and the first company that Automattic ever acquired.

I am quite satisfied with the sale to Automattic. Some will say that I should have pursued VC funding. Indeed, I was contacted by several firms but never travelled very far down that road. I always felt like Gravatar was a feature, and I wasn’t comfortable building a company on such a tiny foundation. Reinforcing this decision, no viable business model ever coalesced during the time I was building the site. It was also made clear by Toni that Automattic would maintain Gravatar as a separate brand and continue its evolution (instead of just absorbing it into WordPress). This appealed to my ego. Most companies kill or maim everything they acquire, but here was a chance for Gravatar to carry forward with all of Automattic’s resources behind it (instead of two measly servers). Part of me just wanted to see what Gravatar could become with time, money, and man-power moving it forward.

Things always seem so clear in retrospect. But it was pride and persistence that kept me in the game long enough to have anything to look back on at all. While the line between horrible mistake and runaway success may be difficult to see, you can still find it if you look hard enough.

http://tom.preston-werner.com/2008/10/27/looking-back-on-selling-gravatar-to-automattic

How I Turned Down $300,000 from Microsoft to go Full-Time on GitHub

Oct 18, 2008 Updated Oct 18, 2008

Show full content

How I Turned Down $300,000 from Microsoft to go Full-Time on GitHub

18 Oct 2008 - San Francisco

2008 is a leap year. That means that three hundred and sixty six days ago, almost to the minute, I was sitting alone in a booth at Zeke’s Sports Bar and Grill on 3rd Street in San Francisco. I wouldn’t normally hang out at a sports bar, let alone a sports bar in SOMA, but back then Thursday was “I Can Has Ruby” night. I guess back then “I can has ___” was also a reasonable moniker to attach to pretty much anything. ICHR was a semi-private meeting of like minded Ruby Hackers that generally and willingly devolved into late night drinking sessions. Normally these nights would fade away like my hangover the next morning, but this night was different. This was the night that GitHub was born.

I think I was sitting at the booth alone because I’d just ordered a fresh Fat Tire and needed a short break from the socializing that was happening over at the long tables in the dimly lit aft portion of the bar. On the fifth or sixth sip, Chris Wanstrath walked in. I have trouble remembering now if I’d even classify Chris and I as “friends” at the time. We knew each other through Ruby meetups and conferences, but only casually. Like a mutual “hey, I think your code is awesome” kind of thing. I’m not sure what made me do it, but I gestured him over to the booth and said “dude, check this out.” About a week earlier I’d started work on a project called Grit that allowed me to access Git repositories in an object oriented manner via Ruby code. Chris was one of only a handful of Rubyists at the time that was starting to become serious about Git. He sat down and I started showing him what I had. It wasn’t much, but it was enough to see that it had sparked something in Chris. Sensing this, I launched into my half-baked idea for some sort of website that acted as hub for coders to share their Git repositories. I even had a name: GitHub. I may be paraphrasing, but his response was along the lines of a very emphatic “I’m in. Let’s do it!”

The next night, Friday, October 19, 2007 at 10:24pm Chris made the first commit to the GitHub repository and sealed in digital stone the beginning of our joint venture. There were, so far, no agreements of any kind regarding how things would proceed. Just two guys that decided to hack together on something that sounded cool.

Remember those amazing few minutes in Karate Kid where Daniel is training to become a martial arts expert? Remember the music? Well, you should probably go buy and listen to You’re The Best by Joe Esposito in iTunes because I’m about to hit you with a montage.

For the next three months Chris and I spent ridiculous hours planning and coding GitHub. I kept going with Grit and designed the UI. Chris built out the Rails app. We met in person every Saturday to make design decisions and try to figure out what the hell our pricing plan would look like. I remember one very rainy day we talked for a good two hours about various pricing strategies over some of the best Vietnamese egg rolls in the city. All of this we did while holding other engagements. I, for one, was employed full time at Powerset as a tools developer for the Ranking and Relevance team.

In mid January, after three months of nights and weekends, we launched into private beta mode, sending invites to our friends. In mid February PJ Hyett joined in and made us three-strong. We publicly launched the site on April 10th. TechCrunch was not invited. At this point it was still just three 20-somethings without a single penny of outside investment.

I was still working full time at Powerset on July 1, 2008 when we learned that Powerset had just been acquired by Microsoft for around $100 million. This was interesting timing. With the acquisition, I was going to be faced with a choice sooner than I had anticipated. I could either sign on as a Microsoft employee or quit and go GitHub full time. At 29 years old, I was the oldest of the three GitHubbers, and had accumulated a proportionally larger amount of debt and monthly expenditure. I was used to my six digit lifestyle. Further confounding the issue was the imminent return of my wife, Theresa, from her PhD fieldwork in Costa Rica. I would soon be transitioning from make-believe bachelor back to married man.

To muddy the waters of decision even more, the Microsoft employment offer was juicy. Salary + $300k over three years juicy. That’s enough money to make anybody think twice about anything. So I was faced with this: a safe job with lots of guaranteed money as a Microsoft man –or– a risky job with unknown amounts of money as an entrepreneur. I knew things with the other GitHub guys would become extremely strained if I stayed on at Powerset much longer. Having saved up some money and become freelancers some time ago, they had both started dedicating full time effort to GitHub. It was do or die time. Either pick GitHub and go for it, or make the safe choice and quit GitHub to make wheelbarrows full of cash at Microsoft.

If you want a recipe for restless sleep, I can give you one. Add one part “what will my wife think” with 3,000 parts Benjamin Franklin; stir in a “beer anytime you damn well please” and top with a chance at financial independence.

I’ve become pretty good at giving my employers the bad news that I’m leaving the company to go do something cooler. I broke the news to my boss at Powerset on the day the employment offer was due. I told him I was quitting to go work full time on GitHub. Like any great boss, he was bummed, but understanding. He didn’t try to tempt me with a bigger bonus or anything. I think deep down he knew I was going to leave. I may have even received a larger incentive to stay than others, on account of my being a flight risk. Those Microsoft managers are crafty, I tell you. They’ve got retention bonuses down to a science. Well, except when you throw an entrepreneur, the singularity of the business world, into the mix. Everything goes wacky when you’ve got one of those around.

In the end, just as Indiana Jones could never turn down the opportunity to search for the Holy Grail, I could no less turn down the chance to work for myself on something I truly love, no matter how safe the alternative might be. When I’m old and dying, I plan to look back on my life and say “wow, that was an adventure,” not “wow, I sure felt safe.”

http://tom.preston-werner.com/2008/10/18/how-i-turned-down-300k

https://feeds.feedburner.com/tom-preston-werner

Posts