In 2008, three developers--Chris Wanstrath, PJ Hyett and Tom Preston-Werner--launched an online project to simplify code sharing on the Web. GitHub took its name from git, the distributed version control system created by Linus Torvalds. By the time it got going, GitHub already had plenty of competition, including SourceForge, GoogleCode and CodePlex. But the new upstart has quickly caught up. The site now has almost a million participants storing over two million code repositories, and is now the world’s largest code host.
A Ruby programmer by trade, Wanstrath worked at CNet Networks, Gamespot.
GitHub has become known online for its fluid development environment, in which people can post not just large projects, but casual ideas, and see where the community takes them. The company now employs about 35 people, 25 at its new home in the South of Market (
- Where did the spark of the idea for GitHub come from?
In early 2007, I saw a Google Tech Talk video where Torvalds discussed distributed version control from a very high level. That was the “ah-ha!” moment for me. I was aware of git and related solutions, but I had never seen the big picture about how distributed version control was meant to work--with frequent forks and people adding nodes to a repository. I suddenly realized that the heart of online collaboration is all about workflows--which got me excited about git. The GitHub idea came from there.
There were a few git hosting sites at the time, notably repo.
or. cz. But they were all in the SourceForge vein where there’s a single namespace for all projects. If you want to fork a project, it’s a very traditional fork, where you spawn a different project with its own name. Our idea for GitHub was different. We just wanted a place where we could post code and send patches around using the mechanisms that git has built in. We wanted a place where we could publish git repositories, and pull from other git repositories and have it all automated and working really simply--so we didn’t have to do any sysadmining or having to mess with our VPS [virtual private server]. - You’ve made a point of redefining “forking.” Rather than a big change in direction, you say that the same project can now have hundreds of forks, some of which may get merged back into the main project.
With the old model, you click on a fork button, and you’re asked “what do you want to name your new project?” We said: let’s not make this such a big deal. When you create a fork, GitHub will just create a copy of the repository underneath your own namespace. We do want people to know whether this is a good fork or a bad fork, and whether its intent is to diverge or eventually get merged back into the upstream. But we’ve really embraced this more flexible concept of a fork: we played the idea up on our early t-shirts, which said “fork you.” That phrase at first looks offensive, but turns out to be a good thing, which is how we view forking in general. The fork predates distributed version control, yet is still core to it. By applying the term to a DVCS, we give people a better understanding of it.
- Given your competition, I think a lot of people wouldn’t have bet on your success. What went right?
I think GitHub was the first hosting site to focus more on sharing code than publishing projects. At the time we came around, SourceForge was great for publishing projects--that’s what it was about. You would apply, and after a waiting period, your project would be published and you could share it with the world. But with GitHub, it just takes a few seconds to push up something. It became less about questions like: what am I going to name this project? What are the rules? What license am I going to pick? Instead, I just push this piece of code up there and see if anyone else is interested in it. I put a link to it out there and see if I get any contributions.
The underlying principle for me is that not every project you work on is a project you necessarily commit your life to. Sometimes, it’s just about sharing a piece of code with people. So rather than forcing you to go through a formal approval process for your project, we allow you to just publish it, just as you might publish video on YouTube, and other people can immediately see your code and do something with it.
That’s the gateway drug: you get addicted to sharing. From there, you get involved in open source and working on other people’s projects. The GitHub mindset encourages people to jump into someone else’s project and fix a bug. The key is to understand all the tools at your disposal and know how the contribution process works. It’s also useful to spend time learning how a particular project wants its patches accepted. For a lot of smaller projects you can just jump on, you can fork it, you can send a pull request that lets others know about your changes--and suddenly you find you’ve gotten your changes merged. In other words, we’ve made it easier to for people to contribute by removing many of the barriers.
- This sounds like what’s going on in the Rails community.
We started as a Rails development shop, and we knew a lot of people in that community. The first major projects on GitHub came from there. But I now like to think of the site as a collection of communities--Rails, JavaScript, Python, Ruby, C#. There are places on GitHub where they meet--like the Changelog blog which chronicles all open source projects on the site. But overall, the communities each still have their own quirks and personalities, and they’re all represented fairly well on GitHub. We try to make GitHub work for all of them.
- Do you see any trends happening within those communities?
I think we are starting to see more blurring of the lines between them. For example, Fabric, a Python tool, was inspired by Capistrano, a Rails tool. We’re seeing more of that: where one language community has an idea, then another language ports it over and adds their own style to it. Another example is the Mustache template engine, which is something I wrote. There are now ports of Mustache to maybe a dozen different programming languages--Python, C#, and Erlang among them. Another good example is Sinatra, the Ruby micro framework, which is now one of the most imitated projects. It has inspired frameworks on several languages, including a few on Ruby, itself. For a thin little DSL [domain-specific language] for writing RESTful web services, Sinatra has really caught on.
So I think people are less isolated in their own programming silos and more willing to look at other languages, even ones they don’t necessarily like, for ideas--and bring those ideas back into their communities.
- GitHub is an online community. But you’ve put a lot of resources into encouraging members to “meet up” in person.
I think face-to-face gatherings are huge, both as a way for us to meet our developers in person and for people from different communities to mix: to see what problems they have in common and what solutions they’ve found. Alcohol is a good social lubricant, so we’ve been having GitHub Drinkups, where we buy you a drink at a conference. We’ve done almost a hundred of these events by now. We try to do them once a month in San Francisco, which is our home town. We had a giant Drinkup in Paris for a PHP Symfony conference that we held in an Irish pub. They’re important to us. After all, our tagline is “social coding.”
- What’s GitHub’s presence in Japan and beyond North America?
Most of our traffic is from overseas: including Germany, the United Kingdom, Paris, Brazil, Russia, and of course Japan: we definitely have Japanese users, and we try to attend Ruby-no-Kai every year. So it’s an international community, and one of our big challenges is to make sure that the site is fast and responsive all over the world. That said, I think these days the national boundaries matter a little bit less. It’s more your programming ideology and the communities you hang out with online.
In fact, I think the biggest signifier of whether someone is going to try out GitHub and other new social coding ideas is the language they associate. Developers using newer languages like Python, JavaScript, and Ruby are more apt to try out a new DVCS because they’re already in a mindset of trying out new things. For example, I know a lot of Ruby developers who weren’t super happy with Subversion. They were using it, but they were open to trying something new. Whereas a Java programmer is more likely to have found solutions that work, and so are less likely to invest the time looking elsewhere.
- What advice would you give to Software Design readers interested in entering the GitHub community?
It of course helps to speak English--that’s true in general for programming because of the wealth of documentation and the discussion lists available in that language. But even if you don’t, you can still find a really giant project like Symfony PHP, which has documentation translated into many languages. Beyond that, my best advice is to search GitHub, find a project that you want to work on, figure out how the tools work, find some people to collaborate with and dive right in. Make a code change--whether a bug fix or a feature idea. The magic moment for many people is when you have a discussion about a feature with a complete stranger and are happy with the outcome. You meet a like-minded person, they totally understand what you’re saying, they tell you to tweak your patch a bit and they’ll accept it, and then they do accept it. It’s a magical feeling to have worked on something with a complete stranger and made something more awesome.