Entrepreneur Greg Arnette has launched two successful tech companies. Now he is working on a third―only this time, he and his development team are building it “in the cloud.” Arnette is founder and chief technology officer of Sonian, a Boston-based online service that archives email and other electronic communications between employees for fast retrieval. Sonian is paying Amazon Web Services to provide and manage its IT infrastructure on a usage basis, much as it is paying the electric company to keep the lights on.
Amazon has offered Web services since 2002, well before the term “cloud computing” was in general use, and over the last two years, has played up its appeal to entrepreneurs like Arnette by sponsoring a contest: the AWS Start-up Challenge, whose winners were announced in late 2008. “The range represented by these startups is in sharp contrast to last year, when the challenge was dominated by early adopters of AWS,” wrote Brigid Gaffikin on the GigaOM blog. “And it’s a good sign for Amazon, offering proof that its vision of cloud computing has started to spread beyond the cozy and sometimes narcissistic confines of Silicon Valley.”
I spoke with three of the finalists and the winner of the 2008 contest to see what other Web developers could learn. Not surprisingly, these companies are all enthusiastic about Amazon Web Services, although not without some reservations. As a group, these entrepreneurs demonstrate how you can use cloud computing to build and deploy an entirely new online business.
The term “cloud computing,” also known as SaaS―software as a service―has become blurry as its definitions have grown. What we are talking about here might be described as the “entrepreneurial cloud,” an IT infrastructure-as-a-service that is priced for companies that are just getting off the ground and looking for a pay-as-you go model. Arnette says that for entrepreneurs, cloud computing can be a “highly reliable, super-scalable compute infrastructure for an applications to tap into.” Sonian was Greg Arnette’s second company to offer a SaaS component. But the first time, “we did it by having to go out and raise millions of dollars of investor money to build two co-located data centers,” Arnette recalls. “That’s the old model. With Amazon Web Services, I don’t have to spend millions of dollars creating the infrastructure that still wouldn’t be as reliable. Amazon makes our business possible.”
The number of cloud-based IT infrastructures is growing. The classic model is Salesforce.
Google began its cloud computing venture with a set of online services that provides such basic functions as word processing, spreadsheet and calendar: the kinds of things you’d expect in a desktop application like Microsoft Office. Google has also opened up its platform, the Google App Engine, but on a more experimental basis. In typical Google style, the Google App Engine has become a long-term beta project. So far, accounts are free, but with a 500MB cap on storage and enough CPU bandwidth for about 5 million page views a month. Microsoft is also reported to be working on a cloud computing entry, though at this writing, the details are scarce. Other SaaS platforms are coming from Web hosting companies.
At Sonian, Arnette’s eight-person development team got started with prototype development on AWS in late 2006. The company encountered some investor reluctance―at the time, Amazon was much better known as an e-commerce company. But the bet has paid off. Sonian’s development and beta testing proceeded through 2007. The team encountered no major problems, and launched the first version of the service in March 2008. Ruby on Rails is used for the customer-facing interface, and a combination of Java, Ruby and Erlang for the back-end. The team developed its own security system to augment AWS’s using the U.
“Cloud computing gives you access to an infrastructure that was previously available only to very largest companies with huge amounts of capital,” Arnette says. “You can build solutions that can scale up to serve a large audience without having to worry about buying the infrastructure to support it. That used to be the barrier to growth. If you wanted to build a service that could support millions of users on a daily basis, you could never build the infrastructure out enough. That’s what’s different now: a small team working in their garage has the ability to do that.”
Scalability was of special concern to Sonian given the volume of email and other communications generated by even middle-size companies. “We had to design a system that could scale up incrementally almost every day, accommodating more customers and customer data. It moves up into the cloud, and then we manage, secure, and encrypt it and make it immediately searchable. So for us, the cloud is the perfect technical solution,” Arnette says. The cloud is also a good economic model for a startup business because a young company can pay as it goes―so that expenses keep pace with revenues, rather than requiring a lot of up-front costs. “You pay for the service after the fact, and you can get down to very granular units of number of hours of CPU that were used and the amount of gigabytes of storage and the bandwidth you consumed.”
- Zephyr: Selling customers on SaaS
Another AWS Start-up Challenge finalist is Zephyr, a Sunnyvale, California company whose software test management system was released in March 2008 after beta testing the previous spring. Zephyr’s first product release was on-premise software, but Samir Shah, Zephyr’s founder and CEO, said that the development team architected the application from the beginning with cloud computing in mind.
“When we first built the product, people weren’t ready for SaaS―there were a few takers, but not many,” he recalls. The concerns were largely about security. The test cases and documents tracked by Zephyr often embody a company’s entire competitive edge―and customers had concerns about storing them on third-party hardware. “But security is easily solved, and we could point out how successfully salesforce. com has secured sensitive data,” Shah says. It also helped that Amazon itself seemed to get more serious about its offerings. Last October, after two years in beta, Amazon Elastic Compute Cloud, or EC2, graduated to “general availability,” or GA, status with a service level agreement committing the company to offer 99. 95 percent up time. “But there’s another factor that also helped us: the current state of the economy,” Shah says. “SaaS brings in cost savings that could easily be understood by people who are already spending a lot of money on licenses, hardware, and IT support. A SaaS subscription-based model eliminates a lot of these costs, and gets you up and running much faster.” Shah says that’s true across the board for his software developer customers, whether they are a start-up or Fortune 500 company. SaaS also better accommodates the global nature of software development, in which programmers on the same project may be on different continents. Zephyr’s transition to the cloud took about four weeks, with the company “pulling the plug” on its on-premise version. Shah advises that “if you are early in the development stage, still working out the architecture, it’s very important to consider how it could be put in the cloud. You should modularize the elements of the stack as much as possible―your databases, application servers, and security components. The way these virtual environments work, your architecture should be flexible enough to be split up and put on separate virtual machines.” You should also be able to closely track transactions in order to bill your own customers accurately, but completely. “You should be able to monetize at the transaction level because that’s how you yourself are being billed by your platform provider. That’s significantly different than with on-premise software, where they sell you 20 perpetual licenses and tell you to do what you want with it.”
Shah says that building the code in a SaaS environment is becoming easier. “Most technologies are making the leap into the cloud. Open source technologies are doing it faster than their commercial counterparts, with Linux, PHP, and MySQL―the LAMP stack―being readily available in these virtual environments.” Open source tools are a good fit for Amazon AWS, where companies want to keep the usage fee low. Shah compares AWS with a visit to a library, where you check out the resources you need from a selection of pre-packaged templates, called Amazon Machine Images. “Let’s say you decide to build a photo-sharing service. Amazon Web Services offers quite a few combinations of operating systems, including various flavors of Linux, different flavors of LAMP stacks.” You select from a preconfigured Image or customize your own―and that becomes your development environment. “Or perhaps you’ve already built your photo-sharing service in-house on your favorite combination of technologies and now you want to put it up on Amazon’s cloud. Chances are you’ll find an Image that is already configured closely to your on-premise environment. You check out the stack, transfer your code, and you’re ready to go in the cloud.”
Shah sees some of the other cloud computing services as more locked into a proprietary approach. With Amazon AWS, Shah says, “we effectively own the cloud and the applications on top of that, and we control how people can integrate with us.” Shah says. “Amazon gives you any kind of operating system and application, along with storage and network services.” This is true utility computing, Shah says. Like buying kilowatt hours from the electric company, you buy processor hours, terabytes of storage, and network bandwidth, and build your application on top of it.
- Pixily: lowering up-front costs
Like Sonian, Pixily is also in the archiving business with a service that stores scanned hardcopy documents―with the added computational burden of optical character recognition software. “For us, a SaaS model was front and center from day one,” says Anand Rajaram, co-founder and chief product officer. “We wanted to bring to consumers and small businesses technology that has so far been available only to large businesses, without having to invest a few million dollars in a solution.” That was all the more important because Pixily began as a self-funded company and so needed to keep operating costs to a bare minimum. And as with other finalists, none of the founders had direct SaaS experience. “We were all used to the grunt work that goes with running a data center.”
“Cloud computing is democratizing that for start-ups like ours―it is doing for hardware what open source did to software. Before open source, you had to invest millions of dollars in licenses before you could start writing one line of code, and that was a big barrier for entry. Open source completely changed all that: with a LAMP stack, anybody could offer a service. Cloud computing runs along those similar lines: you can just buy storage by the gigabyte and CPU by the hour and keep growing, without committing all your capital early on.” Rajaram said that at the time Pixily was looking, Amazon AWS was the obvious choice―and an attractive one because Amazon’s cloud came from its own e-commerce experience. “It wasn’t like they dreamed of a solution and were looking for a problem to solve. The solution had organically grown out of the problem that they encountered running their own business. They had learned all those lessons and there was a huge community around it. When we first signed on, there wasn’t official support, but you could see that the evangelists were on the forums actively talking about it―and that gave us a good sense.
“That said, we were also prudent in defining our own architecture to make sure we are not tied to Amazon―because maybe five years later there might be a better choice―so we wanted to make sure we could potentially switch clouds later on.” To help ensure the company is not locked in, the Pixily development added a layer of abstraction at the interface level to AWS. In an era where SaaS platforms are only now getting established, the move seems prudent. Pixily now uses the Amazon EC2 computing infrastructure for pretty much everything except the production quality page scanners, which are attached to machines running Windows Server 2003. The setup can scan up to 100,000 pages a day. The data is then securely uploaded to the Amazon S3 data repository, thereby entering the Amazon cloud. “If we had to buy the kind of infrastructure that we run now from a more traditional non-cloud provider, I think it would be about $100,000 to $200,000 in infrastructure investment,” Rajaram says.
“Our vision is to support what we call ‘ubiquitous capture’ ―the ability to capture however you want and whenever you want,” Rajaram says. “You can send envelopes of documents in the mail. You can upload information or email information to us. You can click a picture with your iPhone and send it to us. In each case, the services are presented to the user in the same way, and that requires a series of steps within our infrastructure. The choreography happens through SQS―Amazon’s queuing service.” The user interface was built with Ruby on Rails.” To help deal with any customer security firms, Pixily has taken advantage of Amazon’s trusted brand status. “Most of our target customers we think would have made at least one purchase with Amazon, and their credit card information is stored with Amazon.” Pixily’s home page carries both “Powered by Amazon Services” and “TrustE Certified Privacy” logos.
Rajaram cautions against one potential pitfall of the entrepreneurial cloud: the temptation to compensate for poor software design by scaling the deployment environment, a practice comparable to raising the water pressure rather than fixing the hole in the pipe. “If your application has memory leaks or is over-using the CPU, a traditional data center will force you to fix the problem,” he says. Whereas in a cloud computing environment, you can simply buy more resources, purchasing a short-term fix, but a long-term expense. On the other hand, a cloud computing environment lets you test hardware scalability before actual customers sign on. “Two weeks before a big production release, ramp up the resources, then test it so see if you have achieved what you wanted when your application scales.” Then bring it back down again until your customer roster is large enough to justify the expense. For a startup, pay as you go means paying only for what you need, until you need more.
- And the winner: Yieldex
Good things happened to the winner of the second annual AWS Start-up Challenge. Yieldex built a cloud-based management and forecasting service for online advertisers, won $100,000 cash and AWS service credits. Soon after, the company announced $8.
5 million in Series B funding led by Madrona Venture Group, with a new partner: Amazon. com itself. Around the same time, Yieldex also announced its first customer: Martha Stewart Living Omnimedia. “As a startup you don’t want to be spending lots of money on capital in advance of actually having a product and customer,” says Yieldex president Larry Allen. So the Amazon pay-as-you-go model made a lot of sense for us. We don’t have to have dedicated boxes that we’re paying for, whether we are using them or not. We used AWS out of the gate as a cost effective way to host and manage the business.” Most of the initial development work was done by a 10-person development team based in Boulder, Colorado, first on an underlying technology, called Dynamic IQ, which processes the data for use by other applications. Yieldex’s first app, Business IQ, was built on top of that, with more in the works. Development tools used include Google’s MapReduce for managing and distilling data, with Apache Hadoop, Adobe Flash and the Ajax toolset on the front end. Development time was fast and the overhead cost was minimal. “Amazon has gone above and beyond in making AWS very developer-friendly.”
The winner of the AWS Start-up Challenge is now bringing some of its production resources in-house. “From a startup perspective, we were very satisfied with leveraging just the Amazon environment. But now that we’re rolling out some large enterprise customers, we’re using Amazon largely for the processing and data preparation, then hosting our own data center servers for the front end application.” He says that some components of the front-end system are very resource intensive, “so having a dedicated environment at the front end makes a lot of sense for us.”
This hybrid approach points to the possible limits of cloud computing, at least for some companies, as well as the cloud’s flexibility in accommodating both. “If you are selling an enterprise piece of technology, you want to be able to provide a service level agreement. You don’t get an SLA from Amazon that is robust enough to pass along to a customer―where you do get that in a data center. We can’t have our user interface go down. I’m not suggesting that Amazon isn’t robust and reliable. I’m just saying that we need a piece of paper that says we’re going to deliver ‘four nines’ [99.
99% availability]. We get that piece of paper from a data center; you don’t get that from Amazon, because they are very much still focused on consumer applications. That’s fine if you are using it for off-line experiences, but if you are trying to drive a larger business, they still have a ways to go before they are truly supporting that.” Like many other observers, Allen thinks that the cloud computing picture will continue to shift, and the lead is still up for grabs. “Most hosting companies are beginning to adopt a cloud architecture, and I would certainly argue that hosting is becoming the cloud―because it’s a more logical way of managing the infrastructure for a business. I think a lot of large infrastructure companies are going to adopt the strategy. Amazon has got a big advantage because they have built a suite of Web services on top of that cloud infrastructure that makes it very appealing, and it’s going to be a while before anybody else can do that.”