Tuesday, November 3, 2009

Beyond Development's Purview


Toyota has been dominant in applying lean (or agile) manufacturing techniques to improve product quality, lower costs, and increase responsiveness to customer needs. Other manufacturers have applied the same principles and gotten similar results. Recently, lean software techniques (Scrum, Kanban, XP) have been applied by development groups to create high quality, fast-time-to-market products focused on their customers' needs.

Now, VCs and management teams are starting to use lean techniques. I was fortunate to be invited to "Overturning Conventional Wisdom To Disrupt Industries and Markets – A Discussion On Disruptive Business Models and Innovative Approaches In Professional Sports, Software and Venture Capital" held on Thursday, October 29, 2009. Ben Nye, Managing Director of Bain Capital and Scott Maxwell, Founder and Senior Managing Director of OpenView Venture Partners shared their experiences with using Scrum to run their respective VC firm, and its use at their partner firms.

Scott met with Jeff Sutherland who started the first Scrum at Easel Corporation in 1993. Scott told Jeff that one of their partner companies was getting a 30% productivity improvement using Scrum. Jeff's response was that they must be doing something wrong if they're only getting 30%. This caught Scott's attention and Jeff is now an advisor to Scott's firm. Scott likes Scrum because it creates focus. Scott's VC firm uses a basic form of Scrum where his staff members are product owners for various Scrum teams, for example, Sales, Marketing, Finance, Investments. Each team holds daily SCRUM meetings and does weekly sprints. Scott starts with yearly company goals. The product owners break them down into quarterly goals similar to product release planning. Each Scrum team does weekly sprints to plan and do work towards the quarterly goals. Ben added that he sees benefits with agile practices: improvements to time-to-market, more at-bats, and organizational learning. Both said that software companies often bring complex products to market, use expensive direct sales and services to channel their products, and do not focus on customers' pain-points. Their guidance is to keep focused on customers' pain-points and straight-forward products by using Scrum and other lean software development techniques.

Other VC advice from Ben and Scott:

Analysis at CalPHERS showed that private equity investments made following a clear strategy provides the best ROI. Scott's strategy is to target a specific type of company, identify candidates, select companies with good, highly qualified people, provide them access to experts, and repeat the process again. His firm has marketing, sales, process, and finance experts on staff to fill partner companies' gaps.

On customers, based on the customer continuum, Consumer, SOHO, SMBs, ..., Enterprises, start with a selected customer type and work backwards to determine the go-to-market plans. Don't be slow to identify the customers. Size the market opportunity with top-down, bottoms-up, and growth drivers analysis. Get to know the customers. Things like Google Adware gets customers but without allowing the company to know them. Keep a sharp focus on customers' pain-points and products designed for the target customer. Focus go-to-market strategy on the buyer. Use web analytics to know what your customer is doing. Being early to market is the same as being wrong.

On people, debunk the 'status quo'. Risk taking means mistakes will be made. Get minds around innovation. Have great relationships with people. The 'power of question' and 'power of suggestion' are useful when advising companies. As an investor, if you fire someone, you take on their role, so act carefully.

On taking investment, choose partners well. Partnering with VCs is like a marriage, you live with them for a long time. Spend time together. Chemistry is important. Check references as much as possible. Take as little capital as possible. Capital constrained companies quickly figure out who the customer is, and channel to the customer. A high investment offer is likely from a poor VC. Customer revenue is the 'best' kind of capital investment.

Friday, October 30, 2009

Agility

I have spent the past month digging deeper into lean (or agile) software development principles and practices. I have led many successful software projects and programs that have yielded high-quality, well-received software. For various reasons during the past four years, I have led software projects using agile development. Originally, I was trained in the classic waterfall model for software development engineering and management. I have also worked with many successful senior architects who are deeply entrenched by their successes in waterfall practices.

Distinctions help me learn by having opposing views in front of me and asking why people favor one idea over the other. For this investigation, I mentally placed a miniature version of a waterfall architect on one of my shoulders and a miniature version of a lean software development proponent on the other in a similar fashion as one might imagine conscience as an angel and a devil on each shoulder. I won't assign good or evil to either because such assignments make no sense and because I have been a witness to techno-religious wars that are never resolved, for example, the still on-going emacs versus vi war. I would rather master both methods and apply the best for the job at hand.

In my mind, the waterfall architect tells me that long, deep research cycles, in-depth, insightful discussions with multiple customers, careful, precise crafting of specifications, thorough hiring of skilled and capable engineers, well-engineered components, well-planned integration cycles, construction of comprehensive testing and release engineering functions, beta testing for customer feedback and verification, and production release of the final product are needed for successful projects. I appreciate his reasoning behind this statement because of our many successes with waterfall. His challenge is: fundamentally, why does a lean, iterative, agile development process produce as good or better results compared to what we know works?

Metaphorically, I turn to the lean software proponent and raise my eye brows looking for the response. She responds to my silent inquiry with we don't know what we don't know. The architect smiles. I know that he's thinking that's why we spend time carefully thinking through every step before proceeding to the next one. She continues, therefore we need to factor that reality into a different approach by eliminating waste, amplifying learning, deciding as late as possible, delivering as fast as possible, empowering the decision makers, and building in integrity. That sounds intriguing to me. I respond, please continue.

Waste in software development takes many forms such as waiting for approvals and resources to become available, working on features that will never be used by customers, and developer context switching. Learning can be amplified by multiple iterations (think practice makes perfect), feedback, and transparency (anyone can inspect project status at anytime). Deciding as late as possible sounds counter to making decisions but it isn't. Decide on those items that must be decided on first. For those items that you don't need to initially decide upon, allow those decisions to go unmade until you have learned more. Work on items in a depth-first approach to create more information to make better decisions as you go. Delivering as fast as possible to create working versions that customers and sponsors can verify is (or isn't) meeting the need. Empower the rank and file decision makers to make the hundreds of decisions daily to be consistent with the current plan and customer feedback. Allow them to organize the work to deliver potentially shippable units of work with each iteration. Finally, build both conceptual and perceived integrity into the project such that the incremental changes and re-factoring can be done while maintaining a consistent customer experience. This can be achieved by frequent inspections and test driven design with automated test development done either prior to or concurrent with implementation. While these techniques are promising, they are relatively new, not yet applied in a consistent, repeatable fashion, and are usually engineering centric instead of being embraced by the whole organization.

I turn to the architect and ask if these seem incongruent with his perspective. His response is that in theory, there are big differences, however, in practice, many of these techniques are used in pragmatic, successful applications of waterfall. For example, waterfall development projects are often slowed down by waiting for permission and approval, so he often takes risks by proceeding on work before approval or permission is granted. His motto is, "It's better to ask for forgiveness than to ask for permission". He is often frustrated by the lack of clarity of early customer requirements, even those formally specified in contracts. He's equally frustrated by changes in requirements as the project is under development and initiates a formal change request approval process. He keeps track of all of the changes to justify the additional time needed to redesign or re-implement previously completed work. He plans on multiple implementation and integration cycles to resolve dependencies among components and enable testers' early access to functionality. He finds ways to unlink dependencies so groups can work independently. He is frustrated by the constant requests for project status and lack of insight into what's being done by the various groups. He is particularly frustrated by the 'schedule chicken' that occurs between groups. Critical bugs found during testing are problematic, especially, those that uncover issues with the specifications. In response, he asks for design and code reviews to inspect work before it is integrated. He meets with testing organizations to develop test plans in advance so he knows what they will be doing and ensures the implementations can holdup during testing. He encourages testing to make automated test environments available early so engineers can run them locally prior to putting back changes.

I, too, have used these techniques to deliver waterfall projects on time. On the other hand (or, er, shoulder) these techniques are inherent in lean software development. My next question to the architect is so why don't we just use lean software development techniques directly instead of remaining entrenched with waterfall and applying techniques to address its short comings?

He nods and smiles.

At least, he didn't glare at me like he does when someone tells him that emacs is better than his beloved vi.

Monday, September 21, 2009

Virtue

Isn't everyone looking for a virtuous cycle? Virtuous cycles seem to be close relatives to the mythical perpetual motion machine. Once a virtuous cycle starts, they create new value every day. For example, the virtuous cycle of Moore's Law, single-chip CPUs, 'standard' hardware architecture, common OS, memory improvements, Gilder's Law, and shrink-wrap application software created the PC industry. Their continuous improvements collectively created the momentum that gave rise to the high-powered, low-priced laptops that we use today.

Will there be a virtuous cycle for cloud computing? It is too early to know for sure. My impression is that a virtuous cycle may be forming out of the following, seemingly-unrelated parts: application usage, efficient development, geographically dispersed teams, horizontally scaled systems, Internet bandwidth, operations commoditization, social connections, and global reach. Let's take a look at each of these trends.

Application usage. Consumers are growing accustomed to using applications over the Internet, or software-as-a-service. The application is present by just typing an URL into any browser. The notion that the consumer has to run the application on her own system or device may be ebbing away. More powerfully, consumers may be finding the constant installation, updates, reboots, backups, and reconfiguration are hassles to be avoided. SaaS's sidestepping these hassles is becoming a 'customer delight'.

Efficient development. Software development teams are finding the power of lean and agile software development methodologies. Modeled after lean manufacturing techniques and high performance teams, these relatively new software development cycles are placing consumer needs directly in front of software development teams, and allow the rapid production of services which satisfy the highest consumers' needs.

Geographically dispersed teams. Work has increasingly been done by geographically distributed teams over the past two decades. Methodologies have been developed to increase distributed team productivity such as distributed source code control systems, scrum methodologies, and efficient team communication. However, the next big boost for dispersed teams may be virtual data centers provided by Cloud Computing. SaaS companies have reported productivity gains by moving their development and test environments to the cloud. The move increased communication efficiency and lowered bug rates because everyone was using the same resources in the same fashion.

Horizontally scaled systems. Horizontally scaled systems or distributed computing has been the holy grail of computing for a long time. A key problem has been that programming, deployment and management of horizontally scaled systems was extremely difficult and required highly skilled experts. This will likely remain true for some time. However, IaaS, in particular, Amazon Web Services and RightScale, has shown that programming, deployment and management tools are starting to emerge that can ease these difficulties. PaaS, such as Microsoft's Azure and Google's AppEngine are exposing interfaces to make writing horizontally scaled application available to mere-human developers.

Internet bandwidth and prices. While the cheap days of lighting up the dark fiber left over from the original Internet boom may be limited, improvements in wavelength-division multiplexing (WDM), higher quality fiber, and new capital investments will likely continue to increase bandwidth, and over the long term, drive down prices.

Operations commoditization - IT Operations are being automated with resources being provisioned via programmatic interfaces. As automation increases, prices will likely fall, availability will likely increase, and infrastructure variations (such as speeds, capacity, connectivity) will likely increase as Service Providers seek ways to compete.

Social connections. Metcalfe's Law states that the value of the network is proportional to the square of the number of connected users. The value of the Internet increases as people, devices and applications (or services) become connected. Thus, more people, devices and services get connected to the Internet to benefit from the increased value. As connections increase, on-line social networking and reputation become more important.

Global reach. A couple of decades ago, few would ever think of starting up a global business due to the time, complexity and expense involved. With the advent of global credit systems, such as VISA, global auction systems, such as eBay, shipping/logistic companies, such as FedEx, and large e-tailers, such as Amazon, it's possible to conduct limited business transactions across the globe. It's likely that many countries' corporate, tax and customs regulations lag behind this capability and that there will be improvements over the long term as countries find ways to accommodate more global e-commerce transactions.

Thursday, September 3, 2009

Emergency

My son was returning to college a few nights ago. He is an organized person. He meticulously planned every detail for his return. He even reinstalled everything on his laptop to ensure that his system would be in prime condition for his senior year. Just before we headed out the door, his laptop crashed so hard that we could not get the system out of 'safe' mode. We only had time for reinstalling the OS and packing up the rest of the application installation CDs before we left.

His data backup was located on an external hard drive that is used to backup all of our systems. I imagined the condition of the external hard drive after it made the round trip to school and back in his duffel bag. I could not ensure that the drive would be kept secure in an apartment with three roommates and their many friends. I could not let the hard drive out of my control. We made plans for his return trip this weekend to restore his data. We rushed out the door.

While returning home that night, I remembered discussions that I was having with the Cloudberry Lab folks about their AWS S3 tools, a browser (S3 Explorer) and an online backup utility (Backup). They had asked me to kick the tires and let them know what I thought about their products. I had my trusty S3 account. "Just maybe", I thought, "I could use Cloudberry Backup to transfer my son's backup files to S3 and then have my son recover his files from S3".

I downloaded Cloudberry Backup the next morning. The set up was easy. I followed the straight forward backup wizard to create the back up script. I selected all of his files, all 21 GB of them. I fired up Cloudberry Backup and off it went copying his files to S3. I noticed that the progress indicator wasn't exactly zipping up to 100% complete. I clicked on details to see that my upload capacity was sitting right at 250KB per second. At this rate, the transfer would take a day. I easily halted the transfer and edited the backup to select 1GB of his most important files. I started the backup again. Within an hour, his files were on S3. I sent a note to him to download Cloudberry Backup and configure it for my S3 account. He was successful and had access to his files. Not bad for a pre-med student. Not bad for Cloudberry and the cloud.

The flexing capability of the cloud enables consumers to use what they need when they need it. While this use case was a desktop application tethered to a cloud storage service, the access to what I needed when I needed it solved the problem. Even with the best of planning, Murphy strikes. The ability to request an extra CPU instance, web server, replicant, IP address or volume when capacity planning fails will continue to propel interest in the cloud. There's no (or little) price to be paid for this latent capacity.

This situation has caused me to realize that I need an on-line backup service. I'll be doing some more shopping around. I hope that the Cloudberry tools plus AWS S3 holds up to the comparison because I liked the Backup usage model and its coupling with my AWS account.

In case you're wondering what field of medicine my son is pursuing, it's Emergency Medicine.


For the record, I have not received any compensation from Cloudberry beyond using the free Cloudberry Backup (BETA) software.

Friday, August 21, 2009

Vision Versus Hype

Awhile ago, Alvin Toffler's Future Shock painted a vision of radical societal change due to technology advances and attitude shifts. Toffler projected changes that would transform the world thirty years in the future. I recall how foreign and frightening his book was to the average reader who could not fathom the depth of change heading their way. Vision is seldom easy to comprehend and often challenging to embrace. Experience has taught us to be wary of visions based upon over-hyped technologies. The dot.com's boom and bust cycle was an example where the reality took longer to catchup to the over-hyped technology. The hype associated with Cloud Computing seems to swell larger with each passing day.

How can one use the vague information contained within visions to help with today's decisions? In Peter Schwartz' Art of the Long View, he outlines how one can use information gathering, scenarios, driving forces and plots to help make better decisions today. All one needs to do is agree that the basis of a scenario could happen. They need not agree that the scenario will happen. Agreeing a scenario could happen is a much lower bar and easier for most people to accept. Scenarios have a probability of occurring. They make sense given the data, driving forces and plots. They are seldom accurate. Collectively, they give views into possible futures. Given these views, one can evaluate how current decisions might play out against selected backdrops. As the probability of a scenario increases, one can develop contingency plans and/or change plans.

My experience in building complex distributed systems and Cloud Computing platforms provides me a detailed view of how these types of systems are assembled, used by developers, and will likely evolve. However, I felt that I was missing the cloud (forest) for the technical details (trees). I was lacking a larger vision of how companies, markets and consumption could change as Cloud Computing evolves. I turned to Peter Fingar's Dot.Cloud book to see if he could provide a vision for these areas. Just as in Future Shock, Dot.Cloud paints a challenging vision of what may change due to Cloud Computing. Using Schwartz' notion of accepting what could happen, I found Fingar's book eye-opening. I have started to add some of Fingar's information and trends to my own Cloud Computing scenarios.

In particular, I was intrigued by Fingar's chapter on the 'End of Management' and bioteams. Fingar linked together a number of trends (Cloud Computing, Web 2.0, Work Processor, Innovation, and Work 2.0) to describe how and why historic management practices may not be effective in the future. Fingar described the rise of bioteams, or high performance teams with collective leadership. Needless to say, Ken Thompson's Bioteams book is high on my reading list. Just as I finished up Dot.Cloud, I picked up the Monday, August 17, 2009 Wall Street Journal and found The New, Faster Face of Innovation article written by Erik Brynjolfsson and Michael Schrage which echos Fingar's 'End of Management' chapter.

While the Cloud Computing hype continues to grow, I'm encouraged that there is considerable, well-considered material to foster what the Cloud Computing future may hold for us. The challenge is to keep the visions separated from the hype, to embrace the could happen, and be mindful of the will happen. Scenarios may be useful mechanisms to have constructive discussions about the future.

A disclosure, I may have given Fingar the benefit of the doubt beyond reason since he drew upon Dr. John Vicent Atanasoff as an example. I received my Computer Science degree at Iowa State University. I have a fondness of Dr. Atanasoff's work at Iowa State College and his place in computer history.


Thursday, August 13, 2009

Commoditization of IT Operations


My response to 'What will be commoditized by the Cloud?' made during Cloud Camp Boston was "The commoditization we're going to see in the cloud is operations". These new entrants to Cloud Computing seemed to imply that storage, networks and computers were already commodities and unless something else was being commoditized, Cloud Computing would not create sufficient value to succeed. There are other benefits to Cloud Computing, but let's stay on the IT operations aspect for now.

I'm not ready to go as far as Nichols Carr has gone in declaring IT Doesn't Matter nor what's been attributed to Thomas Siebel. Current IT practices emerged in response to the application creation/selection process. The application creation/selection process determines what IT operations must deploy in data centers to support the business. As an over-simplistic example, a company determines that they need a new CRM application. They evaluate CRM applications that were written and tested months or years earlier by an ISV for the market-at-large. The application specifies the supported hardware platform(s), software stack(s), patch level(s), network configuration(s), data access methodology(ies) and monitoring/management interface(s). Once the specific CRM application is selected and purchased, IT operations has to develop a runbook specific to the application for deployment in a specific data center(s). The various IT operations departments work together to fit the application's requirements into their existing infrastructure, and procure new, unique, or additional components, or 'one-offs'. They have to side-step other applications' conflicting requirements and 'one-offs' to complete the deployment. IT operations' job complexity has to be staggering as they deal with hundreds or thousands of deployed applications. Maybe that's why IT operations like to leave a newly deployed application untouched for months or years while they tend to the other applications.

Essential characteristics for Cloud Computing include: on-demand self-service, ubiquitous network access, location independent resource pooling, rapid elasticity, and measured service. Cloud Computing will provide these essential characteristics affordably through virtualization, automation, programmatic control and standards[1]. Applications and automated runbooks will be developed specifically for the Cloud. Programming will automate the manual work done on infrastructure (network, storage, data base, and system) and application configuration, management, monitoring, and maintenance. The automation will drive the cost of operations lower. The virtualization, automation, programmatic control and standards will commoditize IT operations. Evidence of this progress are appearing now as virtualization companies and cloud suppliers automate the allocation, configuration, monitoring, and management of infrastructure and applications.

IT operations will take back control from the applications to deliver better efficiency and lower cost through using public and private Clouds. The pressure to increase IT operations' efficiency and lower costs will be great enough that applications be re-written and automated runbooks will be developed. The historic pressure of creating 'one-offs' in IT operations for unique application requirements will fade. How many 'one-off' requests do you think Amazon has gotten since they started up AWS? I'd bet a lot. How many 'one-off' requests have Amazon implemented? I'd bet none. AWS cannot afford to make custom changes for any application. The same is likely for Google AppEngine, and Microsoft Azure. In the future, it may even be likely for IT operations.

What should we call this new world of commoditization of IT operations? What about OPS 2.0? OPS 2.0 fits along side Web 2.0, Enterprise 2.0, and Everything 2.0, doesn't it?

[1] No standards exist and are likely years away. The industry will likely generate many competing ways of doing Cloud Computing before an a priori or de facto standard comes about. Industry-wide efficiency and cost savings won't be possible until then.

Thursday, July 30, 2009

Conspiracy at Cloud Camp Boston


I attended CloudCamp Boston last night. If you did not attend, you missed out on an excellent unconference. Many thanks to the sponsors for picking up the food and bar tab, and providing the meeting space. Special thanks to Matthew Rudnick and Wayne Pauley who lead Silver Lining, The Boston Cloud Computing User Group for jump-starting the event.

The 350 registered attendees heard Judith Hurwitz and John Treadway give a good overview of Cloud Computing with some of the latest definitions and terms. Most of the action came in the sessions where attendees could ask their questions in a group setting, hear various opinions, and have a discussion to seek understanding. The hallway discussions were happening everywhere. There was an urgency to the give and take of information so the attendees could get to the next person on their list.

Let's face it, the concept of Cloud Computing is vague, unfamiliar, emerging, and complex. I applaud those who are overcoming the inclination to wait until the dust settles before they learn about it. They are sorting through the hype, 'cloudwashing' (a play on whitewashing), pre-announcements, and unproven pundit claims to uncover what they need to learn and, most importantly, unlearn. The common answer to their questions was, 'it depends'. Still they persisted in refining their questions and seeking why it depends.

Apparently, there is a controversy surrounding the concept of 'private cloud'. Some maintain that a private cloud is nothing more than a move by existing IT types to keep their jobs and hardware vendors to keep up their hardware sales to enterprises. Has Oliver Stone been seen lurking around Armonk lately?

Putting conspiracy theories aside for a moment, my brief description of a private cloud is cloud computing done internally. Our NIST friends would agree in principle with this definition. For example, if one could package up all of AWS's tools, software, hardware, and operational knowledge, and actually operate their own resources with the same capability and efficiency as AWS does, that would be an example of a private cloud. A private cloud duplicates the same level of automation, process control, programatic control, scale, multi-tenancy, security, isolation, and cost-efficiency as a public cloud. There may be some internal data centers that are today operating as efficiently as AWS's public cloud and could claim that they are operating a private cloud. However, a person who points to an hodgepodge of machines maintained by an army of administrators claiming that he has a private cloud would have difficulty proving his case.

If hardware vendors and IT types are promoting private clouds to save themselves, they may have grabbed an anchor instead of a life-preserver.

Tuesday, July 28, 2009

Safe Bet

Microsoft's Azure pricing was announced earlier this month. There have been a few blog posts publishing the numbers and comparing prices. The bottom line is that pretty much[1] Microsoft priced their offerings at price parity with Amazon Web Services. The question that kept coming to mind was 'Why parity?'.

Microsoft has market dominance, a relatively captive developer audience, large data center experience, and cash. Azure is designed to remotely run customer code under their control on Microsoft's software stacks. The Azure developer experience is similar in style to the desktop development experience. Azure should be efficient since they are leveraging Microsoft's massive data centers and operational expertise. They have the capital for a prolonged battle.

Meanwhile, AWS prices have been relatively fixed for some time. AWS storage and small-compute instances have remained the same for years. While Amazon has offered new services like reserved instances at lower prices, and tiered outgoing bandwidth prices, the US pricing has remained unchanged. This is an amazing feat given how technology prices fall over time. Sounds like a pricing target to me.

Why not get banner headlines by undercutting AWS? Governments would not blink if Microsoft took on the world's largest on-line retailer on price. Would they? Azure is late and behind. Wouldn't lower prices demonstrate that Microsoft is serious about Azure and Cloud Computing? Azure has the benefit of using modern hardware in a market with two year old pricing. Microsoft has their own large data centers in low cost locations. Couldn't Azure use these for their advantage? If anyone could take on AWS on price, Azure could do it.

Why wasn't Azure's pricing set lower? I don't know the answer. I suspect that, years ago, AWS set aggressive, forward-looking prices based on future efficiencies that they felt they would achieve. They have pretty much executed on plan. If so, there isn't much pricing room for a newcomer to undercut them. Given the large capital investments, automation complexities, low price-per-unit, high unit volumes, and thin margins, any small pricing mistake will compound and drastically affect the bottom line. Azure went with the safe bet, pricing parity.

AWS's pricing may be one the greatest barriers to entry for the next generation of computing. If so, talk about beginner's luck.

[1] There may be a few exceptions with Microsoft's Web Role and SQL Azure. Web Role and AWS CloudWatch price difference may be based on different technical approaches and the resources used, with Web Role potentially being the higher of the two. The SQL Azure price per month has processing and storage bundled, whereas, SimpleDB prices storage and compute component based upon actual usage per month.

Friday, July 10, 2009

Vapor Versus Vendor

I'm interested in cloud versus owned IT cost comparisons. Understanding how organizations break out costs and set up the comparisons is insightful to their views and thinking. Some comparisons don't include the multitude of overheads, for example, management, facilities, procurement, engineering, and taxes because these costs are hidden or difficult to estimate.

A friend sent me a pointer to the up.time blog post. The author is comparing his in-house test environment against running a similar test infrastructure on AWS. AWS is for commonly used for building test environments. The post does a good job of breaking out in-house costs and tracking AWS expenses. The author factors some overheads into the analysis. The result? AWS costs $491,253.01 per year ($40,937.75 per month) more than his in-house environment. Wow!

There must be some mistake. The author must have left out something.

I can argue some minor points. For example, an average month has 30.5 days instead of 31 days, trimming about $1,000.00 per month off the $64,951.20 in instance charges. Another could be that the overhead should include additional overheads mentioned above. These minor points may add up to $5,000.00 per month, a far cry from explaining the $40,937.75 per month delta.

Looking a bit deeper, there is a big cost difference between 150 Linux and 100 Windows instances. Breaking out a baseline of Linux costs (to cover AWS machine, power and other costs) versus the additional cost for Windows. The baseline price for 302 Linux small and large instances is $22,915.20 per month. The Windows premium is $0.025 per CPU hour and that works out to $2,827.00 per month for 152 Windows instances. The cost for Windows and SQL Server running on 152 instances is $42,036 per month. Hence, the SQL Server premium is approximately $39,171.60 per month. The author pays $4,166.00 per month for his in-house database licenses. The premium paid for SQL Server on AWS is approximately $35,005.06 per month.

Most of the $40,937.75 per month cost disadvantage for the cloud can be explained by the AWS pricing for Microsoft Windows at $2,827.00 per month and SQL Server at $35,005.06 per moth. If the author would allow me, I could haggle the overheads and other minor issues to close the remainder of the gap. But, that's not the real issue here.

The pricing for Windows and SQL Server on AWS is not competitive with Microsoft's purchase pricing. Paying almost 10x more is not reasonable. The author points out that ISVs normally have agreements with other ISVs to use their software in test environments for low or no fees. If the test environment needs Windows or SQL Server, you'll have to pay a lot for it at AWS.

One last point, the author wondered if anyone does resource flexing in the cloud. As I pointed out at the beginning of my post, AWS is commonly used for testing because people can scale-up their resource usage prior to release deadlines and when testing at scale. They reduce their resource usage when the need passes. Hence, resource utilization and speed to acquire incremental resources are additional factors to consider.

Wednesday, July 8, 2009

Will Google's Chrome OS be successful?

Google announced it's Chrome OS project last night. Google is developing a secure, simple and fast PC OS focused on web applications. This is a forward looking move given cloud computing and Software-As-A-Service's projected growth. Will Chrome OS succeed? I see a few trouble spots in Google's blog post that they will need to overcome internally to have a chance at success.

Trouble spot #1: Because we're already talking to partners about the project, and we'll soon be working with the open source community, we wanted to share our vision now so everyone understands what we are trying to achieve. .... we are going back to the basics and completely redesigning the underlying security architecture of the OS so that users don't have to deal with viruses, malware and security updates. It should just work. .... Google Chrome running within a new windowing system on top of a Linux kernel."

Google claims to know how to do a secure Linux OS (Linus must be thrilled), simple distribution (yet another Linux distribution) and fast windowing system (yep, Linux is weak here), and they're sharing their vision (via a blog post) with the open source communities. Hubris usually doesn't go far with open source communities.

Trouble spot #2: "Google Chrome OS is a new project, separate from Android. Android was designed from the beginning to work across a variety of devices from phones to set-top boxes to netbooks. Google Chrome OS is being created for people who spend most of their time on the web, and is being designed to power computers ranging from small netbooks to full-size desktop systems."

Google has two emerging and competing OS projects. Both are each up against strong, entrenched competitors. 'Netbooks' are an emerging market with a few OEMs (Freescale, Acer) planning to use Android as their OS. While Google has the extra cash to fund competing projects, OEMs and retailers don't have the resources to support both. They need to invest to make one succeed.

Trouble spot #3 "We have a lot of work to do, and we're definitely going to need a lot of help"

Given today's economy, everyone is willing to spend their abundant resources and time to help Google become more powerful. Right? Yeah, I thought so too.

Tuesday, July 7, 2009

Does open source have a future in the clouds?

I have just read Stephen O'Grady's well-written post on open source and cloud computing. Everyone should read his post. He makes excellent arguments outlining a pessimistic future of open source in the cloud.

The cloud market is new and has a long way to go before a few dominant players control a mature, stable market. Many customers desire that the cloud market maturity is quickly reached, the dominant market leaders are obvious, standards are well-understood, prices are falling year-over-year, and the technology path forward is relatively risk free. Someone at the Boston All-Things-Cloud meet-up mentioned that innovation will happen quickly in the cloud and that the market will rapidly mature. Don't get me wrong. I want the same things. However, high-tech markets don't develop as quickly as desired, nor as projected.

While speed of technology development has been increasing, the pace in which humans can comprehend it, assess the risks, and plan its usage has been a relatively slow constant. Customers will take a while to comprehend this new market's potential. Even if customers understand everything about it, market maturity is a long way down the road.

What does this point have to do with predicting open source's sad fate? Cloud computing will take a lot of time to develop and refine itself giving open source projects time to adapt. Open source projects are built to adapt by their very nature. Advances with cloud computing will benefit how open source projects are done in ways that we cannot comprehend today. So, maybe the future isn't so bleak.

Let's look at the nature of open source projects. They are a mixture of leading edge (see Eric Raymond's The Cathedral and the Bazaar first lesson) and re-engineering (Eric's second lesson) efforts. Open source developers commonly start with a clear need and usage pattern. They publish the code under a license to encourage other developers to extend the project for additional uses and contribute the extension back to the original project. Successful projects change and adapt because different developers are 'scratching various itches'. All it takes is one motivated project member to adapt (or initiate) a project for the cloud.

A common open source project problem has been finding secure, affordable, and usable Internet-connected equipment. In the past, well-meaning hardware providers would contribute equipment to open source projects, only to find that the open source projects did not have the financial means to operate and maintain the equipment at an hosting provider. Cloud computing provides by-the-hour resources for testing and development that individual project members can easily afford. Previously, open source projects that required large amounts of resources for short periods of time were impractical. Now, they are affordable.

AWS's rise to early dominance in the market was due to their early appeal and familiarity to Linux developers. Open source projects can make their binaries available via Amazon Machine Images (AMIs). Other project members can instantiate the AMIs as their own AWS instances and quickly have the project code running. This has helped boost both the AWS developer ecosystem and open source projects. Here are two examples, cloudmaster project and AMI, and mapnik + tilecash projects and their AMI. While I don't have any specific examples, I would not be surprised if a few open source projects have put out a 'donation' AMI to create a cash stream to offset their development costs.

Stephen correctly pointed out that there is currently no turn-key IAAS open source project. I would expect that it will take time for a collective open source effort to piece the right open source projects together to address this need. There are reasons to believe that it will be done. For example, Amazon Web Services (AWS) used many open source projects to build their offerings. They have contributed back to open source projects. I was part of an effort to build an IAAS offering. We were surprised to find how much of the networking, security and virtualization infrastructure needed for an IAAS offering already existed in open source projects.

Open source innovators are not idle. New open source projects are providing needed Internet-scale technology that today are proprietary. Take a look at Chuck Wegrzyn's Twisted Storage Project (to be renamed FreeSTORE) as an example open source project contributing Internet-scale storage technology. I'm guessing that others are out there too.

To be fair, one cannot build an equivalent to AWS from open source projects today. Thus far, AWS has yet to contribute their wholly owned and developed source code to the open source community. If AWS determines that contributing the code to the open source community is a move that would be in their best interest, it's a play that would assure open source's future in the clouds. It may even accelerate standards and cloud maturity. Hmmmm, maybe the guy at All-things-cloud was right.

Wednesday, July 1, 2009

What does it take to profitably compete against Amazon Web Services?

I've been concerned about this question for the past decade. I did not realize it until after Amazon's Web Services became popular. How is that possible? Allow me to explain.

I joined Sun Microsystems over a decade ago intrigued by Jim Waldo, Bill Joy, and others' vision of how computing would change as low-cost CPUs, storage and networks emerge, and as distributing computing becomes accessible to large numbers of developers. A key question was 'how would businesses make money?' The particular technology and business that we imagined did not pan out as we had hoped. As it turns out, we were too early.

Fast forward to a few years ago as many explored how utility and grid computing capabilities could be monetized. Sun Grid and the 'dollar per CPU hour' was advanced as a possible Infrastructure-As-A-Service (IAAS) model. The Sun CTO organization began a research effort, dubbed Project Caroline, to investigate technology for the not-yet-named Platform-As-A-Service (PAAS) space. As part of the Project Caroline effort, we built a model data center to evaluate the technology's potential. Still, the question, 'how to make money?', loomed.

Shortly afterward, Amazon Web Services began rolling out their offerings. They instantly appealed to developers. They captured the early IAAS market with fast-paced growth to the dominant position. Their offerings were straight forward and aggressively priced.

It clicked. I needed to understand how an IAAS could compete against Amazon Web Services. There are good strategies for competing against an early leader an emerging market. Going head-to-head with the leader in a low-margin, capital intensive market is an unlikely strategy. However, knowing what it would take to directly compete with Amazon Web Services would be informative and instructive.

As I talked with people, the complexity became overwhelming. I decided to construct a model to allow evaluation of many potential scenarios. For example, what if I wanted to get to market quickly by using a 24x7 operations center staffed by people who manage the configuration and management of the customer resources? What if, I purchased software to automate their tasks? What if I built the software? Another example, what if I used 'pizza box' systems for customer nodes? What about mid-range systems? What about blade systems? On the operations side, what if power rates doubled? What if I doubled capacity? How much does utilization play a factor? Rich Zippel joined the modeling effort.

As we refined the model, a few findings stood out. For example, utilization of the customer resources, equipment selection, software license costs paid by the IAAS provider, and the level of automation of customer resource configuration/management are major factors in determining profitability. Other factors such as, power costs and automation of infrastructure systems' management are factors that have less of an impact than we had expected.

I presented some of the scenarios and findings last night at the Boston Cloud Services meet up. The audience had great questions and suggestions. For example, is there any way to increase utilization beyond 100% (overbooking as airlines do) knowing full well that you'll deny someone's request when they access their resource? This could be modeled, however, the answer would have to play to the customer satisfaction and SLA requirements. Would more power systems be more cost effective than the low-end HP systems modeled? The model allows various system types and their infrastructure impacts to be compared. I modeled the lower-end systems because they, in general, ended up being more cost effective compared to types of systems. However, modeling of more systems should be done.

If you have other questions or suggestions, feel free to let me know.

Tuesday, June 30, 2009

Building out large scale data centers

Multiple friends have sent me pointers to James Hamilton's work. I agree with his perspective on designing and deploying Internet scale data centers and services due to my Sun experiences with distributed computing and cloud research. I wonder how many executives, managers and engineers understand his proposals and how they will need to change to succeed in this emerging world?

Friday, June 26, 2009

Just beginning..

My blog is under construction. Please pardon the dust.
I'll will be presenting 'Competing Against AWS' at the Boston Cloud Services Meeting on June 30, 2009.