Cloud Savvy

Monday, September 21, 2009

Virtue

Isn't everyone looking for a virtuous cycle? Virtuous cycles seem to be close relatives to the mythical perpetual motion machine. Once a virtuous cycle starts, they create new value every day. For example, the virtuous cycle of Moore's Law, single-chip CPUs, 'standard' hardware architecture, common OS, memory improvements, Gilder's Law, and shrink-wrap application software created the PC industry. Their continuous improvements collectively created the momentum that gave rise to the high-powered, low-priced laptops that we use today.

Will there be a virtuous cycle for cloud computing? It is too early to know for sure. My impression is that a virtuous cycle may be forming out of the following, seemingly-unrelated parts: application usage, efficient development, geographically dispersed teams, horizontally scaled systems, Internet bandwidth, operations commoditization, social connections, and global reach. Let's take a look at each of these trends.

Application usage. Consumers are growing accustomed to using applications over the Internet, or software-as-a-service. The application is present by just typing an URL into any browser. The notion that the consumer has to run the application on her own system or device may be ebbing away. More powerfully, consumers may be finding the constant installation, updates, reboots, backups, and reconfiguration are hassles to be avoided. SaaS's sidestepping these hassles is becoming a 'customer delight'.

Efficient development. Software development teams are finding the power of lean and agile software development methodologies. Modeled after lean manufacturing techniques and high performance teams, these relatively new software development cycles are placing consumer needs directly in front of software development teams, and allow the rapid production of services which satisfy the highest consumers' needs.

Geographically dispersed teams. Work has increasingly been done by geographically distributed teams over the past two decades. Methodologies have been developed to increase distributed team productivity such as distributed source code control systems, scrum methodologies, and efficient team communication. However, the next big boost for dispersed teams may be virtual data centers provided by Cloud Computing. SaaS companies have reported productivity gains by moving their development and test environments to the cloud. The move increased communication efficiency and lowered bug rates because everyone was using the same resources in the same fashion.

Horizontally scaled systems. Horizontally scaled systems or distributed computing has been the holy grail of computing for a long time. A key problem has been that programming, deployment and management of horizontally scaled systems was extremely difficult and required highly skilled experts. This will likely remain true for some time. However, IaaS, in particular, Amazon Web Services and RightScale, has shown that programming, deployment and management tools are starting to emerge that can ease these difficulties. PaaS, such as Microsoft's Azure and Google's AppEngine are exposing interfaces to make writing horizontally scaled application available to mere-human developers.

Internet bandwidth and prices. While the cheap days of lighting up the dark fiber left over from the original Internet boom may be limited, improvements in wavelength-division multiplexing (WDM), higher quality fiber, and new capital investments will likely continue to increase bandwidth, and over the long term, drive down prices.

Operations commoditization - IT Operations are being automated with resources being provisioned via programmatic interfaces. As automation increases, prices will likely fall, availability will likely increase, and infrastructure variations (such as speeds, capacity, connectivity) will likely increase as Service Providers seek ways to compete.

Social connections. Metcalfe's Law states that the value of the network is proportional to the square of the number of connected users. The value of the Internet increases as people, devices and applications (or services) become connected. Thus, more people, devices and services get connected to the Internet to benefit from the increased value. As connections increase, on-line social networking and reputation become more important.

Global reach. A couple of decades ago, few would ever think of starting up a global business due to the time, complexity and expense involved. With the advent of global credit systems, such as VISA, global auction systems, such as eBay, shipping/logistic companies, such as FedEx, and large e-tailers, such as Amazon, it's possible to conduct limited business transactions across the globe. It's likely that many countries' corporate, tax and customs regulations lag behind this capability and that there will be improvements over the long term as countries find ways to accommodate more global e-commerce transactions.

Thursday, September 3, 2009

Emergency

My son was returning to college a few nights ago. He is an organized person. He meticulously planned every detail for his return. He even reinstalled everything on his laptop to ensure that his system would be in prime condition for his senior year. Just before we headed out the door, his laptop crashed so hard that we could not get the system out of 'safe' mode. We only had time for reinstalling the OS and packing up the rest of the application installation CDs before we left.

His data backup was located on an external hard drive that is used to backup all of our systems. I imagined the condition of the external hard drive after it made the round trip to school and back in his duffel bag. I could not ensure that the drive would be kept secure in an apartment with three roommates and their many friends. I could not let the hard drive out of my control. We made plans for his return trip this weekend to restore his data. We rushed out the door.

While returning home that night, I remembered discussions that I was having with the Cloudberry Lab folks about their AWS S3 tools, a browser (S3 Explorer) and an online backup utility (Backup). They had asked me to kick the tires and let them know what I thought about their products. I had my trusty S3 account. "Just maybe", I thought, "I could use Cloudberry Backup to transfer my son's backup files to S3 and then have my son recover his files from S3".

I downloaded Cloudberry Backup the next morning. The set up was easy. I followed the straight forward backup wizard to create the back up script. I selected all of his files, all 21 GB of them. I fired up Cloudberry Backup and off it went copying his files to S3. I noticed that the progress indicator wasn't exactly zipping up to 100% complete. I clicked on details to see that my upload capacity was sitting right at 250KB per second. At this rate, the transfer would take a day. I easily halted the transfer and edited the backup to select 1GB of his most important files. I started the backup again. Within an hour, his files were on S3. I sent a note to him to download Cloudberry Backup and configure it for my S3 account. He was successful and had access to his files. Not bad for a pre-med student. Not bad for Cloudberry and the cloud.

The flexing capability of the cloud enables consumers to use what they need when they need it. While this use case was a desktop application tethered to a cloud storage service, the access to what I needed when I needed it solved the problem. Even with the best of planning, Murphy strikes. The ability to request an extra CPU instance, web server, replicant, IP address or volume when capacity planning fails will continue to propel interest in the cloud. There's no (or little) price to be paid for this latent capacity.

This situation has caused me to realize that I need an on-line backup service. I'll be doing some more shopping around. I hope that the Cloudberry tools plus AWS S3 holds up to the comparison because I liked the Backup usage model and its coupling with my AWS account.

In case you're wondering what field of medicine my son is pursuing, it's Emergency Medicine.

For the record, I have not received any compensation from Cloudberry beyond using the free Cloudberry Backup (BETA) software.

Friday, August 21, 2009

Vision Versus Hype

Awhile ago, Alvin Toffler's Future Shock painted a vision of radical societal change due to technology advances and attitude shifts. Toffler projected changes that would transform the world thirty years in the future. I recall how foreign and frightening his book was to the average reader who could not fathom the depth of change heading their way. Vision is seldom easy to comprehend and often challenging to embrace. Experience has taught us to be wary of visions based upon over-hyped technologies. The dot.com's boom and bust cycle was an example where the reality took longer to catchup to the over-hyped technology. The hype associated with Cloud Computing seems to swell larger with each passing day.

How can one use the vague information contained within visions to help with today's decisions? In Peter Schwartz' Art of the Long View, he outlines how one can use information gathering, scenarios, driving forces and plots to help make better decisions today. All one needs to do is agree that the basis of a scenario could happen. They need not agree that the scenario will happen. Agreeing a scenario could happen is a much lower bar and easier for most people to accept. Scenarios have a probability of occurring. They make sense given the data, driving forces and plots. They are seldom accurate. Collectively, they give views into possible futures. Given these views, one can evaluate how current decisions might play out against selected backdrops. As the probability of a scenario increases, one can develop contingency plans and/or change plans.

My experience in building complex distributed systems and Cloud Computing platforms provides me a detailed view of how these types of systems are assembled, used by developers, and will likely evolve. However, I felt that I was missing the cloud (forest) for the technical details (trees). I was lacking a larger vision of how companies, markets and consumption could change as Cloud Computing evolves. I turned to Peter Fingar's Dot.Cloud book to see if he could provide a vision for these areas. Just as in Future Shock, Dot.Cloud paints a challenging vision of what may change due to Cloud Computing. Using Schwartz' notion of accepting what could happen, I found Fingar's book eye-opening. I have started to add some of Fingar's information and trends to my own Cloud Computing scenarios.

In particular, I was intrigued by Fingar's chapter on the 'End of Management' and bioteams. Fingar linked together a number of trends (Cloud Computing, Web 2.0, Work Processor, Innovation, and Work 2.0) to describe how and why historic management practices may not be effective in the future. Fingar described the rise of bioteams, or high performance teams with collective leadership. Needless to say, Ken Thompson's Bioteams book is high on my reading list. Just as I finished up Dot.Cloud, I picked up the Monday, August 17, 2009 Wall Street Journal and found The New, Faster Face of Innovation article written by Erik Brynjolfsson and Michael Schrage which echos Fingar's 'End of Management' chapter.

While the Cloud Computing hype continues to grow, I'm encouraged that there is considerable, well-considered material to foster what the Cloud Computing future may hold for us. The challenge is to keep the visions separated from the hype, to embrace the could happen, and be mindful of the will happen. Scenarios may be useful mechanisms to have constructive discussions about the future.

A disclosure, I may have given Fingar the benefit of the doubt beyond reason since he drew upon Dr. John Vicent Atanasoff as an example. I received my Computer Science degree at Iowa State University. I have a fondness of Dr. Atanasoff's work at Iowa State College and his place in computer history.

Thursday, August 13, 2009

Commoditization of IT Operations

My response to 'What will be commoditized by the Cloud?' made during Cloud Camp Boston was "The commoditization we're going to see in the cloud is operations". These new entrants to Cloud Computing seemed to imply that storage, networks and computers were already commodities and unless something else was being commoditized, Cloud Computing would not create sufficient value to succeed. There are other benefits to Cloud Computing, but let's stay on the IT operations aspect for now.

I'm not ready to go as far as Nichols Carr has gone in declaring IT Doesn't Matter nor what's been attributed to Thomas Siebel. Current IT practices emerged in response to the application creation/selection process. The application creation/selection process determines what IT operations must deploy in data centers to support the business. As an over-simplistic example, a company determines that they need a new CRM application. They evaluate CRM applications that were written and tested months or years earlier by an ISV for the market-at-large. The application specifies the supported hardware platform(s), software stack(s), patch level(s), network configuration(s), data access methodology(ies) and monitoring/management interface(s). Once the specific CRM application is selected and purchased, IT operations has to develop a runbook specific to the application for deployment in a specific data center(s). The various IT operations departments work together to fit the application's requirements into their existing infrastructure, and procure new, unique, or additional components, or 'one-offs'. They have to side-step other applications' conflicting requirements and 'one-offs' to complete the deployment. IT operations' job complexity has to be staggering as they deal with hundreds or thousands of deployed applications. Maybe that's why IT operations like to leave a newly deployed application untouched for months or years while they tend to the other applications.

Essential characteristics for Cloud Computing include: on-demand self-service, ubiquitous network access, location independent resource pooling, rapid elasticity, and measured service. Cloud Computing will provide these essential characteristics affordably through virtualization, automation, programmatic control and standards[1]. Applications and automated runbooks will be developed specifically for the Cloud. Programming will automate the manual work done on infrastructure (network, storage, data base, and system) and application configuration, management, monitoring, and maintenance. The automation will drive the cost of operations lower. The virtualization, automation, programmatic control and standards will commoditize IT operations. Evidence of this progress are appearing now as virtualization companies and cloud suppliers automate the allocation, configuration, monitoring, and management of infrastructure and applications.

IT operations will take back control from the applications to deliver better efficiency and lower cost through using public and private Clouds. The pressure to increase IT operations' efficiency and lower costs will be great enough that applications be re-written and automated runbooks will be developed. The historic pressure of creating 'one-offs' in IT operations for unique application requirements will fade. How many 'one-off' requests do you think Amazon has gotten since they started up AWS? I'd bet a lot. How many 'one-off' requests have Amazon implemented? I'd bet none. AWS cannot afford to make custom changes for any application. The same is likely for Google AppEngine, and Microsoft Azure. In the future, it may even be likely for IT operations.

What should we call this new world of commoditization of IT operations? What about OPS 2.0? OPS 2.0 fits along side Web 2.0, Enterprise 2.0, and Everything 2.0, doesn't it?

[1] No standards exist and are likely years away. The industry will likely generate many competing ways of doing Cloud Computing before an a priori or de facto standard comes about. Industry-wide efficiency and cost savings won't be possible until then.

Thursday, July 30, 2009

Conspiracy at Cloud Camp Boston

I attended CloudCamp Boston last night. If you did not attend, you missed out on an excellent unconference. Many thanks to the sponsors for picking up the food and bar tab, and providing the meeting space. Special thanks to Matthew Rudnick and Wayne Pauley who lead Silver Lining, The Boston Cloud Computing User Group for jump-starting the event.

The 350 registered attendees heard Judith Hurwitz and John Treadway give a good overview of Cloud Computing with some of the latest definitions and terms. Most of the action came in the sessions where attendees could ask their questions in a group setting, hear various opinions, and have a discussion to seek understanding. The hallway discussions were happening everywhere. There was an urgency to the give and take of information so the attendees could get to the next person on their list.

Let's face it, the concept of Cloud Computing is vague, unfamiliar, emerging, and complex. I applaud those who are overcoming the inclination to wait until the dust settles before they learn about it. They are sorting through the hype, 'cloudwashing' (a play on whitewashing), pre-announcements, and unproven pundit claims to uncover what they need to learn and, most importantly, unlearn. The common answer to their questions was, 'it depends'. Still they persisted in refining their questions and seeking why it depends.

Apparently, there is a controversy surrounding the concept of 'private cloud'. Some maintain that a private cloud is nothing more than a move by existing IT types to keep their jobs and hardware vendors to keep up their hardware sales to enterprises. Has Oliver Stone been seen lurking around Armonk lately?

Putting conspiracy theories aside for a moment, my brief description of a private cloud is cloud computing done internally. Our NIST friends would agree in principle with this definition. For example, if one could package up all of AWS's tools, software, hardware, and operational knowledge, and actually operate their own resources with the same capability and efficiency as AWS does, that would be an example of a private cloud. A private cloud duplicates the same level of automation, process control, programatic control, scale, multi-tenancy, security, isolation, and cost-efficiency as a public cloud. There may be some internal data centers that are today operating as efficiently as AWS's public cloud and could claim that they are operating a private cloud. However, a person who points to an hodgepodge of machines maintained by an army of administrators claiming that he has a private cloud would have difficulty proving his case.

If hardware vendors and IT types are promoting private clouds to save themselves, they may have grabbed an anchor instead of a life-preserver.

Tuesday, July 28, 2009

Safe Bet

Microsoft's Azure pricing was announced earlier this month. There have been a few blog posts publishing the numbers and comparing prices. The bottom line is that pretty much[1] Microsoft priced their offerings at price parity with Amazon Web Services. The question that kept coming to mind was 'Why parity?'.

Microsoft has market dominance, a relatively captive developer audience, large data center experience, and cash. Azure is designed to remotely run customer code under their control on Microsoft's software stacks. The Azure developer experience is similar in style to the desktop development experience. Azure should be efficient since they are leveraging Microsoft's massive data centers and operational expertise. They have the capital for a prolonged battle.

Meanwhile, AWS prices have been relatively fixed for some time. AWS storage and small-compute instances have remained the same for years. While Amazon has offered new services like reserved instances at lower prices, and tiered outgoing bandwidth prices, the US pricing has remained unchanged. This is an amazing feat given how technology prices fall over time. Sounds like a pricing target to me.

Why not get banner headlines by undercutting AWS? Governments would not blink if Microsoft took on the world's largest on-line retailer on price. Would they? Azure is late and behind. Wouldn't lower prices demonstrate that Microsoft is serious about Azure and Cloud Computing? Azure has the benefit of using modern hardware in a market with two year old pricing. Microsoft has their own large data centers in low cost locations. Couldn't Azure use these for their advantage? If anyone could take on AWS on price, Azure could do it.

Why wasn't Azure's pricing set lower? I don't know the answer. I suspect that, years ago, AWS set aggressive, forward-looking prices based on future efficiencies that they felt they would achieve. They have pretty much executed on plan. If so, there isn't much pricing room for a newcomer to undercut them. Given the large capital investments, automation complexities, low price-per-unit, high unit volumes, and thin margins, any small pricing mistake will compound and drastically affect the bottom line. Azure went with the safe bet, pricing parity.

AWS's pricing may be one the greatest barriers to entry for the next generation of computing. If so, talk about beginner's luck.

[1] There may be a few exceptions with Microsoft's Web Role and SQL Azure. Web Role and AWS CloudWatch price difference may be based on different technical approaches and the resources used, with Web Role potentially being the higher of the two. The SQL Azure price per month has processing and storage bundled, whereas, SimpleDB prices storage and compute component based upon actual usage per month.

Friday, July 10, 2009

Vapor Versus Vendor

I'm interested in cloud versus owned IT cost comparisons. Understanding how organizations break out costs and set up the comparisons is insightful to their views and thinking. Some comparisons don't include the multitude of overheads, for example, management, facilities, procurement, engineering, and taxes because these costs are hidden or difficult to estimate.

A friend sent me a pointer to the up.time blog post. The author is comparing his in-house test environment against running a similar test infrastructure on AWS. AWS is for commonly used for building test environments. The post does a good job of breaking out in-house costs and tracking AWS expenses. The author factors some overheads into the analysis. The result? AWS costs $491,253.01 per year ($40,937.75 per month) more than his in-house environment. Wow!

There must be some mistake. The author must have left out something.

I can argue some minor points. For example, an average month has 30.5 days instead of 31 days, trimming about $1,000.00 per month off the $64,951.20 in instance charges. Another could be that the overhead should include additional overheads mentioned above. These minor points may add up to $5,000.00 per month, a far cry from explaining the $40,937.75 per month delta.

Looking a bit deeper, there is a big cost difference between 150 Linux and 100 Windows instances. Breaking out a baseline of Linux costs (to cover AWS machine, power and other costs) versus the additional cost for Windows. The baseline price for 302 Linux small and large instances is $22,915.20 per month. The Windows premium is $0.025 per CPU hour and that works out to $2,827.00 per month for 152 Windows instances. The cost for Windows and SQL Server running on 152 instances is $42,036 per month. Hence, the SQL Server premium is approximately $39,171.60 per month. The author pays $4,166.00 per month for his in-house database licenses. The premium paid for SQL Server on AWS is approximately $35,005.06 per month.

Most of the $40,937.75 per month cost disadvantage for the cloud can be explained by the AWS pricing for Microsoft Windows at $2,827.00 per month and SQL Server at $35,005.06 per moth. If the author would allow me, I could haggle the overheads and other minor issues to close the remainder of the gap. But, that's not the real issue here.

The pricing for Windows and SQL Server on AWS is not competitive with Microsoft's purchase pricing. Paying almost 10x more is not reasonable. The author points out that ISVs normally have agreements with other ISVs to use their software in test environments for low or no fees. If the test environment needs Windows or SQL Server, you'll have to pay a lot for it at AWS.

One last point, the author wondered if anyone does resource flexing in the cloud. As I pointed out at the beginning of my post, AWS is commonly used for testing because people can scale-up their resource usage prior to release deadlines and when testing at scale. They reduce their resource usage when the need passes. Hence, resource utilization and speed to acquire incremental resources are additional factors to consider.

Reading List

Lean Software Development, An Agile Toolkit, Mary and Tom Poppendieck
More Effective Agile, A Roadmap for Software Leaders, Steve McConnell
Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation, Jez Humble and David Farley
Leading Change, John P Kotter
Coaching for Improved Work Performance, F Fournies
Measuring and Managing Performance In Organizations, Robert Austin
Accelerate, Nicole Forsgren, Jez Humble, Gene Kim
The Phoenix Project: A Novelabout IT, G Kim, K Behr and G Spafford
Cloud Application Architectures, Building Applications and Infrastructure in the Cloud, George Reese
Agile Software Development with Scrum, Ken Schwaber, Mike Beedle