Utility


23
Nov 09

Backups in the Hosting World

Something we’ve been finding pretty frustrating lately is the whole issue of backups. On my desktop I run Dropbox, but there still isn’t a Dropbox quality (or ease of use) service for hosting companies. If you run a website / service / business you need at the very least, a disaster recovery plan, and that plan involves backups. There are several ways hosting companies deal with this.

1) They dont do them

Yeah you read that right. You don’t actually get backups. These policies are buried deep in their terms of service or usage policies. It’s totally up to you to backup your server content. If you don’t and your server crashes its the end-users problem. High profile data losses can destroy any business, especially startups.

2) Highly Available Storage

This strategy is usually combined with #1 above. Instead of backing up your data they just replicate your data across multiple drives. This means that the chances of you losing your data go down, and depending on the technology used to safeguard against drive failure, you can get really high availability. (iSCSI and ZFS come to mind). Its important to remember that RAID is NOT a backup solution, only a way to mitigate potential failures.

3) OS Level Backups

In this strategy end users are still required to worry about their own data, and choose which sets of data they backup. A hosting company will provide an end-point for you to send your backups to. If you’ve ever done managed or dedicated hosting, this is often the product that is sold. Tivoli or some other backup client is provided, but still relies on either a consultant, sysadmin or service provider to configure correctly.

4) VM Level Backups

There’s another solution that works, and it gets around a lot of the issues with OS level backups, like running a database while doing a backup, etc. Snapshot the entire virtual machine and replicate the VM to an off-site storage system. For better performance, use data de-duplication technology to reduce the amount of time to perform your backup. This system seems to work well, however few providers are offering it.

What do you think? What’s your favourite backup strategy as a hosting company?


27
Oct 09

Competing in a Commodity Hosting Market

We knew it was going to happen but perhaps not so soon. Today Amazon announced that it would be reducing it’s pricing on EC2 linux instances by 15%. That’s a pretty significant cost reduction but we also have to factor in a whole bunch of other costs to figure out what their strategy seems to be.

Unlike with most bundled VPS services where you get a certain amount of disk space, bandwidth, memory and CPU resources, the Amazon model breaks things down into separate categories. You pay per use on everything. Instances per hour, Bandwidth and Storage per Gig, etc. Under this model it makes sense to shift your revenue to things that are higher margin. What that means is that with enough scale, you could almost afford to break even on the server instance and make money on other things – like bandwidth.

This is similar to the concept of “Freemium” in the Web Apps world. You get to use the basic version at a heavy heavy discount (in some cases free), but the add-ons, extra functionality, etc results in having to pay. The difference is that in the harsh reality of hosting, it costs real money to run a server.


9
Oct 09

Government Brief on Canadian Cloud Computing

Today the Canadian Government released a brief on the opportunities for Canada in Cloud Computing. It’s a great paper that highlights some of the benefits and strategic advantages of building large cloud computing centers in Canada. I’ll jump straight to the conclusion in the article: Canada is one of the BEST places to build out data centers and cloud computing infrastructure. The article mentions a bunch of reasons – I’ll expand on a few.

Geography & Climate

Most of the costs associated with running the 1,000’s of servers is directly associated with the price of electricity and the cost of cooling. Canada has cheap, renewable electricity & it’s colder. That means you can offer competitive services at better margins than someone running a cloud in the hot Nevada Desert. Michael Geist wrote more about it at Clean Cloud Computing.

Legal Reasons

Not only are many Canadian companies required to keep their data on native soil, the privacy and electronic documentations act means keeping information here is a really good idea.

Reliable, low cost, renewable energy

The BC, PEI, and Quebec governments actually have the cleanest and lowest cost per KWh electricity prices in all of North America. That’s possible through the use of hydro-electric dams, which also have an extremely low carbon footprint. As stated previously, the cost of running your servers is mostly the cost of electricity.
Cheaper electricty = Competitive Cloud

We’re right next to the American market

One of the fastest computer networks in the world, funded in part by the Canadian government, already runs through most of Canada. We’re also right next to the American market. That means North Americans can’t really tell if their servers are in Nevada or Nunavut. From a consumers perspective, there would be no reason not to use a Canadian Cloud that’s cheap, secure, and efficient, and we would be able to export a utility that is higher margin than say, electricity.

All in all I’m really excited by this report, and I’m sure that more people will be thinking about the potential Canada has to become the world leader in cloud computing services. You can get a little more background information, and learn more about the suggested ways forward by reading the brief here “Cloud Computing and the Canadian Government


14
May 09

If A Tweet Killed a Tuna – Energy Cost Transparency in IT

One of the keys to improving anything is having enough information. This has been widely discussed in environmental circles, and recent innovations such as the Kill-A-Watt and the awesome hack the Tweet-A-Watt have lead to a more widespread appreciation for just *knowing* the amount of energy your appliances, computers, and home entertainment systems are consuming.  In addition to being surprising, the reality is that all too often assumptions are made about where to focus effort to fix a particular problem – or worse, you don’t even know a problem exists. But what to do with this information? At home it’s as easy as putting your devices on a power bar – such as your home theater – and turning it off when you’re not using it. Having the data enables you to make a decision – the decision to save money because all of the sudden it’s tangible.

These kinds of details can be applied at a really big and small scales too. What if you could measure the amount of power went into making your car? The amount of energy each Google search takes? The amount of energy for every tweet? Would knowing a tweet kills a tuna make you think twice? Would it enable you to make better decisions about the products you consume? Would it allow your customers to make better decisions about their energy efficiency?

This can apply to the hosting world too. Computers currently use more energy than the entire airline industry, and that’s expected to double within the next 5 years. Data centers consume a whopping 2-3% of the power in the United States alone. Hosting companies charge flat rates for collocation, virtual servers, shared hosting, etc. Bundled into that are the charges for electricity, and the electricity required to power the cooling. Unless you’re really close to the physical infrastructure, there’s no way to measure how efficient the servers are, or how much power your server is consuming. If we could measure the amount of power a server uses then you could incorporate that into the pricing of the server, and display the information separately. As a hosting company you would be able to make better decisions about which hardware, software, etc to use. As a hosting customer, you would be able to choose locations that are more power efficient. A slew of other possibilities exist. Due to power deregulation and trading markets in many locations, what costs a dollar during the day might cost 10 cents in the middle of the night.

hourly-demand-in-ontario

Data centers are built for peak capacity, but there should be an incentive for customers to adopt more energy efficient solutions. Being able to measure (in)efficiencies also means that making decisions about moving to a container might be easier to justify.


16
Oct 08

Hosting Apocalypse

Behold Sinners! The Apocalypse Aproacheth. No in all seriousness if you run a managed hosting company then your time is officially ‘up’. You won’t survive the coming hosting Apocalypse. Here’s why.

There are a few companies you may have heard of building large compute grids for consumption by the general public. They’re calling them their Cloud Computing products. IBM is building BlueCloud, Microsoft is building the Mesh, Amazon already has EC2, and Google has AppEngine. AppEngine is in a slightly different category than the others and the BlueCloud details are sparse, but they’re still worth mentioning. Of more immediate interest are Amazon and Microsoft’s solutions.

Microsoft is currently building their famous 300,000 server Data Center in Chicago. That’s roughly 3 times the number of servers that Google has. Microsoft has also announced several other Data center projects – each worth about $500 Million. It’s fair to say that that’s a lot of computing power, and it’s not all for MSN – Microsoft is planning on providing their platform in the cloud.

The real question is what Amazon will do when the Windows Cloud comes online. Microsoft has enough money in the bank to provide their 300,000 servers to customers for *years* without earning a single cent. That implies they can offer services at super low rates; Low enough to at least compete with Amazon’s EC2, which will support the Windows Server OS in fall 2008.

What happens with two huge cloud hosting companies get into a price war?

In the interest of self preservation they won’t make their services commodities – at least right away. But it won’t even matter. When you’re as big as Amazon, Microsoft, Google or IBM, you can afford to buy servers in such massive quantities that you could make money selling compute time for 10$ a month. The hosting space will change forever, because Amazon will eventually drop their prices by an *order of magnitude* and that has dire implications for the rest of the Mom’n'Pop hosting companies.

If thousands of companies can’t compete with Microsoft or Amazon on price, and they can’t compete in terms of convenience, then why would anyone use them? If you have to buy individual servers, or even servers by the rack, then you’re not going to get the price you need to be able to compete. You also don’t have access to the handful of specialized individuals and hardware required to make things work on such a grand scale.

The only answer is for all the smaller players to band together – to create a Federated Hosting environment, where together they can provide services that begin approaching levels of service and power that the Big 4 will offer.

Either way, we’re in an interesting period in the industry. Computing and the infrastructure of technology has become such a requirement for the economy that it will eventually become a general utility. The real question is who will be around.

Do you think it’s the end? We’re working on the answer, and your opinion is important.


3
Oct 08

How to move Servers Between Xen and Amazon

I’ve been working on a project that lets you quickly move systems between your private Xen implementation and Amazon’s EC2 service. There are a lot of hurdles to get this to work, and most of them are surrounding how Amazon doesn’t let you download a Kernel or Ramdisk image out of S3 unless you’re the owner. You can download someone elses image if you’ve saved it as your own but you still can’t download the kernel and ramdisk. Also, EC2 has specific requirements for how the image is built. Here’s how you can get your image out of Amazon and run it locally on your own Xen hypervisor. I will assume you are already using Amazon Web Services and have created an account. If you haven’t then sign up.

Amazon calls their instance images Amazon Machine Images or AMI’s. If you want to be able to grab one of the many images from Amazon you can use download the Amazon AMI tools and AWS tools then do the following. You can download the tools here

Find and Download the AMI

$: ec2-describe-images
IMAGE   ami-cc6386a5    ubuntu-hardy-ruby/image.manifest.xml    848278689040    available       private         i386    machine
IMAGE   ami-386c8951    ubuntu-ruby-lapack/image.manifest.xml   848278689040    available       private         i386    machine
$:

Fields 3 and 4 contain important information. For this example I’m listing the images that I own. Optionally you can provide a switch that will list all Amazon images by including ‘-a’ to the end of the ec2-describe-images command.

Field 3 is the unique identifier for the AMI, and field 4 is the bucket and AMI “manifest” – or a file that describes the AMI. Because users can specify the name of the manifest, you should pay attention to this value when trying to run the next set of commands.

The AMI tools from Amazon include a utility called ‘ec2-download-bundle’. This will download the manifest file from the bucket, parse through to see what other files it needs to download, then it will reassemble the AMI image locally, and check its signature. The AMI’s are encrypted in small (usually 10 meg) chunks. The signatures for those chunks are also included in the manifest.

To download the first AMI listed above, run the following commands

mkdir 'image-to-download'
cd 'image-to-download'
ec2-download-bundle --bucket ubuntu-ruby-lapack -m image.manifest.xml --access-key $AWS_ACCESS_KEY --secret-key $AWS_SECRET --privatekey $EC2_PRIVATE_KEY

That will start downloading the bundle to your local system.

Rebuild the AMI

Now we have to unbundle the files

ec2-unbundle -m image.manifest.xml -k $EC2_PRIVATE_KEY

This will decrypt and reassemble the image from all the individual components in the list

Now you have an image named ‘image’ in your directory. You can take a look at this file by mounting it

mkdir /mnt/image
mount -t ext3 -o loop image /mnt/image
cd /mnt/image

If you’re lucky there will be copies of the kernel and perhaps the ramdisk in the /boot partition. Otherwise you’ve got to do something really tricky : You have to guess as to what kernel will work the best. Thankfully we have a good understanding of what’s required to boot one of these images.

If you’ve created an image for Xen already then chances are your kernel will work just fine, but your ramdisk might need some adjusting. A trick you can use is to chroot to the /mnt/image folder, specify which modules you want loaded and rebuild the ramdisk – then exit the chroot, copy the kernel and ramdisk out of /mnt/image and you’ll have all the components you’ll need.

I know what you’re thinking: That’s a lot of work / guessing

You’re in luck. While there are a couple sites for sharing pre-built Xen images, the community is nowhere near as large as the Parallels or VMware ‘appliance’ sites. Jailtime.org has a hanful of images but they don’t follow any sort of standard, and the disk layouts / configurations aren’t compatible with Amazon’s EC2.

LayerBoom has a Xen image that is completely compatible with Amazons AMI format, and it can run in your own environment. This means you can copy a system into Amazon from your test environment without any hassle. It also works with the Eucalyptus project, and can be booted in xVM server as well (Instructions are coming)

Download the Xen package

url: http://layerboom.com/files/xen/images/centos52-20080930.tar.gz
md5: d54a83fc22f1ec052db6ebe3c258ee45

u/l :root/password


15
Sep 08

Citrix Announces C3

Fresh off the heels of the VMware vCloud annoucement, Citrix has announced their own “Cloud Enabling” product titled Citrix Cloud Cente (C3). Similar to the vCloud offering, C3 will enable data centers to build their own cloud platform.You can get a copy of XenServer 5 today, but it’s unclear how the suite of tools that make up C3 will become available to the public.

C3 appears to be made up of different products that Citrix owns such as WANScaler, NetScaler, Xen Server, and Workflow Studio. Of course, this means you have to use each of these components to build your own cloud, and they all cost money. Not to mention the fact that there’s no way to swap in and out the different components, that I can find at least.

I’ll update more on costs and other information as Citrix gets back to me.