satis egitimisatis


Discussion on the state of cloud computing and open source software that helps build, manage, and deliver everything-as-a-service.

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that has been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.
  • Login

I had a recent discussion with some folks wondering why there was now an option for 32 or 64-bit System VMs with CloudStack 4.3. I provided an answer, and linked back to some mailing list discussions. I figured this might be of general interest, so I’d document in the short term with a blog post.

For background, system VMs provide services like dealing with snapshots and image templates, providing network services like load balancing, or proxying console access to virtual machines. They’ve historically been 32-bit. The reason for this is that the 32-bit arch has been very efficient with memory usage, and since these are horizontally scalable it’s easy to just spin up another.

But you can have either – which do you pick?

Depending on the workload you might have a different answer. Some hypervisors work better with one arch over the other; and that might be a factor; but ignoring hypervisors lets examine the reason you’d want to use either. 32-bit: 32-bit operating systems are pretty efficient with their use of memory compared to 64-bit. (e.g. the same information typically occupies less space in memory). However there are limits on memory. (Yes, you could use PAE with a 32-bit kernel to get more addressable memory, but there is considerable CPU overhead to do so – which makes it inefficient given that all of this is virtualized) The 32-bit kernels also have a limit on how much memory is used by the kernel. This is really where the use case of 64-bit System VMs evolved from. Because one of the system VM functions is providing load balancing, the conntrack kernel module had a practical limit of ~2.5M connections – and that left precious little room for the kernel to do other things. CloudStack orchestrates HAProxy as the default virtual LB, which in turn uses conntrack. Having a heavily trafficked web property behind CloudStack’s 32-bit virtual load balancer might run into that limitation.

64-bit: Not nearly as efficient with memory usage; however it can address more of it. You’ll actually tend to need more memory for the same level of functionality; but if you need to push the envelope further than a 32-bit machine, then at least you have an option to do so.

Hits: 14740
Rate this blog entry:
Continue reading Comments

I have been working with Clouds since before the coining of the term itself (back then, the startup I was working for called it "Agile Infrastructure"; now it's known as "IaaS"). From the very beginning, a frequent blocker to adoption has been the question of security. "We can't go to the Cloud because it is simply not secure," goes the complaint.

Well, I'm here to say it's bunk -- pure bunk. There is NO new security problem in the Cloud.

There is, in fact, a security problem in external Clouds -- but it is already in your data center right now.

If you take a truly secure system and place it in an external or hybrid cloud, it will remain secure. Simply exposing a secure system to a larger number of potentially hostile assailants is not enough to make it vulnerable. No, a truly secure system is designed to remain that way even during escalating pressure.

The problem is that very few of our current systems are truly secure. They rely heavily on the notion that threats are few behind the corporate firewall, so they don't need to have air-tight security. That concept is -- and always was -- a mistake. And now that conditions are changing in the Cloud, the inappropriate assumption is causing major headaches. The leaks in the boat are becoming apparent now that it is finally in the water.

Hits: 15930
Rate this blog entry:
Continue reading Comments

This post is a little more formal than usual as I wrote this for a tutorial on how to run hadoop in the clouds, but I thought this was very useful so I am posting it here for everyone's benefit (hopefully).

When CloudStack graduated from the Apache Incubator in March 2013 it joined Hadoop as a Top-Level Project (TLP) within the Apache Software Foundation (ASF). This made the ASF the only Open Source Foundation which contains a cloud platform and a big data solution. Moreover a closer look at the projects making the entire ASF shows that approximately 30% of the Apache Incubator and 10% of the TLPs is "Big Data" related. Projects such as Hbase, Hive, Pig and Mahout are sub-projects of the Hadoop TLP. Ambari, Kafka, Falcon and Mesos are part of the incubator and all based on Hadoop.

To Complement CloudStack, API wrappers such as Libcloud, deltacloud and jclouds are also part of the ASF. To connect CloudStack and Hadoop two interesting projects are also in the ASF: Apache Whirr a TLP, and Provisionr currently in incubation. Both Whirr and Provisionr aimed at providing an abstraction layer to define big data infrastructure based on Hadoop and instantiate those infrastructure on Clouds, including Apache CloudStack based clouds. This co-existence of CloudStack and the entire Hadoop ecosystem under the same Open Source Foundation means that the same governance, processes and development principles apply to both project bringing great synergy that promises an even better complementarity.

In this tutorial we introduce Apache Whirr, an application that can be used to define, provision and configure big data solutions on CloudStack based clouds. Whirr automatically starts instances in the cloud and boostrapps hadoop on them. It can also add packages such as Hive, Hbase and Yarn for map-reduce jobs.

Whirr [1] is a "set of libraries for running cloud services" and specifically big data services. Whirr is based on jclouds [2]. Jclouds is a java based abstraction layer that provides a common interface to a large set of Cloud Services and providers such as Amazon EC2, Rackspace servers and CloudStack. As such all Cloud providers supported in Jclouds are supported in Whirr. The core contributors of Whirr include four developers from Cloudera the well-known Hadoop distribution. Whirr can also be used as a command line tool, making it straightforward for users to define and provision Hadoop clusters in the Cloud.

Hits: 27551
Rate this blog entry:
Continue reading Comments

Doing it Twice? Write it Down!

Posted by on in Cloud Best Practices

There’s a great meme going around about geeks and repetitive tasks. Because geeks will often get annoyed at the effort of doing something manually, they often decide to find a way to automate it – which usually involves a lot more effort than doing it the one time but “geeks win, eventually” because they save time in the long run.

But in the long run we’re all dead. Then what? Who knows how to run your script? What happens when it needs to be maintained? As Jon Udell points out, it’s really not a contest, it’s a process, and non-geeks can play too. Which is why you should also write it down if you’re going to do it more than two times.

OK, “doing it more than two times” is a huge generalization. What I mean more specifically is:

  • If you’re in a team environment or doing work that will keep cropping up.
  • If you’re doing a task that is non-obvious and/or has a complicated series of steps that is non-obvious to people who are not you.
  • If you’re in any kind of critical path that would block shipping or operations if you aren’t there to do the magical things you do.
  • If you want to reduce your project or organization’s Bus Factor (help other people become proficient).
  • If you want to better understand what you do and how you can improve it.

Then you need to take a step back and document the things that you do on a regular basis, because it will help your teammates and (most likely) even you when you come back to a task that you haven’t done for a long time.

Naturally, I’m thinking of this in terms of a project like CloudStack where documentation is vitally important. The success of a distributed team depends a great deal on good documentation.

Hits: 27100
Rate this blog entry:
Continue reading Comments

Cloudstack Upgrade

Posted by on in Cloud Best Practices

We are researching the upgrade procedures from version 2.2.13 to 3.0.6 and would lke to know if anyone has performed this upgrade and can share a document of procedures to define a road-map for success. We are using the following:

(1.) Cloudstack 2.2.13

(2.) KVM

(3.) CentOS 6.1

(4.) Advanced Networking Setup

Hits: 9052
Rate this blog entry:
Continue reading Comments


Citrix supports the open source community via developer support and evangelism. We have a number of developers and evangelists that participate actively in the open source community in Apache Cloudstack, OpenStack, OpenDaylight, Xen Project and XenServer. We also conduct educational activities via the Build A Cloud events held all over the world.