satis egitimisatis


Discussion on the state of cloud computing and open source software that helps build, manage, and deliver everything-as-a-service.

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that has been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.
  • Login
Subscribe to this list via RSS Blog posts tagged in data mining

Since my last post on the analysis of the CloudStack community, we have graduated and became a top-level project. It's about time to give an update on what can be seen as a metric of the health of our community.

All the data presented is based on the analysis of the mailing lists, the data is publicly accessible, I have used it previously, just when we graduated March 22nd, in January and back in November when I did some social network analysis. This study was inspired by John Jiang, now working at Eucalyptus, you can read his analysis, note that he moved it to the Eucalyptus website.

Methodology: As explained in previous posts, a Contributor is considered as someone who sent an email to one of the CloudStack mailing lists. This is not to be confused with a Committer which at the ASF is meant to represent someone with write access to the code. Not all code contributors have write access. I identify Companies as the email domain used by the Contributors. This is because Contributors are none-affiliated in the ASF. Obviously it has some limitations as email domains such as can represent different companies. All emails are loaded in a mongodb database and queries are performed to extract the plots that you will see below. We currently have seven mailing lists of varying traffic: announce, users, users-cn, dev, marketing, commits, issues. Note that all JIRA emails are now sent to the issues list. Subscription to these lists and number of messages last month is as follows:

* dev@ 609 subs / ~2600 msgs in Apr
* users@ 782 subs / ~800 msgs in Apr
* issues@ 109 subs / ~2400 msgs in Apr
* commits@ 166 subs / ~3300 msgs in Apr
* marketing@ 85 subs / ~260 msg in Apr
* users-cn@ ~300 subs / ~260 msgs in Apr

Contributors: The plots below show the number of contributors per month since we became an ASF project as well as an accumulation to date. Comparison with traffic prior to joining ASF can be seen in the previous posts. The number of monthly contributors in dev is reaching 225 , while the number of monthly contributors in users is reaching 175. Most notable is that the number of contributors in the users list seem to be closing on the number of contributors in dev. It may indicate a stabilization of the number of developers and an increase in the user base. The accumulation on both lists is now over 500. A comparison of both contributor sets gives us an estimate of 806 for the entire CloudStack community. Of course this does not include people who may only participate in the marketing or announce list, but they are much lower traffic lists. It also does not include participants in the Chinese user lists. This will be in the next post hopefully. From the subscription data listed above you can also see that we have roughly a 30% activity ratio, meaning that 1/3 of the subscribers actually send emails to the lists. Difficult to know if this is a good or bad number, one would need to compare with other ASF projects.

Hits: 15307
Rate this blog entry:
Continue reading Comments


Citrix supports the open source community via developer support and evangeslism. We have a number of developers and evangelists that participate actively in the open source community in Apache Cloudstack, OpenDaylight, Xen Project and XenServer. We also conduct educational activities via the Build A Cloud events held all over the world.