Friday, August 6, 2010

NoSQL ... No Kidding

Cloud computing puts a lot of demands over databases. While traditional databases are very good in what they do, they may not be the best fit for cloud computing.

Consider this scenario. Your cloud offering gains popularity and results in hundreds/thousands of new users overnight. You do want to make sure that your cloud offering can support those users and dramatic increase in demand. You might have started with single in house server. Now you must scale well by making use of multiple nodes. Adding new hardware and distributing the load is not trivial process. It also has cost implications. Research in this area suggests that it comes down to fulfilling the 3 important characteristics

* Consistency
This means that each client must always has the same view of the data.

* Availability
This means that all clients must be always read and write.

* Partition tolerance
This means that the system must work well across physical network partitions.

While it is desirable that DBMS should address all the three requirements, it is not possible. The underlying constrains and issues are well captured and described in what is popularly referred to as CAPS theorem.

So, depending on which characteristics are important, more desirable solutions are available. Nathan Hurst has done great job in classifying them.



The solutions are now collectively referred as NoSQL databases (sometimes distributed databases). The NoSQL databses are getting increasing popular to an extent that somebody who is starting out with new cloud offering should think about the value they bring in from the inception. I will try to discuss some of the popular distributed databases in subsequent post.