Relational Vs Non-Relational databases – Part 2

NoSQL | SQL |

Published May 6, 2014 |

In my previous post, we have seen some fundamental differences between Relational and Non-Relational databases. In this post, let’s talk about Scalability of these two.

Scalability

It is an ability of a system that can easily accommodate the rapid incoming data without much performance problems. This is the main factor for any system to provide good scalability. There are two types of scaling methods known as Vertical and Horizontal scaling.

Vertical scaling

All the Relational database tools support vertical scaling. This is the method of increasing the power of the system by adding additional CPU, memory and disk spaces. So to allow rapid incoming data, the single production server is optimised to scale up. In this scaling technique, there is always a single production server which can be connected by all the applications and users. A cluster environment can be created with some nodes and replicate the data across nodes.

Because of ACID properties, all nodes should have the same set of data and data synchronization becomes complicated if there are several nodes in the cluster. This is very optimized for Read scaling. Vertical scaling is also known as scale-up
The benefit of this scaling methodology is the tight integration of data and its consistency across the nodes in a cluster. All nodes will have the same set of data and If there is a problem with the production server, another node will automatically be connected by the applications. So this cluster is known as Fail-over cluster.

Horizontal scaling

All the Non-relational database tools support horizontal scaling. This is the method of adding more computers to the network to allow rapid incoming data. It is easy to add more nodes into the cluster to allow data growth. Data are split automatically and processed across nodes in a cluster. This is a distributed data environment. Hadoop Distributed File System (HDFS) is a classical example for this. Horizontal scaling is also known as Scale-out.

The benefit of this scaling technique is that since data are split and replicated across nodes if any of the nodes goes offline, the application can still have the data from other nodes and this guarantees the availability of data at all the time. This method is very useful for the cases where no JOINs are required among the data of the nodes. This is also helpful in separating data and having them in different geographical locations.

While both these scaling techniques have advantages and disadvantages, a good environment can mix both of these to have outstanding Scale-up and Scale-out. We can have a scale-up read and write database in a single server which requires ACID properties and have a scale-out distributed historical data across several nodes for data mining purpose.