NoSQL

Recent Articles
5 common mistakes to avoid when de-duping your data

Data is power and with that power comes great responsibility. One of the biggest obstacles in data is identifying duplicates and de-duping. The aim of data-deduplication is to eliminate any redundant data in your business. Duplicates are created in all areas of your...

Top 9 database management systems for Joomla’s templates

Have you heard of Joomla Content Management Systems? I’m guessing you have since you clicked the link to this article. But just in case you don’t know, here’s a quick history about it. Joomla! was a part of Mambo until 2005. It then became its own separate thing. And...

DNA vs modern backup methods: The future of data storage

It’s difficult to wrap one’s mind around this, but it’s now possible to store vast amounts of data on a DNA strand. Isn’t it ironic that so much of the world’s data is stored via computerized data centers that are the result of many years of information technology and...

Bridging RDBMS and NoSQL: Introduction to 2DX UI cluster

2DX is a Web UI framework that consists of in-memory Browser NoSQL and RDBMS hosting Stored Procedure-built JavaScript interpreter. Both NoSQL and RDBMS database components of 2DX follow same Entity-Attribute-Value database schema designed to accommodate serialized...

Constraining Data flexibility in a NoSQL database

Having no constraints in a NoSQL database means complete data flexibility. This sounds appealing at first given the fact that it saves development time by not defining any columns or data types at the time of inception of your collections. But this much flexibility...

Framework of an Apache Spark Job Run!
Framework of an Apache Spark Job Run!

Now our the big data analytics community has started to use Apache Spark in full-swing for big data processing. The processing could for ad-hoc queries, prebuilt queries, graph processing, machine learning, and even for the data streaming. Hence the understanding of...

Get ignited by Apache Spark – Part 2
Get ignited by Apache Spark – Part 2

Thanks for your time; I definitely try to value yours. In part 1 – we discussed about Apache Spark libraries, Spark Components like Driver, DAG Scheduler, Task Scheduler, and Worker. Now in Part 2 -we will be discussing on Basics of Spark Concepts like Resilient...

Get ignited with Apache Spark – Part 1
Get ignited with Apache Spark – Part 1

Spark began life in 2009 as a project within the AMPLab at the University of California, Berkeley. More specifically, it was born out of the necessity to prove out the concept of Mesos, which was also created in the AMPLab. Spark was first discussed in the Mesos white...

Relational vs Non-Relational data bases – Part 3
Relational vs Non-Relational data bases – Part 3

In the first and second part of this blog series, we saw some basic differences between the scalability of Relational and Non-Relational Databases. In this post, I will show you how to use these databases correctly, and also tell you about some well-known companies...

Deep dive into NoSQL: A complete list of NoSQL databases
Deep dive into NoSQL: A complete list of NoSQL databases

NoSQL is an alternative to traditional relational databases in which data is placed in tables and is useful for working with large sets of distributed data. It is designed to accommodate a wide variety of data models, including key-value, document, columnar and graph...

Top five advantages and challenges of NoSQL
Top five advantages and challenges of NoSQL

For a quarter of a century, the relational database (RDBMS) has been the dominant model for database management. But, today, non-relational, "cloud," or "NoSQL" databases are gaining mindshare as an alternative model for database management. In this article, we'll...

5 fun facts you may not know about HBase!

Here are some facts you may not have known about HBase: 1. Many companies have had HBase in production for 3+ years: Large technology companies including Trend Micro, EBay, Yahoo! and Facebook, and analytics companies RocketFuel and Flurry depend on HBase for many...

Relational Vs Non-Relational databases – Part 2

In my previous post, we have seen some fundamental differences between Relational and Non-Relational databases. In this post, let's talk about Scalability of these two. Scalability It is an ability of a system that can easily accommodate the rapid incoming data...

Relational vs. non-relational databases – Part 1

For the past few years, NoSQL or Non-relational database tools have gained much popularity in terms of storing huge amount of data and scaling them easily. There are debates on whether non-relational databases will replace relational databases in the future. With the...

HBase: 5 tips for running on low memory EC2

When running on EC2, you often can't win when it comes to instance types. One of the more cost-effective types available is the c1.xlarge. It has enough CPU to handle compactions, a decent amount of disk, and high network I/O. However, we've found that the relatively...

Big data showdown: Cassandra vs. HBase
Big data showdown: Cassandra vs. HBase

In this brave new world of big data, a database technology called "Bigtable" would seem to be worth considering -- particularly if that technology is the creation of engineers at Google, a company that should know a thing or two about managing large quantities of...

The dirty truth about big data and NoSQL

If I asked you for the defining characteristic of a big data customer, you'd probably say they're sitting on large amounts of data. If I asked for the defining characteristic of a NoSQL customer, you might answer they require high levels of concurrency. Well, if...

Data Modeling In The Age Of NoSQL And Big Data

Hadoop Hbase. MongoDB. Cassandra. Couchbase. Neo4J. Riak. Those are just a few of the sprawling community of NoSQL databases, a category that originally sprang up in response to the internal needs of companies such as Google, Amazon, Facebook, LinkedIn, Yahoo and more...

Is there a sequel to NoSQL?

The recent news that MongoDB, Inc. secured $150 million of investment capital underscored the fact that open source data is no longer in its infancy. In fact, judging from the marquee names of the investors—including T. Rowe Price and Fidelity Investments—it’s fair to...

Data Storage: REST vs. POSIX for Archives and HSM
Data Storage: REST vs. POSIX for Archives and HSM

Here is my working hypothesis: POSIX (Portable Operating System Interface) can’t scale to meet the demand of clouds and big data, but REST (Representational State Transfer) can’t manage and tier data the way POSIX can, but will likely get those features in the next...

Installation of HBase in the cluster
Installation of HBase in the cluster

HBase is an open-source, distributed, versioned, column-oriented store modeled after Google 'Bigtable’. This tutorial will describe how to setup and run Hbase cluster, with not too much explanation about hbase. There are a number of articles where the Hbase are...