BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Next-Generation Databases Take On Big Data Management Challenges

Following
This article is more than 8 years old.

Relational databases such as Oracle , IBM ’s DB2 and Microsoft ’s Access, form the backbone for data storage and management in most organizations today. While relational databases provide good structure and accessibility for most data, they also have limitations which have given rise to a new class of databases that address specific needs for dealing with extremely large or complex data resources.

These new databases don’t use the tables, fields and rows found in relational databases, and they don’t require establishing a schema (a highly-ordered database plan) to set them up. Called “NoSQL” (or “not-only SQL”) they are designed to overcome specific data management challenges such as providing rapid data access to power real-time applications, bringing order to data in non-traditional formats, or avoiding the costs and turnaround time required to develop a conventional database schema.

The rise of NoSQL databases presents challenges for established database providers, and new options for data owners.

Do you need a NoSQL database? Today, probably not. Tomorrow, that may change. How will you know? You don’t need to learn the ins and outs of all the new databases now available, but you should get familiar with key types and the situations best suited to each.

Five major classes of NoSQL databases have emerged: column families (also known as “wide-column stores” or “columnar databases”), document, graph, key-value and XML (also known as “native XML”). Here are the basics on each type, with an eye to the kinds of data analysis that each fits best.

  • Column Families: These are the NoSQL databases that most resemble conventional relational databases. They store structured data in individual, columns (rather than tables). In place of tables, these databases use groups of columns. They are good for machine-generated data, structured data sources too big to fit on a single computer, and for rapid data queries. Look at these if you’re thinking about rapid precision analytics on machine data. Big names: Apache Cassandra and Apache Hbase (part of the Hadoop family)
  • Document: These are built around storing documents, rather than structured data. They are good for data that is unstructured, like open text in a letter or email, or semi-structured, like academic papers. Look at these if you’re thinking about text analytics on documents too long for conventional databases. Big names: MongoDB and Apache Couch DB
  • Graph: These use a graph structure, essentially a diagram of the relationships within the data, in place of tables. They are good database engines for powering web applications that must provide information very quickly (think online shopping and social networking platforms). Look at these if your primary interest is a fast application and you can live with some approximations in analytics. Big names: Neo Technology’s Neo4J and Microsoft’s Horton
  • Key-value: These are designed for simple and easy development of applications. They are good for situations where you need a working application developed fast, and all other considerations come second. Big names: Basho Technologies’ Riak, and Redis
  • XML: These use XML, the underlying language of the web and many other information sharing systems, to define data structure. They are good for managing data that you can’t get into any other kind of database, and a good match when you have a lot of data in nontraditional formats like video and audio. Look at these when you’re going deep on analysis of unstructured data like speech or video analytics. Big names: Mark Logic and Sedna

Don’t adopt any new database without consulting everyone affected by the change and researching the options in depth. While, for example, a graph database can make a web application run like lightning, it can also make serious data analysis unacceptably slow.

Will using a NoSQL help your business now? Maybe, maybe not. But knowing what these databases do and where they fit will help you know the moment when the answer is “Yes.”