If you want to build successful applications, choose the database type!
According to a survey by Standish Group about 88% of IT projects in the world exceeded term, budget or both, and 31% are canceled before term. Do you want to make great applications? Yes, you need choose database type!
This article is the second in a series of four articles, where I want to alert you about 3 mistakes that beginner developers which prevent the creation of great applications.
Two large DB type groups
Do not underestimate the data power! Data are the most valuable company asset! They are so important that they can be stolen by hackers and there are researchers stating that in the near future the data will be accounted for as part of corporate equity.
But what is the proper trunk to hold a treasure? In other words … Which database type is suitable for storing your data?
We have two large databases groups: Relational databases and NoSQL databases.
They are totally different and should be used in different situations as well. That’s why I claim one of the biggest mistakes in building great applications is not choosing the right database type for your need.
Relational Database (RD)
In 1970 Edgar Frank Codd a brilliant mathematician from IBM published an article where he formally defined the relational model.
That is, for more than 30 years, relational databases are being used and in many cases, they are used successfully.
In relational databases the data is stored in tables, that is, they have rows and columns. They have a fixed and well-defined schema and also have transactions support. One of the cool features of BDRs is that they have SQL support.
But finally, when should you use Relational Databases?
• When transaction support is required;
• When the team does not have knowledge in NoSQL databases and the project is critical for the company;
• When the data has a tabular format.
NoSQL Databases
NoSQL is the name given to a meeting held in San Francisco, California, in 2009 to discuss open source, non-relational, schema-free, and distributed database projects. We can also point out that they have their own query languages, although some are based on SQL like Hive’s HQL and Cassandra’s CQL.
The data model is used to categorize NoSQL databases so that they can be divided into Key-Value, Documents, Columns, and Graphs.
Key-value databases
As its name says it stores a key (used in the query) and a value (which in most databases is not used for queries). It is widely used as applications cache because it store critical pieces of data in memory for low latency access.
Databases examples in this category are:
• Redis;
• Voldemort;
• Memcahed
• Riak;
Document database
It is database type similar to key-value databases, because it has a key (which uniquely identifies the record and can be used for queries) and values that store data and can also be used in queries.
In this category the data values are usually stored in the JSON format, so they should be used when the data is in semi-structured format. This category is also listed in:
• CouchDB
• OrientDB
• RavenDB
• TerraStore
Column Database
Think of this category as a two-tier aggregate structure, where there is a line identifier and a map with more detailed values. Column-oriented databases should be used when data are denormalized, widely consulted, and high performance is required in these queries. As can be seen in figure bellow.
Figure bellow compares a way of storage in rows and columns.
Column-oriented database examples are:
• Cassandra
• HBase
• Hipertable
• Amazon SimpleDB
Graph Database
They are databases where small records have complex relationships, and in these situations they should be used. In this database type queries have excellent performance, but the inclusion of the data does not perform as well. Figure 3 shows an example of a graph-oriented database.
Graph-oriented databases examples:
• FlockDB
• Neo4J
• OrientDB
• Infinit Graph
Conclusion
Enough of using BDR just because “everyone” uses!
If your data is formatted as rows and columns and transaction support is required, use a relational database! If you have questions about what a transaction is, see post about transactions.
If your data should remain in memory for quick access by applications, use a key-value database.
If your data is semi-structured, use a document-oriented database.
If your data is denormalized, needs high availability, and high performance in queries use a column-oriented database.
If your data is small and have complex relationships, use a graph-oriented database
Yes, I am sure you are capable and your applications will be very successful!
If you liked, share this post! And if do you have any questions talk to me!
This article is available in Portuguese.