Kennis Big Data - Not Just a Size Issue

Big Data - Not Just a Size Issue

Today, many companies face an ever growing pool of data they need to store. Data such as logging user activity, audit feeds, marketing data, user analysis data and so forth... Whatever the case, large amounts of data are no longer the exclusive realm of the Googles and Microsofts of the world.

More and more companies realize they need to tackle the challenges raised by Big Data wether it's for the present or for the future.

The Big Data Problem

Fortunately there are plenty of tools capable of not only storing data, but also making sure that data is retrievable in a timely manner. Classic databases are still the choice of many when it comes to storing data, and they're better suited to the task than ever before. But the amount of data is not the whole issue – though it is a big part of it – the problem is a bit more complex than that...

The main aspects of the Big Data problem are:

  • Size
  • Speed
  • Diversity

Size

Anyone can relate to the size issue: the sheer amount of data is still something that requires attentive management. Just 'winging it' is simply not an option and chances are that trying to deal with large amounts of data when problems already arose, is just too costly. Instead, it is something you need to plan for. That does not mean you should immediately invest in expensive tools if you don't need them, but it certainly doesn't hurt to think ahead: how can I implement these tools later on?

Speed

How many people would use Spotify if it took 5 minutes of buffering before you could listen to a track? Products and services today are faced with the challenge of retrieving their data in a timely manner (and usually, the quicker the better!). But it's not only applicable to users of a product. Companies can use their data to make certain decisions and if it takes a long time to gather all the data this can become a tedious task. Therefore, data should be (made) available as fast as possible for most types of data.

Diversity

Remember the time when anything you could possibly want to store could be easily modeled in a hierarchical manner and stored in rows with foreign key columns describing their interdependencies? No? Neither do I...

More and more companies / products embrace the fact that there are very different kinds of data out there and that their different characteristics may require different kinds of storage and retrieval methods. It's important to take this into consideration.

Big Data Solutions

With more products using the SaaS model (software as a service), software today is more available, to a larger audience, than ever before. That means that if your new software product becomes successful, you could very well end up having to cope with very large amounts of data, very fast.

Thankfully, there are already quite a few tools available to tackle Big Data problems. They range from open source free products like Hadoop, MongoDB and ElasticSearch to closed source commercial products like GigaSpaces XAP, Coherence and JBoss Data Grid, just to name a few.

In short, Big Data is everything the name suggests and more. It is about very different types of data that should be accessible as fast as possible, as well as the sheer quantities of it. The fact that more and more companies are being challenged with Big Data issues means there are a wealth of tools that can provide solutions for your own particular situation.

So think ahead, determine your requirements and see how you can avoid the Big Data problems!