Think about it for a second. Every minute of every day, massive amounts of data are being created around the world. So much in fact that 90% of the data that exists today has been created in the last two years alone. It’s fascinating, isn’t it?
According to the latest reports, 2.5 quintillion bytes of data are being produced daily, a number that’s only going to increase in the coming years. Just a few short years ago, a gigabyte (one billion bytes) was considered to be a lot of data. Today, it’s a tiny fraction of what we are capable of generating. It is estimated that the current global storage capacity for digital information totals around 295 exabytes (that’s one billion gigabytes!).
In recent years, the topic of big data has generated much hype in the IT industry, so much that some are calling it the hot IT buzzword of 2012. And the buzz is quickly spreading to other industries as well. Just type in “big data” into Google search and you’ll get somewhere close to 1,280,000,000 hits.
So what exactly is big data? And why should your organisation care about it?
One simple definition comes from Webopedia: “Big data is a phrase used to describe a massive volume of both structured and unstructured data that is so large and complex that it’s difficult to process with traditional database and software techniques.”
One of the main catalysts for this unprecedented, rapid growth of data is the current explosion of digital technology, such as mobile and web-enabled devices. Today, organisations are generating tremendous amounts of business data, most likely yours too. This digital information is coming from everywhere – sensors used to gather climate information, posts to social media sites like Facebook and Twitter, digital photos on Flickr and Instagram, videos on YouTube, cell phone GPS signals, purchase transactions records and a multitude of other sources.
One of the characteristics that differentiates big data from traditional data is its sheer volume. And the reason it keeps growing is because it’s continuously being generated from more sources and devices than ever before. Organisations today are easily producing terabytes and even petabytes of information. The challenge is having the capacity to store, process and analyse these massive amounts of unstructured data.
Velocity refers to the extremely high speed and frequency at which big data is being created, collected and shared. Certain data sources are arriving so fast that there is no time to store it before it can be analysed. This too creates a challenge and the need to come up with new ways and tools to capture these large sets of fast-moving data.
Social media posts, numbers, audio, video, text, weblogs, photos, log files, reviews on websites and so on. Almost all organisations have access to a stream of unstructured data coming from a variety of channels. Most of the time the data coming in is not neatly structured and it doesn’t fit the structures of the traditional databases, making it difficult to analyse, interpret and use in a meaningful way.
Clearly, big data is a BIG deal. It is opening up BIG opportunities for enterprises, while at the same time creating BIG challenges. One of these is the issue of security. With so much data generated within your organisation how do you ensure that it is kept safe? How do you store it? And how do you analyse it in order to extract valuable information that will allow your organisation to make better, more accurate decisions and keep ahead of the competition?
To keep up with the rapid growth of business data, the IT industry will need to develop new and more effective ways of managing the ever-increasing amounts of information flowing into organisations. According to InfoWorld:
Big data will require major changes in both server and storage infrastructure and information management architecture at most companies. IT managers need to be prepared to expand the IT platform to deal with the ever-expanding stores of both structured and unstructured data. That requires figuring out the best approach to making the platform both extensible and scalable and developing a roadmap for integrating all of the disparate systems that will be the feeders for the big data analysis.