Within the past ten years, CIOs and IT departments have been fighting to slay the ‘Shadow IT’ dragon. You do know what I am taking about right..? All those pesky servers carefully hidden under employee desks because, “IT was not responsive enough to handle their needs”; or, “IT doesn’t understand what the business process needed.” The battle against these, often unsecure and un-maintainable pieces of infrastructure, is still being waged as the shadow army gains new allies in the form SaaS, cloud and big data offerings.

You may remember from my piece on The Big Data Black Hole that the core problem of IT agility has not been solved. Lines Of Business (LOBs) are turning to software-as-a-service offerings (that include both cloud and non-cloud offerings) to address their needs in a way that cuts IT out of the picture. Shadow Data v IT

Being an ‘information guy’, I want to talk about the rapid arrival of ‘Shadow Data’ in the enterprise and how things are likely to get worse before they get better. Most people would agree that big data is defined as data sets that exceed the boundaries and sizes of current infrastructure capabilities, forcing technologists to take a non-traditional approach.

Unfortunately, this means that even if an IT department had data science or large-scale systems awareness, the infrastructure of the average enterprise would not have the scale of processing, storage or networking to deal with the influx of big data. Add to this the fact that big data science is still in its infancy, and that IT budgets continue to be strained, and LOBs looking to benefit from big data, really won’t be able to rely on IT at all.

LOBs are turning to the newest, sexiest and most disruptive technologies as the solution to their own big data infrastructures. What is going to play out is a ‘born again’ Shadow IT problem with the use of Shadow Data silos to boot. Before anyone interjects with, “These old Shadow IT problems also had Shadow Data”, let me just say that in the big data era, size matters!

We can argue every which way to define the exact characteristics of  big data, but I think we can all agree it lies in storage volume,  number of pieces of data, varied format s(i.e. poly-structured) and varied consumption models. This means that the era of big data, these silos are going to dwarf the ones of yesteryear. This increase in size will amplify risks on all fronts: economic, security, privacy, compliance and governance.

  • Economic: More hardware and software will be purchased which cannot be consolidated with existing infrastructures, thereby requiring more skilled people to maintain this big data infrastructure.
  • Security: Physical security is required for these large amounts of data as they need to be protected (especially since backup struggles to work at this level)’ Logical security too, as the bigger the honey pot, the more bears will want to get their heads in.
  • Privacy: The data itself will more than likely hold a greater collection of personally identifiable information and rarely do LOBs know what that means as it relates to privacy.
  • Governance and Compliance: The data itself is so new and varied that very little exists in term of governance framework and technologies to deal with it.

Does this mean businesses should shy away from any big data project? In my view: Hell no! The democratization of big data is upon us and in the end will better the enterprise, allowing for better, faster and data-driven decisions.

Big data projects should be evaluated with eyes wide open, along with a documented remediation plan for the challenges they can’t address.

A few things to consider are:

  • Is your little data problem solved? Are you sure you can’t get the answers you need from it (i.e. better value extraction of more traditional data). Remember that only 1% of existing data is being analyzed.
  • Do you have quantifiable benefits from your potential big data project that will empower your CEO to earmark enough $ to fund it?
  • Did you speak to your CIO and get his blessing on the project? Are you taking into account the minimum set of infrastructure requirements, so that this new infrastructure doesn’t wreak havoc?
  • Don’t forget that your CIO may not have the budget, talent or infrastructure to execute the project.
  • Have you spoken to your legal counsel? Or, if you have one, then your chief compliance officer to make sure you are not putting the enterprise at unnecessary risk.
  • Did you carefully choose your hardware and software suppliers to ensure that they have demonstrated experience in existing  big data implementation and test for scale, scale, and more scale? Generally, this will be the greatest challenge.

So the war is not over and many more battles are still to come. But with a little homework, risk can be avoided at the onset if due process is implemented early.

  • DDN Storage
  • DDN Storage
  • Date: July 3, 2014