Challenges with Managing the Exploding Data Firehose

Share
  • April 23, 2020

As data volumes continue to grow — a rate of 63% per month according to a recent survey — many organizations have more data than they know what to do with. How do organizations keep up? In this interview, John Pocknell, Senior Market Strategist at Quest Software, covers the challenges and bottlenecks preventing data teams from success and explores what data-driven businesses need to keep in mind as data becomes more critical. He also discusses the issue of “dark data” and how companies should address this in the era of data regulations.

SEE ALSO: Project Fugu interview: Bridging the app gap

JAXenter: What does it mean for a business to be data-driven? 

John Pocknell: The phrase “data-driven” has become commonplace in the tech industry, but it’s more than just a buzzword. Data is one of the most useful assets a business can hold, and a key differentiator when it comes to making strategic decisions. It’s a common sentiment that the organizations with the most data will lead the pack, but the truth is much more nuanced than that. Yes, access to data is important, but what’s more important is what organizations do with that data. It’s not just the organizations with the most data that will cross the finish line first, it’s the organizations that identify, analyze, and act on their data who will see the biggest rewards.

JAXenter: How can businesses ensure data is properly prepared for analysis?

John Pocknell: Most companies have so much data that they don’t know what to do with it, or how to extract actionable insights from the set. Before analyzing any set of data, organizations need to understand what is included in a data set, and its value. Without first taking stock of the data and figuring out what exactly it’s saying, you won’t be able to extract the value from it.

A good way to do this is to practice data modeling. If you look at the data model, which is what’s used to design the optimum data structure (tables, relationships, indexes, etc) in the first place, it’s a good way to understand how it was intended to be organized and the types of data defined. Once you understand the framework of the data you’re examining, you can better understand the data itself.

JAXenter: What limitations do businesses face with data analysis and overall data capabilities as data volumes and sources continue to grow?

John Pocknell: One big limitation for businesses is that their data analytics platforms are often siloed. When departments use different data sources, it can be challenging to streamline the data and review it holistically. Organizations must use tools that allow them to connect to multiple data sources including relational and noSQL databases, data warehouses, applications and data files. Data can have an early expiration date, so it’s crucial that these databases are dynamic and constantly updated.

Another limitation occurs when the data can only be accessed and analyzed by a technical member of the team. In order to prevent the bottlenecks that are created when only one team in the organization is able to access the data, tools should be easy to use in order to ensure that data is accessible to business users, as well as technical users.

JAXenter: What is “dark data” and how can it be addressed in the era of data regulations (GDPR, CCPA, etc)?

John Pocknell: For those who don’t know what “dark data” is, and you should, Gartner defines it as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes. Basically, it’s the data that organizations collect but don’t use and don’t plan to throw out — which can be detrimental to business. How? Dark data can typically contain data that organizations aren’t aware lives there. For example, it could contain personally identifiable information (PII) and if say, someone wanted their PII scrubbed and you didn’t even know it was there, you could find yourself hit with a regulatory compliance fine. Not only this, dark data also tends to incur high costs for storage and security, more than the value of its worth.

SEE ALSO: Siler interview: Making PHP easy and enjoyable

JAXenter: How can businesses best leverage, understand and protect the data they don’t know they have, aka “dark data”? 

John Pocknell: Many businesses pull in data that they don’t know they have. The best tool to combat this issue — businesses having data they’re unaware of — is classification and search. A classification and search tool can allow users to identify data according to what its purpose is, providing a more holistic view. On top of this, organizations should be utilizing processes they already have in place for moving and managing data. This enables a better understanding of the data, what’s necessary and what’s not, reducing the likelihood of dark data.

Companies should also determine what data needs to be tracked, and why — important to data strategy and governance policy. While customers are frequently unaware of what personal data is being tracked and stored, companies often behave the same, gathering a large data set without stopping to identify why they are pulling it. This becomes an issue, as mentioned earlier, around PII and regulatory compliance.

The post Challenges with Managing the Exploding Data Firehose appeared first on JAXenter.

Source : JAXenter