You are currently on IBM Systems Media’s archival website. Click here to view our new website.


Figuring out the Questions to be Answered Simplifies the Search for Information Within Unstructured Data

As the use and growth of unstructured data continues to increase, organizations are striving to find a way to manage it successfully, benefiting the business and customers.

Unlike structured data (e.g., a database or table), which can be clearly defined in terms of its meaning, unstructured data has no firm reference points. Its irregularity and ambiguity make it difficult to define and understand.

Textual data (i.e., texts, emails, collaboration software and social media), non-textual data (i.e., photos, videos and audio files) and machine/sensor data (i.e., location information and footage from security cameras) offer a wealth of information that can help organizations conduct business more efficiently. The challenge is finding a way to use all of this data to produce results.

According to IBM estimates, 90 percent of the world’s data has been generated in the past two years, and 80 percent of new data is unstructured, growing at twice the rate of structured data. It’s expected that 40 ZB of data will be created by 2020—300 times the amount in 2005.

90% of the world’s data has been generated in the past two years

With the pace of information accelerating, it’s worth a company’s time to learn how to extract and leverage this information. Companies that lack a current and accurate picture could compromise money, time, quality, security or growth.

Better Ways to Search

Part of the challenge with unstructured data is figuring out what is being presented in the data, says Scott Spangler, principal data scientist at IBM Almaden Research Center and author of “Mining the Talk: Unlocking the Business Value in Unstructured Information.”

Finding the information is not as simple as performing a search and retrieving a document. “Oftentimes, you do not know what to search for,” he says. “You need a process that will help you understand the right questions to ask, and that will be able to tell, in aggregate, here is what the data is telling you is the most likely choice you should make.”

Spangler notes that innovation is becoming more difficult, and although research is at the same pace it was 20 to 30 years ago, it is not keeping up with the changing world. The solution is better research and using new ways to mine the data. “It becomes clear that problems we used to think of as being just too hard to solve, now if we get enough data and we organize it the right way, it is possible,” he says.

Valerie Dennis Craven is a Minneapolis-based writer and editor.



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.


3 Points to Consider When Modernizing IBM Z


Making Sense of APIs and the API Economy

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
Mainframe News Sign Up Today! Past News Letters