Unstructured Data Is Information
Unstructured Data Is Information
conventional data models and thus typically isn't a good fit for a mainstream
relational database. Thanks to the emergence of alternative platforms for
storing and managing such data, it is increasingly prevalent in IT systems and
is used by organizations in a variety of business intelligence
and analytics applications.
One of the most common types of unstructured data is text. Unstructured text is
generated and collected in a wide range of forms, including Word documents, email
messages, PowerPoint presentations, survey responses, transcripts of call center
interactions, and posts from blogs and social media sites.
Other types of unstructured data include images, audio and video files. Machine data is
another category, one that's growing quickly in many organizations. For example, log
filesfrom websites, servers, networks and applications -- particularly mobile ones --
yield a trove of activity and performance data. In addition, companies increasingly
capture and analyze data from sensors on manufacturing equipment and other internet
of things (IoT) connected devices.
Predictive maintenance is an emerging analytics use case for unstructured data. For
example, manufacturers can analyze sensor data to try to detect equipment failures
before they occur in plant-floor systems or finished products in the field. Energy
pipelines can also be monitored and checked for potential problems using
unstructured data collected from IoT sensors.
Analyzing log data from IT systems highlights usage trends, identifies capacity
limitations and pinpoints the cause of application errors, system crashes, performance
bottlenecks and other issues. Unstructured data analytics also aids
regulatory compliance efforts, particularly in helping organizations understand what
corporate documents and records contain.
Analyst firms report that the vast majority of new data being generated is
unstructured. In the past, that type of information often was locked away in siloed
document management systems, individual manufacturing devices and the like --
making it what's known as dark data, unavailable for analysis.
A variety of analytics techniques and tools are used to analyze unstructured data in big
data environments. Text analytics tools look for patterns, keywords and sentiment in
textual data; at a more advanced level, natural language processing technology is a form
of artificial intelligence that seeks to understand meaning and context in text and
human speech, increasingly with the aid of deep learning algorithms that use neural
networks to analyze data. Other techniques that play roles in unstructured data
analytics include data mining, machine learning and predictive analytics.