0% found this document useful (0 votes)
14 views

Structured, Semi-Structured and Unstructured Data

The document explains the three types of data: structured, semi-structured, and unstructured. Structured data is highly organized and originates from relational databases, while semi-structured data includes formats like JSON and XML, and unstructured data lacks a predefined format, encompassing multimedia files and social media data. Each type of data has specific examples and applications within business operations and analytics.

Uploaded by

ishaanvnit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Structured, Semi-Structured and Unstructured Data

The document explains the three types of data: structured, semi-structured, and unstructured. Structured data is highly organized and originates from relational databases, while semi-structured data includes formats like JSON and XML, and unstructured data lacks a predefined format, encompassing multimedia files and social media data. Each type of data has specific examples and applications within business operations and analytics.

Uploaded by

ishaanvnit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Structured Data

Structured data is highly organized and formatted to fit into traditional databases or
spreadsheets. It follows a consistent schema and is typically stored in rows and columns.

Structured data originates from relational database management systems (RDBMS) like SQL
Server, Oracle, MySQL, and PostgreSQL and it can be ingested into OneLake. This includes
tables, indexes, and views.

Content wise, this is often also referred to as transaction data and operational data.
Transaction data being generated from business transactions such as sales records, financial
transactions, and order processing. Operational data meaning data produced by day-to-day
business operations, including inventory levels, human resources records, and customer
relationship management (CRM) systems.

Besides raw data, data originating from other systems and sources, another typically found
example of structured data is derived data. Derived data is processed and transformed from
raw data to provide valuable insights and analytics such as aggregated data (summarized
data from various sources, such as monthly sales totals, average customer ratings, and
summary statistics) or analytical data (processed data ready for analysis, including data
cubes, dashboards, and reports).

Semi-Structured Data

Semi-structured data does not follow a rigid schema but contains tags or markers to
separate data elements, making it more flexible than structured data.

Typical examples would be:

JSON and XML Files

Data formatted in JavaScript Object Notation (JSON) and Extensible Markup Language (XML),
often used for web applications and APIs.

Log Files

System and application logs generated by servers, applications, and network devices. These
files often contain valuable insights for monitoring and troubleshooting.

Sensor Data

Data from Internet of Things (IoT) devices, including temperature readings, humidity levels,
and other environmental sensors.

Email

Content (and also metadata) from email communications.


Unstructured Data

Unstructured data lacks a predefined format, making it the most challenging type of data to
store and analyze. OneLake can handle vast amounts of unstructured data efficiently.

Unstructured data can usually be found in the following places.

Multimedia Files

Images, videos, and audio files used in media production, marketing, and communications.

Documents

Text documents, PDFs, presentations, and other file types used in business operations and
communications.

Social Media Data

Data from social media platforms such as posts, comments, likes, and shares.

Web Data

Content scraped from websites, including HTML, CSS, and JavaScript files.

Source: Fundamentals of Microsoft Fabric, Nikola Ilic, Ben Weissman

You might also like