0% found this document useful (0 votes)
75 views

Chapter 2 Data Collection and Preparation

This document discusses data collection and preparation. It describes various methods of data collection including from automated systems, sensors, external sources, social media, and surveys. It also covers ethical considerations around informed consent, voluntary participation, privacy, and only collecting relevant data. The document then discusses data cleaning and preprocessing in Excel, including fixing empty rows, duplicate data, inconsistent data types, and spelling errors.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

Chapter 2 Data Collection and Preparation

This document discusses data collection and preparation. It describes various methods of data collection including from automated systems, sensors, external sources, social media, and surveys. It also covers ethical considerations around informed consent, voluntary participation, privacy, and only collecting relevant data. The document then discusses data cleaning and preprocessing in Excel, including fixing empty rows, duplicate data, inconsistent data types, and spelling errors.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Data Collection

and preparation
Jhon Loyd D. Criste
METHODS OF
DATA COLLECTION
■ automated data collection functions built
into business applications, websites and
mobile apps;
■ sensors that collect operational data from
industrial equipment, vehicles and other
machinery;
■ collection of data from information services
providers and other external data sources;
■ tracking social media, discussion forums,
reviews sites, blogs and other online
channels;
■ surveys, questionnaires and forms, done
online, in person or by phone, email or
regular mail;
Ethical Considerations in
Data Collection
• Informed consent
• Voluntary participation
• Do no harm
• Confidentiality
• Only assess relevant components
Data Cleaning and
Preprocessing in Excel
Data cleaning is the process
of fixing or removing incorrect,
corrupted, incorrectly
formatted, duplicate, or
incomplete data within a
dataset.
Types of problems
1. Empty rows

Problem:
 It breaks the information into multiple tables instead
of one single table
 There shouldn't be any empty rows in table

Treatment:
 Select the Entire column and then filter
 Filter for empty cell in any column
Types of problems
2. Duplicate Data

Problem:
 Entire Record is same

Treatment:
 Remove Duplicate
 Highlight using Conditional formatting
 Filter Data using advance filtering
Types of problems
3. Data Types and Data Consistency
Problem:
 Data spelled incorrect
 Some columns may have inconsistent data type

Treatment:
 Find and Replace
 Text to Column
Types of problems
3. Data Types and Data Consistency
Problem:
 Data spelled incorrect
 Some columns may have inconsistent data type

Treatment:
 Find and Replace
 Text to Column

You might also like