Simplified Notes on Data Science for MBA Exams
1. **What is Data?**
Data refers to pieces of information that can be collected, measured, and analyzed. Raw data is
unprocessed, while processed data is meaningful and useful. Data can come from people,
machines, or both.
2. **What is Information?**
Information is data that has been organized or structured to make it meaningful and useful for
decision-making. It gives us insights that help solve problems.
3. **Importance of Data:**
- Data helps make better decisions.
- It helps to understand customers and market trends.
- It improves business processes and solves problems.
4. **Types of Data:**
- **Structured Data**: Organized in specific formats, making it easy to search and analyze.
- **Unstructured Data**: Data that doesn't have a fixed format, such as text or multimedia (images,
videos).
- **Semi-structured Data**: Data with some organization, but not in a rigid format (e.g., emails, XML
files).
5. **Data Processing Cycle:**
The steps involved in transforming raw data into useful information:
- **Data Acquisition**: Collecting data.
- **Data Preparation**: Cleaning and organizing data.
- **Data Input**: Entering data into systems.
- **Data Processing**: Analyzing the data.
- **Data Output**: Generating meaningful insights.
- **Data Storage**: Storing the processed data for future use.
6. **Data Analysis Techniques:**
- **Statistical Analysis**: Summarizes data using averages, percentages, and hypothesis testing.
- **Machine Learning**: Using algorithms to find patterns and make predictions.
- **Data Mining**: Exploring large datasets to find hidden patterns.
- **Data Visualization**: Using charts and graphs to represent data.
7. **What is Data Science?**
Data science is the field of extracting useful insights from both structured and unstructured data
using scientific methods, processes, and algorithms. It helps in decision-making and solving
business problems.
8. **Applications of Data Science:**
Data science is used in various industries such as:
- **Healthcare**: Predicting diseases, improving patient care.
- **Finance**: Fraud detection, risk management.
- **Retail**: Personalizing shopping experiences, sales forecasting.
- **Social Media**: Personalized recommendations.
9. **Data Science Lifecycle:**
Steps involved in processing data:
- **Data Collection**: Gathering data from various sources.
- **Data Cleaning**: Removing errors and inconsistencies from the data.
- **Data Analysis**: Applying statistical techniques to explore data.
- **Data Visualization**: Creating graphical representations of data.
- **Data Interpretation**: Drawing conclusions from the analyzed data.
10. **Career in Data Science:**
To become a data scientist, one needs to:
- Learn programming languages like Python and R.
- Gain expertise in machine learning, statistics, and data visualization.
- Gain practical experience through internships or projects.