Introduction To Data Science
Introduction To Data Science
DATA SCIENCE
What is Data?
Data is a collection of information.
One purpose of Data Science is to structure data, making it interpretable and easy to work with.
Data can be categorized into two groups:
•Structured data
•Unstructured data
Data Science is a combination of multiple disciplines that uses statistics, data analysis, and machine learning to analyze data and to
extract knowledge and insights from it.
Where is Data Science Needed?
• Data Science is used in many industries in the world today, e.g. banking, finance, consultancy, healthcare,
manufacturing and government etc.
• Examples of where Data Science is needed:
• For route planning: To discover the best routes to ship
• To foresee delays for flight/ship/train etc. (through predictive analysis)
• To create promotional offers
• To find the best suited time to deliver goods
• To forecast the next years revenue for a company
• To analyze health benefit of training
• To predict who will win elections
How Does a Data Scientist Work?
• A Data Scientist must find patterns within the data. Before he/she can find the patterns, he/she must organize the data in a standard format.
• Here is how a Data Scientist works:
1. Ask the right questions - To understand the business problem.
2. Explore and collect data - From database, web logs, customer feedback, etc.
3. Extract the data - Transform the data to a standardized format.
4. Clean the data - Remove erroneous values from the data.
5. Find and replace missing values - Check for missing values and replace them with a suitable value (e.g. an average value).
6. Normalize data - Scale the values in a practical range (e.g. 140 cm is smaller than 1,8 m. However, the number 140 is larger than 1,8. - so scaling is important).
7. Analyze data, find patterns and make future predictions.
8. Represent the result - Present the result with useful insights in a way the "company" can understand.
9. Story telling through data visualization
Others includes: Critical thinking, problem solving skills, communication
What is Data Science, Analytics, Analysis
• Data analytics is primarily concerned with putting historical data into perspective,
whereas data science is more concerned with machine learning and predictive
Modeling.
• Data science is a multidisciplinary approach to solving analytically complicated
business problems that includes algorithm creation, data inference, and predictive
modeling. Data analytics, on the other hand, encompasses a number of distinct
fields of statistics and analysis.
• Data analysis entails responding to queries posed in order to make better business
decisions. It uncovers actionable data by utilizing existing data.
Careers Opportunities In Data Science
o Data Analyst
o Data Scientist
o Data Engineer
o Data ops
o Business intelligence Developer/ analyst / Engineer
o Data Journalist
o Data Researcher
o Machine Learning Engineer
o BI Engineer
o Database Manger
o Analytical Engineer
o Data Technical Writer
o Data specialist
o Data Visualist/ storyteller
o Data Modeler
Types of Data Analysis
• Descriptive Analysis.
• Exploratory Analysis.
• Inferential Analysis.
• Predictive Analysis.
• Causal Analysis.
• Mechanistic Analysis
To read about this types of data use this link below Types of Data Analysis
Tools and skills set Required
• Skill set includes:
A) Critical Thinking and ability to solve problem
B) Communication Skills
Technical Skill includes:
A) Programming skills (Python, SQL, Git)
B) Visualization skills (Power BI, Tableau, Power Pivot, Qlik sense ,etc.)
For the purpose of the class we would be considering Excel, Power BI and SQL