Introduction of Data- Data refers to raw facts and figures that
are collected for analysis. It can come in various forms such as numbers, text, images, etc. Artificial Intelligence has three broad domains, namely, Data Science, Computer Vision, Natural Language Processing. Data Science-Data Science is an interdisciplinary field that combines various techniques, algorithms, and processes to extract meaningful insights from structured and unstructured data. It uses tools from mathematics, statistics, computer science, and domain knowledge to analyze and interpret data. Computer Vision- It is a field of artificial intelligence (AI) and computer science that enables computers and systems to interpret and understand visual information from the world, such as images, videos, or other visual inputs. Natural Language Processing-(NLP) is a branch of artificial intelligence (AI) and computational linguistics that focuses on the interaction between computers and human (natural) languages. The goal of NLP is to enable computers to understand, interpret, and generate human language in a way that is both meaningful and useful. This involves making sense of spoken or written language, allowing machines to process and respond to human communication. Types of Data: Structured Data: Organized data in rows and columns (e.g., spreadsheets or databases). Unstructured Data: Data that does not have a pre-defined format (e.g., social media posts, images, audio files). Qualitative Data: This type of data is descriptive and categorical. It can include attributes, labels, or categories that describe characteristics, such as color, name, type, or preference. Quantitative Data: Numerical data that can be measured or counted (e.g., height, weight, age). 2.Data Collection and Sources: Data can be collected from different sources, such as surveys, sensors, websites, databases, and social media platforms. Students learn how to gather and organize data in a systematic manner. 3. Data Representation: Text: Number: Images: Video: Audio: Tabular Representation: Organizing data in tables or charts for easier analysis. Graphical Representation: Visual tools like bar graphs, pie charts, histograms, and line graphs help in representing data in a way that is easy to understand. Charts and Graphs: Bar Graphs: Used for comparing quantities. Pie Charts: Used for showing proportions. Histograms: Used for showing the frequency distribution of data. Line Graphs: Used to show trends over time. 4. Data Analysis: Descriptive Statistics: Analyzing data using mean, median, mode, range, and standard deviation to summarize the data. Data Cleaning: The process of removing or correcting errors in the data to improve its quality. Data Processing: Organizing and transforming raw data into a form suitable for analysis. 5.Basic Probability and Statistics: Understanding how to calculate probabilities and apply basic statistical techniques to interpret data. Using measures of central tendency (mean, median, mode) and dispersion (range, variance, standard deviation). 6.Data Interpretation(explanation) and Decision Making: Interpreting the results from the analysis and making data-driven decisions. Understanding patterns and trends in data that can lead to informed conclusions or actions. 7.Applications of Data Science: In Everyday Life: Examples include recommendations on streaming services (like Netflix), shopping suggestions (Amazon), and weather forecasts. In Different Fields: Data science is used in various industries such as healthcare (predicting diseases), finance (fraud detection), education (personalized learning), and marketing (customer behavior analysis). 8. Introduction to Programming: Some basic programming concepts, such as using tools like Python, R, or spreadsheets, to manipulate and analyze data, may be introduced. Teaching Methodology: This can include: Hands-on activities, such as working with spreadsheets or collecting data and analyzing it. Simple coding exercises for understanding data analysis using tools like Python. Group discussions on the use of data science in real-world problems. OTHER APPLICATIONS OF DATA SCIENCE Data Science is a field which is related to many real-life application areas but most importantly in the area of Artificial Intelligence. Data Science deals with collecting data, analyzing it and then generating a machine learning algorithm for performing tasks in a specific field. 1.Data Analysis in Sports: - In sports, data science is used to analyze player performance, predict match outcomes, and improve team strategies. 2. E-commerce and Online Shopping: - Online stores use data science to recommend products to customers based on their browsing history and past purchases. Ex: - Amazon or Flipkart 3. Healthcare and Medical Diagnosis: - Data science is used to analyze medical data such as X-rays, blood tests, and patient histories to diagnose diseases and recommend treatments. Ex: - In detecting diseases like cancer, data science algorithms analyze medical images to find signs of tumors. 4. Social media - Platforms like Facebook, Instagram, and Twitter use data science to analyze user behavior and personalize the content shown to them. Ex: - Data science helps determine which posts, ads, and videos appear in a user's feed based on their interactions and interests. 5. Weather Forecasting- Data science helps meteorologists predict weather by analyzing patterns in historical data, satellite images, and sensor data. Ex: - Predicting the weather for the upcoming days—temperature, rainfall, etc.—based on large datasets of weather observations. 6. Traffic Management and Navigation - Data science is used in traffic management systems and navigation apps like Google Maps to analyze real-time traffic data and optimize travel routes. Ex: - Google Maps uses data science to show traffic patterns, suggest faster routes, and predict arrival times. 7. Banking and Finance - Data science is used by banks to detect fraudulent transactions and assess loan eligibility based on historical financial data. Ex: - Banks analyze transaction data to flag any unusual or fraudulent activities. They also use credit scoring models to determine if someone qualifies for a loan. 8. Entertainment and Movies- Streaming services like Netflix and YouTube use data science to recommend movies, TV shows, and videos based on your watching history. Ex: - When you watch a series on Netflix, it suggests other shows or movies that you might like based on your previous choices. 9. Education and Personalized Learning- Data science is used in education to analyze students' performance and tailor learning materials to their needs, helping them improve. Ex: - Online learning platforms like Khan Academy or Coursera track students' progress and provide personalized recommendations based on their performance. 10. Agriculture and Farming: - Data science is applied in agriculture to predict crop yields, detect diseases, and optimize farming practices. Ex - Using data from weather patterns and soil conditions, farmers can predict the best time to plant crops or detect early signs of plant disease. 11. Government and Public Services- Governments use data science to analyze public data for improving city infrastructure, traffic management, crime prevention, and public health. Ex- Smart cities use data science for better waste management, water supply monitoring, and traffic regulation.
Revisiting the AI Project Cycle can help us better understand how
AI projects are developed, from identifying the problem to deploying and maintaining the AI system. This cycle follows a structured approach to ensure that AI solutions are built effectively and meet the goals of the project. 1. Problem Definition- Understand the business or research problem and define clear goals for the AI project. In Problem solving, we study a problem and try to find a solution for this problem. In this stage we work on 4W’s: Who, What, Where, why. 2. Data Collection- After finalizing the aim of our project, we need to move towards looking at various data features which affect the problem in some way or the other. AI-based project requires data for testing and training, we need to understand what kind of data is to be collected to work towards the aim. 3. Data Exploration- After putting the data in a database, we can arrange it in a meaningful manner to extract the information from it. 4. Model Selection and Training- Modelling refers to the process of using statistical or computational techniques to make predictions, understand patterns, or draw conclusions based on data. The goal of modelling is usually to apply basic mathematical or statistical models to solve simple problems. These models can either predict numerical values or categorical outcomes based on the data. 5. Model- In this stage, we make our dataset. Once this dataset is ready, we train our model on it. The Model Development was done at multiple levels to arrive at the most suitable model. At first level we developed two sets of Model using Multi Linear Regression (MLR). The first one with the actual available variables. The second Model was developed using one additional variable, i.e., Previous Day’s level for that particular Pollutant (Dependent Variable). Then at the second level we developed the Model using Neural Network (NN). Once again this was further divided into two parts. 6. Evaluation- Evaluation refers to the process of assessing how well a model performs after being trained on data.