Intro To AI - Course Notes
Intro To AI - Course Notes
Ned Krastev
Intro to AI
365 DATA SCENCE 2
Table of Contents
................................................................................................................................................................ 1
1. Getting started.................................................................................................................................... 3
2. Data is essential for building AI....................................................................................................... 7
3. Key AI techniques .......................................................................................................................... 10
4. Important AI branches .................................................................................................................. 14
5. Understanding Generative AI ....................................................................................................... 17
Abstract
between AI, data science, and machine learning, and discusses the
concepts of weak versus strong AI. The data section covers structured
models, and concludes with discussions on AI job roles, ethics in AI, and the
future of AI.
365 DATA SCENCE 3
1. Getting started
Period of Stagnation
• 1960s and 70s - AI Winter: Challenges due to limited technology and data
availability lead to reduced funding and interest, slowing AI progress.
Technological Resurgence
• 1997 - IBM’s Deep Blue: Deep Blue defeats world chess champion Garry
Kasparov, reigniting interest in AI.
• Late 1990s and Early 2000s: A surge in computer power and the rapid
expansion of the Internet provide the necessary resources for advanced AI
research.
1.3 Demystifying AI, Data science, Machine learning, and Deep learning
365 DATA SCENCE 5
Data Science
• Relationship with AI and ML: While data science includes AI and machine
learning, it also encompasses a broader set of statistical methods.
• Tools and Methods: Beyond machine learning, data scientists employ
traditional statistical methods like data visualization and statistical inference
to extract insights from data.
• Applications:
o A data scientist might use ML algorithms to predict future client
orders based on historical data.
o Alternatively, they could perform an analysis correlating client orders
with store visits to derive actionable business insights.
• Scope: Data science is not only about creating predictive models but also
about understanding and visualizing data patterns to support decision-
making.
365 DATA SCENCE 6
Narrow AI
• Definition: Narrow AI refers to artificial intelligence systems that are
designed to handle specific tasks.
• Examples and Applications: An example we discussed is a machine-
learning algorithm that predicts movie recommendations based on a user's
viewing history. This type of AI is pervasive in our daily lives and beneficial
for businesses, handling defined and narrow tasks efficiently.
Semi-Strong AI
• Introduction of ChatGPT and GPT 3.5: OpenAI's release of ChatGPT and the
GPT 3.5 models in 2022 marked a significant advancement towards semi-
strong AI.
• Capabilities: Unlike narrow AI, ChatGPT can perform a broad range of tasks:
o Writing jokes
o Proofreading texts
o Recommending actions
o Creating visuals and formulas
o Solving mathematical problems
• Relation to Turing Test: ChatGPT's ability to generate human-like responses
aligns with Alan Turing's imitation game concept, suggesting it can pass the
Turing Test, thus classifying it as semi-strong AI.
Types of Data
• Structured Data:
o Definition: Organized into rows and columns, making it easy to
analyze.
o Example: A sales transactions spreadsheet with predefined fields in
Excel.
• Unstructured Data:
o Definition: Lacks a defined structure and cannot be organized into
rows and columns. Includes formats like text files, images, videos,
and audio.
o Prevalence: Represents 80-90% of the world’s data, making it the
dominant form of data.
Labeled Data
• Definition: Labeled data involves tagging each item in a dataset with
specific labels that the AI model will learn to recognize and predict. For
example, photos can be classified as 'dog' or 'not a dog', and comments
can be labeled as positive, negative, or neutral.
• Process: This method requires a meticulous review and classification of
each data item, which can be time-consuming and costly.
• Benefits: Labeled data significantly enhances the accuracy and reliability of
AI models, making them more effective in real-world applications.
Unlabeled Data
• Definition: Unlabeled data does not come pre-tagged with labels. The AI
model is tasked with analyzing the data and identifying patterns or
classifications on its own.
• Application: This approach is often applied to large datasets where manual
labeling is impractical due to resource constraints.
• Trade-offs: While less resource-intensive upfront, models trained on
unlabeled data might not achieve the same level of accuracy as those
trained on well-labeled datasets.
Practical Implications
• Choice of Method: The decision between using labeled or unlabeled data
often depends on the specific requirements of the project, available
resources, and desired model performance.
365 DATA SCENCE 10
Impact of Digitalization on AI
• Data Growth: The rapid expansion of online platforms, mobile technology,
cameras, social media, sensors, and Internet of Things devices has resulted
in a massive increase in data generation.
• Quality and Quantity: Not only has the volume of data grown, but the
quality has also improved significantly, as evidenced by the comparison of
old mobile phone photos to modern smartphone images.
Challenges of Managing Data
• Unstructured Data: A large portion of this newly generated data is
unstructured and too vast to manually label or organize effectively.
• Metadata as a Solution: To manage this overwhelming amount of data,
metadata becomes essential. Metadata is data about data, providing
summaries of key details such as asset type, author, creation date, usage,
file size, and more.
3. Key AI techniques
Educational Takeaway
• Functionality of ML: This example illustrates how ML uses historical data to
learn patterns and make predictions about new, unseen situations.
• Future Lessons: The next lesson will cover different types of machine
learning models, further expanding on how these technologies are applied.
Supervised Learning
• Definition: Supervised learning uses labeled data to teach models how to
predict outputs based on input data.
• Classification: An example is identifying whether an image contains a dog
or not, using a dataset where each image is labeled as ‘dog’ or ‘not dog.’
365 DATA SCENCE 12
Unsupervised Learning
• Definition: Unsupervised learning involves analyzing data without pre-
labeled responses.
• Clustering: The model scans data to identify inherent patterns and group
similar items, such as differentiating between images of dogs and cats
without prior labels.
• Applications: Useful when labeling data is impractical or too costly, or when
the relationships within data are unknown. Examples include identifying
customer segments in a supermarket or determining popular property
types in real estate.
• Key Point: The algorithm autonomously discovers relationships and
patterns without direct input on the desired output.
Reinforcement Learning
• Definition: Reinforcement learning teaches models to make decisions by
rewarding desired behaviors and penalizing undesired ones, optimizing for
a specific goal without labeled data.
• Application and Dynamics: Commonly used in robotics and
recommendation systems like Netflix’s. The model learns from direct
interaction with the environment, improving its recommendations based on
user feedback, such as views, skips, and ratings.
• Key Point: Operates on a trial-and-error basis within defined rules,
constantly adjusting actions based on feedback to achieve the best
outcomes.
4. Important AI branches
4.1 Robotics
Historical Context
• Ancient Origins: Tales like the myth of Talos and the mechanical inventions
of Al-Jazari show early human fascination with automata and mechanical
beings.
• Renaissance Innovations: Leonardo Da Vinci's designs, such as the
mechanical knight and lion, prefigured modern robotic concepts,
illustrating a longstanding interest in replicating human and animal actions
through machines.
Modern Robotics
• Definition: Robotics involves designing, constructing, and operating
robots—machines capable of performing tasks either autonomously or with
human-like capabilities.
• Interdisciplinary Field: The creation of robots requires a collaborative effort
among mechanical engineers (for physical structure and mobility),
electronics and electrical engineers (for operational control), and AI
specialists (for decision-making and behavioral intelligence).
AI Integration in Robotics
• Role of AI: Advanced AI technologies drive the decision-making and
perception capabilities of robots, equipping them with sensors and
cameras to interact intelligently with their environment.
• Multi-Model Systems: Effective robots often integrate multiple AI models,
including:
o Computer Vision: For object detection and environmental
understanding.
o Simultaneous Localization and Mapping (SLAM): For navigation and
mapping.
o Reinforcement Learning: For adaptive decision-making.
o Natural Language Processing (NLP): For understanding and
generating human language.
365 DATA SCENCE 15
• Specialized Networks:
o Examples: U-net for medical image segmentation and EfficientNet for
optimizing neural network performance and resource use.
4.3 Traditional ML
While AI innovations like ChatGPT and autonomous vehicles often capture public
imagination, a substantial portion of AI's value lies in its application within
traditional business operations. These applications might not make headlines as
frequently, but they are fundamental in transforming various industries by
enhancing efficiency and accuracy.
4.4 Generative AI
Techniques in Generative AI
Large Language Models (LLMs): These are neural networks trained on vast
amounts of text data, predicting word relationships and subsequent words in
sentences. LLMs are foundational for text-based applications like ChatGPT.
365 DATA SCENCE 17
• Diffusion Models: Used primarily for image and video generation, these
models start with a noise pattern and refine it into a detailed image,
applying learned patterns to enhance realism.
• Generative Adversarial Networks (GANs): Introduced in 2014, GANs use
two algorithms in tandem—one to generate content and the other to judge
its realism, improving both through iterative enhancement.
• Neural Radiance Fields: Specialized for 3D modeling, these are used to
create highly realistic three-dimensional environments.
• Hybrid Models: Combining techniques like LLMs and GANs, hybrid models
leverage the strengths of multiple approaches to enhance content
generation.
5. Understanding Generative AI
Early Beginnings
Natural Language Processing, or NLP, is a field within computer science that
focuses on enabling computers to understand, interpret, and generate human
language. Originating in the 1950s, NLP initially relied on rule-based systems.
These systems used explicit language grammar rules to process text. For example,
a rule might dictate that sentences beginning with "Can you," "Will you," or "Is it"
should be treated as questions, helping the system recognize "Can you help me?"
as a question.
365 DATA SCENCE 18
Practical Example
In the 90s, statisticians would analyze sentences containing the word "can" to
determine its use as a noun or a verb—important for understanding its meaning in
context. For instance, "can" as a verb might indicate ability, while as a noun, it
could refer to a container. Analyzing how "can" was used with surrounding words
like "you" or "soda" helped predict its grammatical role in sentences.
Generative AI
The term "generative" highlights these models' ability to produce new content,
making them a subset of Generative AI. They can generate a wide range of
outputs based on the training data they have processed.
N-grams
• Basics: N-grams predict the probability of a word based on the preceding
n−1n-1n−1 words. Unigrams (n=1) predict words without any contextual
basis, often leading to nonsensical choices, while bigrams (n=2) and
trigrams (n=3) incorporate one or two preceding words respectively.
• Limitations: Although n-grams consider immediate predecessors, they lack
an understanding of broader sentence context and fail to capture deeper
semantic relationships.
Transformers
• Innovation: Introduced in the seminal paper "Attention Is All You Need" in
2017, transformers revolutionize language modeling with an attention
mechanism that assesses the relevance of different parts of the input data,
allowing the model to focus on the most important segments.
• Efficiency and Scalability: By calculating attention scores and prioritizing
certain words over others, transformers efficiently handle sequences
without the computational overhead of LSTMs, enhancing scalability and
performance in processing large volumes of text.
Model Design
• Architecture Selection: Developers choose an appropriate neural network
architecture, such as transformers, CNNs, or RNNs, depending on the
intended application.
• Depth and Parameters: Decisions about the model's depth and the number
of parameters it will contain are crucial as they define the model's
capabilities and limitations.
365 DATA SCENCE 23
Dataset Engineering
• Data Collection: Involves gathering data from publicly available sources or
proprietary datasets. The amount and quality of data can significantly
influence the model's performance.
• Data Preparation: Cleansing and structuring of data are critical to ensure
that the model trains on high-quality and relevant information.
• Ethical Considerations: Developers must address key issues such as data
diversity and potential biases within the training data.
Pretraining
• Initial Training: The model is trained on a large corpus of raw data, which
helps in developing a basic understanding of language patterns and
structures.
• Handling Bias: Special attention is needed to avoid training the model on
data that could lead it to generate biased or offensive outputs.
Preliminary Evaluation
• Performance Assessment: Early evaluation of the model to understand its
strengths and areas that require improvement, particularly in how it handles
context and subtlety in language.
Post-training
• Supervised Finetuning: Enhances the model's performance using high-
quality, targeted data.
• Incorporating Feedback: Refining the model further through human
feedback and annotations to improve its accuracy and ethical behavior.
Finetuning
• Optimization: Adjusting the model's weights to optimize for specific tasks,
improving speed and efficiency while potentially sacrificing some general
capabilities.
Prompt Engineering
Definition: Modifying how we interact with the model through specific instructions
or examples without altering the model's underlying architecture or training data.
• Process: Involves crafting and refining verbal prompts to guide the model
towards generating the desired outputs.
• Utility: Allows quick, iterative adjustments to how the model interprets and
responds to queries, ideal for tuning AI behavior with minimal resources.
Fine-Tuning
• Definition: Involves retraining the model on new data or adjusting its neural
network weights to improve or specialize its performance.
• Characteristics: This is more resource-intensive and requires additional
data, often leading to substantial improvements in the model's accuracy
and speed for specific tasks.
• Limitations: Unlike prompt engineering or RAG, fine-tuning is not iterative
and can be computationally expensive, necessitating careful planning and
execution.
Buy vs. Make: Businesses typically decide between outsourcing non-core activities
for efficiency and retaining strategic, value-adding activities internally to maintain
competitive advantage.
Adapting to AI Realities
• Model-as-a-Service: Companies often turn to providers like OpenAI, which
offer access to advanced models such as GPT through model-as-a-service
arrangements. This allows businesses to leverage cutting-edge AI without
the overhead of developing it.
• Core vs. Non-Core: The contradiction arises when AI is a core strategic
asset, but access to the technology depends heavily on external sources.
This shifts the strategic focus from building AI internally to effectively
integrating and customizing external AI solutions.
Competitive Differentiation
• Skill in AI Adaptation: The real competitive advantage lies in a company's
ability to tailor these external AI resources to specific business needs
through techniques like prompt engineering, RAG (Retrieval-Augmented
Generation), and fine-tuning.
• Demand for AI Expertise: As AI continues to be a central element of
business strategy, there is a growing need for skilled AI engineers who can
navigate these tools to enhance business applications and outputs.
365 DATA SCENCE 27
Learn the most sought-after data science skills from the best experts in the field!
Earn a verifiable certificate of achievement trusted by employers worldwide and
future proof your career.
$432 $172.80/year