Digital Fluency Notes
Digital Fluency Notes
Q 2 What is the need for Data Science? / What is the importance of Data
Science?
Data science is the ability to process and interpret data. This enables
companies to make informed decisions around growth, optimization, and
performance.
Data Science enables enterprises to measure, track, and record performance metrics
for facilitating enterprise-wide enhanced decision making
Q 3 What is Data Science useful for? / Write the applications data Science
2.Data Collection
5.Data Analysis
Data that is processed, organized and cleaned would be ready for the
analysis
Various data analysis techniques are used to understand, interpret, and
derive conclusions based on the requirements
Data Visualization is used to examine the data in graphical format to
obtain additional insight regarding the messages within the data.
6.Communication
2.Cloud
Cloud storage accommodates structured and unstructured data and provides
business with real-time information and on-demand insights
Cloud makes an efficient and economical big data source
3.Web
The public web constitutes big data that is widespread and easily accessible
Data on the Web or ‘Internet’ is commonly available to individuals and
companies
Web services such as Wikipedia provide free and quick informational insights to
everyone
4.IOT
Data created from IOT constitute a valuable source of big data
This data is usually generated from the sensors that are connected to electronic
devices
With IOT, data can now be sourced from medical devices, vehicular processes
etc.
5.Databases
Businesses uses databases to acquire relevant big data
Popular databases include a big data sources are MS Access, DB2,
Oracle, SQL, and Amazon Simple etc.
6.Telematics
GPS in the vehicle that helps in monitoring movement of the vehicle to
shorten the path for a destination to cut fuel, time consumption
This system creates huge data of vehicle position and movement
7. Business transactions
Data produced as a result of business activities can be recorded in
databases is the big data source
In e-commerce transaction, banking, and the stock market, lots of records
stored and they are sources of big data
Payment through credit card and debit card are big data source
8. Electronic Files
Documents produced are stored as electronic files like internet pages, videos,
audios, pdf files, etc. are big data source
9. Social networks
Data produced by human interactions through a network like internet is
big data source
The most common is the data produced in social networks
10. Sensors
Sensor placed in various place of the city that gathers data on
temperature, humidity etc.
A camera placed beside road gather information about traffic condition, it
creates data
Security camera placed in a sensitive area like airport, railway station,
shopping mall create a lot of data
Q 10 What are the tools and technologies used in big data?
Big data tools and technologies are:
1.Apache Storm 2. MongoDB 3. Cassandra 4. Cloudera 5. OpenRefine 6.
Apache Spark 7. Apache Hive 8. Apache Mahout 9 Apache Pig 10 Apache
Thrift 11 Apache Zookeeper 12 NoSQL 13 Flink 14 Kafka 15 Tableau
It has track of almost everything- starting from your needs, what you
have searched, what you will need in future, your personal details
It also keeps a check on the feedback habits and studies that as well.
(3) Understands the technicalities(habits)
Amazon tries to understand the habits and the time one devotes to each
platform for browsing
4)Faster process of shipping
Amazon has made the process of shipping a lot easier
Through the help of big data analytics insights, it has reached through a
position where it can predict who will order what and when. This has
increased the experience of online shopping
Q 13 Database Management for Data Science
Data
Data are a set of values of qualitative or quantitative variables
about one or more persons or objects
Data base
Database is an organized collection of structured information, or data,
which stored in a computer system
Database is defined as a structured set of data held in a computer’s
memory or on the cloud that is accessible in various ways
Example: A student database in a college, a company database
Database Management Systems (DBMS)
Database management system is a software which is used to manage the
database
DBMS refer to the technology solution used to optimize and
manage
the storage and retrieval of data from databases
DBMS provides an interface to perform various operations like database
creation, storing data in it, updating data, creating a table in the database
Types of database
1.Relational database
2.Centralized database
3.Distributed database
4. NoSQL database
5. Cloud database
6. Object-oriented database
7. Hierarchical database
8. Network Databases
1.Relational database
Relational database is based on the relational data model, which
stores data in the form of rows(tuple) and columns(attributes), and
together forms a table(relation)
A relational database uses SQL for storing, manipulating, as well
as maintaining the data
Examples of Relational databases are: MySQL, Microsoft SQL
Server, Oracle, DB2, PostgreSQL etc.
2.Centralized database
It is the type of database that stores data at a centralized database
system
It helps the users to access the stored data from different locations
through several applications
Example: Central database library in a college
3.Distributed database
It is the type of database in which the data is distributed among
different database systems of an organization
These database systems are connected via communication links
Example: Apache Cassandra, HBase, Ignite, etc.
4. NoSQL Database
NoSQL is Non-SQL/Not Only SQL; it is a type of database that is
used for storing a wide range of data sets
It stores data not only in tabular form but in several different ways
Example: RabbitMQ, MongoDB, JanusGraph
It is divided into four types:
1. Key-value storage
2. Document-oriented database
3. Graph databases
4. Wide-column stores
5.Cloud database
It is a type of database in which the data is stored in a virtual
environment and executes over the cloud computing platform
It provides users with various cloud computing services (SaaS,
PaaS, IaaS, etc.) for accessing the database
Example: PhonixNAP, Google Cloud SQL, Microsoft Azure
6.Object-oriented database
It is a type of database that uses the object-based data model
approach for storing data in the database system
7.Hierarchical Databases
It is a type of database that stores data in the form of parent-
children relationship nodes
It organizes data in a tree-like structure
8.Network Databases
It is a type of database that follows the network data model
the representation of data is in the form of nodes connected via
links between them
Q14. Why do we use databases? / Advantages of database
(DBMS)
Data entry, update, read and delete cost is reduced
Reduced data redundancy
Data sharing is made easy
Data inconsistency is reduced
Decision making with data is improved
Manages large amount of data
Accurate
Easy to research the data
Easy to update the data
Improved data security
Better data integration
Greater data independence