Junior Software Developer English Class 12 (1)
Junior Software Developer English Class 12 (1)
Preface
Vocational Education is a dynamic and evolving field, and ensuring that every
student has access to quality learning materials is of paramount importance. The
journey of the PSS Central Institute of Vocational Education (PSSCIVE) toward
producing comprehensive and inclusive study material is rigorous and time-
consuming, requiring thorough research, expert consultation, and publication by
the National Council of Educational Research and Training (NCERT). However, the
absence of finalized study material should not impede the educational progress of
our students. In response to this necessity, we present the draft study material, a
provisional yet comprehensive guide, designed to bridge the gap between teaching
and learning, until the official version of the study material is made available by
the NCERT. The draft study material provides a structured and accessible set of
materials for teachers and students to utilize in the interim period. The content is
aligned with the prescribed curriculum to ensure that students remain on track
with their learning objectives.
The contents of the modules are curated to provide continuity in education and
maintain the momentum of teaching-learning in vocational education. It
encompasses essential concepts and skills aligned with the curriculum and
educational standards. We extend our gratitude to the academicians, vocational
educators, subject matter experts, industry experts, academic consultants, and all
other people who contributed their expertise and insights to the creation of the
draft study material.
Teachers are encouraged to use the draft modules of the study material as a guide
and supplement their teaching with additional resources and activities that cater
to their students' unique learning styles and needs. Collaboration and feedback are
vital; therefore, we welcome suggestions for improvement, especially by the
teachers, in improving upon the content of the study material.
This material is copyrighted and should not be printed without the permission of
the NCERT-PSSCIVE.
Deepak Paliwal
(Joint Director)
PSSCIVE, Bhopal
Date: 07 September, 2024
Members
Deepak D. Shudhalwar, Professor (CSE), Head, Department of Engineering and
Technology, PSSCIVE, NCERT, Bhopal, Madhya Pradesh
Ganesh Kumar Dixit, Assistant Professor in IT-ITeS (Contractual), Department of
Engineering and Technology, PSSCIVE, NCERT, Bhopal
Prakash Khanale, Professor and Head, Department of Computer Science, DSM College,
Parbhani, Maharashtra
Rizwan Alam, Assistant Professor in IT-ITeS (Contractual), Department of Engineering and
Technology, PSSCIVE, NCERT, Bhopal
Member Coordinator
Deepak D. Shudhalwar, Professor (CSE), Head, Department of Engineering and
Technology, PSSCIVE, NCERT, Bhopal, Madhya Pradesh
Table of Contents
Module Overview
RDBMS stands for Relational Database Management System. It is a type of database
management system (DBMS) that stores data in a row-based table structure which
connects related data elements. It is called relational because the values within each
table are related to each other. This makes it easy to locate and access specific values
within the database.
In this unit, you will understand the various data models and various concepts
associated with RDBMS. There are various database management software available in
the market. Popular examples of RDBMSs include MySQL, Oracle, and SQL Server. The
RDBMS concepts using MySQL is covered in this unit.
SQL stands for Structured Query Language, is a special-purpose programming language
designed to manage data in a relational database management system (RDBMS) or
stream processing in a relational data stream management system (RDSMS). SQL is
used to search, store, modify records in database management system. SQL queries are
used to retrieve the data needed for specific job functions. It is a standardized way to
request information from relational databases. In this unit, you will be able to create
database objects, insert data in database and use various types of commands to retrieve
the required data from the database.
SQL function is used to perform particular tasks and it returns zero or more values as
a result. Functions are useful while writing SQL queries. Functions can be applied to
work on single or multiple records (rows) of a table. There are various readily available
functions in SQL that can be used in queries. It includes single row functions, multiple
row functions, group records based on some criteria. The use of these functions is
illustrated in this unit.
Learning Outcomes
After completing this module, you will be able to:
• Describe the concepts of Relational Database Management System
• Describe the Structured Query Language (SQL)
• Execute the SQL commands in MySQL
Module Structure
Session 1: RDBMS Concepts
Session 2: Structured Query Language (SQL)
Session 3: Functions In SQL
Controlled Data Sharing – There can be different category of users like teacher, office staff and
parents. Ideally, not every user should be able to access all the data. It means different types of
users should be given different types of access, such as read only. It is very difficult to enforce
this kind of access control in a file system while accessing files through application programs.
1.3 DATABASE MANAGEMENT SYSTEM
Limitations faced in file system can be overcome by storing the data in a database where data
are logically related. A database management systems (DBMSs) is used as an interface to manage
databases.
A database is an organized collection of data, generally stored and accessed electronically from a
computer system. It supports the storage and manipulation of data. In other words, databases
are used by an organization as a method of storing, managing and retrieving information. It is
possible to store and organise related data in a database so that it can be managed in an efficient
and easy way.
A DBMS is a collection of software components designed to create and maintain databases and
control all access to them. DBMS allows to create a database, store, manage, update/modify and
retrieve data from that database by users or application programs. DBMS is used to provide an
effective method of performing database operations, troubleshooting database issues, and
restricting data access. Relational Database Management System (RDBMS), which is still popular
today, is an advanced version of a DBMS system. Dr. E. F. Codd defined the criterias to determine
whether a DBMS is a relational database management system or not. These criteria are knows
as twelve rules Codd’s (E. F. Codd, 1985).
Some examples of open source and commercial DBMS include MySQL, Oracle, PostgreSQL, SQL
Server, Microsoft Access, MongoDB as presented in Table 1.1.
Table 1.1 Popular DBMS
DBMS Primary Database Model License
Oracle RDBMS Commercial (restricted free
version is available)
MySQL RDBMS Open Source
Microsoft SQL Server RDBMS Commercial (restricted free
version is available)
PostgreSQL RDBMS Open Source
MangoDB Document store Open Source
A database system hides certain details about how data are actually stored and maintained.
Thus, it provides users with an abstract view of the data. A database system has a set of
programs through which users or other programs can access, modify and retrieve the stored
data.
The DBMS serves as an interface between the database and end users or application programs.
Retrieving data from a database through special type of commands is called querying the
database. In addition, users can modify the structure of the database itself through a DBMS.
Databases are widely used in various fields. Some applications are given in Table 1.2.
Table 1.2 Use of Database in Real-life Applications
Application Database to maintain data about
Banking customer information, account details, loan details, transaction
details.
Crop Loan kisan credit card data, farmer’s personal data, land area and
cultivation data, loan history, repayment data.
Inventory Management product details, customer information, order details,
delivery data.
Organisation Resource employee records, salary details, information, branch locations.
Management
Online Shopping items description, user login details, users
preferences details,
1.3.1 Limitations of DBMS
Increased Complexity – Use of DBMS increases the complexity of maintaining functionalities
like security, consistency, sharing and integrity.
Increased data vulnerability – As data are stored centrally, it increases the chances of loss of
data due to any failure of hardware or software. It can bring all operations to a halt for all the
users.
1.3.2 Application of the DBMS system
Here, are few important applications of the DBMS system:
• Student Admission System, School Examination System, Library Management System
• Payroll, HR, Sales & Personnel Management System
• Accounting System, Hotel Reservation System and Airline Reservation System
• It is used in the Banking system for Customer information, account activities, Payments,
deposits, loans etc.
• Insurance management system
• DBMS system also used by universities to keep all records
• Finance for storing information about stock, sales, and purchases of financial instruments
like stocks and bonds.
1.3.2 Advantages of DBMS system
The advantages of DBMS system are:
• DBMS offers a variety of techniques to store & retrieve data
• Uniform administration procedures for data storage and retrieval
• Application programmers never exposed to details of data representation and Storage.
• A DBMS uses various powerful functions to store and retrieve data efficiently.
• Offers Data independence, Data Integrity and Data Security and reduce data redundancy.
• The DBMS implies integrity constraints to get a high level of protection against prohibited
access to data.
These system doesn't offer concurrency. DBMS system provides a concurrency facility.
1.4 Key Concepts in DBMS
It is important to understand the following concepts to efficiently manage data using a DBMS.
1.4.1 Database schema
A database schema is a set of schema for a database's relations. It consists of table with all
attributes with their data types and constraints if any. It also represents the relationships among
the tables. It is also used to visualize the logical architecture of database and how the data are
organized in a database. The schema of a relation may not change, but the relation, which is a
variable, changes over time. (Figure 1.3)
Fig 1.5 Relation ParentRecord with its attributes (Columns) and tuples (Rows)
Attribute – Characteristic or parameters for which data are to be stored in a relation. Simply
stated, the columns of a relation are the attributes which are also referred as fields. For example,
Par_ID, Par_Name, Par_Phone and Par_Address are attributes of relation ParentRecord.
Tuple – Each row of data in a relation (table) is called a tuple. In a table with n columns, a tuple
is a relationship between the n related values.
Domain – It is a set of values from which an attribute can take a value in each row. Usually, a
data type is used to specify domain for an attribute. For example, in StudentRecord relation,
the attribute Stu_RollNo takes integer values and hence its domain is a set of integer values.
Similarly, the set of character strings constitutes the domain of the attribute Stu_Fname.
Degree – The number of attributes in a relation is called the Degree of the relation. For example,
relation ParentRecord with four attributes is a relation of degree 5.
Cardinality – The number of tuples in a relation is called the Cardinality of the relation. For
example, the cardinality of relation ParentRecord is 10 as there are 10 tuples in the table.
1.5.2 Three Important Properties of a Relation
In relational data model, following three properties are observed with respect to a relation which
makes a relation different from a data file or a simple table.
Property 1: imposes following rules on an attribute of the relation.
• Each attribute in a relation has a unique name.
• Sequence of attributes in a relation is immaterial.
Property 2: governs following rules on a tuple of a relation.
• Each tuple in a relation is distinct. For example, data values in no two tuples of relation
AttendanceRecord can be identical for all the attributes. Thus, each tuple of a relation
must be uniquely identified by its contents.
• Sequence of tuples in a relation is immaterial. The tuples are not considered to be ordered,
even though they appear to be in tabular form.
Property 3: imposes following rules on the state of a relation.
• All data values in an attribute must be from the same domain (same data type).
• Each data value associated with an attribute must be atomic (cannot be further divisible
into meaningful subparts). For example, Par_Phone of relation ParentRecord has ten
digits numbers which is indivisible.
• No attribute can have many data values in one tuple. For example, any Parent cannot
specify multiple contact numbers under Par_Phone attribute.
• A special value “NULL” is used to represent values that are unknown or non-applicable to
certain attributes. For example, if a parent does not share his or her contact number with
the school authorities, then Par_Phone is set to NULL (data unknown).
1.6 KEYS IN A RELATIONAL DATABASE
The tuples within a relation must be distinct. It means no two tuples in a table should have same
value for all attributes. That is, there should be at least one attribute in which data are distinct
(unique) and not NULL. That way, we can uniquely distinguish each tuple of a relation. So,
relational data model imposes some restrictions or constraints on the values of the attributes
and how the contents of one relation be referred through another relation. These restrictions are
specified at the time of defining the database through different types of keys as given below:
1.6.1 Candidate Key
A relation can have one or more attributes that takes distinct values. Any of these attributes can
be used to uniquely identify the tuples in the relation. Such attributes are called candidate keys
as each of them are candidates for the primary key.
As shown in Figure 1.5, the relation ParentRecord has five attributes out of which Par_ID and
Par_Phone always take unique values. No two parents will have same phone number or same
Par_ID. Hence, these two attributes are the candidate keys as they both are candidates for
primary key.
1.6.2 Primary Key
Out of one or more candidate keys, the attribute chosen by the database designer to uniquely
identify the tuples in a relation is called the primary key of that relation. The remaining attributes
in the list of candidate keys are called the alternate keys.
In the relation ParentRecord, suppose Par_ID is chosen as primary key, then Par_Phone will be
called the alternate key.
Fig. 1.6 StudentAttendance database with the Primary and Foreign keys
Summary
• A file in a file system is a container to store data in a computer.
• File system suffers from Data Redundancy, Data Inconsistency, Data Isolation, Data
Dependence and Controlled Data sharing.
• Database Management System (DBMS) is a software to create and manage databases. A
database is a collection of tables.
• Database schema is the design of a database
• A database constraint is a restriction on the type of data that that can be inserted into the
table.
• Database schema and database constraints are stored in database Catalog. Whereas the
snapshot of the database at any given time is the database instance.
• A query is a request to a database for information retrieval and data manipulation (insertion,
deletion or update). It is written in Structured Query Language (SQL).
• Relational DBMS (RDBMS) is used to store data in related tables. Rows and columns of a
table are called tuples and attributed respectively. A table is referred to as a relation.
• Restrictions on data stored in a RDBMS is applied by use of keys such as Candidate Key,
Primary Key, Composite Primary Key, Foreign Key.
• Primary key in a relation is used for unique identification of tuples.
• Foreign key is used to relate two tables or relations.
• Each column in a table represents a feature (attribute) of a record. Table stores the
information for an entity whereas a row represents a record.
• Each row in a table represents a record. A tuple is a collection of attribute values that makes
a record unique.
• A tuple is a unique entity whereas attribute values can be duplicate in the table.
Table: PROJECT_ASSIGNED
Regi_ID Project_No
IP-101-15 101
IP-104-15 102
CS-103-14 103
CS-101-14 104
CS-101-10 105
Table: PROJECT
Proj_No Project_Name Sub_Date
101 Airline Reservation System 12-01-22
102 Library Automation System 12-01-22
103 Employee Management System 15-01-22
104 Student Management System 12-01-22
105 Inventory Management System 15-01-22
106 Railway Reservation System 15-01-22
Answer the following questions:
• Write the name of primary key of each table.
• Write the name of foreign key(s) in table PROJECT_ASSIGNED.
• Is there any alternate key in table STUDENT? Give justification for your answer.
• Can a user assign duplicate value to the field Roll_No of STUDENT table? Justify.
5. Consider the database STUDENT_PROJECT given above and answer the following questions
with justification.
• Can you insert a new student record with missing roll number.
• Can you insert a new student record with missing registration id value.
• Can you insert a new project detail without Sub_date.
• Can you insert a new project detail without Proj_no.
• Can you insert a new record with Regi_ID as IP-101-19 and Project_No 206 in table
PROJECT_ASSIGNED.
Once the result date is declared, Shyam was eager to see the result on website. (Figure 2.1) He
opened the website to enter his Roll number to see the result. After entering Roll number, he
pressed the OK button. Immediately score card of Shyam got displayed on the screen and passed
with first division marks. Shyam was very happy and also surprised, how a computer searches
the Roll number so fast among approximately 5 lacs students records. Later on, Shyam
understand that it was possible because of the database query language which is also known as
Structured Query Language (SQL). SQL is used to search, store, modify records in data base
management system. In this chapter, you will understand to create database objects, insert data
in database and various types of commands used to retrieve the required data from the database.
CHAR (n) Specifies character type data of length n where n could be any value from 0 to
255. CHAR is of fixed length, means, declaring CHAR (10) implies to reserve
spaces for 10 characters. If data does not have 10 characters (for example, ‘city’
has four characters), MySQL fills the remaining 6 characters with spaces padded
on the right.
VARCHAR (n) Specifies character type data of length ‘n’ where n could be any value from 0 to
65535. But unlike CHAR, VARCHAR is a variable-length data type. That is,
declaring VARCHAR (30) means a maximum of 30 characters can be stored but
the actual allocated bytes will depend on the length of entered string. So ‘city’ in
VARCHAR (30) will occupy the space needed to store 4 characters only.
INT INT specifies an integer value. Each INT value occupies 4 bytes of storage. The
range of values allowed in integer type are -2147483648 to 2147483647. For
values larger than that, we have to use BIGINT, which occupies 8 bytes.
FLOAT Holds numbers with decimal points. Each FLOAT value occupies 4 bytes.
DATE The DATE type is used for dates in 'YYYY-MM-DD' format. YYYY is the 4 digits
year, MM is the 2 digits month and DD is the 2 digits date. The supported range
is '1000-01-01' to '9999-12-31'.
2.2.2 Constraints
Constraints are the certain types of restrictions on the data values that an attribute can have.
Table 2.2 lists some of the commonly used constraints in SQL. They are used to ensure
correctness of data. However, it is not mandatory to define constraints for each attribute of a
table.
Table 2.2 Commonly used SQL Constraints
Constraint Description
Ensures that a column cannot have NULL values where NULL means missing/
NOT NULL
unknown/not applicable value.
PRIMARY
The column which can uniquely identify each row or record in a table.
KEY
FOREIGN The column which refers to value of an attribute defined as primary key in another
KEY table.
After successful execution of the command a message “Query OK” is displayed on the sql prompt.
It is also possible to see the newly created database by using the “show” command. The show
command displays the newly created database along with some default databases of MySQL as
shown in Figure 2.4.
Note: In any RDBMS, it is possible to manage multiple databases on a single computer. USE
command is used to select the specific database. After selecting the database, it is possible to
create tables or querying data from this database.
To select the database SchoolRecord, issue the “USE” command followed by database name.
Note/Tip: In LINUX OS environment, names for database and tables are case-sensitive whereas
in WINDOWS OS, there is no such differentiation. However, as a good practice, it is suggested to
write database or table name in the same letter cases that were used at the time of their creation.
2.3.2 CREATE Table
After creating database SchoolRecord, it is required to define relations (create tables) in this
database. In each relation specify attribute (column name) for each attribute with their required
data types. The syntax for CREATE TABLE statement is as follows.
Syntax:
CREATE TABLE tablename (
Col_name1 datatype constraint,
Col_name2 datatype constraint,
:
Col_nameN datatype constraint );
Let us understand how to choose attribute names and their respected data types. First identify
data types of the attributes in table “StudentRecord” along with their constraint, if any. Let us
assume that there are total 100 students in a class and values of Roll number are in a sequence
from 1 to 100. Since the data values of attribute “Stu_RollNo” is stored in digits, the data type
integer (INT) is appropriate for this attribute. In the same way total number of characters in
student First name and Last name can be upto 20 characters. Since the number of characters
can vary for different students, the data type VARCHAR is used for these columns. In the same
the data type VARCHAR is used for student address upto 50 characters in length. The specific
data type DATE is used for specifying any type of date. So DATE data type is used for attribute
“Date of Birth”. For student's parent id, Aadhaar number is used which is a 12 digit number.
Since Aadhaar number is of fixed length and it is not required to perform any mathematical
operation, the character data type with fixed length of 12 character, CHAR (12) is used for this
attribute.
Table 2.3 Data types and constraints for the attributes of relation StudentRecord
Attribute Data expected to be stored Data type Constraint
Stu_RollNo Numeric value consisting of maximum 3 digits Int Primary Key
Variable length string of maximum 20
Stu_FName Varchar (20) Not Null
characters
Variable length string of maximum 20
Stu_LName Varchar (20) Not Null
characters
Stu_DOB Date value Date Not Null
Variable length string of maximum 50
Stu_Address Varchar (50) Not Null
characters
Fixed length string of 12 digits for Aadhaar
Par_ID Char (12) Foreign Key
Number
Table 2.4 Data types and constraints for the attributes of relation ParentRecord
Attribute Data expected to be stored Data type Constraint
Fixed length string of 12 digits Aadhaar
Par_ID Char (12) Primary Key
number
Variable length string of maximum 20
Par_Name Varchar (20) Not Null
characters
Par_Phone Numeric value consisting of 10 digits Char (10) Null Unique
Par_Address Variable length string of size 30 characters Varchar (30) Not Null
Par_Email Variable length string of size 30 characters Varchar (30)
Table 2.5 Data types and constraints for the attributes of relation AttendanceRecord
Attribute Data expected to be stored Data type Constraint
Att_Date Date value Date Primary Key*
Numeric value consisting of maximum 3 Primary Key*
Stu_RollNo Int
digits Foreign Key
Att_Status ‘P’ for present and ‘A’ for absent Char(1) Not Null
Table 2.3, 2.4 and 2.5 show the chosen data type and constraint for each attribute of the relations
StudentRecord, ParentRecord and AttendanceRecord respectively.
Example 2.2: The following command is used to create table StudentRecord. To create the
table in SchoolRecord database, first open the database with USE SchoolRecord command. Then
create the table under StudentRecord database by using the CREATE TABLE command.
Note: “,” is used to separate two attributes and each statement terminates with a semi-colon (;).
The arrow (->) is an interactive continuation prompt. If we enter an unfinished statement, the
SQL shell will wait for us to enter the rest of the statement.
Example 2.3: The following command is used to Create table ParentRecord.
The SHOW TABLES statement is used to display all the table in database. We have created three
tables in the database SchoolRecord.
Example 2.6: The following SQL command is used to display the tables created in the database
SchoolRecord. It shows all the three tables created so far.
A composite primary key is made up of two attributes. The primary key to the
“AttendanceRecord” relation will be composite primary key of two attributes. “AttendanceDate”
and “Stu_RollNo”.
Example 2.8: The following SQL command is used to add the composite primary key to the
relation “AttendanceRecord”.
The newly added attribute “income” with data type INT in the table “ParentRecord” can be viewed
using DESC command as follows.
Syntax:
ALTER TABLE table_name MODIFY attribute DATATYPE;
Suppose, to change the size of attribute “Par_Address” from VARCHAR (30) to VARCHAR (40)
of the “ParentRecord” table.
Example 2.12: The following command is used to change the size of attribute “Par_Address”
in“ParentRecord” table.
Note: It is required to specify the data type of the attribute along with DEFAULT while using
MODIFY.
(h) Remove an attribute
It is possible to remove attributes from a table using ALTER.
Syntax:
ALTER TABLE table_name DROP attribute;
Example 2.14: The following command is used to remove the attribute income from the table
“ParentRecord”.
Note: The primary key is dropped from StudentRecord table, but each table should have a
primary key to maintain uniqueness. Hence, to use ADD command to specify primary key for
the StudentRecord table as shown in earlier examples
2.3.5 DROP TABLE Command
Sometimes it may require to remove a table in a database or the database itself. DROP statement
is used to remove a database or a table permanently from the system. Since this command will
delete the table or database permanently, you have to be cautious while using this statement as
it cannot be undone. Let us assume that you have created a table with name “ParantRecord”
instead of “ParentRecord”. DROP command can be used to delete the table created with wrong
name.
Syntax:
DROP TABLE table_name;
It is also possible to drop the entire database.
Syntax:
DROP DATABASE database_name;
Example 2.16: The following command is used to delete the table name “ParantRecord” from the
current database.
Cautions:
• Using the Drop statement to remove a database will ultimately remove all the tables
within it.
• DROP statement will remove the tables or database created by you. Hence you may apply
DROP statement at the end of the chapter.
It will create a new table named as “NewStudentRecord” with only 5 attributes and all the records
which are inserted in this table earlier.
It is possible to create a new table with all attributes and all records available in the existing
table.
Example 2.19: The following command is used to create a new table “StudentRecord1” with all
attributes and all records available in the existing table “StudentRecord”.
Now the view named EMP_VIEW will be created with only those employee records who have
salary more than 10000. You can use this view similar to Employee table to see all records using
SELECT command. To see all records from EMP_VEW, use the SELECT command as under.
Activity 1
Practical Activity 2.1 – Create the table “Employee” and “Department” in MySQL with
the following attributes specification.
Employee Table
Attribute Data expected to be stored Data type Constraint
empno Numeric value consisting of 4 digits Int Primary Key
ename Variable length string of max 30 characters Varchar (30) Not Null
job Variable length string of max 15 characters Varchar (15) Not Null
mgr Numeric value consisting of 4 digits Int Not Null
hiredate Date of joining the company Date Not Null
sal Numeric value consisting of 6 digits Int Not Null
comm Numeric value consisting of 4 digits Int Not Null
Dept no which is Numeric type consisting of
Deptno Int
maximum 2 digits
Department Table
Attribute Data expected to be stored Data type Constraint
deptno Numeric value consisting of 4 digits Int Primary Key
dname Variant length string of max 20 characters Varchar (20) Not Null
loc Variant length string of max 25 characters Varchar (25) Not Null
Caution: While populating records in a table with foreign key, ensure that records in referenced
tables are already populated.
Let us insert some records in the SchoolRecord database. First insert the records in
ParentRecord table first as it does not have any foreign key. A set of sample records for
ParentRecord table is shown in Table 2.6.
Table 2.6 Records to be inserted into the ParentRecord Table
Par_ID Par_Name Par_Phone Par_Address Par_Email
452695874564 Manu P Singh 9834567890 203, Khandari, Agra, UP [email protected]
252154687451 Ashok K Sharma 9845678910 144 Gr Kailash, New Delhi [email protected]
686113652987
Gurmeet Singh 9635214789 Shahid Nagar, Amritsar, PB [email protected]
954891122475 Michal DeSousa 8554658958 Guindy, Chennai, TN [email protected]
Example 2.23: The following command is used to insert the record in the “ParentRecord” table.
We can use the SQL statement “SELECT * from table_ name;” to view the inserted record
after any statement to see the current changes in table.
It is also possible to provide values only for some of the attributes in a table by just specifying
the attribute name alongside each data value as per the following syntax.
Syntax:
INSERT INTO tablename (column1, column2, ...)
VALUES (value1, value2, ...);
Suppose to insert the sixth record in “ParentRecord” table (Table 2.6) keeping the value of
“Par_Phone” to NULL. Then it is required to insert the values for other four fields. In this case,
specify the names of attributes in which the values are to be inserted. The values must be given
in the same order in which attributes are written in INSERT command.
Example 2.24: The following command is used to insert the record in “ParentRecord” table by specifying the field
name and corresponding values.
Now observe that all the four values has been inserted in the table ParentRecord except
“Par_Phone” which is being set to NULL at the time of creating a table.
Activities
Practical Activity 2.2 – Insert the records in the ParentRecord table using INSERT command
and check the records inserted in ParentRecord as below.
Practical Activity 2.3 – Insert the records in StudentRecord table (Table 2.7).
Table 2.7 Records to be inserted into the StudentRecord table
Stu_
Stu_FName Stu_LName Stu_DOB Stu_Address Par_ID
RollNo
1 Rajvardhan Singh 5/15/2003 203, Khandari, Agra UP 452695874564
2 Trilok Sharma 8/15/2004 144 Gr Kailash, New Delhi 252154687451
3 Aditi Gaur 4/6/2005 JP Greens, Noida, UP 362115264625
4 Anshika Agrawal 5/17/2003 Kanda, Bagheshwar, UK 602125125261
5 Nandini Roy 12/29/2003 Fortune Somya, Bhopal, MP 225423344657
6 Pawani Dixit 11/12/2004 Lajpat Nagar, Mathura, UP 268953264578
7 Hiba Rizwan 12/3/2006 Deep Nagar, Sahrsa, Bihar 485466192343
8 Riddhi Gupta 1/11/2005 T Nagar, Hyderabad, Telangana 521556651761
9 Manpreet Singh 9/8/2005 Shahid Nagar, Amritsar, Punjab 686113652987
10 John DeSousa 8/17/2005 Guindy, Chennai, TN 954891122475
Example 2.25: The following command is used to insert the first record in table “StudentRecord”.
When column names are not mentioned in the INSERT command, then it is necessary to mention
the values for all the columns. So if there is no “ParentID” for Trilok, then mention the NULL value
for the “Par_ID”.
Example 2.26: The following command inserts the second record with “Par_ID” value as NULL.
mysql>INSERT INTO StudentRecord VALUES (2,'Trilok','Sharma','8/15/2004', '144 Gr
Kailash','New Delhi' NULL);
Note/Tip: Please be careful while entering date in INSERT command. Use the ‘YYYY-MM-DD’
format to write date.
Practical Activity 2.4 – Use INSERT command
Insert the records in employee table using INSERT command and display it after inserting all
record using SELECT statement.
Insert the records in Department table using INSERT command and display it after inserting all
record using SELECT statement.
Suppose, the ParentRecord with Par_ID 485466192343 has requested to change Address to
'WZ - 68, Azad Avenue, Boriwali, Mumbai’ and Phone number to '9988776644'.
Example 2.28: The following SQL statement will update this record.
mysql> UPDATE ParentRecord SET Par_Address = 'WZ - 68, Azad Avenue, Boriwali, Mumbai’,
Par_Phone = 9988776644 WHERE Par_ID = 485466192343;
The changes affected can be verified by using the SELECT statement as below.
Caution: The WHERE clause should be used in the UPDATE and DELETE statement,
otherwise it will apply on all the records.
In the above query, observe that the Student Roll Number and Date of birth of the of the student
whose roll number is 1 is retrieved using WHERE clause.
2.5.2 Querying using database OFFICE
Let us consider an EMP table of employee database with the following fields. The “empno” is a
primary key and “deptno” as foreign key. Table 3.1 shows the data entered in the Emp table.
Table 2.8 Records available in EMP table
empno ename job mgr hiredate sal comm deptno
7019 Smita Clerk 7552 12/14/1994 8800 NULL 20
7049 Alam Salesman 7348 02/17/1995 9600 1800 30
7171 Wasim Salesman 7348 02/19/1995 9250 2000 30
7216 Jawahar Manager 7489 03/30/1995 10975 NULL 20
7304 Manoj Salesman 7348 09/25/1995 9250 2900 30
7348 Balwinder Manager 7489 04/28/1995 10850 NULL 30
7432 Chetana Manager 7489 06/06/1995 10450 NULL 10
7438 Sachin Analyst 7216 12/05/1996 11000 NULL 20
7489 Kushaal President NULL 11/14/1995 13000 NULL 10
7494 Tarun Salesman 7348 09/05/1995 9500 0 30
7526 Amar Clerk 7438 01/08/1997 9100 NULL 20
7550 Jyoti Clerk 7348 11/30/1995 8950 NULL 30
7552 Farhan Analyst 7216 10/27/1995 11000 NULL 20
7584 Mohan Clerk 7432 01/20/1996 9300 NULL 10
7984 Lalitha Clerk 7432 05/23/1998 10300 NULL 10
Now if you wish to retrieve the desired data from the table, let us see how to apply the SELECT
clause to retrieve the data.
(a) Retrieve selected columns – It is possible to retrieve the data of one column of table.
Example 2.31: The following SQL query statement is used to retrieve employee number of all employees in the
table.
Observe that the above query retrieve empno of all the employee from Emp table as only one
column is specified to retrieve.
Let us see another query that select two columns such as emp no and corresponding employee
name. Modify the same query by specifying two fields of table as “empno” and “ename”. and
observe the desired output as below.
Example 2.32: The following SQL query statement will retrieve the data of employee number and name in two
columns.
(b) Renaming of columns – There is specific naming conventions of the fields in table. It is possible
to rename any column while displaying the output by using the alias 'AS'.
Example 2.33: The following SQL query statement selects Employee name as “Name” in the output for
all the employees.
Example 2.34: The following SQL query statement will calculate and to display the annual salary of
employee. Annual salary is calculated as “sal*12”.
Now it doesn’t look nice to display the caption as “sal*12” in the table. It is possible to display it
with new caption as “Annual Salary” for “sal*12”. The revised query and its output is given below.
Observe that “ename” is shown with the caption as “Name” and “sal*12” is shown with the
caption as “Annual Income”.
Note – Annual Income is just the caption to display. It will not add as a new column in the
database table. It is just for displaying the output of the query. If an aliased column name has
space as in the case of Annual Income, it should be enclosed in quotes as ‘Annual Income’.
(c) Distinct Clause – The SELECT clause retrieves all the data through query as output. There may
be a chance of duplicate values such as 2 persons with the same name working in the
department. The DISTINCT clause has provision to retrieve the unique records by omitting the
duplicate records. The DISTINCT clause is used for this purpose.
Example 2.35: The following SQL query statement shows the different departments available in the
“emp” table.
Let us understand, how to retrieve different types of jobs available using DISTINCT clause in the
following example.
Example 2.36: The following SQL query statement will use DISTINCT clause to retrieve different types
of jobs available in the “emp” table.
Observe that there are 5 different job titles although more number of records exists.
(d) WHERE Clause – It retrieves data that meet some specified conditions. In our OFFICE database,
more than one employee can have the same salary.
Example 2.37: The following SQL query statement will gives distinct salaries of the employees working
in the department number 10.
Observe in the output that all the records of employee working in dept no. 10 and having the
distinct salary are retrieved.
In the above example, = operator is used in the WHERE clause. Other relational operators like
(<, <=, >, >=, !=) can also be used to specify conditions as per your requirement. The logical
operators AND, OR, and NOT are used to combine multiple conditions.
Let us see, how to compare columns/fields value/s to specific required records or columns.
Example 2.38: The following SQL query statement will display all the details of those employees of 30
department who earn more than 5000.
Note: Observe the output, two different conditions are being tested separately. First condition
tested for Salary is greater than 5000 and second condition is for department number is 10. AND
operator used to join both conditions.
Let us make a comparison of salary like who is getting more then 8000 and less than 11000.
Example 2.39: The following SQL query statement will selects the name and department number of all
those employees who are earning salary between 8000 and 11000 inclusive of both values.
The query in example 2.39 defines a range of salary between 8000 and 11000 that can also be
achieved using a comparison operator BETWEEN, in the query as below. The output of this query
will be same as above.
Note: The BETWEEN operator defines the range of values in which the column value must fall
into, to make the condition true.
Example 2.40: The following SQL query statement will selects details of all the employees who work in
any of the department number 10, 20, or 40.
Note: Here NOT operator is used in combination with IN to retrieve all records except with deptno
10 and 20.
(F) ORDER BY Clause – It is used to display data in an ordered form with respect to a specified
column. By default, ORDER BY displays records in ascending order of the specified column
values. The DESC keyword is used to display the records in descending.
Let us arrange the records in ascending or descending order using the ORDER BY clause with
DESC clause example 2.42.
Example 2.42: The following SQL query statement selects details of all the employees in ascending order
of their salaries.
Observe that the records are displayed in ascending order of Salary of each employee. To arrange
records in descending order, use DESC clause with ORDER BY as in example 2.43.
Example 2.43: The following SQL query statement selects details of all the employees in descending
order of their salaries.
Note: DESC clause used after the column name on which the records to be displayed in
descending order.
(G) Handling NULL Values – SQL supports a special value called NULL to represent a missing or
unknown value. For example, the “Par_Phone” column in the table “ParentRecord” can have
missing value for certain records. Hence, NULL is used to represent such unknown values. It is
important to note that NULL is different from value 0 (zero). Also, any arithmetic operation
performed with NULL value gives NULL. For example, 5 + NULL = NULL because NULL is
unknown hence the result is also unknown. In order to check for NULL value in a column, use
IS NULL operator in particular statement. Example 2.44 illustrates the use of NULL clause.
Example 2.44: The following SQL query statement selects details of all employees who have not been
given a bonus. This implies that the bonus column will be blank.
Observe the output and see column mgr and comm where NULL is present.
It is also possible to join NULL statement with any other condition. Example 3.11 shows how to
use it in statement.
Example 2.45: The following SQL query statement selects selects emp number, employee names and job
of all those employees who have been given a comm (i.e., comm is not null) and works in the
department 30.
(H) Having clause – It is used in SELECT statement to make group with certain condition in result
of query.
Syntax:
SELECT expression1, expression2, ... expression_n,
aggregate_function (expression)
FROM tables
[WHERE conditions]
GROUP BY expression1, expression2, ... expression_n
HAVING condition;
Example 2.45 shows how to use Group by and Having clause jointly. The HAVING clause must
follow the GROUP BY clause in any SELECT query and must also preceded by ORDER BY clause
if used.
Example 2.45: The following SQL query statement selects jobs, number of employees in that job, their
total salary and department number wise list where minimum 3 employee of same type of job
are working.
(I) Substring pattern matching – Many times it may require that the query should not retrieve that
exact text or value, rather it should retrieve the matching of few characters or values. For
example, to find out names starting with “M” or to find out pin codes starting with “11”, is called
substring pattern matching. Such patterns cannot match using = operator. SQL provides a LIKE
operator that can be used with the WHERE clause to search for specified pattern in a column.
The LIKE operator makes use of the following two wild card characters - (%) and (-). The percent
(%) is used to represent zero, one, or multiple characters. The underscore (_) is used to represent
exactly a single character.
There are several situations when we search data records for some pattern matching. A very
common situation when you search any contacts in your smart phone, you just start typing first
few characters of the name, then immediately list appears with these characters and you tap on
the required name to call. Example 3.46 to 3.51 demonstrates such situations to search some
patterns in text values of records using LIKE clause.
Example 2.46: The following SQL query statement selects details of all those employees whose name
starts with 'K'.
Example 2.47: The following SQL query statement selects details of all those employees whose name
whose name ends with 'a', and gets a salary more than 8500.
Example 2.48: The following SQL query statement selects details of all those employees whose name
consists of exactly 5 letters and starts with any letter but has ‘mita’ after that.
You can also match a particular character or string in between the text simply by using wild card
character as shown in example 2.49.
Example 2.49: The following SQL query statement selects all columns of all employees containing 'ma'
as a substring in name.
Example 2.50: The following SQL query statement selects all columns of employees containing 'a' as the
second character in their names.
Example 2.51: The following SQL query statement selects records of all the employees except Alam.
2.6 SQL FOR DATA CONTROL LANGUAGE (DCL)
Data Control Language is the part of SQL, which have commands to manage users for their work
permission. The user will be able to work as per the permissions granted to them by DBA
(Database Administrator). DCL includes the commands GRANT and REVOKE, which are used to
provide rights & permissions to user.
GRANT statement – The GRANT statement is used to give access privileges to a specific user to
work with any selected database only.
Syntax:
GRANT SELECT, UPDATE ON Test_Table TO NewUser1, NewUser2;
Example:
GRANT SELECT, UPDATE, DELETE ON carshowroom TO 'WebUser';
Here the user 'WebUser' will be able to use only three SELECT, UPDATE and DELETE SQL
statements when working on carshowroom database.
REVOKE statement – The REVOKE statement is used to withdraw privileges from a specific user
so that specific user could not use specific statement on selected database. In other words it is
useful to take back the given permission/s from the user.
Syntax:
REVOKE Privilege_Name ON Object_Name FROM User_Name.
Example:
REVOKE DELETE ON carshowroom FROM WebUser;
2.7 SQL FOR TRANSACTION CONTROL LANGUAGE (TCL)
Transaction control language (TCL) is the part of SQL commands that allows to permanently
change the databases or undo the databases transactions. It is similar to save the database or
undo the current changes. The COMMIT, ROLLBACK and SAVEPOINT statements comes under
this category.
COMMIT – Commit command is used to save all the transactions to the database. After
completing any operation or SQL statement, you can simply write COMMIT as the next statement
to permanently save data in the database.
Syntax:
Commit;
Example: DELETE FROM ClassStudents WHERE RollNo =25;
Commit;
Here, after DELETE statement, the COMMIT statement is used. It means the student record
whose RollNo is 25 is permanently deleted. Now after COMMIT statement, it is not possible to
rollback the record of that student.
ROLLBACK – ROLLBACK command allows to undo transactions that have not already been
saved to the database. This statement is useful to restore the database to the state where last
commit statement was used. Rollback statement is also used with SAVEPOINT statement to
jump to specific Savepoint in the database transactions.
Syntax:
ROLLBACK;
SAVEPOINT – This command helps to sets a Savepoint within a transaction. Basically
SAVEPOINT statement is used to save a transaction temporarily so that user can rollback to that
point as and when required.
Syntax: SAVEPOINT Savepoint_Name;
SUMMARY
• SQL is a domain-specific language that is used to manage relational databases.
• Currently almost all RDBMS such as MySQL, Oracle, Informix, SQL server, MS Access, and
Sybase uses SQL as their standard database language.
• SQL is easy to learn as the statements comprise of descriptive English words.
• SQL is a open source, interactive, portable, faster query processing, standardized and
universal language to work with with RDBMS.
• SQL is divided into five types like DDL, DML, DQL, TCL and DCL.
• DDL (Data Definition Language) includes SQL statements such as, Create table, Alter table
and Drop table.
• Create command is used to create database and its further objects like Table, View.
• DML (Data Manipulation Language) includes SQL statements such as, insert, select, update
and delete.
• A table is a collection of rows and columns, where each row is a record and columns describe
the feature of records.
• DESCRIBE TABLE statement is used to view the structure of an already existing table
• ALTER TABLE statement is used to make changes in the structure of a table like adding,
removing column and changing datatype of column(s). It is also used to apply/remove any
constraints like Primary Key, Foreign Key etc.
• DROP statement is used to remove a database or a table permanently from the database
system.
• TRUNCATE statement is used to delete all records from the table but table structure will
exist in database.
• INSERT INTO statement is used to insert new records in any existing table
• UPDATE statement is used to make required changes in records of any table.
• DELETE statement is used to delete/remove one or more records from a table.
• CREATE TABLE statement can also be used to create new table from existing tables/s.
• RENAME statement is used to change the name of existing tables of other database objects.
• Views in any database is a special kind of virtual table that is created from one or more table
and having no data of its own.
• WHERE clause in SQL query is used to enforce condition(s).
• DISTINCT clause is used to eliminate repetition and display the values only once.
• The BETWEEN operator defines the range of values inclusive of boundary values.
• The IN operator selects values that match any value in the given list of values.
• NULL values can be tested using IS NULL and IS NOT NULL.
• ORDER BY clause is used to display the result of a SQL query in ascending or descending
order with respect to specified attribute values. By default, the order is ascending.
• LIKE operator is used for pattern matching. % and _ are two wild card characters. The per
cent (%) symbol is used to represent zero or more characters. The underscore (_) symbol is
used to represent a single character.
4. The records and structure of a table may be removed or deleted from the database
using which command? (a) Remove (b) Delete (c) Drop (d) Truncate
5. SQL ___ statement can be used to delete or drop existing databases in a SQL schema.
(a) Create Database (b) Rename Database (c) Drop Database (d) Select Database
6. Using DROP TABLE command in SQL (a) Drop the table structure (b) Drop the Integrity
constraints (c) Drop the Relationship (d) All of the above
7. Using DROP TABLE command in SQL (a) Drop the table structure (b) Drop the Integrity
constraints (c) Drop the Relationship (d) Noe of the above
8. TRUNCATE TABLE requires (a) Where Clause (b) Having Clause (c) Both A And B (d)
None of the above
9. Which of the following clause is used to add a Primary Key constraint after creating
table (a) Update (b) Add (c) Alter (d) Join
10. Which of the following clause is used to remove a primary key constraint (a) Delete (b)
Drop (c) Alter (d) Remove
11. Which of the following SQL statement is used to give result in sorted order (a) Sort By
(b) Order (c) Order By (d) Sort
12. Commands under DCL are (a) GRANT (b) REVOKE (c) Both A. and B. (d) None of the
above
13. The SQL command to retrieve table records is (a) RETRIEVE (b) SELECT (c) CREATE (d)
ALTER
14. Which of the following operator is used for pattern matching in SQL? (a) BETWEEN
operator (b) LIKE operator (c) EXISTS operator (d) None of these
15. Which operator is used to check the absence of data in any column (a) EXISTS operator
(b) NOT operator (c) IS NULL operator (d) None of these
16. Which of the following keyword is used to select only unique values from any column (a)
DISTINCTIVE (b) UNIQUE (c) DISTINCT (d) DIFFERENT
B. Fill in the blanks
1. SQL is divided in ___________ category.
2. The _______ command is used to see the structure of table.
3. The _______ command is used to remove all records.
4. The _______ command is used to add an attribute in an existing table.
5. The _______ command is used to remove all records only from a table.
6. The _______ command is used to remove a attribute from a table.
7. A view is a special kind of _____________ table.
8. Views can be created form _________ or more tables.
9. Grant and Revoke are part of _____________ in SQL.
10. Commit and Savepoint are part of ___________ in SQL.
11. To sort the result of a query in descending order, we can use clause ______
12. To extract unique values from a column, user can use __________ clause.
C. State whether True or False
1. INSERT clause is used to add a Foreign key constraint.
2. ALTER clause is used to add a Primary key constraint after table is created.
3. DROP command is used to delete the structure of a table from the database.
4. Updation and deletion of records are part of DDL.
5. Insert into statement is useful to insert a new field in any table.
6. Aggregate functions are used to perform calculations on multiple values and returns a
single value.
7. Aggregate functions are mostly used with the SELECT statement.
8. DML is used to create a new database objects like table and view.
9. A new table can be created from existing table(s).
10. The name of any tables once its created and records are inserted cannot be change.
• add a new column address and mobno to the newly created table NewEmp.
• Suppose the DBMS admin forget to make empno as primary key and deptno as foreign
key. Write the SQL query to make these changes.
• change emp name with your name for empno=7034 in table NewEmp
• change emp name with your friend name for empno=7550 in table NewEmp.
• insert mob no and address in your record and of and your friend's records.
• delete the column address from the new table NewEmp.
• to delete the newly created table NewEmp.
2. Consider the following table named “Product”, showing details of products being sold in a
grocery shop.
PCode PName UPrice Manufacturer
P01 Washing Powder 130 Surf
P02 Toothpaste 58 Colgate
P03 Soap 29 Lux
P04 Toothpaste 75 Pepsodent
P05 Soap 44 Dove
P06 Shampoo 275 Dove
P08 Toothpaste 44 Patanjali
P09 Soap 48 Hamam
P10 Washing Powder 90 Henko
Write SQL queries for the following.
a) Create the table Product with appropriate data types and constraints.
b) Identify the primary key in Product.
c) List the Product Code, Product name and price in descending order of their product name.
If PName is same, then display the data in ascending order of price.
d) Add a new column Discount to the table Product.
e) Calculate the value of the discount in the table Product as 10 per cent of the UPrice for
all those products where the UPrice is more than 100, otherwise the discount will be 0.
f) Increase the price by 12 per cent for all the products manufactured by Dove.
g) Display the total number of products manufactured by each manufacturer.
3. Consider the following MOVIE table and write the SQL queries based on it.
d) Find the net profit of each movie showing its MID, MovieName and NetProfit. Net Profit is to
be calculated as the difference between BussCost and ProdCost.
e) List MID, MovieName and Cost for all movies with ProdCost greater than 10,000 and less
than 1,00,000.
f) List details of all movies which fall in the category of comedy or action.
g) List details of all movies which have not been released yet.
4. Suppose your school management has decided to conduct cricket matches between students
of Class XI and Class XII. Students of each class are asked to join any one of the four teams –
Team Titan, Team Rockers, Team Magnet and Team Hurricane. During summer vacations,
various matches will be conducted between these teams. Help your sports teacher to do the
following:
1. Create a database “Sports”.
12. Create a table “TEAM” with following considerations:
◦ It should have a column TeamID for storing an integer value between 1 to 9, which
refers to unique identification of a team.
◦ Each TeamID should have its associated name (TeamName), which should be a
string of length not less than 10 characters.
13. Using table level constraint, make TeamID as the primary key.
14. Show the structure of the table TEAM using a SQL statement.
15. As per the preferences of the students four teams were formed as given below. Insert
these four rows in TEAM table:
Row 1: (1, Team Titan)
Row 2: (2, Team Rockers)
Row 3: (3, Team Magnet)
Row 3: (4, Team Hurricane)
16. Show the contents of the table TEAM using a DML statement.
17. Now create another table MATCH_DETAILS and insert data as shown below. Choose
appropriate data types and constraints for each attribute.
Table: MATCH_DETAILS
MatchID MatchDate FirstTeamID SecondTeamID FirstTeamScore SecondTeamScore
M1 7/17/2022 1 2 90 86
M2 7/18/2022 3 4 45 48
M3 7/19/2022 1 3 78 56
M4 7/19/2022 2 4 56 67
M5 7/18/2022 1 4 32 87
M6 7/17/2022 2 3 67 51
There are various readily available functions in SQL that can be used in queries. It includes
single row functions, multiple row functions, group records based on some criteria, and working
on multiple tables using SQL.
A function is used to perform some particular tasks and it returns zero or more values as a
result. Functions are useful while writing SQL queries also. Functions can be applied to work on
single or multiple records (rows) of a table.
3.1 SQL functions
SQL functions are categorized as Single Row functions and Aggregate functions, depending on
their application in one or multiple rows.
Single Row Functions are also known as Scalar functions. Single row functions are applied on a
single value and return a single value. These are used in SELECT, WHERE, and ORDER BY
clause. MATH, STRING and DATE functions are examples of single row functions.
Aggregate functions are also called Multiple Row functions. These functions work on a set of
records as a whole and return a single value for each column of the records on which the function
is applied. These are used with SELECT clause only. MAX ( ), MIN ( ), AVG ( ), SUM ( ), COUNT
( ) and COUNT (*) are examples of multiple row fun.
To demonstrate the use of SQL function, Let us create database called CARSHOWROOM having
the schema with four relations as shown in Figure 3.1.
Insert the records in tables Inventory, Customer, Sale and Employee using INSERT command.
The records of these four relations can be viewed using the SELECT command.
Execute the following query to view the records of “inventory” table. After successful execution
of the query, the records entered in the “inventory” table will be displayed.
Execute the following query to view the records of “customer” table. After successful execution
of the query, the records entered in the “customer” table will be displayed.
Execute the following query to view the records of “sale” table. After successful execution of the
query, the records entered in the “sale” table will be displayed.
Execute the following query to view the records of “employee” table. After successful execution
of the query, the records entered in the “employee” table will be displayed.
2. ROUND (N, D) – Rounds off number N to D number of decimal places. If D=0, then it rounds
off the number to the nearest integer.
Step 2. Add a new column “FinalPrice” to the table “inventory”. Update the table “inventory” with
“FinalPrice” as the sum of Price and 12 percent of the GST. Apply the ROUND function to round
off the GST to one decimal place. Execute the following query to do this.
Display the values of “FinalPrice” for all the record by using the SELECT command.
Step 3. Calculate and display the amount to be paid each month in multiples of 1000, which is
calculated after dividing the FinalPrice of the car into 10 installments. After dividing the amount
into EMIs, find out the remaining amount to be paid immediately, by performing modular
division. Use SELECT command to display the result. Execute the following query to do this.
Step 4. Execute the following query to display the “InvoiceNo” and “Commission” value rounded
off to zero decimal places.
Step 5. Execute the following query to display the details of “sale” table where payment mode
is credit card.
Step 6. Execute the query to add a new column “Commission” with total length of 7 with 2
decimal places to the “sale” table.
Step 7. Execute the query to calculate commission for sales agents as 12% of “SalePrice”.
Step 8. Execute the following query to insert the values to the newly added column “Commission”
and then display all records of the “sale” table where Commission > 73000.
Step 9. Execute the following query to display InvoiceNo, EmpID, SalePrice and Commission
such that commission value is rounded off to 0.
3. MID (string, pos, n) OR SUBSTRING (string, pos, n) OR SUBSTR (string, pos, n) – Returns a
substring of size n starting from the specified position (pos) of the string. If n is not specified, it
PSS Central Institute of Vocational Education, NCERT, Bhopal
Junior Software Developer, Grade XII 57
returns the substring from the position pos till end of the string.
5. LEFT (string, N) – Returns N number of characters from the left side of the string.
8. LTRIM (string) – Returns the given string after removing leading white space characters.
9. RTRIM (string) – Returns the given string after removing trailing white space characters.
10. TRIM (string) – Returns the given string after removing both leading and trailing white
space characters.
i.
Practical Activity 3.2 – Demonstrate to use string function
Let us use Customer relation to understand the working of various string functions.
Step 1. Execute the following query to display customer name in lower case and customer email
in upper case from “customer” table.
Step 2. Execute the following query to display the length of email and part of the email from
the email id before the character ‘@’.
The function INSTR will return the position of “@” in the email address. So, to print email id
without “@” position -1 is used.
Let us assume that four-digit area code is reflected in the mobile number starting from position
number 3. For example, 1851 is the area code of mobile number 9818511338.
Step 3. Execute the following query to display the area code of the customer living in Rohini.
Step 4. Execute the following query to display emails after removing the domain name extension
“.com” from emails of the customers.
Step 5. Execute the following query to display details of all the customers having yahoo emails
only.
Now let us use the table “inventory” from CARSHOWROOM database, write SQL queries for the
following:
Step 6. Execute the following query to convert the “CarMake” to uppercase if its value starts
with the letter ‘B’.
Step 7. If the length of the car model is greater than 4 then Execute the following query to
fetch the substring starting from position 3 till the end from attribute Model.
and year), displaying day of the week and so on. Some of the date and time functions with
examples are given below.
1. NOW() – It returns the current system date and time.
2. DATE() – It returns the date part from the given date/time expression.
Step 1. Execute the following query to display the date in the format "Wednesday, 26,
November, 1979", if the date of joining is not Sunday.
Step 2. Execute the following query to list the Employee Name, date of birth and Salary for all
employees whose salary is more than 25000, in “emp” table.
Step 3. Execute the following query to list the invoice number, customer id and date of sale
those payment are done using bank finance in “Sale” table.
Step 4. Execute the following query to list all the employee without peon whose salary is more
than 30000 in “emp” table.
Step 5. Execute the following query to list all the records without LXI and VXI models in the
table “inventory”.
2. MIN (column) – Returns the smallest value from the specified column.
3. AVG (column) – Returns the average of the values in the specified column.
4. SUM (column) – Returns the sum of the values for the specified column.
5. COUNT (*) – Returns number of records in a table. COUNT (*) is used with WHERE clause to
display the number of records that matches a particular criteria in the table.
Step 2. Execute the following SQL query to display the total number of different types of
Models available from table “inventory”.
Step 3. Execute the following SQL query to display the average price of all the cars with Model
LXI from table “inventory”.
aggregate functions (COUNT, MAX, MIN, AVG and SUM) can be used with GROUP BY clause. HAVING Clause
in SQL is used to specify conditions on the rows with GROUP BY clause.
Practical Activity 3.5 – Demonstrate to use GROUP BY and HAVING clause in
SQL
Consider the “sale” table from the CARSHOWROOM database. Display the number of records in
the “sale” table using the following SQL statement.
In these records, it is observed that, the columns, CarID, CustID, SaleDate, PaymentMode,
EmpID, SalePrice can have rows with the same values in it. So, GROUP BY clause can be used
in these columns to find the number of records of a particular type (column), or to calculate the
sum of the price of each car type.
Step 1. Execute the following SQL query to display the number of Cars purchased by each
Customer from SALE table.
Step 2. Execute the following SQL query to display the Customer Id and number of cars
purchased if the customer purchased more than 1 car from SALE table.
Step 3. Execute the following SQL query to display the number of people in each category of
payment mode from the table SALE.
Step 4. Execute the following SQL query to display the PaymentMode and number of payments
made using that mode more than once.
5 Abhay 8A
Execute the following query to view the records of “music” table. After successful execution of the
query, the records entered in the “music” table will be displayed.
Step 1. Execute the following SQL query to find the list of students participating in either of
events by using UNION operation on relations DANCE and MUSIC. After execution it will display
the union of DANCE and MUSIC relations.
specify conditions on the related attributes of two tables within the FROM clause. Usually, such
attribute is the primary key in one table and foreign key in another table.
Let us create two tables UNIFORM (UCode, UName, UColor) and COST (UCode, Size, Price) in
the SchoolUniform database. “UCode” is primary key in table UNIFORM. “UCode” and “Size” is
the composite key in table COST. Therefore, UCode is a common attribute between the two tables
which can be used to fetch the common data from both tables. Define UCode as foreign key in
the “Cost” table while creating this table. Enter the records in these tables as shown in Table
3.13 and 3.14.
Table 3.13 Uniform table
UCode Uname UColor
1 Shirt White
2 Pant Grey
3 Tie Blue
Table 3.14 Cost table
UCode Size Price
1 L 580
1 M 500
2 L 890
1 M 810
Practical Activity 3.6 – Demonstrate to join two tables in SQL
Let us consider two tables created, UNIFORM and COST to demonstrate the joining of two tables.
The joining of two tables can be done in three different ways – using WHERE clause, JOIN clause
and NATURAL JOIN clause
Step 1. Execute the following query to join the two tables using WHERE clause.
As the attribute “UCode” appears in both “uniform” and “cost” tables. Hence alias is used to
remove ambiguity by specifying qualifier U with attribute UCode in SELECT and FROM clauses
to indicate its scope.
Step 2. Execute the following query to join the two tables using JOIN clause.
The output of the query is same as that of step 1. In this query the JOIN clause is used explicitly
along with condition in FROM clause. Hence no condition is required in WHERE clause.
The output of queries in step 1 and 2 has a repetitive column UCode having exactly the same
values. This redundant column provides no additional information. SQL provides the extension
PSS Central Institute of Vocational Education, NCERT, Bhopal
Junior Software Developer, Grade XII 69
of JOIN operation called as NATURAL JOIN, which works similar to JOIN clause in SQL to remove
the redundant attribute. This operator can be used to join the contents of two tables if there is
one common attribute in both the tables.
Step 3. Execute the following query to join the two tables using NATURAL JOIN clause.
It is clear from the output that the result of this query is same as above in step 1 and 2, except
that the attribute UCode appears only once.
It is important to note the following points while applying JOIN operations on two or more
relations.
• If two tables are to be joined on equality condition on the common attribute, then one
may use JOIN with ON clause or NATURAL JOIN in FROM clause. If three tables are to
be joined on equality condition, then two JOIN or NATURAL JOIN are required.
• In general, N-1 joins are needed to combine N tables on equality condition.
• Any relational operators can be used with JOIN clause to combine tuples of two tables.
SUMMARY
• A Function is used to perform a particular task and return a value as a result.
• Single Row functions work on a single row of the table and return a single value.
• Multiple Row functions work on a set of records as a whole and return a single value.
Examples include COUNT, MAX, MIN, AVG and SUM.
• GROUP BY function is used to group rows of a table that contain the same values in a
specified column.
• Join is an operation which is used to combine rows from two or more tables based on one
or more common fields between them.
8. Which of the following SQL operation cannot be performed on relations (a) Union, (b)
Intersection (c) Difference (d) Merge
9. Which of the following is used to join two tables on equality condition on the common
attribute (a) JOIN with ON clause (b) NATURAL JOIN in FROM clause (c) Any of a or b (d)
NATURAL JOIN
10. What will be the Cartesian product of the two relations having 4 rows and 3 columns for
first relation and 3 rows and 4 columns in second relation. (a) degree 7 cardinality 12 (b)
degree 6 cardinality 16 (c) degree 7 cardinality 16 (d) degree 9 cardinality 16
B. Fill in the blanks
1. Single row functions are applied on a single ______ and return a single value.
2. Aggregate functions work on a ________ as a whole and return a single value.)
3. Math Functions accept numeric value as input and return a ______ value as a result.
4. MONTH (date) returns the month in ________ form from the date.
5. By default, the order by clause lists items in ______ order.
6. INSTR (string, substring) returns the position of the _________ of the substring in the
given string.)
7. MID (string, pos, n) returns a substring of size ___ starting from the specified position
______ of the string. (n, pos).
8. LTRIM (string) returns the given string after removing ________ white space characters.
9. TRIM (string) returns the given string after removing both ________ and _________ white
space characters.
10. The _________ operation is used to get common tuples from two tables.
C. State True or False
1. Aggregate functions are also called Scalar functions.
2. A function always return a single value.
3. Functions can be applied to work on single or multiple records of a table.
4. INSTR (string, substring) returns 0, if the substring is not present in the string.
5. If n is not specified MID (string, pos, n), it returns the substring from the position 1 till
end of the string.
6. RTRIM (string) returns the given string after removing leading white space characters.
7. NOW() returns the current system date and time.
8. Union operation eliminates the duplicate rows.
9. Cartesian product operation combines tuples from two relations.
10. Join statement is used to combine two tables on a specified condition.
C. Short answer questions
1. Differentiate between single row functions and aggregate functions.
2. List the single row functions with example.
3. Differentiate between TRIM( ), LTRIM( ) and RTRIM( ) functions.
4. Demonstrate the use of LCASE( ) and UCASE( ) function with example.
5. List the date functions with example.
6. What is the difference between NOW( ) and DATE( ) function?
7. Demonstrate the difference between SUM( ) and AVG( ) function?
8. A table Student has 4 rows and 2 column and another table has 3 rows and 4 columns.
How many rows and columns will be there if we obtain the Cartesian product of these
two tables?
9. What will be the output of following SQL functions.
a) Select pow (3,2);
b) Select round (342.9234, 2);
c) Select length (‘Vocational Education’);
Module Overview
In Grade 11, you have learned the basic data structures – list, set, tuples, and dictionary. Each
of the data structures is unique in its own way. A data structure defines a mechanism to store,
organize and access data along with operations that can be efficiently performed on the data.
An Exception is an error that happens during the execution of a program. It is necessary to tackle
the exception to prevent the program from getting crashed. File handling in Python is a fundamental
skill for developers, enabling them to manage data effectively, perform data processing tasks, and
work with various data sources. In Python, the lists data structure serve the purpose of arrays, but
they are slow to process.
NumPy aims to provide an array object that is much faster than traditional Python lists. NumPy
stands for Numerical Python is a Python library used for working with arrays. Pandas Series is similar
to one-dimensional NumPy array consists of an array of data (values), and an array of labels (indices).
It has additional functionality that allows values in the Series to be indexed using labels. A NumPy
array does not have the flexibility to do this.
The Data visualisation in the form of charts, graphs, animation, and maps are very easy and simple
to understand the trends, outliers, and patterns in data. Data visualization techniques for such big
data are very important for the purpose of analysis of data.
In Python it is important to understand the database connectivity for software development. It is
possible to connect with the database application to develop applications. It is required to use libraries
that provide various connectivity functionalities.
In this unit the advanced topics of implementation of data structure using Stack & Queue, Exception
Handling, File Handling, Numpy Array, Pandas Series, Graphical Representation using MatpotLib,
and Database Connectivity with MySQL are covered.
Learning Outcomes
After completing this module, you will be able to:
• Code and execute the programs to execute Stack and Queue data strucure in Python
• Code and execute the programs to handle exception in Python
• Code and execute the programs of file handling in Python
• Code and execute the programs to create and use NumPy array in Python
• Code and execute the programs to use Pandas and Series in Python
• Code and execute the programs to demonstrate Graphical Representation using
MatpotLib
• Code and execute the programs to establish database connectivity with MySQL
Module Structure
Session 1. Implementing Data Structure using Stack & Queue
Session 2. Exception Handling in Python
Session 3. File Handling in Python
PSS Central Institute of Vocational Education, NCERT, Bhopal
Junior Software Developer, Grade XII 73
In Python different data types are used to handle the values. String, List, Set, Tuple, are the
sequence data types that can be used to represent collection of elements either of the same
type or different types. Multiple data elements are grouped in a particular way for fast
accessibility and efficient storage of data. Thus Python uses different data types for storing
data values. Such grouping is referred as data structure. A data structure defines a mechanism
to store, organize and access data along with operations (processing) that can be efficiently
performed on the data.
For example, string data structure has a sequence of elements where each element is a
character. List is a sequence data structure in which each element may be of different types.
It is possible to apply different operations like reversal, slicing, counting of elements on list and
string. Data structure organizes multiple elements so that certain operations cn be performed
on each element as well as on the collective data unit.
Stack and Queue are two other popular data structures used in programming. It is not directly
available in Python, but it is important to learn these concepts as they are extensively used in
programming languages.
1.1 APPLICATIONS OF STACK
When the books are kept one after another it forms the stack of books as shown in the Figure
1.1. Any book to be placed on these books is always added on the top of the books. In the same
way the top most book can be removed first. It is not possible to add or remove the book stored
in the middle or botttom. Addtion or removal of the book is possible from the top endly only by
following the LIFO (Last-In-First-Out) principle. This arrangement of elements in a linear order
is called a stack. The element which was inserted last (the most recent element) will be the first
one to be taken out from the stack.
In editing the text or image the functions such as redo/undo has to be programmed. The most
recent operations can be unto/ redo using this button. The stack can be usefull to do the
programming of this type of operations.
In web browsing, the web pages are accessed sequentially. The stacked pages are stored in the
browser history. The history of pages browsed is maintained as stack. To access the previous
pages the most recently pages can be viewed first.
1.2 OPERATIONS ON STACK
As stack implements LIFO arrangement where elements are added and deleted from the stack at
one end only. The end from which elements are added or deleted is called TOP of the stack. Two
main operations performed on the stack are PUSH and POP. PUSH is performed to add an
element in the stack and POP is performed to remove an element from the stack. Thes operations
can be performed using programming by using Python.
1.2.1 PUSH and POP Operations
PUSH – This operation is used to add a new element at the TOP of the stack. It is an insertion
operation. The elements can be added to a stack until it is full. Addition an element when the
stack is full results into “stack overflow’” error.
POP – This operation is used to remove the top most element of the stack. It is a deletion
operation. The elements can be deleted from a stack until it is empty. Deleting an element from
an empty stack results in “stack underflow” error.
Following programme illustrates the stack operation in Python.
1.3 IMPLEMENTATION OF STACK IN PYTHON
The various operations such as inserting an element in the stack (PUSH) and deleting an element
from the stack (POP) can be implemented using Python programming. The following programme
is coded to illustrate these operations.
After executing the above programme you can perform the stack operations.
1.4 QUEUE
As we have seen the Stack data structure works on Last-InFirst-Out (LIFO) principle. Queue is
an another data structure which works on First-In-First-Out (FIFO) principle. Queue is an
ordered linear list of elements, having different ends for adding and removing elements in it.
In out everyday life customers forming a queue at the cash counter in a bank and petrol pump,
vehicles queued at fuel pumps. So whenever there is queue the first standing in the queue can
exit first. Thus it follows the principle of First In First Out (FIFO). Queue is an arrangement in
which new elements always get added at one end, usually called the REAR, and elements always
get removed from the other end, usually called the FRONT of the queue. REAR is also known as
TAIL and FRONT as HEAD of a queue.
1.1.1 OPERATIONS ON QUEUE
The data structure queue supports the following operations.
ENQUEUE – Is used to insert a new element to the queue at the rear end. The element can be
inserted in the queue from the rear end till there is space to add the elements. When there is no
space left to insert an elements beyond capacity of the queue will result in an exception error
“Overflow”.
DEQUEUE – Is used to remove one element at a time from the front of the queue. The element
can be deleted from a queue until it is empty. Trying to delete an element from an empty queue
will result in exception error “Underflow”.
To perform enqueue and dequeue efficiently on a queue, following operations are also required.
IS EMPTY – used to check whether the queue has any element or not, so as to avoid Underflow
exception while performing dequeue operation.
PEEK – used to view elements at the front of the queue, without removing it from the queue.
IS FULL – used to check whether any more elements can be added to the queue or not, to avoid
Overflow exceptions while performing enqueue operation.
1.3 IMPLEMENTATION OF QUEUE IN PYTHON
Following program illustrates the various operations performed on Queue.
SUMMARY
• Stack is a data structure in which insertion and deletion is done from one end only,
usually referred to as TOP.
• Stack follows LIFO principle using which an element inserted in the last will be the first
one to be out.
• PUSH and POP are two basic operations performed on a stack for insertion and deletion
of elements, respectively.
• Trying to pop an element from an empty stack results into a special condition
underflow.
• In Python, list is used for implementing a stack and its built-in-functions. Push and
Pop are used for insertion and deletion. Hence, explicit declaration of TOP is not
needed.
• Stack is commonly used data structure to convert an Infix expression into equivalent
Prefix/Postfix notation.
• While conversion of an Infix notation to its equivalent Prefix/Postfix notation, only
operators are PUSHed onto the Stack.
• Queue is an ordered linear data structure, following FIFO strategy.
• Front and Rear are used to indicate beginning and end of queue.
• In Python, the use of predefined methods takes care of Front and Rear.
• Insertion in a queue happens at the rear end. Deletion happens at the front.
• Insertion operation is known as enqueue and deletion operation is known as dequeue.
• To support enqueue and dequeue operations, is Empty, is full and peek operations are
used.
Sometimes a Python program does not execute or generates unexpected output or behaves
abnormally. This happens due to presence of syntax errors, runtime errors or logical errors in
the code. In Python, exceptions are errors that get triggered automatically. However, exceptions
can be forcefully triggered and handled through program code. In this chapter, you will learn
about exception handling in Python.
2.1 Syntax Errors
Syntax errors are also known as parsing errors, detected when the program is not coded
according to syntax defined in that programming language. When syntax error encountered, the
interpreter does not execute the program unless the errors are rectified. Python displays the
syntax error with small description about the error in shell mode as shown in the Figure 2.1.
syntax errors of “Missing parentheses” as shown in above examples are also comes under
exception. But these errors can be handled by correcting the syntax, while other exceptions
which are generated even the with the correct syntax need to be tackled logically.
2.3 Built-in Exceptions
These are commonly occurring exceptions that are usually defined in the compiler or interpreter.
These are called as built-in exceptions. Python’s standard library is an extensive collection of
built-in exceptions that deals with the commonly occurring errors (exceptions) by providing the
standardized solutions for such errors. When such errors occurrs an appropriate exception
handler code is executed which displays the raised exception name and the reason for the same.
A programmer has to take appropriate action to handle it. Some of the commonly occurring built-
in exceptions that can be raised in Python are stated in Table 2.1.
Table 2.1 Built-in exceptions in Python
SN Built-in Raised when...
Exception
1 SyntaxError there is an syntax error in the code.
2 ValueError built-in method or operation receives an argument that has the right
data type but mismatched or inappropriate values.
3 IOError the file specified in a program statement cannot be opened.
4 KeyboardInterrupt the user accidentally hits the Delete or Esc key while executing a
program due to which the normal flow of the program is interrupted.
5 ImportError when the requested module definition is not found.
6 EOFError End of file condition is reached without reading any data by input ().
7 ZeroDivisionError the denominator in a division operation is zero.
8 IndexError the index or subscript in a sequence is out of range.
9 NameError a local or global variable name is not defined.
10 IndentationError there is incorrect indentation in the program code.
11 TypeError operator is supplied with a value of incorrect data type.
12 OverFlowError Result of calculation exceeds the maximum limit for numeric data
type.
Some of the built-in exceptions viz, ZeroDivisionError, NameEError, and TypeError raised by
the Python interpreter are shown below.
ZeroDivisionError – It is raised when the denominator in a division operation is zero as shown
in Figure 2.3.
In the code shown in Figure 2.7, since the value of variable length is greater than the length of
the list numbers, an IndexError exception will be raised. The statement following the raise
statement will not be executed. So the message ”NO EXCEPTION” will not be displayed in this
case.
There is a need for exception handling in Python just like other programming languages such as
C++, Java, Ruby. It is a useful technique that helps in capturing runtime errors and handling
them so as to avoid the program getting crashed. Following are some of the important points
regarding exceptions handling.
• Python categorises exceptions into distinct types so that specific exception handlers (code to
handle that particular exception) can be created for each type.
• Exception handlers separate the main logic of the program from the error detection and
correction code. The segment of code where there is any possibility of error or exception, is
placed inside one block. The code to be executed in case the exception has occurred, is
placed inside another block. These statements for detection and reporting the exception do
not affect the main logic of the program.
• The compiler or interpreter keeps track of the exact position where the error has occurred.
• Exception handling can be done for both user-defined and built-in exceptions.
2.2.2 Process of Handling Exception
When an error occurs, Python interpreter creates an object called the exception object. This
object contains information about the error like its type, file name and position in the program
where the error has occurred. The object is handed over to the runtime system so that it can find
an appropriate code to handle this particular exception. This process of creating an exception
object and handing it over to the runtime system is called throwing an exception. It is important
to note that when an exception occurs while executing a particular program statement, the
control jumps to an exception handler, abandoning execution of the remaining program
statements.
A runtime system refers to the execution of the statements given in the program. It is a
complex mechanism consisting of hardware and software that comes into action as soon as
the program, written in any programming language, is put for execution.
The runtime system searches the entire program for a block of code, called the exception handler
that can handle the raised exception. It first searches for the method in which the error has
occurred and the exception has been raised. If not found, then it searches the method from which
this method (in which exception was raised) was called. This hierarchical search in reverse order
continues till the exception handler is found. This entire list of methods is known as call stack.
When a suitable handler is found in the call stack, it is executed by the runtime process. This
process of executing a suitable handler is known as catching the exception. If the runtime system
is not able to find an appropriate exception after searching all the methods in the call stack, then
the program execution stops. Figure 2.9 describes the exception handling process.
2.2.3 Catching Exceptions
An exception is said to be caught when a code that is designed to handle a particular exception
is executed. Exceptions, if any, are caught in the try block and handled in the except block.
While writing or debugging a program, a user might doubt an exception to occur in a particular
part of the code. Such suspicious lines of codes are put inside a try block. Every try block is
followed by an except block. The appropriate code to handle each of the possible exceptions (in
the code inside the try block) are written inside the except clause.
While executing the program, if an exception is encountered, further execution of the code inside
the try block is stopped and the control is transferred to the except block. The syntax of try...
except clause is as follows.
try:
[program statements where exceptions might occur]
except [exception-name]:
[code for exception handling if the exception-name error is encountered]
The python programme code shown in Figure 2.10 illustrates the functioning of try... except
clause.
• The process of exception handling involves writing additional code to give proper
messages or instructions to the user. This prevents the program from crashing abruptly.
The additional code is known as an exception handler.
• An exception is said to be caught when a code that is designed to handle a particular
exception is executed.
• An exception is caught in the try block and handles in except block.
• The statements inside the finally block are always executed regardless of whether an
exception occurred in the try block or not.
Text files contain only the ASCII equivalent of the contents of the file whereas a.docx file
contains many additional information like the author's name, page settings, font type and size,
date of creation and modification, etc.
Activity 3.1. Create a text file using notepad and write your name and save it. Now, create
a .docx file using Microsoft Word and write your name and save it as well. Check and compare
the file size of both the files. You will find that the size of .txt file is in bytes whereas that
of .docx is in KBs.
image, audio, video, compressed versions of other files, executable files, etc. These files are not
human readable. Thus, trying to open a binary file using a text editor will show some garbage
values. We need specific software to read or write the contents of a binary file.
Binary files are stored in a computer in a sequence of bytes. Even a single bit change can corrupt
the file and make it unreadable to the supporting application. Also, it is difficult to remove any
error which may occur in the binary file as the stored contents are not human readable. We can
read and write both text and binary files through Python programs.
3.3 Opening and Closing a Text File
In real world applications, computer programs deal with data coming from different sources like
databases, CSV files, HTML, XML, JSON, etc. We broadly access files either to write or read data
from it. But operations on files include creating and opening a file, writing data in a file,
traversing a file, reading data from a file and so on. Python has the IO module that contains
different functions for handling files.
3.3.1 Opening a file
To open a file in Python, we use the open() function. The syntax of open() is as follows:
file_object= open(file_name, access_mode)
This function returns a file object called file handle which is stored in the variable file_object. We
can use this variable to transfer data to and from the file (read and write) by calling the functions
defined in the Python’s io module. If the file does not exist, the above statement creates a new
empty file and assigns it the name we specify in the statement.
The file_object has certain attributes that tells us basic information about the file, such as:
<file.closed> returns true if the file is closed and false otherwise.
<file.mode> returns the access mode in which the file was opened.
<file.name> returns the name of the file.
The file_name should be the name of the file that has to be opened. If the file is not in the current
working directory, then we need to specify the complete path of the file along with its name. The
access_mode is an optional argument that represents the mode in which the file has to be
accessed by the program. It is also referred to as processing mode. Here mode means the
operation for which the file has to be opened like <r> for reading, <w> for writing, <+> for both
reading and writing, <a> for appending at the end of an existing file. The default is the read mode.
In addition, we can specify whether the file will be handled as binary (<b>) or text mode. By
default, files are opened in text mode that means strings can be read or written. Files containing
non-textual data are opened in binary mode that means read/write are performed in terms of
bytes. Table 3.1 lists various file access modes that can be used with the open() method. The file
offset position in the table refers to the position of the file object when the file is opened in a
particular mode.
Table 3.1 File Open Modes
<rb> Opens the file in binary and read-only mode. Beginning of the file
<r+> or Opens the file in both read and write mode. Beginning of the file
<+r>
<w> Opens the file in write mode. If the file already exists, all the Beginning of the file
Beginning of the file contents will be overwritten. If the file
doesn’t exist, then a new file will be created.
<wb+> or Opens the file in read,write and binary mode. If the file Beginning of the file
<+wb> already exists, the contents will be overwritten. If the file
doesn’t exist, then a new file will be created.
<a> Opens the file in append mode. If the file doesn’t exist, then End of the file
a new file will be created.
<a+> or Opens the file in append and read mode. If the file doesn’t End of the file
<+a> exist, then it will create a new file.
Think and Reflect for a newly created file, is there any difference between write() and writeline()
methods?
The write() method takes a string as an argument and writes it to the text file. It returns the
number of characters being written on single execution of the write() method. Also, it is required
to add a newline character (\n) at the end of every sentence to mark the end of line. Consider
the following Python code.
Executing this code will create the file with the name myfile.txt with the text as mentioned under
the write () method. Note that ‘\n’ is treated as a single character.
If numeric data are to be written to a text file, the data need to be converted into string before
writing to the file. Following program code and its output text file below that shows the numeirc
marks 58 is converted to string using the str () function and then it is written in the text file.
The write() actually writes data onto a buffer. When the close() method is executed, the contents
from this buffer are moved to the file located on the permanent storage. The flush() method can
also be used to clear the buffer and write contents in buffer to the file. This is how programmers
can forcefully write to the file as and when required.
3.4.2 The writelines() method
This method is used to write multiple strings to a file. We need to pass an iterable object like
lists, tuple, etc. containing strings to the writelines() method. Unlike write(), the writelines()
method does not return the number of characters written in the file. The following code and its
output shows that the writeln function writes by breaking the text to the next line when \n
character encountered. This illustrates the use of writelines() method.
We can write a program to read the contents of a file. Before reading a file, we must make sure
that the file is opened in “r”, “r+”, “w+” or “a+” mode. There are three ways to read the contents
of a file:
3.5.1 The read () method
This method is used to read a specified number of bytes of data from a data file. The syntax of
read () method is:
file_object.read(n)
Consider the following set of statements. The value 10 is passed to read () method, produce the
output of reading 10 characters from the line. This illustrates the usage of read() method:
If no argument or a negative number is specified in read(), the entire file content is read as shown
in the following code.
If no argument or a negative number is specified, it reads a complete line and returns string.
To read the entire file line by line using the readline(), we can use a loop. This process is known
as looping/ iterating over a file object. It returns an empty string when EOF is reached.
3.5.3 The readlines() method
The method reads all the lines and returns the lines along with newline as a list of strings. The
following code illustrats the use of readlines() to read data from the text file myfile.txt.
The above output shows that when a file is read using readlines() function, lines in the file become
members of a list, where each list element ends with a newline character (‘\n’).
To display each word of a line separately as an element of a list, the split() function can be used.
The following code demonstrates the use of split() function.
Let us now write a program that accepts a string from the user and writes it to a text file.
Thereafter, the same program reads the text file and displays it on the screen. Following program
illustrates how the text written in the text file is read.
In the above syntax, offset is the number of bytes by which the file object is to be moved.
reference_point indicates the starting position of the file object. That is, with reference to which
position, the offset has to be counted. It can have any of the following values:
0 - beginning of the file
1 - current position of the file
2 - end of file
By default, the value of reference_point is 0, i.e. the offset is counted from the beginning of the
file. The following program illustrates the usage of seek() and tell().
Executing the above program, the statement fileObject.seek(6,0) will position the file object at 6th
byte position from the beginning of the file and print the text from the 6 th character as shown in
the following output screen.
mode (a), then the new data will be written after the existing data. In both cases, if the file does
not exist, then a new empty file will be created. In the following program a file, practice.txt is
opened in write (w) mode and three sentences are stored in it as shown in the output screen that
follows it.
Executing the above program allows to enter the data to be stored in the practice.txt file, and
thereafter read the contents entered as shown below.
Summary
• A file is a named location on a secondary storage media where data are permanently
stored for later access.
• A text file contains only textual information consisting of alphabets, numbers and other
special symbols. Such files are stored with extensions like .txt, .py, .c, .csv, .html, etc.
Each byte of a text file represents a character.
• Each line of a text file is stored as a sequence of ASCII equivalent of the characters and
is terminated by a special character, called the End of Line (EOL).
• Binary file consists of data stored as a stream of bytes.
• open() method is used to open a file in Python and it returns a file object called file handle.
The file handle is used to transfer data to and from the file by calling the functions defined
in the Python’s IO module.
• close() method is used to close the file. While closing a file, the system frees up all the
resources like processor and memory allocated to it.
• write() method takes a string as an argument and writes it to the text file.
• writelines() method is used to write multiple strings to a file. We need to pass an iterable
object like lists, tuple etc. containing strings to writelines() method.
• read([n]) method is used to read a specified number of bytes (n) of data from a data file.
• readline([n]) method reads one complete line from a file where lines are ending with a
newline (\n). It can also be used to read a specified number (n) of bytes of data from a file
but maximum up to the newline character (\n).
• readlines() method reads all the lines and returns the lines along with newline character,
as a list of strings.
• tell() method returns an integer that specifies the current position of the file object. The
position so specified is the byte position from the beginning of the file till the current
position of the file object.
• seek()method is used to position the file object at a particular position in a file.
• Pickling is the process by which a Python object is converted to a byte stream.
• dump() method is used to write the objects in a binary file.
• load() method is used to read data from a binary file.
In Python, the lists data structure serves the purpose of arrays, but they are slow to process.
NumPy aims to provide an array object that is much faster than traditional Python lists. NumPy
arrays are stored at one continuous place in memory unlike lists, so processes can access and
manipulate them very efficiently. The elements of a NumPy array must all be of the same type,
whereas the elements of a Python list can be of completely different types.
NumPy stands for Numerical Python is a Python library used for working with arrays. It provides
functions for fast mathematical computation on arrays and matrices. NumPy objects are
primarily used to create arrays or matrices that can be applied to Deep Learning or Machine
Learning models. Pandas is used for creating heterogeneous, two-dimensional data objects,
NumPy makes N-dimensional homogeneous objects. Pandas functions return result in the form
of NumPy array.
Installation and Importing of NumPy
To use NumPy, it is required to install and import NumPy. If Python and PIP already installed on
a system, then installation of NumPy is very easy by using the command,
install numpy
Once NumPy is installed, it is possible to import it in by adding the import keyword.
import numpy
NumPy is usually imported under the np alias as follows.
import numpy as np
NumPy is ready to use after imported.
NumPy Creating Arrays
There are 6 general mechanisms for creating arrays.
1. Conversion from other Python structures such as lists and tuples.
2. Intrinsic NumPy array creation functions (e.g. arange, ones, zeros, etc.).
3. Replicating, joining, or mutating existing arrays.
4. Reading arrays from disk, either from standard or custom formats.
5. Creating arrays from raw bytes through the use of strings or buffers.
6. Use of special library functions such as random.
NumPy is used to work with arrays. The array object in NumPy is called ndarray. NumPy ndarray
object can be created by using the array() function.
type() is built-in Python function display the type of the object passed to it. Like in above code
it shows that arr is numpy.ndarray type.
To create an ndarray, you can pass a list, tuple or any array-like object into the array() method,
and it will be converted into an ndarray, as illustrated in the following example.
Dimensions in Arrays
There are two forms of NumPy arrays – one dimensional array, known as vectors, and
multidimensional arrays, known as matrices. NumPy has a whole sub module dedicated towards
matrix operations called numpy.mat
Following example illustrate to create 1D, 2D and 3D arrarys.
In this example, L = [12,23,34,22 ] is list. The statement aa=np.array(L) will convert this list into
array and store into “aa”.
Example: Following python code illustrates that NumPy array can perform vector additon but
List cannot perform it.
A table is 2-D array with rows and columns, where the dimension represents the row and the
index represents the column. To access elements from 2-D arrays, a comma is used to separate
integers representing the dimension and the index of the element.
Similarly, to access elements from 3-D arrays we can use comma separated integers representing
the dimensions and the index of the element.
Following example illustrate to access the elements of 2-Dim and 3-Dim array.
It is possible to create NumPy array containing the data element of the same data type as well
as the data element of different data types. Following example illustrate to create the NumPy
array of data element of integer, float, string as well as mixed data types.
Converting Data Type on Existing Arrays
The data type of an existing array can be changed. To do this make a copy of the array with the
astype() method. The astype() function creates a copy of the array, and allows to specify the data
type as a parameter. The data type can be specified using a string, like 'f' for float, 'i' for integer
or you can use the data type directly like float for float and int for integer.
Slicing means taking elements from one given index to another given index. It is possible to
slice instead of index like this: [start:end]. It is possible to define the step, like this:
[start:end:step]. If start or end is not passed, by default the start is considered 0, and end is
considered length of array in that dimension. If step is not passed it is considered as 1.
Following example illustrates the array slicing.
Negative Slicing
The index of the array when referred from the end by using minus operator, then it is negative
indexing. Slicing can be done in steps by specifying steps.
Following example illustrates the negative indexing performed on the array and slicing in steps.
2. NumPy array supports ________ operations which is not supported in Python List.
3. NumPy is an ___________ module of Python.
4. NumPy support ___________ array.
5. We can convert list into array with the help of _____ function.
C. State whether True or False
D. Answer the following in short
1. How One Dimension array is different from Two Dimension array?
2. How head() and head(3) is different with each other. Justify your answer with suitable
example.
3. What is the use of MOD() function? Explain with example.
4. What is use of hstack() and vstack()? Explain with example.
5. What is use of concatenate ( ) function? Explain with example.
Practical Exercise
1. Write a python program to reverse the rows in a 2D numpy array?
1. Write a python program for: Given a 1D array to negate all elements which
are between 3 and 8.
2. Write a python program for to Create a integer array from a range between 100
to 200 such that the difference between each element is 10
3. Write a Program to store 30 zeros in an array.
A pandas Series is very similar to a one-dimensional NumPy array consists of an array of data
(values), and an array of labels (indices). It has additional functionality that allows values in the
Series to be indexed using labels. A NumPy array does not have the flexibility to do this. As the
most basic element in Pandas, a Pandas Series is also the building block of Pandas DataFrames.
Pandas or Python Pandas is a library of Python which is used for data analysis. The term Pandas
is derived from “Panel Data System”, which is an echometric term for multidimensional,
structured dataset. Pandas has become a popular option for Data Analysis. Pandas provide
various tools for data analysis in very simple and easy form. Pandas are an Open Source, BSD
library specially built for Python Programming language. Pandas offer high performance, easy to
use data structure and data analysis tools for real world need of individual or any organisation.
The main author of Pandas is Wes McKinney.
Key Features of Pandas
• Pandas, is the most popular library in Scientific Python ecosystem for data analysis.
• Quick and efficient data manipulation and analysis.
• It has functionality to find and fill missing data.
• It allows you to apply operations to independent groups within the data.
• It supports reshaping of data into different forms.
• It supports advanced time-series functionality (which is the use of a model to predict
future values based on previously observed values).
• It supports visualization by integrating matplotlib.
• Pandas is best for handling huge tabular (like excel, mysql) data sets comprising
different data formats.
• Tools for loading data from different file formats into in-memory data objects.
• Label-based Slicing, Indexing, and Subsetting can be performed on large datasets.
• Merges and joins two datasets easily.
• Pivoting and reshaping data sets
• Easy handling of missing data (represented as NaN) in both floating point and non-
floating point data.
• Represents the data in tabular form.
• Size mutability: DataFrame and higher-dimensional object columns can be added and
deleted.
• It provides time-series functionality.
• Effective grouping by functionality for splitting, applying, and combining data sets.
Installation of Pandas
If Python and PIP already installed on a system, then Pandas can be installed using this
command in Windows:
C:\Users\Your Name>pip install pandas
If this command fails, then use a python distribution that already has Pandas installed.
In Linux use the following command to install Pandas.
$ sudo apt install python3-pandas
Import Pandas
Once Pandas is installed, import it in your applications by adding the import keyword as
follows:
import pandas
Now Pandas is imported and ready to use. Let us test it using the following code.
Pandas as pd
In Python alias are an alternate name for referring to the same thing. Pandas is usually imported
under the pd alias. To create an alias with the as keyword while importing as.
import pandas as pd
Now the Pandas package can be referred to as pd instead of pandas. Here pd is an object of
pandas library to which you can use in your program.
Pandas is a high level data manipulation tools used for analyzing data. It is very easy to import
and export pandas libraries which has very rich set of functions. Pandas have three important
data structure as under – Series, DataFrame, Panel.
Pandas data frame can contain different data types (int, float, double and string). Pandas data
frame have column name, make it is keep track of data. Pandas are used when data is available
in any tabular format like in a spreadsheet or in a database table. In this chapter we will discuss
Series data structure only as others are beyond the scope of this book.
Series
A Pandas Series is like a column in a table. It is a one-dimensional array holding data of any
type. It contains a sequence of value of any data type like int, float or string. By default the index
will be of type integer and start with zero. Series index can also be given by programmer like in
the form of dictionary keys. It could be of numeric or character or even as string given by the
programmer.
Series is a 1 dimension array of homogeneous (same type) elements with mutable (can be change)
values of immutable size, i.e. size of the series once created cannot be changed.
Creation of Series Objects
There are many ways to create series. Series can be created with the help of list, dictionary,
Series () function, empty () function, zero () function.
Creation of series using list
Following program illustrates to crate empty series and non empty series using Series()
function.
Here in this list, 3 values are integer and one is float type, so finally series type will be float.
Creation of series using dictionary
Following program illustrates to crate series with the help of dictionary. Dictionary’s keys act
as series index and dictionary’s value’s act as series value.
In this example, “Jan”, “Feb”, “Mar” is called series index and 31, 28, 31 are as series values and
the type of series is int type.
Creation of series with Scalar value
Let us understand different ways to create series using pandas library using following code.
In this code snippet, range (0, 3) function will generate indexes 0, 1 and 2 given by programmer.
There is only one value “10” which will store in all indexes.
In this code snippet, range () functions is used to create index 1, 3 and 5. Here index range is 1
to 6, which will start from 1 and increment by 2 and goes up to 5 only. There is only one value
“15” given by programmer which will store in all indexes.
Indexes can also be given in the form list that is ‘Hema’, ‘Rahul’, ‘Anup’. This is given by
programmer and there is only one value “Welcome to CIVE” which will store in all indexes.
arange() is numpy library function which stored 9 to 12 numbers into “a” object and we can
used this “a” object values as series index and we can used “data= a * 2” for data values.
arange() is numpy library function which stored 9 to 12 numbers into “a” object and we can
used this “a” object values as series index and we can used “data= a ** 3” for data values. **
means a raised to power 3.
Series Object Attributes
Here in this example, we have update the value of series at index ‘c’.
Example : Write a python code to modify / update index of a data series.
Here in this example, the series is created with the index using for loop as [‘a’, ‘b’, ‘c’, ‘d’, ‘e’].
Then the index of the series is changed to [‘u’, ‘v’, ‘w’, ‘x’, ‘y’]. So the new indexes in series are
u, v, w, x and y.
Now perform arithmetic operation (+, -, * and /) on s1 and s2 as the indexes are same in both
but not using s3 as it has different indexes.
Following code illustrates the arithmetic operation and vector operation on series.
In the above code the arithmetic and vector operations performed on series are as follows.
s1+s2: This operation will add each element of series s1 and s2. It will successfully done as
both the series have the similar in nature in term of their index number.
s1 + 2: This operation will add 2 to each item of the data series. So we can get 13, 14, 15, 16,
17, 18, and 19 instead of 11, 12, 13, 14, 15, 16 and 17.
s1 * 2: This operation will multiply each item of the data series by 2. So we will get 22, 24, 26,
28, 30, 32 and 34 instead of 11, 12, 13, 14, 15, 16 and 17.
s1 + s3: This operation will not do appropriately as both series had different types of indexes.
The index of series s1 is [0,1,2,3,4,5,6] while series s3 has index [10,20,30,40,50] . if indexes
are not matched then Python will result in NaN (Not a number) in Output.
Relational Operations on series
It is also possible to perform various relational operations (>, <, >=, <=, ==, !=) on series data in
python to generate Boolean results in the form of True/False. These operations are also known
as filtration in python.
First create a series and then perform the relational operations and delete data from data
Series as shown in the following code.
Following program illustrates that arithmetic addition operation is not possible on two
arrays of different size.
Summary
• Data Structure refer to specialized way of storing data so as to apply a specific type of
functionality on them.
• Series is a pandas data structure the represents a 1D array -like object containing an
array of data and an associated array of data labels call its INDEX.
• The shape of a series object tells how big it is, i.e. how many elements it contains
including missing or empty values(NaN).
• If you use len() function on series object , then it return total elements in it including
NaNs but Series.count() return only the count of non- NaN values in a series object.
• When you perform arithmetic operations on 2 series type objects, the data is aligned on
the basis of matching indexes.
• NaN means “Not a Number”.
• Missing values are a hindrance in data analysis and must be handled properly.
Solved Practical
1. Write a python code to create a series using list [3, 4, 5, 6, 7] and [‘a’, ‘b’, ‘c’, ‘d’ ].
2. Write a python code to create a series using an ndarry that has 19 elements in the range
100 to 201
3. Write a python code to create a series using a dictionary that store number of days in a
months.
4. Write a python code to create a series using list for player name p = [“Virat”, “Khan”,
“Karan”, “Sawan”] and run score r = [12, 34, 56, 78] by the player , use player name as series
index and run as series values.
Data is very important for any organization for any type of acknowledgement, research, analysis,
decision making and future forecasting. From a huge amount of data, picking up or point out
the required data at right time is very important and tedious task.
Data visualisation is the graphical representation of data or information. It is used to display
data in more expressive way to fulfill the audience requirement. The Data visualisation in the
form of charts, graphs, animation, and maps are very easy and simple to understand the trends,
outliers, and patterns in data. Data visualization techniques for such big data are very important
for the purpose of analysis of data.
Use of PYPLOT MATPLOTLIB Library
The Matplotlib is a python library that provides many interfaces and functionality for 2D-
graphics similar to MATLAB. Python scripts can be used to create 2D graphs and plots using the
Matplotlib module. With features to control line styles, font attributes, formatting axes, and other
features, it offers a module called pyplot that makes things simple for plotting. It offers a huge
range of graphs and plots, including error charts, bar charts, power spectra, and histograms. It
is combined with NumPy to provide a powerful open source MatLab substitute environment.
Installing Matplotlib
To install Matplotlib library, you need to open command prompt with administrator rights and
make sure internet connectivity is on. Matplotlib library and it’s all dependencies can be easily
downloaded as binary file (pre-compiled) package from internet very easily.
To install Matplotlib in Windows operating system, issue the following command on the
command prompt.
import matplotlib.pyplot as pp
Here pp is user defined object for pyplot. You can use all the functions in pyplot library using
this object as per your need.
Basics of Simple Plotting
Graphical representation of compiled data is known as data visualization. With the help of Pyplot
we can create following type Graphs or Charts.
Line chart – Line charts are used to represent the relation between two data X and Y on a
different axis.
Bar Chart – A bar plot or bar chart is a graph that represents the category of data with
rectangular bars with lengths and heights that is proportional to the values which they represent.
The bar plots can be plotted horizontally or vertically.
Pie Chart – A Pie Chart is a circular statistical plot that can display only one series of data. The
area of the chart is the total percentage of the given data. The area of slices of the pie represents
the percentage of the parts of the data. The slices of pie are called wedges.
Example 2: Write a program to plot a Line Graph of number of runs and over provided in two
different lists.
The python program and output is given below.
Example 3. Let us modify previous example for changing marker size, edge color and increase
the line width using various parameters of plot () function.
Here in this example, the parameters like marker, markersize, markeredgecolor and linewidth are
used to give specification in line plot.
Character Description
‘p’ pentagon marker
‘*’ star marker
‘h’ hexagon1 marker
‘H’ hexagon2 marker
‘+’ plus marker
‘x’ x marker
‘D’ diamond marker
‘d’ thin_diamond marker
‘|’ vline marker
‘_’ hline marker
Example 4. Write a python code for creating a line chart with different line color.
In this example we have used arange() function. Variable z is initialized with multiple instances
of values from 0 to 10 with the interval of 0.1 using this function. These multiple values of z will
be passed to sin and cos functions respectively and result will be stored in variable a, b.
Now using pp object values of z will be plot in line chart along with a and b within blue and green
color respectively.
In the above program, the various color codes as given below can be used while preparing chart
as below.
Code Colour Name
“b” Blue
“g” Green
“r” Red
“c” Cyan
“m” Magenta
“y” Yellow
“k” Black
“w” White
CREATING BAR CHART
A Bar Graph/Chart a graphical display of data using bars of different heights. We can use bar()
and barh() for this purpose. We can use width and color parameter of bar Graph
Example 5: Let us create bar Graph for monthly sale of an electronic items shop using list.
On executing the above program the pie graph is created as shown below.
In this code snippet, pp.savefig() function is used to save the chart as image. It
will save chart at the given path/location in the function as parameter.
Summary
• Data visualisation is the graphical representation of data or information using visual
elements like, chart, graph, maps and so forth.
• The Matplotlib is a python library that provides many interfaces and functionality for 2D-
graphics similar to MATLAB.
• To install Matplotlib, you need to use “pip install Matplotlib” at command prompt.
• Pyplot is a set of functions in Matplotlib library to utilize various graphical objects like
Line, Bar, Pie chart etc.
• Data point are called marker.
• A line chart or line graph can be created using plot () function.
• The arange () function is used to generate numerical values at a fixed interval for array.
• The show () function is used to display the figure /graph on the screen.
• The savefig () function is used to save the figure of the graph at user given path.
• The pie chart is a type of graph in which a circle is divided into sector that each represents
a proportion of the whole.
• Explode property of pie graph separate the slice of pie chart to display it separately.
Solved Programs
2. Write a program to plot a line chart to display marks of students secure in unit test.
Program Output
3. Write a program to draw a Bar chart to display run score by India in last 4 test matches.
Program Output
Practical Exercise
1. Add title for x-axis and y-axis and for whole chart title.
2. In an match player and their score runs are as fellow:
Virat=12, Surya=34, Mohit=45, Zaheer=23
In real life applications we always deal with data. In many applications we see that data is stored,
manipulated, sorted, searched and retrieved as per the requirement through programs designed
by the software developer. Like other programming language in Python also it is possible to
connect with the database application to develop such types of applications. It is required to use
libraries that provide various connectivity functionalities. To work with python MySQL
connector, you need to install MySQL connector. In this chapter, you will understand to install
mysql connector for Python connectivity, connect with the MySQL connector library for database
connectivity, front-end interface and back-end interface. After database connecting, you will be
able to extract from database, insert data into database, search data from database, modify data
in database and delete data from database.
Front-end interface
Front-end interface of any software is the screen where user can interact with the software.
Front-end interface allows to enter data to the computer using a well-designed form as shown in
Figure 7.1. The information filled in this form will store into connected database for future
reference.
administration can work on database like table creation, insert data into table, update data, and
delete data. Figure 7.3 illustrates the data stored in the database is the back-end which is
displayed on the screen through front-end programs.
In Windows, download MySQL Installer and install it in computer. The installation manager
helps to configure the security settings of the MySQL server. On the Accounts and Roles page,
enter a password for the root (admin) account and also optionally add other users with varying
privileges.
Here, in the above example, the first line is used to import the mysql connector using the
command as “import mysql.connector”. In next line, we have created an object named mydb to
connect with MySql database by passing host id, user name and password of MySql. The host
name or IP address, user name and password. The host name or IP address, user name and
password are given as sample. Check these credential from your school lab before making the
connection. Finally you can print the mydb object to see the connection status from MySql
database server. You can also print your own message such as “database connected successfully”
as seen in the output.
Example 2 : Create database on MySQL and display all database name available in MySQL.
Let us understand the new things in this code. Here, in this example 2, we have created new
cursor object as “mycur”. This object is used to refer to the result set returned from MySql
database after executing a query. We can perform multiple operations row by row against a result
set, with or without returning to the original table.
After creating mycur object, mycur.execute (“create database school”) will create new
database school in MySql database. mycur.execute(show database”) will prepare result set of
all database in MySql database. Now using for loop, we can print all exiting databases with our
newly created database school.
Example 3: Write a python code to create table “student” in “school” database and show the
Tables name available in database “school”, as we have created database “school” in Example
2.
In Example 3, we need to create one new table “student” as per question in the “school”
database. For this purpose we have used mycur.execute(“create table student (……..)) with
requested fields. After that we can prepare a result set object using mycur.execute(show tables).
Now using for loop, we can print the all tables available in the result set. Finally
mycur.execute(“desc student”) will display all table fields and its data types. Here for loop is used
to display record one by one from cursor “mycur” and display on the screen.
Example 4: Write a python code to insert data in table “student” which is in “school” database
under MySQL. under MySQL.
In Example 1, 2 and 3, we have created “student” table in “school” database. It is now required
to insert data into “student” table. It is illustrated in the following example.
In this example, a new variable r1 is created that contain SQL statement for inserting data into
table. In this statement %s is used for each value which we need to insert in table. Another
variable of type tuple is created to hold values of one student like roll number, name, age and
city of the student.
Similarly, three more SQL statement variables and tuples are created for other students. Now
using mycur.execute (r1,v1), execute these SQL queries to insert records in the database. Finally
update and save database permanently using mydb.commit() function.
All the 4 records inserted successfully into student table can be displayed using the query as
shown in the following output screen.
Example 5: Write a python code to display data from table “student” with select command.
In Example 4, we have inserted 4 records into “student” table in “school” database. Now we need
to display/show data from “student” table with different SQL statement. Explore example below
for said purpose.
Here in this example, a variable q1 is created with SQL query to select all records from student
table, then execute the SQL statement q1 using mycur.execute(q1) statement.
In next statement, with the help of fetchall() function, mucur resultset object will transfer all
records to r1 variable which is part of python environment. Now using for loop, we can display
all records stored in r1. Similarly, we have created r2 and r3 result set for other two SQL queries.
Example 6: Write a python code to delete data from “student” table using delete command.
In Example 4 and 5, we have inserted and displayed data in the table. Now to delete data from
“student” table is illustrated in the following code.
Observe the above code, one SQL query is prepared in variable q1 to delete the student record
whose age is 23. This query will delete record/s from table where age of any student is 23. As of
now current table has only one student whose age is 23 and name is ravi.
Now on execution of mycur.execute(q1) statement, the record of student/s will delete whose age
is 23 in student table. In the second last line, we used mydb.commit() statement to save updated
data permanently into database.
In the last we have used mycur.rowcount statement to display the number of records deleted
from the current result set.
Try to use some other select statement to get different result.
In Example 5, First record is for ravi and age is 23. But after performing delete command in this
example, record of ravi where age was 23 get deleted. See the output at SQL command prompt
carefully.
Example 7: Write a python code to Update data into table “student” using Update command.
Suppose Veena comes and requests to update her age by 30 as her age in table is 40. One more
request from Head Clerk is to change city for all students to “Mathura”.
Now for this we need to use two update query commands for both the request. Now observe the
code of update query given below.
Here we have prepared two SQL statements as q1, q2 for python string which are used to update
required record(s). The first statement q1 is to update age to 30 where rollno is 103. The second
statement q2 is to update city to “Mathura” for all students.
Using mycur.execute() statement, we can execute both SQL queries and by using mydb.commit(),
we can update and save data permanently in the database.
In the output, observe the difference in two figures, Veena’s age is updated by 30 and all students
city are updated to Mathura.
Summary
• SQL-connector is used to to connect Python with MySQL. So download My SQL-
connector-python with the help of “pip install mysql-connector “.
• fetchall() method is used to to fetch multiple values from a database table.
• rowcount() is a read-only attribute and return the number of rows that were affected by
an execute() method.
• To disconnect from database while working with python, use close ( ) function.
• A Database Cursor is a special control structure that facilitates the row by row
processing of records in the result set.
• You can use connect() method for establishing database connection, cursor() to create a
cursor and execute() to execute an SQL query.
• To fetch records from a result set, You can use fetchone() method to read one record at
a time and fetchall() method to read all records at a time.
• For INSERT, UPDATE and DELETE queries, You must run commit() method with
connection object.
Solved Programs
Consider database “school” having “student” table.
1. Write a Python-MySQL Connectivity code to retrieve data, one record at a time, for student
rollno is less than 200.
Solution:
import mysql.connector
mydb=mysql.connector.connect(host="127.0.0.1", user="root", passwd=" Admin@123",
database="school")
print(mydb)
mycur=mydb.cursor()
q1="select * from student where rollno <200"
mycur.execute(q1)
r1=mycur.fetchall()
for x in r1:
print(x)
2. Write a Python-MySQL Connectivity code to insert a student record into table at a time.
Solution:
import mysql.connector
mydb=mysql.connector.connect(host="127.0.0.1", user="root", passwd=" Admin@123",
database="school")
print(mydb)
mycur=mydb.cursor()
r1="insert into student(rollno,name,age,city) values(%s,%s,%s,%s)"
v1=(101,"Geeta Sharma",33,"kolkata")
mycur.execute(r1,v1)
mydb.commit()
3. Write a Python-MySQL Connectivity code to update city with Agra where rollno is 101.
Solution:
import mysql.connector
Module Overview
Software engineering concepts are essential for the development of reliable, and high-quality
software products. Software engineers create applications by applying the principles of software
engineering. The Software Development Life Cycle (SDLC) is a structured process that enables
the production of high-quality, low-cost software, in the shortest possible production time. SDLC
is a process followed for software development. It consists of a detailed plan describing how to
develop, maintain, replace, and alter or enhance specific software. The life cycle defines a
methodology for improving the quality of software and the overall development process. There
are more than 50 recognized SDLC models, each of them having its advantages and
disadvantages. Some of the most useful models are discussed in this unit.
There are 7 Phases of SDLC that include planning, analysis, design, development, testing,
implementation, and maintenance. In this unit, these phases are discussed in detail. The
implementation of sample project by using the concepts of software engineering is also
demonstrated in this unit.
Learning Outcomes
After completing this module, you will be able to:
• Describe the concepts of software engineering
• Describe the software development process flow
• Demonstrate to implement minor software project using python
Module Structure
Session 1: Software Engineering Concepts
PSS Central Institute of Vocational Education, NCERT, Bhopal
Junior Software Developer, Grade XII 146
Software is an essential component to make effective use of a computer system. The first and
essential software that every computer must have is its operating system.
A software is a large collection of executable programs that are associated with libraries and
documentation. Every software is written for a specific purpose. Software engineers apply
principles of Software Engineering to develop robust software applications for improving quality,
budget, time and efficiency.
In this chapter, you will understand the concept of software engineering used for software
development. The software development life cycle is explained with various phases of software
development. The most commonly used models of software engineering are also explained.
1.1 SOFTWARE ENGINEERING
A computer program and software are related terms but have different scope and purpose.
Program – A program is a set of instructions written in a programming language that performs a
specific task. Programs are coded for software and are not directly used by end-users. Examples
of programs are, program to find ASCII value of a character, program to compute quotient and
remainder, program to find the size of int, float, double and char.
Software – Software is a broader term that refers to a collection of programs, data, and other
supporting elements that work together to perform a specific function. Software can be thought of
as a complete package that provides a solution to a particular problem. Software includes
programs, documentation, and other components required to support the program.
Software is categorised into different types, such as system software, application software, and
utility software.
Software engineering is the technological and managerial discipline concerned with systematic
production and maintenance of software products that are developed and modified on time and
within cost estimates.
Software engineering is defined as a process of analyzing user requirements and then designing,
building, and testing software applications which will satisfy those requirements.
IEEE defines Software Engineering (SE) as a knowledge area of computing that defines
systematic, disciplined and quantifiable approaches for the development, operation and
maintenance of software.
1.1.1 Need and Importance of Software Engineering in Software Development
Software is required in almost every industry. The working of software can dramatically affect
our day to work. So it becomes important to produce good quality workable software. It is
essential to apply the principles of software engineering to produce the quality software because
of the following features.
1. Complexity – It becomes difficult to build big and complex software with a large number of
programs. It is possible to reduce the complexity of the project by applying the principles of
software engineering for the development of software.
2. Cost effectiveness – The costs of any software depends upon the required man hours for its
development. It is possible to break up the long project into small components. This helps to
optimize the code and hence reduces the cost.
3. Time optimization – The software development process is complex and it may require a lot
of time to get error free executable code. It is possible to decrease the development time by
applying the principles of software engineering.
4. Effectiveness – The commercial software needs to comply with the standards available with
the company. These standards can be achieved by applying the scientific method of software
engineering.
5. Reliability – The software developed should be reliable in nature, i.e. it should be executable
under cross platforms and every time when it is executed should serve the purpose. With the
help of methods of software engineering we can achieve high reliability for the software products.
10.1.2 Characteristics Of Good Software
A good software is characterised by its functionality, reliability, usability, efficiency,
maintainability, security, and scalability. By adhering to these characteristics, software developers
can create high-quality software that meets the needs of users and businesses. Some of the
characteristics of good software are listed below:
1. Functionality – Good software should fulfil its intended purpose and meet the requirements of
its users. It should perform the functions efficiently and accurately for which it was designed.
2. Reliability – Good software should work consistently and without errors, even under varying
conditions.
3. Usability – The user-friendly interface makes the software easy to use for users to accomplish
tasks quickly and efficiently.
4. Efficiency – Good software should be efficient and fast, using minimal system resources while
performing its functions. It should run smoothly without causing slowdowns or crashes.
5. Maintainability – The software with modular design is easy to maintain and update. It allows
you to make changes without affecting the entire system.
6. Security – Good software should be secure, protecting against unauthorised access and
malicious attacks. The security measures such as encryption, authentication, and access control
should be implemented in software.
7. Scalability – Good software should be able to handle increasing demands as the user base
grows. It should be designed to scale up or down depending on the workload.
1.2 SOFTWARE DEVELOPMENT LIFE CYCLE (SDLC)
A software development is a complex process and hence it has different phases to complete the
process. It is also called a software life cycle or software process model. It provides a systematic
management framework with specific deliverables at every stage. It comprises a detailed plan
that describes how to develop, maintain, and replace the software. The Specific SDLC models
have different ways of implementing these steps. SDLC comprises seven different stages:
planning, analysis, design, development, testing, implementation, and maintenance. Figure 1.1
shows these phases of SDLC.
6. System Implementation
After finishing the testing and fixing errors, the software is ready for deployment and
implementation. At this stage, the software undergoes final testing through the training or pre-
production environment, after which it’s ready for presentation on the market. A software is
released to the customer after its testing. System performance is compared to performance
established during the planning phase. Implementation includes user notification, user training,
installation of hardware and software, and integration of the system into daily work processes.
This phase continues until the system is operating in production in accordance with the defined
user requirements. The users are then provided with the training or documentation that will help
them to operate the software.
7. System Maintenance
The system is monitored for continued performance in accordance with user requirements. Many
times the system requirements are changed over a period of time. So it is necessary to update
the developed software at regular intervals of time. Maintenance is a process of modifying
software in order to keep it working over a time. It is possible that sometimes the bugs may arrive
or the security issue may arise due to various reasons. This is particularly important for large
systems, which usually are more difficult to test in the debugging stage. In such a case, in the
software maintenance these issues are addressed to get the solution. When modifications are
identified, the system may reenter the planning phase and the software development life cycle
repeats.
1.3 SOFTWARE DEVELOPMENT METHODOLOGIES (SDLC MODELS)
Software development methodologies are called SDLC models. SDLC is a process followed for
software development. It consists of a detailed plan describing how to develop, maintain, replace,
and alter or enhance specific software. The life cycle defines a methodology for improving the
quality of software and the overall development process. There are more than 50 recognized
SDLC models, each of them having its advantages and disadvantages. Some of the most useful
models are discussed below.
1.3.1 Classical waterfall model
The Waterfall model is the first process model to be used for software development. It illustrates
the process in a linear sequential flow and hence it is also called a linear-sequential model. In
this model, each phase must be completed before the next phase can begin and there is no
overlapping in the phases. A drawback of this model is that even the small details left incomplete
can hold an entire process.
Figure 1.2 shows the several consecutive phases of the classical waterfall model in software
engineering. You can see that there are six distinct stages in this model, namely requirement
analysis, system design, implementation, testing, deployment, and maintenance. The activities
of each stage can begin only after completing the previous step and all activities are properly
documented. The output of one phase becomes the input of the next. Thus, the development
process can be viewed as a sequential flow in a waterfall.
Note: Redraw all figures consisting of similar phases of software development and look.
to build a prototype. This will help to understand the requirements with minimal design, coding
and testing.
For an Online inventory management system, a waterfall model is suggested. This software
is not so costly. The software team can gather all the information from the user followed by
analysis and development.
For a data entry system for office staff that have never used computers before, it is suggested
to go for an incremental model. In this project the user interface and user friendliness are
extremely important. The basic software would be developed and delivered. And in each
increment some functional capability may be added to the system until the system is
implemented. At each step, extensions and design modifications can be made and the testing
can be done at each increment.
5. Software development cost is higher than that of the software maintenance cost.
6. Testing approach changes based on the life cycle applied for development of a software.
7. In an incremental model, requirements do not need to be prioritized.
8. The selection of the software models for the development of software depends on various
factors
D. Answer the following questions in short.
1. What is the difference between a computer program and software?
2. What are the different categories of software?
3. What is software engineering?
4. List the characteristics of software.
5. What are the phases of software development life cycle?
6. List the various software development methodologies.
7. What are the main types of Agile methodology?
The software development process involves the activities related to the production of the software
such as design, coding, and testing. The Software Development Life Cycle (SDLC) refers to a
methodology with clearly defined processes for creating high-quality software. In this chapter
you will understand in detail the phases of software development.
2.1. REQUIREMENT ANALYSIS
Requirement analysis is the process of determining user expectations for a new or modified
system. It includes gathering, documenting, and analyzing the needs and constraints of
stakeholders.
The following four steps are involved in this process.
1. Feasibility study,
2. Requirement elicitation and analysis,
3. Requirement specification,
4. Requirement validation
2.1.1 Feasibility study
A feasibility study is conducted to determine the viability of the project. It examines all aspects
of a proposed project, including technical, economic, financial, legal and environmental
considerations. It explores various aspects such as usability, maintainability, productivity and
integration. The report of the feasibility study contains recommendations whether or not the
project should be developed or not.
There are five types of feasibility study.
1. Technical feasibility – It ensures the availability of technical resources such as hardware and
software. The technical team converts the ideas into a working system.
2. Economic feasibility – It determines the cost and benefits analyses and suggests the potential
economic benefits to the organization.
3. Operational feasibility – It analyses how the system being developed will meet the operations
needs of the organisation.
4. Legal feasibility – It analyses the legal aspects such as zoning laws, data protection acts or
social media laws.
5. Schedule feasibility – It estimates how much time a team needs to complete the project.
2.1.2 Requirement elicitation and analysis (Requirement Gathering)
It is the process of gathering and defining the requirements for a software system. It is based on
a clear and comprehensive understanding of the customer’s needs and requirements. It involves
the identification, collection, analysis, and refinement of the requirements. It involves the
stakeholders including business owners, technical experts and end-users.
The various activities involved in requirement elicitation are as follows.
a. Requirement discovery – It is the process of interacting with stakeholders in the system to
collect their requirements. Domain requirements from stakeholders and documentation are also
discovered during this activity.
b. Requirement classification and organisation – This activity takes the unstructured
collection of requirements, group related requirements and organizes them into coherent
clusters.
c. Requirement prioritization and negotiation – This activity is concerned with prioritizing
requirements, finding and resolving requirements conflicts through negotiation.
d. Requirement documentation – The requirements are documented and input into the next
round of the spiral. Formal and informal requirements documents may be produced.
Requirements Elicitation Techniques
There are many techniques to obtain critical information from stakeholders. The most commonly
used techniques are Brainstorming, Interview, Focus Group, Observation, Document
Analysis/Review, Prototyping and Survey/Questionnaire.
a. Brainstorming – The subject matter experts discuss in group and generate new ideas to find
a solution for a specific issue. Each member is given time to share their ideas.
b. Interview – In this technique, the interviewer asks the questions to stakeholders to obtain
information. It can be structured or unstructured. The structured interview consists of the
questions to get the answer is Yes or No form. In unstructured interviews, open-ended questions
are used to get the detailed information.
c. Document Analysis/Review – This technique is used to gather the information through
available documents. It includes business plans, technical documents, problem reports, and
existing requirement documents. This is useful to update or migrate an existing system.
d. Focus Group – A focus group consists of 6 to 12 subject matter experts. They discuss the
topic in the group and the moderator manages the discussion to analyze the results and
provide findings to the stakeholders.
e. Observation – The necessary information can be obtained through observation. The observer
records all the activities and the time taken to perform the work. Observation can be either active
or passive. In active observation information is obtained by asking the questions while passive
observation is silent and information can be obtained by observing the work.
f. Prototyping – In this technique, a prototype is created and demonstrated to the client to give
an idea of the product. Prototypes can be used to create a mock-up of sites, and describe the
process using diagrams.
Modifiable – SRS should be capable of easily accepting changes to the system. To be modifiable,
requirements documents must have a logical structure. Modifications should be properly indexed
and cross-referenced.
Ranked – The organization and structure of the requirements document establish a ranking of
specification statements based on stability and importance. It becomes difficult to create a
document for large and complex problems.
Testable – A requirement defined in SRS can be tested and validated. For example the
requirement, “The system is user-friendly” is not tastable. It should be written as, “The user
interface should be menu driven with a tooltip for all the text boxes”.
Traceable – The SRS is traceable if the origin of each of the requirements is clear and if it
facilitates the referencing of each condition in future development or enhancement
documentation.
Unambiguous – A requirement statement is unambiguous if it can only be interpreted in one
way. The use of weak phrases or poor sentence structure will lead to misunderstanding in the
specification statement.
Valid – To validate requirement specification, all project participants, including managers,
engineers, and customer representatives, should be able to comprehend, analyze, and accept or
reject it.
Verifiable – It means that the requirements mentioned in SRS ensure that it is being met by the
system. The requirements are verified with the help of reviews. For example, a requirement
starting that the system must be user-friendly is not verifiable then it shold not be mentioned in
SRS.
2.2.2 Structure of SRS document
The type of information included in SRS is determined by a number of factors, including the type
of software being developed and the approach used in its development.
The general structure of SRS as proposed by IEEE standard is given below.
1. Introduction
1.1 Purpose
1.2 Scope
1.3 Definitions, Acronyms and Abbreviations
1.4 Reference
1.5 Overview
2. Overall Description
2.1 Product Perspective
2.2 Product Functions
2.3 User Characteristics
2.4 General Constraints
2.5 Assumptions and Dependencies
3. Specific Requirements
3.1 External Interface Requirements
3.1.1 User Interfaces
3.1.2 Hardware Interfaces
3.1.3 Software Interfaces
3.1.4 Communication Interfaces
3.2 Functional Requirements
3.2.1 Mode 1
3.2.1.1 Functional Requirement 1.1
3.2.1.2 Functional Requirement 1.2
3.2.2 Mode 2
3.2.2.1 Functional Requirement 2.1
Symbol Meaning
Square defines a source or destination of data.
Arrow identifies data flow, means the data in motion. It is a pipeline through
which information flows.
Circle or a bubble represents a process that transforms incoming data flow into
outgoing data
Levels of DFD
Level 0 or Context diagram – Level 0 DFDs are also known as context diagrams. It is the
highest abstraction level, which depicts the entire information system as one diagram. It starts
with mentioning major processes with little details and then goes on giving more details of the
processes with the top-down approach. It establishes the context in which the system operates
such as who are the users, what data do they input to the system, and what data they received
by the system. The data input to the system and the data output from the system are represented
as incoming and outgoing arrows. Figure 2.1 shows the context diagram of the school
management system.
of information. Figure 2.3 shows Level 1 DFD of the result management system, where high level
processes of Level 0 are further broken down into subprocesses.
provides a way of documentation for the complete database system in one place. Validation of
DFD is carried out using a data dictionary. Data dictionary contains the following items.
Data Elements – It is the smallest unit of data that provides for no further decomposition. For
example, DATE consists of day, month and year
Data Structure – It is a group of data elements handled as a unit. For example, a phone is a
data structure consisting of four data elements: area-code-exchange-number-extension.
Data Flows and Data Stores – data flows are data structures in motion, whereas data stores
are data structures at rest. A data store is a location where data structures are temporarily
located
A typical data dictionary for the book is as follows.
Data Field Description Data Type Length
STUD_NAME Name of the student Character 20
DOB Date of Birth of student Date 8
SEX Sex of the student Boolean 1
AGE Age of the student Number 2
3. Decision Trees
A Decision Tree is a graph that uses a branching method to display all the possible outcomes of
any decision. It helps in processing logic involved in decision-making. It is a diagram that shows
conditions and their alternative actions within a horizontal tree framework. Decision trees depict
the relationship of each condition and their permissible actions. A square node indicates an
action and a circle indicates a condition.
For example, Bookstores get a trade discount of 25%; for orders from libraries and individuals,
5% allowed on orders of 6-19 copies per book title; 10% on orders for 20-49 copies per book title;
15% on orders for 50 copies or more per book title. A decision tree for this is shown in Figure
2.5.
If Customer is Bookstore Y Y N N N N
(Condition) Order size 6 copies or more ? Y N N N N N
Customer Librarian or Y Y Y Y
Individual
Order-size 50 copies or more ? Y N N N
Order-size 20-49 copies ? Y N N
Order-size 6-19 copies ? Y N
Then Allow 25% Discount X
(Action) Allow 15% Discount X
Allow 10% Discount X
Allow 5% Discount X
No Discount allowed X
Action Stub Action Entry
5. Structured English
Structured English is the description of the programming code in simple English. It uses common
verbs such as IF-THEN-ELSE and DO WHILE-ENDDO. Structured English is based on
structured logic, used to express all logic in terms of sequential structures, decision structures,
iterations and case structures. It assists programmers to write error-free code.
Example: The Structured English notation for the problem mentioned in the Decision tree can
be represented as below.
IF order is from Bookstore
and-IF order is for 6 copies or more per book title
THEN: Discount is 25%
ELSE (order is for fewer than 6 copies per book title)
SO: no discount is allowed
ELSE (order is from libraries or individuals)
ELSE (order is from libraries or individuals)
SO-IF order is for 50 copies or more per book title
Discount is 15%
ELSE IF order is for 20 to 49 copies per book title
Discount is 10%
ELSE IF order is for 6 to 19 copies per book title
Discount is 5%
ELSE (order is for less than 6 copies per book order)
SO: no discount is allowed
6. Pseudocode
Pseudocode is normally produced before the target code is generated for the software application.
It is written in the programming language used. It can be thought of as an augmented
programming language, with lots of comments and descriptions.
The Pseudocode to check if the given integer is even or odd, can be written as below.
READ X
COMPUTE x%2
IF x%2 == 0
PRINT "Even Number"
ELSE
PRINT "Odd Number"
EXIT
User interface design is essential for ensuring that the system is user-friendly and easy to
navigate, which can greatly enhance its overall usability.
2.3.2 Low Level and High Level System Design
System design is conducted at two levels that are Low-Level Design and High-Level Design.
1. Low-Level System Design
Low-Level Design, refers to the component-level design process, where the large system is broken
down into smaller and manageable components that define how these components will interact
with each other. This may involve creating detailed system diagrams, flowcharts, and other visual
representations of the system's architecture. It also includes the detailed design of software
algorithms, data structures, interfaces, specific programming languages, software libraries, and
hardware components that are required to implement the system's functionality. Low-level
system design follows the higher-level conceptual and logical design phases, and is focused on
the specific implementation details of the system. The low-level system design includes defining
the database architecture and ER-diagrams, creating tables and defining relationships between
them, deciding a design pattern, classes structure and models.
2. High Level System Design
This is the first phase of system design, and is focused on creating a comprehensive blueprint
for the system that can guide the detailed design and implementation phases that follow. It refers
to the process of designing the overall architecture and components of a system at conceptual
level. It involves creating a broad understanding of the system's goals and objectives, identifying
key components and their interactions, and exploring alternative design options. The goal of
high-level system design is to provide a clear and complete picture of the structure and
functionality.
2.3.3 Software Design Strategies
Software design is a process to conceptualize the software requirements into actual design. The
design strategies help to conceptualize a plan into the best possible design for implementing the
intended solution.
Structured Design
The main objective of structured design is to minimize the complexity and increase the
modularity of a program. Structured design is based on the divide and conquer technique, in
which a large problem is divided into several small tasks. The small pieces of the problem are
solved by means of solution modules. Solution modules are used to address the individual
problems. In structured design these modules are arranged in hierarchy to communicate with
each other to produce exact results.
Modularization
Structured design uses the modularization approach to minimize the complexity of the project.
In this approach the program is divided into small and independent modules in a top down
manner. There are two types of modular design strategy – top down and bottom up strategy.
1. Top-Down Strategy
The top-down strategy uses the modular approach to design the system. It starts from the
highest-level or topmost module and moves towards the lowest level or bottom modules. In this
technique, the main module is divided into several small modules or segments based on the task
performed by each module. Then, each module is further subdivided into several submodules of
the next lower level. Figure 2.6 shows the top down design strategy.
down and modular design method where a complete problem is divided into many modules till
each module becomes manageable. It is a design tool that displays the relationship between
program modules. It consists of diagram consisting of rectangular boxes that represent the
modules, connecting arrows, or lines as shown in the Figure 2.9.
The number of times an entity of an entity set participates in a relationship set is known as
cardinality. There are four types of cardinalities.
1. One to One – One entity from entity set A can be
contained with at most one entity of entity set B and
vice versa. Let us assume that each student has only
one student ID, and each student ID is assigned to
only one person. So, the relationship will be one to
one.
2. One to many – When a single instance of an entity
is associated with more than one instance of another
entity then it is called one to many relationships. For
example, a client can place many orders; an order
cannot be placed by many customers.
3. Many to One – More than one entity from entity
set A can be associated with at most one entity of
entity set B, however an entity from entity set B can
be associated with more than one entity from entity
set A. For example, many students can study in a
single school, but a student cannot study in many
school at the same time.
4. Many to Many – One entity from A can be
associated with more than one entity from B and vice-
versa. For example, the student can be assigned to
many projects, and a project can be assigned to many
students.
ER Diagram for Student Management System
A sample ER diagram for the Student management system is shown in Figure 2.14 with entities
and their associated attributes.
Together, these principles ensure that the software developed is not only technically sound, but
also well aligned with user needs and business goals.
2.5 SYSTEM TESTING
Software testing is an important stage in the software development process to ensure that it
works properly. Software testing is the process of evaluating and verifying the functionality of
the software that matches the expected requirements, specification, functionality, and
performance of a software. It helps to enhance the quality of the software in terms of accuracy,
reliability, scalability, practicability, usability, portability and reusability.
The process of software testing involves running the software under controlled conditions at
various levels.
Verification and validation testing
Verification and validation are two essential processes in software testing.
Verification is the static testing process that checks the software functioning against the
specified requirements. The main activities involved in the verification process are – inspection,
review, walkthrough, and desk-checking.
Validation is the dynamic testing process that focuses on evaluating the software at the end of
the development process to determine whether the software product meets the customer’s
expectations and requirements or not.
Types of Software Testing
Software testing is a complex process carried out through various types of software testing, each
designed to meet specific objectives and address different aspects of the software. The various
types of testing can be divided into various levels as shown in Figure 2.15.
Continuous testing – Continuous testing integrates testing into every phase of the software
development lifecycle, providing continuous quality assurance and rapid feedback on potential
problems.
Testing approaches
There are three types of testing approaches – black-box, white-box and gray box testing.
Black Box Testing – In this type of testing the tester does not have access to the source code of
the system that's being tested. It considers the system's external behavior and does not require
programming skills. It can be used for functional tests, as well as for non-functional tests such
as performance testing, usability, and accessibility. It is done by the Quality Assurance (QA)
team at higher levels. It takes less time to perform.
White-Box Testing – It is also known as clear box testing, glass box testing, code-based testing,
or structural testing. In this type of testing the tester is aware of the internal workings of the
system and has access to its source code. It focuses on the logic and the implementation of the
software. Is usually done through automation testing. Is based on a good understanding of the
system’s code and usually done by the developers.
Grey Box Testing – It is a combination of white-box testing and black-box testing. Its aim is to
search for the defects, if any, due to improper structure or usage of applications. The gray box
tester may not have complete knowledge of an application's source code but may have partial
knowledge of it and/or access to design documentation.
Functional Testing and Non-Functional Testing
In the next level, software testing is classified as functional testing and non-functional testing.
A. Functional Testing
Functional testing tests the specific actions and features of the software against the functional
requirements. The various types of functional testing are Unit testing, Integration testing, System
testing and User Acceptance Testing.
1. Unit Testing – Unit testing is a method in which individual units or the components of the
software application are tested. Unit testing helps to identify bugs at an early stage and improves
the overall quality of the software. It is mainly executed at the early stage of software
development. It is seen as a function, procedure, or method. For example, in unit testing, the
login button is tested to ensure it can route to the correct page link.
2. Integration Testing – In this method the different units or modules of the software application
are integrated to test the system as a whole. The advantage of this method is that it helps to
identify errors when the different units of the software work together. In integration testing,
errors about performance, requirements, and functional level are investigated. In unit testing,
individual units are tested, however, in integration testing, such units' performance is checked
when they are integrated.
3. System Testing
System testing is mainly executed to investigate the behavior, architecture, and design of the
software. In system testing, all the integrated modules of the complete system are tested to verify
and validate the system requirements. It involves a different test that includes validating output
in terms of particular input and the user’s experience. Here, performance and quality standards
are tested in compliance with the technical and functional specifications.
4. User acceptance testing
In this type of testing the end users test the software to verify that it performs the required tasks
in real-world scenarios according to the specifications. There are two main subtypes: alpha and
beta testing.
Alpha Testing – It is conducted by internal staff in a controlled environment, to identify bugs
and issues before the software is released to external users.
Beta Testing – In this testing the software is distributed to the external users to test its
functionality in real-world conditions. It collects feedback and user experience to make the
necessary changes before the official release of the software.
B. Non-functional Testing
Non-functional testing considers the non-functional aspect of the software like performance,
usability, reliability, portability, efficiency, security, and others.
The various types of functional testing are Performance testing, Security testing, Usability testing
and Compatibility testing.
1. Performance Testing
This test evaluates the performance of the system on the parameters such as speed,
responsiveness, and stability under various conditions. It is essential to ensure that the software
performs well in terms of speed and response time under expected workload. Load testing, Stress
testing, Spike testing, Endurance testing, Scalability testing are the subtypes of performance
testing.
2. Security testing
Security testing focuses on identifying vulnerabilities, threats, and risks in a software application
to prevent malicious attacks. It involves two crucial aspects of testing-authentication and
authorization. Security testing makes the application secure and able to store confidential
information when required. It also checks the behavior of the software related to attacks from
hackers and how it should be maintained for data security upon noticing such attacks. The
different types of security testing includes: Penetration Testing, Vulnerability Scanning, Security
Auditing, Security Scanning, Ethical Hacking, Portability Testing.
3. Usability Testing
Usability testing is done to ensure the quality and easiness of application usage. A gaming
application’s usability testing checks whether it is operated by both hands, the color of the
background, the vertical scroll, and others. The type of usability testing includes the Cross-
Browser Testing, Accessibility Testing, Exploratory Testing.
4. Compatibility testing
Compatibility testing focuses on evaluating whether a software application works as intended
across different browsers, databases, hardware, operating systems, mobile devices, and
networks. It ensures the compatibility of the software with different environments and
configurations.
2.6. SYSTEM IMPLEMENTATION
It is the process of installing, configuring, and integrating systems. It involves various tasks,
such as setting up hardware and software components, migrating data, customizing features,
training users, and providing support. Software implementers use different techniques,
standards, and best practices to ensure the software works properly and meets the needs and
goals of the organization. System implementation allows access to the latest technology by
replacing old applications with new software. New applications increase customer satisfaction
with a more user-friendly experience. System implementation requires technical, operational,
and organizational skills.
Software implementation methodologies
Software implementation methodologies are frameworks or models that provide the principles,
practices, and procedures to achieve the desired outcomes. These are the software development
methodologies that include Waterfall, Agile, Iterative and Incremental approach. Each
methodology has its own advantages and disadvantages.
The best software implementation methodology depends on various factors, such as the size,
complexity, and nature of the project, user requirement and preferences. Therefore, it is
important to evaluate and compare the pros and cons of each software implementation
methodology, and choose the one that suits the project's goals, requirements, and context.
Elements of a successful software implementation
The following are the elements that are considered for the successful implementation of software.
Defining the organization's needs
It is necessary to understand the organization's needs to choose the appropriate software in
terms of its deliverables, number of users, platform, functionality and compatibility with the
organization's existing systems and security features.
Choosing the appropriate software
After understanding the organization's needs, it is easy to look for the appropriate software.
Search for the latest system with innovative features that meet the organisation requirement.
Installing the application
Generally, the vendors provide the installation support with no extra charge. The vendors may
provide the support for installation with the collaboration with the IT department. This
collaboration can ensure all necessary devices get the application and facilitate seamless
integration with existing systems. If the IT department installs the new application
independently, try providing the vendor's instructional manual or contact information to
troubleshoot potential issues.
Configuring features
Once the application is installed, configure the most basic features first. Configuring simple
features typically involves using the program's default settings. Keeping the initial configuration
process simple also allows us to troubleshoot underlying issues before adding more complicated
features.
Customizing features
Initially the default settings can help to start the system with basic features. To offer more
flexibility, it is necessary to customize the advanced features of the system. Additional
customization helps employees to understand their progress and encourage them to meet their
goals.
Integrating with existing systems
During the selection process, it's important to choose an application that can integrate with the
organization's systems. Compatibility allows multiple features to work together and prevent
errors. As your team integrates the new application, they might consider how to transfer data
from systems that the organization is no longer using. Automatic data migration can help your
team save time while protecting sensitive information, including customer payment details.
Training employees
A good training program can ensure employees understand how to use the new application. It
may emphasize how the software differs from old systems and how employees can optimize the
various features. As part of the training program, consider providing employees with their
account login information and establishing the appropriate permissions.
Testing the software
It is obvious that some errors may be encountered while using new software. Application testing
allows us to evaluate the effectiveness of each feature and identify bugs that affect multiple users.
Identify the features that require improvement, and make the appropriate adjustments and
conduct more testing. Testing also allows us to identify issues that the vendor is responsible for
addressing.
Software Implementation Challenges
There are some challenges faced by the development team while implementing the software.
Some of them are mentioned below.
Code-reuse – The interfaces of modern programming languages are very sophisticated and are
equipped with huge library functions. Still, to bring down the cost, the organization management
prefers to re-use the code, created earlier for some other software. There are huge issues faced
by programmers for compatibility checks and deciding how much code to reuse.
Version Management – Every time a new software is issued to the customer, developers have to
maintain version and configuration related documentation. This documentation needs to be
highly accurate and available on time.
Target-Host – The software program, which is being developed in the organization, needs to be
designed for host machines at the customer's end. But at times, it is impossible to design a
software that works on the target machines.
Software Documentation
Software documentation is a comprehensive collection of written materials that describe and
explain a software system. It includes various documents that provide the design, functionality,
architecture, and use of the software. The documentation serves as a critical resource for
developers, stakeholders, and end users to help them understand, use, and maintain the
software effectively.
Types of software documentation
There are five major types of software documentation addressing different aspects of the
development lifecycle.
1. Process documentation
Process documentation focuses on capturing and describing the workflows, procedures, and
software development methodologies involved in a software development life cycle. It serves as a
comprehensive guide for project managers, system administrators, software developers, and
stakeholders, to understand the sequence of tasks, dependencies, and responsibilities
throughout the software lifecycle. There are two types of documents in this category.
Process Flowcharts – Visual representations of the step-by-step sequence of activities and
decisions in a software development process, providing a clear overview of the workflow and
potential branching paths.
a) Standard Operating Procedures (SOPs) – Detailed written instructions that outline the
specific tasks, roles, and responsibilities for each phase of the software development
lifecycle, ensuring consistency to established processes.
2. Requirement documentation
It contains all the functional, non-functional and behavioural descriptions of the software. It
includes a detailed outline of the software’s intended features, functionality, and performance
expectations. This documentation serves as the foundation for the entire development process,
providing a clear roadmap for creating the software and guiding the work of developers,
designers, and testers. There are five key types of requirements documentation.
Software Requirements Specification (SRS) – It provides a detailed description of the
software’s functional and non-functional requirements. It includes user needs, system
capabilities, constraints, and interfaces, serving as the foundation for the entire development
process.
a) Use Cases – Descriptions of interactions between users and the software, illustrating how
the software responds to various user actions. Use cases help in identifying system
behaviour and understanding user interactions with the application.
b) Functional Requirements – It includes statements that specify the software’s expected
behaviour and the actions it must perform in response to certain inputs. In other words,
functional requirements define the features and functionalities that the software should
deliver.
implementation, invalid or incomplete tests. This type of problem needs immediate attention as
it hampers the day to day work of the end user. Proper planning and interaction with the end
user during the system development process can minimize the occurance of corrective
maintenance.
2. Adaptive Software Maintenance: Keeping pace with change
The software system once developed needs the changes over a period of time due to change in
environment, organisation functioning, policy changes. The client may require the system with
modified functionality, platform, or new operating environment as per the requirement of new
hardware and software. Maintenance of the software to adapt to this kind of changes is called
adaptive maintenance. This activity is not as urgent as corrective maintenance as these changes
are gradual and allow sufficient time to the system group to make changes to the software.
Adaptive maintenance is necessary to ensure the functionality of software in a changing
technological landscape.
3. Perfective Software Maintenance: Tuning for peak performance
Perfective maintenance focuses on enhancing and improving the software to meet evolving user
needs. It includes adding new features, improving user interfaces, and optimizing software
performance. It makes the software more efficient, easier to use, and relevant in a competitive
marketplace. It enhances the user experience and extends the useful life of software assets,
leading to improved system performance.
4. Preventive Software Maintenance: Anticipating and preventing issues
Preventive software maintenance helps to prevent the system from any future vulnerabilities. It
includes tasks such as optimizing code, refining documentation, and updating systems to
improve overall performance and maintainability. It addresses problems, which are not
significant at this moment but may cause serious issues in future. It reduces the need for
corrective maintenance. It is done when the system is least used or not used at all. It does not
add value to the system, but lowers the cost of corrective maintenance.
Cost of Maintenance of Software
The cost of software maintenance is very high to about 60% to 67% of the total cost of the
software development process. The standard lifespan of any software is upto 10-15 years. With
advancement in software technology, the cost of maintenance of old software becomes very high.
The most common method to correct any software problem used by engineers is the trial and
error method and the changes are usually left undocumented which in return causes conflicts
in future. This process changes the structure of the original software structure. Other factors at
the software-end are programming language used for the development of software, structure of
the software, reliability and availability of staff, dependence on the external software.
The cost of software maintenance is influenced by several factors such as, size and complexity
of the software, program lifetime, dependency on external environment, hardware stability and
frequency of change in software system.
4. Requirements validation is the process of (a) gathering the requirements (b) jotting down
the requirements (c) checking the requirements (d) noting the requirements
5. Requirement specification is a process of (a) gathering the requirements (b) jotting down
the requirements (c) checking the requirements (d) noting the requirements
6. Which of the following is not the type of attribute in ERD (a) key attribute (b) derived
attribute (c) strong attribute (d) multivalued attribute
7. Which of the following is not the type of entity in ERD (a) Entity set (b) Strong entity (c)
Weak entity (d) Key entity
8. Which of the following is not the type of relationship in ERD (a) Unary relationship (b)
Binary relationship (c) Ternary relationship (d) Null relationship
9. Which of the following is not a structured analysis tool (a) DFD (b) Data dictionary (c)
Structured chart (d) Structured English
10. Which of the following is not a structured design tool (a) HIPO diagram (b) Structured
flowchart (c) Structured chart (d) Structured English
B. Fill in the blanks
1. The number of times an entity of an entity set participates in a relationship set is known
as _______.
2. Observation can be either _________ or ___________
3. Close ended questions have a __________ set of answers.
4. Technical requirements describes the technical specifications and ___________
5. The requirements are verified with the help of ___________ .
6. Level 0 DFD is also called as _____________
7. Structured analysis helps to understand the system in a __________ .
8. DFD can be _________ and _________
9. Validation of DFD is carried out using __________ .
10. Structured English is based on __________ .
11. Structured design follows the rules of __________ and___________.
12. Structured design is based on the _________ and __________ technique.
13. Two types of modular design strategy are ____________ and __________
14. Preventive maintenance reduces the need of _________ maintenance.
15. Strong entity has a ______ attribute as primary key.
C. State whether True or False
1. Unit testing helps to identify bugs at an early stage.
2. Integration testing identifies errors when the different units of the software work together.
3. The structured interview consists of the open ended questions.
4. Functional requirements describe the behavior of the system.
5. The information in active observation can be obtained just by observing.
6. Data dictionary is the structured repository of data elements.
7. Decision tree depicts the relationship of each condition and their permissible actions.
8. Structured English is the description of the programming code in simple English.
9. Pseudocode is normally produced after the target code is generated.
10. Logical design relates to the actual input and output processes of the system.
11. High Level Design, refers to the component-level design process.
12. The low cohesion and high coupling arrangements result into good structured design.
13. Structure chart is a top down and modular design method.
14. HIPO diagram organizes the software modules into a hierarchy.
15. The For loop may or may not execute at all depending on the result of the condition.
16. The ER Model is a graphical approach to database design.
17. A strong relationship exists between two strong entities.
18. Strong entity holds a weak relationship with the weak entity.
19. The degree of relationship depends on the number of different entity sets participating in
a relationship.
20. Validation is the static testing process and Verification is the dynamic testing process.
D. Answer the following questions in short.
1. What are the types of feasibility studies?
2. What are the types of documentation?
3. What are the four types of software maintenance?
4. What are the various activities involved in requirement elicitation?
5. What are the various required elicitation techniques?
6. What is the difference between DFD and Flowchart?
7. What is the difference between logical and physical DFD?
8. List the elements of a data dictionary?
9. What is meta data? Give examples of metadata.
10. What are the different types of system design?
11. What is the difference between a strong entity and a weak entity?
12. What is the difference between white box testing and black box testing?
13. What is gray box testing?
Assignment
Project Implementation using Concepts of Software Engineering
Let us consider a prototype software project, details of which are as given below.
Project Title: Banking Management System
Purpose of the Project
The main purpose of the software project is to simplify the tedious task of banking by providing
a user friendly environment. It also aims at increasing the efficiency and reducing the drawbacks
of existing manual banking processes. This makes it a more convenient banking tool for the
customers.
Scope
In this system, only certain banking operations are permitted. It includes opening of a new
account, updating of account, deposit amount in account, withdrawal and applying for the loan.
Definitions
The abbreviations used in this project are as below.
BMS – Banking Management system
UI - User Interface
DBMS – Database Management System
Process Model
The first step in the project development is to decide the appropriate software development
model. In this project we have selected the Waterfall model because it is simple and easy to
implement. Also the project requirements are well known. This model provides us with a
structured approach for software development.
Software Requirement Specification
Overall Description
The manual banking system has a lot of paperwork and the data is stored on paper only. It is a
very sluggish and time consuming process that is inconvenient to the customers. The existing
system cannot tolerate the increasing number of customers. Hence the automated banking
system is proposed. The proposed system will decrease the amount of paperwork and automate
certain tasks.
This system will provide the user-ID validation and hence unauthorized access is prevented.
Product functions
The “BMS'' software is an independent web based application. There are various user interfaces
related to this software. These interfaces help the user to interact with the software and provide
the necessary information for the online banking system. In order to achieve the automatic
banking system it is necessary to divide the entire functionality of the system into different
modules. The modular approach is used as a design strategy.
The software is divided into following modules. These are
1. Customer Management
2. Loan System
3. Transaction System
Module 1. Customer Management
In this module, the user can open a new account or can update an existing account by providing
the details such as name, father name, address, phone number and Email id.
Module 2. Loan System
In this module the customer can apply for a loan by providing required documents and details
like time period of loan, amount of loan and get the detailed description of EMIs.
Module 3: Transaction System
This module allows the customers to deposit and withdraw money from their account using some
private information like signature and OTP.
Programming Platform for Software
This Banking Management System (BMS) can be implemented by using the Python programming
language.
User characteristics
The software user interface will be in English language and hence the user should be comfortable
in English language and basic usage of computer and Internet.
General Constraints
The database of the system should be accessible only by the authorised person. Any other person
should not be able to access it.
Data Flow Diagram (DFD)
The data flow diagram indicates the flow of the data of software systems to the different nodes
as shown below.
DFD for system and system users
The DFD shown in Figure 2.16 shows that the users of the system can be employees of bank
and customers are registered account holders.
may be attempted
Requirements not CU 50% 1 Regular interaction with the customer
properly and getting the requirements verified
documented and before finalising them
understood
Delivery deadline BU 40% 2 Review the progress from time to time
will be tightened and take appropriate steps to keep up
with the schedule
Lack of skill ST 40% 2 External resources might help
Estimation
Efforts
Calculation
MILESTONE:
Cost
Estimated
4. Design
Development
Formulate
System
architecture
Generate
Code
MILESTONE:
System
Design
developed
5. Testing
Develop test
cases
Calculate
cyclomatic
complexity
Develop flow
graph
MILESTONE:
Testing
Complete
Module Overview
Many new technologies are introduced almost every day. Some of these new technologies prosper
and persist over time, gaining attention from users. Emerging trends in information technology
is the primary catalyst for change to survive in the competitive market. Technological innovations
brings the progress in the corporate industry. The competition requires to stay with technologies
and pursue digital transformation. It becomes important to incorporate the new software or
hardware technology quickly. IT professional are constantly learning, unlearning, and relearning.
Adapting new trends is necessary to deliver quality services, reduce spending, and boost user
experience. It is a long-term strategy that asks for time, effort, and expertise. However, it is better
to learn about emerging trends in information technology to understand the business needs.
The increasing use of ‘Digital Technologies’, have made the significant impact on our lives,
making things more convenient, faster, and easier to handle. Applications of digital technologies
have redefined and evolved all spheres of human activities to the common man. These
technologies can also be misused. Best practices can ensure a productive and safe digital
environment.
Social impact is the positive change that individuals, organizations, and movements create for
society and the environment. Social impact reporting is the process of measuring and
communicating the social and environmental effects of an organization's activities. As the world
faces complex and urgent challenges, social impact becomes more important than ever. It is a
key tool for accountability, transparency, and learning in the social sector.
In this unit, you will learn about some of the emerging trends and innovations in social impact
to keep up with the emerging trends and follow the best practices.
Learning Outcomes
After completing this module, you will be able to:
• Describe Emerging Trends and New Technologies
• Describe the Societal Impact of Technology
Module Structure
Session 1: Emerging Trends and Technologies
Session 2: Societal Impact
5. Humans cannot work in hazardous environment, where AI based machines can work.
Disadvantages of AI
There are certain disadvantages of AI.
1. Machine cost increases with the use of AI technology.
2. AI is under research. Still It is not possible to replace human by AI based machine.
3. AI is not creative like humans. It cannot create any original thing to some extent.
1.3 Fields of AI
AI is a vast topic constituting many fields such as knowledge representation, planning, learning,
natural language processing, reasoning and perception. Machine learning, neural network, deep
learning, Cognitive computing, Computer vision and Natural language processing are the areas
that widely uses artificial intelligence. Some of these emerging fields of AI are discussedin this
section.
1.3.1 Machine Learning
Machine Learning is a subsystem of Artificial Intelligence, wherein computers have the ability to
learn from data using statistical techniques, without being explicitly programmed by a human
being. It comprises algorithms that use data to learn on their own and make predictions. These
algorithms, called models, are first trained and tested using a training data and testing data,
respectively. After successive trainings, once these models are able to give results to an
acceptable level of accuracy, they are used to make predictions about new and unknown data.
Machine learning is an application area of AI. In machine learning, algorithms are developed
such that the computing machine, can learn from experience or data. Machine learning is widely
used in image processing, medical diagnosis, prediction and classification.
A neural network is a network of artificial neurons that recognizes relationship between set of
data in similar way to that of human brain. Such network can be trained and it generates best
possible result without any redesign. Neural Network algorithms can be used in rules forecasting,
risk management and data validation.
Deep learning is a part of machine learning. Like human beings learn naturally, deep learning
algorithms teaches computers. This technology is used for “driver-less cars”.
1.3.2 Computer Vision
Computer vision is a technology in which computers are enabled to see, identify and process
images in the same way as humans. It is the subset of AI which makes use of statistical models
to aid computer systems in understanding and interpreting visual information in the
environment. This technology is used responsible for creating efficient self-driving cars, drones,
medical diagnosis and monitoring health of crops.
1.3.3 Expert Systems
Expert Systems are perhaps the most rigid subset of AI due to their use of rules. This area
involves the use of explicitly stated rules and knowledge bases in an attempt to imitate the
decision-making of an expert in a certain field. In other words, it is the use of explicitly stated
rules and inference techniques to make informed decisions in specific fields, such as medicine.
1.3.4 Robotics
Robotics is essentially the integration of all the above-mentioned concepts. It is the sub-field
responsible for making AI systems perceive, process, and act in the physical world. Robotics
involves using algorithms which can recognize objects in their immediate environment and
interpret how interactions with these objects can alter their current state and that of the
environment plus the people in it. Robots are used in fields such as medicine, manufacturing,
e-commerce (warehouses), and many more.
1.3.5 Natural Language Processing (NLP)
Natural Language Processing is the subset of AI which is responsible for enabling AI systems to
interact using Natural Human Language such as English and Hindi. The predictive typing feature
of search engine that helps us by suggesting the next word in the sentence while typing keywords
and the spell checking features are examples of Natural Language Processing (NLP). NLP involves
using statistical models to understand, interpret, and generate human language in a way that is
meaningful to human beings. Human language includes all languages spoken by humans. It is
the technology behind chatbots like ChatGPT, Siri, Alexa, and others.
In fact it is possible to search the web or operate or control the devices using voice. NLP system
can perform text-to-speech and speech-to-text conversion as depicted in Figure 1.2. Machine
translation is a rapidly emerging field where machines are able to translate texts from one
language to another with fair amount of correctness. Another emerging application area is
automated customer service where a computer software can interact with customers to serve
their queries or complaints.
5. It can emulate a Linux system; simulate an entire chat room; play games like tic-tac-toe;
and simulate an ATM
ChatGPT's training data includes man pages and information about internet phenomena and
programming languages, such as bulletin board systems and the Python programming language.
By using ChatGPT we can perform various activities such as topic searching, poem or essay
creation and help for programming. The steps to use ChatGPT are,
Step 1. Google ChatGPT and click on OpenAI link.
Step 2. Login with your email ID
Step 3. Click on + to start a new chat.
Step 4. Write your question.
Following are the some of the activities that can be performed by using ChatGPT.
1. Topic Searching on ChatGPT
2. Poem Creation
Limitations of ChatGPT
1. May occasionally generate incorrect information
1. May occasionally produce harmful instructions or biased content
2. Limited knowledge of world and events after 2021
3. Sometimes cannot draw or show photo as it is a linguistic model
It is not sure that ChatGPT will give correct answer every time, so one needs to recheck it. Do
not blindly follow answers given by ChatGPT.
Demonstrate the activity based on use of Google Assistant, Siri and Alexa.
1.4 Immersive Experiences
With the three-dimensional (3D) videography, the joy of watching movies in theatres has reached
to a new level. Video games are also being developed to provide immersive experiences to the
player. Immersive experiences allow us to visualise, feel and react by stimulating our senses. It
enhances our interaction and involvement, making them more realistic and engaging. Immersive
experiences have been used in the field of training, such as driving simulators as shown in Figure
1.3, flight simulator and so on. Immersive experience can be achieved using virtual reality and
augmented reality.
objects and other actions of the user. At present, it is achieved with the help of VR Headsets. In
order to make the experience of VR more realistic, it promotes other sensory information like
sound, smell, motion, and temperature. It is a comparatively new field and has found its
applications in gaming (Figure 1.4), military training, medical procedures, entertainment, social
science and psychology, engineering and other areas where simulation is needed for a better
understanding and learning.
The ‘Internet of Things’ is a network of devices that have an embedded hardware and software to
communicate (connect and exchange data) with other devices on the same network as shown in
Figure 1.6. In IoT, our mobile phones, computers, car, TVs, Fridges, Radios, Watches, Tablets
can all be connected together. All these devices can talk to each other i.e. they can exchange
data.
The IoT can enable better safety, efficiency and decision making for businesses as data is
collected and analyzed. It can enable predictive maintenance, speed up medical care, improve
customer service, and offer benefits.
It is the beginning of this new technology. Forecasts suggest that by 2030 around 50 billion of
these IoT devices will be in use around the world. There will be a massive web of interconnected
devices, everything from smartphones to kitchen appliances. You will have to learn about
Information security, AI and machine learning fundamentals, networking, hardware interfacing,
data analytics, automation, understanding of embedded systems, and must have device and
design knowledge.
idea of a smart city as shown in Figure 1.7. It makes use of IoT and WoT to manage and distribute
resources efficiently. The smart building shown here uses sensors to detect earthquake and then
warn nearby buildings so that they can prepare themselves accordingly. The smart bridge uses
wireless sensors to detect any loose bolt, cable or crack. It alerts concerned authorities through
SMS. The smart tunnel also uses wireless sensors to detect any leakage or congestion in the
tunnel. This information can be sent as wireless signals across the network of sensor nodes to a
centralized computer for further analysis.
6. Connected Cars: All cars in the city will be connected to each other to avoid accidents.
7. Digital Health Service: All health services will be in electronic form.
8. Smart Retail: Retail will be automated as per requirement of the user.
Practical Activity
1. Explore and list a few IoT devices available in the market.
2. Demonstrate the controlling AC through mobile phones
3. Demonstrate the lighting system using IoT
(c) Variety – It asserts that a data set has varied data, such as structured, semi-structured and
unstructured data. Some examples are text, images, videos, web pages and so on.
(d) Veracity – Big data can be sometimes inconsistent, biased, and noisy or there can be
abnormality in the data or issues with the data collection methods. Veracity refers to the
trustworthiness of the data because processing such incorrect data can give wrong results or
mislead the interpretations.
(e) Value – Big data is not only just a big pile of data, but also possess to have hidden patterns
and useful knowledge which can be of high business value. But as there is cost of investment of
resources in processing big data, we should make a preliminary enquiry to see the potential of
the big data in terms of value discovery or else our efforts could be in vain.
1.6.2 Data Analytics
Data analytics is the process of examining data sets to draw conclusions about the information
they contain, with the aid of specialized systems and software. Data analytics technologies and
techniques are becoming popular day-by-day. They are used in commercial industries to enable
organizations to make more informed business decisions. In the field of science and technology,
it can be useful for researchers to verify or disprove scientific models, theories and hypotheses.
Pandas is a library of the programming language Python that can be used as a tool to make data
analysis much simpler.
There are various data analysis tools such as Microsoft Excel, Python, R, Jupyter Notebook,
Apache Spark, SAS, Microsoft Power BI, Tableau and KNIME.
1.7 Cloud Computing
Cloud computing refers to the availability of large computing and storage facility delivered over
the Internet or the cloud. It does not require active management by the user. Data centers
available on the internet are examples of cloud computing. Google’s Gmail service, Facebook and
WhatsApp services are examples of cloud computing.
The services comprise software, hardware (servers), databases, and storage. These resources are
provided by companies called cloud service providers and usually charge on use basis. We
already use cloud services while storing our pictures and files as backup on Internet, or host a
website on the Internet. Through cloud computing, a user can run a big application or process
a large amount of data without having the required storage or processing power on their personal
computer as long as they are connected to the Internet. Besides other numerous features, cloud
computing offers cost-effective, on-demand resources. A user can avail need-based resources
from the cloud at a very reasonable cost.
1.7.1 Cloud Services
There are three standard models to categorise different computing services delivered through
cloud as shown in Figure 1.9. These are Infrastructure as a Service (IaaS), Platform as a Service
(PaaS), and Software as a Service (SaaS).
(a) Infrastructure as a Service (IaaS) – The IaaS providers can offer different kinds of computing
infrastructure, such as servers, virtual machines (VM), storage and backup facility, network
components, operating systems or any other hardware or software. Using IaaS from the cloud, a
user can use the hardware infrastructure located at a remote location to configure, deploy and
execute any software application on that cloud infrastructure. They can outsource the hardware
and software on demand basis and pay as per the usage. Thus they can save the cost of software,
hardware and other infrastructures as well as the cost of setting up, maintenance and security.
Some popular IaaS platforms are Amazon EC2, Microsoft Azure, Google cloud platform, GoGrid,
and Digital ocean.
(b) Platform as a Service (PaaS) – The PaaS provides a platform or environment to develop, test,
and deliver software applications. Suppose we have developed a web application using MySQL
and Python. To run this application online, you can avail a pre-configured Apache server from
cloud having MySQL and Python pre-installed. In PaaS, the user has complete control over the
deployed application and its configuration. It provides a deployment environment for developers
at a much reduced cost without buying and managing the underlying hardware and software.
Google App engine, Microsoft Azure, Openshift and oracle cloud are examples of Paas.
(c) Software as a Service (SaaS) – SaaS provides on-demand access to application
software, usually requiring a licensing or subscription by the user. SaaS from cloud can be used
while using Google doc, Microsoft Office 365, Drop Box to edit a document online. Like PaaS, a
user is provided access to the required configuration settings of the application software, that
they are using at present. In all of the above standard service models, a user can use on-demand
infrastructure or platform or software on charged basis. Microsoft office 360, Google G smite,
Zoho, Salesforce are common examples of Saas. Government of India has embarked upon an
ambitious initiative — ‘GI Cloud’ which has been named as ‘MeghRaj’ (https://cloud.gov.in).
Advantages of cloud computing
Cloud computing is an essential part of modern computing. There are several advantages of
cloud computing as mentioned below.
Cost saving – Capital cost required for hardware and software can be with maintenance cost.
Time saving – Cloud computing offers competitive advantage to save time on installations.
High speed – Cloud computing services are deployed quickly and make available all the
resources immediately.
Backup and restore data – The time consuming process of data back up and restoring can be
easily done with cloud services.
Software and hardware integration – In cloud services, software and hardware are integrated
with each other.
Reliability – Cloud services are highly reliable services.
Mobility – Employees can work at premises or at remote locations and they can access to cloud
services.
Unlimited storage – Almost unlimited storage is offered under cloud services with a very
nominal fees.
Disadvantages
Technical issues, downtime, security threat, internet connectivity and lower bandwidth are
certain disadvantages of cloud services.
Practical Activity
Demonstration of uses of GMail, WhatsApp and Facebook for sharing of photographs and videos
of an event.
1.8 Blockchain
Traditionally, we perform digital transactions by storing data in a centralized database and the
transactions performed are updated one by one on the database. However, since all the data is
stored on a central location, there are chances of data being hacked or lost. The blockchain
technology works on the concept of decentralized and shared database where each computer has
a copy of the database. A block can be thought as a secured chunk of data or valid transaction.
Each block has some data called its header, which is visible to every other node, while only the
owner has access to the private data of the block. Such blocks form a chain called blockchain as
shown in Figure 1.10. We can define blockchain as a system that allows a group of connected
computers to maintain a single updated and secure ledger. Each computer or node that
participates in the blockchain receives a full copy of the database. It maintains an ‘append only’
open ledger which is updated only after all the nodes within the network authenticate the
transaction. Safety and security of the transactions are ensured because all the members in the
network keep a copy of the blockchain and so it is not possible for a single member of the network
to make changes or alter data.
Several industries are involving and implementing blockchain, there is a demand for skilled
professionals. It requires hands-on experience in programming languages, OOPS fundamentals,
flat and relational databases, data structures, web app development, and networking.
3. Remotely triggering an experiment in an actual lab and providing the student the result of the
experiment through the computer interface. This would entail carrying out the actual lab
experiment remotely.
Need of Virtual Lab
Physical distances and the lack of resources make us unable to perform experiments, especially
when they involve sophisticated instruments. Also, good teachers are always a scarce resource.
Web-based and video-based courses address the issue of teaching to some extent. Conducting
joint experiments by two participating institutions and also sharing costly resources has always
been a challenge.
With the present day internet and computer technologies the above limitations can no more
hamper students and researchers in enhancing their skills and knowledge. Also, in our country,
costly instruments and equipment need to be shared with fellow researchers to the extent
possible. Web enabled experiments can be designed for remote operation and viewing so as to
enthuse the curiosity and innovation into students. This would help in learning basic and
advanced concepts through remote experimentation.
Internet-based experimentation further permits use of resources – knowledge, software, and data
available on the web, apart from encouraging skillful experiments being simultaneously
performed at points separated in space and possibly, time.
VL provide remote-access to Labs in various disciplines of Science and Engineering. These
Virtual Labs would cater to students at the undergraduate level, post graduate level as well as
to research scholars. VL enthuse students to conduct experiments by the arousing their
curiosity. This would help them in learning basic and advanced concepts through remote
experimentation.
VL provide a complete Learning Management System around the Virtual Labs where the students
can avail the various tools for learning, including additional web-resources, video-lectures,
animated demonstrations and self-evaluation. VL share costly equipment and resources, which
are otherwise available to limited number of users due to constraints on time and geographical
distances.
Virtual Labs will be made more effective and realistic by providing additional inputs to the
students like accompanying audio and video streaming of an actual lab experiment and
equipment. For the ‘touch and feel’ part, the students can possibly visit an actual laboratory for
a short duration. Virtual labs can be designed for subjects of various disciplines such as:
Computer Science and Engineering – ANN, Problem solving, Pattern Recognition.
Electronics and Communication – Digital Design, Network Technology, VLSI Design.
Electrical Engineering – Electrical machines, Industrial Lab.
Mechanical Engineering – Mechanics, Vibrations, Material Response.
Physical Sciences – Laser Optics, Modern Physics, Optics.
Chemical Sciences – Molecular Spectroscopy, Organic and Inorganic chemistry lab.
Practical Activity. 8051 Assembly Language Programme of addition of two numbers using
Virtul Lab
Perform 8051 ALP using VL
ALP1: Addition of two numbers
ALP:
ORG 0000H ; Set program counter 0000H
MOV A,#50H ; Load the number 50H into A
ADD A,#51H ; Add 51H with contents of A
MOV 52H,A ; Save the least significant byte of the result in location 52H
MOV A, #00 ; Load 00H into A
ADDC A, #00 ; Add the immediate data and the contents of carry flag to A
MOV 53H,A ; Save the most significant byte of the result in location 53
END
After execution we get [52H]=A1H and [53H]=00H
SUMMARY
• Artificial Intelligence (AI) endeavours to simulate the natural intelligence of human beings
into machines thus making them intelligent.
• Machine learning comprises algorithms that use data to learn on their own and make
predictions.
• Natural Language Processing (NLP) facilitates communicating with intelligent systems
using a natural language.
• Virtual Reality (VR) allows a user to look at, explore and interact with the virtual
surroundings, just like one can do in the real world.
• The superimposition of computer-generated perceptual information over the existing
physical surroundings is called Augmented Reality.
• Big data holds rich information and knowledge which can be of high business value. Five
characteristics of big data are: Volume, Velocity, Variety, Veracity and Value.
• Data analytics is the process of examining data sets in order to draw conclusions about
the information they contain.
• The Internet of Things (IoT) is a network of devices that have an embedded hardware and
software to communicate (connect and exchange data) with other devices on the same
network.
• A sensor is a device that takes input from the physical environment and uses built-in
computing resources to perform predefined functions upon detection of specific input and
then processes data before passing it on.
• Cloud computing allows resources located at remote locations to be made available to
anyone anywhere. Cloud services can be Infrastructure as a Service (IaaS), Platform as a
Service (PaaS) and Software as a Service (SaaS).
• Blockchain is a system that allows a group of connected computers to maintain a single
updated and secure ledger which is updated only after all the nodes in the network
authenticate the transaction.
• Virual lab allows to perform the physical lab experiments virtually with the help of
simulations.
In recent years, the increasing use of ‘Digital Technologies’, have made the significant impact on
our lives, making things more convenient, faster, and easier to handle. In the past, sending a
letter to some one take days to reach. Today, one can send and receive emails to more than one
person at a time. The instantaneous nature of electronic communications has made us more
efficient and productive. Applications of digital technologies have redefined and evolved all
spheres of human activities to the common man. While we reap the benefits of digital
technologies, these technologies can also be misused. Let’s look at the impact of these
technologies on our society and the best practices that can ensure a productive and safe digital
environment for us.
2.1 DIGITAL FOOTPRINTS
While Internet surfing, some data trail behind which reflects the online activities performed by
the user. This is the digital footprint. The digital footprint can be created and used with or without
our knowledge. It includes websites visited, emails, and any information submitted online, along
with the computer’s IP address, location, and other device specific details. Such data can be
misused or exploited. This awareness make us cautious about what we write, upload or download
or even browse online.
There are two kinds of digital footprints – active and passive. Active digital footprints includes
data intentionally submitted online. This may include emails, or responses or posts on different
websites or mobile Apps. The digital data trail left online unintentionally is called passive digital
footprints. This includes the data generated when visited a website, use a mobile App, browse
Internet. (Figure 2.1)
Everyone who is connected to the Internet may have a digital footprint. With more usage, the
trail grows. On examining the browser settings, we can see, how it stores browsing history,
cookies, passwords, auto fills, and many other types of data. Besides browser, most of our digital
footprints are stored in servers where the applications are hosted. We may not have access to
remove or erase that data. Therefore, once a data trail is generated, even if we later try to erase
data about our online activities, the digital footprints still remain. There is no guarantee that
digital footprints will be fully eliminated from the Internet. Therefore, be more cautious while
being online. All our online activities leave a data trace on the Internet as well as on the
computing device that we use. This can be used to trace the user, his/her location, device and
other usage details.
In this era of digital society, our daily activities like communication, social networking, banking,
shopping, entertainment, education, transportation, are increasingly being driven by online
transactions. Anyone who uses digital technology along with Internet is a digital citizen or a
netizen. A responsible netizen must abide by net etiquettes, communication etiquettes and social
media etiquettes.
2.2.1 Net Etiquette
It is necessary to to exhibit proper manners and etiquette while being online as shown in Figure
2.2. One should be ethical, respectful and responsible while surfing the Internet.
(A) Be Precise
Respect time – do not waste precious time in responding to unnecessary emails or comments
unless they have some relevance. Also, do not expect an instant response as the recipient may
have other priorities.
Respect data limits – For concerns related to data and bandwidth, very large attachments may
be avoided. Rather send compressed files or link of the files through cloud shared storage like
Google Drive, Microsoft One Drive, Yahoo Dropbox.
(B) Be Polite
Whether the communication is synchronous (happening in real time like chat, audio/video calls)
or asynchronous (like email, forum post or comments), be polite and non-aggressive. Avoid being
abusive even if you don’t agree with others’ point of view.
(C) Be Credible
Always be cautious while making a comment, replying or writing an email or forum post as such
acts decide your credibility over a period of time. On various discussion forums, we usually try
to go through the previous comments of a person and judge their credibility before relying on
that person’s comments.
2.2.3 Social Media Etiquette
In the current digital era, we are familiar with different kinds social media and we may have an
account on Facebook, Google+, Twitter, Instagram, Pinterest, or the YouTube channel. These
platforms encourage users to share their thoughts and experiences through posts or pictures.
In social media too, there are certain etiquette need to be followed are as below.
Choose password wisely – It is essential to choose the strong password and to change it
frequently to safeguard social media accounts. Never share personal credentials like username
and password with others.
Know who you befriend – social networks usually encourage connecting with users (making
friends). But be careful while befriending unknown people as their intentions possibly could be
malicious and unsafe.
Beware of fake information – A user should be aware of fake news, messages and posts are
common in social networks. The user should apply their knowledge and experience to validate
such news, message or post.
Think before uploading – it is possible to upload almost anything on social network. However,
remember that once uploaded, it is always there in the remote server even after deleting the files.
Hence, be cautious while uploading sensitive or confidential files.
2.3 DATA PROTECTION
Data or information protection is mainly about the privacy of data stored digitally. The sensitive
data such as biometric information, health information, financial information, or other personal
documents, images or audios or videos is accessible only to the authorised user. Privacy of
sensitive data can be implemented by encryption, authentication, and other secure methods.
The data protection policies (laws) provide guidelines to the user on processing, storage and
transmission of sensitive information. The motive behind implementation of these policies is to
ensure that sensitive information is appropriately protected from modification or disclosure.
2.3.1 Intellectual Property Right (IPR)
Intellectual Property refers to the inventions, literary and artistic expressions, designs and
symbols, names and logos. The ownership of such concepts lies with the creator, or the holder
of the intellectual property. This enables the creator or copyright owner to earn recognition or
financial benefit by using their creation or invention. Intellectual Property is legally protected
through copyrights, patents, trademarks.
(A) Copyright
Copyright grants legal rights to creators for their original works like writing, photograph, audio
recordings, video, computer software, and other creative works like literary and artistic work.
Copyrights are automatically granted to creators and authors. Copyright law gives the copyright
holder a set of rights that they alone can avail legally. The rights include right to copy (reproduce)
a work, right to create derivative works based upon it, right to distribute copies of the work to
the public, and right to publicly display or perform the work. It prevents others from copying,
using or selling the work.
Executing IPR: say for a software
√ Code of the software will be protected by a copyright
√ Functional expression of the idea will be protected by a patent
√ The name and logo of the software will come under a registered trademark
(B) Patent
A patent is usually granted for inventions. Unlike copyright, the inventor needs to apply (file) for
patenting the invention. When a patent is granted, the owner gets an exclusive right to prevent
others from using, selling, or distributing the protected invention. Patent gives full control to the
patentee to decide whether or how the invention can be used by others. Thus it encourages
inventors to share their scientific or technological findings with others. A patent protects an
invention for 20 years, after which it can be freely used.
(C) Trademark
Trademark includes any visual symbol, word, name, design, slogan, or label that distinguishes
the brand or commercial enterprise, from other brands or commercial enterprises. It also
prevents others from using a confusingly similar mark, including words or phrases.
2.3.2 Violation of IPR
Violation of intellectual property right may happen in one of the following ways:
(A) Plagiarism
It is easy to copy or share text, pictures and videos from internet. If we copy some contents from
Internet, but do not mention the source or the original creator, then it is considered as an act of
plagiarism. It is a serious ethical offense and sometimes considered as an act of fraud. Even for
the contents that are open for public use, should cite the author or source to avoid plagiarism.
(B) Copyright Infringement
As per the Copyright Act, 1957, the use of a copyrighted work without the permission of the
owner results in copyright infringement. Infringement occurs when a third person
unintentionally or intentionally uses/copies the work of another without giving credit. It is
usually classified into two categories, i.e. primary and secondary infringement. Primary
infringement occurs when there is an actual act of copying, while secondary infringement occurs
when unauthorised dealings take place, such as selling or importing pirated books.
(C) Trademark Infringement
Trademark Infringement means unauthorised use of other’s trademark on products and services.
An owner of a trademark may commence legal proceedings against someone who infringes its
registered trademark.
2.3.4 Public Access and Open Source Software
Copyright sometimes put restriction on the usage of the copyrighted works by anyone else. If
others are allowed to use and built upon the existing work, it will encourage collaboration and
would result in new innovations in the same direction. Licenses provide rules and guidelines for
others to use the existing work. When authors share their copyrighted works with others under
public license, it allows others to use and even modify the content. Open source licenses help to
contribute to existing work or project without seeking special individual permission to do so.
The GNU General public license (GPL) and the Creative Commons (CC) are two popular categories
of public licenses. CC is used for all kind of creative works like websites, music, film, and
PSS Central Institute of Vocational Education, NCERT, Bhopal
Junior Software Developer, Grade XII 213
literature. CC enables the free distribution of an otherwise copyrighted work. It is used when an
author wants to give people the right to share, use and build upon a work that they have created.
GPL is primarily designed for providing public licence to a software. GNU GPL is another free
software license, which provides end users the freedom to run, study, share and modify the
software, besides getting regular updates.
Users or companies who distribute GPL license works may charge a fee for copies or give them
free of charge. This distinguishes the GPL license from freeware software licenses like Skype,
Adobe Acrobat reader, that allow copying for personal use but prohibit commercial distribution,
or proprietary licenses where copying is prohibited by copyright law.
Many of the proprietary software that we use are sold commercially and their program code
(source code) are not shared or distributed. However, there are certain software available freely
for anyone and their source code is also open for anyone to access, modify, correct and improve.
Free and open source software (FOSS) has a large community of users and developers who are
contributing continuously towards adding new features or improving the existing features. For
example, Linux kernel-based operating systems like Ubuntu and Fedora come under FOSS.
Some of the popular FOSS tools are office packages, like Libre Office, browser like Mozilla Firefox.
Software piracy is the unauthorised use or distribution of software. Those who purchase a license
for a copy of the software do not have the rights to make additional copies without the permission
of the copyright owner. It amounts to copyright infringement regardless of whether it is done for
sale, for free distribution or for copier’s own use. One should avoid software piracy. Using a
pirated software not only degrades the performance of a computer system, but also affects the
software industry which in turn affects the economy of a country.
2.4 CYBER SECURITY
Cyber security is the application of technologies, processes, and controls to protect systems,
networks, programs, devices and data from cyber-attacks. Cyber-attacks can be:
1. Cyber fraud – Including phishing, spear phishing, vishing and whaling. A technique carried
out over the phone (vishing), email (phishing), text (smishing) or even social media with the goal
being to trick you into providing information or clicking a link to install malware on your device.
Spear phishing is targeted phishing. This is even more effective as instead of targets being chosen
at random, the attacker takes time to learn a bit about their target to make the wording more
specific and relevant.
Whaling is going after executives or presidents. They’re hoping for a bigger return on their
phishing investment and will take time to craft specific messages in this case as well.
2. Malware attacks – Including viruses, worms, Trojans, spyware, rootkits. A malware attack is
a common cyber-attack where malware i.e. a malicious software executes unauthorized actions
on the victim’s system.
3. Ransomware attacks – Ransomware is a type of malware attack that encrypts a victim’s
data and prevents access until a ransom payment is made. Ransomware attackers often use
social engineering techniques, such as phishing, to gain access to a victim’s environment.
The most common types include:
Crypto Ransomware or Encryptors – Encyrptors are one of the most well-known and damaging
variants. This type encrypts the files and data within a system, making the content inaccessible
without a decryption key.
Lockers – Lockers completely lock you out of your system, so your files and applications are
inaccessible. A lock screen displays the ransom demand, possibly with a countdown clock to
increase urgency and drive victims to act.
Scareware – Scareware is fake software that claims to have detected a virus or other issue on
your computer and directs you to pay to resolve the problem. Some types of scareware lock the
computer, while others simply flood the screen with pop-up alerts without actually damaging
files.
Drive-by downloads – A drive-by download refers to the unintentional download of malicious
code onto a computer or mobile device that exposes users to different types of threats.
Hacking – Including distributed denial-of-service attacks (DDoS), key logging- DDoS
(Distributed Denial of Service) is a category of malicious cyber-attacks that hackers or Cyber
criminals employ in order to make an online service, network resource or host machine
unavailable to its intended users on the Internet.
Password decryption – Passwords are decrypted by hackers.
Out-of-date, unpatched software – Unpatched software refers to applications or systems that
contain known vulnerabilities that have not yet been addressed through the implementation of
updates or patches.
Various Tools such as Hunchly, Censys, Virtual Box, Kali Linux, 7 Zip, OSINT, Bitwarden,
Password and Socint can be used to deal with cyber security issues.
Practical Activity
Demonstrate to install and use Bitwarden on mobile phones to protect your passwords.
Demonstrate to generate VID for your Aadhar card.
Practical Activity
Read your local newspaper for last one month and note various Cybercrime news spotted in
newspaper.
Demonstrate the messages containing fraud links, emails, sharing of OTPs and fraud KYC.
(Give some screen shots of fraud links, emails and messages along with explanation)
offenses. This section is related to Sec 43. For example, hacking. Any person who dishonestly or
fraudulently does any act referred to in Sec 43. Dishonestly and fraudulently as defined in IPC.
Section 66A deals with sending offensive message from any communication device.
Section 66B deals with dishonestly receiving or retaining any computer resource or
communication device. It has a provision of imprisonment up to 3 years or fine up to Rs.1 lac or
both. Example, purchasing stolen computer or cell phone is covered under this section. Section
66C deals with theft of identity. It has a provision of imprisonment up to 3 years and fine up to
Rs.1 lac. Example, cloning of ATM cards is covered under this section. Most of the financial
frauds are covered under this Act. Section 66D deals with cheating by personation. It has a
provision imprisonment up to 3 years and fine up to Rs.1 lac.
Section 66E deals with violating privacy i.e. intentionally/knowingly captures, publishes or
transmits the image of a private area of any person without his or her consent under
circumstances violating the privacy of that person. It has a provision of Imprisonment up to 3
years and fine up to Rs. 1 lac.
Section 66F deals with Cyber Terrorism. It has a provision of Life imprisonment. Causing death
or injuries with Intent to threaten the unity, integrity, security or sovereignty of India or to strike
terror is covered under this section.
SUMMARY
• Digital footprint is the trail of data we leave behind when we visit any website (or use any
online application or portal) to fill-in data or perform any transaction.
• A user of digital technology needs to follow certain etiquette like net-etiquette,
communication-etiquette and social media-etiquette.
• Net-etiquette includes avoiding copyright violations, respecting privacy and diversity of
users, and avoiding Cyber bullies and Cyber trolls, besides sharing of expertise.
• Communication-etiquette requires us to be precise and polite in our conversation so that
we remain credible through our remarks and comments.
• While using social media, one needs to take care of security through password, be aware
of fake information and be careful while befriending unknowns. Care must be taken while
sharing anything on social media as it may create havoc if being mishandled, particularly
our personal, sensitive information.
• Intellectual Property Rights (IPR) help in data protection through copyrights, patents and
trademarks. There are both ethical and legal aspects of violating IPR. A good digital citizen
should avoid plagiarism, copyright infringement and trademark infringement.
• Certain software is made available for free public access. Free and Open Source Software
(FOSS) allow users to not only access but also to modify (or improve) them.
• Cyber crimes include various criminal activities carried out to steal data or to break down
important services. These include hacking, spreading viruses or malware, sending
phishing or fraudulent emails, ransomware, etc.
8. Additional crimes like pornography and Cyber terrorism are also described in IT Act 2000
9. Passwords are encrypted by hackers.
10. Phishing and hacking are the same forms of Cybercrime.
(D) Short answer questions
1. What is the digital footprint? How it is stored?
2. What are the Net Etiquette?
3. What are the Communication Etiquette?
4. What are the Social Media Etiquette?
5. Write the long form of the acronym (a) GNU (b) GPL (c) FOSS (d) CC
6. Explain the term (a) vishing (b) phishing (c) smishing
7. List the different forms of Cybercrime.
8. What is Cyber security?
9. List the impacts of Cybercrime on the society.
10. What are the advantages of FOSS?
(E) Match the following
Column A Column B
Plagiarism Fakers, by offering special rewards or money prize asked for personal
information, such as bank account information
Hacking Copy and paste information from the Internet into your report and then
organise it
Credit card fraud The trail that is created when a person uses the Internet.
Digital Foot Print Breaking into computers to read private emails and other files
Answer