0% found this document useful (0 votes)
3 views

RProgramming1UnitQ&A

Uploaded by

pankaj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

RProgramming1UnitQ&A

Uploaded by

pankaj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Confidential - Oracle Restricted

2-Mark Questions (Short Answer)

1. What is a vector in R? Provide an example.

o A vector is a basic data structure in R that holds elements of the same type
(numeric, character, logical). You can create a vector using the c() function.
For example:

x <- c(1, 2, 3, 4)

This creates a numeric vector containing the values 1, 2, 3, and 4.

2. How do you create a matrix in R? Give an example.

o A matrix is a two-dimensional array where all elements are of the same type.
It can be created using the matrix() function. For example:

m <- matrix(1:6, nrow=2, ncol=3)

This creates a matrix with 2 rows and 3 columns, containing values from 1 to 6.

3. What is an array in R? How does it differ from a matrix?

o An array is a multi-dimensional data structure that can hold data in more


than two dimensions. Unlike a matrix, which is always two-dimensional, an
array can have three or more dimensions. For example:

arr <- array(1:8, dim=c(2,2,2))

This creates a 3D array with dimensions 2x2x2.

4. Explain the concept of non-numeric values in R. Give an example.

o Non-numeric values in R refer to data that are not numbers, such as


characters or logical values. They can be created using character strings or
logical expressions. For example:

non_numeric <- c("apple", "banana", "cherry")

This creates a character vector of fruit names.

Confidential - Oracle Restricted


Confidential - Oracle Restricted

5. What is a list in R? Provide an example of its creation.

o A list in R is an ordered collection of objects, which can be of different types


(e.g., numbers, strings, vectors). Lists are more flexible than vectors because
they can hold different types of data. You can create a list using the list()
function:

my_list <- list(a = 1, b = "hello", c = TRUE)

This creates a list with an integer, a string, and a logical value.

6. Define a data frame in R and show how to create one.

o A data frame is a table-like structure that can hold columns of different data
types (e.g., numeric, character, logical). It is similar to a spreadsheet or SQL
table. Data frames are created using the data.frame() function. For example:

df <- data.frame(Name = c("John", "Jane"), Age = c(22, 24))

This creates a data frame with two columns, Name and Age.

7. What are NA values in R? Provide an example.

o NA is a special constant in R representing a missing or undefined value. It is


used to handle missing data in vectors, matrices, and data frames. For
example:

x <- c(1, 2, NA, 4)

Here, the third element in the vector is NA, indicating that the value is missing.

8. What is the function is.na() used for in R?

o The is.na() function checks if a value is NA (missing). It returns a logical


vector indicating which elements are NA. For example:

is.na(c(1, 2, NA, 4))

This would return FALSE, FALSE, TRUE, FALSE, indicating the position of NA in the vector.

9. Describe the term "coercion" in R with an example.

o Coercion in R refers to the automatic or explicit conversion of data from one


type to another. R will attempt to coerce values to a common type when
necessary. For example:

as.numeric("3")

Confidential - Oracle Restricted


Confidential - Oracle Restricted

Here, the string "3" is coerced into the numeric value 3.

10. What is the significance of NULL in R?

o NULL represents the absence of any value or object in R. It is different from


NA because NULL is used when there is no data at all. For example:

x <- NULL

Here, x is assigned no value.

11. How do you convert a list into a data frame in R?

o You can convert a list to a data frame using the as.data.frame() function.
However, the list should have named components or similar structures to
form a valid data frame. For example:

my_list <- list(Name = c("John", "Jane"), Age = c(22, 24))

df <- as.data.frame(my_list)

This converts the list into a data frame with two columns: Name and Age.

12. Explain the difference between a matrix and a data frame.

o A matrix is a two-dimensional structure where all elements are of the same


type (numeric, character, etc.), while a data frame can hold columns of
different data types. For example, a matrix might contain only numeric
values, whereas a data frame can have both numeric and character
columns.

13. What is a factor in R and how is it created?

o A factor in R is used to represent categorical data. Factors store both the


values of a variable and the corresponding levels (unique categories). They
are useful for representing qualitative data. You can create a factor using the
factor() function. For example:

gender <- factor(c("Male", "Female", "Male", "Female"))

This creates a factor variable representing gender with two levels: "Male" and "Female."

14. What is the difference between NULL and NA in R?

o NULL represents the absence of a value or object, while NA represents a


missing value or undefined data within an object. For example, x <- NULL

Confidential - Oracle Restricted


Confidential - Oracle Restricted

means there is no object, while x <- NA indicates a missing value in a


variable.

15. How can you extract a specific element from a list in R?

o To extract an element from a list, you can use the double bracket notation [[ ]]
or the $ operator for named elements. For example:

my_list <- list(a = 1, b = "hello", c = TRUE)

my_list[[2]] # Extracts the second element

Or for named elements:

my_list$b # Extracts the value associated with the name 'b'

5-Mark Questions (Detailed Answer)


1. Explain the creation and manipulation of vectors in R with examples.

A vector in R is a fundamental data structure that holds an ordered collection of elements,


all of which must be of the same type, such as numeric, character, or logical. Vectors can
be created using the c() function. For example:

v <- c(10, 20, 30, 40)

This creates a numeric vector with four elements: 10, 20, 30, and 40. Vectors are also
important because R supports various operations on them, such as mathematical
operations, logical tests, and indexing. For example:

o Mathematical Operations: You can add, subtract, multiply, or divide each


element of a vector. If you have a vector v, you can perform element-wise
operations like:

v * 2 # Multiplies each element of v by 2

o Logical Operations: You can perform logical operations such as comparison


or filtering. For example:

v > 20 # Checks which elements are greater than 20, returning a logical vector: FALSE,
FALSE, TRUE, TRUE

Confidential - Oracle Restricted


Confidential - Oracle Restricted

o Indexing: Vectors are indexed using square brackets. You can extract
specific elements like this:

v[2] # Extracts the second element, which is 20

v[1:3] # Extracts elements from the first to the third, which returns 10, 20, 30

2. Discuss the creation and characteristics of matrices in R.

A matrix in R is a two-dimensional, rectangular data structure that stores elements of the


same type (e.g., numeric, logical). It can be created using the matrix() function, which
requires the data to be provided and the dimensions of the matrix (rows and columns) to be
specified. For example:

m <- matrix(1:6, nrow=2, ncol=3)

This creates a 2x3 matrix with the numbers 1 through 6. The matrix will be filled by columns
by default:

[,1] [,2] [,3]

[1,] 1 3 5

[2,] 2 4 6

Characteristics of matrices:

o Data Type: All elements of a matrix must be of the same type.

o Dimensionality: A matrix is two-dimensional. Each element is indexed by its


row and column number, unlike a vector, which has only one dimension.

o Manipulation: You can perform various operations on matrices, such as


element-wise arithmetic, matrix multiplication, and transposition. For
example:

t(m) # Transposes the matrix

m[1,2] # Accesses the element in the first row and second column, which is 3

Matrices are also compatible with functions like rowSums() and colSums(), which
summarize rows and columns respectively. Matrix operations are faster and more memory-
efficient compared to lists or data frames when dealing with numerical data.

Confidential - Oracle Restricted


Confidential - Oracle Restricted

3. Explain arrays in R and provide an example of a 3-dimensional array.

An array in R is a multi-dimensional data structure that can hold elements of the same
type, similar to matrices, but with more than two dimensions. Arrays are particularly useful
when working with data in higher dimensions (e.g., images, time series data). To create an
array, you use the array() function and specify the data and its dimensions. For example, to
create a 3-dimensional array:

arr <- array(1:12, dim=c(3, 2, 2))

This creates a 3x2x2 array. The dim argument specifies the dimensions: 3 rows, 2 columns,
and 2 layers. The array will look like:

,,1

[,1] [,2]

[1,] 1 4

[2,] 2 5

[3,] 3 6

,,2

[,1] [,2]

[1,] 7 10

[2,] 8 11

[3,] 9 12

Characteristics of Arrays:

o Dimensionality: Arrays can have more than two dimensions, unlike


matrices, which are strictly two-dimensional. Arrays can have up to several
hundred dimensions in R, though practical use cases are usually limited to 3
or 4 dimensions.

o Data Type: Like matrices, all elements of an array must be of the same data
type.

Confidential - Oracle Restricted


Confidential - Oracle Restricted

o Indexing: Arrays are indexed by multiple dimensions. You can access


elements with a specific row, column, and layer index. For example:

arr[1, 2, 1] # Accesses the element in the first row, second column, and first layer, which is
4

Arrays are a powerful tool when dealing with data that requires more than two dimensions,
such as when working with multi-dimensional datasets like 3D images.

4. Describe how to handle non-numeric data in R, with examples.

Non-numeric data in R can include characters, logical values, factors, and other data
types. These non-numeric types are essential when working with categorical data, text, and
binary values. Here's how they can be handled:

o Character Data: A character vector is created using the c() function and can
store strings of text. For example:

fruits <- c("apple", "banana", "cherry")

You can manipulate character data just like numeric data. For instance:

fruits[2] # Extracts the second element, "banana"

o Logical Data: Logical vectors hold boolean values (TRUE/FALSE). These are
useful in conditional operations. For example:

logic_vec <- c(TRUE, FALSE, TRUE)

logic_vec[logic_vec == TRUE] # Extracts TRUE values

o Factors: A factor is used to represent categorical data, such as gender or


species. Factors have a fixed set of levels. You can create a factor using the
factor() function. For example:

gender <- factor(c("Male", "Female", "Male", "Female"))

levels(gender) # Returns "Male", "Female"

Factors are often used in statistical modeling because they allow R to efficiently handle
categorical variables.

Confidential - Oracle Restricted


Confidential - Oracle Restricted

Manipulating Non-Numeric Data:

o You can convert character data into factors using factor().

o Logical data can be converted to numeric using as.numeric(), where TRUE


becomes 1 and FALSE becomes 0.

Non-numeric data are essential for data analysis when dealing with categorical variables or
conditions, and they often require special handling in statistical models.

5. Explain the difference between lists and data frames in R with examples.

Lists and data frames are both used to store multiple pieces of data, but they differ in
structure and purpose.

o Lists: A list is an ordered collection of objects that can be of different types.


Lists are highly flexible, and elements within a list can be numbers, strings,
vectors, or even other lists. A list is created using the list() function. For
example:

my_list <- list(a = 1, b = "hello", c = TRUE)

This creates a list with three elements: an integer (1), a string ("hello"), and a logical value
(TRUE). You can access elements in a list using [[ ]] or $ if the elements have names:

my_list$a # Accesses the element named 'a', which is 1

o Data Frames: A data frame is a tabular structure in which each column can
hold data of a different type (e.g., numeric, character, logical), but all
elements in a column must be of the same type. Data frames are created
using the data.frame() function. For example:

df <- data.frame(Name = c("John", "Jane"), Age = c(22, 24))

This creates a data frame with two columns: Name (character data) and Age (numeric
data). You can access columns in a data frame using the $ operator:

df$Name # Accesses the 'Name' column

Confidential - Oracle Restricted


Confidential - Oracle Restricted

Key Differences:

o Homogeneity: All elements in a data frame column must be of the same


type, while elements in a list can be of different types.

o Usage: Data frames are most commonly used to store tabular data (like
spreadsheets or databases), while lists are useful when you need to store
heterogeneous data (e.g., different types of data, nested structures).

o Indexing: Lists use [[ ]] for accessing elements, while data frames use $ or
indexing with [].

Data frames are ideal for structured data, while lists are more flexible for holding mixed
data types.

6. What are special values in R, and how do you handle them?

Special values in R are used to represent exceptional or undefined conditions in data and
calculations. They include:

1. NA (Not Available): Indicates missing values in a dataset.


Example:

x <- c(1, 2, NA, 4)

is.na(x) # Identifies NA values as TRUE or FALSE

o You can handle NA values using na.omit() to remove them or by replacing


them with other values using ifelse().

2. NULL: Represents the absence of an object or value.


Example:

y <- NULL

is.null(y) # Returns TRUE if the object is NULL

3. Inf and -Inf (Infinity): Occurs when a number exceeds the machine's representable
limit or results from division by zero.
Example:

z <- c(1, Inf, -Inf)

is.infinite(z) # Identifies infinite values

Confidential - Oracle Restricted


Confidential - Oracle Restricted

These special values can disrupt analyses, so identifying and addressing them is crucial for
data cleaning and preprocessing.

7. Explain coercion in R with examples.

Coercion is the automatic or explicit conversion of data types in R to ensure consistency.


This occurs because R operates on homogeneous data types in vectors, matrices, and
arrays.

Types of Coercion:

1. Automatic Coercion: When combining different data types, R coerces all elements
to the most flexible type (character > numeric > logical).
Example:

x <- c(1, "2", TRUE) # Numeric and logical are coerced to character

print(x) # Outputs: "1" "2" "TRUE"

2. Explicit Coercion: Use functions like as.numeric(), as.character(), or as.logical() to


manually change data types.
Example:

x <- "5"

y <- as.numeric(x) # Converts the string "5" to numeric value 5

z <- as.logical(1) # Converts 1 to TRUE

Implications:

• Coercion can simplify data manipulation but may also introduce errors if
incompatible types are coerced (e.g., trying to convert a character that isn't numeric
to a number).

8. Describe the process of basic plotting in R with examples.

R provides a range of functions for data visualization, allowing users to explore and present
data graphically.

Example 1: Scatter Plot

x <- c(1, 2, 3, 4)

Confidential - Oracle Restricted


Confidential - Oracle Restricted

y <- c(2, 3, 4, 5)

plot(x, y, type="b", col="blue", pch=19, main="Scatter Plot Example", xlab="X Values",


ylab="Y Values")

• type="b": Plots both points and lines.

• col="blue": Sets the plot color.

• pch=19: Specifies the shape of points.

Example 2: Histogram

data <- c(1, 2, 2, 3, 3, 3, 4, 4, 5)

hist(data, col="lightblue", main="Histogram Example", xlab="Values", breaks=5)

Example 3: Boxplot

boxplot(data, main="Boxplot Example", col="orange")

9. Discuss how to handle missing values (NA) in R.

Missing values (NA) are common in datasets and need to be addressed to ensure accurate
analysis.

Identifying Missing Values:

• Use is.na() to locate missing values.


Example:

x <- c(1, 2, NA, 4)

is.na(x) # Outputs TRUE for the third element

Handling Missing Values:

1. Remove Missing Values:

clean_x <- na.omit(x) # Removes NA values

2. Replace Missing Values:

x[is.na(x)] <- 0 # Replaces NA with 0

3. Impute Missing Values: Replace NA with mean, median, or another statistic.


Example:

Confidential - Oracle Restricted


Confidential - Oracle Restricted

x[is.na(x)] <- mean(x, na.rm=TRUE) # Replaces NA with the mean

10. Explain the concept of factors in R with examples.

Factors are data structures used to store categorical data in R. Each category, or level, is
assigned a unique integer.

Creation of Factors:

gender <- factor(c("Male", "Female", "Male", "Female"))

print(gender) # Outputs the data with levels: Male, Female

Properties:

1. Levels: Unique categories in the factor.

levels(gender) # Returns: "Female" "Male"

2. Conversion to Numeric: Factors are stored as integers internally.

as.numeric(gender) # Converts to integers: 2, 1, 2, 1

Factors are essential for statistical analysis, especially in regression models, where they
represent categorical predictors.

11. Discuss data frame operations in R with examples.

A data frame is a table-like structure where each column can have a different data type. It
is ideal for storing tabular data.

Creating a Data Frame:

df <- data.frame(Name = c("John", "Jane"), Age = c(22, 24))

Operations:

1. Extract Columns:

df$Age # Extracts the Age column

2. Add Columns:

df$Gender <- c("Male", "Female") # Adds a new column

3. Subset Rows:

Confidential - Oracle Restricted


Confidential - Oracle Restricted

subset_df <- df[df$Age > 22, ] # Extracts rows where Age > 22

4. Summarize Data:

summary(df) # Provides a statistical summary of the data

12. How do you create and customize plots in R using the ggplot2 package?

ggplot2 is a powerful R package for creating layered, customizable visualizations using the
grammar of graphics.

Example:

library(ggplot2)

# Basic Scatter Plot

df <- data.frame(Age=c(22, 24, 26), Height=c(160, 170, 175))

ggplot(df, aes(x=Age, y=Height)) +

geom_point(color="blue", size=3) +

theme_minimal() +

labs(title="Age vs Height", x="Age", y="Height")

Customizations:

• Add layers using +.

• Change themes (theme_minimal, theme_classic).

• Customize aesthetics (e.g., color, size, shape).

12. Explain the importance of lists in R and provide examples of their creation and
usage.

Lists in R are versatile data structures that can store elements of different types (e.g.,
numbers, characters, vectors, matrices, or even other lists). This flexibility makes lists
essential for handling complex data structures, such as the output of statistical models or
custom objects.

Key Features of Lists:

1. They can hold heterogeneous data.

Confidential - Oracle Restricted


Confidential - Oracle Restricted

2. Each element in a list can be accessed by its index or name.

3. Lists are commonly used in functions where multiple results need to be returned
together.

Creating a List:

my_list <- list(

Name = "John",

Age = 25,

Scores = c(90, 85, 88),

Passed = TRUE

print(my_list)

• my_list contains a string, numeric, vector, and logical value.

Accessing Elements in a List:

my_list$Scores # Access using the name

my_list[[2]] # Access using index

Modifying Lists:

my_list$Department <- "Science" # Add a new element

my_list$Age <- 26 # Modify an element

Use Case Example:

Lists are useful for storing regression outputs:

model <- lm(mpg ~ wt, data = mtcars)

summary_list <- summary(model)

summary_list$coefficients # Access the regression coefficients

Confidential - Oracle Restricted


Confidential - Oracle Restricted

13. Describe the process of creating and manipulating matrices in R. Provide


examples.

Matrices in R are two-dimensional data structures where all elements are of the same type
(e.g., numeric or character). They are commonly used in mathematical computations, such
as linear algebra.

Creating a Matrix:

# Create a 3x3 matrix

mat <- matrix(1:9, nrow = 3, ncol = 3)

print(mat)

This creates a matrix with numbers from 1 to 9, arranged in 3 rows and 3 columns.

Manipulating Matrices:

1. Accessing Elements:

mat[1, 2] # Access element in the 1st row, 2nd column

mat[, 2] # Access the entire 2nd column

2. Adding Rows/Columns:

new_row <- c(10, 11, 12)

mat <- rbind(mat, new_row) # Add a new row

3. Matrix Arithmetic:

mat <- mat * 2 # Multiply each element by 2

Matrix Operations:

• Transpose:

t(mat) # Transpose of the matrix

• Matrix Multiplication:

mat2 <- matrix(1:9, nrow = 3)

result <- mat %*% mat2 # Perform matrix multiplication

Use Case Example:

Confidential - Oracle Restricted


Confidential - Oracle Restricted

Matrices are often used for representing datasets or performing operations like solving
systems of equations.

14. What is the role of classes in R, and how do they affect object behavior? Provide
examples.

Classes in R define the type or behavior of an object. They allow R to determine how
functions should interact with objects, enabling object-oriented programming.

Role of Classes:

1. Define object properties and associated methods.

2. Enable specific functions to behave differently based on the object's class


(polymorphism).

3. Provide structure and clarity to complex data.

Checking and Setting Classes:

x <- 5

class(x) # Returns "numeric"

y <- factor(c("Male", "Female", "Male"))

class(y) # Returns "factor"

Creating Custom Classes:

You can define your own classes using class():

custom_obj <- list(a = 1, b = 2)

class(custom_obj) <- "custom_class"

print(custom_obj)

Methods and Classes:

Functions like summary() or print() behave differently based on the object's class:

model <- lm(mpg ~ wt, data = mtcars)

class(model) # Returns "lm"

Confidential - Oracle Restricted


Confidential - Oracle Restricted

summary(model) # Summarizes the linear model

Use Case Example:

Classes are vital for packages that define custom data types, like ggplot objects or
regression models.

15. Discuss the use of non-numeric values in R, such as characters and logicals, with
examples.

Non-numeric values, such as characters and logicals, are important data types in R that
expand beyond numerical computations.

Character Values:

1. Represent textual data.

2. Used for labels, names, or strings in datasets.

Example:

names <- c("Alice", "Bob", "Charlie")

paste(names, "is learning R") # Concatenates strings

Logical Values:

1. Represent binary TRUE/FALSE values.

2. Useful for conditional statements and subsetting data.

Example:

x <- c(5, 10, 15)

x > 8 # Returns: FALSE, TRUE, TRUE

Combining Non-Numeric Values:

df <- data.frame(Name = c("Alice", "Bob"), Passed = c(TRUE, FALSE))

print(df)

Use Case Example:

Non-numeric values are essential in filtering datasets:

subset(mtcars, gear == 4) # Subset rows where gear is 4

Confidential - Oracle Restricted


Confidential - Oracle Restricted

16. How are special values like NA, NaN, and Inf different from one another in R?
Provide examples.

R uses special values to represent undefined or exceptional conditions.

NA (Not Available):

Indicates missing data.

x <- c(1, 2, NA, 4)

is.na(x) # Identifies missing values

• Used to represent incomplete or unavailable data.

NaN (Not a Number):

Represents undefined numerical results.

y <- 0/0 # Results in NaN

is.nan(y) # Checks if a value is NaN

• NaN is a type of NA but specifically indicates a mathematical anomaly.

Inf and -Inf (Infinity):

Occurs when a number exceeds the machine's limit or involves division by zero.

z <- 1/0 # Results in Inf

z <- -1/0 # Results in -Inf

is.infinite(z) # Checks for infinite values

Key Differences:

• NA: General missing value in any data type.

• NaN: Specific missing value resulting from invalid numeric computations.

• Inf/-Inf: Represents extremely large positive or negative values.

Handling Special Values:

x <- c(1, 2, NA, NaN, Inf)

is.na(x) # Identifies NA and NaN

Confidential - Oracle Restricted


Confidential - Oracle Restricted

is.nan(x) # Identifies only NaN

is.infinite(x) # Identifies Inf

These special values must be carefully addressed during data preprocessing to avoid
errors.

Confidential - Oracle Restricted

You might also like