Data Science With R - Comcast Telecom Consumer Complaints
Data Science With R - Comcast Telecom Consumer Complaints
Comcast is an American global telecommunication company. The firm has been providing
terrible customer service. They continue to fall short despite repeated promises to improve.
Only last month (October 2016) the authority fined them a $2.3 million, after receiving over
1000 consumer complaints.
The existing database will serve as a repository of public customer complaints filed against
Comcast.
It will help to pin down what is wrong with Comcast's customer service.
Data Dictionary
Ticket #: Ticket number assigned to each complaint
Customer Complaint : Description of complaint
Date : Date of complaint
Time : Time of complaint
Received Via: Mode of communication of the complaint
City : Customer city
State : Customer state
Zipcode : Customer zip
Status : Status of complaint
Filing on behalf of someone
Analysis Task
- Provide insights on :
Code
# Install Packages
install.packages("stringi")
install.packages("lubridate")
install.packages("dplyr")
install.packages("ggplot2")
install.packages("ggpubr")
library("lubridate")
library("stringi")
library("dplyr")
library("ggplot2")
library("ggpubr")
setwd("C:/Users/Dima/Desktop")
getwd()
# Importing Dataset
monthly_tickets <-
summarise(group_by(comcast,month=as.integer(month(Date))),count=n())
library(ggplot2)
comcast$ComplaintType[internet_issues]<- "Internet"
comcast$ComplaintType[network_issues] <- "Network"
comcast$ComplaintType[billing_issues] <- "billing"
comcast$ComplaintType[charges_issues] <- "Charges"
comcast$ComplaintType[email_issues] <- "Email"
comcast$ComplaintType[-
c(internet_issues,network_issues,billing_issues,charges_issues,email_issues)] <- "Others"
table(comcast$ComplaintType)
# Let us create a new categorical variable with value as Open and Closed
# Provide the percentage of complaints resolved till date, which were received through
the Internet and customer care calls
par(mfrow = c(1,2))
total<-ggplot(total_resolved,
aes(x= "",y =percentage,fill = ComplaintStatus))+
geom_bar(stat = "identity",width = 1)+
coord_polar("y",start = 0)+
geom_text(aes(label = paste0(round(percentage*100),"%")),
position = position_stack(vjust = 0.5))+
labs(x = NULL,y = NULL,fill = NULL)+
theme_classic()+theme(axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank())
category<-ggplot(Category_resloved,
aes(x= "",y =percentage,fill = ComplaintStatus))+
geom_bar(stat = "identity",width = 1)+
coord_polar("y",start = 0)+
geom_text(aes(label = paste0(Received.Via,"-",round(percentage*100),"%")),
position = position_stack(vjust = 0.5))+
labs(x = NULL,y = NULL,fill = NULL)+
theme_classic()+theme(axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank())
ggarrange(total,category,nrow = 1, ncol = 2)
Analysis with screenshots
Data was loaded into the R environment and no missing data was identified as per the
screenshot below:
Provide the trend chart for the number of complaints at monthly and daily granularity
levels.
As showcased in the table below the daily and monthly tickets were extracted and two graphs
were plotted below to compare the Monthly and daily tickets
As we can interpret in the first graph below, the number of tickets starts to increase in April
and May. However, what we can further see is that the number of tickets has raised
drastically in the month of June. We can as such assume is that there a significant reason
behind such turning-point.
In the second graph what we can interpret is that the number of tickets starts to drastically
increase during the second half of the month of June.
Let us dive into more details to check what is the most category of complaints that the
company is receiving.
Provide a table with the frequency of complaint types.
As shown in the table below, most of the complaints are related to Internet issues. A lot of
other categories of complaints were grouped under the “Others” category
Create a stacked bar chart for complaints based on city and status
As we can observe in the chart the states where the number of tickets is the highest are in
Georgia and Florida.
As depicted in the pie charts below, we can conclude that the resolved complaints are 77% in
which 38% are received from the Internet and 39% from the customer care calls. Also, we
can notice that there is 23% of complaints that are still unresolved and in which 12% are
received from the Internet and 11% from the customer care calls.
Conclusion:
As per the above analysis we observe that in the 2nd half of the June month Comcast
received high amount of complaints in which most of the complaints are related to internet
service issue and the highest amount of complaints are received from the state Georgia.
The highest unresolved complaints are related from the state Georgia and the total amount
of resolved complaints are 77% in which 38% are received the internet and 39% are from
the customer care calls.