Data Analysis and Business Modeling With Excel 2013 - Sample Chapter
Data Analysis and Business Modeling With Excel 2013 - Sample Chapter
$ 39.99 US
25.99 UK
P U B L I S H I N G
David Rojas
If you want to start using Excel 2013 for data analysis and
business modeling and enhance your skills in the data
analysis life cycle, then this book is for you, whether you're
new to Excel or an experienced user.
ee
Sa
pl
e
P r o f e s s i o n a l
E x p e r t i s e
D i s t i l l e d
David Rojas
P U B L I S H I N G
his time as a consultant in the data world. He lives in the Silicon Valley and is active
within the data community. After receiving a degree from the University of Florida
as an industrial and systems engineer and obtaining a minor in sales engineering,
he received his state license as an engineer in training. Soon thereafter, he pursued a
career change to the IT world as a data analyst and discovered his passion for data
using various tools in order to manage and analyze data in a better way. After many
years of working in a wide range of odd data roles, such as reporting, gathering
requirements, writing documentation, working with databases, and working with
flat files, he decided to make his love for data a reality and started his own business
(www.hedaro.com). You will often find his work being cited by various professors
and other data enthusiasts around the Bay Area.
Preface
If you ever wondered how other data professionals manage, analyze, and visualize
data with Excel, then this book will be a wealth of knowledge for you. This book is
filled with step-by-step instructions and progresses through the same natural stages
a data analyst goes through in practice. The examples are deliberately small so that
you can understand the problems being solved and solutions are shown in detail
without skipping any steps along the way. In addition, my extensive experience in
the industry will help you explore practical real-world examples that go beyond
theories and provide you with a strong foundation that can be used in a wide range
of data-intensive roles that you may encounter throughout your career. After reading
the entire book, you will have the confidence to work with data and tell a compelling
story about its findings using Excel.
Preface
Chapter 4, Using Formulas to Prepare Your Data for Analysis, covers the use of Excel's
formulas to create custom columns, identify key metrics, and make decisions based
on business rules. Formulas are one of the key features that showcase the power of
the tool, and this chapter provides you with plenty of practical examples to help you
gain valuable experience.
Chapter 5, Analyzing Your Data Using Descriptive Statistics and Charts, uses Excel to
explore data to identify bad data, spot outliers, and trends. After data has been
cleaned and prepared, it is now time to dig a little deeper. Are there any issues with
your data? Do you have bad data? Do you understand what kind of data is in each
column and how it relates to the rest of your dataset? Using Excel's built-in tools and
charting capabilities, you will learn more about the data you are working with.
Chapter 6, Link Your Data Using Data Models, covers how to combine and link data
using database concepts by taking advantage of the new features of Excel 2013.
Excel's data model allowa us to combine tables in a similar way to how the LOOKUP
functions accomplished this previously. This new functionality will allow the analyst
to merge datasets faster and with ease. Organizing data is the key concept in this
chapter that will propel you to answer questions about the data.
Chapter 7, A Primer on Using the Excel Solver, teaches you the basics of the Excel
Solver, which is one of the most underrated tools that comes with Excel. You will
learn how to activate the add-ins all the way through to solving business problems
that are relevant to today's workplace. The information in these few pages will
elevate you above other Excel developers.
Chapter 8, Learning VBA Excel's Scripting Language, introduces you to Excel's very
own scripting language. After performing the same data transformations over and
over again, a smart data analyst will try to find ways to automate repetitive tasks.
Excel's solution to this problem is VBA (Visual Basic for Applications), in which
you will learn how to create macros to automate certain tasks. This chapter will
empower you with knowledge that will differentiate you from a casual Excel
user to a powerful, skilled, and advanced Excel developer.
Chapter 9, How to Build and Style Your Charts, discusses how to use Excel's built-in
charting tools to quickly create visually appealing charts. Visualizing data is not
only a great way to understand it but also a great way to tell a story to an audience.
This chapter also covers how to customize properties, such as titles, legends, colors,
and so on. This chapter focuses on the keys to generate creative, simple, and concise
charts that will deliver insights from your findings.
Preface
Chapter 10, Creating Interactive Spreadsheets Using Tables and Slicers, helps you leverage
Excel's interactive slicers, which is one of the most exciting chapters in this book
that will simply impress you. Here, you will gain the ability to slice and dice data
interactively, create custom filters that automatically update the data on the fly,
and watch the audience engage with the data. You can filter by dates, strings,
and numbers; the possibilities are endless!
Appendix, Tips, Tricks, and Shortcuts, provides you with useful shortcuts and tips that
have been used throughout this book for reference purposes.
Gathering data
Gathering data is exactly what it sounds like; in this step, you will be gathering all of
the data you need for analysis. This might include data that you get from your client,
boss, coworker, the Internet, or a database. There are other data sources, such as
CSV files, but remember that it is your job to find the data. I once had a client asking
me "Can you take a look at my code as it is not working?" He was trying to map
some data into Google Maps and he was having trouble doing this. He sent me code
snippets and asked me if I could figure out what the problem was. I took a look at
his work, but I just did not have enough information to debug the issue. Guess what
my next question to my client was? "Send me your code and the data you are trying
to plot." Knowing what kind of data my client was working with and what the code
was doing with the data were the two key questions that I needed to know.
[1]
I eventually figured out the issue for my client, but the point here is to show you
that getting the data in your hands is the first step. Chapter 1, Getting Data into Excel,
and Chapter 2, Connecting to Databases, will focus on providing you with all the skills
needed to bring data from various sources into Excel.
Preparing data
You will soon realize that after you gather your data, it does not always come in a
neat package for you. For example, you may be given a PDF document with 1,000
entries and asked to transfer that data into an Excel spreadsheet. You might get lucky
and be able to copy/paste the records into Excel, or you might be forced to manually
enter each record by hand. I used to work for a wholesaler of college text books and
faced a similar situation. I needed to copy a very large PDF document and transfer
its content to Excel. I remember refusing to do so and asking a coworker to put this
data in a different format. I was trying everything under my control to avoid that
PDF file. Unfortunately, in the end I had no choice but to roll up my sleeves and get
the job done. As a data analyst, you would probably spend most of your time in the
data analysis life cycle cleaning the data. In other words, you will gather data and
organize it in a format you can work with. Munging and data wrangling are other
terms you may hear that refer to this step of the process. Other common issues are
numbers formatted as strings, missing values, extra spaces, and so on. We will go
through various examples of the ones mentioned and their solutions in Chapter 3,
How to Clean Texts, Numbers, and Dates, and Chapter 4, Using Formulas to Prepare Your
Data for Analysis.
Analyzing data
After you gather and prepare your data, you are now ready to analyze it. Your main
goal up until now was to get your data into Excel; this is our comfort zone where
we know we can work with data. What do I mean when I say analyze your data?
Well, this means that it is time to get your inquisitive and curious hats on. If you
don't have any of these, then it is time to act like a detective, Inspector Gadget style
(if you're old enough to remember who he is). In this step, we begin with inspecting
every column one by one. For example, let's say that the first column was called
Revenue and the second column was called Product Name. We would expect the
Revenue column to have numbers in each of the values and the Product Name
column to have strings as the values associated with this column. We will then look
for any missing values, the largest number, and the smallest value. We might also
take a look at the distinct values in the Product Name column and look for any
misspelled words.
[2]
Chapter 1
Are you trying to solve a problem? Are you trying to predict the next year's revenue?
Did you ask for some background of the task you were assigned to do? Remember
to ask all these questions to whoever is going to receive your analysis for feedback
along the way. The last thing that you might avoid is that when you complete the
analysis, you are told that you were analyzing or solving the wrong problem. You
may also spend a lot of time figuring out what certain columns mean if you actually
have the data to complete the task. Chapter 5, Analyzing Your Data Using Descriptive
Statistics and Charts, Chapter 6, Link Your Data Using Data Models, and Chapter 7, A
Primer on Using the Excel Solver, will give you enough exposure to analyzing and
squeezing out insights from your data.
Presenting data
This is where the fun begins; you are now at a point where you can tell your story.
At this point, you should know everything about your data, such as where it came
from and how it was prepared or organized, and you should have completed the
task you were assigned, at least in theory. For example, if you were asked to simply
create a line chart with the monthly sales for the year, then this is where you should
be at this stage. The data should be in Excel, the sales data should be aggregated on a
monthly basis, and you should already have an idea of how to create and place your
line chart. Before you spend an hour or so making your final spreadsheet look good,
create a simple mockup and get feedback from your end user. I know that this is not
always applicable to every situation, but getting feedback along the way will save
you a lot of time from redoing the work at a later stage. Another little known fact
is that people just change their minds or sometimes change their requirements, so
always build your spreadsheets as flexible as possible. In our example, you may be
asked to switch the data from quarterly to monthly for an analysis at the last minute.
They may want the data over the past 5 years and a bar chart instead of a line chart.
My advice to you is very simple; expect changes every single time. Luckily, Excel
has many wonderful tools to help you spin up interactive and visually impressive
workbooks. In Chapter 9, How to Build and Style Your Charts, and Chapter 10, Creating
Interactive Spreadsheets Using Tables and Slicers, we will go through all these neat
features that will equip you with the necessary knowledge to further enhance
your skills.
[3]
2. Type Revenue in cell A1 and Name in cell B1, as shown in the following
screenshot. These are going to be our column headings of our dataset.
[4]
Chapter 1
3. We are now going to apply styles to the column headings so that they
stand out. Highlight columns A1 and B1 and press Ctrl + B. This action
will make the two strings that we selected bold. Another option is to
highlight the cells and click on the Bold button in the toolbar, as shown
in the following screenshot:
4. Now, type 321, 45, 7, and 23 in the Revenue column. Then, type David, Bob,
Bill, and Mike in the Name column. Your spreadsheet should look like the
example in the following screenshot:
[5]
5. For our finishing touches, we can apply styles to our data by adding borders
around the cells. We can accomplish this by highlighting the cells A1 through
B5 and clicking on the Borders button. This will bring up a new menu. Select
All Borders. Remember that you need to first highlight the cells you want to
add the borders to. Refer to the following screenshot:
[6]
Chapter 1
Congratulations! You have just learned how to enter numbers, strings, and column
headers in an Excel spreadsheet. You have also learned how to apply styles to the
text and cells using various built-in Excel functions. You can think of this as your
first Hello World program in Excel 2013. Your final output should look like this:
[7]
[8]
Chapter 1
2. Now, open Excel and create a new workbook. Go to the DATA tab and click
on the From Text button, as shown in the following screenshot:
Navigate to your data.txt file and click on the Import button, as shown in
the following screenshot:
[9]
3. You will now see a dialog box, as shown in the following screenshot. This
dialog box will ask you how your data has been formatted. By default,
you will have the Delimited option selected. This means that your data is
separated by some characters such as spaces, commas, and semicolons.
In our example, the values in the data.txt file are separated by commas.
There are other options, but 99 percent of the time, you can just click
on the Next button.
[ 10 ]
Chapter 1
4. Step two of the import wizard will now ask you to select the delimiter or the
character that separates each of the values you are trying to import. Make
sure that you click on the Comma delimiter and remove any other options
that may have been checked automatically.
Now, let's take a look at the Data preview area in the following screenshot.
This area will show you a few records of how Excel plans to parse the data.
As shown in the following screenshot, we can see that by choosing the
Comma delimiter, Excel correctly splits the data into two columns.
We can now click on the Finish button.
[ 11 ]
5. The last dialog box will ask you to select where you want to paste the data.
The default value is A1 and this is usually the cell you would like to insert
the data into. At this point, you also have the option to paste your data into
a new worksheet by choosing this option in the dialog box.
After you click on the OK button, you will see your data in columns A and B.
You can also drag the actual data.txt file into Excel and this
will activate the Text Import Wizard.
Importing a CSV le
The acronym CSV means comma-separated values. What this means to us is that
when we use the Text Import Wizard, we need to select Comma as the delimiter. To
import a CSV file, the steps are exactly the same as those in the Manually creating data
section; however, the data.txt file is not a CSV file. A CSV file can be identified by
its filename ending in .csv.
[ 12 ]
Chapter 1
File two should be called two.xlsx and will look similar to the
following screenshot:
[ 13 ]
Now, let's pause and see what we are trying to do here with the two files.
The goal is to combine them into one file. There are two methods that
we can use, so let's start with the easiest one.
In the first method, open the one.xlsx and two.xlsx files. Using the
two.xlsx file, highlight columns A1 through A5. Press Ctrl + C to copy
the selected cells. Now, switch to the one.xlsx file and select column B1.
Press Ctrl + V to paste the data. Your spreadsheet should now look like
the following screenshot:
Congratulations! At this point, you can save the file as final.xlsx and you
are all done. You have combined two different Excel workbooks into one.
The second method involves using an Excel feature that you will often use
in different situations. Let's go through the following steps, and then, I will
explain the benefits of using this technique:
[ 14 ]
Chapter 1
1. Open the one.xlsx and two.xlsx files. Using the one.xlsx file,
right-click on the tab named Sheet 1, and select the Move or Copy...
option, as shown here:
The Move or Copy dialog box will appear. Select the workbook
that you plan to move the data to. In this case, it is going to be
01 Chapter two.xls. Make sure that you have the second
workbook open, or you will not be able to see this option in the
drop-down menu. In the next section named Before sheet, select the
option called (move to end), check the Create a copy checkbox, and
click on the OK button, as shown here:
[ 15 ]
You will now have your second spreadsheet with two tabs: one
named Sheet 1 that holds your original data and another one
named Sheet 1 (2) that holds the data we just imported from the
first spreadsheet. From here on, we can just employ the first
technique and combine both the datasets. Good job!
What was so different about the second method? This method gives
us options and that is the key. We currently have a spreadsheet that
contains the raw data from each of the two workbooks. We can then
create a third spreadsheet or a third tab that holds the data from the
two datasets. If we make any mistakes, we can simply remake the
third tab/spreadsheet, as our original data is still intact. We can also
filter the data of our two original datasets before we combine any
data. In practice, you will notice that you will be performing a unique
combination of these two methods, depending on your dataset and
the problem you are trying to solve.
[ 16 ]
Chapter 1
You will notice that in several places, on the website, you can see a yellow
square with a single back arrow, as shown in the following screenshot. This
button tells you that Excel has found a table on the website. This button also
tells you that you can grab the contents of the table and import them into
your spreadsheet. What do we mean by saying that the website has a table
on it? This is outside the scope of this book, as the answer requires you to
know HTML. But for your reference, Excel looks for HTML <table> tags to
identify tables on a website.
[ 17 ]
3. Scroll down to the end of the web page and then click on the last button and
Import button, as shown in the following screenshot. Why did we choose to
select the last button and not the first one? In this example, there were two
buttons to choose from. Sometimes, the button will be right next to the table
that you are interested in, and at other times, you will have to complete the
task by trial and error.
Notice that the yellow button will change to a green checkbox. After you
click on the Import button, you will get a new dialog box that will ask you
where you want to paste the data, as shown in the following screenshot.
The cell A1 is usually the default location selected, but you may change
the location if you wish.
[ 18 ]
Chapter 1
At this point, Excel will grab the data from the website, and you will
have a worksheet that looks similar to the following screenshot. You
should have 50 records, but they do not have to match the ones in
the following screenshot:
Good job! You have just imported data from a website effortlessly, thanks
to Excel's robust tools that helped you get the job done as easily and quickly
as possible. The advantage of grabbing the data using the previous steps is
that if there are any changes in the data on the website, we can easily update
our spreadsheet to match any new changes. If we simply copy and paste the
data from the website into Excel, we would have to perform these same steps
every time the data changes. The Pandas Bootcamp website actually changes
data every time you refresh the web browser. Try it!
This means that if we ask Excel to refresh or to check whether the website has
any new data, it will update our spreadsheet with the new data. Let's give it
a try.
4. Right-click on cell A1 or any other cell with data. Go to the menu bar, and
click on the Refresh button, as shown in the following screenshot. Your data
should have changed! This feature will allow your data to always be in sync
with just a few clicks.
[ 19 ]
Summary
The lessons in this chapter were designed to teach you how to gather data from
various data sources. You should now be able to pull data from text files, CSV files,
other Excel files, and web pages. Getting data in your hands is the first step in the
data analysis life cycle, and you now have the skills needed for this process. In the
next chapter, we will take a look at the last set of data gathering skills that all data
analysts should be equipped with. Chapter 2, Connecting to Databases, will guide
you through detailed step-by-step instructions on how to connect to Microsoft SQL
Server databases using Excel's data connection tools.
[ 20 ]
Get more information Data Analysis and Business Modeling with Excel 2013
www.PacktPub.com
Stay Connected: