Getting started with R an introduction for biologists Second Edition Beckerman instant download
Getting started with R an introduction for biologists Second Edition Beckerman instant download
https://ebookname.com/product/getting-started-with-r-an-
introduction-for-biologists-second-edition-beckerman/
https://ebookname.com/product/getting-started-with-r-an-
introduction-for-biologists-1st-edition-andrew-p-beckerman/
https://ebookname.com/product/25-recipes-for-getting-started-
with-r-1st-edition-paul-teetor/
https://ebookname.com/product/getting-started-with-raspberry-pi-
second-edition-richardson/
https://ebookname.com/product/salesforce-for-dummies-8th-edition-
for-true-epub-liz-kao-jon-paz/
Of Spies and Spokesmen My Life As a Cold War
Correspondent Nicholas Daniloff
https://ebookname.com/product/of-spies-and-spokesmen-my-life-as-
a-cold-war-correspondent-nicholas-daniloff/
https://ebookname.com/product/san-juan-edgardo-rodriguez-julia/
https://ebookname.com/product/from-reform-to-growth-managing-the-
economic-crisis-in-europe-1st-edition-vit-novotny/
https://ebookname.com/product/the-producer-as-composer-1st-
edition-virgil-moorefield/
https://ebookname.com/product/uplifting-the-people-three-
centuries-of-black-baptists-in-alabama-religion-american-
culture-1st-edition-wilson-fallin-jr/
Assimilation of Immigrants and Their Adult Children
College Education Cohabitation and Work 1st Edition
Ping Chen
https://ebookname.com/product/assimilation-of-immigrants-and-
their-adult-children-college-education-cohabitation-and-work-1st-
edition-ping-chen/
Getting Started with R
This page intentionally left blank
Getting Started with R
An Introduction for Biologists
ANDREW P. BECKERMAN
& OWEN L . PETCHEY
1
1
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© Andrew P. Beckerman and Owen L. Petchey 2012
The moral rights of the authors have been asserted
First Edition published in 2012
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
British Library Cataloguing in Publication Data
Data available
Library of Congress Cataloging in Publication Data
Library of Congress Control Number: 2011945448
ISBN 978–0–19–960161–5 (hbk.)
978–0–19–960162–2 (pbk.)
Printed in China by
CC Offset Printing Co. Ltd
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
Table of Contents
Preface vii
What this book is about vii
What you need to know to make this book work for you viii
How the book is organized ix
Chapter 1: Why R? 1
What you need to know to make this book work for you
There are a few things that you need to know to make this book, and our ideas,
work for you. Many of you already know how to do most of these things,
having been in the Internet age for long enough now, but just to be sure:
In this book, we will show you how to use R in the context of every day
research in Biology. Our philosophy assumes that you have some data and
would like to derive some understanding from it. Typically you need to
manage your data, explore your data (e.g. by plotting it), and then analyse
your data. Before any attempt at analysis, we suggest that you always plot
your data. As always, analysing (modelling) your data involves first devel-
oping a model and testing critical assumptions associated with the statisti-
cal method (model). Only after this do you attempt interpretation. Our
focus is on developing a rigorous and efficient routine (workflow) and a
template for using R for data exploration, visualization, and analysis. We
believe that this will give you a functional approach to using R, in which
you always have the goal (understanding) in mind.
We start by providing in-depth instruction for how to get data into R,
manipulate and summarize your data, and make a variety of informative,
publication-quality figures common to our field. We then provide an
overview of how to do some common analyses. In contrast to other
books, we spend most of our time helping you to develop a workflow for
analysis and an understanding of how to tell R what to do. We also help
you identify core pieces of R output that are reported regularly in our
field. This is important because the output of all statistic packages is
different.
Chapter 1 is titled Why R? It is an (our) overview of why you might
spend a few days (and more!) of your valuable time converting your data
management, graphics, and analysis to R. There are many reasons, though
x PREFACE
we advise all readers to make a careful decision about whether the invest-
ment of time and effort will give sufficient return.
Chapters 2–4 are based on our tried and tested Import, Explore, Graph.
We walk you through one of the most difficult stages in using R—getting
your data into R and producing your first figure in R. Then we show you
how to explore your data, summarize it in various forms, and plot it in
various formats. Visualizing your data before you do statistics is vital.
These chapters also introduce you to the script—a permanent, repeatable,
annotated, shareable, cross-platform record of your analysis.
Chapter 5 introduces you to our workflow for implementing and inter-
preting t-tests, chi-square tests, and general linear models. General linear
models are a flexible set of methods that include the more well-known
concepts of regression, ANOVA, and ANCOVA. In the spirit of functional-
ity and making R work for you, our objective is to help you develop a re-
peatable and reliable workflow in R. We focus on helping you produce
interesting, appealing, and appropriate figures, interpreting the output of R
and crafting sensible descriptions of the methods and results for publica-
tion. Our focus is the workflow.
Throughout the book, we highlight where you can work along with us,
on your own computer, using R through the use of this symbol at the left.
Throughout the book, we use syntax (code) colouring as found in the
OSX R script editor. This should help visualize the instructions you need to
give R. We believe this book can be used as a self-guided tutorial. All of the
datasets we use are available online at http://www.r4all.org. There are also
13 boxes embedded in the book with extra detail about certain topics. Take
your time and learn the magic of R.
1
1
Why R?
S ome of you will have established research careers based around using a
variety of statistical and graphing packages. Some of you will be start-
ing with your research career and wondering whether you should use some
of the packages and applications that your supervisor/research group uses,
or jump ship to R. Perhaps your group already uses R and you are just look-
ing for that “getting started” book that answers what you think are embar-
rassing questions. Regardless of your stage or background, we think a
formal introduction to an approach to, and routine for using R, will help.
We begin by reviewing a core set of features and characteristics of R that we
think make it worth using and worth making a transition to from other
applications.
First, we think you should invest the effort because it is freely available
and cross-platform (e.g. it works on Windows, Macs [OS X], and Linux).
This means that no matter where you are and with whom you work, you
can share data, figures, analyses, and most importantly the instructions
(also known as scripts and code) used to generate the figures and analyses.
Anyone, anywhere in the world, with any kind of Windows, Macintosh, or
Linux operating system, can use R, without a license. If you, or your
2 GET TING STARTED WITH R
Finally, and quite importantly, R makes it very easy to write down and
save the instructions you want R to execute—this is called a script in R. In
fact, the script becomes a permanent, repeatable, annotated, cross-platform,
shareable record of your analysis. Your entire analysis, from transferring
your data from field or lab notebook, to making figures and performing
analyses, is all in one, secure, repeatable, annotated place.
So, if you have not already done so, go get R. Follow this link and locate
the server closest to you holding R:
http://cran.r-project.org/mirrors.html
Then click on the operating system (Mac OS X, Windows, and Linux) you
use. This will take you to a page specific to your operating system. For
Mac OS X users, look down until you see a link to something like R-2.13.
pkg. Click on this, and then install the downloaded file. For Windows
users, you get taken to a page where you click on the “base” link (you want
the “base” version of R); from here the on-screen instructions make it clear
what to do next. If you get stuck here, please read the instructions about
installation in the FAQs, linked at the bottom of every R homepage. Linux
users, and those of other operating systems, probably know how to get and
download R. If any of you are stuck at this stage, take a look at Box 2.1 and
Box 2.2 for more information on setting up your computer and getting
and installing R. If you’re still stuck, email one of us. Really.
This page intentionally left blank
2
2
Import, Explore, Graph I
Getting Started
O ne of the most frequent stumbling blocks for new students and sea-
soned researchers using R is actually just getting the data into R. This
is really unfortunate, since it is also most people’s first experience with R!
Let’s make this first step easy.
Assume you have some datasheets on your clipboard or in your lab
book. Perhaps these data are measures of the abundances of various
species in various places, or morphological measurements of individual
organisms. Obviously, you need to put this data into your computer in
order that you can then get it into R. And though we’ve implied that R is
good for everything, it is not so good for entering large amounts of
data.
Instead, use a spreadsheet application such as Microsoft Excel, Open
Office, or Numbers to enter your data. Make the first row of your spread-
sheet the names of your variables (column names). Keep these variable
names informative, brief and simple. Try not to use spaces or special sym-
bols, which R can deal with but may not do so particularly nicely. Also, if
you have categorical variables, such as sex (male, female) use the category
names rather than codes for them (e.g. don’t use 1 = male, 2 = female).
6 GET TING STARTED WITH R
R can easily deal with variables that contain words, and this will make sub-
sequent tasks in R much more efficient.
Finally, we highly recommend that you enter data so there is one obser-
vation per row. That is, make a dataset with column names like treatment1,
treatment2, replicate, response_variable. This will make your R-life much,
much easier, since many of the functions in R prefer data this way. If
you, for example, made a dataset with columns treatment1, treatment2,
response of replicate 1, response of replicate 2, and response of replicate 3,
you would have to do some data manipulation to get the data into the
correct shape for many R functions.
Next, print your entered data onto paper, and check that this copy of
your data matches the data on your original datasheets. Correct any mis-
takes you made when you typed your data in.
Now, don’t save your file as an .xls, .xlsx, .oo, or .numbers file. Instead,
save it as a “comma separated values” file (.csv file). In Excel, Open Office
or Numbers, after you click Save As . . . you can change the format of the file
to “comma separated values’, then press Save. Excel might then, or when
you close the Excel file, ask if you’re sure you’d like to save your data in this
format. Yes, you are sure! At this point in our workflow, you have your
original datasheets, a digital copy of the data in a .csv file, and a printed
copy of the data from the .csv file.
One of the remarkable things about R is that once a copy of your “raw
data” is established, the use of R for data visualization and analysis will
never require you to alter the original file any way (unless you collect more
data!). Therefore keep it very safe! In many statistical and graphing pro-
grams, your data spreadsheet has columns added to it, or you manipulate
columns or rows of the spreadsheet before doing some analysis or graphic.
With R, your original can always remain unmanipulated. If any adjust-
ments do occur, they occur in an R copy only and only via your explicit
instructions.
But how do you do actually get the data in this .csv file into R? How do
you go from making a dataset in a .csv file to having a working copy in R’s
brain to use for your analysis?
IMPORT, EXPLORE , GRAPH I : GET TING STARTED 7
Part of keeping your data safe is putting it somewhere safe. But where? We
like to be organized and we want you to be organized too.
Make a new folder in the main location/folder in which you store your
research information—perhaps you have a Projects folder in MyDocu-
ments (PC) or in the Documents folder (Mac and e.g. Unbuntu); create a
folder inside this location. Give the folder an informative name—perhaps
“MyFirstAnalysis”.
Understanding where your data is stored is very important for learning
how to use R. The location, or address, for information on a computer is
known as the PATH. How do you find out the PATH to a folder or dataset on
a computer? First, note that there are some really easy ways to get the PATH
to your raw data file without having to type it in. This is important, since
typing in PATHs incorrectly is one of the most common early stumbling
blocks, and a very frustrating one. Essentially, the trick for getting paths is to
copy them from the Finder (OS X) or the Explorer (Windows) (see Box 2.1
for details). You shouldn’t ever have to type a PATH in, though sometimes
you might choose to. In case you do, here’s what you need to know.
In Windows, the location of things historically starts with a drive letter
followed by a colon and some slashes, for example “C:\\MyDocuments\\
project1\\raw data\\file.csv”. The conventions for the direction of the
slashes in Windows varies, depending on how old your version of Win-
dow’s is—they are either presented as single or double backslashes. On a
Macintosh or Linux/Unix machine, the location will usually be in some
folder in the path from your home directory, such as “Users/username/
Documents/project1/raw data/file.csv” or “~/Documents/project1/raw
data/file.csv”.
In R, the convention for writing a PATH is always a single forward slash.
So, whether you are on a PC or a MAC or a Linux/Unix box, the way you
describe where files are within your folders is consistent: one forward slash.
However, if you copy PATH information from Windows, you will probably
need to change backward slashes to forward slashes.
8 GET TING STARTED WITH R
Linux/Unix
Our guess is that if you are using Linux/Unix to run R, you understand the PATH.
Macintosh
To make OSX show the PATH in the title of the Finder, you need to invoke a
command in the Terminal.
Go to Applications/Utilities/Terminal.app and double click.
At the prompt, type the following:
Now, the top of your finder window should show the location where you are—
the PATH.
Note that Linux/Unix systems and Macintosh systems use the same convention
for describing the PATH—a forward slash separates all folders and documents in the
final folder.
One feature of OSX is that you can actually drag this information to another
document. If you click and hold the folder symbol, you can drag this information to
any text editor, including an R script. In fact, dragging ANY folder or document to
the script editor for the Macintosh will paste the path.
Windows
To make Windows show the path in the explorer, you need to set the preferences in
the Tools menu for the explorer. Tools->Folder Options->View Tab will take you to a
list of tick boxes. There should be a tick box labelled “Display full path . . .” which will
show the path to files in the address bar and the title bar.
If you’re confused by any of that, don’t forget you can copy and paste
paths, and Box 2.1 provides a bit more detail about how to do this. We
assume that Linux/Unix users are familiar with PATHs.
Now, inside that folder you just made, make another folder called analyses
(i.e. “MyFirstAnalysis/analyses”). And perhaps another folder called
manuscript. Perhaps another one called Important PDFs. We think you
might be getting the point: use folders to organize separate parts of your
project. Your file of instructions for R—your script file—will go in the
analyses folder. We’ll build one soon. Be patient.☺
What is the script file? As we noted in the Preface, R allows you to write
down and save the instructions that R uses to complete your analysis. Trust
us, this is a desirable feature of R. As a result of making a script, you, as a
researcher, end up with a permanent, repeatable, annotated, shareable, cross-
platform archive of your analysis. Your entire analysis, from transferring
your data from notebook, to making figures and performing analyses, is all
in one, secure, repeatable, annotated place. We think you can see the value.
2.3 How to get your data into R and where it is stored in R’s brain
Now comes the fun part. You’ve organized your life a bit. You know where
your data is, and you know where you will store things associated with a
particular project/experiment/analysis. Let’s use R.
Wait! R? Where is it? If you have not installed it yet, Box 2.2 provides
detailed instructions on how to install R on your machine—be it a PC,
Macintosh, or Linux. It’s not hard, but there are some helpful hints in
there.
Now, assuming the installation has been successful, start R. Click the
icon on your desktop, navigate to the start menu, find the applications
folder, click on the icon in the dock . . . you know what to do. You are clever.
You know how to start an application on your computer!
IMPORT, EXPLORE , GRAPH I : GET TING STARTED 11
You can bookmark your favourite mirror in your web browser. The mirror provides
all the files and instructions necessary for installing R on your computer. Here we
show the mirror at Bristol University, UK. Simply click on the link for your operating
system, READ THE INSTRUCTIONS, and proceed to enjoy R. There is a section at the
bottom of the boxes called “Questions About R”. If you think you might ignore these,
you are exactly the people they are written for.
IMPORT, EXPLORE , GRAPH I : GET TING STARTED 13
Macintosh
The Macintosh version of R comes with a feature-rich script editor. It provides line
numbering, syntax colouring, and a keyboard combination for sending information
from the script to the console.
Line numbering
Syntax colouring:
Text is red (within quotes)
Numbers are green
Boolean operators are yellow
Function components are blue
To “send” this information from the script to the console, you may use the
keyboard combination:
cmd + return
or the leftmost icon in the script, with the red arrow (send this to the console).
If you simply press cmd + return, the line on which the cursor is on will be sent to
the console. Alternatively, you can select larger portions of the script, and then
press cmd + return. This will send the entire selection to R.
Because the Mac is now based on UnixBSD, it also has access to a variety of Unix/
Linux script editors including vi, emacs, and so on. The Mac version of emacs can be
fully functional and integrated with R (using Emacs Speaks Statistics).
Windows
The built in Windows script editor is currently as basic as it gets. It is simply a text
editor. It contains no line numbering or syntax highlighting. However, it does possess
a keystroke combination for sending lines or selections from the script to the console:
control + R
Despite the lack of features for the built in editor, there are numerous Windows edi-
tors that interface with R very well. The two most popular with our students seem to
be TinnR and Rstudio (which is actually available for Windows, Macintosh and Linux):
http://www.sciviews.org/Tinn-R/
http://rstudio.org/
General
There are innumerable scripting applications available for many platforms. A small
repository can be found here: http://www.sciviews.org/_rgui/index.html.
Another Random Document on
Scribd Without Any Related Topics
flames, and the whole of the crew, eight hundred in number,
perished.
But Kanaris seemed to be the only Greek naval officer who had the
necessary courage and coolness to manœuvre successfully with fire-
ships. The other captain ran his fire-ship alongside the man-of-war
which carried the flag of the capitan-pasha. The position of the fire-
ship was, however, ill chosen, and after being set on fire it drifted
away without doing injury to the Turk. The rest of the Turkish fleet
cut their cables and made for the Dardanelles, while one corvette
ran ashore on Tenedos. Another was abandoned by her crew.
Kanaris and the crews of the two fire-ships returned safely to Psara
in their boats.
CHAPTER XIX
PRISONERS
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookname.com