0% found this document useful (0 votes)
166 views

Sources of Data Sets

The document discusses different types of data sets and how they are collected. It defines main types of data sets as census (descriptive), observational studies (inferential), convenience samples (may be biased), and randomized trials (causal). Other types include prediction studies, and longitudinal, cross-sectional, or retrospective studies over time. Specific examples are given to illustrate census data, observational studies, convenience samples, randomized trials, and prediction studies.

Uploaded by

tnylson
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
166 views

Sources of Data Sets

The document discusses different types of data sets and how they are collected. It defines main types of data sets as census (descriptive), observational studies (inferential), convenience samples (may be biased), and randomized trials (causal). Other types include prediction studies, and longitudinal, cross-sectional, or retrospective studies over time. Specific examples are given to illustrate census data, observational studies, convenience samples, randomized trials, and prediction studies.

Uploaded by

tnylson
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Sources of data sets

1/22/13 7:41 PM

Sources of data sets


Jeffrey Leek, Assistant Professor of Biostatistics Johns Hopkins Bloomberg School of Public Health

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 1 of 13

Sources of data sets

1/22/13 7:41 PM

Data are defined by how they are collected


Main types Census (descriptive) Observational study (inferential) Convenience sample (all types - may be biased) Randomized trial (causal) Other types Prediction study (prediction) Studies over time - Cross sectional (inferential) - Longitudinal (inferential, predictive) Retrospective (inferential)

2/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 2 of 13

Sources of data sets

1/22/13 7:41 PM

A population

3/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 3 of 13

Sources of data sets

1/22/13 7:41 PM

Pick a person and measure

4/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 4 of 13

Sources of data sets

1/22/13 7:41 PM

Census

5/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 5 of 13

Sources of data sets

1/22/13 7:41 PM

Observational study
set.seed(5) sample(1:8,size=4,replace=FALSE)

[1] 2 5 6 8

6/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 6 of 13

Sources of data sets

1/22/13 7:41 PM

Convenience sample
probs = c(5,5,5,5,1,1,1,1)/24 sample(1:8,size=4,replace=FALSE,prob=probs)

[1] 4 1 2 5

7/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 7 of 13

Sources of data sets

1/22/13 7:41 PM

Randomized trial
treat1 = sample(1:8,size=2,replace=FALSE); treat2 = sample(2:7,size=2,replace=FALSE) c(treat1,treat2)

[1] 8 1 3 4

8/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 8 of 13

Sources of data sets

1/22/13 7:41 PM

Prediction study: train


set.seed(5) sample(1:8,size=4,replace=FALSE)

[1] 2 5 6 8

9/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 9 of 13

Sources of data sets

1/22/13 7:41 PM

Prediction study: test


sample(c(1,3,4,7),size=2,replace=FALSE) [1] 1 4

10/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 10 of 13

Sources of data sets

1/22/13 7:41 PM

Study over time: cross-sectional

11/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 11 of 13

Sources of data sets

1/22/13 7:41 PM

Study over time: longitudinal

12/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 12 of 13

Sources of data sets

1/22/13 7:41 PM

Study over time: retrospective

13/13

file:///Users/jtleek/Dropbox/Public/008sourcesOfDataSets/index.html#1

Page 13 of 13

You might also like