R quickstart

Basic data structures and operations in R

R is a great calculator:

1 + 2

Assign a value and print an object:

print(x)
(x = 2) # shorthand for print
x == 2 # logical operator, not assignment
y <- x + 0.5
y # another way to print

Logical operators:

x == y # equal
x != y # not eaqual
x < y # smaller than
x >= y # greater than or equal

Vectors and sequences:

x <- c(1, 2, 3)
x
1:3
seq(1, 3, by = 1)

rep(1, 5)
rep(1:2, 5)
rep(1:2, each = 5)

Vector operations, recycling:

x + 0.5
x * c(10, 11, 12, 13)

Indexing vectors, ordering:

x[1]
x[c(1, 1, 1)] # a wey of repeatig values
x[1:2]
x[x != 2]
x[x == 2]
x[x > 1 & x < 3]
order(x, decreasing=TRUE)
x[order(x, decreasing=TRUE)]
rev(x) # reverse

Character vectors, NA values, and sorting:

z <- c("b", "a", "c", NA)
z[z == "a"]
z[!is.na(z) & z == "a"]
z[is.na(z) | z == "a"]
is.na(z)
which(is.na(z))
sort(z)
sort(z, na.last=TRUE)

Matrices and arrays:

(m <- matrix(1:12, 4, 3))
matrix(1:12, 4, 3, byrow=TRUE)

array(1:12, c(2, 2, 3))

Attribues:

dim(m)
dim(m) <- NULL
m
dim(m) <- c(4, 3)
m
dimnames(m) <- list(letters[1:4], LETTERS[1:3])
m
attributes(m)

Matrix indices:

m[1:2,]
m[1,2]
m[,2]
m[,2,drop=FALSE]
m[2]

m[rownames(m) == "c",]
m[rownames(m) != "c",]
m[rownames(m) %in% c("a", "c", "e"),]
m[!(rownames(m) %in% c("a", "c", "e")),]

Lists and indexing:

l <- list(m = m, x = x, z = z)
l
l$ddd <- sqrt(l$x)
l[2:3]
l[["ddd"]]

Data frames:

d <- data.frame(x = x, sqrt_x = sqrt(x))
d

Structure:

str(x)
str(z)
str(m)
str(l)
str(d)
str(as.data.frame(m))
str(as.list(d))

Summary:

summary(x)
summary(z)
summary(m)
summary(l)
summary(d)

Key concepts

  • a matrix is a vector with dim attribute, elements are in same mode,
  • a data frame is a list where length of elements match and elements can be in different mode.

Random numbers

Random numbers can be generated from a distribution using the r* functions where the wildcard (*) stands for the abbreviated name of the distribution, for example rnorm.

n <- 1000
## draw n random numbers from Normal(0, 1)
hist(rnorm(n, mean = 0, sd = 1))
## Uniform (not as 'run-if' but as 'r-unif')
hist(runif(n, min = 0, max = 1))
## Poisson
plot(table(rpois(n, lambda = 5)))
## Binomial
plot(table(rbinom(n, size = 10, prob = 0.25)))

MCMC list objects

The coda R package defines MCMC list objects as:

  • a list where elements are matrices of identical dimansions,
  • each list stores posterior sample from an MCMC chain, thus the length of the mcmc.list object equals the number of parallel chains,
  • each matrix has the dimensions: number of samples (defined by interations and thinning value), and the number of variables monitored.
  • the matrices have some attributes attached to them storing info about the MCMC parameters (start iteration the end iteration and the thinning interval of the chain).

For example if we are monitoring 2 variables, a normally and a uniformly distributed one, the structure might look like this:

mcmc <- replicate(3, 
    structure(cbind(a = rnorm(n), b = runif(n)),
        mcpar = c(1, n, 1), class = "mcmc"),
    simplify = FALSE)
class(mcmc) <- "mcmc.list"
str(mcmc)

Some basic methods for such mcmc.list objects are defined in the coda and dclone packages:

library(dclone)
summary(mcmc)
str(as.matrix(mcmc))
varnames(mcmc)
start(mcmc)
end(mcmc)
thin(mcmc)
plot(mcmc)
traceplot(mcmc)
densplot(mcmc)
pairs(mcmc)