R's Basic Data Types
Brief introduction of R's basic data types.
R’s Basic Data Types
Atomic Classes

character

numeric( real numbers)
# Inf represents infinity > 1/0 [1] Inf > 1 / Inf [1] 0

integer

complex( complex number )

logical( True/False)
Vector and List
Vector contains the same types of data，while list could contains different kinds of data.
> x = 0:3
> x
[1] 0 1 2 3
> as.character(x)
[1] "0" "1" "2" "3"
> as.logical(x)
[1] FALSE TRUE TRUE TRUE
Matrices
Just like multidimensional arrays in Python.
Matrices must have every element be the same class
# matrix method
> m < matrix(1:6, nrow = 2, ncol = 3)
> m
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
# dim() method
> m =1:10
> m
[1] 1 2 3 4 5 6 7 8 9 10
> dim(m) = c(2,5) # use dim() to determine dimension
> m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
# cbind & rbind
> x = 1:3
> y = 10:12
> cbind(x, y)
x y
[1,] 1 10
[2,] 2 11
[3,] 3 12
> rbind(x,y)
[,1] [,2] [,3]
x 1 2 3
y 10 11 12
Factors
Factors are used to represent categorical data. Factors can be unordered or ordered. One can think of a factor as an integer vector where each integer has a label.
Factors are selfdescribing; having a variable that has values “Male” and “Female” is better than a variable that has values 1 and 2.
# unordered factors
> x < factor(c("yes", "yes", "no", "yes", "no"))
> x
[1] yes yes no yes no
Levels: no yes # levels ordered alphabetically by default
> table(x)
x
no yes
2 3
> unclass(x)
[1] 2 2 1 2 1
attr(,"levels")
[1] "no" "yes"
The order of the levels can be set using the levels argument to factor(). This can be important in linear modeling because the first level is used as the baseline level.
x < factor(c("yes", "yes", "no", "yes", "no"), levels = c("yes", "no"))
> x
[1] yes yes no yes no
Levels: yes no # ordered as set
Missing Values
NA or NaN :

NaN: not a number. Used undefined mathematical operations.

NA: not available. Used for everything else other than NaN.
NA values have a class also, so there are integer NA, character NA
# NaN ⊂ NA
> x < c(1, 2, NaN, NA, 4)
> [is.na](http://is.na/)(x)
[1] FALSE FALSE TRUE TRUE FALSE
> is.nan(x)
[1] FALSE FALSE TRUE FALSE FALSE
Data Frames
Data frames are used to store tabular data.
Unlike matrices, data frames can store different classes of objects in each column (just like lists)
> x < data.frame(foo = 1:4, bar = c(T, T, F, F))
> x
foo bar
1 1 TRUE
2 2 TRUE
3 3 FALSE
4 4 FALSE
> nrow(x)
[1] 4
> ncol(x)
[1] 2
Names Attribute
R objects can also have names, which is very useful for writing readable code and selfdescribing objects.
# vector
> x< 1:3
> x
[1] 1 2 3
> names(x) < c('name_a', 'name_b', 'name_c')
> x
name_a name_b name_c
1 2 3
> names(x)
[1] "name_a" "name_b" "name_c"
# list
> x < list(a = 1, b = 2, c = 3)
> x
$a
[1] 1
$b
[1] 2
$c
[1] 3
# matrices
> m < matrix(1:4, nrow = 2, ncol = 2)
> dimnames(m) < list(c("a", "b"), c("c", "d"))
> m
c d
a 1 3
b 2 4