wiki:introtor

# Intro to R Tutorial

Open R-studio for a brief intro to R.

#### Variables

Variables can be characters, integers, numeric, logical (TRUE, FALSE), factors, etc.

#### Objects

Save a new object with

`newObject <- 1`

Return new object

`newObject`

Determine type of variables (in a vector) or the type of object:

`class(newObject)`

Types of basic objects:

Dimensions Homogeneous variables Heterogeneous variables
1 vector list
2 matrix (numeric) data frame
n complex custom objects

#### Vectors

```newVector <- c(1,2,3)      # to create vector
newVector                  # to return vector values
newVector               # to access the first element of the vector```

Certain functions can be called on vectors:

```length(newVector)          # report length vector
help(length)               # look up the instructions for the length function
max(newVector)
mean(newVector)

newVector + 2              # Mathematical operations can also be performed on all elements of a vector
rm(newVector)              # remove vector from the workspace```

Typical syntax for executing R functions is:

functionName(object, option = <value>, option2 = <value>, option3 = <value>)

Documentation pages can show you the types of objects that can be taken as input, options available, and default settings.

To save the output of an R function, capture it in an object:

newObjectName ← functionName(object, option = <value>, option2 = <value>, option3 = <value>)

#### Data frames

```x = c('a', 'b', 'c')
y = c(1,2,3)
newDF <- data.frame(x, y)     # to create a data frame
newDF                                                       # to report data frame values
newDF\$x          # To subset columns as vectors
newDF\$y
newDF[,1]        # To subset columns as vectors
newDF[,2]
newDF[1,]        # To subset rows as vectors
newDF[2,]```

Functions that work on data frames:

```dim(newDF)
colnames(newDF)
rownames(newDF)
rownames(newDF) <- c("gene1", "gene2", "gene3")  # Overwrite rownames
newDF
newDF[1,2]
newDF[1,2] <- 5                                  # Overwrite some types of elements
newDF```

#### Getting data into R

First, make sure you're in the right directory:

```getwd()                           # Get the working directory
setwd('/Users/Erin/mydirectory')  # Set a new working directory``` Quick tip: You can also set the working directory under the Files Tab (lower right panel). Under More there is the option to `Set As Working Directory`. You can save the resulting command in your code.

Now, you can upload a file into R:

```df <- read.table(‘file.txt’, header = TRUE/FALSE, sep = "\t") #Input a file named file.txt that is tab separated Common pitfall: headers can't have spaces. Get rid of 'special characters'. Make sure there are no extra trailing rows or columns. Exercise: Make a file in a directory on the server containing the following information and name it `RNAseq_stats.txt`.

```Sample	Input	Mapped
guts1	36636027	35201820
guts2	24131701	23305661
guts3	18635372	17951602
N21	21315252	20365046
N22	23326031	22573853
N23	31648497	30711043``` Exercise: Import the data into R using read.table and save the information as an object called RNAstats.

#### Getting data out of R

```write.table(newDF, ‘file.txt’)  # Saves a dataframe or matrix in a .txt file.
write(newVector, ‘file.txt’)    # Saves a vector in a .txt file.```

#### Plotting

```x <- (1,2,3,4,5)
y <- (10,22,15,2,20)

plot(x, y, col = "red")
help(plot)
help(par)``` Exercise: Use the plot function to try to plot either the input or the mapped read counts in your RNAstats object.

#### Getting plots out of R:

```pdf("filename.pdf")      # Start an output destination
plot(x, y, col = "red")  # Start plotting
dev.off()                # Turn off the plotting function

help(pdf)                # Look up more options like setting the dimensions, resolution, etc.``` Quick tip: You can also save plots by locating the Export pull down menu in the Plots panel (lower right).

Here's a link to the tutorial we did in class:

```newObject <- 1

newObject

class(newObject)

newVector <- c(1,2,3,4)
newVector
newVector

# Functions
length(newVector)
help(length)
max(newVector)
mean(newVector)

newVector + 2
rm(newVector)
newVector
help(mean)

# Data frames
x <- c('a', 'b', 'c')
y <- c(1,2,3)
newDF <- data.frame(x,y)
newDF

newDF\$x
newDF[1,2]
newDF[,1]
newDF[1,]

dim(newDF)
colnames(newDF)
rownames(newDF)
newDF

rownames(newDF) <- c("gene1", "gene2", "gene3")
newDF
newDF[1,2] <- 5

# import export
getwd()
setwd("~/Documents/RNA-seq_part2")
getwd()

x <- c(1,2,3,4,5)

y <- c(10,20,22,15,30)
plot(x,y, col = "red")

help(plot)
help(par)

pdf("plot.pdf")
plot(x,y, col = "red")
dev.off()``` 