
Day 4 - R Basics

LunaprimRose 2020.03.16

Introduction to R

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.

R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.

The R environment

Ris an integrated suite of software facilities for data manipulation, calculation and graphical display.

It includes

The term “environment” is intended to characterize it as a fully planned and coherent system, rather than an incremental accretion of very specific and inflexible tools, as is frequently the case with other data analysis software.

R, like S, is designed around a true computer language, and it allows users to add additional functionality by defining new functions. Much of the system is itself written in the R dialect of S, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.

Many users think of R as a statistics system. We prefer to think of it as an environment within which statistical techniques are implemented. R can be extended (easily) via packages. There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites covering a very wide range of modern statistics.

R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hardcopy.

Download R

The Comprehensive R Archive Network is available at the following URLs, please choose a location close to you.

The Comprehensive R Archive Network

CRAN Mirrors


Intriduction to RStudio

RStudio is an integrated development environment (IDE) for R , a programming language for statistical computing and graphics. It is available in two formats: RStudio Desktop is a regular desktop application while RStudio Server runs on a remote server and allows accessing RStudio using a web browser.

RStudio IDE Features

RStudio is the premier integrated development environment for R. It is available in open source and commercial editions on the desktop (Windows, Mac, and Linux) and from a web browser to a Linux server running RStudio Server or RStudio Server Pro.

Download RStudio


  1. 设置字体大小
    • RStudio
    • Tools
    • Global Options
    • Appearance
  2. 设置镜像源
    • RStudio
    • Tools
    • Global Options
    • Packages

Basic operation

R-project 管理多个 R 工作目录

用R进行数据分析, 不同的分析问题需要放在不同的文件夹中

R 把在命令行定义的变量都保存到工作空间中

退出 R 时可以选择是否保存工作空间





> 1+1
[1] 2
> 1-1
[1] 0
> 1*1
[1] 1
> 1/1
[1] 1





Scatter plot
boxplot(iris$Sepal.Length~iris$Species,col = c('lightblue','lightyellow','lightpink'))
Scatter plot

