2. S-PLUS
• S-PLUS is a commercial implementation of the S programming language sold
by TIBCO Software Inc..
• It features object-oriented programming capabilities and advanced analytical
algorithms.
• S-plus is an object-oriented programming and statistical analysis language
developed primarily at AT&T research labs during the 1980's. Data in Splus is
aggregated into "objects", which are of different types.
3. S (programming language)
• S is a statistical programming language developed primarily by John Chambers and (in earlier versions) Rick
Becker and Allan Wilks of Bell Laboratories. The aim of the language, as expressed by John Chambers, is "to
turn ideas into software, quickly and faithfully."
• The two modern implementations of S are R, part of the GNU free software project, and S-PLUS.
• History
• "Old S“: S is one of several statistical computing languages that were designed at Bell Laboratories, and first
took form between 1975–1976.[1] Up to that time, much of the statistical computing was done by directly
calling Fortran subroutines; however, S was designed to offer an alternate and more interactive approach.
Early design decisions that hold even today include interactive graphics devices (printers and character
terminals at the time), and providing easily accessible documentation for the functions.
• "New S“:The New S Language[4] (1988 Blue Book) was published to introduce the new features, such as the
transition from macros to functions and how functions can be passed to other functionsMany other changes
to the S language were to extend the concept of "objects", and to make the syntax more consistent (and
strict). However, many users found the transition to New S difficult, since their macros needed to be
rewritten. Many other changes to S took hold, such as the use of X11 and PostScript graphics devices,
rewriting many internal functions from Fortran to C, and the use of double precision (only) arithmetic. The
New S language is very similar to that used in modern versions of S-PLUS and R.
• S4:Version 4 of S, often abbreviated S4, provides advanced object-oriented features. S4 classes differ
markedly from S3 classes.
4. Historical timeline
• 1988: S-PLUS is first produced by a Seattle-based start-up company called Statistical Sciences, Inc. The
founder and sole owner is R. Douglas Martin, professor of statistics at the University of Washington, Seattle.
• 1993: Statistical Sciences acquires the exclusive license to distribute S and merges with MathSoft, becoming
the firm's Data Analysis Products Division (DAPD).
• 1995: S-PLUS 3.3 for Windows 95/NT. Matrix library, command history, Trellis graphics
• 1996: S-PLUS 3.4 for UNIX. Trellis graphics, nlme library, hexagonal binning, cluster methods.
• 1997: S-PLUS 4 for Windows. New GUI, integration with Excel, editable graphics.
• 1998: S-PLUS 4.5 for Windows. Scatterplot brushing, create S-PLUS graphs from within Excel & SPSS.
• 1998: S-PLUS is available for Linux & Solaris.
• 1999: S-PLUS 5 for Solaris, Linux, HP-UX, IBM AIX, SGI Irix, and DEC Alpha. S-PLUS 2000 for Windows. nlme
3.3, quality control charting, new commands for data manipulation.
• 2000: S-PLUS 6 for Linux/UNIX. Java-based GUI, Graphlets, survival5, missing data library, robust library.
• 2001: MathSoft sells its Cambridge-based Engineering and Education Products Division (EEPD), changes
name to Insightful Corporation, and moves headquarters to Seattle. This move is basically an "Undo" of the
previous merger between MathSoft and Statistical Sciences, Inc.
5. Historical timeline
• 2001: S-PLUS Analytic Server 2.0. S-PLUS 6 for Windows (Excel integration, C++ classes/libraries for
connectivity, Graphlets, S version 4, missing data library, robust library).
• 2002: StatServer 6. Student edition of S-PLUS now free.
• 2003: S-PLUS 6.2 New reporting, database integration, improved Graphlets, ported to AIX, libraries for
correlated data, Bayesian methods, multivariate regressions.
• 2004: Insightful purchases the S language from Lucent Technologies for $2 million.
• 2004: S+ArrayAnalyzer 2.0 released.
• 2005: S-PLUS 7.0 released. BigData library for working with larger-than-memory data sets, S-PLUS
Workbench (Eclipse development tool). Insightful Miner 7.0 released.
• 2007: S-PLUS 8 released. New package system, language extensions for R package compatibility, Workbench
debugger.
• 2008: TIBCO acquires Insightful Corporation [1
6. Historical timeline
• Data in S-plus is aggregated into "objects", which are of different types. The most common
object types are:
• vectors: ordered strings of data values (having no row or column orientation). All data
must be of the same mode (e.g., all numeric, all character, all logical, etc.).
• matrices: rectangular arrays, with rows and columns, of like-mode data.
• data frames: rectangular arrays similar to SAS data sets. Columns and rows have
identifying names. Columns (variables) may be of different modes.
• lists: ordered collections of objects of possibly different types and modes. For
example, a data frame, several matrices, and a vector or two can be joined to form a
list.
• There are many other kinds of objects, including time series objects, array objects, and
factor objects.
7. Installation
• System requirements
• A PC with Pentium processor
• 32 MB internal memory (64 MB recommended)
• 75 MB of free hard disk space
• CD-ROM drive
• Microsoft Windows 95/98/Me, Windows NT (4.0 or higher) or Windows 2000
8. INTRODUCTION
• Simple arithmetical ,Don't forget a RETURN after each statement. The first > on each line is the Splus prompt
• 3+5
• 4*5
• 8*(3+6)
• The colon operator is used to obtain sequences
• 1:7
• Vectors can be made with the function c() (c for 'combine')
• c(3, 4, 1, 6)
• MISTAKES
• 32 1+1
• 0.5(2+4)
• SPACES AND RETURN
• 31+
4
• 31 + 4 ok
• 3 1 +4 Error
• UPPER AND LOWER CASE: case sensitive
• sex <- 'female‘ Different from SEX<-’female’
9. INTRODUCTION
• Comment ##, or #
• CONTINUATION:continuation prompt +
• 3 *
+ 6
• c(3, 4, 1, 6
+ )
•?log
•sqrt(x)
•exp(x)
• ONLINE HELP:on-line help on all functions, operators or data sets
• help('c')
• Help(seq)
• Help()
• Types of data objects include vectors, matrices, lists, data frames, arrays, categories, time series and functions.
• Vectors have the attributes length and mode
• mode(c(-2.0, 3.1, 4.7, 6.9))
• length(c(-2.0, 3.1, 4.7, 6.9))
• mode(c(T, T, F, T, T, F, F, F, F))
• mode(c('orc', 'troll', 'gnome', 'elf', 'hobbit'))
• mode(c('hello', 1, T))
10.
11. ASSIGNMENT
• the assignment operator <-
• x <- c(4, 2, 8, 7), x+10,x+y,x/*+y
• Print(x) or x or length(x)
• We can overwrite with x<-199
• To remove x write:rm(x)
• Be careful with minus signs when assigning objects
• x <-- 4 or x <- -4
• y <- 1:10
• Y1<-seq(1,5,1)
• everything get stored
• objects()
• OPERATORS:+ - * / ^ (exponentiation)
• 2^3*4
• 2^(3*4)
• x <- 5
• 1:(x-1)
• 1:x-1
• 1:3^2
• x <- c(2, 1, 4, 5) => x+1 Ex:2*x, x^2, sin(x)
12. ASSIGNMENT
• EX: y <- c(1, 2, 1, 2)
• x+y, x*y , x/y , z <- c(2, 4, 6), x + z
• y==6:10, x=2:9, z=c(x,y)
• MATRIX DATA OBJECTS
• A matrix in S-PLUS is a two-way array
• x <- matrix(1:12, nrow=3, ncol=4)
• Show length of row and column: dim(x)
• length(x) show length of number
• mode(x) show data type
• S-PLUS is clever enough not to need both the number of rows and the number of columns
• matrix(1:12, nrow=3)
• Notice that the matrix has been filled down the columns first. This is the DEFAULT.
• matrix(1:12, nrow=3, byrow=T)
• DATA FRAME OBJECTS :A data frame consists of rows and columns of data, just like a matrix object, except that the
COLUMNS can be different modes.
• firstnames <- c('Kate', 'Linda', 'Edgar', 'Chelsea')
• lastnames <- c('Beatty', 'Colpoys', 'Gonzales', 'Miller')
• height <- c(165, 170, 180, 168)
• class.height <- data.frame(firstnames,lastnames,height)
13. ASSIGNMENT
• seq1 <- seq(1:6)
mat1 <- matrix(seq1, 2)
mat1
• mat2 <- matrix(seq1, 2, byrow = T)
• matrix(rnorm(20), 4)
• #appending v1 to mat5
v1 <- c(1, 1, 2, 2)
mat6 <- cbind(mat5, v1)
• v2 <- c(1:6)
mat7 <- rbind(mat6, v2)
• #determining the dimensions of a mat7
dim(mat7)
• matrix_name[row#, col#]
mat7[1, 6]
• #to access an entire row leave the column number blank
mat7[1, ]
•#to access an entire column leave the row number
blank
mat7[, 6]
•#Creating mat8 and mat9
mat8 <- matrix(1:6, 2)
•mat9 <- matrix(c(rep(1, 3), rep(2, 3)), 2, byrow = T)
•#addition mat9 + mat8
•mat9 + 3
•#subtraction mat8 - mat9
•#inverse
•solve(mat8[, 2:3])
#transpose t(mat9)
•cbind(B, C)
•D = matrix(
+ c(6, 2),
+ nrow=1,
+ ncol=2)
•Deconstruction:We can deconstruct a matrix by
applying the c function, which combines all column
vectors into one.
•c(B)
15. • use of the function data.frame()
• class.height
• LIST OBJECTS: They are even more general than data frames
• junklist <- list(x=firstnames,y=height,z=X)
• junklist
• ?
16. • ADDING NAMES: it is useful to add names to vectors
• x <- (1:5)^2
• x
• days <- c('Mon', 'Tue', 'Wed', 'Thu', 'Fri')
• names(x) <- days
• X
17. • REP AND SEQ
• repeat function rep() and the sequence function seq() are useful functions which can be used to make
vectors and matrices.
• seq(from=0.3, to=1, by=0.1)
• Note that S-PLUS doesn't care about the order the arguments of a function when they are given names,
as above.
• seq(to=1, by=0.1, from=0.3)
• But if you leave out the names, you must keep to a standard order: from,to, by
• seq(0.3, 1, 0.1)
• seq(f=0.3, b=0.1, t=1)
• rep() makes a repeat of an input either a certain number of times, or to a certain length.
• rep(1, length=4)
• rep(1:4, l=12)
• rep(1:4, l=9)
• rep() just fills in until it runs out of length. Another use of times is when times is a vector.
• rep(c(4, 6, 5), t=c(1, 2, 3))
18. • SUBSETS OF DATA
• x <- c(5, 14, 8, 9, 10)
• To display a single element we use square brackets x[1]
• To display more than one element at a time, use the c() function within the [] characters.
• Eg. to get the 3rd and 5th characters use x[c(3, 5)]
• Negation is useful to display all elements except those that are specified. Try
• x[-4]
• x[-c(1, 3)]
• Logical expressions are a more sophisticated and VERY USEFUL tool.
• x[x>8]
• we can assign this to another vector of length 3 with
• biguns <- x[x>8]
• it is convenient to define a vector with logical entries as in
19. • Data Frame Object
• row and columns of data, like matrix but columns
• can be different mode.
• Syntax function: mydata<-data.frame(col1,col2,col3...)
• ls() : show all variable
• class(varName): show mode of varName
• remove(varName): remove variable
• names(varObject): show variale in varObject
• village$age=age
EX1:
patientID<-c(1,2,3,4)
age<-c(25,34,38,52)
diabetes<-c("Type1","Type2","Type1","Type1")
status<-c("Poor","Improved","Excelent","Poor")
patientData<-data.frame(patientID,age,diabetes,status)
print(patientData)
EX2:
#patientData[1:3]
#patientData[c("status","age")]
20. • Data Frame Object
• attach: make variable in dataframe to work space
• plot(age,height) : show linear regression activity
• List
• x<-list(name='Chan chav',sex='M',salary=200,age=25)
• y<-list(name='Tav minsour',sex='F',salary=100,age=20)
• z<-c(x,y)
• z
• #x[[2]]
• #x["name"]
• x[["name"]]
21. Control Structures
• x == y "x is equal to y"
• x != y "x is not equal to y"
• x > y "x is greater than y"
• x < y "x is less than y"
• x <= y "x is less than or equal to y"
• x >= y "x is greater than or equal to y"
• if(1==0){
print(1)}
else{
print(2)}
• + Ifelse Statements
• Syntax:
• ifelse(test,true_value,false_value)
• Ex:
x<-1:10
ifelse(x<5 | x>8,x,0)
•EX:
score<-10
if(score<5)
{
print("FAIL");
}
else
{
print("Pass");}
•EX2:
score<-1;
ifelse(score<5,"FAIL","PASS");
•EX3:
x <- 1:10
ifelse(x<5 | x>8, x, 0)
•EX4:
•x <- 1
while(x < 5) {x <- x+1; print(x);}
•next can skip one step of the loop.
•break will end the loop abruptly.
•which() function gives the TRUE indices
of a logical object, allowing for array
indices.
•which(letters=="a")
•LOOP
x <- 1
while(x < 5) {x <- x+1; if (x == 3) break;
print(x); }
•EX6:
x <- 1
while(x < 5) {x <- x+1; if (x == 3) next;
print(x);}
22. Control Structures
• Loops
• for (k in 1:5){
print(k)
}
• while (g < 1)
{
g <- rnorm(1) + cat(g,"n")
}
• repeat is similar to while and for loop,
it will execute a block of commands
repeatly till break.
• sum <- 1
repeat
{ sum <- sum + 2;
print(sum);
if (sum > 11)
break; }
• The break statement can be used to terminate the loop abruptly
•EX:
samples <- c(rep(1:10))
samples
•EX1:
x<-0;
while (x < 10) {
x<- x+4;
print (x);
}
EX2:
x<-0;
while (x < 10)
{
x <- x + 4;
print (x);
if ( x = 8)
{
break;
}
}
•EX3:
samples<-c(1:10)
for (thissample in samples)
{
str <- paste(thissample,"is
current sample",sep=" ")
print(str)
}
EX4:
end <- length(samples)
begin <- end - 2
for (thissample in begin:end)
{
str <- paste(thissample,"is
current sample",sep=" ")
print(str)
}
23. array
• R Array:
• + syntax:
• array(data,dim)
• Note: data: vector fill the array
• dim: row and col number
• EX:
• x <- array(1:6,c(2,3))
• x
• Ex:
• x <- 1:64#declare vector(1->64)
• dim(x)<- c(2,4,8)#convert vector x to array row=2
• #col=4 number eleement=8
• is.array(x)#test wheter array or not
• x#show element of array
EX: show all event of array
x[1,,]
EX: x[2,2,] # show row 2 column 2