R part II

Vector
with regularly spaced numbers
> 1:10
[1] 1 2 3
> seq(1,10)
[1] 1 2 3
> seq(1,10,2)
[1] 1 3 5 7 9

4

5

6

7

8

9 10

4

5

6

7

8

9 10

• We have used both “:” operator and seq command
• Note the last command where we have used “2” as
step, which is the “by” argument of the seq command

Try some sequence
or seq commands ….
> seq(0,1, length=11)
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> seq(4,10,by=0.5)
[1] 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
> seq(4,10,0.5)
[1] 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

8.5

9.0

9.5 10.0

8.5

9.0

9.5 10.0

> seq(2,8,0.3)
[1] 2.0 2.3 2.6 2.9 3.2 3.5 3.8 4.1 4.4 4.7 5.0 5.3 5.6 5.9 6.2 6.5 6.8 7.1 7.4
7.7
[21] 8.0
> seq.int(2,8,0.3)
[1] 2.0 2.3 2.6 2.9 3.2 3.5 3.8 4.1 4.4 4.7 5.0 5.3 5.6 5.9 6.2 6.5 6.8 7.1 7.4
7.7
[21] 8.0
> seq(2,8,length.out=10)
[1] 2.000000 2.666667 3.333333 4.000000 4.666667 5.333333 6.000000 6.666667
7.333333
[10] 8.000000

Try more seq commands ….
> seq(1,5,0.3)
[1] 1.0 1.3 1.6 1.9 2.2 2.5 2.8 3.1 3.4 3.7 4.0 4.3 4.6 4.9
> pi:6
[1] 3.141593 4.141593 5.141593
> 6:pi
[1] 6 5 4
> 10:-2
[1] 10 9 8 7 6 5 4 3 2 1 0 -1 -2
> -7:8
[1] -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8

• You can generate decreasing sequence
• Try generating a sequence of negative numbers

Think and try …...
Generate a sequence of the following numbers:
0.0 0.2 0.4 0.6 0.8 1.0 1.0 2.0 3.0
6.0 7.0 8.0 9.0 10.0 100.0

4.0

Hints
• You have to use more than one sequence.
• But how will you include “100”?

5.0

Think and try ….. Possible Solution
Generate a sequence of the following numbers:
0.0 0.2 0.4 0.6 0.8 1.0 1.0 2.0 3.0
6.0 7.0 8.0 9.0 10.0 100.0

> seq(0, 1, length=6)
[1] 0.0 0.2 0.4 0.6 0.8 1.0
> seq.1<-seq(0, 1, length=6)
> c(seq.1,1:10,100)
[1]
0.0
0.2
0.4
0.6
[14]
8.0
9.0 10.0 100.0

0.8

1.0

1.0

2.0

3.0

4.0

4.0

5.0

5.0

6.0

7.0

Try replicate or rep command
> rep(1:5,2)
[1] 1 2 3 4 5 1 2 3 4 5
> rep(1:5, length=12)
[1] 1 2 3 4 5 1 2 3 4 5 1 2

> rep(c('one', 'two'), c(6, 3))
[1] "one" "one" "one" "one" "one" "one" "two" "two" "two"

Now enter help(rep) command and try the examples

Try replicate or rep command
> rep(1:4, each = 2)
[1] 1 1 2 2 3 3 4 4
> rep(1:4, c(2,2,2,2))
[1] 1 1 2 2 3 3 4 4
> rep(5:8, c(2,1,2,1))
[1] 5 5 6 7 7 8
> rep(1:4, each = 2, len = 4)
[1] 1 1 2 2

Hope you are enjoying as we go….. Have you noted the
arguments “each” and “len”gth? Now note the “times”
argument
> rep(1:4, each = 2, times = 3)
[1] 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4

Try Histogram….
Suppose the top 25 ranked movies made the following gross receipts
for a Week:
29.6 28.2 19.6 13.7 13.0 7.8 3.4 2.0 1.9 1.0 0.7 0.4 0.4 0.3
0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1
Scan the data and then draw some histograms.
> x
[1] 29.6 28.2 19.6 13.7 13.0
0.4 0.4 0.3 0.3 0.3
[17] 0.3 0.3 0.2 0.2 0.2
> receipts<-x
> hist(receipts)

7.8

3.4

2.0

1.9

1.0

0.1

0.1

0.1

0.1

0.1

0.7

Try Histogram….
Suppose the top 25 ranked movies made the following gross receipts
for a Week:
29.6 28.2 19.6 13.7 13.0 7.8 3.4 2.0 1.9 1.0 0.7 0.4 0.4 0.3
0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1

Now try better histograms ….
Add colour, change colour, add title for the histogram, add title
for x-axis and then y-axis
> hist(receipts, col="red2")
> hist(receipts, col="red4")
> hist(receipts, col="red2",main="Gross Receipts for
first 25 ranked movies")
first 25 ranked movies",xlab="receipts in a week")
first 25 ranked movies",xlab="receipts in a
week",ylab="count of movies")

Your new histogram should look like this

Now put the range for x-axis and y-axis
> hist(receipts, col="red2",main="Gross Receipts for first 25
ranked movies",xlab="receipts in a week",ylab="count of
movies",xlim=c(0.1,35),ylim=c(0,25))

Now more about histograms ….
Now try breaks=….
What is “breaks”?
> hist(receipts,breaks=3,col="red2",main="Gross Receipts
for first 25 ranked movies",xlab="receipts in a
week",ylab="count of movies")

Remember:
Breaks is just a
suggestion to R

Now more about breaks ….
“breaks” can also specify the actual break points
in a histogram
> hist(receipts,breaks=c(0,1,2,3,4,5,10,20,max(x)),col="violetred")

Note the break points

Summary and Fivenum
Suppose, CEO yearly compensations are sampled and the
following are found (in millions).
12 0.4 5 2 50 8 3 1 4 0.25
> sals
[1] 12.00 0.40 5.00 2.00 50.00 8.00 3.00 1.00 4.00 0.25
> mean(sals) # the average
[1] 8.565
> var(sals) # the variance
[1] 225.5145
> sd(sals) # the standard deviation
[1] 15.01714
> median(sals) # the median
[1] 3.5
> summary(sals)
Min. 1st Qu. Median
Mean 3rd Qu.
Max.
0.250
1.250
3.500
8.565
7.250 50.000
> fivenum(sals) # min, lower hinge, Median, upper hinge, max
[1] 0.25 1.00 3.50 8.00 50.00
> quantile(sals)
0%
25%
50%
75% 100%
0.25 1.25 3.50 7.25 50.00

Important: Difference between
Fivenum and Quantiles

Difference between
Fivenum and Quantile:
Lower and Upper Hinge
The sorted data:
0.25 0.4 1 2 3 3.5 4 5 8 12 50
Median = 3.5

• The lower hinge is the median of all the data to the left of
the median (3.5), not counting this particular data point (if it
is one.)
• The upper hinge is similarly defined.

R part II

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (9)

Similar a R part II

Similar a R part II (20)

Más de Ruru Chowdhury

Más de Ruru Chowdhury (20)

Último

Último (20)

R part II