The document describes analyzing nucleotide sequences of the rhodopsin gene from human, chimpanzee and macaque. Key steps include:
1) Obtaining rhodopsin coding sequences from NCBI and writing them to a FASTA file
2) Performing a multiple sequence alignment using ClustalW
3) Calculating the transition/transversion ratio and genetic distance between species based on the alignment
5. – ‐
(A:61%, G:32%, C: 7%)
5 (A:80%, G:20%) 10 (A:60%, G:40%)
5
01 A 10
( )
02 A
03 A 01 A
A G A G G 02 A
G A A 04 G
A A A G 05 A 03 A
G A A C G 04 G
C A A A: 80%, G: 20%
A A A A 05 A
A G A G
06 G
07 A
08 G
09 A
10 G
A: 60%, G: 40%
25. transioon/transversion
Rhodopsin CDS(coding sequence) 1047bp
transioon: A G, C T
transversion: A,G C,T
NCBI ID
human(NM_000539) chimpanzee(XM_516740) macaque(XM_001094250)
human – chimpanzee, human – macaque
transioon transversion
p distace
p distance = /
transioon
transversion
p distance
human‐
chimpanzee
human‐
macaque
26. Bioconductor/GeneR 1
R Users/tg03/bin
> library(GeneR) # GeneR R
1) REST NCBI Rhodopsin
x R
> x <‐ " "
> s <‐placeString(x)
2) CDS R CDS
# rhodopsin.txt fasta
> writeFasta(file="rhodopsin.txt",from =96, to=1142, comment="human")
[1] 1 # 1: , ‐1:
27. Bioconductor/GeneR 2
1’) REST NCBI chimpanzee Rhodopsin
> x <‐ " "
> s <‐placeString(x)
2’) CDS R CDS
> writeFasta(file=“rhodopsin.txt”, from = , to= , comment= "chimpanzee", append=T)
#append=T
1’’) REST NCBI macaque Rhodopsion
> x <‐ " "
> s <‐placeString(x)
2’’) CDS R CDS
> writeFasta(file=“rhodopsin.txt”, from = , to= , comment= "macaque", append=T)
Users/tg03/bin rhodopsin.txt
Seq_R_96_1142
29.
Rhodopsin CDS(coding sequence) 1047bp
transioon: A G, C T
transversion: A,G C,T
human – chimpanzee, human – macaque
(NM_000539)‐(XM_516740) (NM_000539)‐(XM_001094250)
transioon transversion
transioon
transversion
p distance
human‐
5
1
6/1047 = 0.006
chimpanzee
human‐
32
8
40/1047 = 0.038
macaque
600
3000
30. 2
2
1
2
TCTGAGACCT
TCTGTGACCT
5th A T 6th G C
3th T A 6th G C
6th G C
3th T A
9th C G 9th C G
5th A T 6th C G
TCAGTGACCT TCTGTCACGT TCAGTGACCT TCTGTCACGT
31.
1) p distance:
: ,
:
2) Jukes and Cantor model (1969): A
T
C
G
A
‐
α
α
α
( ) [ ( )]
d = − 3 4 ln 1− 4 3 p T
α
‐
α
α
G
α
α
‐
α
3) Kimura’s two parameter model (1980):
C
α
α
α
‐
transioon
€ transvesion
A
T
C
G
( )(
P = 1 4 1− 2e−4 (α + β ) t + e−8 βt ) A
‐
β
β
α
T
β
‐
α
Q= 1( 2)(1− e )−8 βt
G
β
α
‐
β
d ≡ 2rt = 2αt + 4 βt
€ C
α
β
β
‐
€
( ) ( )
= − 1 2 ln(1− 2P − Q) − 1 4 ln(1− 2Q)
32. 1
( a) a’,a”
t t+1
a’
t
t+1
T a’
T a’
T a’
(1‐r)
(1‐r)2
a
A
G
C
T a
a’’
1‐2r
(1‐r)
T a”
T a”
T a”
r 10‐8,10‐9
A
G
C
r =3α
r2
(
t T a’
T a’
(1‐r)
)
a”
:r
C a”
T a”
r/3
2r(1‐r)/3
1‐r
t A a’
T a’
r/3
2r/3
a’
:1/3
T a”
T a”
(1‐r)
33. 2
Jukes and Cantor model (1969):
t t+1
qt qt+1
qt t qt+1‐qt dq/dt
(d)
2rt d
qt
( ) [ ( )]
d = − 3 4 ln 1− 4 3 p pt=(1‐qt):
p:
(p distance)
€
34. human, chimanzee, macaque Rhodopsin
Jukes and Cantor model
transioon
transversion
p distance
JC distance
human‐
5
1
6/1047 = 0.006
chimpanzee
human‐
32
8
40/1047 = 0.038
macaque
( ) [ ( )]
d = − 3 4 ln 1− 4 3 p
4
€ > ‐(3/4)*log(1‐(4/3)*0.006)