A holistic view of how the web works, with an overview of the HTTP protocol.
Presented by me at null security group (http://null.co.in), Mumbai chapter meet on Aug' 27th.
12. Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness
13. Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness
14. Bit of History
Mar’89 – T.B. Lee presents “Information Management:
A Proposal”
Aug’91 – Announces WWW
Mar’93 – Mosaic announced
Mar’94 – Netscape found
Oct’94 – W3C found by T.B. Lee
16. HTTP: What is it?
Part of the Application Layer of TCP/IP protocol suite
17. HTTP: What is it?
Part of the Application Layer of TCP/IP protocol suite
A set of grammatical rules for a client and server to
communicate
http://www.flickr.com/photos/joshfassbind/4584323789/
18. HTTP: What is it?
Part of the Application Layer of TCP/IP protocol suite
A set of grammatical rules for a client and server to
communicate
HTTP is what powers the WWW
24. Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2985: Handling Statelessness
http://www.flickr.com/photos/stephenpoff/2312981944/
25. OSI & TCP/IP protocol suite
OSI is a reference model
http://blog.uad.ac.id/imam_riadi/files/2009/01/osi-layer.jpg
26. OSI & TCP/IP protocol suite…
TCP/IP protocol suite is implementation of OSI
http://www.hill2dot0.com/wiki/index.php?title=Image:G0209_TCPIP_vs_OSI.jpg
28. Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness
29. The Communication
My favorite interview question:
http://www.flickr.com/photos/terryhart/2890904949/
30. The Communication
My favorite interview question:
What all happens between the time when:
and the page is
we click on a completely
hyperlink rendered in a
browser
31. Web DB
Brower Proxy Internetz LB
Server Server
32. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
33. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
null.co.in
Browser cache/ hosts
file/ DNS server
34. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
null.co.in
74.53.228.212
Browser cache/ hosts
file/ DNS server
35. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
SYN
TCP Connection: There, bro?
36. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
SYN
SYN-ACK
TCP Connection: Yo!
37. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
SYN
SYN-ACK
ACK
TCP Connection: Cool!
38. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
GET /
HTTP: Got this file?
39. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
GET /
200 OK
index.html
HTTP: Yup! Here ‘tis.
40. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
GET /
200 OK
index.html
GET /js.js
GET /pic.jpg
HTTP: Can I have these as well?
41. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
GET /
200 OK
index.html
GET /js.js
GET /pic.jpg
200 OK
more content…
HTTP: Sure!
42. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
FIN
TCP Connection: Arigato, am done.
43. Client Server (null.co.in)
Web DB
Brower Proxy Internetz LB
Server Server
FIN
FIN-ACK
TCP Connection: Sayonara!
45. The Communication
Web 2.0 has shrunk the client and server distinction
Conventionally, client sends an HTTP request
Server responds with an HTTP response
46. The Communication: HTTP Request
Request Line
Request Method
Requested Resource
HTTP Version used
Headers
General Headers
Request Headers
Entity Headers
Content (Optional)
47. The Communication: HTTP Response
Status Line
HTTP version(s) understood by server
Status code (3 digit numerical value)
Status description
Headers
General Headers
Response Headers
Entity Headers
Content (Optional)
48. Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness
http://www.saynotocrack.com/wp-content/uploads/2007/06/flinstones-anatomy.jpg
49. Anatomy
HTTP Request and Response are comprised of various
components:
Request Methods
Response Status Codes
Request Headers
Response Headers
General Headers
Entity Headers
Content (MIME Media Types)
50. Anatomy: Request Methods
Humans can convey emotions in several ways
Why should HTTP clients lag!!!
HTTP methods describe the type of communication
GET POST HEAD OPTIONS
TRACE PUT DELETE CONNECT
51. Anatomy: Response Status Codes
Indicate the server’s mood corresponding to a request
Combination of a numerical code, and a short
description
Cab be categorized in 5 categories:
1xx -- Informational
2xx -- Successful
3xx -- Redirection
4xx -- Client Error
5xx -- Server Error
52. Anatomy: Request Headers
Specific to an HTTP Request
Carry information about the client, and the type of
request
Facilitates better understanding between client and
server
Host Accept-Language If-Modified-Since Referer
User-Agent Authorization If-None-Match Expect
Accept Proxy- If-Range From
Authorization
Accept-Charset Max-Forwards If-Unmodified- TE
Since
Accept-Encoding If-Match Range
53. Anatomy: Response Headers
Specific to an HTTP Response
Carry information about the server, and the type of
response
Accept-Ranges ETag Retry-After WWW-Authenticate
Age Location Server Proxy-Authenticate
Vary
54. Anatomy: General Headers
Carry information about the HTTP transaction
Can be a part of request, as well as response
Cache-Control Keep-Alive Pragma Via
Connection Upgrade Trailer Warning
Transfer-Encoding Date
55. Anatomy: Entity Headers
Carry information about the content
Mainly a part of HTTP response
Allow Content-Language Content-Location Content-Range
Content-Encoding Content-Length Content-MD5 Content-Type
Expires Last-Modified
56. Anatomy: Content
IANA maintains a list of valid content types
It is specified by the Content-Type Entity header
Categorized in 9 MIME Media types:
application audio example image
message model multipart text
video
57. Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness