New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Introduction to Internet Engineering Course Slides
1. 1-Introduction 2/2
Internet Engineering Course
University of Tehran
Abbas Nayebi
These slides are based on the presentation files provided by Lennart Herlaar, Utrecht University, the
Netherlands
(http://www.cs.uu.nl/people/lennart)
1
2. Internet prog. Environment characteristics
Client/server architecture
Protocols
Addressing under IP
Names versus numbers
IP Packet, TCP
WWW: Document Formats, Markup,
HTML, Browsers
Summary of the last session
2
3. Web servers
The http protocol
URLs, URIs, and URNs
MIME
Scripting languages
◦ Client side
◦ Server side: SSI, CGI, servlets
Summary for this session
3
4. Specialized program for handling document requests and executing scripts
and programs (written in PHP, Perl, Java (JSP), ASP, ...).
Javascript, HTML and CSS are handled by the browser (the client).
Several servers exist: Apache (52%), Microsoft’s IIS (33%), and IIS’s
smaller brother PWS.
Number of servers used to double about every half year, but...
Browser communicates request to server via port numbers (usually 80).
Other numbers are possible: http://knor.glob.nl:810/index.html
Apache’s Tomcat supports servlets, Java programs which extend the
capabilities of a server. Tomcat works on port 8080 alongside another
server.
Internet applications can be programmed without using Web servers: set
up your own protocols and use sockets.
Webservers can be clients of a database server: three-tier organization.
Web servers
4
5. Possible requests to a web server:
◦ GET – retrieve document
◦ HEAD – retrieve the head of the document
◦ POST – execute the document, using enclosed data
◦ PUT – replace document with enclosed data
◦ DELETE – delete the document
Most used are GET and POST.
Other information passed along with request
◦ a list of acceptable mime types,
◦ e.g. Accept: text/* and Accept: image/gif
◦ For caching: If-Modified-since: date
◦ parameters for server side scripts
GET, DELETE and HEAD do not have message bodies.
The http protocol
5
6. A status line: 200 OK, 404 File not found,....
Retrieval date and various header fields
A blank line
Hopefully the requested body of text, if any
The return of the http protocol
6
7. URI = Uniform Resource Identifier which generalizes URL
A grammar for Uniform Resource Locators (URLs)
◦ url ::= scheme address
◦ scheme ::= http:// | ftp:// | mailto: | file:// |...
◦ address ::= full-domain-name/path-to-document[#anchor]
◦ full-domain-name ::= host.domainname[:portnr]
◦ domainname ::= domain . domainname | domain
◦ domain ::= string-without-whitespace-and-{; : &}
◦ path-to-document ::= /-separated-strings-without-whitespace-and-{; : &}
Spaces in filenames should be coded as %20 (ASCII in hexadecimal).
If no html filename is given for the http protocol, the file index.html,
index.php or index.htm may be used (this depends on the server, though).
URN: Uniform Resource Name
Example:
◦ URL http://www.pierobon.org/iis/review1.htm
◦ URN www.pierobon.org/iis/review1.htm#one
◦ URI http://www.pierobon.org/iis/review1.htm.html#one
URLs, URIs, and URNs
7
8. How does a browser know what kind of file it is retrieving?
A GIF file is displayed differently from a HTML or PDF file.
For some types, like PDF, a separate helper application must be started.
Sometimes a plug-in or filter is available, e.g. Macromedia Flash or some
to-HTML converter.
Would be nice to automatically start the right one.
The same problem occurred with attachments to e-mail.
Here Multipurpose Internet Mail Extensions (MIME) were developed.
A MIME specificiation is type/subtype.
Common examples are text/html, text/plain, image/jpeg,
image/gif, audio/x-realaudio and application/x-shockwave-flash.
Experimental MIME types are of the form type/x-subtype
When a MIME type is not present, the extension is usually the deciding
factor.
MIME
8
9. WWW for dummies: use elaborate tools to construct large
websites
Little programming involved, just plug and play.
We don’t want to make fancy/professional websites. We try to
capture the concepts and make a dependable knowledge for web
programming.
You may use drawing tools like Gimp, Photoshop, CoolEdit and
the like for making sounds, pictures, banners, icons and movies.
You may not use web authoring tools like Macromedia
Dreamweaver, Microsoft Frontpage, Adobe PageMill,....
What is allowed depends on the assignment: notepad++, PHP Ed,
Eclipse, …
Documentation about the tools you use: include it.
Of course, in real life you should use tools wherever applicable,
but whatever you do: make sure the result is maintainable.
Authoring tools
9
10. Inthis course, we program the Internet using the
Web.
Question: why develop for the Web?
Possible answers:
◦ Browser handles displaying
◦ “Everybody” has a browser
◦ Servers handle large parts of the transmission details
◦ Database servers do much of the rest
Software maintenance/upgrade problems in a
large organization.
Security issues: client program keeps a password
to connect to the database server.
A lot of work has been done. Why not use it?
Why use the Web?
10
11. Invent your own protocol
Write your own server
Write client programs for as many platforms as you
can
Try to get client and server software at the right
places
Examples: (s)ftp, MSN, BitTorrent (peer-to-peer),
MMORPG (Massively multiplayer online role-playing game)
The main problem is to get the proper software in the
proper places.
People have to download your client software and
install it on their local (home) computers
Your server software must be installed e.g. at
Internet Service Provider computers.
Making your own Internet
applications
11
12. Used to make web sites dynamic.
A fully dynamic website could be called a web
application.
Scripting languages exist on both the client side and
the server side.
Client side scripts are executed by the browser.
◦ These are used mainly to overcome shortcomings in the
protocol / presentation (!).
◦ Dynamic HTML
◦ Examples are: JavaScript (and Java Applets to a lesser
extent).
Server side scripts are executed by the webserver.
Several means of integration exist.
Examples of languages are: PHP, Perl, Python, C++,
ASP.
On scripting languages
12
13. Some executables codes transferred to
the browser for execution.
Now a days, are not appreciated generally
Some technologies:
◦ Java Applets (executed on a VM, secured by a
sandbox, multi-platform)
◦ Microsoft ActiveX (executed natively, dangerous
!, single-OS, single-browser)
Both of them can be signed digitally
◦ Normally, looser security rules are applied to
the signed applets.
Client side executables
13
14. Interpreted languages
No (insistence on) declarations
Run-time typing
Exceptions exist
Libraries for accessing various databases
Regular expressions
Easy reporting/printing
Many libraries available
General aspects of scripting
languages
14
15. Most server side scripting languages use the one-page-at-a
-time philosophy: they process an scripted page.
On request, a webserver can execute a program (CGI).
The program generates the whole page.
Or replace inline code by its output (PHP).
The result is usually an HTML file which is returned to the
client.
Input to server side scripts is usually a form.
”Running a web application” usually consists of a long
string of script incarnations.
Page based (in form of page) ping ponging of data between
client and server.
Form, submit, form, submit, form, submit, etc.
Page based requests generate lots of overhead.
Lack of state generates lots of overhead (and security
issues!).
Server side scripting
15
16. Communication between client and server
(and vice versa)
◦ HTTP, HTMLx
◦ Forms, parameters
Communication between server and script
(and vice versa)
◦ Inline, CGI, servlets
◦ Direct, standard input/output, environment
Communication between script and database
(and vice versa)
◦ SQL, resultset
Interesting parts of server side
scripting
16
17. Snippets of program scattered through HTML.
Executed from top to bottom.
Results embedded into the returned document by web
server.
Examples
◦ Server Side Includes (SSI)
◦ PHP mainly uses inline, i.e., within HTML between <?php
and ?>.
◦ Java Server Pages.
◦ Active Server Pages (used with VBScript or Javascript)
Often used when only small parts of the document
are computations.
Thin separation between code and presentation
Code becomes quickly unreadable.
Solutions to this problem exist (templates).
Method 1: Inlining code into a
document
17
18. HTML files may include preprocessing directives.
HTML recognized by extension (e.g., shtml) or
having exec flag on.
Server parses HTML files and executes directives.
General form:
◦ <!--#element attribute=value attribute=value ... -->
Variables can be used as well, and #if directives.
Server Side Includes
18
19. A protocol for web browsers to interface with applications
via a webserver.
Mainly used for running programs on the server (with
certain permissions)
Identifiable by reference to file in special directory .../cgi-
bin/xxxx or file extension.
Parameters are passed to it.
Result is usually a HTML document which is sent to the
client.
Perl, Python or shell scripts are often used in this fashion.
But all languages are possible, even C++.
There are slight differences between CGI executing and
ordinary execution of programs.
CGI is deprecated now.
Method 2: Common Gateway
Interface
19
20. Client (browser) sends a request to the server by TCP/IP
(socket)
We focus here on the commands GET, HEAD, POST, PUT
GET = give a document
HEAD = give only the header of document (email/news
header)
POST = send the contents of a form (upload form data)
PUT = upload of a file
For CGI mostly GET or POST
The request line will be followed by 0 or more headers and
an empty line
with POST/PUT followed by a body (contents of form
and/or file)
HTTP requests
20
21. The URL used for this request was:
http://sunshine.cs.uu.nl:8000/docs/vakken
/inp/dwarf.html?M=November&Y=2004
The body was empty
Example request
21
22. With GET the contents of the form, or a parameter is given as
part of the URL (URL=cgi-bin/xxx?parameters)
Special characters are given in hex.
With POST the contents of the form is given as the body of the
request.
The server puts all relevant information in environment
variables
The body (POST) is given to the standard input of the script.
Hence, scripts reacting to a POST should read from standard
input.
The server does not indicate end-of-file (length field in the
headers)
In most cases, a special CGI module makes access to these
information transparent.
GET versus POST
22
23. Important environment variables:
◦ QUERY_STRING: The query (the part after ? in the URL). Usually a
series of the form field=value, separated by &.
◦ CONTENT_LENGTH length of the data (POST)
◦ CONTENT_TYPE type of the data (MIME)
Not all of the normal environment is sent along: the server
filters it.
Test your program by faking the parameters (environment
variables) and running it from the prompt.
The CGI program should give a correct document (with
header and data) on standard output (say, using printf() ).
First thing to do: print the Content-type.
The server sends this document to the client (browser).
CGI Environment
23
24. Many languages offer libraries for CGI
programming.
These libraries take care of tedious tasks,
such as accessing the environment
variables and decoding QUERY_STRING.
Always look for, and use libraries for CGI.
Libraries for CGI
24
25. Independent of a programming language
Isolation of processes. CGI cannot damage the server
CGI programs can get limited permissions (only for part of the
Web site)
Performance: for each request a CGI script must be started as a
separate process
Starting a new process is very expensive, especially on Windows.
No possibility to give other information to the server (like logging,
program should write the logs in its own repository)
One time Request/Response: stateless (not just CGI, also for
inline).
Ex: consider an electronic shopping cart (e-commerce): where do
I maintain the shopping list?
CGI advantages/disadvantages
25
26. For a stateless server the client must keep the state.
In a browser: this could be done with Javascript, Java, etc.
In Forms: “hidden” fields can be filled.
When the form is sent these fields are included.
Cumulative information (like shopping card) could be stored in this
way.
Problem: when the user “returns” (with the BACK key) or enters the
URL directly in the browser the field gets lost. Not bad when each
user wants to keep multiple workflows simultaneously.
If GET is used, it is also possible to put extra information in the URL if
no form is present.
Exe: Find a website that gets parameters from the URL. Try to
change the parameters and retrieve the infoprmation without
navigating through the pages, e.g., zzzzz.com?
picId=100&newsId=56
Keeping state in the client or URL is prone to security issues.
Other (better) solutions exist to the problem of statelessness.
Client side state
26
27. Four steps: article selection, personal
data, credit card details, confirmation.
Four scripts or one script?
Maintaining state.
Validating input.
Security.
Example: order process of an
online shop
27
29. A more recent development.
Servlet = applet run on a server.
Servlets, in general define extensions to servers (WWW or
other).
Many people know Java and its many libraries.
Java is typed.
No need to learn a new language.
Basic packages are javax.servlet and javax.servlet.http
JSP is used when only little code is included in HTML
documents.
Use javax.servlet.jsp and javax.servlet.jsp.tagext.
Together JSP and servlets constitute the Web side of J2EE.
Database connecting using Java DataBase Connectivity
(JDBC).
Method 3: Java Servlets
29
30. Exe: Answer the following question:
◦ What is a keep-alive connection?
Exe: List three enhances in Http 1.1 over
Http 1.0
Keep-Alive Connections
30
31. Web servers
The http protocol
URLs, URIs, and URNs
MIME
Scripting languages
◦ Client side
◦ Server side: SSI, CGI, servlets
Summary for this session
31
Notas del editor
Ex5: Connect to a web server using the telnet command: C:\\>telnet www.google.com 80 GET / HTTP/1.0 HOST:www.google.com <blank line>
The sandbox is a set of rules that are used when creating an applet that prevents certain functions when the applet is sent as part of a Web page
CGI=Common Gateway Interface is a standard protocol that defines how webserver software can delegate the generation of webpages to a console application
Text in a pre element is displayed in a fixed-width font, and it preserves both spaces and line breaks. Useful for code samples.