This document provides an overview of distributed web-based systems, including the key components and technologies that enable them. It discusses the World Wide Web and how documents are accessed via URLs. It also describes HTTP and how connections and requests/responses work. Other topics covered include caching, content distribution networks, web services, traditional and multi-tiered web architectures, web server clusters, and web security protocols like SSL.
2. Outline WWW URL WebDocuments HTTP Connections Methods Messages Caching Content Distribution Network Web Service Terminology Architecture Traditional Web Based Systems Multi-tiered Web Based Systems Web Server Clusters Web Security SSL References
3.
4. Servers maintain collections of documents while clients provide users an easy-to-use interface for presenting and accessing those documents.
5. A document is fetched from a server, transferred to a client, and presented on the screen.
6. There is conceptually no difference between a document stored locally or in another part of the world for any user.
7. Now, Web has become more than just a simple document based system.
8.
9. The DNS name of its associated server along with a file name is specified.
20. HTTP is a simple protocol; a client sends a request to a server and waits for a response.
21. HTTP is based on TCP; whenever a client issues a request to a server, it first sets up a TCP connection and sends the message on that connection. The same connection is used for receiving the response.
22. One of the problems with the first versions of HTTP was its inefficient use of TCP connections.
23.
24. In HTTP version 1.0 and older, each request to a server required setting up a separate connection. When server had responded the connection was broken down. These connections are referred as non-persistent.
25. In HTTP version 1.1, several requests and their responses can be issued without the need for a separate connection. These connections are referred as persistent.
29. HTTP MESSAGES (Response) Status code (Phrase): 200 (OK), 400 (Bad Request), 403 (Forbidden), and 404 (Not Found).
30.
31. HTTP Caching Clients often cache documents Challenge: update of documents If-Modified-Since requests to check HTTP 0.9/1.0 used just date HTTP 1.1 has an opaque “entity tag” (could be a file signature, etc.) as well When/how often should the original be checked for changes? Check every time? Check each session? Day? Etc? Use “Expires” header If no Expires, often use Last-Modified as estimate 16
32. Example Cache Check Request GET / HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate If-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMT If-None-Match: "7a11f-10ed-3a75ae4a" User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Host: www.intel-iris.net Connection: Keep-Alive 17
34. Problems Over 50% of all HTTP objects are un-cacheable . Not easily solvable Dynamic data : stock prices, scores, web cams CGI scripts : results based on passed parameters SSL : encrypted data is not cacheable Most web clients don’t handle mixed pages well : many generic objects transferred with SSL Cookies : results may be based on passed data Hit metering : owner wants to measure # of hits for revenue, etc. 19
35. Server Selection Lowest load : to balance load on servers Best performance : to improve client performance Any alive node : to provide fault tolerance How to direct clients to a specific server? Cluster load balancing : TCP hand-off As part of application : HTTP redirect As part of naming : DNS 20
36. Application-Based Redirection HTTP supports simple way to indicate that Web page has moved (30X responses) Server receives Get request from client Decides which server is best suited for particular client and object Returns HTTP redirect to that server May introduce additional overhead : multiple connection setup, name lookups, etc. 21
37. Naming Based Client does name lookup for service Name server chooses appropriate server address A record returned is “best” one for the client Name server could base decision on Server load/location must be collected Information in the name lookup request Name service client : typically the local name server for client 22
38. Web Proxy Caches 23 origin server Proxy server HTTP request HTTP request client HTTP response HTTP response HTTP request HTTP response origin server client User configures browser: Web accesses via cache Browser sends all HTTP requests to cache Object in cache: cache returns object Else cache requests object from origin server, then returns object to client
39. 24 Content Distribution Networks (CDNs) origin server in North America The content providers are the CDN customers. Content replication CDN company installs hundreds of CDN servers throughout Internet Close to users CDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates servers CDN distribution node CDN server In U.S.A CDN server in Asia CDN server in Europe
40. Content Distribution Networks Replicate content on many servers The general organization of a CDN as a feedback-control system 25
41.
42. Web Services Terminology SOAP Simple Object Access Protocol exchanging XML messages on a network WSDL Web Service Description Language describing interfaces of Web services UDDI Universal Description, Discovery and Integration managing registries of Web services 27
44. Why a New Framework? CORBA, DCOM, Java/RMI, ... already exist XML+HTTP: platform/language neutral, widely accepted and utilized Web service interoperability 29
45. Servlets/CGI vs. Web Services Browser Browser GUI Client Web Server HTTP GET/POST WSDL SOAP Web Server WSDL Web Server WSDL WSDL SOAP JDBC JDBC DB DB 30
46.
47. The core of a Web site: a process that has access to a local file system storing documents.
48. A client interacts with Web servers through a special application known as browser.
58. The request is forwarded to an application system where the resulting reply is generated dynamically. (server-side program execution)
59. Although Web started as simple two-tiered client-server architecture for static Web documents, this architecture has been extended to support advanced type of documents.33
73. it first inspects the HTTP request and decides which server it should forward that request to.
74. For example, if the front end always forwards requests for the same document to the same server, the server may cache the document resulting in better response times.37
77. a single domain name is associated with multiple IP addresses.
78. When resolving a host name, a browser would receive a list of multiple addresses, each address corresponding a server.
79. Normally, browsers choose the first address on the list, but most DNS servers circulate the entries.
80. As a result, simple distribution of requests over the servers in the cluster is achieved.39
81.
82. Many corporations now use the Web for advertising, marketing and sales
83. Web servers might be easy to use but
84. Complicated to configure correctly and difficult to build without security flaws
85.
86. Securing the TCP/IP Stack HTTP FTP SMTP HTTP FTP SMTP SSL/TLS TCP TCP IP/IPSEC IP At the Network Level At the Transport Level S/MIME PGP SET Kerberos SMTP HTTP TCP UDP IP At the Application Level 42
87. Secure Sockets Layer (SSL) Originally developed (1994) by Netscape in order to secure http communications Slight variation became Transport Layer Security (TLS) backward compatible with SSL TCP provides a reliable end-to-end service Consists of two sublayers: SSL Record Protocol (where all the action takes place) SSL Management (Handshake/Cipher Change/ Alert Protocols) 43
90. Web Service Composition - Current Solutions and Open Problems, by BiplavSrivastava-IBM India Research Laboratory and Jana Koehler-IBM Zurich Research Laboratory
91. A Reference Architecture for Web Servers, by Ahmed E. Hassan and Richard C. Holt , Software Architecture Group (SWAG), University of Waterloo
92. An Introduction to Web-based Support Systems, by JingTao Yao, University of Regina
93. Semantic Annotation for Web Services and their elevance to Environmental Models, by DumitruRoman University of Innsbruck / STI Innsbruck 45
Notas del editor
This is done by ‘wrapping’ some computational capability with a Web Service interface, and allowing other organizations to locate it (via UDDI) and interact with it (via WSDL). Hence, Web Service technology allows the description of an interface in a standard way,