A browser allows users to view and interact with resources on the World Wide Web. It displays HTML pages and other web content by making HTTP requests and rendering the responses. Key components of a browser include a user interface, layout engine, rendering engine, JavaScript interpreter, and networking components. When a user enters a URL, the browser looks up the IP address and sends HTTP requests to retrieve and display the requested content, including linked resources. Common browser features include back/forward buttons, an address bar, and the ability to view page source. Browsers support privacy/security functions and web standards.
2. What is browser all about?
• An information resource is identified by a Uniform Resource Identifier (URI/URL) and may be
a web page, image, video or other piece of content. Hyperlinks present in resources enable users
easily to navigate their browsers to related resources.
• Browsers are able to display Web pages largely in part to an underlying Web protocol called
HyperText Transfer Protocol (HTTP).
• The primary purpose of a web browser is to bring information resources to the user ("retrieval" or
"fetching"), allowing them to view the information ("display", "rendering"), and then access other
information ("navigation", "following links").
3. Process
• user inputs a Uniform Resource Locator (URL), for example http://en.wikipedia.org/, into the
browser.
• The prefix of the URL, the Uniform Resource Identifier or URI, determines how the URL will be
interpreted.
• commonly used kind of URI starts with http: and identifies a resource to be retrieved over
the Hypertext Transfer Protocol (HTTP).
• browsers also support a variety of other prefixes, such as https: for HTTPS, ftp: for the File
Transfer Protocol, and file: for local files.
• HTML and associated content (image files, formatting information such as CSS, etc.) is passed to
the browser's layout engine to be transformed from markup to an interactive document, a
process known as "rendering".
• Information resources may contain hyperlinks to other information resources. Each link contains
the URI of a resource to go to. When a link is clicked, the browser navigates to the resource
indicated by the link's target URI, and the process of bringing content to the user begins again.
4. What is the first Web browser?
• First web browser was invented in 1990 by Sir Tim Berners-Lee.
• His browser was called WorldWideWeb and later renamed Nexus.
• The first Internet domain name "symbolics.com" was registered by Symbolics
5. Web browser’s user interface elements.
• Back and forward buttons to go back to the previous resource and forward respectively.
• A refresh or reload button to reload the current resource.
• A stop button to cancel loading the resource. In some browsers, the stop button is merged with
the reload button.
• A home button to return to the user's home page.
• An address bar to input the Uniform Resource Identifier (URI) of the desired resource and display
it.
• A search bar to input terms into a web search engine. In some browsers, the search bar is merged
with the address bar.
• A status bar to display progress in loading the resource and also the URI of links when the cursor
hovers over them, and page zooming capability.
• The viewport, the visible area of the webpage within the browser window.
• The ability to view the HTML source for a page.
6. Privacy and security of Web browser.
• Most browsers support HTTP Secure and offer quick and easy ways to
delete the web cache, download history, form and search
history, cookies, and browsing history.
7. Standards support.
• Early web browsers supported only a very simple version of HTML.
• Modern web browsers support a combination of standards-based
and de facto HTML and XHTML, which should be rendered in the
same way by all browsers.
8. Components
Web browsers consist of :
• A user interface
• Layout_engine
• Rendering engine
• JavaScript interpreter
• UI backend
• Networking component
• Data persistence component.
9. What is HTTP?
• The Hypertext Transfer Protocol (HTTP) is designed to enable communications between clients
and servers.
• HTTP works as a request-response protocol between a client and server.
• A web browser may be the client, and an application on a computer that hosts a web site may be
the server.
• Example: A client (browser) submits an HTTP request to the server; then the server returns a
response to the client. The response contains status information about the request and may also
contain the requested content.
10. Two HTTP Request Methods: GET and POST
• Two commonly used methods for a request-response between a client and server are: GET and
POST.
• GET - Requests data from a specified resource
• POST - Submits data to be processed to a specified resource
11. The GET Method
• The query string (name/value pairs) is sent in the URL of a GET
request:/test/demo_form.asp?name1=value1&name2=value2
• GET requests can be cached
• GET requests remain in the browser history
• GET requests can be bookmarked
• GET requests should never be used when dealing with sensitive data
• GET requests have length restrictions
• GET requests should be used only to retrieve data
12. The POST Method
• The query string (name/value pairs) is sent in the HTTP message body of a POST request:
POST /test/demo_form.asp HTTP/1.1
Host: w3schools.com
name1=value1&name2=value2
• POST requests are never cached
• POST requests do not remain in the browser history
• POST requests cannot be bookmarked
• POST requests have no restrictions on data length
13. HTTP Request Methods
Method Description
HEAD Same as GET but returns only HTTP headers and no document body
PUT Uploads a representation of the specified URI
DELETE Deletes the specified resource
OPTIONS Returns the HTTP methods that the server supports
CONNECT Converts the request connection to a transparent TCP/IP tunnel
14. HTTP Status Messages
• When a browser requests a service from a web server, an error might
occur.
• This is a list of HTTP status messages that might be returned:
• 1xx: Information
• 2xx: Successful
• 3xx: Redirection
• 4xx: Client Error
• 5xx: Server Error
15. URL (Uniform Resource Locator)
• "http" stands for HyperText Transfer Protocol, knows what protocol it is going to use to access the
information specified in the domain.
• colon ( : ) and two forward slashes ( // ) that separate the protocol from the remainder of the URL.
• www. stands for World Wide Web and is used to distinguish the content.
• .com is the domain name for the website, known as the "domain suffix", or TLD, and is used to identify
the type or location of the website. For example, .com is short for commercial, .org is short for an
organization, and .co.uk is the United Kingdom.
• Some URL tell about directories, where on the server the web page is located.
• The protocol, domain, directories, and files are all separated by forward slashes ( / ).
• The trailing .htm is the file extension of the web page that indicates the file is an HTML file.
• When a URL points to a script that performs additional functions, such as a search engine pointing to a
search results page, additional information (parameters) is added to the end of the URL.
16. What really happens when you navigate to a URL.
• You enter a URL into the browser
• The browser looks up the IP address for the domain name
• The browser sends a HTTP request to the web server
• The facebook server responds with a permanent redirect
• The browser follows the redirect
• The server ‘handles’ the request
• The server sends back a HTML response
• The browser begins rendering the HTML
• The browser sends requests for objects embedded in HTML
• The browser sends further asynchronous (AJAX) requests
17. Library http
• Common http requests are simply get, post, and head; or, if more
control is required, generic_request can be used.
• The get_url helper function can be used to parse and retrieve a full
URL.
• HTTPS support is transparent. The library uses comm.tryssl to
determine whether SSL is required for a request.
18. The functions return a table of values,
including:
• status-line - A string representing the status, such as "HTTP/1.1 200 OK". In case of an error, a
description will be provided in this line.
• status: The HTTP status value; for example, "200". If an error occurs during a request, then this
value is going to be nil.
• header - An associative array representing the header. Keys are all lowercase, and standard
headers, such as 'date', 'content-length', etc. will typically be present.
• rawheader - A numbered array of the headers, exactly as the server sent them. While
header['content-type'] might be 'text/html', rawheader[3] might be 'Content-type: text/html'.
• cookies - A numbered array of the cookies the server sent. Each cookie is a table with the
following keys: name, value, path, domain, and expires.
• body - The full body, as returned by the server.
19. Proxy
• A proxy is a point to point connection between you and a remote location on the Internet.
• A proxy server that passes requests and responses unmodified is usually called a gateway or
sometimes a tunneling proxy.
• A forward proxy is an Internet-facing proxy used to retrieve from a wide range of sources (in most
cases anywhere on the Internet).
• A reverse proxy is usually an internal-facing proxy used as a front-end to control and protect
access to a server on a private network. A reverse proxy commonly also performs tasks such as
load-balancing, authentication, decryption or caching.
20. Open proxies
• An open proxy is a forwarding proxy server that is accessible by any
Internet user.
• Open proxy allows users to conceal their IP address while browsing
the Web or using other Internet services.
• a proxy server only allows users within a network group (i.e. a closed
proxy) to store and forward Internet services such as DNS or web
pages to reduce and control the bandwidth used by the group. With
an open proxy, however, any user on the Internet is able to use this
forwarding service.
21. Reverse proxy
• A reverse proxy (or surrogate) is a proxy server that appears to clients
to be an ordinary server.
• Requests are forwarded to one or more proxy servers which handle
the request.
• The response from the proxy server is returned as if it came directly
from the original server, leaving the client no knowledge of the origin
servers.
22. Reasons for installing reverse proxy servers:
• Encryption / SSL acceleration: when secure web sites are created, the Secure Sockets Layer (SSL) encryption is often not done by
the web server itself, but by a reverse proxy that is equipped with SSL acceleration hardware. Furthermore, a host can provide a
single "SSL proxy" to provide SSL encryption for an arbitrary number of hosts; removing the need for a separate SSL Server
Certificate for each host, with the downside that all hosts behind the SSL proxy have to share a common DNS name or IP address
for SSL connections. This problem can partly be overcome by using the SubjectAltName feature of X.509certificates.
• Load balancing: the reverse proxy can distribute the load to several web servers, each web server serving its own application area.
In such a case, the reverse proxy may need to rewrite the URLs in each web page (translation from externally known URLs to the
internal locations).
• Serve/cache static content: A reverse proxy can offload the web servers by caching static content like pictures and other static
graphical content.
• Compression: the proxy server can optimize and compress the content to speed up the load time.
• Spoon feeding: reduces resource usage caused by slow clients on the web servers by caching the content the web server sent and
slowly "spoon feeding" it to the client. This especially benefits dynamically generated pages.
• Security: the proxy server is an additional layer of defense and can protect against some OS and Web Server specific attacks.
However, it does not provide any protection from attacks against the web application or service itself, which is generally
considered the larger threat.
• Extranet Publishing: a reverse proxy server facing the Internet can be used to communicate to a firewall server internal to an
organization, providing extranet access to some functions while keeping the servers behind the firewalls. If used in this way,
security measures should be considered to protect the rest of your infrastructure in case this server is compromised, as its web
application is exposed to attack from the Internet.
23. Proxies still require trust
• remember that while a proxy server will provide you with security and anonymity,
the proxy itself has to decode your traffic to send it through.
• This means it can see everything you're doing, unless you use SSL connections.
• So you need to trust it.
• A lot of people use TOR, which is a free anonymity network run by volunteers, or
some go to underground channels to get so-called "private" proxies, but the
problem is you never know if you can trust those servers.
• It may end up being worse than not using a proxy at all.
• the proxy server is the one party that knows what your real IP address is.
• Also, using proxies will typically slow your connection down, since you're
basically transferring all your data to another location around the world before it
goes out to the Internet.
24. Load balancing
• Load balancing improves the distribution of workloads across multiple computing resources, such as
computers, a computer cluster, network links,
• Load balancing aims to optimize resource use, maximize throughput, minimize response time, and avoid
overload of any single resource.
• Using multiple components with load balancing instead of a single component may increase reliability and
availability through redundancy.
• Load balancing usually involves dedicated software or hardware, such as a multilayer switch or a Domain
Name System server process. central processing units, or disk drives.
• User requests to the Wikimedia Elasticsearch server cluster are routed via load balancing.
• Elasticsearch can be used to search all kinds of documents. It provides scalable search, has near real-time
search, and supports multitenancy. "Elasticsearch is distributed, which means that indices can be divided
into shards and each shard can have zero or more replicas. Each node hosts one or more shards, and acts
as a coordinator to delegate operations to the correct shard(s). Rebalancing and routing are done
automatically [...]".
• Elasticsearch uses Lucene and tries to make all its features available through the JSON and Java API. It
supports facetting and percolating, which can be useful for notifying if new documents match for
registered queries.
• Another feature is called "gateway" and handles the long-term persistence of the index; for example, an
index can be recovered from the gateway in the event of a server crash. Elasticsearch supports real-
time GET requests, which makes it suitable as a NoSQLdatastore, but it lacks distributed transactions.
25. Gateway
• What is the gateway address?
• If the router's LAN IP address is 192.168.1.1, please type in IP address 192.168.1.x (x is
from 2 to 253), subnet mask 255.255.255.0, and default gateway 192.168.1.1. Note: 1. DNS
server should be provided by your ISP.
• What is the use of Gateway in networking?
26. Application program interface (API)
• An application program interface (API) is code that allows two
software programs to communicate with each other.
27. VPN
• If you work remotely, or have to handle corporate files on the road, then chances are you've used a specific type of proxy and may not even
be aware of it.
• In fact, proxies are used by workers all over the world in the form of a VPN.
• A virtual private network is one specific type of proxy which provides you with the ability to work remotely and securely.
• opening a VPN to your corporate office means your computer will create a permanent connection between your own system and a
dedicated device at the corporate office called the VPN server.
• This connection provides you with a tunnel through which all further communication will pass.
• This is the first and most well known quality of a VPN.
• All of your traffic, whatever it is, will be encrypted inside that tunnel, going from your current location to the VPN server, and then be
resent on your behalf to the wider Internet.
• What this means is that anyone listening nearby, or trying to see the packets going from your own system, will see nothing but static.
• In fact, they won't even know which websites you visit, because everything is encrypted.
• This is an even stronger security mechanism than SSL, since with SSL people can still see the headers and know which sites you surf to.
• A lot of people use them simply for safety.
• If you have a slow Internet connection, you could use a proxy server with a lot of bandwidth, and malware threats roaming the net trying
to find unpatched systems, or launch potential denial of service attacks, would find only the proxy.