1. Server Management Bell College
Configuring the Apache Web Server
Introduction to web servers ........................................................................................... 1
Web server processing steps ........................................................................................ 4
Running Apache.............................................................................................................. 6
Configuring Apache........................................................................................................ 8
Configuring by editing httpd.conf ............................................................................... 11
Using .htaccess to password protect a directory ...................................................... 14
Lab: Configuring an Apache server ............................................................................ 15
Introduction to web servers
At a very basic level, Web servers serve simple, static content -- HTML documents and
images. A user request for a file through a browser is picked up by the Web server and
taken to the host file system. The desired file is loaded from the disk where it travels
back across the network and is finally delivered to the Web client (browser) by the Web
server.
When we talk about web servers we can actually mean two different things:
• the computer which contains the web site files, or
• the software which runs on the computer
In these notes we are mostly referring to the software.
HTTP
The browser and the Web server talk to each other using Hypertext Transfer Protocol
(HTTP). A single TCP connection is opened that transmits first the HTML document and
then the images which the document requires. HTTP is used to transmit resources, not
just files. A resource is some chunk of information that can be identified by a URL. The
most common kind of resource is a file, but a resource may also be server-side script
output.
A browser is an HTTP client because it sends requests to an HTTP server (Web
server), which then sends responses back to the client. The standard (and default) port
for HTTP servers to listen on is 80, though they can use any port.
Like most network protocols, HTTP uses the client-server model: An HTTP client opens
a connection and sends a message called a request to an HTTP server; the server
then returns a response, usually containing the resource that was requested.
Configuring Web Servers page 1
2. Server Management Bell College
After delivering the response, the server closes the connection. HTTP is a stateless
protocol. This means that there is no connection information maintained between one
transaction and the following transactions – HTTP communication is like asking a single
question and getting an answer, rather than having a conversation.
The format of the request and response messages are similar. Both kinds of messages
consist of:
• an initial line,
• zero or more lines known as headers,
• a blank line
• an optional message body (e.g. a file, or query data, or query output).
Header lines provide information about the request or response.
Dynamic content
Many web sites nowadays feature dynamic content and user interaction, for example for
online shopping. This requires accessing databases e or processing of other program
code. The Web server delivers a dynamic content Web page to the browser that is
created in response to the user input (direct or indirect). This process requires the use of
CGI.
Common Gateway Interface (CGI)
CGI is a Web server extension protocol which defines how web clients can pass
information to web servers. CGI is not language specific; it's a protocol that allows Web
server to communicate with a program. The CGI standard defines how the Web server
should run programs locally and transmit their output to the Web browser. For example,
as a result of a client request, a Web server launches CGI program (e.g.
processform.cgi) to send the parameters as requested by the client browser. It then
retrieves output from the processform.cgi program to pass output back to the browser.
This is how CGI programs dynamically serve HTML data based on user input. CGI's
main disadvantage lies in its slow processing since each request for dynamic content
relies on a new program to be launched. CGI scripts can be written in many languages
including Perl, C and Python
Hypertext Transmission Protocol, Secure (HTTPS)
HTTPS is a security protocol that allows a secure Web connection. This means that with
HTTPS it is safe for an exchange of sensitive data between user and the server across
the insecure network. URLs that begin with 'https' are handled using SSL algorithm that
setup a secure, encrypted link between a Web browser and a Web server.
Multipurpose Internet Mail Extension (MIME)
MIME type header is the primary mechanism to display content downloaded by the
browser. It tells browser about the content type being delivered. MIME types are
Configuring Web Servers page 2
3. Server Management Bell College
identified using a type/subtype syntax associated with a file extension. Here are some
examples:
text/xml xml
video/mpg mpeg mpg
video/quicktime qt mov
MIME headers are used by HTTP to specify the contents of any transported file. The
header will specify a file's type.
Serving Web Pages
Today's Web servers are able to process and deliver multiple requests simultaneously to
serve more users at a time. This is done using multi-threading and multi-processing.
Today most Web sites run Web servers that support either multi-threading or multi-
processing and thus can handle a much higher load.
Commonly used web server software
The most widely used web server software is Apache – according to www.netcraft.com
more than 60% of web sites use it. Apache is an open source project and the server is
available for Unix, Linux, Windows and other systems. If you have web space with an
ISP account, it is likely to be hosted on a Unix or Linux system using the Apache server.
In this course you will look at the Apache server on a Linux system.
The next most popular is Microsoft’s Internet Information Services, which is only used
on Windows systems. Other servers include SunONE and Zeus.
Application servers
Dynamic content is now often created by application servers. An application server sits
in the middle of other programs and serves to process data for those other programs. In
a web site, the application server usually sits between the web server and a database. It
is sometimes referred to as middleware. The web server, application server and
database can be on the same computer, or all on different computers. Common
application server technologies are ColdFusion, ASP.NET, PHP and JSP/Java
Servlets. A web server can often be configured to use an application server for
processing – for example, Apache and the ColdFusion server can be used together.
Configuring Web Servers page 3
4. Server Management Bell College
Web server processing steps
Web servers are designed around a certain set of basic goals:
• Accept network connections from browsers.
• Retrieve content from disk.
• Run local CGI programs or application server programs.
• Transmit data back to clients.
• Keep a log of user activity.
• Be as fast as possible.
The following diagram shows the steps used by Apache to process a request.
Translate
URL to Parse
filename request
headers Access
control
Check
user
Check
MIME type
Invoke
Log the handler
request (sends
response)
Translate URL to filename
For example the URL of a document may look like:
http://hamilton.bell.ac.uk/index.html
Configuring Web Servers page 4
5. Server Management Bell College
The internal path in the filesystem is
/var/www/html/index.html
Thus this step converts the URL into the internal path where the document can be found
on the server.
Parse request headers
The server analyzes HTTP headers of the request
Access control
Access restrictions can be defined on the resources of the server, according to certain
characteristics of the client (IP address, or hostname).
Check user
If a resource is password protected, Apache checks if the password and the login
provided by the client exist and are valid
Check MIME type of the object requested
Determines the MIME type of the document required in order to carry out certain actions
(for example if it is a CGI file, the program is run).
Invoke handler (sends response)
The HTTP response is made up and sent to the client. This The response can be a static
document, or can be generated dynamically, depending on the request.
Log the request
Records a trace of the transaction carried out by recording in one or more logfiles The
logfiles can be analysed to obtain information about site visitors.
Configuring Web Servers page 5
6. Server Management Bell College
Running Apache
You can download and install the appropriate version of Apache for your system from
www.apache.org. Alternatively, many Linux distributions include Apache, and it can be
installed at the same time as the operating system. In these notes we will look at running
Apache 2.0 as installed on Fedora Linux.. Different versions of Linux or Unix may be
configured slightly differently.
The Apache server on Linux runs as the httpd daemon (a daemon is a program that
runs continuously and exists for the purpose of handling requests that a computer
system expects to receive). Usually the system initialisation files are set up to start httpd
when the system starts. It is also possible to start and stop httpd manually. You need to
stop and restart Apache when you change anything in its configuration.
GUI:
Use the Services tool under the Red Hat->System Settings->Server Settings in the
menu on the Fedora desktop. This is similar to Services in Windows 2000.
You can select the httpd service and click Start, Stop or Restart.
To check whether httpd is running, open a browser and enter http://localhost in the
address bar. If it is running, you should see the test page:
Configuring Web Servers page 6
7. Server Management Bell College
Command line:
To start Apache, type the following command in a terminal (you need to be logged in as
the root user):
/usr/sbin/apachectl start
To stop Apache, type
/usr/sbin/apachectl stop
Note that the location of the apachectl file may be different on other Linux systems.
It can be very useful to know how to work with Apache using the command line. Many
web servers are located in data centres with fast internet access. Web server
administrators often need to access their servers remotely using a simple interface such
as telnet or ssh.
Configuring Web Servers page 7
8. Server Management Bell College
Configuring Apache
Apache’s configuration information is kept in a configuration file called httpd.conf. Like
most Linux configuration information, this is a text file which can be edited with a text
editor such as Pico or Vi. Some versions of Linux, including Fedora, provide GUI tools
for configuring httpd.conf. In Red Hat Linux, you can use the HTTP Configuration Tool.
Some common web server configuration tasks include:
Basic configuration
• Server name
• IP address
• TCP port
• Webmaster email address
Site configuration
• Default filename(s)
• Root directory
Access control and authentication
• Allowing or denying access from specific hosts
• Password protecting pages
Virtual Hosting
• Hosting multiple we sites on the same server
Log files
• Defining what information is logged
Configuring Web Servers page 8
9. Server Management Bell College
Apache directories
An Apache web site typically has the following directories:
• Document root all web pages go in here
• CGI bin all CGI scripts go in here
• Log directory all server logs are created in here
• Manual all Apache documentation is in here
These are usually all located with one directory, such as /var/www, as in the example
shown below:
What is Virtual Hosting?
Virtual hosts are Web sites with different names that all run on the same server
hardware. The idea is that Apache knows which site the user is trying to get, even
though they are all on the same server, and serves content from the right one.
This trick lets you run several Web sites on a single machine, from a variety of different
domain names, and several names within one domain, so that one machine looks like a
room full of servers.
More commonly, web hosting companies and ISPs can host many web sites on a single
server computer using virtual hosting. Many web sites run for smaller businesses or
individuals are hosted this way. Some companies, usually larger ones, have their own
server computer on their own premises, or a dedicated server computer at located at a
hosting company’s data centre.
Configuring Web Servers page 9
10. Server Management Bell College
Types of virtual hosting
Virtual hosts can be specified one of two ways. These are configuration differences on
the server, and are not visible to the client - that is, there is no way that the user can tell
what sort of virtual host they are using. Or even that they are using a virtual host, for that
matter.
The two types are IP-based virtual hosting and name-based virtual hosting. In a nutshell,
the difference is that IP-based virtual hosts have a different IP address for each virtual
host, while name-based virtual hosts have the same IP address, but use different names
for each one.
IP-based Virtual Hosting
In IP-based virtual hosting, you are running more than one web site on the same server
machine, but each web site has its own IP address. In order to do this, you have to first
tell your operating system about the multiple IP addresses.
Once you have given your machine multiple IP addresses, you will need to make sure
that each IP address and host name is added to your DNS server.
Name-Based Virtual Hosts
Sometimes, you don't have the luxury of giving your machine multiple IP addresses.
Public IP addresses are in short supply, and frequently, for example if you have an DSL
connection, you many only have one. In this case, you need to use name-based virtual
hosting.
You don't have to give your machine multiple IP addresses, but you still need to set up
more than one DNS record on your DNS server for your machine. These extra records
are called C-records, or CNames. (The main record pointing at a machine is called the
AName, or A-record.) You can have as many CNames as you like pointing to a particular
IP address.
Configuring Web Servers page 10
11. Server Management Bell College
Configuring by editing httpd.conf
The httpd.conf file is a text file which contains directives which define the configuration.
Apache can be configured by opening the file /etc/http/conf/httpd.conf in a text editor, for
example KEdit, Pico or Vi. You can then change or add directives as required
Some Linux distributions include GUI tools to configure Apache. These simply provide a
convenient way of editing parts of the file – when you change a setting on the GUI tool,
the file is altered accordingly. In this course we will configure by editing httpd.conf, and
the techniques learned here should work on any Linux or Unix server.
The file contains several sections, for different types of configuration. You can usually
search the file for the item you want to change, then edit the relevant lines.
Basic settings are configured with Global directives in the section headed:
### Section 1: Global Environment
For example, to configure the server to listen for requests on 192.168.1.4 on port 80,
httpd.conf must include the line
Listen 192.168.1.4:80
The following example configures the server to listen on any assigned IP address on port
80
Listen 80
Site configuration settings are configured in the section headed
### Section 2: 'Main' server configuration.
Examples of settings:
ServerAdmin root@localhost
DocumentRoot "/var/www/html"
ServerName mywebserver
DirectoryIndex index.htm index.html (the default filename)
Configuring Web Servers page 11
12. Server Management Bell College
Access control and authentication for specific directories are also configured here, for
example:
<Directory "/var/www/html">
Order allow,deny
Allow from all
AllowOverride all
</Directory>
Note that directives which define subsections are in bracketed, XML-like elements.
Virtual Hosting configuration settings are configured in the section headed
### Section 3: Virtual Hosts
To use name-base virtual hosting you need a NameVirtualHost directive. The following
directive allows name based hosting for all IP addresses which listen on port 80.
NameVirtualHost *:80
To create a virtual server you enclose the directives for that server in a <VirtualHost>
directive. The following directive sets up a virtual server which uses the directory
/another/directory as its root directory, and responds t requests for the host myotherhost.
<VirtualHost *:80>
ServerAdmin other@localhost
DocumentRoot /another/directory
ServerName myotherhost
</VirtualHost>
Logging for the default site is configured in Section 2, and for a virtual host is configured
inside the <VirtualHost> directive. For example
ErrorLog logs/error_log
LogLevel warn
TransferLog logs/access_log
User Directories is an example of an additional configuration option in httpd.conf. If it is
enabled, then each user on the system can store web pages in a directory called
public_html inside their own home directory. If a user called myuser has a page called
mypage.html in this directory on the server myhost, then it can be viewed using the URL
http://myhost/~myuser/mypage.html
Configuring Web Servers page 12
13. Server Management Bell College
User Directories is enabled with these directives
UserDir enable <list of usernames>
UserDir public_html
Note that Apache runs with its own user id, so the permissions on the user directory
need to be set to allow users other than the owner to read the files. Permissions of 755
on the user’s home directory, public_html directory and web pages will allow this.
Configuring Web Servers page 13
14. Server Management Bell College
Using .htaccess to password protect a directory
An .htaccess file is simply a text file containing Apache directives. Those directives apply
to the documents in the directory where the .htaccess file is located, and to all
subdirectories under it as well. Other .htaccess files in subdirectories may change or
nullify the effects of those in parent directories.
What you can put in these files is determined by the AllowOverride directive for a
directory in httpd.conf. This directive specifies, in categories, what directives will be
used if they are found in a .htaccess file. The following directive only allows the
.htaccess file to control user authorisation.
AllowOverride AuthConfig
To restrict access to a directory /var/www/html/private to users who log in with a
username of privateuser and a password of privatepassword, you need to do the
following:
1. Edit httpd.conf to include the following <Directory> directive:
<Directory "/var/www/html/private">
AllowOverride all
</Directory>
2. Create a file called .htaccess inside the directory /var/www/html/private,
containing the following:
AuthName “Private Directory”
AuthType Basic
AuthUserFile /var/www/html/private/.htpasswd
Require user privateuser
3. You need to create a file called .htpasswd in /var/www/html/private which
contains the allowed username and password. You can do this using the
following command:
htpasswd –c /var/www/html/private/.htpasswd privateuser
You are prompted for a password. After you have entered and confirmed the
password, there will be a file called .htpasswd, containing the username and an
encrypted version of the password, something like this:
privateuser:e9Ad7d7qpbvAw
Now, when a user attempts to access a page stored in that directory, the user’s browser
will show a pop-up box and ask for a username and password to access the page.
Configuring Web Servers page 14
15. Server Management Bell College
Lab: Configuring an Apache server
In this lab you will configure an Apache server on a Red Hat Linux system as an intranet
server on a private LAN.
You will need the following set up for you before you start:
• Linux system with Apache installed
• Server name set to myserver
• A test web page in the directory /var/www/html
• A test web page in the directory /var/www/html/secretstuff
• A test web page in the directory /var/otherwww/html
• A test web page in the directory /var/yetanotherwww/html
• A user on the system called labuser with a password of password
• A test web page in the directory /home/labuser/public_html
• Permissions set to 755 for all these directories and files and for labuser’s home
directory
• A root password of password
• A /etc/hosts file set up to resolve myserver, myotherserver, and yetanotherserver
to the local loopback address 127.0.0.1
Start up the system and log in as root with a password of password (actually, Linux
administrators never log on directly as root, but we will do so here for simplicity).
Start Apache. You can use the GUI Services dialog or the command line.
Access the URL http://myserver in your browser. You can use any web browser which
is configured on the system, for example Konqueror, accessed from the Red Hat-
>Internet->More Internet Tools desktop menu. You should see the test page.
Open httpd.conf for editing. For example, you can open a terminal window and type:
kedit /etc/httpd/conf/httpd.conf &
Make the configuration changes listed on the following pages. Take a note of what
you did to achieve each change. Test each change as you make it – do not try to make
all the changes at once! Tick the appropriate box when you have tested a change
successfully.
You need to restart Apache after you make changes.
Configuring Web Servers page 15
16. Server Management Bell College
CONFIGURATION CHANGE 1: DEFAULT FILENAME
Add default.html to the list of default file names
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
TEST:
Open a browser.
OK
Enter the URL http://myserver. You should see the test web page default.html.
CONFIGURATION CHANGE 2: PORT
Configure the server to listen to all addresses on port 8084. Restart Apache.
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
TEST:
Open a browser.
OK
Enter the URL http://myserver:8084. You should see the test web page
default.html. Note – this requires that configuration change 1 has been done.
When you have done this, reconfigure the server to listen to all addresses on port 80
and restart Apache.
Configuring Web Servers page 16
17. Server Management Bell College
CONFIGURATION CHANGE 3: PASSWORD PROTECTION
Password protect the directory /var/www/html/secretstuff with a username of secret
and a password of stuff. Restart Apache.
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
TEST:
Open a browser.
Enter the URL http://myserver/secretstuff/ You should be prompted for a OK
username and password. Enter secret and wrong. You should be denied
access
Enter the URL http://myserver/secretstuff/. You should be prompted for a OK
username and password. Enter secret and stuff. You should see the test web
page default.html. Note – this requires that configuration change 1 has been
done.
Configuring Web Servers page 17
18. Server Management Bell College
CONFIGURATION CHANGE 4: USER DIRECTORIES
Configure the server to listen to allow access to pages in labuser’s public_html
directory. Restart Apache.
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
TEST:
Open a browser.
OK
Enter the URL http://myserver/~labuser/. You should see the test web page
default.html. Note – this requires that configuration change 1 has been done.
CONFIGURATION CHANGE 5: VIRTUAL HOSTS
Add a two name-based virtual hosts as follows, and restart Apache.
hostname: myotherserver
document root: /var/otherwww/html
hostname: yetanotherserver
document root: /var/yetanotherwww/html
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
TEST:
Open a browser.
OK
Enter the URL http://myotherserver/. You should see the appropriate
test web page default.html. Note – this requires that configuration
change 1 has been done.
AND
OK
Enter the URL http://yetanotherserver/. You should see a different test
web page default.html.
Configuring Web Servers page 18