SlideShare una empresa de Scribd logo
1 de 93
© 1999 Elliotte Rusty Harold 04/27/13
URLs, InetAddresses, and
URLConnections
High Level Network Programming
Elliotte Rusty Harold
elharo@metalab.unc.edu
http://metalab.unc.edu/javafaq/slides/
© 1999 Elliotte Rusty Harold 04/27/13
We will learn how Java
handles
• Internet Addresses
• URLs
• CGI
• URLConnection
• Content and Protocol handlers
© 1999 Elliotte Rusty Harold 04/27/13
I assume you
• Understand basic Java syntax and I/O
• Have a user’s view of the Internet
• No prior network programming
experience
© 1999 Elliotte Rusty Harold 04/27/13
Applet Network Security
Restrictions
• Applets may:
– send data to the code base
– receive data from the code base
• Applets may not:
– send data to hosts other than the code base
– receive data from hosts other than the code
base
© 1999 Elliotte Rusty Harold 04/27/13
Some Background
• Hosts
• Internet Addresses
• Ports
• Protocols
© 1999 Elliotte Rusty Harold 04/27/13
Hosts
• Devices connected to the Internet are
called hosts
• Most hosts are computers, but hosts also
include routers, printers, fax machines,
soda machines, bat houses, etc.
© 1999 Elliotte Rusty Harold 04/27/13
Internet addresses
• Every host on the Internet is identified by a
unique, four-byte Internet Protocol (IP)
address.
• This is written in dotted quad format like
199.1.32.90 where each byte is an unsigned
integer between 0 and 255.
• There are about four billion unique IP
addresses, but they aren’t very efficiently
allocated
© 1999 Elliotte Rusty Harold 04/27/13
Domain Name System
(DNS)
• Numeric addresses are mapped to names
like "www.blackstar.com" or
"star.blackstar.com" by DNS.
• Each site runs domain name server
software that translates names to IP
addresses and vice versa
• DNS is a distributed system
© 1999 Elliotte Rusty Harold 04/27/13
The InetAddress Class
• The java.net.InetAddress class
represents an IP address.
• It converts numeric addresses to host
names and host names to numeric
addresses.
• It is used by other network classes like
Socket and ServerSocket to identify
hosts
© 1999 Elliotte Rusty Harold 04/27/13
Creating InetAddresses
• There are no public InetAddress()
constructors. Arbitrary addresses may
not be created.
• All addresses that are created must be
checked with DNS
© 1999 Elliotte Rusty Harold 04/27/13
The getByName() factory
method
public static InetAddress getByName(String
host) throws UnknownHostException
InetAddress utopia, duke;
try {
utopia = InetAddress.getByName("utopia.poly.edu");
duke = InetAddress.getByName("128.238.2.92");
}
catch (UnknownHostException e) {
System.err.println(e);
}
© 1999 Elliotte Rusty Harold 04/27/13
Other ways to create
InetAddress objects
public static InetAddress[] getAllByName(String host)
throws UnknownHostException
public static InetAddress getLocalHost() throws
UnknownHostException
© 1999 Elliotte Rusty Harold 04/27/13
Getter Methods
• public boolean isMulticastAddress()
• public String getHostName()
• public byte[] getAddress()
• public String getHostAddress()
© 1999 Elliotte Rusty Harold 04/27/13
Utility Methods
• public int hashCode()
• public boolean equals(Object o)
• public String toString()
© 1999 Elliotte Rusty Harold 04/27/13
Ports
• In general a host has only one Internet
address
• This address is subdivided into 65,536
ports
• Ports are logical abstractions that allow
one host to communicate simultaneously
with many other hosts
• Many services run on well-known ports.
For example, http tends to run on port 80
© 1999 Elliotte Rusty Harold 04/27/13
Protocols
• A protocol defines how two hosts talk to
each other.
• The daytime protocol, RFC 867, specifies
an ASCII representation for the time
that's legible to humans.
• The time protocol, RFC 868, specifies a
binary representation, for the time that's
legible to computers.
• There are thousands of protocols,
standard and non-standard
© 1999 Elliotte Rusty Harold 04/27/13
IETF RFCs
• Requests For Comment
• Document how much of the Internet
works
• Various status levels from obsolete to
required to informational
• TCP/IP, telnet, SMTP, MIME, HTTP,
and more
• http://www.faqs.org/rfc/
© 1999 Elliotte Rusty Harold 04/27/13
W3C Standards
• IETF is based on “rough consensus and
running code”
• W3C tries to run ahead of
implementation
• IETF is an informal organization open to
participation by anyone
• W3C is a vendor consortium open only to
companies
© 1999 Elliotte Rusty Harold 04/27/13
W3C Standards
• HTTP
• HTML
• XML
• RDF
• MathML
• SMIL
• P3P
© 1999 Elliotte Rusty Harold 04/27/13
URLs
• A URL, short for "Uniform Resource
Locator", is a way to unambiguously
identify the location of a resource on the
Internet.
© 1999 Elliotte Rusty Harold 04/27/13
Example URLs
http://java.sun.com/
file:///Macintosh%20HD/Java/Docs/JDK
%201.1.1%20docs/api/java.net.InetAddress.html#_top_
http://www.macintouch.com:80/newsrecent.shtml
ftp://ftp.info.apple.com/pub/
mailto:elharo@metalab.unc.edu
telnet://utopia.poly.edu
ftp://mp3:mp3@138.247.121.61:21000/c%3a/stuff/mp3/
http://elharo@java.oreilly.com/
http://metalab.unc.edu/nywc/comps.phtml?
category=Choral+Works
© 1999 Elliotte Rusty Harold 04/27/13
The Pieces of a URL
• the protocol, aka scheme
• the authority
– user info
user name
password
– host name or address
– port
• the path, aka file
• the ref, aka section or anchor
• the query string
© 1999 Elliotte Rusty Harold 04/27/13
The java.net.URL class
• A URL object represents a URL.
• The URL class contains methods to
– create new URLs
– parse the different parts of a URL
– get an input stream from a URL so you can
read data from a server
– get content from the server as a Java object
© 1999 Elliotte Rusty Harold 04/27/13
Content and Protocol
Handlers
• Content and protocol handlers separate the data
being downloaded from the the protocol used to
download it.
• The protocol handler negotiates with the server
and parses any headers. It gives the content
handler only the actual data of the requested
resource.
• The content handler translates those bytes into a
Java object like an InputStream or
ImageProducer.
© 1999 Elliotte Rusty Harold 04/27/13
Finding Protocol Handlers
• When the virtual machine creates a URL
object, it looks for a protocol handler
that understands the protocol part of the
URL such as "http" or "mailto".
• If no such handler is found, the
constructor throws a
MalformedURLException.
© 1999 Elliotte Rusty Harold 04/27/13
Supported Protocols
• The exact protocols that Java supports
vary from implementation to
implementation though http and file are
supported pretty much everywhere.
Sun's JDK 1.1 understands ten:
– file
– ftp
– gopher
– http
– mailto
–appletresource
–doc
–netdoc
–systemresource
–verbatim
© 1999 Elliotte Rusty Harold 04/27/13
URL Constructors
• There are four (six in 1.2) constructors in the
java.net.URL class.
public URL(String u) throws MalformedURLException
public URL(String protocol, String host, String file)
throws MalformedURLException
public URL(String protocol, String host, int port,
String file) throws MalformedURLException
public URL(URL context, String url) throws
MalformedURLException
public URL(String protocol, String host, int port,
String file, URLStreamHandler handler) throws
MalformedURLException
public URL(URL context, String url, URLStreamHandler
handler) throws MalformedURLException
© 1999 Elliotte Rusty Harold 04/27/13
Constructing URL Objects
• An absolute URL like
http://www.poly.edu/fall97/grad.html#cs
try {
URL u = new
URL("http://www.poly.edu/fall97/grad.html#cs")
;
}
catch (MalformedURLException e) {}
© 1999 Elliotte Rusty Harold 04/27/13
Constructing URL Objects in
Pieces
• You can also construct the URL by
passing its pieces to the constructor, like
this:
URL u = null;
try {
u = new URL("http",
"www.poly.edu",
"/schedule/fall97/bgrad.html#cs");
}
catch (MalformedURLException e) {}
© 1999 Elliotte Rusty Harold 04/27/13
Including the Port
URL u = null;
try {
u = new URL("http",
"www.poly.edu", 8000,
"/fall97/grad.html#cs");
}
catch (MalformedURLException e) {}
© 1999 Elliotte Rusty Harold 04/27/13
Relative URLs
• Many HTML files contain relative URLs.
• Consider the page
http://metalab.unc.edu/javafaq/index.html
• On this page a link to “books.html" refers to
http://metalab.unc.edu/javafaq/books.html.
© 1999 Elliotte Rusty Harold 04/27/13
Constructing Relative URLs
• The fourth constructor creates URLs
relative to a given URL. For example,
try {
URL u1 = new
URL("http://metalab.unc.edu/index.html"
);
URL u2 = new URL(u1, ”books.html");
}
catch (MalformedURLException e) {}
• This is particularly useful when parsing
HTML.
© 1999 Elliotte Rusty Harold 04/27/13
Parsing URLs
• The java.net.URL class has five
methods to split a URL into its
component parts. These are:
public String getProtocol()
public String getHost()
public int getPort()
public String getFile()
public String getRef()
© 1999 Elliotte Rusty Harold 04/27/13
For example,
try {
URL u = new
URL("http://www.poly.edu/fall97/grad.html#c
s ");
System.out.println("The protocol is " +
u.getProtocol());
System.out.println("The host is " +
u.getHost());
System.out.println("The port is " +
u.getPort());
System.out.println("The file is " +
u.getFile());
System.out.println("The anchor is " +
u.getRef());
}
catch (MalformedURLException e) { }
© 1999 Elliotte Rusty Harold 04/27/13
Parsing URLs
• JDK 1.3 adds three more:
public String getAuthority()
public String getUserInfo()
public String getQuery()
© 1999 Elliotte Rusty Harold 04/27/13
Missing Pieces
• If a port is not explicitly specified in the URL
it's set to -1. This means the default port is to
be used.
• If the ref doesn't exist, it's just null, so watch
out for NullPointerExceptions. Better
yet, test to see that it's non-null before using
it.
• If the file is left off completely, e.g.
http://java.sun.com, then it's set to "/".
© 1999 Elliotte Rusty Harold 04/27/13
Reading Data from a URL
• The openStream() method connects to the
server specified in the URL and returns an
InputStream object fed by the data from that
connection.
public final InputStream openStream() throws
IOException
• Any headers that precede the actual data are
stripped off before the stream is opened.
• Network connections are less reliable and slower
than files. Buffer with a BufferedReader or a
BufferedInputStream.
© 1999 Elliotte Rusty Harold 04/27/13
Webcat
import java.net.*;
import java.io.*;
public class Webcat {
public static void main(String[] args) {
for (int i = 0; i < args.length; i++) {
try {
URL u = new URL(args[i]);
InputStream in = u.openStream();
InputStreamReader isr = new InputStreamReader(in);
BufferedReader br = new BufferedReader(isr);
String theLine;
while ((theLine = br.readLine()) != null) {
System.out.println(theLine);
}
} catch (IOException e) { System.err.println(e);}
}
}
}
© 1999 Elliotte Rusty Harold 04/27/13
The Bug in readLine()
• What readLine() does:
– Sees a carriage return, waits to see if next
character is a line feed before returning
• What readLine() should do:
– Sees a carriage return, return, throw away
next character if it's a linefeed
© 1999 Elliotte Rusty Harold 04/27/13
Webcat
import java.net.*;
import java.io.*;
public class Webcat {
public static void main(String[] args) {
for (int i = 0; i < args.length; i++) {
try {
URL u = new URL(args[i]);
InputStream in = u.openStream();
InputStreamReader isr = new InputStreamReader(in);
char c;
while ((c = br.read()) != -1) {
System.out.print(c);
}
} catch (IOException e) { System.err.println(e);}
}
}
}
© 1999 Elliotte Rusty Harold 04/27/13
CGI
• Common Gateway Interface
• A lot is written about writing server side
CGI. I’m going to show you client side
CGI.
• We’ll need to explore HTTP a little
deeper to do this
© 1999 Elliotte Rusty Harold 04/27/13
Normal web surfing uses
these two steps:
– The browser requests a page
– The server sends the page
• Data flows primarily from the server to
the client.
© 1999 Elliotte Rusty Harold 04/27/13
Forms
• There are times when the server needs to
get data from the client rather than the
other way around. The common way to
do this is with a form like this one:
© 1999 Elliotte Rusty Harold 04/27/13
CGI
• The user types the requested data into the
form and hits the submit button.
• The client browser then sends the data to the
server using the Common Gateway Interface,
CGI for short.
• CGI uses the HTTP protocol to transmit the
data, either as part of the query string or as
separate data following the MIME header.
© 1999 Elliotte Rusty Harold 04/27/13
GET and POST
• When the data is sent as a query string
included with the file request, this is
called CGI GET.
• When the data is sent as data attached to
the request following the MIME header,
this is called CGI POST
© 1999 Elliotte Rusty Harold 04/27/13
HTTP
• Web browsers communicate with web servers
through a standard protocol known as HTTP,
an acronym for HyperText Transfer
Protocol.
• This protocol defines
– how a browser requests a file from a web server
– how a browser sends additional data along with
the request (e.g. the data formats it can accept),
– how the server sends data back to the client
– response codes
© 1999 Elliotte Rusty Harold 04/27/13
A Typical HTTP Connection
– Client opens a socket to port 80 on the server.
– Client sends a GET request including the name
and path of the file it wants and the version of
the HTTP protocol it supports.
– The client sends a MIME header.
– The client sends a blank line.
– The server sends a MIME header
– The server sends the data in the file.
– The server closes the connection.
© 1999 Elliotte Rusty Harold 04/27/13
What the client sends to the
server
GET /javafaq/images/cup.gif
Connection: Keep-Alive
User-Agent: Mozilla/3.01 (Macintosh; I; PPC)
Host: www.oreilly.com:80
Accept: image/gif, image/x-xbitmap, image/jpeg, */*
© 1999 Elliotte Rusty Harold 04/27/13
MIME
• MIME is an acronym for "Multipurpose
Internet Mail Extensions".
• an Internet standard defined in RFCs
2045 through 2049
• originally intended for use with email
messages, but has been been adopted for
use in HTTP.
© 1999 Elliotte Rusty Harold 04/27/13
Browser Request MIME
Header
• When the browser sends a request to a
web server, it also sends a MIME header.
• MIME headers contain name-value
pairs, essentially a name followed by a
colon and a space, followed by a value.
Connection: Keep-Alive
User-Agent: Mozilla/3.01 (Macintosh; I; PPC)
Host: www.digitalthink.com:80
Accept: image/gif, image/x-xbitmap,
image/jpeg, image/pjpeg, */*
© 1999 Elliotte Rusty Harold 04/27/13
Server Response MIME
Header
• When a web server responds to a web
browser it sends a MIME header along
with the response that looks something
like this:
Server: Netscape-Enterprise/2.01
Date: Sat, 02 Aug 1997 07:52:46 GMT
Accept-ranges: bytes
Last-modified: Tue, 29 Jul 1997
15:06:46 GMT
Content-length: 2810
Content-type: text/html
© 1999 Elliotte Rusty Harold 04/27/13
Query Strings
• CGI GET data is sent in URL encoded
query strings
• a query string is a set of name=value
pairs separated by ampersands
Author=Sadie, Julie&Title=Women
Composers
• separated from rest of URL by a question
mark
© 1999 Elliotte Rusty Harold 04/27/13
URL Encoding
• Alphanumeric ASCII characters (a-z, A-Z,
and 0-9) and the $-_.!*'(), punctuation
symbols are left unchanged.
• The space character is converted into a plus
sign (+).
• Other characters (e.g. &, =, ^, #, %, ^, {,
and so on) are translated into a percent
sign followed by the two hexadecimal digits
corresponding to their numeric value.
© 1999 Elliotte Rusty Harold 04/27/13
For example,
• The comma is ASCII character 44
(decimal) or 2C (hex). Therefore if the
comma appears as part of a URL it is
encoded as %2C.
• The query string "Author=Sadie,
Julie&Title=Women Composers" is
encoded as:
Author=Sadie%2C+Julie&Title=Women+Composers
© 1999 Elliotte Rusty Harold 04/27/13
The URLEncoder class
• The java.net.URLEncoder class
contains a single static method which
encodes strings in x-www-form-url-
encoded format
URLEncoder.encode(String s)
© 1999 Elliotte Rusty Harold 04/27/13
For example,
String qs = "Author=Sadie, Julie&Title=Women Composers";
String eqs = URLEncoder.encode(qs);
System.out.println(eqs);
• This prints:
Author%3dSadie%2c+Julie%26Title
%3dWomen+Composers
© 1999 Elliotte Rusty Harold 04/27/13
String eqs = "Author=" +
URLEncoder.encode("Sadie, Julie");
eqs += "&";
eqs += "Title=";
eqs += URLEncoder.encode("Women
Composers");
• This prints the properly encoded query
string:
Author=Sadie
%2c+Julie&Title=Women+Composers
© 1999 Elliotte Rusty Harold 04/27/13
The URLDecoder class
• In Java 1.2 the
java.net.URLDecoder class contains
a single static method which decodes
strings in x-www-form-url-encoded
format
URLEncoder.decode(String s)
© 1999 Elliotte Rusty Harold 04/27/13
GET URLs
String eqs =
"Author=" + URLEncoder.encode("Sadie, Julie");
eqs += "&";
eqs += "Title=";
eqs += URLEncoder.encode("Women Composers");
try {
URL u = new
URL("http://www.superbooks.com/search.cgi?" + eqs);
InputStream in = u.openStream();
//...
}
catch (IOException e) { //...
© 1999 Elliotte Rusty Harold 04/27/13
URLConnections
• The java.net.URLConnection class
is an abstract class that handles
communication with different kinds of
servers like ftp servers and web servers.
• Protocol specific subclasses of
URLConnection handle different kinds
of servers.
• By default, connections to HTTP URLs
use the GET method.
© 1999 Elliotte Rusty Harold 04/27/13
URLConnections vs. URLs
• Can send output as well as read input
• Can post data to CGIs
• Can read headers from a connection
© 1999 Elliotte Rusty Harold 04/27/13
URLConnection five steps:
1. The URL is constructed.
2. The URL’s openConnection() method
creates the URLConnection object.
3. The parameters for the connection and the
request properties that the client sends to the
server are set up.
4. The connect() method makes the connection
to the server. (optional)
5. The response header information is read using
getHeaderField().
© 1999 Elliotte Rusty Harold 04/27/13
I/O Across a
URLConnection
• Data may be read from the connection in
one of two ways
– raw by using the input stream returned by
getInputStream()
– through a content handler with
getContent().
• Data can be sent to the server using the
output stream provided by
getOutputStream().
© 1999 Elliotte Rusty Harold 04/27/13
For example,
try {
URL u = new
URL("http://www.sd99.com/");
URLConnection uc =
u.openConnection();
uc.connect();
InputStream in =
uc.getInputStream();
// read the data...
}
catch (IOException e) { //...
© 1999 Elliotte Rusty Harold 04/27/13
Reading Header Data
• The getHeaderField(String name)
method returns the string value of a named
header field.
• Names are case-insensitive.
• If the requested field is not present, null is
returned.
String lm = uc.getHeaderField("Last-modified");
© 1999 Elliotte Rusty Harold 04/27/13
getHeaderFieldKey()
• The keys of the header fields are returned by
the getHeaderFieldKey(int n)
method.
• The first field is 1.
• If a numbered key is not found, null is
returned.
• You can use this in combination with
getHeaderField() to loop through the
complete header
© 1999 Elliotte Rusty Harold 04/27/13
For example
String key = null;
for (int i=1; (key =
uc.getHeaderFieldKey(i))!=null);
i++) {
System.out.println(key + ": " +
uc.getHeaderField(key));
}
© 1999 Elliotte Rusty Harold 04/27/13
getHeaderFieldInt() and
getHeaderFieldDate()
• These are utility methods that read a named
header and convert its value into an int and a
long respectively.
public int getHeaderFieldInt(String name, int default)
public long getHeaderFieldDate(String name, long
default)
© 1999 Elliotte Rusty Harold 04/27/13
• The long returned by
getHeaderFieldDate() can be converted
into a Date object using a Date() constructor
like this:
String s = uc.getHeaderFieldDate("Last-
modified", 0);
Date lm = new Date(s);
© 1999 Elliotte Rusty Harold 04/27/13
Six Convenience Methods
• These return the values of six
particularly common header fields:
public int getContentLength()
public String getContentType()
public String getContentEncoding()
public long getExpiration()
public long getDate()
public long getLastModified()
© 1999 Elliotte Rusty Harold 04/27/13
try {
URL u = new URL("http://www.sdexpo.com/");
URLConnection uc = u.openConnection();
uc.connect();
String key=null;
for (int n = 1;
(key=uc.getHeaderFieldKey(n)) != null;
n++) {
System.out.println(key + ": " +
uc.getHeaderField(key));
}
}
catch (IOException e) {
System.err.println(e);
}
© 1999 Elliotte Rusty Harold 04/27/13
Writing data to a
URLConnection
• Similar to reading data from a URLConnection.
• First inform the URLConnection that you plan to
use it for output
• Before getting the connection's input stream, get
the connection's output stream and write to it.
• Commonly used to talk to CGIs that use the
POST method
© 1999 Elliotte Rusty Harold 04/27/13
Eight Steps:
1.Construct the URL.
2.Call the URL’s openConnection()
method to create the URLConnection object.
3.Pass true to the URLConnection’s
setDoOutput() method
4.Create the data you want to send, preferably
as a byte array.
© 1999 Elliotte Rusty Harold 04/27/13
5.Call getOutputStream() to get an
output stream object.
6.Write the byte array calculated in step 5
onto the stream.
7.Close the output stream.
8.Call getInputStream() to get an
input stream object. Read from it as
usual.
© 1999 Elliotte Rusty Harold 04/27/13
POST CGIs
• A typical POST request to a CGI looks like
this:
POST /cgi-bin/booksearch.pl HTTP/1.0
Referer:
http://www.macfaq.com/sampleform.html
User-Agent: Mozilla/3.01 (Macintosh; I;
PPC)
Content-length: 60
Content-type: text/x-www-form-urlencoded
Host: utopia.poly.edu:56435
username=Sadie
%2C+Julie&realname=Women+Composers
© 1999 Elliotte Rusty Harold 04/27/13
A POST request includes
• the POST line
• a MIME header which must include
– content type
– content length
• a blank line that signals the end of the
MIME header
• the actual data of the form, encoded in x-
www-form-urlencoded format.
© 1999 Elliotte Rusty Harold 04/27/13
• A URLConnection for an http URL will
set up the request line and the MIME
header for you as long as you set its
doOutput field to true by invoking
setDoOutput(true).
• If you also want to read from the
connection, you should set doInput to
true with setDoInput(true) too.
© 1999 Elliotte Rusty Harold 04/27/13
For example,
URLConnection uc =
u.openConnection();
uc.setDoOutput(true);
uc.setDoInput(true);
© 1999 Elliotte Rusty Harold 04/27/13
• The request line and MIME header are
sent as soon as the URLConnection
connects. Then getOutputStream()
returns an output stream on which you
can write the x-www-form-urlencoded
name-value pairs.
© 1999 Elliotte Rusty Harold 04/27/13
HttpURLConnection
• java.net.HttpURLConnection is an
abstract subclass of URLConnection
that provides some additional methods
specific to the HTTP protocol.
• URL connection objects that are
returned by an http URL will be
instances of
java.net.HttpURLConnection.
© 1999 Elliotte Rusty Harold 04/27/13
Recall
• a typical HTTP response from a web
server begins like this:
HTTP/1.0 200 OK
Server: Netscape-Enterprise/2.01
Date: Sat, 02 Aug 1997 07:52:46 GMT
Accept-ranges: bytes
Last-modified: Tue, 29 Jul 1997
15:06:46 GMT
Content-length: 2810
Content-type: text/html
© 1999 Elliotte Rusty Harold 04/27/13
Response Codes
• The getHeaderField() and
getHeaderFieldKey() don't return the
HTTP response code
• After you've connected, you can retrieve the
numeric response code--200 in the above
example--with the getResponseCode()
method and the message associated with it--
OK in the above example--with the
getResponseMessage() method.
© 1999 Elliotte Rusty Harold 04/27/13
HTTP Protocols
• Java 1.0 only supports GET and POST
requests to HTTP servers
• Java 1.1/1.2 supports GET, POST, HEAD,
OPTIONS, PUT, DELETE, and TRACE.
• The protocol is chosen with the
setRequestMethod(String method)
method.
• A java.net.ProtocolException, a
subclass of IOException, is thrown if an
unknown protocol is specified.
© 1999 Elliotte Rusty Harold 04/27/13
getRequestMethod()
• The getRequestMethod() method
returns the string form of the request
method currently set for the
URLConnection. GET is the default
method.
© 1999 Elliotte Rusty Harold 04/27/13
disconnect()
• The disconnect() method of the
HttpURLConnection class closes the
connection to the web server.
• Needed for HTTP/1.1 Keep-alive
© 1999 Elliotte Rusty Harold 04/27/13
For example,
try {
URL u = new
URL("http://www.amnesty.org/");
HttpURLConnection huc =
(HttpURLConnection)
u.openConnection();
huc.setRequestMethod("PUT");
huc.connect();
OutputStream os =
huc.getOutputStream();
int code = huc.getResponseCode();
if (code >= 200 && < 300) {
// put the data...
}
huc.disconnect();
}
catch (IOException e) { //...
© 1999 Elliotte Rusty Harold 04/27/13
usingProxy
• The boolean usingProxy() method
returns true if web connections are being
funneled through a proxy server, false if
they're not.
© 1999 Elliotte Rusty Harold 04/27/13
Redirect Instructions
• Most web servers can be configured to
automatically redirect browsers to the
new location of a page that's moved.
• To redirect browsers, a server sends a
300 level response and a Location header
that specifies the new location of the
requested page.
© 1999 Elliotte Rusty Harold 04/27/13
GET /~elharo/macfaq/index.html HTTP/1.0
HTTP/1.1 302 Moved Temporarily
Date: Mon, 04 Aug 1997 14:21:27 GMT
Server: Apache/1.2b7
Location: http://www.macfaq.com/macfaq/index.html
Connection: close
Content-type: text/html
<HTML><HEAD>
<TITLE>302 Moved Temporarily</TITLE>
</HEAD><BODY>
<H1>Moved Temporarily</H1>
The document has moved <A
HREF="http://www.macfaq.com/macfaq/index.html">he
re</A>.<P>
</BODY></HTML>
© 1999 Elliotte Rusty Harold 04/27/13
• HTML is returned for browsers that
don't understand redirects, but most
modern browsers do not display this and
jump straight to the page specified in the
Location header instead.
• Because redirects can change the site
which a user is connecting without their
knowledge so redirects are not
arbitrarily followed by URLConnections.
© 1999 Elliotte Rusty Harold 04/27/13
Following Redirects
HttpURLConnection.setFollowRedirects
(true) method says that connections will follow
redirect instructions from the web server.
Untrusted applets are not allowed to set this.
HttpURLConnection.getFollowRedirects
() returns true if redirect requests are honored,
false if they're not.
© 1999 Elliotte Rusty Harold 04/27/13
To Learn More
• Java Network Programming
– O’Reilly & Associates, 1997
– ISBN 1-56592-227-1
• Java I/O
– O’Reilly & Associates, 1999
– ISBN 1-56592-485-1
• Web Client Programming with Java
– http://www.digitalthink.com/catalog/cs/cs
308/index.html
© 1999 Elliotte Rusty Harold 04/27/13
Questions?

Más contenido relacionado

Similar a Presentation_1367055457962

Url web design
Url web designUrl web design
Url web designCojo34
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBryan Bende
 
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks
 
W3C Linked Data Platform Overview
W3C Linked Data Platform OverviewW3C Linked Data Platform Overview
W3C Linked Data Platform OverviewSteve Speicher
 
Top 10 HTML5 Features for Oracle Cloud Developers
Top 10 HTML5 Features for Oracle Cloud DevelopersTop 10 HTML5 Features for Oracle Cloud Developers
Top 10 HTML5 Features for Oracle Cloud DevelopersBrian Huff
 
Up and Running with the Typelevel Stack
Up and Running with the Typelevel StackUp and Running with the Typelevel Stack
Up and Running with the Typelevel StackLuka Jacobowitz
 
Vorontsov, golovko ssrf attacks and sockets. smorgasbord of vulnerabilities
Vorontsov, golovko   ssrf attacks and sockets. smorgasbord of vulnerabilitiesVorontsov, golovko   ssrf attacks and sockets. smorgasbord of vulnerabilities
Vorontsov, golovko ssrf attacks and sockets. smorgasbord of vulnerabilitiesDefconRussia
 
2014 database - course 1 - www introduction
2014 database - course 1 - www introduction2014 database - course 1 - www introduction
2014 database - course 1 - www introductionHung-yu Lin
 
DSpace 4.2 Transmission: Import/Export
DSpace 4.2 Transmission: Import/ExportDSpace 4.2 Transmission: Import/Export
DSpace 4.2 Transmission: Import/ExportDuraSpace
 

Similar a Presentation_1367055457962 (20)

HTTP & HTML & Web
HTTP & HTML & WebHTTP & HTML & Web
HTTP & HTML & Web
 
Url web design
Url web designUrl web design
Url web design
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFi
 
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search
 
W3C Linked Data Platform Overview
W3C Linked Data Platform OverviewW3C Linked Data Platform Overview
W3C Linked Data Platform Overview
 
Codeigniter
CodeigniterCodeigniter
Codeigniter
 
Top 10 HTML5 Features for Oracle Cloud Developers
Top 10 HTML5 Features for Oracle Cloud DevelopersTop 10 HTML5 Features for Oracle Cloud Developers
Top 10 HTML5 Features for Oracle Cloud Developers
 
Up and Running with the Typelevel Stack
Up and Running with the Typelevel StackUp and Running with the Typelevel Stack
Up and Running with the Typelevel Stack
 
Vorontsov, golovko ssrf attacks and sockets. smorgasbord of vulnerabilities
Vorontsov, golovko   ssrf attacks and sockets. smorgasbord of vulnerabilitiesVorontsov, golovko   ssrf attacks and sockets. smorgasbord of vulnerabilities
Vorontsov, golovko ssrf attacks and sockets. smorgasbord of vulnerabilities
 
Networking
NetworkingNetworking
Networking
 
2014 database - course 1 - www introduction
2014 database - course 1 - www introduction2014 database - course 1 - www introduction
2014 database - course 1 - www introduction
 
Intro apache
Intro apacheIntro apache
Intro apache
 
10. ROS (1).pptx
10. ROS (1).pptx10. ROS (1).pptx
10. ROS (1).pptx
 
Webtech1b
Webtech1bWebtech1b
Webtech1b
 
Webtech1b
Webtech1bWebtech1b
Webtech1b
 
Webtech1b
Webtech1bWebtech1b
Webtech1b
 
Webtech1b
Webtech1bWebtech1b
Webtech1b
 
Webtech1b
Webtech1bWebtech1b
Webtech1b
 
25dom
25dom25dom
25dom
 
DSpace 4.2 Transmission: Import/Export
DSpace 4.2 Transmission: Import/ExportDSpace 4.2 Transmission: Import/Export
DSpace 4.2 Transmission: Import/Export
 

Más de Alexander Nevidimov

чек за платеж за телефон
чек за платеж за телефончек за платеж за телефон
чек за платеж за телефонAlexander Nevidimov
 
чек за платеж за телефон
чек за платеж за телефончек за платеж за телефон
чек за платеж за телефонAlexander Nevidimov
 

Más de Alexander Nevidimov (20)

чек за платеж за телефон
чек за платеж за телефончек за платеж за телефон
чек за платеж за телефон
 
чек за платеж за телефон
чек за платеж за телефончек за платеж за телефон
чек за платеж за телефон
 
Presentation_1376917645876
Presentation_1376917645876Presentation_1376917645876
Presentation_1376917645876
 
Presentation_1376678601814
Presentation_1376678601814Presentation_1376678601814
Presentation_1376678601814
 
Presentation_1376311255728
Presentation_1376311255728Presentation_1376311255728
Presentation_1376311255728
 
Presentation_1376222064850
Presentation_1376222064850Presentation_1376222064850
Presentation_1376222064850
 
Presentation_1376220985856
Presentation_1376220985856Presentation_1376220985856
Presentation_1376220985856
 
Presentation_1376220236996
Presentation_1376220236996Presentation_1376220236996
Presentation_1376220236996
 
Presentation_1376218980392
Presentation_1376218980392Presentation_1376218980392
Presentation_1376218980392
 
Presentation_1376168115602
Presentation_1376168115602Presentation_1376168115602
Presentation_1376168115602
 
Presentation_1375882767439
Presentation_1375882767439Presentation_1375882767439
Presentation_1375882767439
 
Presentation_1375882705328
Presentation_1375882705328Presentation_1375882705328
Presentation_1375882705328
 
Presentation_1375280857464
Presentation_1375280857464Presentation_1375280857464
Presentation_1375280857464
 
Presentation_1375280653597
Presentation_1375280653597Presentation_1375280653597
Presentation_1375280653597
 
Presentation_1374052137363
Presentation_1374052137363Presentation_1374052137363
Presentation_1374052137363
 
Presentation_1373778041831
Presentation_1373778041831Presentation_1373778041831
Presentation_1373778041831
 
Presentation_1373190655210
Presentation_1373190655210Presentation_1373190655210
Presentation_1373190655210
 
Presentation_1372848115982
Presentation_1372848115982Presentation_1372848115982
Presentation_1372848115982
 
Presentation_1372103147097
Presentation_1372103147097Presentation_1372103147097
Presentation_1372103147097
 
Presentation_1371997361000
Presentation_1371997361000Presentation_1371997361000
Presentation_1371997361000
 

Presentation_1367055457962

  • 1. © 1999 Elliotte Rusty Harold 04/27/13 URLs, InetAddresses, and URLConnections High Level Network Programming Elliotte Rusty Harold elharo@metalab.unc.edu http://metalab.unc.edu/javafaq/slides/
  • 2. © 1999 Elliotte Rusty Harold 04/27/13 We will learn how Java handles • Internet Addresses • URLs • CGI • URLConnection • Content and Protocol handlers
  • 3. © 1999 Elliotte Rusty Harold 04/27/13 I assume you • Understand basic Java syntax and I/O • Have a user’s view of the Internet • No prior network programming experience
  • 4. © 1999 Elliotte Rusty Harold 04/27/13 Applet Network Security Restrictions • Applets may: – send data to the code base – receive data from the code base • Applets may not: – send data to hosts other than the code base – receive data from hosts other than the code base
  • 5. © 1999 Elliotte Rusty Harold 04/27/13 Some Background • Hosts • Internet Addresses • Ports • Protocols
  • 6. © 1999 Elliotte Rusty Harold 04/27/13 Hosts • Devices connected to the Internet are called hosts • Most hosts are computers, but hosts also include routers, printers, fax machines, soda machines, bat houses, etc.
  • 7. © 1999 Elliotte Rusty Harold 04/27/13 Internet addresses • Every host on the Internet is identified by a unique, four-byte Internet Protocol (IP) address. • This is written in dotted quad format like 199.1.32.90 where each byte is an unsigned integer between 0 and 255. • There are about four billion unique IP addresses, but they aren’t very efficiently allocated
  • 8. © 1999 Elliotte Rusty Harold 04/27/13 Domain Name System (DNS) • Numeric addresses are mapped to names like "www.blackstar.com" or "star.blackstar.com" by DNS. • Each site runs domain name server software that translates names to IP addresses and vice versa • DNS is a distributed system
  • 9. © 1999 Elliotte Rusty Harold 04/27/13 The InetAddress Class • The java.net.InetAddress class represents an IP address. • It converts numeric addresses to host names and host names to numeric addresses. • It is used by other network classes like Socket and ServerSocket to identify hosts
  • 10. © 1999 Elliotte Rusty Harold 04/27/13 Creating InetAddresses • There are no public InetAddress() constructors. Arbitrary addresses may not be created. • All addresses that are created must be checked with DNS
  • 11. © 1999 Elliotte Rusty Harold 04/27/13 The getByName() factory method public static InetAddress getByName(String host) throws UnknownHostException InetAddress utopia, duke; try { utopia = InetAddress.getByName("utopia.poly.edu"); duke = InetAddress.getByName("128.238.2.92"); } catch (UnknownHostException e) { System.err.println(e); }
  • 12. © 1999 Elliotte Rusty Harold 04/27/13 Other ways to create InetAddress objects public static InetAddress[] getAllByName(String host) throws UnknownHostException public static InetAddress getLocalHost() throws UnknownHostException
  • 13. © 1999 Elliotte Rusty Harold 04/27/13 Getter Methods • public boolean isMulticastAddress() • public String getHostName() • public byte[] getAddress() • public String getHostAddress()
  • 14. © 1999 Elliotte Rusty Harold 04/27/13 Utility Methods • public int hashCode() • public boolean equals(Object o) • public String toString()
  • 15. © 1999 Elliotte Rusty Harold 04/27/13 Ports • In general a host has only one Internet address • This address is subdivided into 65,536 ports • Ports are logical abstractions that allow one host to communicate simultaneously with many other hosts • Many services run on well-known ports. For example, http tends to run on port 80
  • 16. © 1999 Elliotte Rusty Harold 04/27/13 Protocols • A protocol defines how two hosts talk to each other. • The daytime protocol, RFC 867, specifies an ASCII representation for the time that's legible to humans. • The time protocol, RFC 868, specifies a binary representation, for the time that's legible to computers. • There are thousands of protocols, standard and non-standard
  • 17. © 1999 Elliotte Rusty Harold 04/27/13 IETF RFCs • Requests For Comment • Document how much of the Internet works • Various status levels from obsolete to required to informational • TCP/IP, telnet, SMTP, MIME, HTTP, and more • http://www.faqs.org/rfc/
  • 18. © 1999 Elliotte Rusty Harold 04/27/13 W3C Standards • IETF is based on “rough consensus and running code” • W3C tries to run ahead of implementation • IETF is an informal organization open to participation by anyone • W3C is a vendor consortium open only to companies
  • 19. © 1999 Elliotte Rusty Harold 04/27/13 W3C Standards • HTTP • HTML • XML • RDF • MathML • SMIL • P3P
  • 20. © 1999 Elliotte Rusty Harold 04/27/13 URLs • A URL, short for "Uniform Resource Locator", is a way to unambiguously identify the location of a resource on the Internet.
  • 21. © 1999 Elliotte Rusty Harold 04/27/13 Example URLs http://java.sun.com/ file:///Macintosh%20HD/Java/Docs/JDK %201.1.1%20docs/api/java.net.InetAddress.html#_top_ http://www.macintouch.com:80/newsrecent.shtml ftp://ftp.info.apple.com/pub/ mailto:elharo@metalab.unc.edu telnet://utopia.poly.edu ftp://mp3:mp3@138.247.121.61:21000/c%3a/stuff/mp3/ http://elharo@java.oreilly.com/ http://metalab.unc.edu/nywc/comps.phtml? category=Choral+Works
  • 22. © 1999 Elliotte Rusty Harold 04/27/13 The Pieces of a URL • the protocol, aka scheme • the authority – user info user name password – host name or address – port • the path, aka file • the ref, aka section or anchor • the query string
  • 23. © 1999 Elliotte Rusty Harold 04/27/13 The java.net.URL class • A URL object represents a URL. • The URL class contains methods to – create new URLs – parse the different parts of a URL – get an input stream from a URL so you can read data from a server – get content from the server as a Java object
  • 24. © 1999 Elliotte Rusty Harold 04/27/13 Content and Protocol Handlers • Content and protocol handlers separate the data being downloaded from the the protocol used to download it. • The protocol handler negotiates with the server and parses any headers. It gives the content handler only the actual data of the requested resource. • The content handler translates those bytes into a Java object like an InputStream or ImageProducer.
  • 25. © 1999 Elliotte Rusty Harold 04/27/13 Finding Protocol Handlers • When the virtual machine creates a URL object, it looks for a protocol handler that understands the protocol part of the URL such as "http" or "mailto". • If no such handler is found, the constructor throws a MalformedURLException.
  • 26. © 1999 Elliotte Rusty Harold 04/27/13 Supported Protocols • The exact protocols that Java supports vary from implementation to implementation though http and file are supported pretty much everywhere. Sun's JDK 1.1 understands ten: – file – ftp – gopher – http – mailto –appletresource –doc –netdoc –systemresource –verbatim
  • 27. © 1999 Elliotte Rusty Harold 04/27/13 URL Constructors • There are four (six in 1.2) constructors in the java.net.URL class. public URL(String u) throws MalformedURLException public URL(String protocol, String host, String file) throws MalformedURLException public URL(String protocol, String host, int port, String file) throws MalformedURLException public URL(URL context, String url) throws MalformedURLException public URL(String protocol, String host, int port, String file, URLStreamHandler handler) throws MalformedURLException public URL(URL context, String url, URLStreamHandler handler) throws MalformedURLException
  • 28. © 1999 Elliotte Rusty Harold 04/27/13 Constructing URL Objects • An absolute URL like http://www.poly.edu/fall97/grad.html#cs try { URL u = new URL("http://www.poly.edu/fall97/grad.html#cs") ; } catch (MalformedURLException e) {}
  • 29. © 1999 Elliotte Rusty Harold 04/27/13 Constructing URL Objects in Pieces • You can also construct the URL by passing its pieces to the constructor, like this: URL u = null; try { u = new URL("http", "www.poly.edu", "/schedule/fall97/bgrad.html#cs"); } catch (MalformedURLException e) {}
  • 30. © 1999 Elliotte Rusty Harold 04/27/13 Including the Port URL u = null; try { u = new URL("http", "www.poly.edu", 8000, "/fall97/grad.html#cs"); } catch (MalformedURLException e) {}
  • 31. © 1999 Elliotte Rusty Harold 04/27/13 Relative URLs • Many HTML files contain relative URLs. • Consider the page http://metalab.unc.edu/javafaq/index.html • On this page a link to “books.html" refers to http://metalab.unc.edu/javafaq/books.html.
  • 32. © 1999 Elliotte Rusty Harold 04/27/13 Constructing Relative URLs • The fourth constructor creates URLs relative to a given URL. For example, try { URL u1 = new URL("http://metalab.unc.edu/index.html" ); URL u2 = new URL(u1, ”books.html"); } catch (MalformedURLException e) {} • This is particularly useful when parsing HTML.
  • 33. © 1999 Elliotte Rusty Harold 04/27/13 Parsing URLs • The java.net.URL class has five methods to split a URL into its component parts. These are: public String getProtocol() public String getHost() public int getPort() public String getFile() public String getRef()
  • 34. © 1999 Elliotte Rusty Harold 04/27/13 For example, try { URL u = new URL("http://www.poly.edu/fall97/grad.html#c s "); System.out.println("The protocol is " + u.getProtocol()); System.out.println("The host is " + u.getHost()); System.out.println("The port is " + u.getPort()); System.out.println("The file is " + u.getFile()); System.out.println("The anchor is " + u.getRef()); } catch (MalformedURLException e) { }
  • 35. © 1999 Elliotte Rusty Harold 04/27/13 Parsing URLs • JDK 1.3 adds three more: public String getAuthority() public String getUserInfo() public String getQuery()
  • 36. © 1999 Elliotte Rusty Harold 04/27/13 Missing Pieces • If a port is not explicitly specified in the URL it's set to -1. This means the default port is to be used. • If the ref doesn't exist, it's just null, so watch out for NullPointerExceptions. Better yet, test to see that it's non-null before using it. • If the file is left off completely, e.g. http://java.sun.com, then it's set to "/".
  • 37. © 1999 Elliotte Rusty Harold 04/27/13 Reading Data from a URL • The openStream() method connects to the server specified in the URL and returns an InputStream object fed by the data from that connection. public final InputStream openStream() throws IOException • Any headers that precede the actual data are stripped off before the stream is opened. • Network connections are less reliable and slower than files. Buffer with a BufferedReader or a BufferedInputStream.
  • 38. © 1999 Elliotte Rusty Harold 04/27/13 Webcat import java.net.*; import java.io.*; public class Webcat { public static void main(String[] args) { for (int i = 0; i < args.length; i++) { try { URL u = new URL(args[i]); InputStream in = u.openStream(); InputStreamReader isr = new InputStreamReader(in); BufferedReader br = new BufferedReader(isr); String theLine; while ((theLine = br.readLine()) != null) { System.out.println(theLine); } } catch (IOException e) { System.err.println(e);} } } }
  • 39. © 1999 Elliotte Rusty Harold 04/27/13 The Bug in readLine() • What readLine() does: – Sees a carriage return, waits to see if next character is a line feed before returning • What readLine() should do: – Sees a carriage return, return, throw away next character if it's a linefeed
  • 40. © 1999 Elliotte Rusty Harold 04/27/13 Webcat import java.net.*; import java.io.*; public class Webcat { public static void main(String[] args) { for (int i = 0; i < args.length; i++) { try { URL u = new URL(args[i]); InputStream in = u.openStream(); InputStreamReader isr = new InputStreamReader(in); char c; while ((c = br.read()) != -1) { System.out.print(c); } } catch (IOException e) { System.err.println(e);} } } }
  • 41. © 1999 Elliotte Rusty Harold 04/27/13 CGI • Common Gateway Interface • A lot is written about writing server side CGI. I’m going to show you client side CGI. • We’ll need to explore HTTP a little deeper to do this
  • 42. © 1999 Elliotte Rusty Harold 04/27/13 Normal web surfing uses these two steps: – The browser requests a page – The server sends the page • Data flows primarily from the server to the client.
  • 43. © 1999 Elliotte Rusty Harold 04/27/13 Forms • There are times when the server needs to get data from the client rather than the other way around. The common way to do this is with a form like this one:
  • 44. © 1999 Elliotte Rusty Harold 04/27/13 CGI • The user types the requested data into the form and hits the submit button. • The client browser then sends the data to the server using the Common Gateway Interface, CGI for short. • CGI uses the HTTP protocol to transmit the data, either as part of the query string or as separate data following the MIME header.
  • 45. © 1999 Elliotte Rusty Harold 04/27/13 GET and POST • When the data is sent as a query string included with the file request, this is called CGI GET. • When the data is sent as data attached to the request following the MIME header, this is called CGI POST
  • 46. © 1999 Elliotte Rusty Harold 04/27/13 HTTP • Web browsers communicate with web servers through a standard protocol known as HTTP, an acronym for HyperText Transfer Protocol. • This protocol defines – how a browser requests a file from a web server – how a browser sends additional data along with the request (e.g. the data formats it can accept), – how the server sends data back to the client – response codes
  • 47. © 1999 Elliotte Rusty Harold 04/27/13 A Typical HTTP Connection – Client opens a socket to port 80 on the server. – Client sends a GET request including the name and path of the file it wants and the version of the HTTP protocol it supports. – The client sends a MIME header. – The client sends a blank line. – The server sends a MIME header – The server sends the data in the file. – The server closes the connection.
  • 48. © 1999 Elliotte Rusty Harold 04/27/13 What the client sends to the server GET /javafaq/images/cup.gif Connection: Keep-Alive User-Agent: Mozilla/3.01 (Macintosh; I; PPC) Host: www.oreilly.com:80 Accept: image/gif, image/x-xbitmap, image/jpeg, */*
  • 49. © 1999 Elliotte Rusty Harold 04/27/13 MIME • MIME is an acronym for "Multipurpose Internet Mail Extensions". • an Internet standard defined in RFCs 2045 through 2049 • originally intended for use with email messages, but has been been adopted for use in HTTP.
  • 50. © 1999 Elliotte Rusty Harold 04/27/13 Browser Request MIME Header • When the browser sends a request to a web server, it also sends a MIME header. • MIME headers contain name-value pairs, essentially a name followed by a colon and a space, followed by a value. Connection: Keep-Alive User-Agent: Mozilla/3.01 (Macintosh; I; PPC) Host: www.digitalthink.com:80 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
  • 51. © 1999 Elliotte Rusty Harold 04/27/13 Server Response MIME Header • When a web server responds to a web browser it sends a MIME header along with the response that looks something like this: Server: Netscape-Enterprise/2.01 Date: Sat, 02 Aug 1997 07:52:46 GMT Accept-ranges: bytes Last-modified: Tue, 29 Jul 1997 15:06:46 GMT Content-length: 2810 Content-type: text/html
  • 52. © 1999 Elliotte Rusty Harold 04/27/13 Query Strings • CGI GET data is sent in URL encoded query strings • a query string is a set of name=value pairs separated by ampersands Author=Sadie, Julie&Title=Women Composers • separated from rest of URL by a question mark
  • 53. © 1999 Elliotte Rusty Harold 04/27/13 URL Encoding • Alphanumeric ASCII characters (a-z, A-Z, and 0-9) and the $-_.!*'(), punctuation symbols are left unchanged. • The space character is converted into a plus sign (+). • Other characters (e.g. &, =, ^, #, %, ^, {, and so on) are translated into a percent sign followed by the two hexadecimal digits corresponding to their numeric value.
  • 54. © 1999 Elliotte Rusty Harold 04/27/13 For example, • The comma is ASCII character 44 (decimal) or 2C (hex). Therefore if the comma appears as part of a URL it is encoded as %2C. • The query string "Author=Sadie, Julie&Title=Women Composers" is encoded as: Author=Sadie%2C+Julie&Title=Women+Composers
  • 55. © 1999 Elliotte Rusty Harold 04/27/13 The URLEncoder class • The java.net.URLEncoder class contains a single static method which encodes strings in x-www-form-url- encoded format URLEncoder.encode(String s)
  • 56. © 1999 Elliotte Rusty Harold 04/27/13 For example, String qs = "Author=Sadie, Julie&Title=Women Composers"; String eqs = URLEncoder.encode(qs); System.out.println(eqs); • This prints: Author%3dSadie%2c+Julie%26Title %3dWomen+Composers
  • 57. © 1999 Elliotte Rusty Harold 04/27/13 String eqs = "Author=" + URLEncoder.encode("Sadie, Julie"); eqs += "&"; eqs += "Title="; eqs += URLEncoder.encode("Women Composers"); • This prints the properly encoded query string: Author=Sadie %2c+Julie&Title=Women+Composers
  • 58. © 1999 Elliotte Rusty Harold 04/27/13 The URLDecoder class • In Java 1.2 the java.net.URLDecoder class contains a single static method which decodes strings in x-www-form-url-encoded format URLEncoder.decode(String s)
  • 59. © 1999 Elliotte Rusty Harold 04/27/13 GET URLs String eqs = "Author=" + URLEncoder.encode("Sadie, Julie"); eqs += "&"; eqs += "Title="; eqs += URLEncoder.encode("Women Composers"); try { URL u = new URL("http://www.superbooks.com/search.cgi?" + eqs); InputStream in = u.openStream(); //... } catch (IOException e) { //...
  • 60. © 1999 Elliotte Rusty Harold 04/27/13 URLConnections • The java.net.URLConnection class is an abstract class that handles communication with different kinds of servers like ftp servers and web servers. • Protocol specific subclasses of URLConnection handle different kinds of servers. • By default, connections to HTTP URLs use the GET method.
  • 61. © 1999 Elliotte Rusty Harold 04/27/13 URLConnections vs. URLs • Can send output as well as read input • Can post data to CGIs • Can read headers from a connection
  • 62. © 1999 Elliotte Rusty Harold 04/27/13 URLConnection five steps: 1. The URL is constructed. 2. The URL’s openConnection() method creates the URLConnection object. 3. The parameters for the connection and the request properties that the client sends to the server are set up. 4. The connect() method makes the connection to the server. (optional) 5. The response header information is read using getHeaderField().
  • 63. © 1999 Elliotte Rusty Harold 04/27/13 I/O Across a URLConnection • Data may be read from the connection in one of two ways – raw by using the input stream returned by getInputStream() – through a content handler with getContent(). • Data can be sent to the server using the output stream provided by getOutputStream().
  • 64. © 1999 Elliotte Rusty Harold 04/27/13 For example, try { URL u = new URL("http://www.sd99.com/"); URLConnection uc = u.openConnection(); uc.connect(); InputStream in = uc.getInputStream(); // read the data... } catch (IOException e) { //...
  • 65. © 1999 Elliotte Rusty Harold 04/27/13 Reading Header Data • The getHeaderField(String name) method returns the string value of a named header field. • Names are case-insensitive. • If the requested field is not present, null is returned. String lm = uc.getHeaderField("Last-modified");
  • 66. © 1999 Elliotte Rusty Harold 04/27/13 getHeaderFieldKey() • The keys of the header fields are returned by the getHeaderFieldKey(int n) method. • The first field is 1. • If a numbered key is not found, null is returned. • You can use this in combination with getHeaderField() to loop through the complete header
  • 67. © 1999 Elliotte Rusty Harold 04/27/13 For example String key = null; for (int i=1; (key = uc.getHeaderFieldKey(i))!=null); i++) { System.out.println(key + ": " + uc.getHeaderField(key)); }
  • 68. © 1999 Elliotte Rusty Harold 04/27/13 getHeaderFieldInt() and getHeaderFieldDate() • These are utility methods that read a named header and convert its value into an int and a long respectively. public int getHeaderFieldInt(String name, int default) public long getHeaderFieldDate(String name, long default)
  • 69. © 1999 Elliotte Rusty Harold 04/27/13 • The long returned by getHeaderFieldDate() can be converted into a Date object using a Date() constructor like this: String s = uc.getHeaderFieldDate("Last- modified", 0); Date lm = new Date(s);
  • 70. © 1999 Elliotte Rusty Harold 04/27/13 Six Convenience Methods • These return the values of six particularly common header fields: public int getContentLength() public String getContentType() public String getContentEncoding() public long getExpiration() public long getDate() public long getLastModified()
  • 71. © 1999 Elliotte Rusty Harold 04/27/13 try { URL u = new URL("http://www.sdexpo.com/"); URLConnection uc = u.openConnection(); uc.connect(); String key=null; for (int n = 1; (key=uc.getHeaderFieldKey(n)) != null; n++) { System.out.println(key + ": " + uc.getHeaderField(key)); } } catch (IOException e) { System.err.println(e); }
  • 72. © 1999 Elliotte Rusty Harold 04/27/13 Writing data to a URLConnection • Similar to reading data from a URLConnection. • First inform the URLConnection that you plan to use it for output • Before getting the connection's input stream, get the connection's output stream and write to it. • Commonly used to talk to CGIs that use the POST method
  • 73. © 1999 Elliotte Rusty Harold 04/27/13 Eight Steps: 1.Construct the URL. 2.Call the URL’s openConnection() method to create the URLConnection object. 3.Pass true to the URLConnection’s setDoOutput() method 4.Create the data you want to send, preferably as a byte array.
  • 74. © 1999 Elliotte Rusty Harold 04/27/13 5.Call getOutputStream() to get an output stream object. 6.Write the byte array calculated in step 5 onto the stream. 7.Close the output stream. 8.Call getInputStream() to get an input stream object. Read from it as usual.
  • 75. © 1999 Elliotte Rusty Harold 04/27/13 POST CGIs • A typical POST request to a CGI looks like this: POST /cgi-bin/booksearch.pl HTTP/1.0 Referer: http://www.macfaq.com/sampleform.html User-Agent: Mozilla/3.01 (Macintosh; I; PPC) Content-length: 60 Content-type: text/x-www-form-urlencoded Host: utopia.poly.edu:56435 username=Sadie %2C+Julie&realname=Women+Composers
  • 76. © 1999 Elliotte Rusty Harold 04/27/13 A POST request includes • the POST line • a MIME header which must include – content type – content length • a blank line that signals the end of the MIME header • the actual data of the form, encoded in x- www-form-urlencoded format.
  • 77. © 1999 Elliotte Rusty Harold 04/27/13 • A URLConnection for an http URL will set up the request line and the MIME header for you as long as you set its doOutput field to true by invoking setDoOutput(true). • If you also want to read from the connection, you should set doInput to true with setDoInput(true) too.
  • 78. © 1999 Elliotte Rusty Harold 04/27/13 For example, URLConnection uc = u.openConnection(); uc.setDoOutput(true); uc.setDoInput(true);
  • 79. © 1999 Elliotte Rusty Harold 04/27/13 • The request line and MIME header are sent as soon as the URLConnection connects. Then getOutputStream() returns an output stream on which you can write the x-www-form-urlencoded name-value pairs.
  • 80. © 1999 Elliotte Rusty Harold 04/27/13 HttpURLConnection • java.net.HttpURLConnection is an abstract subclass of URLConnection that provides some additional methods specific to the HTTP protocol. • URL connection objects that are returned by an http URL will be instances of java.net.HttpURLConnection.
  • 81. © 1999 Elliotte Rusty Harold 04/27/13 Recall • a typical HTTP response from a web server begins like this: HTTP/1.0 200 OK Server: Netscape-Enterprise/2.01 Date: Sat, 02 Aug 1997 07:52:46 GMT Accept-ranges: bytes Last-modified: Tue, 29 Jul 1997 15:06:46 GMT Content-length: 2810 Content-type: text/html
  • 82. © 1999 Elliotte Rusty Harold 04/27/13 Response Codes • The getHeaderField() and getHeaderFieldKey() don't return the HTTP response code • After you've connected, you can retrieve the numeric response code--200 in the above example--with the getResponseCode() method and the message associated with it-- OK in the above example--with the getResponseMessage() method.
  • 83. © 1999 Elliotte Rusty Harold 04/27/13 HTTP Protocols • Java 1.0 only supports GET and POST requests to HTTP servers • Java 1.1/1.2 supports GET, POST, HEAD, OPTIONS, PUT, DELETE, and TRACE. • The protocol is chosen with the setRequestMethod(String method) method. • A java.net.ProtocolException, a subclass of IOException, is thrown if an unknown protocol is specified.
  • 84. © 1999 Elliotte Rusty Harold 04/27/13 getRequestMethod() • The getRequestMethod() method returns the string form of the request method currently set for the URLConnection. GET is the default method.
  • 85. © 1999 Elliotte Rusty Harold 04/27/13 disconnect() • The disconnect() method of the HttpURLConnection class closes the connection to the web server. • Needed for HTTP/1.1 Keep-alive
  • 86. © 1999 Elliotte Rusty Harold 04/27/13 For example, try { URL u = new URL("http://www.amnesty.org/"); HttpURLConnection huc = (HttpURLConnection) u.openConnection(); huc.setRequestMethod("PUT"); huc.connect(); OutputStream os = huc.getOutputStream(); int code = huc.getResponseCode(); if (code >= 200 && < 300) { // put the data... } huc.disconnect(); } catch (IOException e) { //...
  • 87. © 1999 Elliotte Rusty Harold 04/27/13 usingProxy • The boolean usingProxy() method returns true if web connections are being funneled through a proxy server, false if they're not.
  • 88. © 1999 Elliotte Rusty Harold 04/27/13 Redirect Instructions • Most web servers can be configured to automatically redirect browsers to the new location of a page that's moved. • To redirect browsers, a server sends a 300 level response and a Location header that specifies the new location of the requested page.
  • 89. © 1999 Elliotte Rusty Harold 04/27/13 GET /~elharo/macfaq/index.html HTTP/1.0 HTTP/1.1 302 Moved Temporarily Date: Mon, 04 Aug 1997 14:21:27 GMT Server: Apache/1.2b7 Location: http://www.macfaq.com/macfaq/index.html Connection: close Content-type: text/html <HTML><HEAD> <TITLE>302 Moved Temporarily</TITLE> </HEAD><BODY> <H1>Moved Temporarily</H1> The document has moved <A HREF="http://www.macfaq.com/macfaq/index.html">he re</A>.<P> </BODY></HTML>
  • 90. © 1999 Elliotte Rusty Harold 04/27/13 • HTML is returned for browsers that don't understand redirects, but most modern browsers do not display this and jump straight to the page specified in the Location header instead. • Because redirects can change the site which a user is connecting without their knowledge so redirects are not arbitrarily followed by URLConnections.
  • 91. © 1999 Elliotte Rusty Harold 04/27/13 Following Redirects HttpURLConnection.setFollowRedirects (true) method says that connections will follow redirect instructions from the web server. Untrusted applets are not allowed to set this. HttpURLConnection.getFollowRedirects () returns true if redirect requests are honored, false if they're not.
  • 92. © 1999 Elliotte Rusty Harold 04/27/13 To Learn More • Java Network Programming – O’Reilly & Associates, 1997 – ISBN 1-56592-227-1 • Java I/O – O’Reilly & Associates, 1999 – ISBN 1-56592-485-1 • Web Client Programming with Java – http://www.digitalthink.com/catalog/cs/cs 308/index.html
  • 93. © 1999 Elliotte Rusty Harold 04/27/13 Questions?

Notas del editor

  1. Most URLs can be broken into about five pieces, not all of which are necessarily present in any given URL. These are: There may also be a query string part which is used for CGI data. We’ll talk about that when we discuss CGIs.