Web Techniques: Network Programming in Java
Mục Lục
Network Programming in Java
By Bruce Eckel
Historically, network programming has been error-prone, difficult, and complex because the programmer had to know many details about the network and sometimes even the hardware. However, the concept of networking is not so very difficult: You want to move information from one machine to another. It’s quite similar to reading and writing files, except that the file exists on a remote machine that determines what to do about the information you’re requesting or sending.
One of Java’s great strengths is painless networking. As much as possible, the underlying details of networking have been abstracted away and taken care of within the Java Virtual Machine (JVM) and local-machine installation of Java. The programming model is the file; in fact, you wrap the network connection (“socket”) with
InputStream
andOutputStream
objects, so you end up using exactly the same method calls you would with files. In addition, Java’s built-in multithreading is exceptionally handy when dealing with multiple simultaneous connections.Identifying a Machine
To differentiate between machines and ensure you are connected with the proper one, there must be a way of uniquely identifying machines on a network. Early networks provided unique names for machines within the local network, but because Java works within the Internet, a machine must be uniquely identified from all others in the world. This is accomplished with the Internet Protocol (IP) address, which is a 32-bit number that is unique across the Internet. The IP address can also be represented in symbolic form for human consumption in two ways: First, “dotted-quad” notation — a text string of four numbers separated by dots, as in 123.255.28.120. Because dotted-quad numbers are difficult to remember, you can also define a name via the domain-name service (DNS) that will be mapped to your IP address.
In Java, an object of type
InetAddress
is used to contain the 32-bit numerical address. The staticInetAddress.getByName()
method in java.net returns an object of this type. You can use this to build a socket; more on this later.In both cases, the IP address is represented internally as a 32-bit number (so no quad number can exceed 255), and a special Java object represents this number using the static
InetAddress.getByName()
method in java.net.Here’s a simple example of using
InetAddress.getByName()
: Each time you dial up an Internet Service Provider (ISP), you are assigned a temporary IP address (assuming your provider uses dynamically allocated IP addresses, which most do today). While you’re connected, your IP address has the same validity as any other IP address on the Internet. For example, if you’re running a Web server or FTP server and someone connects to your machine using your IP address, then they can connect to your Web or FTP server. To do so, they’ll need your IP address, but it’s assigned each time you dial up, so how can you find out what it is?Listing One uses
InetAddress.getByName()
to produce your IP address. (Because of space constraints, all listings are available electronically; see “Availability” on page 3.) To use it, you must know the name of your computer. In Windows 95 (no other platforms have been tested), you go to Settings|Control Panel|Network, select the Identification tab, and enter the computer name on the command line. My machine is called “Colossus,” so once I’ve connected to my ISP, I run the programjava
WhoAmI
Colossus
, which returns a message likeColossus/199.190.87.75
(of course, the address is different each time).Someone else can log onto my personal Web server by going to http://199.190.87.75 (but only as long as I remain connected). This can be a handy way to distribute information or to test a Web-site configuration before posting it to a “real” server.
Servers and Clients
The whole point of a network is to allow two machines to communicate. Once they’ve connected, communication becomes a two-way process, regardless of which is the server and which is the client. The server’s job is to listen for a connection, and that’s performed by the special server object you create. The client’s job is to connect to a server, and this is performed by the special client object you create. Once the connection is made, it is magically turned into an
InputFile
andOutputFile
object at both server and client ends, and from then on you treat the connection as reading from and writing to a file using the familiar commands from the Java I/O library.Testing Programs without a Network
The designers of the Internet addressing scheme were aware that not everyone has a client machine, a server machine, and a network available to test programs. Thus, they created a special address, localhost, to be the “local loopback” IP address for testing without a network. The generic way to produce this address in Java is
InetAddress
addrI=InetAddress.getByName(null);
. If you handgetByName()
a null, it defaults to using the localhost. TheInetAddress
is used to refer to the particular machine, so you must produce this first, and the only way to do so is through one of that class’s static member methods:getByName()
(which you’ll usually use),getAllByName()
, orgetLocalHost()
.You can’t manipulate the contents of an
InetAddress
(but you can print them — read on). You can also produce the local loopback address by handing it the string localhost —InetAddress.getByName ("localhost");
— or using its dotted-quad form to name the loopback’s reserved IP number:InetAddress.getByName("127.0.0.1");
All three forms produce the same result.Port: A Unique Place within the Machine
An IP address isn’t enough to identify a unique server, since many servers can exist on one machine. Each machine also contains ports, and when setting up a client or a server, you must choose a port where client and server meet. It’s as if you’re meeting someone: The IP address is the neighborhood, and the port is the bar.
The port is not a physical location; it’s a software abstraction. In fact, it’s the service associated with a particular port number. An example service would be the time of day.
The system services reserve ports 1 through 1024, so you shouldn’t use those or any other port that you know to be in use. My first choice for examples is port 8080.
Sockets
The socket is the software abstraction that represents the “terminals” of a connection between two machines. For a given connection, there’s a socket on each machine; a hypothetical “cable” runs between the two machines, each end of which is plugged into a socket. Of course, the physical hardware and cabling between machines is completely unknown at this level of abstraction.
In Java, you create a socket to make the connection to the other machine, then you get an
InputStream
and/orOutputStream
from the socket in order to treat the connection as an I/O stream object. At first, there seem to be two kinds of sockets: a ServerSocket that waits for a network connection, and a Socket to represent a client. The Socket connects to a machine by specifying the IP address and port.This would seem to be another example of the Java libraries’ confusing naming scheme. You might think it better to name the ServerSocket something without the word “socket” in it and that ServerSocket and Socket should both be inherited from a common base class. Indeed, the two classes have several methods in common but not enough for a common base class. Instead, ServerSocket has an
accept()
method that waits until another machine connects to it, then returns an actual Socket. This creates a true Socket-to-Socket connection in which both ends are treated the same way because they are the same. So, the job of ServerSocket isn’t really to be a socket but to make a Socket object when someone else connects to it.However, the ServerSocket does create a “server,” or listening socket that listens for incoming connections and returns an “established” socket (with the local and remote endpoints defined) via
accept()
. The confusing part is that both listening and established sockets are associated with the same server socket. TCP knows that the listening socket can only accept new connection requests, not data packets. So while ServerSocket doesn’t make much sense programmatically, it does “physically.”A ServerSocket requires only the port number it’s going to use; it doesn’t need an IP address because it’s already on the machine it represents. A Socket, however, needs both the IP address and the port number where you’re trying to connect.
Once you have Socket objects on both ends, you use the
getInputStream()
andgetOutputStream()
methods to produce the correspondingInputStream
andOutputStream
objects. These must be wrapped inside buffers and formatting classes just like any other stream object.A Simple Server and Client
This example makes the simplest use of servers and clients using sockets. The server simply waits for a connection, then uses the Socket produced by that connection to create an
InputStream
andOutputStream
. After that, it echoes everything it reads from theInputStream
to theOutputStream
until it receives the lineEND
, and then closes the connection.The client connects to the server, then creates an
OutputStream
, through which lines of text are sent. The client also creates anInputStream
to hear what the server is saying (in this case, the words echoed back).Both the server and client use the same port number, and the client uses the local loopback address to connect to the server on the same machine so you don’t have to test it over a network.
Listing Two is the server. The
ServerSocket
needs a port number, but not an IP address (since it’s running on this machine). When you callaccept()
, the method blocks until a client tries to connect to it; meanwhile, other processes can run. When a connection is made,accept()
returns a Socket object representing that connection.Both the ServerSocket and the Socket produced by
accept()
are printed to System.out, so theirtoString()
methods are automatically called, producing:
ServerSocket[addr=0.0.0.0,port=0,localport=8080]
Socket[addr=127.0.0.1,port=1077,
localport=8080]
The next part of the program looks just like opening files for reading and writing except that the
InputStream
andOutputStream
objects are created from theSocket
object. Both objects are converted to Java 1.1Reader
andWriter
objects using the converter classesInputStreamReader
andOutputStreamWriter
, respectively. You could use the Java 1.0InputStream
andOutputStream
classes directly, butWriter
offers a distinct advantage:PrintWriter
has an overloaded constructor that takes a second argument, a Boolean flag that indicates whether or not to automatically flush the output at the end of eachprintln()
statement (but notprint()
). Every time you write toout
, its buffer must be flushed so the information actually goes out over the network. Flushing is important here — the client and server each wait for a line from the other party before proceeding, and if flushing doesn’t occur, the information will not be put onto the network until the buffer is full, causing lots of problems in this example.Like virtually all files you open, these are buffered to augment speed.
The infinite
while
loop reads lines from theBufferedReader
and writes information toSystem.out
and thePrintWriter
out. These could be any streams — they just happen to be connected to the network.When the client sends the line consisting of
END
, the program breaks out of the loop and closes the Socket. It’s always a good idea to callclose()
to make sure everything is cleaned up properly.Listing Three is the client. The
main()
routine shows how to produce theInetAddress
of the local loopback IP address using null, localhost, or the explicit reserved address 127.0.0.1 (for which you would substitute a particular machine’s IP address). When theInetAddress
addr
is printed (via the automatic call to itstoString()
method) the result islocalhost/127 .0.0.1
. HandinggetByName()
a null made it default to finding the localhost, producing the special address 127.0.0.1.The Socket called “socket” is created with both the
InetAddress
and the port number. Remember, an Internet connection is determined uniquely by four pieces of data:clientHost
,clientPortNumber
,serverHost
, andserverPortNumber
. When the server comes up, it takes up its assigned port (8080) on the localhost (127.0.0.1); when the client comes up, it is allocated the next available port on its machine, in this case 1077, which also happens to be on the same machine (127.0.0.1) as the server. For data to move between the client and server, each side must know where to send it. Therefore, during the process of connecting to the “known” server, the client sends a “return address, ” which is what you see in the example output for the server side:Socket[addr=127.0.0.1,port=1077,localport=8080]
. This means the server accepted a connection from 127.0.0.1 on port 1077 while listening on its local port (8080). On the client side,Socket[addr=localhost/127.0.0.1,port=8080,localport=1077]
, which means the client made a connection to 127.0.0.1 on port 8080 using local port 1077.Each time you start up the client, the local-port number is incremented. It starts at 1025 (one past the reserved block of ports) and increases until you reboot the machine, which returns it to 1025. (On UNIX machines, once the upper limit of the socket range is reached, the numbers will wrap around to the lowest available number again).
Once the Socket object has been created, the process of turning it into a
BufferedReader
andPrintWriter
is the same as in the server. Here, the client initiates the conversation by sending the string “howdy,” followed by a number. The buffer is flushed automatically via the second argument to thePrintWriter
constructor; otherwise, the whole conversation would hang because the initial “howdy” would never get sent. (The buffer isn’t full enough to make it happen automatically.) Each line sent back from the server is written toSystem.out
to verify that everything is working correctly. To terminate the conversation, the agreed-uponEND
is sent. If the client simply hangs up, the server throws an exception.Sockets produce a “dedicated” connection that persists until it is explicitly disconnected. (The dedicated connection can still be disconnected inexplicitly if one side or an intermediary link crashes.) This means the two parties are locked in communication and the connection is constantly open. This puts an extra load on the network; later I’ll discuss an approach where the connections are temporary.
Serving Multiple Clients
This JabberServer can handle only one client at a time, and a typical server must deal with many clients at once. The answer is multithreading, which in Java is reasonably straightforward.
The basic scheme is to make a single ServerSocket in the server, and call
accept()
to wait for a new connection. Whenaccept()
returns, the resulting Socket is used to create a new thread whose job is to serve that particular client; then you callaccept()
again to wait for a new client.Listing Four is similar to the JabberServer.java example, except that all the operations to serve a particular client have been moved inside a separate thread class. The
ServeOneJabber
thread takes the Socket object produced byaccept()
inmain()
every time a new client makes a connection. Then, it creates aBufferedReader
and autoflushedPrintWriter
object using the Socket. Finally, it calls the special Thread methodstart()
, which performs thread initialization and then callsrun()
. This process, too, reads something from the socket and echoes it until it reaches theEND
signal.As before, a ServerSocket is created and
accept()
is called to allow a new connection. But this time, the return value ofaccept()
is passed to the constructor forServeOneJabber
, which creates a new thread to handle that connection. When the connection is terminated, the thread simply goes away.To test that the server really handles multiple clients,
Listing Five creates many clients (using threads), each of which connects to the same server. Each thread has a limited lifetime, leaving space for new threads. The maximum number of threads allowed is determined by the final
int
maxthreads
. This value is critical: If it’s too high, the threads seem to run out of resources and the program mysteriously fails.The
JabberClientThread
constructor uses anInetAddress
to open a Socket. Again,start()
performs thread initialization and callsrun()
. Here, messages are sent to the server and information from the server is echoed to the screen. However, the thread has a limited lifetime and eventually completes.The thread count keeps track of the number of
JabberClientThread
objects. It is incremented as part of the constructor, and decremented asrun()
exits (which means the thread is terminating).MultiJabberClient.main()
tests the number of threads, and if there are too many, no more are created and the method sleeps. This way, some threads will eventually terminate and more can be created. You can experiment withmaxthreads
to see where your system begins to have trouble with too many connections.Conclusion
These examples use the Transmission Control Protocol (TCP) running on top of IP to transport the data. TCP/IP adds a reliable layer of transport which guarantees that the data will get there. Lost data is retransmitted, multiple paths are provided through different routers, and bytes are delivered in the order sent. However, this control and reliability comes at the cost of high overhead.
An alternative transport, User Datagram Protocol (UDP), doesn’t guarantee that the packets will be delivered or that they will arrive in order. This “unreliable protocol” sounds bad, but it’s much faster, and for some applications, such as digitized audio, it isn’t so critical if a few packets are dropped here or there.
(Get the source code for this article here.)
Bruce is the author of Thinking in Java (freely available at www.EckelObjects.com), Thinking in C++ (Prentice-Hall, 1995), and C++ Inside & Out (Osborne/McGraw-Hill, 1993). He is the C++ and Java track chair for the Software Development conference and provides seminars and design consulting in C++ and Java. He can be reached at [email protected].