Web Techniques: Network Programming in Java

Network Programming in Java

By Bruce Eckel

Historically, network programming has been error-prone, difficult, and complex because the programmer had to know many details about the network and sometimes even the hardware. However, the concept of networking is not so very difficult: You want to move information from one machine to another. It’s quite similar to reading and writing files, except that the file exists on a remote machine that determines what to do about the information you’re requesting or sending.

One of Java’s great strengths is painless networking. As much as possible, the underlying details of networking have been abstracted away and taken care of within the Java Virtual Machine (JVM) and local-machine installation of Java. The programming model is the file; in fact, you wrap the network connection (“socket”) with InputStream and OutputStream objects, so you end up using exactly the same method calls you would with files. In addition, Java’s built-in multithreading is exceptionally handy when dealing with multiple simultaneous connections.

Identifying a Machine

To differentiate between machines and ensure you are connected with the proper one, there must be a way of uniquely identifying machines on a network. Early networks provided unique names for machines within the local network, but because Java works within the Internet, a machine must be uniquely identified from all others in the world. This is accomplished with the Internet Protocol (IP) address, which is a 32-bit number that is unique across the Internet. The IP address can also be represented in symbolic form for human consumption in two ways: First, “dotted-quad” notation — a text string of four numbers separated by dots, as in 123.255.28.120. Because dotted-quad numbers are difficult to remember, you can also define a name via the domain-name service (DNS) that will be mapped to your IP address.

In Java, an object of type InetAddress is used to contain the 32-bit numerical address. The static InetAddress.getByName() method in java.net returns an object of this type. You can use this to build a socket; more on this later.

In both cases, the IP address is represented internally as a 32-bit number (so no quad number can exceed 255), and a special Java object represents this number using the static InetAddress.getByName() method in java.net.

Here’s a simple example of using InetAddress.getByName(): Each time you dial up an Internet Service Provider (ISP), you are assigned a temporary IP address (assuming your provider uses dynamically allocated IP addresses, which most do today). While you’re connected, your IP address has the same validity as any other IP address on the Internet. For example, if you’re running a Web server or FTP server and someone connects to your machine using your IP address, then they can connect to your Web or FTP server. To do so, they’ll need your IP address, but it’s assigned each time you dial up, so how can you find out what it is?

Listing One uses InetAddress.getByName() to produce your IP address. (Because of space constraints, all listings are available electronically; see “Availability” on page 3.) To use it, you must know the name of your computer. In Windows 95 (no other platforms have been tested), you go to Settings|Control Panel|Network, select the Identification tab, and enter the computer name on the command line. My machine is called “Colossus,” so once I’ve connected to my ISP, I run the program java WhoAmI Colossus, which returns a message like Colossus/199.190.87.75 (of course, the address is different each time).

Someone else can log onto my personal Web server by going to http://199.190.87.75 (but only as long as I remain connected). This can be a handy way to distribute information or to test a Web-site configuration before posting it to a “real” server.

Servers and Clients

The whole point of a network is to allow two machines to communicate. Once they’ve connected, communication becomes a two-way process, regardless of which is the server and which is the client. The server’s job is to listen for a connection, and that’s performed by the special server object you create. The client’s job is to connect to a server, and this is performed by the special client object you create. Once the connection is made, it is magically turned into an InputFile and OutputFile object at both server and client ends, and from then on you treat the connection as reading from and writing to a file using the familiar commands from the Java I/O library.

Testing Programs without a Network

The designers of the Internet addressing scheme were aware that not everyone has a client machine, a server machine, and a network available to test programs. Thus, they created a special address, localhost, to be the “local loopback” IP address for testing without a network. The generic way to produce this address in Java is InetAddress addrI=InetAddress.getByName(null);. If you hand getByName() a null, it defaults to using the localhost. The InetAddress is used to refer to the particular machine, so you must produce this first, and the only way to do so is through one of that class’s static member methods: getByName() (which you’ll usually use), getAllByName(), or getLocalHost().

You can’t manipulate the contents of an InetAddress (but you can print them — read on). You can also produce the local loopback address by handing it the string localhost — InetAddress.getByName ("localhost"); — or using its dotted-quad form to name the loopback’s reserved IP number: InetAddress.getByName("127.0.0.1"); All three forms produce the same result.

Port: A Unique Place within the Machine

An IP address isn’t enough to identify a unique server, since many servers can exist on one machine. Each machine also contains ports, and when setting up a client or a server, you must choose a port where client and server meet. It’s as if you’re meeting someone: The IP address is the neighborhood, and the port is the bar.

The port is not a physical location; it’s a software abstraction. In fact, it’s the service associated with a particular port number. An example service would be the time of day.

The system services reserve ports 1 through 1024, so you shouldn’t use those or any other port that you know to be in use. My first choice for examples is port 8080.

Sockets

The socket is the software abstraction that represents the “terminals” of a connection between two machines. For a given connection, there’s a socket on each machine; a hypothetical “cable” runs between the two machines, each end of which is plugged into a socket. Of course, the physical hardware and cabling between machines is completely unknown at this level of abstraction.

In Java, you create a socket to make the connection to the other machine, then you get an InputStream and/or OutputStream from the socket in order to treat the connection as an I/O stream object. At first, there seem to be two kinds of sockets: a ServerSocket that waits for a network connection, and a Socket to represent a client. The Socket connects to a machine by specifying the IP address and port.

This would seem to be another example of the Java libraries’ confusing naming scheme. You might think it better to name the ServerSocket something without the word “socket” in it and that ServerSocket and Socket should both be inherited from a common base class. Indeed, the two classes have several methods in common but not enough for a common base class. Instead, ServerSocket has an accept() method that waits until another machine connects to it, then returns an actual Socket. This creates a true Socket-to-Socket connection in which both ends are treated the same way because they are the same. So, the job of ServerSocket isn’t really to be a socket but to make a Socket object when someone else connects to it.

However, the ServerSocket does create a “server,” or listening socket that listens for incoming connections and returns an “established” socket (with the local and remote endpoints defined) via accept(). The confusing part is that both listening and established sockets are associated with the same server socket. TCP knows that the listening socket can only accept new connection requests, not data packets. So while ServerSocket doesn’t make much sense programmatically, it does “physically.”

A ServerSocket requires only the port number it’s going to use; it doesn’t need an IP address because it’s already on the machine it represents. A Socket, however, needs both the IP address and the port number where you’re trying to connect.

Once you have Socket objects on both ends, you use the getInputStream() and getOutputStream() methods to produce the corresponding InputStream and OutputStream objects. These must be wrapped inside buffers and formatting classes just like any other stream object.

A Simple Server and Client

This example makes the simplest use of servers and clients using sockets. The server simply waits for a connection, then uses the Socket produced by that connection to create an InputStream and OutputStream. After that, it echoes everything it reads from the InputStream to the OutputStream until it receives the line END, and then closes the connection.

The client connects to the server, then creates an OutputStream, through which lines of text are sent. The client also creates an InputStream to hear what the server is saying (in this case, the words echoed back).

Both the server and client use the same port number, and the client uses the local loopback address to connect to the server on the same machine so you don’t have to test it over a network.

Listing Two is the server. The ServerSocket needs a port number, but not an IP address (since it’s running on this machine). When you call accept(), the method blocks until a client tries to connect to it; meanwhile, other processes can run. When a connection is made, accept() returns a Socket object representing that connection.

Both the ServerSocket and the Socket produced by accept() are printed to System.out, so their toString() methods are automatically called, producing:


ServerSocket[addr=0.0.0.0,port=0,

localport=8080]

Socket[addr=127.0.0.1,port=1077,

localport=8080]

The next part of the program looks just like opening files for reading and writing except that the InputStream and OutputStream objects are created from the Socket object. Both objects are converted to Java 1.1 Reader and Writer objects using the converter classes InputStreamReader and OutputStreamWriter, respectively. You could use the Java 1.0 InputStream and OutputStream classes directly, but Writer offers a distinct advantage: PrintWriter has an overloaded constructor that takes a second argument, a Boolean flag that indicates whether or not to automatically flush the output at the end of each println() statement (but not print()). Every time you write to out, its buffer must be flushed so the information actually goes out over the network. Flushing is important here — the client and server each wait for a line from the other party before proceeding, and if flushing doesn’t occur, the information will not be put onto the network until the buffer is full, causing lots of problems in this example.

Like virtually all files you open, these are buffered to augment speed.

The infinite while loop reads lines from the BufferedReader and writes information to System.out and the PrintWriter out. These could be any streams — they just happen to be connected to the network.

When the client sends the line consisting of END, the program breaks out of the loop and closes the Socket. It’s always a good idea to call close() to make sure everything is cleaned up properly.

Listing Three is the client. The main() routine shows how to produce the InetAddress of the local loopback IP address using null, localhost, or the explicit reserved address 127.0.0.1 (for which you would substitute a particular machine’s IP address). When the InetAddress addr is printed (via the automatic call to its toString() method) the result is localhost/127 .0.0.1. Handing getByName() a null made it default to finding the localhost, producing the special address 127.0.0.1.

The Socket called “socket” is created with both the InetAddress and the port number. Remember, an Internet connection is determined uniquely by four pieces of data: clientHost, clientPortNumber, serverHost, and serverPortNumber. When the server comes up, it takes up its assigned port (8080) on the localhost (127.0.0.1); when the client comes up, it is allocated the next available port on its machine, in this case 1077, which also happens to be on the same machine (127.0.0.1) as the server. For data to move between the client and server, each side must know where to send it. Therefore, during the process of connecting to the “known” server, the client sends a “return address, ” which is what you see in the example output for the server side: Socket[addr=127.0.0.1,port=1077,localport=8080]. This means the server accepted a connection from 127.0.0.1 on port 1077 while listening on its local port (8080). On the client side, Socket[addr=localhost/127.0.0.1,port=8080,localport=1077], which means the client made a connection to 127.0.0.1 on port 8080 using local port 1077.

Each time you start up the client, the local-port number is incremented. It starts at 1025 (one past the reserved block of ports) and increases until you reboot the machine, which returns it to 1025. (On UNIX machines, once the upper limit of the socket range is reached, the numbers will wrap around to the lowest available number again).

Once the Socket object has been created, the process of turning it into a BufferedReader and PrintWriter is the same as in the server. Here, the client initiates the conversation by sending the string “howdy,” followed by a number. The buffer is flushed automatically via the second argument to the PrintWriter constructor; otherwise, the whole conversation would hang because the initial “howdy” would never get sent. (The buffer isn’t full enough to make it happen automatically.) Each line sent back from the server is written to System.out to verify that everything is working correctly. To terminate the conversation, the agreed-upon END is sent. If the client simply hangs up, the server throws an exception.

Sockets produce a “dedicated” connection that persists until it is explicitly disconnected. (The dedicated connection can still be disconnected inexplicitly if one side or an intermediary link crashes.) This means the two parties are locked in communication and the connection is constantly open. This puts an extra load on the network; later I’ll discuss an approach where the connections are temporary.

Serving Multiple Clients

This JabberServer can handle only one client at a time, and a typical server must deal with many clients at once. The answer is multithreading, which in Java is reasonably straightforward.

The basic scheme is to make a single ServerSocket in the server, and call accept() to wait for a new connection. When accept() returns, the resulting Socket is used to create a new thread whose job is to serve that particular client; then you call accept() again to wait for a new client.

Listing Four is similar to the JabberServer.java example, except that all the operations to serve a particular client have been moved inside a separate thread class. The ServeOneJabber thread takes the Socket object produced by accept() in main() every time a new client makes a connection. Then, it creates a BufferedReader and autoflushed PrintWriter object using the Socket. Finally, it calls the special Thread method start(), which performs thread initialization and then calls run(). This process, too, reads something from the socket and echoes it until it reaches the END signal.

As before, a ServerSocket is created and accept() is called to allow a new connection. But this time, the return value of accept() is passed to the constructor for ServeOneJabber, which creates a new thread to handle that connection. When the connection is terminated, the thread simply goes away.

To test that the server really handles multiple clients,

Listing Five creates many clients (using threads), each of which connects to the same server. Each thread has a limited lifetime, leaving space for new threads. The maximum number of threads allowed is determined by the final int maxthreads. This value is critical: If it’s too high, the threads seem to run out of resources and the program mysteriously fails.

The JabberClientThread constructor uses an InetAddress to open a Socket. Again, start() performs thread initialization and calls run(). Here, messages are sent to the server and information from the server is echoed to the screen. However, the thread has a limited lifetime and eventually completes.

The thread count keeps track of the number of JabberClientThread objects. It is incremented as part of the constructor, and decremented as run() exits (which means the thread is terminating). MultiJabberClient.main() tests the number of threads, and if there are too many, no more are created and the method sleeps. This way, some threads will eventually terminate and more can be created. You can experiment with maxthreads to see where your system begins to have trouble with too many connections.

Conclusion

These examples use the Transmission Control Protocol (TCP) running on top of IP to transport the data. TCP/IP adds a reliable layer of transport which guarantees that the data will get there. Lost data is retransmitted, multiple paths are provided through different routers, and bytes are delivered in the order sent. However, this control and reliability comes at the cost of high overhead.

An alternative transport, User Datagram Protocol (UDP), doesn’t guarantee that the packets will be delivered or that they will arrive in order. This “unreliable protocol” sounds bad, but it’s much faster, and for some applications, such as digitized audio, it isn’t so critical if a few packets are dropped here or there.

(Get the source code for this article here.)

Bruce is the author of Thinking in Java (freely available at www.EckelObjects.com), Thinking in C++ (Prentice-Hall, 1995), and C++ Inside & Out (Osborne/McGraw-Hill, 1993). He is the C++ and Java track chair for the Software Development conference and provides seminars and design consulting in C++ and Java. He can be reached at [email protected].