data:image/s3,"s3://crabby-images/de028/de0285773f65d7119ab0dd4b9b656e0f9161475f" alt="Learning Network Programming with Java"
Networking basics
Networking is a broad and complex topic. In particular, a subtopic, such as addressing, is quite involved. We will introduce the terms and concepts that are commonly encountered and useful from a Java perspective.
Most of this discussion will focus on Java support for the Internet. A Uniform Resource Locator (URL) is recognized by most Internet users. However, the terms Uniform Resource Identifier (URI) and Uniform Resource Name (URN) are not recognized or understood as well as URL. We will differentiate between these terms and examine the Java supporting classes.
A browser user would normally enter a URL for the site that they would like to visit. This URL needs to be mapped to an IP address. The IP address is a unique number identifying the site. The URL is mapped to an IP address using a Domain Name System (DNS) server. This avoids a user having to remember a number for each site. Java uses the InetAddress
class to access IP addresses and resources.
UDP and TCP are used by many applications. IP supports both of these protocols. The IP protocol transfers packets of information between nodes on a network. Java supports both the IPv4 and IPv6 protocol versions.
Both UDP and TCP are layered on top of IP. Several other protocols are layered on top of TCP, such as HTTP. These relationships are shown in this following figure:
data:image/s3,"s3://crabby-images/35cd5/35cd5e0d67336f460599a7af7d40ecbed0604998" alt=""
When communications occur between different networks using different machines and operating systems, problems can occur due to differences at the hardware or software level. One of these issues is the characters used in URLs. The URLEncoder
and URLDecoder
classes can help address this problem, and they are discussed in Chapter 9, Network Interoperability.
The IP address assigned to a device may be either static or dynamic. If it is static, it will not change each time the device is rebooted. With dynamic addresses, the address may change each time the device is rebooted or when a network connection is reset.
Static addresses are normally manually assigned by an administrator. Dynamic addresses are frequently assigned using the Dynamic Host Configuration Protocol (DHCP) running from a DHCP server. With IPv6, DHCP is not as useful due to the large IPv6 address space. However, DHCP is useful for tasks, such as supporting the generation of random addresses, which introduce more privacy within a network when viewed from outside of the network.
The Internet Assigned Numbers Authority (IANA) is responsible for the allocation of IP address space allocations. Five Regional Internet Registries (RIRs) allocate IP address blocks to local Internet entities that are commonly referred to as Internet Service Providers (ISP).
There are several publications that detail the IP protocol:
- RFC 790—assigned numbers: This specification addresses the format of network numbers. For example, the IPv4 A, B, and C classes are defined in this specification (https://tools.ietf.org/html/rfc790).
- RFC 1918—address allocation for private internets: This specification is concerned with how private addresses are assigned. This allows multiple private addresses to be associated with a single public address (https://tools.ietf.org/html/rfc1918).
- RFC 2365—administratively scoped IP multicast: This specification defines the multicast address space and how it can be implemented. The mapping between IPv4 and IPv6 multicast address spaces is defined (https://tools.ietf.org/html/rfc2365).
- RFC 2373—IPv6 addressing architecture: This specification examines the IPv6 protocol, its format, and the various address types that are supported by IPv6 (http://www.ietf.org/rfc/rfc2373.txt).
Many of the concepts introduced here will be illustrated with Java code whenever possible. So let's start with understanding networks.
Understanding network basics
A network consists of nodes and links that are combined to create network architecture. A device connected to the Internet is called a node. A computer node is called a host. Communication between nodes is conducted along these links using protocols, such as HTTP, or UDP.
Links can either be wired, such as coaxial cable, twisted pairs, and fiber optics, or wireless, such as microwave, cellular, Wi-Fi, or satellite communications. These various links support different bandwidth and throughput to address particular communication needs.
Nodes include devices, such as Network Interface Controllers (NIC), bridges, switches, hubs, and routers. They are all involved with transmitting various forms of data between computers.
The NIC has an IP address and is part of a computer. Bridges connect two network segments allowing a larger network to be broken down into smaller ones. Repeaters and hubs are used primarily to retransmit a signal boosting its strength.
Hubs, switches, and routers are similar to each other but differ in their complexity. A hub handles multiple ports and simply forwards the data to all connected ports. A switch will learn where to send data based on its traffic. A router can be programmed to manipulate and route messages. Routers are more useful in many networks, and most home networks use a router.
When a message is sent across the Internet from a home computer, there are several things going on. The computer's address is not globally unique. This requires that any messages sent to and from the computer be handled by a Network Address Translation (NAT) device that changes the address to one that can be used on the Internet. It allows a single IP address to be used for multiple devices on a network, such as a home LAN.
The computer may also use a proxy server, which acts as a gateway to other networks. Java provides support for proxies using the Proxy
and ProxySelector
classes. We will examine their use in Chapter 9, Network Interoperability.
Messages are often routed through a firewall. The firewall protects the computer from malicious intent.
Network architectures and protocols
Common network architectures include bus, star, and tree-type networks. These physical networks are often used to support an overlay network, which is a virtual network. Such a network abstracts the underlying network to create a network architecture supporting applications, such as peer-to-peer applications.
When two computers communicate, they use a protocol. There are many different protocols used at various layers of a network. We will mainly focus on HTTP, TCP, UDP, and IP.
There are several models depicting how networks can be layered to support different tasks and protocols. One common model is the Open Systems Interconnection (OSI) model, which defines seven layers. Each layer of a network model can support one or more protocols. The relationships of various protocols are depicted in the following table:
data:image/s3,"s3://crabby-images/42b09/42b0994f993f9828131766a582202dccb7396f5a" alt=""
A more complete list of protocols for the OSI layers can be found at https://en.wikipedia.org/wiki/List_of_network_protocols_(OSI_model). We are not able address all of these protocols and will focus on the more important ones that are supported by the Java SDK.
Consider the transfer of a web page from a server to a client. As it is sent to a client, the data will be encapsulated in an HTTP message, which is further encapsulated in TCP, IP, and link-level protocol messages, each frequently containing a header and footer. This encapsulated set of headers is sent across the Internet to the destination client, where the data is extracted for each encapsulating header until the original HTML file is displayed.
Fortunately, we do not need to be familiar with the details of this process. Many of the classes hide how this occurs, allowing us to focus on the data.
The protocols of the transport layer that we are interested in are TCP and UDP. TCP provides a more reliable communication protocol than UDP. However, UDP is better suited for short messages when delivery does not need to be robust. Streaming data often uses UDP.
The differences between UDP and TCP are outlined in the following table:
data:image/s3,"s3://crabby-images/74092/74092012a8c43d96010a7dd5ad9f7ddf63f7f3b9" alt=""
TCP is used for a number of protocols, such as HTTP, Simple Mail Transfer Protocol (SMTP), and File Transfer Protocol (FTP). UDP is used by DNS to stream media, such as movies, and for Voice Over IP (VOIP).