Overview of H.323
H.323 is a communication protocol from the ITU-T. It is a VoIP call control protocol that allows for the establishment, maintenance, and teardown of multimedia sessions across H.323 endpoints. H.323 is a suite of specifications that controls the transmission of voice, video, and data over IP networks. The following are some of the H.323 specifications relevant to the subject matter laid out in this book:
■ H.225: H.225 handles call setup and teardown between H.323 endpoints and is also responsible for peering with H.323 gatekeepers via the Registration Admission Status (RAS) protocol.
■ H.245: H.245 acts as a peer protocol to H.225 and is used to negotiate the characteristics of the media session, such as media format, the method of DTMF relay, the media type (audio, video, fax, and so on), and the IP address/port pair for media.
■ H.450: H.450 controls supplementary services between H.323 entities. These supplementary services include call hold, call transfer, call park, and call pickup.
H.323 Components
The H.323 protocols outlined in the previous section are used in the communications between H.323 components or devices. The following are the most common H.323 devices:
■ H.323 gateways: H.323 gateways are endpoints that are capable of interworking between a packet network and a traditional Plain Old Telephone Service (POTS) network (analog or digital). Since these H.323 endpoints can implement their own call routing logic, they are considered to be “intelligent” and, as such, operate in a peer-to-peer mode. H.323 gateways are capable of registering to a gatekeeper and interworking calls with a gatekeeper by using the RAS protocol.
■ H.323 gatekeepers: H.323 gatekeepers function as devices that provide lookup services. They indicate via signaling to which endpoint or endpoints a particular called number belongs. Gatekeepers also provide functionality such as Call Admission Control and security. Endpoints register to the gatekeeper by using the RAS protocol.
■ H.323 terminals: Any H.323 device that is capable of setting up a two-way, real-time media session is an H.323 terminal. H.323 terminals include voice gateways, H.323 trunks, video conferencing stations, and IP phones. H.323 terminals use H.225 for session setup, progress, and teardown. They also use H.245 to define characteristics of the media session such as the media format, the method of DTMF, and the media type.
■ Multipoint control units: These H.323 devices handle multiparty conferences, and each device is composed of a multipoint controller (MC) and multipoint processor (MP). The MC is responsible for H.245 exchanges, and the MP is responsible for the switching and manipulation of media.
H.323 Call Flow
An H.323 call basically involves the following:
■ A TCP socket must be established on port 1720 to initiate H.225 signaling with another H.323 peer. This assumes that there is no gatekeeper in the call flow. As defined in the previous section, gatekeepers assist in endpoint discovery and call admission.
■ For an H.323 call, the H.225 exchange is responsible for call setup and termination, whereas the H.245 exchange is responsible for establishment of the media channels and their properties. In most cases, the establishment of two independent TCP connections is required: one for the H.225 exchange and the other for the H.245 exchange. To effectively bind the two, the TCP port number on which the answering terminal intends to establish an H.245 exchange is advertised in one of the H.225 messages. The port number can be advertised before the H.225 connect message is sent (for example, in an H.225 progress message) or when the H.225 connect message is sent.
■ H.225 and H.245 exchanges can proceed on the same TCP connection, using a process called H.245 tunneling.
■ Every H.245 message is unidirectional in the sense that it is used to specify the negotiation from the perspective of the sender of that H.245 message. For the successful establishment of a two-way real-time session, both H.323 terminals must exchange H.245 messages.
Figure 2-7 depicts a basic H.323 slow start call between two H.323 terminals. The calling terminal first initiates a TCP connection to the called terminal, using destination port 1720. Once this connection is established, H.225 messages are exchanged between the two terminals to set up the call. In order to negotiate parameters that define call characteristics such as the media types (for example, audio, video, fax), media formats, and DTMF types, an H.245 exchange has to ensue between the terminals.
In most cases, a separate TCP connection is established between the endpoints to negotiate an H.245 exchange; however, in some cases, as an optimization, H.245 messages are tunneled using the same TCP socket as H.225, using a procedure known as H.245 tunneling. When utilizing a separate TCP connection for H.245, the called terminal advertises the TCP port number over which it intends to establish an H.245 exchange. The ports used for the establishment of H.245 are ephemeral and are not dictated by the H.323 specification.
The H.245 exchange results in the establishment of the media channels required to transmit and receive real-time information. You should be aware that while Figure 2-7 highlights a slow start call, a variant to the slow start procedure, known as FastConnect, also exists and is depicted in Figure 2-8. As the name suggests, FastConnect is a quicker and more efficient mechanism to establish an H.323 call. In fact, FastConnect can establish an H.323 call with as few as two messages. This is possible because with FastConnect there is no need to open an H.245 socket, as long as all needed media can be negotiated via FastConnect.
Figure 2-8 shows how an H.323 FastConnect call is set up. When transmitting a Call Setup message, the endpoint populates a field, known as the fastStart element, with H.245 messages. The called endpoint can accept FastConnect by selecting any fastStart element in the Call Setup message, populate the necessary data fields (as specified in H.323), and return a fastStart element in any H.225 message (for example, Call Proceeding, Alerting, Connect) to the caller. The called endpoint can also reject FastConnect and fall back to the traditional slow start procedures shown in Figure 2-7 by either explicitly indicating so (using a flag), initiating any H.245 communications, or providing an H.245 address for the purposes of initiating H.245 communications.