IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
IP Telephony
Lasse Huovinen
Department of Computer Science and Engineering
Helsinki University of Technology
[email protected]
Shuanghong Niu
Department of Computer Science and Engineering
Helsinki University of Technology
[email protected]
Abstract
IP telephony, or voice over IP, has become a hot topic during the last three years. IP telephony means placing
telephony calls over IP networks instead of public switched telephone networks (PSTN). Currently IP telephony offers
cheaper call prices with less quality of service than PSTNs. The aim of this paper is to give an introduction to several
aspects of IP telephony such as concept of IP telephony, technical and business issues, advantages and challenges of
IP telephony, end user applications and their vendors.
Keywords: IP telephony, Voice over IP, H.323, SIP, PSTN, Internet, QoS
Table of Contents
Abbreviations
Glossary
1. Introduction
2. Concept of IP Telephony
2.1. History of IP Telephony
2.2. IP Telephony vs. PSTN
2.3. Advantages of IP Telephony
2.4. Challenges of IP Telephony
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
3. Business Aspects of IP Telephony
3.1. Business Scenarios of IP Telephony
3.2. Current Market Situation
3.3. Users of IP Telephony
3.4. IP Telephony End User Prices
4. Technical Aspects of IP Telephony
4.1. Network Architecture
4.2. Protocol Architecture
4.3. Message Structure
4.4. Real-time Transport Protocol
4.5. Quality of Service
4.6. IETF's SIP
4.7. Security Issues
4.8. Network Interoperability
5. Applications and Vendors
5.1. Software Solutions
5.2. Quicknet's Specialized Soundcard
5.3. Nokia Solution
5.4. Selsius' Ethernet Phones
5.5. Future Phones
6. Summary
References
Abbreviations
CCB
Customer Care and Billing
ETSI
European Telecommunications Standards Institute
HTML
Hyper Text Markup Language
HTTP
Hyper Text Transfer Protocol
IETF
Internet Engineering Tasks Force
IP
Internet Protocol
ISDN
Integrated Services Digital Networks
ISP
Internet Service Provider
IMTC
International Multimedia Teleconferencing Consortium
ITC
Internet & Telecoms Convergence Consortium
ITU
International Telecommunications Union
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
LAN
Local Area Network
MCU
Multipoint Control Unit
NMS
Network Management System
PBX
Private Branch Exchange
POTS
Plain Old Telephone Service
PSTN
Public Switched Telephone Network
QoS
Quality of Service
RTSP
Real-time Streaming Protocol
SIP
Session Initiation Protocol
TCP
Transmission Control Protocol
TIPHON
Telecommunications and Internet Protocol Harmonization over Networks
UDP
User Datagram Protocol
VoIP
Voice over IP
Glossary
E.164
ITU-T standard which defines the numbering arrangements for integrated services digital networks (ISDN).
IPSec
Internet Protocol Security is a developing standard for security at the network or packet-processing layer of
network communication. IPSec provides two choices of security service: Authentication Header which
essentially allows authentication of the sender of data, and Encapsulating Security Payload which supports both
authentication of the sender and encryption of data. The specific information associated with each of these
services is inserted into the packet in a header that follows the IP packet header.[24] IPSec does not require
modifications to the application.
ISDN
Integrated Services Digital Network is a set of CCITT/ITU standards for digital transmission over ordinary
telephone copper wire as well as over other media. Home and business users who install ISDN adapters (in
place of their modems) can see highly graphical Web pages arriving relatively high speed (up to 128 kbit/s).
ISDN is generally available in most urban areas in the United States and Europe.
PBX
Private Branch Exchange is a telephone system within an enterprise that switches calls between enterprise users
on local lines while allowing all users to share a certain number of external phone lines. The main purpose of a
PBX is to save the cost of requiring a line for each user to the telephone company's central office.
PSTN
Public Switched Telephone Network refers to the world's collection of interconnected voice-oriented public
telephone networks both commercial and government owned. It is also referred to as the Plain Old Telephone
Service (POTS).
Q.931
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
ITU-T standard which specifies the procedures for the establishing, maintaining, and clearing network
connections at the ISDN user-network interface. These procedures are defined in terms of messages exchanged
over the D-channel of basic and primary rate interface structures.
S/MIME
Secure Multi-Purpose Internet Mail Extension is a secure method of sending Email that uses RSA encryption
system. MIME (RFC 1521) itself spells out how an electronic message will be organized. S/MIME describes
how encryption information and a digital certificate can be included as part of the message body. RSA has
proposed S/MIME as a standard to the IETF.[24]
SSL
Secure Sockets Layer is a program layer created by Netscape for managing the security of message transmissions
in a network. The "sockets" part of the term refers to the sockets method of passing data back and forth between
the client and a server program in a network or between program layers in the same computer. SSL uses the
public and private key encryption system from RSA which also includes the use of digital certificate.[24]
TLS
Transport Layer Security is a technique for secure communication residing on the top of the IP stack. TLS
requires modifications to the applications. TLS is based on Secure Socket Layer (SSL) and Private
Communication Technology (PCT).
1. Introduction
The current development on telecommunications industry is changing the use of telecommunications networks
remarkably. People use telephone lines more and more for data transfer instead of ordinary voice calls. At the moment
the data traffic is estimated to be equal to voice traffic but the estimate by the year 2001 is that the data traffic will be
one hundred times greater than voice traffic [3].
Obviously these changes has strong affection to the networks and operators. Public networks are moving from circuit
switched networks to packet switched networks. This will cause telecommunications networks and
datacommunications networks, which are mainly IP based, to converge same networks as illustrated in Figure 1-1. This
will cause that voice calls are transmitted over IP based networks, as well.
IP telephony, or voice over IP, means that voice and fax calls are transmitted over an IP network such as the Internet,
rather than over the familiar public switched telephone network (PSTN). During the last three years IP telephony has
become a hot topic. Five years ago IP telephony was virtually unknown technology but at the moment it seems that IP
telephony will revolutionize voice call business and technology used to transport voice calls. Since access to the
Internet is available at local phone connection rates, an international or other long-distance call will be much less
expensive than through the traditional call arrangement.
On the Internet, three new services are now or will soon be available:
The ability to make a normal voice phone call (whether or not the callee is immediately available; that is, the
phone will ring at the location of the callee).
The ability to send faxes at very low cost (at local call prices) through a gateway point on the Internet in major
cities.
The ability to leave voice mail to a called number.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Figure 1-1. Convergence of telecommunications and datacommunications networks.
The focus on this paper is to introduce technical aspects (network elements, protocols, quality of service, security,
interoperability) and end-user applications (vendors and their products). Also authors' own visions about the future
equipment are given.
However, firstly an overview to IP telephony is given: concept of IP telephony, history, differences between IP
telephony calls and PSTN calls, and advantages and challenges of IP telephony. Secondly the business aspects are
described briefly: business scenarios, current market situation, and end user prices. Finally, in the end of the paper,
some conclusions are made.
2. Concept of IP Telephony
IP telephony means transmission of voice over IP networks. The basic steps for transmitting voice over IP network are
listed below and illustrated in Figure 2-1:[1]
1. Audio from microphone or line input is A/D converted at audio input device.
2. The samples are copied into memory buffer in blocks of frame length.
3. The IP telephony application estimates the energy levels of the block of samples.
4. Silence detector decides whether the block is to be treated as silence or as part of a talkspurt.
5. If the block is a talkspurt it is coded with the selected algorithm (e.g., GSM 6.10).
6. Some header information is added to the block.
7. The block with headers is written into socket interface (UDP).
8. The packet is transferred over a physical network and received by the peer.
9. The header information is removed, block of audio is decoded using the same algorithm it was encoded, and
samples are written into a buffer.
10. The block of samples is copied from the buffer to the audio output device.
11. The audio output device D/A converts the samples and outputs them.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Figure 2-1. Basic idea of IP telephony.
2.1. History of IP Telephony
There were research activities of transmission of voice signals over packet networks in the late 70's and early 80's. In
the late 70's there was discussion and experiments with packetized voice over ARPANET, the predecessor of the
Internet using IP and specialized coding and packetizing equipment [5]. The real development in IP telephony started
in 1995. VocalTec pioneered the IP telephony market in 1995 with PC software which opened a voice connection
between two PCs over IP-based network. The product was ideally suited for the Internet. After that, several other
competing software packages were launched consecutively. In 1996 first internetworking trials between IP network
and PSTN were made. In 1997 the Delta Three launched the first phone to phone service for commercial use[2]. The
development of IP telephony is summarized in the following[8]:
1995 - The year of the Hobbyist
1996 - The year of the IP Telephony Client
1997 - The year of the Gateway
1998 - The year of the Gatekeeper
1999 - The year of the Application
In the beginning of the next millennium IP telephony will be used over mobile IP network such as GPRS (General
Packet Radio Service) or UMTS (Universal Mobile Telecommunications System). This is an opportunity to the new
mobile telephony operators because they do not have to build separate networks or parts of networks for voice and
data. It also offers opportunities for new mobile applications. This is further discussed in Section 5.5.
2.2. IP Telephony vs. PSTN
The primary technical difference between the Internet and the PSTN is their switching architectures. The Internet uses
dynamic routing (based on non-geographic addressing) versus the PSTN which uses static switching (based on
geographic telephone numbering). Furthermore, the Internet's "intelligence" is very much decentralized, or distributed,
versus the PSTN which bundles transport and applications resulting in the medium's intelligence residing at central
points in the network.
PSTN is circuit switched network. It dedicates a fixed amount of bandwidth for each conversation and thus quality is
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
guaranteed. When the caller places a typical voice call, she picks up the phone and hears the dial tone. Then she dials
the country code, area code, and the number of callee. The central office will establish the connection, and then the
caller and callee can discuss with each other.
When the caller places IP telephony call, she picks up the phone and hears dial tone from the PBX (private branch
exchange) if one is available. Then she dials a number which is forwarded to the nearest IP telephony gateway located
between the PBX and a TCP/IP network [7]. The IP telephony gateway finds a route through the Internet that reaches
the called number. Then the call is established. The IP telephony gateway modulates voice into IP packets and sends
them on their way over the TCP/IP network as if they were typical data packets. Upon receiving the IP encoded voice
packets, the remote IP telephony gateway reassembles them into analog signals to the callee through the PBX.
2.3. Advantages of IP Telephony
In relatively short period of time, IP telephony is expected to revolutionize the telecommunications industry.
Advantages of IP telephony include lower cost long distance and reduced access charges, more efficient backbones
and compelling new services. The benefits of shifting traditional voice onto packet networks can be reaped by
businesses, Internet Service Providers (ISP), traditional carriers etc. Business benefits from IP telephony because it
takes advantages of existing data networks, reducing operating costs by managing only one network and enables them
to enjoy almost-toll quality voice. In a recent survey, conducted by Forester Research of 52 Fortune 1000 Firms, more
than 40% of telecom managers plan to move some voice/fax traffic to IP network by 1999.
For consumers IP telephony started from inexpensive or even free Internet calls. Although IP telephony calls lack
certain quality when compared to calls in PSTNs people are willing to sacrifice it for much less costs. Many
households have only one telephone line which is often used for Web browsing. Therefore they would like to use IP
telephony as a way of accepting incoming calls by rerouting them to the PC.
ISPs have also become increasingly focused on IP telephony because it enables them to offer new services beyond
Internet access (voice/fax), improve their network utilization, and offer voice services at significantly lower rates. It is
forecasted that the Internet will carry 11% of U.S. & International long distance traffic and 10% of the world's fax
traffic by 2002.[9] In fact, UUNET, PSINet, IDT and America Online have already announced IP telephony and fax
plans for 1999.
2.4. Challenges of IP Telephony
Since IP packets carrying voice are treated just like IP packets carrying any other type of data, they are subjected to
delays, loss, and retransmissions. This is specially true when the network is congested. The quality of service becomes
very important issue. Losing every other words of the phone call can make the call essentially worthless. IP telephony
is facing the following challenges:
Unpredictable service quality which relates to quality of service and reliability. Real-time applications set high
requirements on the reliability and quality of service capabilities of IP networks. Protocols and techniques to
ensure this must be developed. Until these techniques are widely deployed and supported by most networks,
over-provisioning or proprietary methods in private IP networks remain the only way to ensure the required
QoS.
Datacom and telecom convergence related complex system integration, Network Management Systems (NMS)
integration, Customer Care and Billing (CCB) systems integration, and diversity in the marketplace. IP
telephony equipment consist of new network elements that need to be integrated into the corporate, and
teleoperator's or service provider's network. Both physical and logical integration to the other network elements
are required, as well as integration to the vital operation support systems such as maintenance, provisioning, and
billing systems.
Lack of interoperability because a single waterproof standard does not exist. There are several competing or
partially overlapping standard proposals. Current IP telephony standards only ensure interoperability within a
single IP telephony subnetwork. The communication between gateways or gatekeepers from different vendors
remains to be standardized.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Regulatory development will have a major impact on IP telephony. In most countries IP telephony is still
unregulated but the regulatory authorities are monitoring the situation closely.
Inertia in the legacy networks, large investments tied in legacy technologies, and people are accustomed to the
old services. There is inertia in the traditional telecom services.
3. Business Aspects of IP Telephony
The communication industry is going through a period of explosive change. Data is becoming a more significant
proportion of traffic compared to voice. IP is today considered the most promising platform on which to build new
services. IP telephony service shapes the Internet for real-time services.
Telecom operators and ISPs, finding themselves fighting for the same customers, can benefit from IP telephony in
several ways. The ability to offer cost optimized telephony services by bypassing the circuit switched network enables
price discrimination as well as entirely new business opportunities for challenging operators and ISPs.
3.1. Business Scenarios of IP Telephony
There are three basic business scenarios for IP telephony.
IP terminal (such as computer) to IP terminal (see Figure 3-1)
In this scenario, both the subscriber A and B are using computer attached to an IP network as terminals. Voice is
compressed and decompressed by the PC software. This requires both parties participating in the call to have a PC
with sound card, microphone, loudspeaker, and compatible IP telephony client software.
Figure 3-1. Business scenario: IP terminal <--> IP terminal.
IP terminal to Phone or Phone to IP terminal (see Figure 3-2)
In this scenario one of the subscribers is using a computer for IP-voice and the other uses a phone on a
PSTN/ISDN/GSM/TDM network. A gateway on the edge of the IP network translates IP-voice to voice and takes care
of the signaling between the two networks.
Figure 3-2. Business scenario: IP terminal <--> Phone.
Phone to phone (see Figure 3-3)
Both subscribers are using conventional phones in this option, and the IP network is used for the long distance
connection. Gateways on both ends take care of traffic and signaling translations between networks.
Figure 3-3. Business scenario: Phone <--> Phone.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
3.2. Current Market Situation[9]
It is believed that IP telephony will create exciting new revenue opportunities for service providers and networking
vendors alike. The voice telephony carrier services market is expected to grow to $1 trillion worldwide by 2000. Sales
of IP gateways alone should approach $1.81 billion by the end of 2001, a compounded five year annual growth rate of
229 percent! Furthermore, by 2001, it is project that 96% of revenues in the IP telephony market will come from the
gateway segment.
Even though IP telephony becomes more popular everyday it will still be a complementary to traditional PSTN
services for many years. The IP telephony technology is not matured yet. There are still many problems to be solved,
such as the voice quality and reliability issues.
There are a few statements of the market outlook.
Internet will carry 11% of U.S. & international long distance traffic and 10% of the world's fax traffic by
2002.[9]
25% of world's telephone calls will travel over the "Net" within 12 years (InfoTEST International)
60 million PC users will be using the Internet for voice calls by 1999 (International Data Corporation)
IP telephony will account for 12.5 billion minutes of use by 2001 (International Data Corporation)
70% of Fortune 1000 will use Internet for voice and fax by 2002 (CTI Magazine Nov 97)
3.3. Users of IP Telephony
There are two major types of IP telephony end users: residential users and business users.
In the beginning stage of IP telephony, IP telephony residential users are those persons who are familiar with
computers and are early adopters of new technology. They are willing to accept and test new technology. Especially
those people who are aboard, or whose family or friends are aboard, such as students and expatriates may find IP
telephony as a great opportunity. They are very price sensitive to the long distance costs and they are willing to trade
with lower quality.
Nowadays the IP telephony traffic is mainly generated by residential customers but there are forecasts that the number
of business users of IP telephony will increase remarkably in the future. The reasons for the changes are
The quality of IP telephony is improving all the time.
One major saving cost for a company to run only data network instead of both data network and telephone
network.
New corporate solutions of IP telephony.
Small and home offices will become early adaptors.
Table 3-1 represents estimates on relative shares of the IP telephony traffic of different types of users.[11]
Year Enterprise(%) Consumer(%)
1997 8 92
1998 11 89
2002 43 57
Table 3-1. Relative shares of IP telephony users.
3.4. IP Telephony End User Prices
The prices for IP telephony services vary greatly depending on the originating and terminating points. However, IP
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
telephony service providers typically offer a 50-60% discount on international routes.[4]
The average European PSTN international call price per minute was estimated to be approximately 60 US cents in
1998. In 2002 the IP telephony price is expected to be at most 10 US cents per minute.[6]
Some prices of different types of telephony service from Finland are shown in Table 3-2.[4]
Telia
IP Phone-to-
phone
RSL.COM
IP Phone-to-
phone
Telia
PSTN
telephone
Sonera
PSTN
telephone
Delta Three
IP PC-to-phone
FIM/min+ppm FIM/min+ppm FIM/min+ppm FIM/min+ppm
US$/min+Internet
access
Germany 1.93 1.40 2.55 2.89 0.13
USA 1.85 1.40 2.79 3.29 0.10
New
Zealand
7.16 - 8.95 8.99 0.16
UK 1.85 1.40 2.55 3.29 0.12
Table 3-2. Price examples of telephone calls from Finland.
4. Technical Aspects of IP Telephony
IP telephony requires new technology into networks: logical network elements and protocols. New logical networks
elements are needed for call management, routing, storing call information, etc. Signaling protocols are used to
establish calls or multimedia sessions, such as multimedia conferences, voice calls, and distance learning. The IP
signaling protocols are used to create connections between clients over intranets or the Internet. The main functions of
singling protocols are user location lookup, address translation, connection setup, service feature negotiation, call
termination, and call participant management such as invitation of more participants. Additionally, signaling protocols
are responsible for billing, security, directory services, for instance.
One of the most vital issues for wide spread use of IP telephony is to achieve international standards that enable
equipment combinations of different vendors to work properly together. Currently several organizations are developing
IP telephony standards. Few important organizations are ITU-T, IETF, ETSI, iNOW!, IMTC VoIP forum, and MIT's
Internet Telephony Consortium.
Two remarkable signaling protocol "standards" exist for IP telephony: ITU's H.32x series and IETF's Session Initiation
Protocol (SIP). H.323 seems to have quite good position at current market when compared to SIP which is a new
protocol trying to get into the market.
H.323 is so called umbrella standard for multimedia communications over local area networks that does not provide
guaranteed quality of service. H.323 belongs to the series of communications standards called H.32x., for multimedia
conferencing over different types of networks including ISDN and PSTN. The H.323 specification was approved in
1996, but first drafts of the H.32x series were approved in the beginning of 90's. The second version, approved in
J anuary 1998, concerns conferencing over wide area networks.[1,10] H.323 is studied in detail in the following
sections.
SIP is developed by the Multiparty Multimedia Session Control (MMUSIC) working group of the IETF. The protocol
is still under development and it is not as well known as H.323. SIP is based on hyper text markup language (HTML)
and is more lightweight than H.323. SIP was originally designed for multimedia conferencing on the Internet. In
addition to SIP, two other signaling protocols are considered as parts of the SIP architecture: Session Description
Protocol (SDP) and Session Announcement Protocol (SAP).[1] SIP is studied shortly in Section 4.6 and at the same
time some comparisons to H.323 are made.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Organizations such as iNOW!, IMTC VoIP forum, and MIT's Internet Telephony Consortium are interested in
interoperability of different standards and networks. These organizations and their goals are introduced in Section 4.8.
4.1. Network Architecture
Figure 4-1 illustrates architecture of IP telephony network based on H.323. The network architecture consists of four
types of network elements: terminals, gatekeepers, gateways, and multipoint control units (MCU). The basic
configuration consists of at minimum two terminals connected to a local area network (LAN). However, in practical
applications it is necessary to add some of the other elements in order to create an efficient communication system
with connections to the outside world. The purpose and functionality of each network element are introduced below.
Figure 4-1. H.323 network architecture.
Terminals are clients which are able to receive or initiate calls. They generate and receive bi-directional real-time
information streams. A terminal can be either software running in a computer or dedicated equipment. All terminals
must support voice while video and data capabilities are optional. Different solutions for IP telephony terminals are
represented in Chapter 5.
Gatekeeper manages a so called zone which is a collection of terminals, gateways, and multipoint control units. The
H.323 network consists of these zones. Calls within a zone are managed by the gatekeeper but call between zones may
involve several gatekeepers.
Gatekeeper is responsible for address resolution and call routing. In this case address translation means translation of
alias addresses to transport addresses using translation table. Admissions control is used to determine whether an
endpoint is allowed to terminate or originate a call. It may be based on authorization, bandwidth or some other criteria.
Bandwidth control lets a given amount of bandwidth be reserved for H.323 traffic and distributed between the
connections. When the limit is reached no more connections can be opened, leaving capacity for other traffic.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Terminals within a zone register to their gatekeeper, which adds the corresponding address to the registration table.
Gatekeeper may also provide call control signaling, call authorization, bandwidth management, and call management.
Call control signaling means that the gatekeeper may process the control signals (Q.931) in point to point conferences
instead of passing them directly between terminals. Call authorization allows the gatekeeper to reject a call depending
on its properties. The reasons for rejection can be user defined, e.g., restricted access from/to particular terminals or
gateways, or restricted access during certain periods of time.
Bandwidth management is closely related to bandwidth control allowing the gatekeeper to reject calls from a terminals
if the available bandwidth is low. In call management the gatekeeper keeps a list of on-going calls to indicate that a
terminal is busy or to provide information for the bandwidth management function.
An optional feature of the gatekeeper is call routing. This feature allows more effective call control and service
providers can bill for calls in their network. The routing service may also be used to redirect calls to other terminals if
a called terminal is unavailable. Additionally, gatekeepers can balance the load among multiple gatekeepers based on
some routing logic. The gatekeeper acts like an interface to other H.323 networks. The other network may be owned by
different service provider. Gatekeepers are optional elements but if they are present terminals have to use them.
Gateway is responsible for connecting IP telephone network to other type of networks. For example, the gateway may
connect H.323 network to SIP based network, PSTN, or ISDN. The gateway performs translation between different
transmission formats and communication procedures. Also, it is responsible to set up and clear calls on both sides.
Terminals communicate with gateways using the H.245 and Q.931 protocols.
Multipoint Control Unit (MCU; not shown in the figure) is needed only if centralized and hybrid multipoint
conferences for distribution of media streams are utilized. An MCU consists of Multipoint Controller (MC) and a
number of Multipoint Processors (optional; MP). The MC handles control information and the MPs handles the
streams. Terminals send their streams to the MCU which mixes and redistributes streams back to the terminals.
Often it is possible to combine several different network elements into the same physical unit. For example, gatekeeper
functionality might be incorporated into the gateway and MCU, or MCU could be implemented into the terminals in
order to allow multipoint conferences without any separate MCU unit.
4.2. Protocol Architecture
The H.32x series consists of different protocols for different network types:
H.320 for narrowband switched digital ISDN.
H.321 for broadband switched digital ISDN and ATM.
H.322 for guaranteed bandwidth packet switched networks.
H.323 for non-guaranteed bandwidth packet switched networks.
H.324 for analog phone system (POTS).
All above mentioned protocols support a set of audio and video codecs, depending on the bandwidth of the network
and the approval date of the standard. As well, different multiplexing, control and multipoint methods are used in
different standards. In this paper only H.323 is considered.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Figure 4-2. Structure and scope of H.323.
H.323 is based on reliable and unreliable communication and both communication types must be provided by the
network. Reliable transport, provided by TCP, is required by control signaling and data. Unreliable transport, provided
by UDP, is used for audio, video, and RAS channel. H.323 is independent from the underlying network topology.
H.323 is a series of several ITU standards as illustrated in Figure 4-2. H.323 defines the system, control procedures,
media descriptions and call signaling.
H.245 defines negotiation of channel usage and capabilities exchange. It is used for opening and closing of channels
for audio, video, data, and camera mode requests. Q.931 defines call signaling and setup. RAS (registration,
admission, status) is used for communication between terminal and gatekeeper. Optional data channel can be supported
with the T.120 series of ITU standards. Codecs needed for speech and video coding are defined by G.711, G.722,
H.728, H.723.1, G.729 and H.261, H.263, respectively.
H.225.0 defines media packetization, stream synchronization, control message packetization and control message
formats. It is designed to operate over various LANs such as IEEE 802.3 and IEEE 802.5. It acts as a convergence
layer just above the transport layer being protocol independent. The scope of H.225.0 is communication between
H.323 terminals and gateways in the same LAN using the same transport protocol. H.225.0 may be used over
interconnected LANs or the Internet, but the performance is acceptable only if the network load is low. There are some
ways for resource reservation and that is discussed in Section 4.5. In H.323 audio and video packets are formatted as
defined in H.225.0 using Real-time Transport Protocol (RTP) which is discussed in Section 4.4.
4.3. Message Structure
Most of the control messages in H.323 are encoded in the Abstract Syntax Notation One (ASN.1) scheme using the
Packet Encoding Rules (PER). ASN.1 is a complex encoding scheme where data is put into hierarchical structures.
Structure may be optional, variable length, and nested. ASN.1 extensions are backward compatible by upgrading the
central description. However, this requires that the upgrades are coordinated to avoid incompatibility.[1]
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Since IP telephony is still continuously under development most likely more signaling capabilities are needed in the
future when new applications appear. Vendors may want to add new features and support for their equipment. Also
new codecs and video formats are developed. It is important that extensions could be added to the existing protocols
and new versions should be backward compatible. Because the Internet is an open and evolving network it can be
expected that additions will develop without coordination and all vendors will develop their own extensions.
H.323 provides nonstandardParam fields placed in various locations in the ASN.1 structures. These fields consist of a
vendor code and a value that only has meaning to the vendor. Vendor cannot add a new parameter to places where
nonstandardParam field exists. Currently there is no way for terminals to exchange information about which
extensions they support. Therefore, interoperability between terminals of different vendors is restricted to the H.323
standard capabilities.[1]
4.4. Real-time Transport Protocol
IETF audio-video transport group started to develop RTP in 1993. The aim of the protocol was to provide services
required by interactive multimedia conferences, such as play-out synchronization, demultiplexing, media identification
and active party identification. However, not only multimedia conferencing applications can benefit from RTP, but
also storage of continuous data, interactive media distribution, distributed simulation, and control applications can
utilize RTP.
RTP consists of the actual Real-time Transport Protocol which is used to carry data with real-time properties and RTP
Control Protocol (RTCP) which is used to monitor QoS and conveying information about the participants in an on-
going conference.
RTP implementation will often be integrated into application rather than being implemented as a separate protocol
layer (see Figure 4-3). In applications RTP is typically run on top of UDP to make use of its port numbers and
checksums. The RTP framework is relatively "loose" allowing modifications and tailoring depending on application.
Additionally, a complete specification for a particular application will require a payload format and profile
specification. The payload format defines how a particular payload is to be carried in RTP. A payload specification
defines how a set of payload type codecs are mapped into payload formats.
Figure 4-3. Location of RTP in IP stack.
RTP session setup consists of defining a pair of destination transport addresses one IP address and UDP port pair, one
for RTP and another for RTCP. In the case of multicast conference the IP address is a class D multicast address. In
multimedia session each medium is carried in a separate RTP session with its own RTCP packets reporting only the
quality of that session. Usually additional media are allocated in additional port pairs and only one multicast address is
used for the conference.
4.5. Quality of Service
The quality issues of IP telephony (as everything which demands real-time operation) are very complicated and
challenging. In [23] quality of service (QoS) is defined as
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
"the collective effect of service performance which determine the degree of satisfaction of a user of the service".
In [ETR300 003] ETSI has specified QoS more accurately in order to consider the viewpoints of the different
parties of the communication process:
QoS requirements of the user (customer)
QoS offered by the service provider
QoS achieved by the service provider
QoS perceived by the user (customer)
QoS requirements of the Internet Service Provider
In IP telephony the QoS perceived by the user is dependent on two things: the quality of the perceived voice and
the delay in two-way conversation. These are closely connected to each other because a better voice quality
requires more bits which again causes more delay. The development on this area can be divided into two parts:
Development of end user equipment by developing more efficient codecs for voice digitalization with
better quality and less bits.
Development of network transmission to achieve end-to-end QoS.
Nowadays IP networks offer only very limited QoS capabilities and especially this is the case in the Internet.
New techniques can be used to over come this. The low level network layers with QoS options (e.g., ATM) or
advanced routing protocols may be used for prioritization. These improvements will improve QoS mainly on
private networks where end-to-end control can be used. Later the public network and especially the Internet will
offer better QoS options by utilizing Resource Reservation Protocol (RSVP), Multi-protocol over ATM
(MPOA), and IPv6.
RSVP is a protocol that allows channels or paths on the Internet to be reserved for the multicast transmission of
video and other high-bandwidth messages. RSVP is a part of the Internet Integrated Service model which
ensures best-effort service, real-time service, and controlled link sharing.[24]
Before starting to receive broadcast an RSVP request must be sent to allocate sufficient bandwidth and priority
of packet for the broadcast. At first the request goes to the nearest gateway with an RSVP server. It determines
if the user has eligible to have such a reservation and, if so, whether sufficient bandwidth remains to be reserved
without affecting earlier reservations. If the reservation was successful the request is forwarded to the next
gateway towards the source of multicast. If the reservation cannot be made all the way to the destination then all
reservations are removed.[24]
When the multicast begins packets are transmitted through gateways on a high-priority basis. An RSVP packet is
designed to be flexible; it can vary in size and in the number of data types and objects. If packets need to travel
through gateways that do not support RSVP they can be tunneled through as ordinary packets. RSVP work with
both versions, 4 and 6, of IP.[24] RSVP's requirement for bandwidth reservation may restrict its applicability for
widespread use. It is hard to imagine that a user is allowed to reserve bandwidth over the Internet. However, for
intranets RSVP can be a good solution.
MPOA is one of the industry's first standards based solution that allows routed networks to take advantage of the
benefits of the ATM network. It reduces the cumulative latency in a multi-protocol routed network by reducing
number of the intermediate points where packet processing must be performed (i.e., hop-by-hop processing).
MPOA allows traffic to be forwarded to its destination over an ATM virtual circuit which incurs the net delay of
a single router hop.[26]
IPv6 will offer much better QoS options than the current standard IPv4. For IP telephony it offers following
improvements[24]:
The header of the IP packet is simpler reducing overhead and thus giving more bandwidth for user data.
Flow labeling makes real-time prioritization possible.
Larger address space allows to add the new IP telephone client equipment and assign them a static IP
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
address.
Security options gives possibility to protect IP telephony traffic (see Section 4.6).
As seen above, several things can be done to achieve better QoS. However, better QoS will always cost more.
Users do not always need very high QoS level and they are not willing to pay very much. To address this issue
ETSI's TIPHON project has defined a number of quality levels for IP telephony that can be used to define billing
levels, for instance[3]:
best effort
GSM quality
toll quality
CD quality
4.6. IETF's SIP
Network Architecture
SIP consists of three types of network elements: terminals (user agent servers), proxy servers, and redirect
servers. Functionality of the SIP terminal is similar to the H.323 terminal and the minimum configuration
needed for communication in SIP based IP telephony is two terminals.
SIP server has part of H.323 gatekeeper functionality, but SIP server does not have such importance as the
gatekeeper has. The call does not have to go through any servers and there is no equivalent to the zone concept.
Servers are mainly used to route and redirect calls. Some simple authentication functions can be implemented in
the servers but best suited locations for security functions are in the terminals or in the firewalls. A SIP server
can operate in either proxy or redirect mode. A redirect server informs the caller to contact another server
directly. A proxy server contacts one or more next hop servers itself and passes the call request further. The
proxy server has to maintain a call state whereas a redirect server can forget the call request after it has been
processed. A terminal does not need to know whether it is communicating with server or another terminal. It is
recommended that servers are able to operate in both modes.
SIP does not include separate network element similar to H.323 gateway. In SIP, gateway functionality is
implemented as terminal that receive and establishes calls on either side and translate the streams and control
information.
Protocol Architecture
SIP protocol architecture is illustrated in Figure 4-4. SIP is independent from underlying network topology and it
can be used with several transport protocols. Any datagram or stream protocol that delivers complete SIP request
or response can be used. For example, UDP, TCP, X.25, ATM AAL5, CLNP, TP4, IPX, and PPP are all
suitable.[1]
SIP does not require reliable transport protocol and thus simple clients could be implemented using only UDP.
For servers recommendation is that both, UDP and TCP, should be supported. A TCP connection is opened only
if UDP connection cannot be opened. Reliable transport is achieved by retransmitting request every half second
until a response is returned. The functionality is very similar to three-way handshake. The use of application
layer reliability has the advantage that the timers can be adjusted according to requirements. Standard TCP has
too long retransmission delay if a packet is lost. The other features of TCP such as sequence numbers, flow and
congestion control are not needed. Also, session using UDP is not tied to any connection allows participants to
reboot as long as the call identifiers are maintained. TCP must be used in some cases. For example some
firewalls require TCP and transport layer security protocols.[1]
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Figure 4-4. Protocol architecture in SIP.
The functionality of SIP is concentrated on signaling which is contrary to H.323 where the protocol include all
required functions of conferencing. The SIP protocol includes all basic signaling, user location, registration and
as en extension also advanced signaling. The other services such as directory access reside in separate protocols.
SIP architecture has planned to be modular where different functions can be easily replaced and even some
components of H.323 can be integrated into SIP environment.
SIP uses Session Description Protocol (SDP) to describe the capabilities and media types supported by the
terminals. SDP is developed by IETF and messages are text based as in SIP. SDP messages lists the features that
must be implemented in the endpoints. SDP messages are mainly sent within SIP messages but also other ways
can be used.
Sessions can be informed to a larger group of users using IETF's Session Announcement Protocol (SAP). SAP is
primarily used for informing about large public conferences and broadcast streams like Internet television and
radio. However, SIP could be used for this because of multicast signaling feature. For real-time transmission SIP
architecture uses RTP which has been presented in Section 4.4.
SIP architecture includes Real-time Streaming Protocol (RTSP) which has not been introduced in this paper, yet.
RTSP is a control protocol for initiating and directing delivery of streaming multimedia from media servers; it is
like "Internet VCR remote control protocol". RTSP does not deliver data but it uses RTP for that as H.323 does.
However, the RTSP connection can be used to tunnel RTP traffic for ease of use with firewalls and other
network devices. H.323 and RTSP are complementary in function. H.323 is useful for setting up audio/video
conferences in moderately sized networks whereas RTSP is useful for large-scale broadcasts and audio/video-
on-demand streaming.[25]
Message Structure
In SIP client-server approach is utilized where the client transmits requests and the servers returns responses. A
single call may involve several clients and servers.
SIP uses message structures found from HTML. The messages are in text format using ISO 10646 in UTF-8
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
encoding. As in HTML the client requests invoke methods on the server. The messages consists of a start-line
specifying the method and the protocol, a number of header fields specifying call properties and service
information, and an optional message body which can contain a session description. The following methods are
applicable in SIP[1]:
Invite invites a user to a conference
Bye terminates a connection between two users
Options signals information about capabilities
Status informs the server about the progress of signaling
Ack is used as a response in reliable message exchanging
Register conveys location information to a SIP server
The syntax of response codes are similar to HTML. The three digit codes are hierarchically organized with the
first digit representing the result class and the other two digits providing additional information. The first digit
controls the protocol operation and the other two gives useful but non-critical information. A textual description
and even a whole HTML document can be attached to the result message.
In SIP the extensibility of functionalities has same approach as hyper text transfer protocol (HTTP) and simple
mail transfer protocol (SMTP) use. New headers can be added to the SIP messages. Unknown headers and
values are ignored by default. Using Require header the client can require specific headers to be understood by
the other endpoint. If it does not support the named services an error message containing the unknown feature is
returned and the client can return to simpler operation. Feature names are based on hierarchical namespace and
new features can be centrally registered with Internet Assigned Number Authority (IANA).
The textual names also make the fields of headers self-describing and thus different developers can understand
and support other vendors' new features easily. This approach has been used in SMTP. Additional fields are not
only vendor specific but common to all vendors and developers that chooses to implement the extension which is
contrary to H.323. Also numerical results are hierarchically organized similar to HTML.
4.7. Security Issues
The Internet is an open network where everyone can receive and transmit packets relatively easily.
Eavesdropping of calls in IP networks is probably easier than in PSTN. Therefore, some mechanisms are
necessary to avoid eavesdropping. In addition to voice stream also signaling (call setup, call management,
billing) requires protection to prevent spoofing of calls, denial of service, spamming (disturbing), etc.
H.323
The version 2 of H.323 supports authentication, integrity, privacy, and non-repudiation. Their usage is specified
in H.235 (earlier H.Secure). In addition to user streams, call signaling (Q.931), management (H.245), and RAS
are protected. Four basic security aspects are addressed in IP telephony:
Authentication: A process to ensure that the participants really are who they claim to be.
Integrity: A process to ensure that the contents of the packet remains unchanged during transmission.
Privacy: Using ciphering mechanisms eavesdroppers are prevented to monitor (listen) the contents of the
transmitted packets.
Non-repudiation: A process to prevent that someone can deny he or she has done something, e.g., a
caller do not want to pay his or her bills pleading that no calls has made.
There are two ways to protect privacy of IP telephony call: IP telephony software includes security function or
external private protocols, such as TLS (Transport Layer Security) and IPSec (IP Security protocol), can be
used.
In addition to the actual speech stream RAS, Q.931, and H.245 need protection. With RAS the preservation of
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
the integrity of data within the packets must be protected and endpoints must be authenticated. At the moment
the integrity issue is incomplete and any privacy of RAS data is not defined. Typically, gatekeepers want to
authenticate the client (one directional) but clients can also request the gatekeeper to authenticate itself (bi-
directional). The H.235 security is utilized for Q.931 and H.245 by encrypting the transmitted streams, protecting
packets from tampering and utilizing user authentication to verify the correctness of the endpoints.[1]
SIP
Because SIP is based on HTTP it naturally has security mechanisms similar to HTTP. Authentication of the
caller and the callee can be realized with HTTP mechanisms, including basic (clear-text password) and digest
(challenge-response) authentication. Keys for media encryption are conveyed using SDP.
The basic SIP draft does not include any security considerations other than to specify a reliance on lower layers
security mechanisms such as Secure Socket Layer (SSL). Also hop-by-hop TLS is supported but it is not
applicable if UDP is used. According to [27] SIP could use any transport layer or HTTP's security mechanisms,
such as Secure Shell (SSH) or Secure-HTTP.
The version 2.1 of SIP includes improved security mechanisms. It defines end-to-end authentication and
encryption using either Pretty Good Privacy (PGP, mandatory) or S/MIME (optional). These methods are used
for signing and encrypting messages.
Firewall Interoperability
Nowadays, almost all intranets are protected by firewalls. To make IP telephony possible H.323 and firewalls
must be able to co-operate. Both H.323 client and firewall need changes. A H.323 proxy is needed and it
operates like most other proxies. It monitors calls and decides which are allowed to pass through the firewall. A
proxy can be considered as a special gateway enforcing access control policies in addition to bandwidth control.
H.323 uses dynamic port addresses. When the call is set up a port number for the H.245 connection is assigned,
and a new TCP connection will be set up to that port. Also media channels use dynamic ports. Each channels
requires two UDP connections for the RTP streams and one bi-directional connection for the RTCP stream. A
typical audio-only conference requires two TCP connections and four UDP connections and only one of them is
static. Dynamic addresses and port numbers are exchanged within the data stream.[1]
It is not possible to search addresses from fixed places in the exchanged packets because ASN.1 encoding has
optional and various length fields. Therefore the complete data stream must be searched. The proxy has to
participate with the application protocol making the firewall visible to the applications. Firewall looks like a
server to the internal IP telephony application and for the external application the firewall looks like a client.
This kind of proxy must perform address translation from the addresses between internal application and firewall
to the address between external application and firewall, and vice versa. This kind of proxy can be classified as
an application proxy.
Other types of firewalls are not suitable for H.323. With a packet filtering router all TCP and UDP ports above
1024 must be open for bi-directional traffic due to dynamic port numbers. This approach does not provide much
protection. A circuit gateway can disassemble the H.323 packets to examine the used port numbers and open the
used ports. This is difficult to implement.
In SIP firewalls does not cause that much problems. Only one UDP or TCP connection is needed and it is easy
to add to the firewall configuration. Also similarity to HTTP enables reuse of proxies and security mechanisms.
4.8. Network Interoperability
H.323 is today's IP telephony standard allowing vendors to select among many options when creating product.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Although Version 2 of H.323 is currently published some current products are based on Version 1 and Version 3
is coming soon. Also other competing standards are published and at least SIP will most probably become very
popular. Considering this all, it is a virtual certainty that different products will not work together.
All network elements using the same protocol should work with each other easily without that engineers of the
vendors have to work long time for interoperability. In this way competition is possible making network
elements cheaper and speeding up technological development. To enable widespread commercial use of IP
telephony virtually all end user equipment should work without problems in different networks and with
different terminals. To address above mentioned problems and requirements different networks architectures,
equipment, and protocols must be compatible. Several organizations are driving compatibility between different
IP telephony standards.
Organizations
The purpose of ETSI's TIPHON (Telecommunications and Internet Protocol Harmonization Over Networks)
project is to combine IP with other telecommunication technologies to enable IP telephony networks to
interwork with switched circuit networks. TIPHON will develop service oriented solutions that a variety of
operators can use. If possible TIPHON will use available standards H.323 and SIP. Although ETSI is European
institute, results of TIPHON are aimed to gain world-wide acceptance. For example, the following companies
are supporting TIPHON: AT&T, Cisco, Ericsson, Lucent, Intel, Microsoft, Nokia, Motorola, Nortel, Siemens,
Philips, and Telia.[2] The main working items of TIPHON are:
Requirements for service interoperability.
Global TIPHON architecture, interfaces, and functions.
Call control procedures, information flows and protocols.
End-to-end quality of service parameters.
Address translation between E.164 and IP.
Technical aspects of billing and accounting.
Security profiles and procedures.
iNOW! (interoperability now) is a broad based, multi-vendor initiative established to quickly provide
interoperability among IP telephony platforms. The purpose of iNOW! is to provide equipment vendors with the
blueprint for achieving real world, revenue generating gateway to gateway and gatekeeper to gatekeeper
interoperability. Vendors that implement their products according to iNOW! recommendations called "Profile"
are promised to gain commercially viable interoperability with other iNOW! compliant platforms including
VocalTec, Lucent and other industry leaders in advance of all the necessary standard documents being
completed. The Profile will be updated periodically to extend its scope and to come into compliance with newly
ratified standards. The goal of publishing the iNOW! Profile is to accelerate true interoperability between
different IP telephony platforms.
The supporters of iNOW! are active participants in standards bodies such as ITU and ETSI. The iNOW! Profile,
written by VolcaTec, Lucent and ITXC, is based on H.323 standard as well as existing and proposed elements of
H.225.0. Wherever possible the iNOW! Profile references published standards as the basis for achieving
interoperability for a given set of functionality. Where standards do not yet exist or are not approved, iNOW!
"fills the gaps" with proposed solutions. These proposals are under consideration at various standards bodies that
addresses IP telephony.
The version 2 of iNOW! Profile addresses following properties:
Gateway-to-gateway interoperability.
Gatekeeper-to-gatekeeper interoperability.
Gatekeeper-to-exchange carrier interoperability.
Phone-to-phone service.
Fax-to-fax service.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
The basis for iNOW! Profile is H.323. H.323 was selected because it today's IP telephony standard and for time
being it will provide quickest route to interoperability. The future versions of the Profile will address new
properties. The following are missing from the current version:
Gatekeeper-to-gateway interoperability.
PC-to-phone services.
Roaming services.
Exchange carrier cascading, where one exchange carrier originates/redirects traffic to another.
SS7 support.
The Internet & Telecoms Convergence Consortium (ITC) of Massachusetts Institute of Technology (MIT)
consists of member firms and selected academics who collaborate on research into technical, economic,
strategic, and policy issues that arise from the convergence of telecommunications and the Internet. Consortium
participants work together to understand and shape future technologies, industry and market structures, and
regulatory policies worldwide. Ultimately, the goal is to enable the growth of new forms of multimedia
communication that span the Internet and telecommunications infrastructures. ITC is not a standards
organization. It merely plays a pre-standards role, identifying gaps and emerging issues that need to be
addressed.[12]
International Multimedia Teleconferencing Consortium, Inc., is a non-profit corporation comprising more
than 150 organizations all around the world. The goal of IMTC is to promote and facilitate the development and
implementation of interoperable multimedia conferencing solutions based on open international standards --
particularly the multimedia conferencing standards adopted by the ITU, as well as other organizations.[13]
Gateway Interoperability
As stated earlier, IP telephony voice gateway acts like an endpoint that responds to call requests on one side and
establishes connection on the other side of the gateway. It also translates informations streams between
interworking networks. Gateways usually support four types of connections: analog, T1/E1, ATM, and ISDN. As
an example Figure 4-5 represents SIP network co-operation with PSTN.[1] The caller is in SIP network and
callee in PSTN. The caller also ends the call.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
Figure 4-5. SIP interworking with PSTN.
H.323 and SIP users cannot communicate directly between each other. The call must be routed through other
network (e.g., PSTN) or H.323-SIP gateway is needed. If other network is used then gateway translates H.323
traffic to PSTN traffic and another gateway translates PSTN traffic to SIP traffic, and vice versa.
If H323-SIP gateway is used signaling must be translated in the gateway. The information streams does not
necessarily have to be translated because RTP is used for transportation on both sides. However, this requires
that same codecs can be used on both sides of the gateway. If not then information streams must also be
translated from a coding to another.
SIP offers few interesting opportunities for co-operation with H.323 reported in [1]. Firstly, a SIP client can be
used to locate any terminal and determine its capabilities including H.323. The actual calling can be done with
an H.323 client integrated in the SIP software. In this case the hop-by-hop searches of SIP can be used to locate
both H.323 and SIP endpoints and the call is established with the appropriate protocol.
Secondly, using the redirection features of SIP it is possible to establish calls to H.323 terminals. The SIP user
could indicate that it prefers to communicate via SIP as first choice and H.323 as an alternative. If neither of
them is available then ordinary telephone could be used. Both the caller and callee can indicate such preferences.
The callee can also indicate a preferred time to be called back.
Thirdly, SIP allows protocol mixing. For example, H.323 could be used to establish a connection between end
systems and gateways while SIP might be used for gateway-to-gateway signaling.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
5. Applications and Vendors
5.1. Software Solutions
This is the oldest way of making IP telephony calls. In this service a soundcard, microphone and loudspeakers,
modem or network card, Internet connection, and the IP telephony client software are needed. The rough cost for
updating a PC for IP telephony is about 600 FIM. If video functionality is wanted, video camera and video
capturing card are needed. This will costs additional 1500 FIM. [4]
The requirements for PC depend on whether the user is doing only voice conversation or also video
conferencing. Some voice conversation applications might even run on a PC with a 486 processor and 16 MB
RAM. However, as a general recommendation the user should have at least a Pentium 100 with 32 MB RAM.
A network card or modem is needed for the network connection. The modem should support at least 28.8 Kbit.
There are many different IP telephony client softwares available. Further information about client softwares can
be found, e.g., from references [14, 15, 16].
Some popular clients are:
Microsoft Netmeeting [17]
VocalTec InternetPhone [18]
Netscape Conference [19]
Net2phone [20]
Previously most of the client software were not interoperable with each other since the software used different
and even proprietary voice compression algorithms/codecs. Nowadays, many client software are H.323
compliant, but this still does not necessarily mean that they are compliant since H.323 supports several optional
codecs.
Most of the clients are free of charge. Of above mentioned, all are free of charge except VocalTec's
InternetPhone which costs 50 US$. It can be excepted that good service support is lacking from most of the
products.
This way of making IP telephony calls is like a toy for the computer users. Quality of service is not necessary
very good and the users have to have certain experience with computers. It may be used by the early technical
adaptors to use it for fun or to save some money, but it may not be suitable for professional, business, or
ordinary home users. However, it has been very important initiator for IP telephony use.
5.2. Quicknet's Specialized Soundcard[21]
The Internet PhoneJ ACK is an audio card developed by Quicknet. It is designed specifically to carry voice over
the Internet or intranets. The Internet PhoneJ ACK can be used in combination with the user's favorite IP
telephony software (Microsoft NetMeeting and VocalTec Internet Phone softwares are included in the package)
to make and receive voice calls over the Internet. It can also be used with software like IDT's Net2Phone to
make calls from the user's PC to a normal telephone.
The Internet PhoneJ ACK was designed for IP telephony calls. The PhoneJ ACK is meant to be used as the
ordinary telephone at the office and at the home. Some factors are considered carefully such as echo, data
compression, simultaneous two-way conversations.
The user does not need to replace existing sound card with the Internet PhoneJ ACK. The Internet PhoneJ ACK
has different use than a sound card and it is designed to work along side existing sound cards in multimedia PC.
This means that by adding the Internet PhoneJ ACK the user can have simultaneous Internet phone conversations
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
and full use of multimedia PC at the same time. For example, the user can play RealAudio and carry on an
Internet phone conversation simultaneously; play a network game and enjoy the sound effects while talking to
her opponent over the net at the same time ("Take that you alien!"); or give a multimedia presentation with
sound and discuss about slides.
The Internet PhoneJ ACK costs $149.95 after $10.00 Manufacturer's Mail-in Rebate. The price includes the
following goods:
The Internet PhoneJ ACK 1/2 size ISA bus PnP card
A Windows 95 Installation CD-ROM
The good thing about the PhoneJ ACK is that it can work with existing soundcard in multimedia PCs. This means
that with Internet PhoneJ ACK the user can have Internet phone calls and the use of multimedia at the same time.
This might be useful for small companies, some home business users, or some students. They care about the
telephony costs, and they have facilities to use the phone and PC at the same time. This may not be suitable for
big companies or ordinary home users. For ordinary home users, it requires certain skills and knowledge to use
the product.
5.3. Nokia Solution[22]
In December 1998 Nokia made acquisition of Vienna Systems. Nokia offers a range of IP telephony clients and
peripherals for end-users that enable different services over IP networks. They are following:
IPCourier Ethernet Phone
IPShuttle Ethernet analog adapter
My.way IP Telephony Client Application
SerialSet telephone with PC serial interface
Phone.way Serial Telephony Adapter
The IPCourier Ethernet Phone provides PBX functionality without a PBX. With multiple line appearances,
speakerphone capability, programmable buttons for memory dialing and an LCD display. IPCourier is an IP
telephony client with a familiar interface. In conjunction with the call Processing Server, the Ethernet telephone
supports advanced call features such as call waiting, Caller ID, forward, transfer, mute, and so on.
IPShuttle Ethernet Analog Adapter is suitable for home use. IPShuttle supports existing analog telephones
allowing up to two lines to be provided via an IP network. This versatile IP telephony client is designed to
support new voice and data services being offered by cable companies.
My.way is a client application. It turns a PC with a microphone and sound card into a multi-line feature phone
with PBX functionality. Combined with standard data and/or video conferencing software, my.way is a powerful
tool for keeping in touch. Users equipped with my.way and their PC can simultaneously access their corporate
voice and data networks, attend virtual meetings through a 28.8 kbit/s or faster connection from anywhere in the
world.
A Serialset telephone connects directly to a PC serial interface. SerialSet provides increased privacy and voice
quality. For those locations without installed telephones, SerialSet is a secure, familiar alternative to a PC
microphone and sound card for voice across IP networks. SerialSet is a telephone equipped with a PC serial
interface, allowing callers with my.way to connect directly through the reduction of background noise, and
privacy in conversations is assured. Users can have familiar functionality such as flash, re-dial and mute.
For those locations with telephones already installed, Phone.way connects a standard telephone through a PC
serial port to a TCP/IP network. Operating with my.way, it allows callers the choice of dialing and accessing
PBX features from their PC keyboard or from their telephone keypad.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
It seems that Nokia has a wide range of IP telephony products. The IPCourier Ethernet Phone has many
advanced call features, and it is suitable for business users. The IPShuttle Ethernet Analog Adapter is suitable for
home use, the good thing about it is that it supports two lines. The SerialSet Telephone is very good choice if the
user has PC, but the bad thing is that she cannot use IP telephony call and PC at the same time. Serial Telephone
Adapter can connect the user's ordinary phone to PC enabling IP telephony calls, the additional features of it is
that the user can dial and access intelligent features from their PC's keyboard or the telephone when the adapter
is used together with my.way and phone.way software.
5.4. Selsius' Ethernet Phones
Selsius' family of the Ethernet phones offers users connectivity to the IP PBX. Each model is a full-featured
PBX-like telephone that can be plugged directly into an Ethernet jack. Selsius' family of phones provides the
same business quality audio of a traditional PBX phone and does not require a companion PC. The Selsius-
Phone is an IP-based telephone that can be installed anywhere in the corporate LAN/WAN IP network. The
phones are DHCP (Dynamic Host Configuration Protocol) supported and do not have to be located with the
Selsius-CallManager(tm).
DHCP support for Selsius-Phones makes phone setup virtually automatic. Also, DHCP allows the user to move
phones and plug them in anywhere on the IP network (local or remote ports) with no configuration.
The Selsius-Phone comes equipped with a 10Base-T interface and operates on third-pair power. Power can be
supplied near the phone or from a shared 48-volt power source near the hub. The Selsius-Phone comes in two
models, with 12 or 30 programmable buttons, speakerphone, and display. Each model supports G.711 and G.723
audio compression for low bandwidth requirements. Also, each model contains an integrated Ethernet repeater
enabling the use of single Ethernet wall jack for the computer (data) and the Selsius-Phone.
The Selsius-Phone are Microsoft NetMeeting(tm) enabled. Using NetMeeting, features such as application
sharing and video conferencing are available simply by pressing a button on Selsius-Phone. The phone comes
with distinctive ringing and volume adjustments and can be configured using a Web browser.
Features of Selsius-phone:
10Base-T Interface - just plug it in or move it around.
Integrated Ethernet hub - connect the phone and PC (optional) to the same port to eliminate multiple cable
runs to the desktop.
With Selsius-CallManager, the Selsius-Phone provides a desktop H.323 interface without need for a PC.
One button collaboration - with NetMeeting(tm) application sharing and video are available with touch of
a button.
G.711/G.723 audio compression for low bandwidth requirements.
Choice of models, with 12 or 30 programmable buttons, speakerphone, and display.
Distinctive ringing and ringer volume adjustments. Handset has volume adjustment and is hearing aid
compatible.
Phone can be configured using a Web browser.
Selsius' Ethernet Phones are suitable for business users. It can used in offices since it has full-featured PBX-like
telephone. The quality of Selsius' Ethernet Phones are reasonably good since they are meant for business users.
Selsius' Ethernet Phones are usually useless in homes due to lack of Ethernet.
5.5. Future Equipment
Technology is developing very fast. In the future, there will be many different kind of IP telephones poping out.
Those phones are not only using the latest technology but people's needs and feelings must be considered, as
well.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
The following list few properties that future IP telephony terminals should have:
They should be at least as convenient to use as the advanced phones are currently.
Quality of service must be very good. No annoying delays or external noise in voice are allowed, and
movements in video must be smooth.
Multi-functionality: they can act like phones, pagers, credit cards, note pad, calculators, video terminals.
Variable uses: they can be used as an office tool or for entertainment.
Mobility: they can be used for access wherever the user is.
Design: they must look and feel good. Since they are not only tools they must be as good looking as
furniture or they must be suitable for carrying with in every situation.
Price of the advanced equipment must be low.
One of the first services in addition to voice calls will be video telephony which has been a dream for many
years. IP telephony networks address this issue and it will become to every home if wanted. Even home users
will be able to make multiuser call or video conferences in relatively cheap price. Users could, e.g., discuss
together in which restaurant they will spend Friday night. For offices IP multimedia conferences will offer much
better quality of service than current video conferences in cheaper price.
Since IP telephony will not reserve all bandwidth from the user's line she can easily access to data using the
same line as call or conference. The user has a loudspeaker in the ear and microphone, e.g., on the desk. At the
same time she can access remote databases in the network. This makes working much easier and flexible. When
two or more players are playing the same game they can discuss with each other and made plans in real-time
how to destroy opponent's (computer or other group of humans) military forces, for instance.
By developing voice recognition users discussing together can computer to retrieve information from a database
giving commands in their speech. On the other hand players could give commands to their virtual soldier by
speaking (or yelling).
IP will be used everywhere in the future. Microwave ovens, TV sets, stereos, sauna heaters, etc. will use IP in a
way or other. In houses there could be speakers all around the house and people could tell, e.g, to sauna heater
to switch on or off. They could also call home from the office or from mobile phone and give instructions to the
machines.
IP telephony terminals can be entertainment center in the future. Radio and video services could be used over IP
network from home or even when the user is traveling by using mobile phone. In addition to video/music on
demand there could be interaction with theater plays, for instance. User's sitting at home could give suggestions
and applauds to the actors and actresses. When the screen technology evolves books could be transferred over
the Internet to the electronic book. For example, if the user forgot to take a book with her, she could request it to
be transmitted to her phone. Then she could read it and after reading destroy it from the "phone" and load a new
one.
The future mobile phones could be real mini offices where the user has everything with her. No separate
calculators, note pads, credit cards, pagers, and phones are needed. The user can receive and transmit all types of
data directly to her mobile phone. The phone then converts the received data to the correct output format (text,
images, voice, video).
The combination of electronic commerce and mobile phones offers possibilities to pay bills, order supplies, etc.
very easily. For example, when the user is traveling in the bus from the office to home she can make payment
orders or she can order food from a shop to her home. Naturally the requirement for these operations is that
security is excellent.
Additionally to enhanced mobile services in the future the coverage where mobile terminals can be used is
nearly global. By doing roaming agreements between satellite networks and public land mobile networks only
few areas on the Earth are not covered.
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
The future equipment will be more and more based on software on the top of very efficient processors. This
offers an opportunity for advanced users to extend the equipment functionality as they want by writing their own
personal applications on the top of the commercial equipment.
6. Summary
This paper introduced several aspects of IP telephony: concept of IP telephony, history, IP telephony business,
IP telephony technology, and end user equipment and their vendors. IP telephony means placing voice call over
IP networks instead of PSTNs and it has been a hot topic during last few years.
IP telephony can be used in three basic situations: IP terminal to IP terminal, IP terminal to external phone (and
vice versa), and phone to phone. IP telephony market is growing very rapidly at the moment. The prices of IP
telephony call are essentially lower than in PSTNs. Especially private users are interested on the possibilities of
IP telephony because they are not very concerned about quality of service when the price is low.
Currently two remarkable standards exist for IP telephony: ITU-T's H.323 and IETF's SIP. H.323 has reached
good position in current markets when compared to SIP. H.323 was introduced in detail in Chapter 4. The H.323
network architecture consists of four types of network elements: terminals, gatekeepers, gateways, and
multipoint control units. Basically, only two terminals are needed for communication, but when more advanced
and flexible IP telephony services are wanted to use other network elements are needed as well. SIP was shortly
introduced in Section 4.6. SIP consists of terminals, proxy servers, and redirect servers and its messages are
based on HTML. SIP is probably more suitable for telephony and multimedia conferencing over the Internet
while H.323 is suitable for intranet use.
One of the most vital issues in IP telephony is quality of service which in the case of voice calls means quality
of the perceived voice and delay in two-way conversation. To fulfill these requirements codecs of the end user
equipment and network transmission must be efficient. Nowadays IP networks offer only very limited QoS
capabilities and especially this is the case in the Internet. New techniques can be used to over come this. Later
the public network and especially the Internet will offer better quality of service options by utilizing Resource
Reservation Protocol, Multi-protocol over ATM, and IPv6.
Another vital issue is interoperability between different IP telephony network architectures, and IP telephony
networks and PSTNs. Currently several organizations are interested in interoperability. In this paper ETSI's
TIPHON, iNOW!, MIT MTC, and IMTC were introduced.
Also mechanisms to prevent eavesdropping and spoofing of calls, denial of service, and spamming are very
essential. The version 2 of H.323 offers methods for authentication, integrity, privacy, and non-repudiation. SIP
uses similar methods similar to used with HTTP.
End user IP telephony equipment can be based on bare software, specialized sound cards, and combination of
computer and specialized telephony equipment. Current end user equipment need development. Currently they
are not necessarily easy to use and their applicability is restricted; no general IP telephone exists. In the future IP
telephony terminals will become easy to use multimedia terminals with capability to handle voice, video, still
images, text etc. The future IP telephone equipment will be capable for mobility and then users are not tied to a
particular location.
References
[1]
Beijar, Nicklas Signaling Protocols for Internet Telephony --- Architectures based on H.323 and SIP.
Helsinki University of Technology.
<http://keskus.hut.fi/tutkimus/ipana/paperit/sip.pdf>
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
[2]
Koistinen T., Haeggström J . IP Telephony. Nokia Telecommunications, October, 1998.
<http://keskus.hut.fi/opetus/s38130/s98/ip_tel/ip_tel.html>
[3]
Martola J ., Pettersson M. Voice over IP Technology. Helsinki University of Technology, 1999.
[4]
Korkea-Aho, Mari Commercial IP Telephony Services. Helsinki University of Technology, 1999.
[5]
Gold, B. Digital Speech Networks, IEEE Proceedings, vol.65, no. 12, Dec. 1977.
[6]
Datamonitor Plc. IP telephony Markets in Europe and the US, J une 1998.
[7]
Null, Christopher No more hangups over voice.
<http://www.lantimes.com/testing/98mar/803c047a.html>
[8]
The Pulver Report, October 8, 1998.
[9]
Voice over IP market overview
<http://www.telogy.com/our-products/golden-gateway/voip-market-overview.html>
[10]
A Primer on the H.323 Series Standard
<http://gw.databeam.com/h323/h323primer.html>
[11]
Frost, Sullivan World Markets for IP Telephony Equipment and Services, 1998.
[12]
Frequently Asked Questions About The MIT Internet & Telecoms Convergence Consortium
<http://itel.mit.edu/itel/ITC.FAQ.html>
[13]
IMTC -- Frequently Asked Questions
<http://http://www.imtc.org/faq.htm>
[14]
IP telephony client software
<http://www.von.com/teleph.html>
[15]
IP telephony client software
<http://itel.mit.edu/itel/software.html>
[16]
Internet Phones
<http://www.ipxstream.com/GIP/vendors/client-phones/index.html>
[17]
Microsoft Netmeeting
<http://www.microsoft.com/netmeeting/>
[18]
VocalTec products
<http://www.vocaltec.com/products/products.htm>
[19]
Netscape Conference
<http://www.netscape.com/>
[20]
Net2Phone Client
<http://www.net2phone.com/english/>
[21]
<http://www.quicknet.net/>
[22]
IP Telephony
http://www.tml.tkk.fi/Opinnot/Tik-110.551/1999/papers/04IPTelephony/voip.html[5/30/2011 3:32:35 PM]
<http://www.viennasys.com/>
[23]
ITU-T E.800: Quality of Service and dependability vocabulary. 1988.
[24]
<http://whatis.com>
[25]
<http://www.real.com/devzone/library/fireprot/rtsp/faq.html>
[26]
<http://www.atmforum.com/atmforum/library/mpoa.html>
[27]
Schulzrinne, H. SIP: FAQ
<http://www.cs.columbia.edu/~hgs/sip/faq.html>