Professional Documents
Culture Documents
D OCTOR OF T ECHNOLOGY
Contact information:
Karl-Johan Grinnemo
Telecom & Media
TietoEnator AB
Box 1038
SE651 15 Karlstad, Sweden
Phone: +46 (0)5429 41 49
Fax: +46 (0)5429 40 01
Email: karl-johan.grinnemo@tietoenator.com
Printed in Sweden
Karlstads Universitetstryckeri
Karlstad, Sweden 2006
To my parents
Abstract
In recent years, Internet and IP technologies have made inroads into almost every communication market ranging from best-effort services such as email and Web, to soft real-time
applications such as VoIP, IPTV, and video. However, providing a transport service over
IP that meets the timeliness and availability requirements of soft real-time applications has
turned out to be a complex task. Although network solutions such as IntServ, DiffServ,
MPLS, and VRRP have been suggested, these solutions many times fail to provide a transport service for soft real-time applications end to end. Additionally, they have so far only
been modestly deployed. In light of this, this thesis considers transport protocols for soft
real-time applications.
Part I of the thesis focuses on the design and analysis of transport protocols for soft realtime multimedia applications with lax deadlines such as image-intensive Web applications.
Many of these applications do not need a completely reliable transport service, and to this
end Part I studies so-called partially reliable transport protocols, i.e., transport protocols that
enable applications to explicitly trade reliability for improved timeliness. Specifically, Part
I investigates the feasibility of designing retransmission-based, partially reliable transport
protocols that are congestion aware and fair to competing traffic. Two transport protocols
are presented in Part I, PRTP and PRTP-ECN, which are both extensions to TCP for partial
reliability. Simulations and theoretical analysis suggest that these transport protocols could
give a substantial improvement in throughput and jitter as compared to TCP. Additionally, the
simulations indicate that PRTP-ECN is TCP friendly and fair against competing congestionaware traffic such as TCP flows. Part I also presents a taxonomy for retransmission-based,
partially reliable transport protocols.
Part II of the thesis considers the Stream Control Transmission Protocol (SCTP), which
was developed by the IETF to transfer telephony signaling traffic over IP. The main focus of
Part II is on evaluating the SCTP failover mechanism. Through extensive experiments, it is
suggested that in order to meet the availability requirements of telephony signaling, SCTP
has to be configured much more aggressively than is currently recommended by IETF. Furthermore, ways to improve the transport service provided by SCTP, especially with regards
to the failover mechanism, are suggested. Part II also studies the effects of Head-of-Line
Blocking (HoLB) on SCTP transmission delays. HoLB occurs when packets in one flow
block packets in another, independent, flow. The study suggests that the short-term effects
of HoLB could be substantial, but that the long-term effects are marginal.
Keywords: transport protocol, congestion control, partial reliability, soft real-time, SCTP,
failover, SIGTRAN, head-of-line blocking
Acknowledgments
This thesis has benefited from the help of many people. First and foremost, I would like
to acknowledge my supervisor, Prof. Anna Brunstrom (Department of Computer Science,
Karlstad University, Sweden), for being an excellent research mentor, and for guiding me
through my doctoral studies. Also, I would like to express my sincere gratitude to my employer and main sponsor, TietoEnator. Without their economical and administrative support,
I would not have been able to pursue any doctoral studies in the first place.
Second, I would like to thank my current co-supervisor, Dr. Reiner Ludwig (Senior Specialist, Ericsson Research, Aachen, Germany) for his assistance in my work on SCTP, and
iii
Comments on My Participation
I am the principal contributor to all papers except Papers I, VII, and X. The taxonomy presented in Paper I builds upon an earlier work conducted by Assistant Prof. Johan Garcia and
Prof. Anna Brunstrom at the Dept. of Computer Science at Karlstad University. However, I
have substantially re-worked their taxonomy, and I am the main author of Paper I. In Paper
VII, Torbjorn Andersson, at that time assistant at the Dept. of Computer Science at Karlstad
University, is responsible for the design and execution of the tests presented in the paper.
My participation in Paper VII includes the analysis of the test results and the writing of the
paper. Paper X is a joint effort between Ericsson Eurolab in Aachen and Karlstad University.
My work on this paper includes participation in the discussions which led to the proposed
retransmission timeout strategy; taking part in the design and analysis of the simulations and
experiments; executing the experiments; and co-authoring the paper.
Other Papers
Apart from the papers included in this thesis, I have authored or co-authored the following
papers:
[1] K. Asplund, A. Brunstrom, J. Garcia, K-J Grinnemo, and S. Schneyer. PRTP A Partially Reliable Transport Protocol for Multimedia Applications: Background Information and Analysis. Karlstad University Studies 1999:5, Karlstad University, Sweden,
June 1999.
[2] K-J Grinnemo, J. Garcia, and A. Brunstrom. A Taxonomy and Survey of Retransmission Based Partially Reliable Transport Protocols. Karlstad University Studies
2002:34, Karlstad University, Sweden, October 2002.
[3] K-J Grinnemo and A. Brunstrom. A Survey of TCP-Friendly Congestion Control
Mechanisms for Multimedia Traffic. Karlstad University Studies 2003:1, January
2003.
[4] K-J Grinnemo and A. Brunstrom. Impact of SCTP-controlled Failovers for M3UA
Users in a Dedicated SIGTRAN Network. In Proceedings of the Second Swedish
National Computer Networking Workshop (SNCNW). Stockholm, Sweden, September
2003.
[5] K-J Grinnemo and A. Brunstrom. Some Observations on the Performance of SCTPcontrolled Failovers in M3UA-based SIGTRAN Networks. In Proceedings of the Second Swedish National Computer Networking Workshop (SNCNW). Karlstad, Sweden,
November 2004.
[6] S. Lindskog, K-J Grinnemo, and A. Brunstrom. Physical Separation for Data Protection based on SCTP Multihoming. In Proceedings of the Second Swedish National
Computer Networking Workshop (SNCNW). Karlstad, Sweden, November 2004.
vi
[7] S. Lindskog, K-J Grinnemo, and A. Brunstrom. Data Protection Based on Physical
Separation: Concepts and Application Scenarios. In Proceedings of the International
Conference on Computational Science and its Application (ICCSA). Singapore, May
2005.
[8] K-J Grinnemo, S. Baucke, A. Brunstrom, R. Ludwig, and A. Wolisz. An Easy Way
to Reduce SCTP Failover Times. In Proceedings of the Third Swedish National Computer Networking Workshop (SNCNW). Halmstad, Sweden, November 2005.
vii
Contents
Introductory Summary
1 Introduction
2 Research Objectives
3 Contributions
3.1 Part I: Partially Reliable Transport Protocols for Multimedia Applications .
3.2 Part II: Transport Service for Telephony Signaling . . . . . . . . . . . . . .
5
5
6
4 Thesis Outline
4.1 Part I: Partially Reliable Transport Protocols for Multimedia Applications .
4.2 Part II: Transport Service for Telephony Signaling . . . . . . . . . . . . . .
7
7
9
10
13
17
2 Preliminaries
19
3 The Taxonomy
3.1 Classification with Respect to Reliability Service . . . . . . . . . . . . . .
3.2 Classification with Respect to Error Control Scheme . . . . . . . . . . . .
20
20
23
28
28
30
31
32
32
33
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Concluding Remarks
34
39
1 Introduction
41
2 Protocol Design
42
ix
44
44
45
46
Stationary Analysis
5.1 Simulation Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
52
53
Transient Analysis
6.1 Simulation Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
58
63
Conclusions
66
Paper III: Evaluation of the QoS Offered by PRTP-ECN A TCP Compliant Partially Reliable Transport Protocol
71
1
Introduction
73
Related Work
74
Overview of PRTP-ECN
75
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
76
77
77
78
79
Results
80
83
Paper IV: A Simulation Based Performance Analysis of a TCP Extension for Best
Effort Multimedia Applications
87
1
Introduction
89
Overview of PRTP-ECN
91
92
92
94
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
95
. 96
. 98
. 100
. 101
5 Conclusions
104
107
1 Introduction
109
2 Related Work
110
112
127
131
Paper VI: Towards the Next Generation Network: The Softswitch Solution
133
1 Introduction
136
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
137
137
139
141
147
149
154
156
Bearer Signaling
194
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
199
199
204
207
207
209
212
Future Outlook
212
Summary
220
233
Introduction
235
237
Methodology
238
Results
240
Conclusions
246
Introduction
251
Methodology
253
Results
256
Conclusions
261
265
Introduction
267
Failovers in SCTP
269
Methodology
270
xii
4 Results
274
5 Conclusions
276
279
1 Introduction
282
286
287
288
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
291
291
293
294
294
297
298
xiii
Introductory Summary
1. Introduction
1 Introduction
Over the course of the last decade, the phenomenal success of the Internet and the universal
adoption of Internet technologies have driven profound changes in the data- and telecommunication industry. Maybe the most remarkable outcome of this evolution is the vision of the
Internet as a ubiquitous service platform for basically every known communication service;
from being the platform of basic data services such as file transfer, email, and Web browsing,
to also become a platform for applications such as video broadcasting, IPTV, and, not the
least, wireline and wireless telecommunication. However, using Internet as a ubiquitous service platform greatly challenges the overall architectural philosophy of the Internet Protocol
(IP) which prescribes an end-to-end architecture with smart hosts and a dumb switching
network [19]. In particular, IP does not lend itself easily to soft real-time applications, such
as video and telephony, with time deadlines that need to be met most of the time, and with
fairly stringent availability requirements.
To address the timeliness and availability requirements of soft real-time applications,
quality-of-service (QoS) architectures such as the Integrated Services (IntServ) [5] and Differentiated Services (DiffServ) [7] architectures have been proposed; traffic engineering solutions such as MultiProtocol Label Switching (MPLS) [18] have been developed; and availability/redundancy solutions such as the Virtual Router Redundancy Protocol (VRRP) [11]
and IP-based Fast Rerouting [21] have been suggested. However, in spite of these network
solutions, many times IP still has problems meeting soft real-time requirements. The reasons
to this are many and include the fact that neither of the proposed network solutions have so
far enjoyed widespread deployment. They are also fairly expensive and need to be supported
by a complex management architecture. Furthermore, even from a theoretical viewpoint the
proposed network solutions have difficulties to provide a real-time service from one end host
to another [12]. Often the solutions fall back to the end-to-end architecture, and it becomes
the transport protocols of the end hosts that try to provide an end-to-end real-time service to
the best of their abilities. To this end, this thesis is concerned with transport protocols for
soft real-time applications. The thesis considers both timeliness and availability issues, and
focuses on two important categories of soft real-time applications in the Internet: multimedia
and telephony signaling.
Part I of the thesis considers the design and analysis of transport protocols for multimedia
applications; particularly, for multimedia applications with lax deadlines such as imageintensive Web applications. Many of these applications do not require a completely reliable
transport service, and, in view of this, Part I studies so-called partially reliable transport
protocols. These protocols enable an application to explicitly trade reliability for improved
timeliness.
From an implementation perspective, we may differentiate between two major classes of
partially reliable transport protocols: open- and closed-loop protocols (cf. Paper I). Openloop protocols comprise those protocols which do not employ feedback from the network or
end nodes when they perform error recovery, while closed-loop protocols do employ feedback. Part I studies a subclass of closed-loop protocols, retransmission-based protocols, i.e.,
partially reliable transport protocols which recover from packet losses by retransmitting lost
packets. Specifically, Part I investigates the feasibility of designing retransmission-based,
Introductory Summary
partially reliable transport protocols that are congestion aware and fair to contending flows.
Part II of the thesis considers the Stream Control Transmission Protocol (SCTP) [22].
The SCTP transport protocol was developed by the Internet Engineering Task Force
(IETF) [1] for the transfer of Public Switched Telephony Network (PSTN) signaling traffic
over IP. From a broad viewpoint, Part II studies how well SCTP meets the timeliness and
availability requirements of PSTN signaling in the SIGnaling TRANsport (SIGTRAN) architecture [16], i.e., in the interworking architecture between traditional telecom networks
and carrier-grade Voice over IP (VoIP) networks proposed by the IETF (cf. Paper VI).
The majority of Part II considers the performance of the network path recovery mechanism of SCTP, the so-called SCTP failover mechanism. SCTP supports redundant network
paths between two end points: one primary path and one or several backup or alternative
paths. Normally, all traffic goes on the primary path, however, if this path becomes unavailable, traffic is rerouted to one of the alternative paths. The detection of an unavailable
path, and the rerouting of traffic from the primary to the selected alternative path are done
by the SCTP failover mechanism. To be able to interwork with the corresponding recovery
mechanisms in the traditional telecom network, it is essential that SCTP exhibits the same
failover performance as these mechanisms. To this end, Part II evaluates the performance of
the SCTP failover mechanism, and, on the basis of this, suggests improvements to its current
design.
Part II also studies the deteriorating effect of Head-of-Line Blocking (HoLB) on the endto-end transmission delay. HoLB occurs when packets from one flow block packets from
another separate, independent flow. Since mitigating the impact of HoLB was one of the
main reasons SCTP was developed in the first place, Part II tries to quantify the effects of
HoLB on PSTN signaling traffic under various network conditions.
Research Objectives
3. Contributions
3 Contributions
The main contributions of this thesis are summarized in this section. The contributions of
Part I are summarized in Subsection 3.1, and the contributions of Part II are summarized in
Subsection 3.2.
Introductory Summary
4. Thesis Outline
4 Thesis Outline
This thesis is arranged in two parts. Part I considers the design and analysis of retransmission-based, partially reliable transport protocols for soft real-time multimedia applications. It comprises five papers: Paper I Paper V. Paper I presents a taxonomy and survey of
retransmission-based, partially reliable transport protocols. Furthermore, Paper I provides an
introduction to the subject. Paper II introduces PRTP, and Papers III - V discuss PRTP-ECN.
Part II of the thesis concerns the transport service offered by SCTP in the SIGTRAN architecture. It consists of five papers: Paper VI Paper X. Paper VI, provides a background to
our research on SCTP. It gives a comprehensive introduction to the softswitch solution, the
common way of implementing carrier-grade VoIP networks, and how softswitch networks
interwork with traditional telecom networks in the SIGTRAN architecture. Paper VII considers HoLB and its short- and long-term impact on SCTP traffic, and Paper VIII Paper X
address SIGTRAN availability and the SCTP failover mechanism. In particular, Paper VIII
and Paper IX evaluate the SCTP failover performance in unloaded and loaded SIGTRAN
networks, and Paper X presents our study of using relaxed retransmission backoff schemes
to improve the failover performance.
A more detailed description of the papers included in this thesis is provided in Subsections 4.1 and 4.2, below.
Introductory Summary
performance of PRTP for long-lived connections in terms of average interarrival jitter, average throughput, and average fairness is considered. The second simulation experiment
also studies whether PRTP is TCP friendly or not. Finally, the third simulation experiment
evaluates the transient performance of PRTP compared to TCP. Specifically, the third simulation experiment studies the throughput performance of PRTP in a typical Web browsing
scenario. Since, the throughput obtained in Web browsing is very much dependent on the
type of Internet connection, three types of connections are studied: fixed, modem, and GSM.
Paper III: Evaluation of the QoS Offered by PRTP-ECN A TCP Compliant Partially
Reliable Transport Protocol
The simulations presented in Paper II found PRTP to be TCP unfriendly and not altogether
fair. To address this, PRTP-ECN was conceived. This paper considers PRTP-ECN: the
principal ideas behind the protocol and its design. The stationary performance of PRTPECN is evaluated using the same simulation testbed as was used in the stationary analysis
of PRTP (see Paper II). The paper gives a detailed description of the stationary analysis of
PRTP-ECN. Specifically, it evaluates the stationary performance of PRTP-ECN compared
to TCP in terms of average interarrival jitter, average throughput, average goodput, average
fairness, and TCP friendliness.
Paper IV: A Simulation Based Performance Analysis of a TCP Extension for Best Effort
Multimedia Applications
In the same way as Paper III, this paper considers the stationary performance analysis of
PRTP-ECN. However, here the focus is on the statistical design and analysis of the simulation experiment. The simulation experiment is designed as a series of factorial experiments,
one for each studied performance metric. The paper elaborates on the underlying effects
model. Examples of issues discussed are model fitting, e.g., variance stabilizing transforms,
and the statistical hypotheses tested. In addition to the performance metrics studied in Paper III, the paper considers the link utilization of PRTP-ECN as compared to TCP.
4. Thesis Outline
10
Introductory Summary
This thesis considers IP-based transport protocols for soft real-time applications. The thesis
is concerned with both timeliness and availability issues, and focuses on two particular categories of applications: multimedia and PSTN signaling. Part I of the thesis considers the design and analysis of a subclass of partially reliable transport protocols: retransmission-based,
partially reliable transport protocols. The objective with this work was to study the feasibility of designing retransmission-based, partially reliable transport protocols for soft real-time
applications that are compatible with existing Internet transport protocols; are congestion
aware; and are, if possible, fair and TCP friendly. Our work resulted in two extensions to
TCP for partial reliability, PRTP and PRTP-ECN, and Part I shows through simulations and
theoretical analysis that these protocols could give a substantial improvement in throughput
and jitter compared to TCP. Furthermore, the simulations in Part I suggest that while PRTP
is not altogether TCP friendly, PRTP-ECN is both TCP friendly and reasonably fair against
competing TCP flows. Part I of the thesis also presents a taxonomy for retransmissionbased, partially reliable transport protocols which, apart from serving as a classification
framework, provides a uniform terminology for the subject. The work presented in Part I
opens up a number of avenues for future research including effective image and/or video
coding techniques for partially reliable transport protocols, and alternative partially reliable
retransmission schemes.
Part II of the thesis evaluates the transport service provided by SCTP, and studies to what
extent SCTP is able to meet PSTN signaling requirements in the IETF SIGTRAN architecture. The main focus of Part II is on the SCTP failover mechanism and its ability to meet the
availability requirements of PSTN signaling. Through extensive experiments, it is suggested
that in order to meet the availability requirements of PSTN signaling, SCTP has to be configured much more aggressively than is recommended in RFC 2960. Ways to improve the
transport service provided by SCTP are also presented. In particular, a relaxed retransmission scheme is proposed. Simulations and complementary experiments suggest that such a
retransmission scheme could significantly improve the SCTP failover performance. Part II
also studies the effects of HoLB on SCTP transmission delays. The study suggests that the
short-term effects of HoLB could be substantial, but that the long-term effects are marginal.
This thesis does not by any means signify the end of our study of timeliness and availability issues in SCTP. Currently, SCTP uses more or less the same congestion control mecha-
REFERENCES
11
nism as TCP, a congestion control mechanism which is believed by many, such as Camarillo
et al. [6], to be less than ideal for signaling traffic. In future research, we intend to study and
evaluate alternative congestion control mechanisms that, in better ways, take into account
the properties of signaling traffic, e.g., in terms of burstiness and duration. Also our research
on the SCTP failover mechanism will be continued. Notably, we intend to further our study
of relaxed retransmission timeout schemes and consider alternative solutions. Furthermore,
we intend to more formally analyze the stability of congestion control schemes that utilize a
relaxed retransmission strategy.
References
[1] Internet engineering task force (IETF). http://www.ietf.org.
[2] 3GPP. 3rd generation partnership project; technical specification group services and
system aspects; IP multimedia subsystem (IMS); stage 2 (release 7). Technical Specification TS 23.228 v.7.1.0, 3GPP, September 2005.
[3] P. D. Amer, C. Chassot, T. Connolly, M. Diaz, and P. T. Conrad. Partial order transport
service for multimedia and other applications. ACM/IEEE Transactions on Networking,
2(5), October 1994.
[4] K. Asplund, J. Garcia, A. Brunstrom, and S. Schneyer. Decreasing transfer delay
through partial reliability. In Protocols for Multimedia Systems (PROMS), Cracow,
Poland, October 2000.
[5] R. Braden, D. Clark, and S. Shenker. Integrated services in the internet architecture.
RFC 1633, IETF, June 1994.
[6] G. Camarillo and H. Schulzrinne. Signalling transport protocols. Technical report,
Dept. of Computer Science, Columbia University, February 2002.
[7] M. Carlson, W. Weiss, S. Blake, Z. Wang, D. Black, and E. Davies. An architecture for
differentiated services. RFC 2475, IETF, December 1998.
[8] L. Coene and J. Pastor-Balbas. Telephony signalling transport over stream control
transmission control (SCTP) applicability statement. RFC 4166, IETF, February 2006.
[9] M. Diaz, A. Lopez, C. Chassot, and P. D. Amer. Partial order connections: A new
concept for high speed and multimedia services and protocols. Annals of Telecomunications, 49(56):270281, 1994.
[10] M. Enachescu, Y. Ganjali, A. Goel, N. McKeown, and T. Roughgarden. Part iii:
Routers with very small buffers. ACM Computer Communication Review, 35(3):8389,
July 2005.
[11] R. Hinden. Virtual router redundancy protocol (VRRP). RFC 3768, IETF, April 2004.
12
Introductory Summary
[12] G. Huston. Internet Performance Survival Guide QoS Strategies for Multiservice
Networks. John Wiley & Sons, Inc., 1st edition, 2000.
[13] A. Jungmaier, E P. Rathgeb, and M. Tuexen. On the use of SCTP in failover scenarios. In 6th World Multiconference on Systemics, Cybernetics and Informatics, Orlando,
Florida, USA, July 2002.
[14] G. De Marco, D. De Vito, M. Longo, and S. Loreto. SCTP as a transport for SIP: a
case study. In 7th World Multiconference on Systemics, Cybernetics and Informatics
(SCI), Orlando, Florida, USA, July 2003.
[15] B. Mukherjee and T. Brecht. Time-lined TCP for the TCP-friendly delivery of streaming media. In IEEE International Conference on Network Protocols (ICNP), pages
165176, Osaka, Japan, November 2000.
[16] L. Ong, I. Rytina, M. Garcia, H. Schwarzbauer, L. Coene, H. Lin, I. Juhasz, M. Holdrege, and C. Sharp. Framework architecture for signalling transport. RFC 2719, IETF,
October 1999.
[17] G. Raina, D. Towsley, and D. Wischik. Part ii: Control theory for buffer sizing. ACM
Computer Communication Review, 35(3):7982, July 2005.
[18] E. Rosen, A. Viswanathan, and R. Callon. Multiprotocol label switching architecture.
RFC 3031, IETF, January 2001.
[19] J. H. Saltzer, D. P. Reed, and D. D. Clark. End-to-end arguments in system design.
ACM/IEEE Transactions on Networking, 2(4):277288, November 1984.
[20] T. Seth, A. Broscius, C. Huitema, and H-A P. Lin. Performance requirements for
signaling in internet telephony. Internet draft, IETF, November 1998. Work in Progress.
[21] M. Shand and S. Bryant. IP fast reroute framework. Internet draft, IETF, March 2006.
Work in Progress.
[22] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina,
M. Kalla, L. Zhang, and V. Paxson. Stream control transmission protocol. RFC 2960,
IETF, October 2000.
[23] D. Wischik and N. McKeown. Part i: Buffer sizes for core routers. ACM Computer
Communication Review, 35(3):7578, July 2005.
Part I
Partially Reliable Transport Protocols
for Multimedia Applications
Paper I
1 Introduction
The two standard transport protocols in the TCP/IP suite, TCP [29] and UDP [28], date
back some thirty years. They were primarily designed to offer a service appropriate for the
prevailing applications in those days. TCP was designed to offer a completely reliable service, a service suitable for such applications as email, file transfer and remote login; UDP,
on the other hand, was designed to offer an unreliable service, a service appropriate for
simple query-response applications such as name servers and network management systems.
However, in recent years, we have witnessed a growing interest in distributed multimedia applications: a category of applications typically requiring a service that in terms of reliability
17
18
places itself somewhere between the services offered by TCP and UDP but that often have
stringent demands on throughput, delay and delay jitter. To address the needs of this class
of applications, the Real-time Transport Protocol (RTP) [31] and other application-support
protocols have been proposed. These protocols try to follow the principles of protocol architecture design outlined by Clark and Tennenhouse [9]. Specifically, they try to implement
the two key architectural principles of Clark and Tennenhouse: application level framing
and integrated layer processing. However, apart from giving the application more control
over transport level decisions, these protocols also impose a number of responsibilities on
the application, e.g., flow and congestion control, and the administration of the timing of the
application messages.
The fact that RTP and similar protocols impose a number of duties on the application that
are normally taken care of by the transport protocol, and that are essentially of no interest to
the application, has made many researchers argue that this is not the right way to go [21].
They re-emphasize the requirement that there should be a clean separation between the specification and implementation of a service. In particular, they argue for a transport protocol
that, in the same way as RTP, gives the application greater influence on the transport-level
decisions but that, in contrast to RTP, relieves the application of all transport-level responsibilities. To this end, a class of transport protocols has emerged that offers a service enabling
the application to trade some reliability for improved performance with respect to some or
all of the metrics: throughput, delay and delay jitter; a class of transport protocols that offers
a service more suitable for multimedia and other real-time applications than either TCP or
UDP. This class of protocols is commonly called partially reliable transport protocols.
From an implementation perspective, we may differentiate between two major classes
of partially reliable transport protocols: closed-loop and open-loop protocols. Closed-loop
protocols dynamically adapt their error control scheme on the basis of feedback on the current network conditions. In contrast, open-loop protocols do not use any feedback from the
network, but work instead by adding redundancy to the payload and thereby mitigating the
impact of losses.
While open-loop protocols have been extensively studied and used for multimedia communication during the past twenty years [5, 6], closed-loop techniques were not considered
appropriate for multimedia communication until ten years ago. It was at that time, at the
beginning of the 1990s, Dempsey [12], Papadopoulos and Parulkar [26] and others demonstrated the feasibility of using retransmission-based, closed-loop protocols for multimedia
communication. Since then, a large number of protocols belonging to this class have been
proposed. However, along with the growing interest in this class of protocols, a terminology
has evolved which in some parts is both incoherent and inconsistent. Consequently, it is currently difficult to analyze the relative merits of different protocols in a meaningful way. It is
also difficult to discern which efforts are likely to be most rewarding and thus should be considered in future research. This paper presents a taxonomy for these protocols, a taxonomy
which we believe not only gives a unified terminology but also clarifies the core principles
and thus could serve as a basis for future work in this area.
The taxonomy presented comprises two classification schemes: one that classifies the
protocols with respect to the reliability service they offer and one that classifies them with
respect to their error control scheme. The taxonomy has been inspired by the earlier work
2. Preliminaries
19
of, among others, Diaz and Dempsey. In particular, our view of the concept of a partially
reliable service is to a large extent derived from the work on partially reliable and partially
ordered services by Diaz et al. [14]. Furthermore, our classification of retransmission-based,
partially reliable transport protocols with respect to their error control scheme is to some
degree influenced by the discussion of closed-loop, partially reliable transport protocols in
Dempseys thesis [11]. However, in contrast to these and other works, our taxonomy maintains a clear distinction between the service offered by a protocol and its implementation.
This paper also provides a classification and survey of existing protocols. Specifically,
it shows how a selection of existing protocols are classified with respect to our taxonomy.
Noteworthy, it follows from our classification that the majority of protocols employ a relatively small set of core principles. A subset of the protocols classified is further elaborated
in a survey. The survey complements the classification by illustrating how the majority of
the reliability services and error control schemes in our taxonomy have been implemented
in actual protocols.
The remainder of the paper is organized as follows. Section 2 provides some preparatory
material. In particular, the concept of a retransmission-based, partially reliable transport
protocol is defined. Our taxonomy is presented in Section 3. Section 4 gives a survey of
existing retransmission-based, partially reliable transport protocols and shows how they are
classified with respect to our taxonomy. Finally, Section 5 concludes the paper with a brief
summary of our proposed taxonomy and a discussion of the insight gained in developing the
taxonomy.
2 Preliminaries
The taxonomy of retransmission-based, partially reliable transport protocols presented in
this paper is based on the following definition of a partially reliable service.
Definition 1 Let r denote the reliability level offered by a service. A service is considered
partially reliable provided r ]0%, 100%[.
On the basis of Definition 1, we used the following definition of a retransmission-based,
partially reliable transport protocol.
Definition 2 A retransmission-based, partially reliable transport protocol is a transport
protocol that is explicitly designed to offer a partially reliable service and that uses an error
control scheme in which error recovery is made through retransmissions.
The reason why Definition 2 specifically states that a transport protocol must be explicitly
designed to offer a partially reliable service in order to be a partially reliable transport protocol is that, from a theoretical viewpoint, there is no such thing as a transport protocol offering
a completely reliable service. For example, TCP is designed to offer a completely reliable
service but is only able to do so as long as some conditions hold. In particular, TCP depends
on the ability of the IP protocol to provide end-to-end connectivity and assumes that the
TCP checksum is able to detect all conceivable kinds of bit errors. Consequently, TCP fails
to provide a completely reliable service in the rare occasions when the network becomes
partitioned or the TCP checksum fails to detect a corrupted packet [35].
20
Reliability
Service
Specification of
Reliability Level
Implicit
Adaptiveness
Granularity
Explicit
Message
Message
Group
Flow
Non-Adaptive
Adaptive
Per-Flow
In-Flow
Per-Message
The Taxonomy
As mentioned in Section 1, our taxonomy consists of two classification schemes: one that
classifies retransmission-based, partially reliable transport protocols with respect to their
offered reliability service and one that classifies them with respect to their error control
scheme. The former classification scheme is presented in Subsection 3.1 and the latter in
Subsection 3.2.
3. The Taxonomy
21
refers to the unit of data considered by the error control scheme when it makes retransmission decisions on lost or corrupted packets (cf. Section 3.2). As shown in Figure 1,
we distinguish between three main classes of protocols with respect to granularity:
message, message group and flow.
Message. In this context, the term message denotes messages exchanged between
application entities, i.e., Application Protocol Data Units (APDUs). When a
protocol has a granularity of a message, this means that the error control scheme
of the protocol performs retransmissions based on the status of one message.
This, for example, can be that the reliability service is based on the number of
times a message has been retransmitted, or that the reliability service is based on
a priority level assigned to a message.
Typically, protocols with a granularity of a message target video and audio streaming applications: both video and audio coders usually generate sequences of
fixed-sized data blocks that fit conveniently into messages. Furthermore, video
coding schemes such as MPEG-1, MPEG-2 and H.263 generate data blocks
of different importance, e.g., H.263 generates I-, P- and B-frames, of which Iframes are more important than either the P- or B-frames. By using a protocol
with a granularity of a message, a video application using one of these video
coding schemes can exploit the fact that all data blocks are not equally important
and can assign different reliability levels to each data block.
Message Group. A protocol is said to have a granularity of a message group when
the retransmission decision of the protocol involves not only one message but a
pre-specified number of messages. Typically, a protocol adhering to this class,
calculates the reliability level by counting the number of successfully received
messages within a fixed-length message group.
Protocols with a granularity of a message group primarily target the same niche
of applications as those with a granularity of a message, i.e., video and audio
streaming applications. Compared to protocols with a granularity of a message,
the major advantages of message group-based protocols are that they are generally easier to realize and, owing to their coarser granularity, entail less overhead.
However, the coarser granularity of message group based protocols is not always
beneficial. While it could be argued that a granularity of a message group is
sufficient for many audio streaming applications, it is less than ideal for video
streaming applications: audio coders typically produce packets of more or less
the same importance, but, as said, this is generally not the case with video coders.
Flow. A flow refers to a unidirectional stream of messages from a single media source.
One or several flows constitute a session: a single connection and/or conversation
between one end point and one or several other end points. Common examples of
flows are video streams in a video broadcast application and audio streams in an
Internet radio application. When a protocol belonging to this class of protocols
makes a retransmission decision at a particular time during a flow, it bases its
retransmission decision on all packets sent and/or received in this flow up to this
time.
22
3. The Taxonomy
23
24
Sender side
Receiver side
Location of
error detection
Location of
retransmission
decision
Sender side
Receiver side
Retransmission Decision
Component
Sender side
Location of
retransmission
decision
Receiver side
Retransmission Decision
Feedback Component
Retransmission Component
scheme with respect to the error control scheme is exclusively based on the most salient
features of the retransmission decision component. In particular, classification is made along
the two dimensions of location and decision base.
Location. The retransmission decision component of an error control scheme is located
3. The Taxonomy
25
Retransmission
Decision
Decision Base
Location
Sender
Receiver
Packet Loss
Priority
Metrics
Time
Number of
Retransmissions
Reliability
Classes
PR/UR
R/PR/UR
Statistic
Sliding
Window
Deadline
Statistic
Decision Base
for PR class
Point Estimate
Absolute
Deadline
Relative
Deadline
Point Estimate
Metrics
at either the sender side or the receiver side. Depending on the location of the retransmission decision component, we distinguish between sender-based protocols and
receiver-based protocols.
Although an analytical study by Marasli et al. [23] suggests that sender-based protocols in some instances exhibit a slightly better throughput performance than receiverbased protocols, the latter indeed possess some attractive features. Receiver-based
protocols are generally more scalable and versatile. An example is the case in which
a server serves several clients, all with different reliability service requirements. In
the sender-based case, the responsibility of providing the correct service to all clients
is primarily the servers. In the receiver-based case, the responsibility has to a large
degree been distributed to each one of the clients. Furthermore, in the receiver-based
case, the sender is not involved in the retransmission decisions and is consequently
able to simultaneously serve several clients with different retransmission decision
components.
In some protocols, the retransmission decision is normally made by the receiver but in
exceptional cases is ignored by the sender. For example, in the CM protocol proposed
by Papadopoulos and Parulkar [25], the sender has a retransmission buffer whose size
approximately follows the size of the playout buffer at the receiver side. When a
packet has been sent, it is placed in the retransmission buffer and, if necessary, the
oldest packet in the retransmission buffer is discarded. When a retransmission request
arrives at the sender for a packet that has already been discarded from the retransmis-
26
sion buffer, the retransmission request is ignored by the sender. Although it could be
argued that protocols like the CM protocol should be classified as hybrid sender-based
receiver-based protocols, we classify them as receiver-based protocols because only
the retransmission decision at the receiver side is made with regard to the requirements imposed by the data stream, i.e., they use a decision base that is either metrics
based, priority based or based on reliability classes.
Decision Base The decision base comprises the metrics, rules and/or heuristics that form the
basis for the retransmission decisions made by a retransmission decision component.
This dimension has three main classes: priority, metrics and reliability classes.
Priority. In priority-based protocols, each packet is assigned a priority based on its
relative importance. Retransmission of lost packets is always made in priority
order, where high priority packets are sent before packets with lower priorities.
Often, protocols are not pure priority-based protocols but are a combination of
priority and deadline based. Apart from having a priority, packets in combined
priority- and deadline-based protocols must also meet a deadline. In these protocols, lost packets are typically retransmitted in priority order until they have been
successfully received or their deadline has expired. This means that high priority
packets have a better chance of being delivered since, in the event of packet loss,
they are more likely to be retransmitted.
Priority-based protocols are very lightweight and have therefore found their use
in the area of audio and video broadcasting. For example, the CUDP [32] transport protocol targets audio and video file servers and SR-RTP [15] targets video
file servers.
Metrics. Protocols that base their retransmission decision on direct or indirect measurements of one or several properties of a flow, e.g., packet loss rate or timeliness, are considered metrics based protocols. We distinguish between three
classes of metrics-based protocols: protocols that base their retransmission on
some kind of estimate of the packet loss rate; protocols that make retransmissions as long as the timeliness of the data flow is not violated; and protocols that
indirectly consider the timeliness of the data flow by restricting the number of
retransmissions made on individual messages.
Packet Loss. To our knowledge, the existing protocols use either of two ap-
proaches to estimate the packet loss rate. The first approach involves using
a statistic, e.g., the arithmetic average or mean packet loss rate. Protocols
belonging to this class usually monitor the packet loss rate by continuously
calculating an estimate or weighted estimate of the mean packet loss rate.
The second approach involves calculating the average packet loss rate over a
fixed-length sequence of packets. Since the packet loss-based protocols using this approach typically keep track of the fixed-length sequence of packets through the use of a sliding window mechanism, this subclass of packet
loss-based protocols is called the sliding window class.
A problem with pure packet loss-based protocols are their obliviousness to
time, which makes them not altogether perfect for audio, video and other
3. The Taxonomy
27
is a function of time. The class of time-based protocols can be further divided into the subclasses: deadline-based and statistic-based protocols.
In deadline-based protocols, the retransmission decision component issues
a retransmission request for a lost packet provided the retransmitted packet
is likely to meet the deadline of the lost packet. Typically, deadline-based
protocols are used by streaming media applications, where it is important
that packets arrive at the receiver before their playout time. There are two
major classes of deadline-based protocols: absolute deadline-based and relative deadline-based protocols. Absolute deadline-based protocols refer to
protocols in which deadlines are calculated with respect to the beginning of
a flow while, in relative deadline-based protocols, the deadline of a packet
is calculated relative to preceding packets.
The class of statistic-based protocols comprises all time-based protocols that
use a metric that is a function of statistical estimates of one or several time
related performance metrics, e.g., mean latency.
As far as we know, there exist no purely time-statistic based protocols. However, there is at least one protocol, SRP [27], that uses a metric involving
both packet loss and time. In contrast to deadline-based protocols, SRP is
able to make explicit trade-offs between reliability level and timeliness.
Number of Retransmissions. Unlike time-based protocols, protocols belonging to
this class have no notion of time. Instead, they impose timely delivery of
packets by limiting the number of times a packet can be retransmitted. More
specifically, a retransmission counter is associated with each packet. The
counter is decreased every time the packet is retransmitted. When the retransmission counter of a packet reaches zero, the packet is discarded.
There are very few protocols that adhere to this class. In fact, we have not
found a single protocol that has a decision base only involving the number
of times a packet has been retransmitted. On the other hand, there is a
reliability class-based protocol, k-XP [3, 22], in which one reliability class
has a decision base consisting of the number of times a packet has been
retransmitted.
The reason that there are so few protocols that base their retransmission decision on the number of times a packet has been retransmitted is of course
that this is a very imprecise metric; it only indirectly governs the packet loss
rate and the timeliness. However, in combination with reliability classes,
it has proven itself quite useful. In particular, using the number of times a
packet has been retransmitted as a decision base has been proven useful in
implementing a better than best effort reliability class, similar to the controlled load service class of IntServ [7].
28
4.1 PECC
To be precise, PECC (Partially Error-Controlled Connection) is an error control scheme, not
a protocol. Developed by Dempsey et al. [11, 12] as an extension to the multi-service transport protocol Xpress Transfer Protocol [36] (XTP), it was primarily intended for continuous
media applications.
The application atop PECC specifies its service requirements explicitly on a per-flow
basis through four parameters: fifo min, window length, window density and
max gap. The fifo min parameter indicates the minimum number of contiguous bytes
that must be queued before the receiver is permitted to issue a retransmission request. In
Retransmission Decision
Protocol
Specification of
Reliability Level
Granularity
Adaptiveness
Decision Base
Location
Slack ARQ
PECC
k-XP
POCv2
AOEC
CUDP
VDP
Jacobs/Eleftheriadis
TLTCP
SRP
Papadopoulos/Parulkar
SR-RTP
HPF
MSP
XUDP
PR-SCTP
PRTP-ECN
Implicit
Explicit
Explicit
Explicit
Explicit
Implicit
Implicit
Implicit
Implicit
Explicit
Implicit
Implicit
Explicit
Implicit
Explicit
Implicit
Explicit
Message
Message Group
Message
Message
Message Group
Message Group
Message
Message
Message
Flow
Message
Message
Message
Message
Message
Message
Flow
Per-Flow
Per-Flow
Message
Message
Per-Flow
Per-Flow
Per-Flow
Per-Flow
Message
Per-Flow
Per-Flow
Per-Flow
Per-Flow
Per-Flow
Message
Message
In-Flow
Absolute Deadline
Absolute Deadline, Sliding Window
Reliability Classes, R/PR/UR(Number of Retransmissions)
Reliability Classes, R/PR/UR(Relative Deadline)
Sliding Window
Priority
Reliability Classes, PR/UR(Absolute Deadline)
Absolute Deadline
Absolute Deadline
Point Estimate of Packet Loss and Time
Absolute Deadline
Absolute Deadline, Priority
Reliability Classes, R/PR/UR(Absolute Deadline)
Absolute Deadline
Reliability Classes, R/PR/UR(Absolute Deadline)
Absolute Deadline
Point Estimate of Packet Loss
Receiver
Receiver
Sender
Receiver
Receiver
Sender
Receiver
Receiver
Sender
Receiver
Receiver
Receiver
Sender
Receiver
Sender
Sender
Receiver
Reliability Service
Table 1: Classification of retransmission-based, partially reliable transport protocols in our taxonomy. The decision bases of the
partially reliable classes in the reliability class schemes are appended in parentheses to the names of the schemes.
29
30
other words, fifo min should be an estimate of the number of bytes consumed by the
application during one round trip time. The two parameters of window length and
window density specify the loss tolerance of the application. They specify that no more
than window density bytes are permitted to be lost out of window length bytes of
application data. From a practical viewpoint, this means that a reliability service at a granularity of a message group is offered: setting window length to a number of bytes equal to
or less than the size of a message is not meaningful1. Finally, the last parameter, max gap,
puts a limit on burst losses, giving an upper bound to how many contiguous bytes are permitted to be lost by the application.
Since PECC is intended mainly for continuous media applications, it assumes that the
XTP receiver logically places received data in a FIFO buffer that is emptied at an approximately constant rate (isochronously). Consequently, the fifo min parameter serves as
an absolute deadline for the data received. Together with parameters window length
and window density, this makes PECC an absolute deadline and sliding window-based
protocol.
In PECC, the retransmission decision takes place at the receiver side. Every time an outof-sequence packet is received, this is taken as an indication of one or more packets being
lost and results in the invoking of the retransmission decision component of PECC. If there
are more than fifo min bytes in the FIFO buffer, a retransmission request is issued for the
presumably lost data. Otherwise, PECC tries to skip as much data as is needed to facilitate
a retransmission, i.e., to make the depth of the FIFO buffer greater than fifo min bytes.
However, when this is not possible without violating the max gap parameter or the sliding
window determined by the window length and window density parameters, PECC
reports a failure to the application and skips the data anyway.
4.2 POCv2
POCv2 was proposed by Conrad et al. [10] at the University of Delaware in an attempt
to design a transport protocol better suited to distributed multimedia applications. However, the origins of POCv2 can be traced back to the POC [1, 2] (Partial Order Connection)
protocol, which was developed as a joint effort between the University of Delaware and
LAAS/ENSICA.
Apart from being a partially reliable transport protocol, POCv2 builds upon the notion
of partial order. It considers a flow as consisting of a partially ordered sequence of messages, where each message corresponds to exactly one media object (e.g., an audio clip or a
component of a video frame).
POC is a reliability class based protocol: the POCv2 application decomposes a flow into
messages, specifies the partial order of the messages and assigns each message to one of
three reliability classes, reliable, partially reliable or unreliable. Furthermore, during a flow,
POCv2 enables an application to alter the reliability class assigned a particular message.
1 Notably, if window length is assigned a value equal to or less than the size of a message, an unreliable
service is obtained when window density is set to 0, and a completely reliable service is obtained for all values
of window density greater than 0.
31
The retransmission decisions in POCv2 are made by the receiver. Whether or not a request for retransmission of a message should be issued depends on the reliability class of the
message: for reliable messages, retransmission requests are issued until they are successfully
received; partially reliable messages are retransmitted as long as the receiver has some messages to deliver to the application. The receiver will stop issuing retransmission requests for
a partially reliable message when there are no more messages to deliver to the application.
Declaring the partially reliable message as lost will make some messages succeeding the lost
message (with respect to the partial order among the messages) deliverable. In other words,
the decision base of the partially reliable class of messages is relative deadline based, where
the deadlines for the partially reliable messages are measured with respect to their preceding
messages in the partial order. Finally, no retransmission requests are issued for unreliable
messages.
4.3 SRP
Commonly, retransmission-based partially reliable transport protocols that consider both the
timing and the packet loss requirements of a multimedia streaming application, e.g., PECC,
do so by simply giving the timing requirements priority over the packet loss requirements;
lost packets are retransmitted as long as this can be done without violating the timing requirements. In contrast, the SRP (Selective Retransmission Protocol) protocol proposed by
Piecuch et al. [27] not only strives to offer a service that complies with the timing and packet
loss requirements imposed by the streaming application but also strives to offer the service
that gives the optimal trade-off between the two requirements.
During an SRP session, the SRP client is responsible for receiving a multimedia stream
from the server and issuing retransmission requests if necessary. The retransmission decisions of the receiver are governed by the maximum tolerable transmission delay and the
maximum tolerable packet loss rate of the application. These performance requirements are
communicated to SRP by the application at the inception of a flow and are specified on a
per-flow basis.
The SRP receiver assumes that packets arrive at a constant rate. The receiver thus maintains an estimate of the arrival time of the next packet in a flow. A packet is considered lost
if it has not been received before its expected arrival time has elapsed.
The expected arrival time of a packet is calculated as the arrival time of the previous
packet plus the round trip timeout (RTO) value maintained by the SRP receiver. Since it is
vital for the performance of SRP that the RTO is accurate, the receiver updates the RTO by
sending time probe packets to the sender at regular intervals.
The SRP receiver implements two retransmission decision algorithms: Equal Loss Latency (ELL) and Optimum Quality (OQ). ELL and OQ are based on the notions of loss ratio
and delay ratio. The loss ratio, rloss , is the quotient of the current packet loss rate divided
by the maximum tolerable packet loss rate, and the delay ratio, rdelay , is the quotient of
the current transmission delay divided by the maximum tolerable transmission delay. As an
estimate of the current transmission delay, one-half of RTO is used. When a packet loss is
detected, ELL decides whether the lost packet should be retransmitted on the basis of which
32
(1)
(2)
4.4 HPF
The HPF (Heterogeneous Packet Flows) protocol was designed by Li et al. [21] to effectively
support heterogeneous packet flows: for example, MPEG flows with frames of different priority or multiplexed audio/video streams. The primary motivation for HPF was to demonstrate the feasibility of designing a transport protocol that provides mechanisms for flow and
congestion control on a per-flow basis and mechanisms for reliability, sequencing, framing
and prioritization on a per-message basis.
The application atop HPF partitions the data stream into messages, e.g., MPEG frames,
and specifies the service requirements at the initiation of a flow on a per-message basis. In
particular, HPF enables an application to assign a message to one of the three reliability
classes: reliable, unreliable and unreliable delay bounded. The application messages are
then treated as single entities by HPF, i.e., all packets belonging to the same application
message are assigned the same reliability class as the message.
HPF is a sender-based protocol, i.e., the retransmission decisions are made by the sender.
The retransmission policy for reliable and unreliable packets is simple: a reliable packet
is retransmitted until it has been successfully received, while an unreliable packet is never
retransmitted. The retransmission policy for unreliable delay-bounded packets is somewhat
more complex. Specifically, all delay-bounded packets are assigned a deadline based on the
transmission rate. A packet is only retransmitted if the estimated round trip time suggests that
the packet can be retransmitted and still meet its deadline. In other words, delay-bounded
packets use an absolute deadline-based decision base.
4.5 PR-SCTP
To address the shortcomings and limitations of TCP and UDP for the transportation of telephony signaling messages, the SIGTRAN (Signaling Transport) working group in IETF developed SCTP [34].
SCTP provides a message-based, reliable, and ordered transport service to an application. It accomplishes this by fragmenting the application messages into so called chunks and
assigning a Transmission Sequence Number (TSN) to each chunk. In the same way as in
TCP, SCTP employs a cumulative acknowledgement mechanism: data chunks received are
acknowledged by informing the sender of the TSN of the next expected chunk.
While the development of SCTP was directly motivated by the transfer of the SS7 (Signaling System Number 7) signaling protocol to IP, SIGTRAN ensured that the design of
SCTP was general enough for the protocol to be suitable for applications with similar requirements. As a result of this design decision, a number of extensions to SCTP have been
proposed [33, 37]. One of the most recent is PR-SCTP [33].
PR-SCTP is SCTP extended with a framework for implementing partially reliable transport services. It entails adding two new items to SCTP: a new parameter and a new type of
33
chunk. The parameter introduced by PR-SCTP is used during the initialization of an SCTP
session by both sides to signal support for PR-SCTP to the other side. A PR-SCTP session
can only be initiated if both sides use PR-SCTP. If either the sending or the receiving side is
not using PR-SCTP, the other side has the option of ending the session or starting an SCTP
session instead. The new type of chunk introduced by PR-SCTP, Forward Cumulative TSN
(FCTSN), is used by the sender to inform the receiver that it should consider all chunks
having a TSN less than a certain value as having been received.
At present, only one service has been proposed that is based on PR-SCTP, timed reliability. When an application uses this service, it assigns deadlines to its messages. In contrast to
HPF, the deadlines are assigned continuously during the lifetime of a flow, and not only at
the flow inception. These deadlines are translated into chunk lifetimes by PR-SCTP. Before
a packet is transmitted or retransmitted, the PR-SCTP sender evaluates the lifetime of the
packet. When the lifetime of a packet has expired, it is discarded, and the sender informs the
receiver of this by sending it an FCTSN.
4.6 PRTP-ECN
PRTP-ECN is an extension to TCP suggested by Grinnemo et al. [17, 18]. It aims to make
TCP better suited to applications with soft real-time constraints (e.g., best effort multimedia
applications). In particular, PRTP-ECN is an attempt to make this traditionally congestioninsensitive class of applications aware of congestion. An attractive feature of PRTP-ECN is
that it only entails modifying the retransmission decision component of TCP on the receiver
side; the sender side remains unaffected.
PRTP-ECN offers a flow-based reliability service. As long as no packets are lost in
a flow, PRTP-ECN behaves in the same way as standard TCP. When an out-of-sequence
packet is received, however, this is taken as an indication of packet loss, and the modified
retransmission decision component is invoked. This component decides, on the basis of the
success rate of all previous packets in a flow, whether to acknowledge all packets up to and
including the out-of-sequence packet or to do the same as standard TCP, i.e., acknowledge
the last successfully received in-sequence packet and wait for a retransmission.
The success rate of previous packets, called the current reliability level (crl), is calculated
as an exponentially weighted moving average over all packets up to but not including the outof-sequence packet. It is calculated as
Pn
nk
pk b k
k=1 af
,
(3)
crl(n) = P
n
nk b
af
k
k=1
where n is the sequence number of the packet preceding the out-of-sequence packet, af
is the weight or aging factor, and bk denotes the number of bytes contained in packet k.
Variable pk is given a binary value. If the kth packet was successfully received, then pk = 1,
otherwise pk = 0.
An application communicates its lower bound reliability level through the aging factor
and a second parameter called the required reliability level (rrl). This parameter functions
as a target value. As long as crl(n) rrl, lost packets are acknowledged. However, if an
out-of-sequence packet is received at a time when crl(n) is below rrl , the last in-sequence
34
packet is acknowledged, forcing the sender to retransmit the lost packet. It should be noted
that an application is permitted to alter either af or rrl at any time during the lifetime of a
flow, i.e., PRTP-ECN is an example of an in-flow adaptive protocol.
Although, PRTP-ECN is a flow-based protocol, it has some features in common with
message group-based protocols. In particular, PRTP-ECN does not consider all messages
equally important, which is usually the case with flow-based protocols. Instead, the aging
factor makes PRTP-ECN consider newly arrived messages to be more important than messages that arrived some time ago. Furthermore, to some degree, PRTP-ECN provides for an
application to control the maximum tolerable message burst loss length: more or less the
same reliability level is given by several different combinations of af and rrl . However,
the combinations differ from each other in that they translate to different upper limits on
message burst loss.
In order to decouple the error control and congestion control schemes of TCP, PRTP-ECN
uses the transport level flags of ECN [30] (Explicit Congestion Notification). In particular,
when lost packets are acknowledged, PRTP-ECN signals congestion to the sender side by
setting the explicit congestion notification flag in the acknowledgement packet. As a result,
when the sender receives an acknowledgement of a lost packet, it will act as though a congestion has occurred, e.g., it will reduce its congestion window. However, it will not re-send
the lost packet.
Concluding Remarks
This paper presents a taxonomy for retransmission-based, partially reliable transport protocols. The taxonomy comprises two classification schemes. The first classification scheme
classifies protocols with respect to the reliability service they offer. In this scheme, protocols are classified along three dimensions: specification of reliability level, granularity and
adaptiveness. The second classification scheme classifies protocols with respect to their retransmission decision component. It comprises two dimensions: location and decision base.
The paper also shows how existing protocols are classified according to this taxonomy and
gives a survey of a subset of the classified protocols. The surveyed subset of protocols is
selected such that it covers the majority of reliability services and error control schemes in
the taxonomy.
The taxonomy suggests that the majority of retransmission-based, partially reliable transport protocols use error control schemes that are simply variations of a relatively small set
of core principles. Specifically, most protocols make their retransmission decision on the
basis of one or several of the following core principles: an estimate of the average packet
loss rate, an absolute deadline, an upper bound on the number of retransmissions, priorities, or reliability classes. Noteworthy, neither of the core principles of the error control
schemes explicitly involve delay jitter, a performance parameter that, for many multimedia
applications, is at least as important as the delay itself. Instead, delay jitter is almost always
indirectly controlled through buffers at the receiver side. Furthermore, the taxonomy and
survey suggest that, in a large number of protocols, the service interface is closely coupled
to the implementation of the error control scheme; in fact, many times the service interface
and implementation are intertwined. Taken together, these observations suggest that future
REFERENCES
35
work on this class of protocols should consider including delay jitter as a parameter in the
error control scheme and strive for more generic and application-oriented service interfaces
that are decoupled from the implementation.
In summary, this paper presents a taxonomy for retransmission-based, partially reliable
transport protocols that gives a unified terminology and a framework for comparison and
evaluation of this class of protocols. In addition, we believe that the insight provided by the
taxonomy and survey in this paper can be used to guide future research in this area.
References
[1] P. D. Amer, C. Chassot, T. Connolly, and M. Diaz. Partial order quality of service to
support multimedia connections: Reliable channels. In 2nd High Performance Distributed Computing Conference, Spokane, Washington, USA, July 1993.
[2] P. D. Amer, C. Chassot, T. Connolly, and M. Diaz. Partial order quality of service
to support multimedia connections: Unreliable channels. In International Networking
Conference (INET), San Fransisco, California, USA, August 1993.
[3] P. D. Amer, P. T. Conrad, E. Golden, S. Iren, and A. Caro. Partially-ordered, partiallyreliable transport service for multimedia applications. In Advanced Telecommunications/Information Distribution Research Program (ATIRP) Conference, pages 215
220, College Park, Maryland, USA, January 1997.
[4] M. J. Andrews. XUDP: A real-time multimedia networking protocol. Bachelor thesis,
Worcester Polytechnic Institute, March 1997.
[5] G. Barberis and D. Pazzaglia. Analysis and design of a packet-voice receiver. IEEE
Transactions on Communications, 28(2):152156, February 1981.
[6] V. Bhargava. Forward error correction schemes for digital communications. IEEE
Communications Magazine, 21:1119, January 1983.
[7] R. Braden, D. Clark, and S. Shenker. Integrated services in the internet architecture.
RFC 1633, IETF, June 1994.
[8] Z. Chen, S-M. Tan, R. H. Campbell, and Y. Li. Real time video and audio in the
world wide web. In 4th International World Wide Web Conference (WWW), Boston,
Massachusetts, USA, December 1995.
[9] D. Clark and D. Tennenhouse. Architectural considerations for a new generation of protocols. ACM Computer Communication Review (SIGCOMM), pages 200208, September 1990.
[10] P. T. Conrad, E. Golden, P. D. Amer, and R. Marasli. A multimedia document retrieval
system using partially-ordered/partially-reliable transport service. In Multimedia Computing and Networking, San Jose, California, USA, January 1996.
36
REFERENCES
37
[24] B. Mukherjee and T. Brecht. Time-lined TCP for the TCP-friendly delivery of streaming media. In IEEE International Conference on Network Protocols (ICNP), pages
165176, Osaka, Japan, November 2000.
[25] C. Papadopoulos. Error Control for Continuous Media and Large Scale Multicast
Applications. PhD thesis, Washington University, August 1999.
[26] C. Papadopoulos and G. Parulkar. Retransmission-based error control for continuous
media applications. In 6th International Workshop on Network and Operating System
Support for Digital Audio and Video (NOSSDAV), pages 512, Zushi, Japan, April
1996.
[27] M. Piecuch, K. French, G. Oprica, and M. Claypool. A selective retransmission protocol for multimedia on the Internet. In SPIE Multimedia Systems and Applications,
Boston, Massachusetts, USA, November 2000.
[28] J. Postel. User datagram protocol. RFC 768, IETF, August 1980.
[29] J. Postel. Transmission control protocol. RFC 793, IETF, September 1981.
[30] K. Ramakrishnan and S. Floyd. A proposal to add explicit congestion notification
(ECN) to IP. RFC 2481, IETF, January 1999.
[31] H. Schulzrinne. RTP: A transport protocol for real-time applications. RFC 1889, IETF,
January 1996.
[32] B. C. Smith. Cyclic-UDP: A priority-driven best effort protocol. Unpublished, May
1994.
[33] R. Stewart, M. Ramalho, Q. Xie, M. Tuexen, and P. T. Conrad. SCTP partial reliability
extension. Internet draft, IETF, May 2002. Work in Progress.
[34] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina,
M. Kalla, L. Zhang, and V. Paxson. Stream control transmission protocol. RFC 2960,
IETF, October 2000.
[35] J. Stone and C. Partridge. When the CRC and TCP checksums disagree. In ACM
Computer Communication Review (SIGCOMM), pages 309319, Stockholm, Sweden,
August 2000.
[36] W. Strayer, B. Dempsey, and A. Weaver. XTP: The Xpress Transfer Protocol. AddisonWesley Publishing, July 1992.
[37] Q. Xie, R. Stewart, C. Sharp, and I. Rytina. SCTP unreliable data mode extension.
Internet draft, IETF, April 2001. Work in Progress.
Paper II
1 Introduction
Traditionally, the Internet has been used by two kinds of applications: applications that require a completely reliable service, such as file transfers, remote login, and electronic mail,
and applications for which a best effort service suffices, such as network management and
name services. Over the course of the last decade, however, a large number of new applications have become part of the Internets development and have started to erode this traditional
picture. In particular, we have experienced a growing interest in distributed applications with
soft real-time requirements, e.g., best effort multimedia applications.
While the two standard transport protocols, TCP [26] and UDP [25], successfully provide
transport services to traditional Internet applications, they fail to provide adequate transport
services to applications with soft real-time requirements. In particular, TCP provides a completely reliable transport service and gives priority to reliability over timeliness. While, soft
41
42
real-time applications commonly require timely delivery, they can do with a less than completely reliable transport service. In contrast, UDP provides a best effort service and makes
no attempt to recover from packet loss. Nevertheless, despite the fact that soft real-time
applications often tolerate packet loss, the amount that is tolerated is limited.
To better meet the needs of soft real-time applications in terms of timely delivery, a
new class of transport protocols has been proposed: retransmission-based, partially reliable
transport protocols. These protocols exploit the fact that many soft real-time applications
do not require a completely reliable transport service and therefore trade some reliability for
improved throughput and jitter.
Early work on retransmission-based, partially reliable transport protocols was done by
Dempsey [9], and Papadopoulos and Parulkar [24]. Independently of each other, they demonstrated the feasibility of using a retransmission-based partially reliable transport protocol for
multimedia communication. Further extensive work on retransmission-based partially reliable transport protocols was done by Diaz et al. [10] at LAAS-CNRS and by Amer et
al. [2] at the University of Delaware. Their work resulted in the proposal of the POC (Partial Order Connection) protocol, which has also been suggested as an extension to TCP [7].
Examples of more recent work on retransmission-based partially reliable transport protocols
are TLTCP [21] (Time-Lined TCP) and PR-SCTP [28] (Partially Reliable extension to the
Stream Control Transmission Protocol).
This paper presents PRTP, an extension for partial reliability to TCP. PRTP differs from
many other proposed extensions to TCP, such as POC, in that it does not involve any elaborate changes to standard TCP. In fact, PRTP only involves a change in the retransmission
decision scheme of standard TCP: lost packets are acknowledged provided the cumulative
packet reception success rate is always kept above the minimum guaranteed reliability level
communicated to PRTP by the application. Another attractive feature of PRTP is that it
only has to be implemented on the receiver side. Neither the sender nor any intermediary
network equipment, e.g., routers, need be aware of PRTP. Thus, PRTP provides for gradual
deployment.
This paper also presents a simulation based performance evaluation that has been performed on PRTP. The performance evaluation comprised of two subactivities. Our simulation model of PRTP was first validated against a prototype of PRTP implemented in Linux.
Second, we made a performance analysis of PRTP. The performance analysis involved studying the stationary and the transient behavior of PRTP, i.e., the behavior of PRTP in long- as
well as short-lived connections.
The paper is organized as follows. Section 2 elaborates on the design of PRTP. Sections 3
and 4 discuss the implementation and validation of the PRTP simulation model. The performance analysis of PRTP is presented in Sections 5 and 6. Section 5 considers the stationary
performance of PRTP, while Section 6 considers its transient behavior. Section 7 gives a
brief summary and some concluding remarks.
Protocol Design
As mentioned above, PRTP is an extension for partial reliability to TCP that aims at making
TCP better suited for applications with soft real-time constraints, e.g., best-effort multimedia
2. Protocol Design
43
11
00
00
11
00
11
00
11
00
11
Sender side
TCP
11
00
00
11
00
11
00
11
00
11
Insequence da
ta packet
Receiver side
PRTP
Outofseque
Packet loss
et
Packet loss detected (1)
to and
all packets up
Acknowledge
nce packet
outofseque
including the
crl
rrl (2)
Insequence da
ta packet
Outofseque
Packet loss
et
Packet loss detected
Acknowledge
e packet
last insequenc
applications. In particular, the main intent in PRTP is to trade some of the reliability offered
by TCP for improved throughput and jitter.
An attractive feature of PRTP is that it only involves changing the retransmission decision
scheme of standard TCP; the rest of the implementation of standard TCP is left as is. This
makes PRTP very easy to implement and compatible with existing TCP implementations. In
addition, PRTP only needs to be implemented on the receiver side. Neither the sender side
nor any intermediate network equipment such as routers are affected by PRTP.
PRTP allows an application to specify a reliability level between 0% and 100%. The
application is then guaranteed that this reliability level will be maintained until a new reliability level is specified or until the session is terminated. More precisely, the retransmission
decision scheme of PRTP is parameterized. The application atop PRTP explicitly specifies
a minimum acceptable reliability level by setting the parameters of the PRTP retransmission decision scheme. Implicitly, the parameters govern the trade-off between reliability,
throughput, and jitter. By relaxing the reliability, the application can receive less jitter and
better throughput.
The timeline in Figure 1 illustrates how PRTP works. As long as no packets are lost,
PRTP works in the same way as standard TCP. When an out-of-sequence packet is received
44
(1), however, this is taken as an indication of packet loss, and the PRTP retransmission
decision scheme is invoked. This scheme decides whether the lost packet should be retransmitted on the basis of the rate of successfully received packets at the time of the reception
of the out-of-sequence packet. To be more precise, this scheme calculates an exponentially
weighted moving average over all packets, lost and received, up to but not including the
out-of-sequence packet. This weighted moving average is called the current reliability level,
crl(n), and is defined as
crl(n) =
Pn
nk
pk b k
k=1 af
P
.
n
nk
bk
k=1 af
(1)
In Equation 1, n is the sequence number of the packet preceding the out-of-sequence packet,
af is the weight or aging factor, and bk is the number of bytes contained in packet k. The
variable denoted pk is a conditional variable that only takes a value of 1 or 0. If the kth
packet was successfully received, then pk = 1 otherwise pk = 0.
An application communicates its lower bound reliability level through the aging factor
and a second parameter called the required reliability level. The required reliability level,
rrl , acts as a reference value. As long as crl(n) rrl , lost packets need not be retransmitted
and are therefore acknowledged (2). If an out-of-sequence packet is received and crl(n) is
below rrl , PRTP acknowledges the last in-sequence packet, and waits for a retransmission
(3). In the remainder of this text, a PRTP protocol that has been assigned fixed values for af
and rrl is called a PRTP configuration.
All simulations of PRTP in this paper were made with version 2.1b5 of the ns-2 [23] network
simulator. In ns-2, PRTP is implemented as an agent derived from the FullTcp class. In
particular, the PRTP agent only modifies the retransmission decision scheme of FullTcp
in the way detailed in Section 2. Besides this modification, the PRTP agent fully supports
the TCP Reno [11] congestion control mechanisms. Most notably, fast retransmit and fast
recovery are supported.
The FullTcp class in ns-2 strives to correctly model the 4.4 BSD implementation of
TCP [20]. However, in version 2.1b5 of ns-2, the FullTcp class still lacks support for some
of the congestion control mechanisms of 4.4 BSD, specifically for selective acknowledgements (SACK) [19] and timestamps [17]. A consequence of this is that our PRTP simulation
model also lacks support for these mechanisms.
This section considers the validation of our PRTP simulation model against a prototype of
PRTP implemented in Linux. Subsection 4.1 summarizes the validation setup and methodology, and Subsection 4.2 discusses the results of the validation.
45
Node 2
Ethernet LAN
Bandwidth: 10 Mbps
Propagation delay: 0 ms
Node 3
Node 1
1
0
0
1
0
1
0
1
0
1
0
1
0000
1111
0000
0000000000001111
111111111111
0000
1111
1
0
0
1
0
1
0
1
0
1
0
1
0000
1111
0000
0000000000001111
111111111111
0000
1111
Running receiver application
Bandwidth: 10 Mbps
Propagation delay: 0 ms
Uniform error model
Node 3
4.1 Methodology
The validation of our ns-2 simulation model was done using the experiment testbed depicted
in Figure 2(a). Nodes 1, 2, and 3 in Figure 2(a) denote three 233 MHz Pentium II PCs. Nodes
1 and 2 ran unmodified versions of Linux 2.2.14, while node 3 ran our PRTP prototype. All
three PCs ran on a 10 Mbps Ethernet LAN. Traffic was introduced between the nodes 1 and
3 by a constant bitrate traffic generator residing at node 1 that sent bulk data to a PRTP sink
application at node 3. The traffic between nodes 1 and 3 was routed through node 2 on which
NIST Net [22], a network emulation tool, was running. NIST Net was introduced to make it
possible for us to vary the propagation delay and packet loss frequency on the link between
nodes 1 and 3.
Our PRTP prototype [3] is derived from Linux 2.2.14, whose TCP implementation supports both the SACK and the timestamp options. Therefore, in order to mitigate the differences between our PRTP simulation model and our prototype implementation, neither of
these two options was turned on during the validation.
Figure 2(b) illustrates how the experiment testbed in Figure 2(a) was modeled in ns-
46
4.2 Results
The validation of the PRTP simulation model was confined to only one metric: the transfer
time. Figures 3, 4, 5, and 6 show the results of the simulations and experiments in which af
was equal to 0.9 and 1.0, respectively. In each graph, the time it took to transfer 5 Mbytes of
data is plotted against rrl . The 95% confidence interval for each transfer time is shown.
As follows from figures 3, 4, 5, and 6 there was a close correspondence between the
transfer times observed in the experiments and the simulations. Still, the results of the experiments and the simulations differed in two important ways. In scenarios where the propagation delay was 125 ms and rrl was below 90%, the simulations tended to predict shorter
transfer times than were measured in the experiments. In the scenarios where rrl was higher
than 90%, the situation was the reverse: the transfer times observed in the simulations were
in most cases longer than those obtained in the experiments.
The first difference between the experiments and the simulations was an effect of our
PRTP prototype being hampered by more spurious timeouts than our simulation model.
This, in turn, seemed to be an effect of the retransmission timeout being calculated more
aggressively in Linux 2.2.14 than in ns-2, and consequently more aggressively in our PRTP
prototype than our simulation model. This also explains why the difference in transfer times
was marginal when the propagation delay was short (50 ms) and became obvious first when
the propagation delay was longer (125 ms). When the propagation delay was short, the time
it took to recover from a retransmission timeout was also very short. Together with the fact
that the number of retransmission timeouts occurring at low values of rrl was very small,
this made the fraction of time spent on recovering from timeouts almost negligible compared
with the total transfer time. However, when the propagation delay was long, each timeout
recovery took a substantial amount of time. Therefore, even though the number of timeouts
was small, the duration of each timeout recovery was long enough to make the total fraction
of the transfer time spent on recovering from timeouts non-negligible.
As mentioned earlier, the implementation of TCP in our PRTP simulation model and prototype differed in some important ways. Despite the fact that we had tried to make them be-
47
Figure 3: Transfer time against rrl for an aging factor of 0.9 and a propagation delay of
50 ms.
48
Figure 4: Transfer time against rrl for an aging factor of 0.9 and a propagation delay of
125 ms.
49
Figure 5: Transfer time against rrl for an aging factor of 1.0 and a propagation delay of
50 ms.
50
Figure 6: Transfer time against rrl for an aging factor of 1.0 and a propagation delay of
125 ms.
5. Stationary Analysis
51
have as similarly as possible during the validation, some differences remained. In particular,
the TCP implementation of Linux 2.2.14 implemented the TCP NewReno [16] modification
to TCP Renos fast recovery, which was found to be the reason for the second difference
between the simulations and the experiments.
TCP NewReno made our PRTP prototype more robust toward multiple packet losses
within a single sender window. This difference was of no importance at low rrl values
since PRTP then was quite tolerant to packet loss anyway. However, at rrl values greater
than 90%, PRTP more or less required that every lost packet was retransmitted, i.e., PRTP
worked almost in the same way as TCP. Thus, in those scenarios, our PRTP prototype reacted
to multiple packet losses in a TCP NewReno fashion, while our PRTP simulation model
reacted in the same way as TCP Reno. Consequently, when the packet loss rate was low
(1%), and therefore the chances of experiencing multiple packet losses was marginal, there
was no significant difference between our PRTP prototype and simulation model. At higher
packet loss rates (3% and 5%), however, the number of multiple packet losses was large
enough to make the transfer times obtained with the PRTP prototype significantly shorter
than in the simulations.
5 Stationary Analysis
The stationary analysis of PRTP studied the performance of PRTP in long-lived connections.
Three performance metrics were studied: average interarrival jitter, average throughput, and
average fairness. Average fairness was measured using Jains fairness index [18]. In particular, the average fairness for n flows, each one acquiring an average bandwidth, bi , on a
given link, was calculated as
Pn
2
( i=1 bi )
Pn 2 .
Fairness index =
n ( i=1 bi )
def
(2)
A problem in Jains fairness index is that it essentially considers all protocols with better
link utilization than TCP more or less unfair [4]. To address this problem, Jains fairness
index was complemented with the TCP-friendliness test proposed by Floyd et al. [12]. According to this test, a flow is TCP-friendly provided its arrival rate does not exceed the arrival
rate of a conformant TCP connection under the same circumstances. Specifically, it means
that the following inequality must hold between the arrival rate of a flow, , the packet size,
, the minimum round-trip time, RT T , and the experienced packet-loss rate, ploss :
p
1.5 2/3
.
(3)
RT T ploss
Compared to Jains fairness index, the main advantage of the TCP-friendliness test is
that it accepts a certain skewness in the bandwidth allocation between competing flows. In
particular, a flow is permitted to use more bandwidth than dictated by Jains fairness index
provided it does not use more bandwidth than the theoretically most aggressive TCP flow
would have in the same situation.
52
Node 1
Bandwidth: 10 Mbps
Propagation delay: 0 ms
Bandwidth: 10 Mbps
Propagation delay: 0 ms
Node 4
Node 2
Node 7
Node 8
Buffer size: 25 segments
Node 3
Node 5
Node 6
5. Stationary Analysis
53
5.2 Results
The graphs in Figures 8, 9, and 10 show how the three performance metrics studied, average
interarrival jitter, average throughput, and average fairness, varied with the protocol used at
node 4 and the router link traffic load, i.e., the two primary factors. In each graph, the sample
mean of the observations obtained for a performance metric in the 40 runs comprising a
simulation was taken as a point estimate for the performance metric in that simulation.
Figure 8 illustrates how the average interarrival jitter of the seven PRTP configurations
varied with the traffic load. The traffic load is on the horizontal axis, and the average interarrival jitter relative to the average interarrival jitter obtained with TCP is on the vertical
axis.
As is evident from Figure 8, the largest reductions in average interarrival jitter with PRTP
compared to TCP were obtained at low traffic loads and with the PRTP configurations that
had the largest fs values. Specifically, both PRTP-14 and PRTP-20 gave a reduction in
average interarrival of more than 150% when the traffic load was 20% (ca. 1% packet loss
rate). This was of course not surprising: when the traffic load was low, i.e., when the packet
loss rate was low, those PRTP configurations tolerating large packet loss rates almost never
54
5. Stationary Analysis
55
had to make retransmissions of lost packets. Consequently, the interarrival jitter experienced
with these PRTP configurations at low traffic loads was primarily caused by variations in the
queueing delays at the two routers, and was therefore very low.
However, even though the largest reductions in average interarrival jitter at low traffic
loads were obtained with the PRTP configurations that had largest packet loss tolerances,
significant reductions were also obtained with PRTP configurations with relatively low fs
values. In particular, Figure 8 shows that, at a traffic load of 20%, PRTP-5 gave a reduction
in average interarrival jitter of 106%, and PRTP-8 gave a reduction in average interarrival
jitter of 137%. In fact, even PRTP-2 gave a non-negligible reduction in average interarrival
jitter when the traffic load was low: at a traffic load of 20%, PRTP-2 gave a reduction in
average interarrival jitter of 59%, and, at a traffic load of 60% (ca. 2% packet loss rate), the
reduction was 29%.
At high traffic loads, the largest reductions in average interarrival jitter were also obtained
with those PRTP configurations tolerating the largest packet loss rates: at a traffic load of
87% (ca. 8% packet loss rate), PRTP-14 gave a reduction in average interarrival jitter of
70%, and PRTP-20 gave a reduction of 94%. However, PRTP-8 and PRTP-11 also gave
substantial reductions in average interarrival jitter at high traffic loads. In fact, as shown in
Figure 8, these two PRTP configurations gave a substantial reduction in average interarrival
jitter for all seven traffic loads studied, from the lowest to the highest traffic load. Most
notably, they both reduced the average interarrival jitter compared to TCP at the two highest
traffic loads, 93% (ca. 14% packet loss rate) and 97% (ca. 20% packet loss rate), by more
than 30%.
Another noteworthy observation is the increase in the reduction of the average interarrival
jitter obtained with PRTP configurations PRTP-8, PRTP-11, PRTP-14, and PRTP-20, as
compared to TCP, when the traffic load increased from 93% to 97%. However, this was
not because these PRTP configurations actually reduced their average interarrival jitter when
the traffic load was increased. Instead, it was caused by a large increase in the average
interarrival jitter for TCP, PRTP-2, PRTP-3, and PRTP-5 when the traffic load increased
from 93% to 97%, i.e., when the packet loss rate reached approximately 20%. This in turn
seemed to be an effect of loss of retransmitted packets at times when the sending window
was only two packets, i.e., at times when the loss of packets unconditionally led to timeouts.
Specifically what happened at those times, was that one of the two packets in the sender
window was dropped, which, apart from leading to a timeout and a retransmission of the
dropped packet, also led to a doubling of the RTO (Retransmission timer). Then, when the
retransmitted packet was also dropped, it took twice the time of the first retransmission until
a second attempt to retransmit the dropped packet was made. PRTP configurations PRTP-8,
PRTP-11, PRTP-14, and PRTP-20 never found themselves in the same predicaments. This
was primarily because they managed to keep their sender windows large enough to enable
packet loss recovery through fast retransmit and fast recovery.
Figure 9 shows the result of the throughput evaluation. As before, the traffic load is on
the horizontal axis, and, this time, the average throughput relative to the average throughput
of TCP is on the vertical axis.
A comparison between the results of the throughput evaluation and the results of the evaluation of the average interarrival jitter indicates that the improvements in average throughput
56
obtained with PRTP as compared to TCP were not of the same magnitude as the reductions
in average interarrival jitter. In addition, we observe that, in the same way as for average
interarrival jitter, the largest gains in average throughput compared to TCP, at both low and
high traffic loads, were obtained with the PRTP configurations that had the largest fs values.
For example, at a traffic load of 20%, PRTP-20 gave an improvement in average throughput
of almost 50%, while PRTP-2 gave an improvement in average throughput of only 26%; at
a traffic load of 97%, the difference was even larger: while PRTP-20 gave an improvement
in average throughput of as much as 75%, PRTP-2 gave an improvement of only 11%. The
reason the largest gains in average throughput were obtained with the PRTP configurations
with the largest packet loss tolerances was the same as for average interarrival jitter: the
PRTP configurations that tolerated the largest packet loss rates almost never had to make
retransmissions of lost packets, which not only resulted in reductions in average interarrival jitter but also in improvements in average throughput. Specifically, the retransmissions
performed by the TCP sender always entailed reductions of the TCP sender window. This
meant that at high packet loss rates, the TCP sender window was frequently less than four
packets. As a consequence, the TCP sender was often unable to detect packet losses through
fast retransmit at high packet loss rates. Instead, packet losses at those times were detected
through timeouts. In other words, at high packet loss rates, the retransmissions made by the
TCP sender were often preceded by a timeout period that drastically reduced the throughput
performance.
While the largest improvements in average throughput were indeed obtained with the
PRTP configurations with the largest packet loss tolerances, we observe that all seven PRTP
configurations in fact gave noticeable improvements in average throughput over TCP at low
traffic loads. At a traffic load of 20%, PRTP-2 (as already mentioned) gave an improvement
in average throughput of 26%; PRTP-3 gave an improvement of 31%; PRTP-5 gave an improvement of 39%; PRTP-8 gave an improvement of 44%; PRTP-11 and PRTP-14 gave an
improvement of 47%; and PRTP-20 gave an improvement of 48%.
As follows from Figure 9, when the traffic load increased from 20%, the relative throughput performance of the four PRTP configurations with the lowest packet loss tolerances,
PRTP-2, PRTP-3, PRTP-5, and PRTP-8, immediately started to decrease, while the relative
throughput performance of the three PRTP configurations with the highest packet loss tolerances, PRTP-11, PRTP-14, and PRTP-20, continued to increase. In particular, the relative
throughput performance of PRTP-11 and PRTP-14 continued to increase up to a traffic load
of 67% (ca. 3% packet loss rate), while the relative throughput performance of PRTP-20
peaked at the traffic load of 60%.
The reason for this behavior in the relative throughput performance of the seven PRTP
configurations was of course an effect of their different packet loss tolerances. The four
PRTP configurations with the lowest packet loss tolerances had to begin issuing retransmission requests for lost packets at lower packet loss rates, i.e., lower traffic loads, than the three
with the highest packet loss tolerances. Consequently, the relative throughput performance
of the four PRTP configurations with the lowest packet loss tolerances started to decrease at
much lower traffic loads than the three with the highest packet-loss tolerances.
One might have expected that the maximum relative throughput performance of the seven
PRTP configurations would have occurred closer to their respective fs values, i.e., closer
5. Stationary Analysis
57
to their allowable steady-state packet loss frequencies: The number of retransmissions increased with increased traffic load, i.e., increased packet loss rate, for TCP, but should have
been kept low for a particular PRTP configuration as long as the packet loss rate was below fs . For example, one might have expected that the relative throughput performance of
PRTP-3 and PRTP-5 would have peaked at traffic loads of 67% (ca. 3% packet loss rate)
and 80% (ca. 5% packet loss rate) and that the relative throughput performance of PRTP-20
would have increased monotically all the way up to the traffic load of 97% (ca. 20% packet
loss rate). However, as mentioned in Section 5.1, fs is defined on the basis of a certain ideal
packet loss scenario. Depending on how much the actual packet losses in a simulation differed from this packet loss scenario, the actual packet loss tolerance of a PRTP configuration
differed from fs .
It follows from Figure 9 that the relative throughput performance of all seven PRTP configurations increased when the traffic load increased from 93% to 97%. This was not because
the throughput performance of these PRTP configurations actually suddenly increased but
was instead, in the same way as for average interarrival jitter, an effect of the performance
of TCP deteriorating rapidly when the packet-loss rate approached 20%. In fact, in the same
way as for average interarrival jitter, this was due to TCP experiencing loss of retransmitted packets at times when the sender window was only two packets: the retransmissions
and the doubling of the RTO not only impeded the jitter performance but also impeded the
throughput performance.
Figure 10 shows the result of the evaluation of the average fairness index. Again, traffic
load is displayed on the horizontal axis, and the fairness index is displayed on the vertical
axis.
Since the fairness index, by the way it is defined (see Equation 2), is inversely proportional to the gain in average throughput obtained with PRTP, it is not surprising that the plots
of the fairness indexes for the seven PRTP configurations have shapes roughly opposite to
those of the corresponding plots for average throughput.
As follows from the plots of the fairness indexes for the PRTP configurations in Figure 10, PRTP was not altogether fair: While the reference TCP flow between nodes 1 and
4, except for the simulations with a traffic load of 97%, never acquired more than a 20%
larger bandwidth than the competing TCP flow between nodes 2 and 5 (i.e., had a fairness
index less than 0.99), the PRTP configurations with large packet loss tolerances had a fairness index less than 0.9, i.e., acquired more than a 50% larger bandwidth than the TCP flow
between nodes 2 and 5, at a wide range of traffic loads. Most notably, PRTP-20 had an
average fairness index of 0.74 at a traffic load of 60% (ca. 2% packet loss rate), i.e., had a
bandwidth allocation that was four times that of the TCP flow between nodes 2 and 5.
The fact that PRTP was not as fair as TCP was expected and was a direct consequence of
the way PRTP works: PRTP not only acknowledges successfully received packets but also
lost packets provided crl(n) rrl (see Section 2). However, in TCP error and flow control
are intertwined. Thus, in the cases that PRTP acknowledges lost packets, it temporarily
disables the congestion control mechanism of the sender side TCP.
Although PRTP was not as fair as TCP, it could still, as mentioned in Section 5, have
exhibited a TCP-friendly behavior. However, as follows from Table 1, this was not the case.
Even at low packet loss tolerances, PRTP was TCP-unfriendly, and this became worse when
58
Protocol
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Pass Freq.
96.6%
41.6%
36.6%
26.8%
14.9%
5.8%
1.0%
0.0%
Transient Analysis
The previous section evaluated the performance of PRTP for long-lived connections. However, at present, short-lived connections in the form of Web traffic constitutes by far the
largest fraction of the total Internet traffic [6]. To this end, we decided to complement our
study of the stationary behavior of PRTP with an analysis of its transient performance. Subsection 6.1 describes the simulation setup and methodology of the transient analysis, and
Subsection 6.2 discusses the results of the simulation experiment.
6. Transient Analysis
59
HTTP requests
HTTP responses
(HTML pages with JPEG images)
Web client
Web server
(a) Web browsing scenario.
Bandwidth: b kbps
Propagation delay: d ms
Node 1
Node 2
file transfer scenario between an FTP client at node 1 and an FTP server at node 2; the
FTP server was modeled by an FTP application (Application/FTP) running atop a TCP
agent (FullTcp); and the FTP client was modeled by a TCP (FullTcp) or PRTP agent
running in LISTEN mode (i.e., worked as a sink and discarded all received packets). More
precisely, the FTP client was modeled by a TCP agent in the reference simulations and by a
PRTP agent in the comparative simulations.
In all simulations, the TCP and PRTP agents used a maximum segment size of 1460
bytes. Furthermore, the PRTP agent was configured in all simulations with an aging factor,
af , of 1.0 and a required reliability level, rrl, of 85%, which approximately translated to an
allowable steady-state packet loss frequency of 15%.
Simulations were run for seven different file sizes: 5 kB, 8 kB, 12 kB, 20 kB, 35 kB,
60 kB, and 1 MB; the 1 MB file was included to enable straightforward comparisons between
the performance of PRTP for long- and short-lived connections. Furthermore, simulations
were performed for three types of Internet connections: fixed, modem, and GSM.
The three types of Internet connections simulated were modeled using different bandwidths and propagation delays on the link between nodes 1 and 2. The link configurations
shown in Table 2 were used. The link configurations for the fixed Internet connection were
intended to model three typical LAN connections, the link configurations for the modem connection was intended to model a 56 kbps modem in two common Internet access scenarios,
and the link configuration for the GSM connection was intended to model a non-transparent,
60
Internet connection
Fixed
Modem
GSM
Bandwidth, b (kbps)
400
150
60
33.6
48
7.68
6. Transient Analysis
61
62
6. Transient Analysis
63
6.2 Results
As mentioned, the primary objective of the transient analysis was to evaluate the performance
of PRTP compared to TCP in a typical Web browsing scenario for three types of Internet
connections: fixed, modem, and GSM. In contrast to the stationary analysis, the transient
analysis only evaluated the throughput performance of PRTP.
Figures 12 through 16 show the result of the transient analysis. The graphs in Figures 12,
14, and 16(a) show the actual throughputs obtained with TCP and PRTP and Figures 13,
15, and 16(b) show the throughputs of PRTP relative to TCP expressed in percent of the
throughputs of TCP. To better appreciate how the relative throughput of PRTP varied with
increased packet loss rate (i.e., increasing loss profile number) and increased file size for the
three studied types of Internet connections, the markers in the relative throughput graphs are
connected with lines.
Let us first consider the results of the simulations of the fixed Internet connection. The
graphs in Figures 12 and 13 show that the trend was that the relative throughput performance
of PRTP increased with increased packet loss rate and increased file size. The largest gains
in throughput obtained with PRTP compared to TCP were obtained when the packet loss rate
was 20% (i.e., loss profile #6) and the file was at least 60 kB.
On the other hand, Figures 12 and 13 also show that significant improvements in throughput were already obtained when the packet loss rate was as moderate as 10% and the file was
no larger than 35 kB. Specifically, when the packet loss rate was 10% (i.e., loss profile #5)
and the file was 35 kB, the relative throughput of PRTP was 245% in the simulation of a
400 kbps fixed Internet connection; 184% in the simulation of a 150 kbps fixed Internet connection; and 229% in the simulation of a 60 kbps fixed Internet connection.
Similar results were obtained for the simulations of the modem (see Figures 14 and 15)
and GSM connections (see Figure 16). The trend in the simulations of these two types of
Internet connections was also that the relative throughput performance of PRTP increased
with increased packet loss rate and increased file size. However, for these two types of
Internet connections, significant improvements in throughput were also obtained at moderate
packet loss rates and with relatively small files. When the packet loss rate was 10% and the
file was 20 kB, the relative throughput of PRTP was 216% in the simulation of a 33.6 kbps
modem connection; 197% in the simulation of a 48 kbps modem connection; and 227% in
the simulation of a GSM connection.
As follows from the relative throughput graphs in Figures 13, 15, and 16(b), the relative
throughput performance of PRTP exhibited large fluctuations between different loss profiles
and file sizes. The trend of the throughput performance of PRTP increasing with increased
packet loss rate and increased file sizes was pretty weak.
The large fluctuations in the relative throughput performance of PRTP was primarily an
effect of us using a deterministic error model instead of a stochastic one: in some simulations, a certain loss profile turned out to represent a particularly favorable packet loss pattern
for PRTP, and sometimes an unfavorable one. For example, the loss profile #2 represented a
particularly favorable packet loss pattern for PRTP when the file size was 12 kB and a fixed
Internet connection was simulated (see Figure 13). In the simulations of the 400 kbps and
150 kbps fixed Internet connections the relative throughput of PRTP increased from 100% to
175% when the file size increased from 8 kB to 12 kB, and in the simulation of the 60 kbps
64
6. Transient Analysis
65
sions was that the ACK of a retransmitted packet was lost at a time the sender window was
only one packet: the first packet in a sender window of three packets was lost. This led to a
timeout and a retransmission of the lost packet. Unfortunately, the ACK of the retransmitted
packet was also lost and yet another timeout and retransmission occurred. This time, the
timeout was twice as long and significantly impeded the throughput performance of PRTP.
To sum up, the transient analysis suggests that PRTP can give substantial improvements
in throughput for short-lived Internet connections as diverse as fixed, modem, and GSM.
However, the results are very preliminary and further simulations are needed to more firmly
establish them.
66
(a) Throughput.
Figure 16: GSM connection with bandwidth of 7.68 kbps and propagation delay of 310 ms.
Conclusions
This paper presents an extension for partial reliability to TCP, PRTP, that aims at making
TCP more suitable for multimedia applications. PRTP enables an application to prescribe a
minimum guaranteed reliability level between 0% (i.e., a best-effort transport service such
as UDP) and 100% (i.e, a completely reliable transport service such as TCP).
The major advantage of PRTP is that it only entails modifying the retransmission decision
scheme of TCP. Specifically, PRTP alters the retransmission decision scheme of TCP in such
a way that retransmissions of lost packets are made only when it is necessary in order to
uphold the minimum required reliability level.
The paper also gives a detailed description of a simulation based performance evaluation
that has been performed on PRTP. The performance evaluation was made using the ns-2
REFERENCES
67
network simulator and comprised two subactivities. First, our simulation model of PRTP was
validated against our implementation of PRTP in Linux: in short, this validation suggested
that our simulation model quite accurately modeled PRTP. Second, a performance analysis
of PRTP was conducted. This analysis entailed evaluating the stationary and the transient
performance of PRTP compared to TCP.
The stationary analysis indicated that significant reductions in average interarrival jitter
and improvements in average throughput could be obtained with PRTP at both low and high
traffic loads and with PRTP configured to tolerate low as well as high packet loss rates.
However, the stationary analysis also found PRTP to be less fair than TCP and not TCPfriendly.
The transient analysis entailed evaluating the throughput performance of PRTP in a typical Web browsing scenario. Three types of Internet connections were considered: fixed,
modem, and GSM. The results suggested that at packet loss rates as low as 10%, and for
files as small as 35 kB, throughput gains larger than 140% could be obtained with PRTP
irrespective of the Internet connection.
Taken together, the stationary and transient analyses clearly indicate that PRTP can give
significant performance improvements compared to TCP for a wide range of loss tolerant
applications. However, the stationary analysis also indicated that PRTP would most likely
exhibit a TCP-unfriendly behavior and therefore would not share bandwidth in a fair manner
with competing TCP flows. Consequently, we do not recommend using PRTP for wireline
applications. We believe that PRTP can indeed be a good choice for loss tolerant applications
in error prone wireless environments, e.g., GSM with a transparent link layer: error rates on
wireless links are much higher compared to the error rates in fiber and copper links used in
the fixed Internet. Thus, packet loss that is not related to congestion is much more common
and cannot always be compensated for by layer two retransmissions. Trying to retransmit on
layer two could, for example, trigger a TCP retransmission if it takes too much time.
Although, the current version of PRTP is not TCP-friendly and altogether fair, and therefore not suitable for wireline applications, we intend to change this in future versions of the
protocol. In particular, we believe that PRTP could be made more TCP-friendly and fair in at
least two ways. First, PRTP could be modified so that when a lost packet is acknowledged,
ECN [27] (Explicit Congestion Notification) is used to signal congestion to the sender side
TCP. Second, PRTP could, at the same time it acknowledges a lost packet, also advertise a
smaller receiver window. Furthermore, it would be interesting to see whether the adverse effects that PRTP has on TCP could be reduced by active queueing techniques such as RED [5]
(Random Early Detection) and WFQ [8] (Weighted Fair Queueing).
References
[1] M. Allman, V. Paxson, and W. Stevens. TCP congestion control. RFC 2581, IETF,
April 1999.
[2] P. D. Amer, C. Chassot, T. Connolly, M. Diaz, and P. T. Conrad. Partial order transport
service for multimedia and other applications. ACM/IEEE Transactions on Networking,
2(5), October 1994.
68
REFERENCES
69
[16] T. Henderson and S. Floyd. The NewReno modification to TCPs fast recovery algorithm. RFC 2582, IETF, April 1999.
[17] V. Jacobson, R. Braden, and D. Borman. TCP extensions for high performance. RFC
1323, IETF, May 1992.
[18] R. Jain, D. Chiu, and W. Hawe. A quantitative measure of fairness and discrimination
for resource allocation in shared computer systems. Technical Report DEC-TR-301,
Digital Equipment Corporation, September 1984.
[19] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP selective acknowledgement
options. RFC 2018, IETF, October 1996.
[20] M. K. McKusick, K. Bostic, M. J. Karels, and J. S. Quarterman. The Design and
Implementation of the 4.4 BSD Operating System. Addison-Wesley Publishing, 1996.
[21] B. Mukherjee and T. Brecht. Time-lined TCP for the TCP-friendly delivery of streaming media. In IEEE International Conference on Network Protocols (ICNP), pages
165176, Osaka, Japan, November 2000.
[22] NIST Net. http://snad.ncsl.nist.gov/itg/nistnet.
[23] The network simulator ns-2. http://www.isi.edu/nsnam/ns.
[24] C. Papadopoulos and G. Parulkar. Retransmission-based error control for continuous
media applications. In 6th International Workshop on Network and Operating System
Support for Digital Audio and Video (NOSSDAV), pages 512, Zushi, Japan, April
1996.
[25] J. Postel. User datagram protocol. RFC 768, IETF, August 1980.
[26] J. Postel. Transmission control protocol. RFC 793, IETF, September 1981.
[27] K. Ramakrishnan and S. Floyd. A proposal to add explicit congestion notification
(ECN) to IP. RFC 2481, IETF, January 1999.
[28] R. Stewart, M. Ramalho, Q. Xie, M. Tuexen, and P. T. Conrad. SCTP partial reliability
extension. Internet draft, IETF, May 2002. Work in Progress.
Paper III
1 Introduction
Distribution of multimedia traffic such as streaming media over the Internet poses a major
challenge to existing transport protocols. Apart from having demands on throughput, many
multimedia applications are sensitive to delays and variations in those delays [27]. In addition, they often have an inherent tolerance for limited data loss [29].
The two prevailing transport protocols in the Internet today, TCP [21] and UDP [20], fail
to meet the QoS requirements of streaming media and other applications with soft real-time
constraints. TCP offers a fully reliable transport service at the cost of increased delay and
reduced throughput. UDP on the other hand introduces virtually no increase in delay or
reduction in throughput but provides no reliability enhancement over IP. In addition, UDP
73
74
leaves congestion control to the discretion of the application. If misused, this could impair
the stability of the Internet.
In this paper, we present a novel transport protocol, Partially Reliable Transport Protocol using ECN (PRTP-ECN), which offers a transport service that better complies with the
QoS requirements of applications with soft real-time requirements. PRTP-ECN is a receiver
based, partially reliable transport protocol that is implemented as an extension to TCP and
is able to work within the existing Internet infrastructure. It employs a congestion control
mechanism that largely corresponds to the one used in TCP. A simulation evaluation suggests
that, by trading reliability for latency, PRTP-ECN is able to offer a service with significantly
reduced interarrival jitter and increased throughput and goodput as compared to TCP. In addition, the evaluation implies that PRTP-ECN is TCP-friendly, which may not be the case in
some RTP/UDP solutions.
The paper is organized as follows. Section 2 discusses related work. Section 3 gives
a brief overview of the design principles behind PRTP-ECN and Section 4 describes the
design of the simulation experiment. The results of the simulation experiment are discussed
in Section 5. Finally, in Section 6, we summarize the major findings and indicate further
areas of study.
Related Work
3. Overview of PRTP-ECN
75
3 Overview of PRTP-ECN
As mentioned above, PRTP-ECN is a partially reliable transport protocol. It is implemented
as an extension to TCP, and differs from TCP only in the way it handles packet losses. PRTPECN need only be employed at the receiver side. An ECN-capable TCP is used at the sender
side.
PRTP-ECN lets the QoS requirements imposed by the application govern the retransmission scheme. This is done by allowing the application to specify the parameters in a
retransmission decision algorithm. The parameters let the application directly prescribe an
acceptable packet loss rate and indirectly affect the interarrival jitter, throughput, and goodput. By relaxing the reliability, the application receives less interarrival jitter and better
throughput and goodput.
PRTP-ECN works in a way identical to TCP as long as no packets are lost. When an
out-of-sequence packet is received, this is taken as an indication of packet loss. PRTP-ECN
must then decide whether the lost data are needed to ensure the required reliability level
imposed by the application. This decision is based on the success rate of previous packets.
In PRTP-ECN, the success rate is measured as an exponentially weighted moving average
over all packets, lost and received, up to but not including the out-of-sequence packet. This
76
weighted moving average is called the current reliability level, crl(n), and is defined as
crl(n) =
Pn
nk
pk b k
k=1 af
P
,
n
nk b
af
k
k=1
(1)
where n is the sequence number of the packet preceding the out-of-sequence packet, af is
the weight or aging factor, and bk denotes the number of bytes contained in packet k. The
variable denoted pk is a conditional variable that only takes a value of 1 or 0. If the kth
packet was successfully received, then pk = 1 otherwise pk = 0.
The QoS requirements imposed on PRTP-ECN by the application translate into two parameters in the retransmission scheme: af and rrl. The required reliability level, rrl, is the
reference in the the feedback control system made up of the data flow between the sender
and the receiver, and the flow of acknowledgements in the reverse direction. As long as
crl(n) rrl, dropped packets need not be retransmitted and are therefore acknowledged.
If an out-of-sequence packed is received and crl(n) is below rrl, PRTP-ECN acknowledges
the last in-sequence packet and waits for a retransmission. In other words, PRTP-ECN does
the same thing as TCP in this situation.
There is, however, a problem in acknowledging lost packets. In TCP, the retransmission scheme and the congestion control scheme are intertwined. An acknowledgement not
only signals the successful reception of one or several packets; it also indicates that there is
no noticeable congestion in the network between the sender and the receiver. PRTP-ECN
decouples these two schemes by using the TCP portions of ECN (Explicit Congestion Notification) [22].
The only requirement imposed on the network by PRTP-ECN is that the TCP implementation on the sender side must be ECN capable. It does not engage intermediary routers. In
the normal case, ECN enables direct notification of congestion instead of indirect notification
via missing packets. It engages both the IP and TCP layers. Upon incipient congestion, a
router sets a flag, the Congestion Experienced bit (CE), in the IP header of arriving packets.
When the receiver of a packet finds that the CE bit has been set, it sets a flag, the ECNEcho flag, in the TCP header of the subsequent acknowledgement. Upon reception of an
acknowledgement having the ECN-Echo flag set, the sender halves its congestion window
and performs fast recovery. PRTP-ECN does not involve intermediate routers, however, and
correspondingly does not need the IP parts of ECN. It employs the ECN-Echo flag only to
signal congestion. When an out-of-sequence packet is acknowledged, the ECN-Echo flag is
set in the acknowledgement. When the acknowledgement is received, the sender will throttle
its flow but refrain from re-sending any packet.
The QoS offered by PRTP-ECN as compared to TCP was evaluated by simulation, where
potential improvements in average interarrival jitter, average throughput, and average goodput were examined. We also investigated whether PRTP-ECN connections are TCP-friendly
and fair against competing flows.
S1
S2
S3
77
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
R1
1.5 Mbps, 50 ms
R2
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
S4
S5
S6
4.1 Implementation
We used version 2.1b5 of the ns-2 network simulator [19] to conduct the simulations described in this paper. The TCP protocol was modeled by the FullTcp agent, while PRTPECN was simulated by PRTP, an agent developed by us.
The FullTcp agent is similar to the 4.4 BSD TCP implementation [18, 30]. This means,
among other things, that it uses a congestion control mechanism similar to TCP Renos [1].
However, SACK [17] is not implemented in FullTcp. The PRTP-ECN agent, PRTP, inherits most of its functionality from the FullTcp agent. Only the retransmission mechanism
differs between FullTcp and PRTP.
78
Routers R1 and R2 had a single output queue for each attached link and used FCFS
scheduling. Both router buffers had a capacity of 25 segments, i.e., approximately twice the
bandwidth-delay product of the network path. All receivers used a fixed advertised window
size of 20 segments, which enabled each of the senders to fill the bottleneck link.
The traffic load was controlled by setting the mean sending rate of the UDP flow to
a fraction of the nominal bandwidth on the R1-R2 link. Tests were run for seven traffic
loads: 20%, 60%, 67%, 80%, 87%, 93%, and 97%. These seven traffic loads corresponded
approximately to packet-loss rates of 1%, 2%, 3%, 5%, 8%, 14%, and 20% in the reference
tests, i.e., the tests in which TCP was used at node S4. Tests were run for eight PRTPECN configurations (see Section 4.3), and each test was run 40 times to obtain statistically
significant results.
In all simulations, the UDP flow started at 0 s, while three cases of start times for the FTP
flows were studied. In the first case, the flow between nodes S1 and S4 started at 0 s, and the
flow between nodes S2 and S5 started at 600 ms. In the second case, the situation was the
reverse, i.e., the flow between nodes S1 and S4 started at 600 ms, and the flow between nodes
S2 and S5 started at 0 s. Finally, in the last case, both flows started at 0 s. Each simulation
run lasted 100 s.
loss( n )
,
n
n
floss = lim
(2)
where n denotes the packet sequence comprising all packets sent up to and including the nth
packet and loss( n ) is a function that returns the number of lost packets in n . Considering
that packet losses almost always occur in less favorable situations, the allowable steady-state
packet loss frequency may be seen as a rough estimate of the upper bound of the packet loss
frequency tolerated by a particular PRTP-ECN configuration.
We selected seven PRTP-ECN configurations that had allowable steady-state packet loss
frequencies ranging from 2% to 20%. Since our metric, the allowable steady-state packet
loss frequency, did not capture all aspects of a particular PRTP-ECN configuration, care
was taken to ensure that the selection was done in a consistent manner. Of the eligible
configurations for a particular packet loss frequency, we always selected the one with the
largest aging factor. If there were several configurations with the same aging factor, we
Configuration Name
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
79
af
0.99
0.99
0.99
0.99
0.99
0.99
0.99
rrl
0.97
0.96
0.94
0.91
0.88
0.85
0.80
TCP-friendliness A flow is said to be TCP-friendly if its arrival rate does not exceed the
arrival rate of a conformant TCP connection under the same circumstances [13]. In
this simulation study, we make use of the TCP-friendliness test presented by Floyd and
Fall [13]. According to their test, a flow is TCP-friendly if the following inequality
holds for its arrival rate:
p
1.5 2/3
,
(4)
RT T ploss
80
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Jitter (ms)
Throughput (bps)
Goodput (bps)
Fairness Index
27.5 1.5
18.9 0.8
18.6 0.7
18.0 0.7
17.3 0.6
17.3 0.6
17.0 0.7
17.1 0.6
579315 17992
662325 14829
664734 13949
669018 14385
680175 12561
677904 12588
683313 14350
681975 13047
578781 17981
662082 14832
664467 13951
668883 14395
680046 12566
677778 12586
683193 14350
681855 13047
0.99 0.003
0.98 0.006
0.98 0.006
0.98 0.006
0.98 0.005
0.98 0.006
0.98 0.006
0.98 0.006
Table 2: Performance metrics for tests where the traffic load was 20%.
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Jitter (ms)
Throughput (bps)
Goodput (bps)
Fairness Index
101.8 3.6
72.6 2.8
63.5 2.5
51.8 1.4
50.0 0.8
49.9 1.1
49.8 1.0
49.9 1.0
237156 5479
268659 6627
284100 6707
309105 5813
312003 4111
312624 5633
312618 5071
311787 4778
235920 5483
268248 6630
283797 6725
308934 5814
311871 4108
312483 5640
312471 5065
311652 4783
1.00 0.002
0.98 0.005
0.98 0.007
0.95 0.010
0.94 0.007
0.94 0.009
0.94 0.008
0.94 0.007
Table 3: Performance metrics for tests where the traffic load was 67%.
where is the arrival rate of the flow in Bps, denotes the packet size in bytes, RT T
is the minimum round trip time in seconds, and ploss is the packet loss frequency.
Results
In the analysis of the simulation experiment, we made a TCP-friendliness test, and calculated
the average interarrival jitter, the average throughput, and the average goodput for the flow
between nodes S1 and S4. In addition, we calculated the average fairness in each run. We
let the mean, taken over all runs, be an estimate of a performance metric in a test. Of the
three primary factors studied in this experiment, the starting times of the two FTP flows were
found to have marginal impact on the results. For this reason, we focus our discussion on
one of the three cases of starting times: the one in which the FTP flow between the nodes S1
and S4 started 600 ms after the flow between nodes S2 and S5. It should be noted, however,
that the conclusions drawn from these tests also apply to the tests in the other two cases.
To make comparisons easier, the graphs show interarrival jitter, throughput, and goodput
for the PRTP-ECN configurations relative to TCP, i.e., the ratios between the metrics obtained for the PRTP-ECN configurations and the metrics obtained for TCP are plotted. As
a complement to the graphs, Tables 2, 3, and 4 show the estimates of the metrics together
with their 99%, two-sided, confidence interval for a selection of traffic loads.
5. Results
81
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Jitter (ms)
Throughput (bps)
Goodput (bps)
Fairness Index
737.1 112.3
722.6 77.1
701.4 97.0
582.7 56.1
485.2 36.0
425.2 45.0
329.7 28.2
242.8 11.8
39741 4521
37812 3710
38214 3494
42669 3434
46737 2880
48396 4512
53304 3138
58173 2853
39093 4406
37392 3664
37848 3464
42393 3420
46557 2864
48237 4512
53184 3131
58059 2851
0.92 0.053
0.95 0.033
0.95 0.030
0.93 0.042
0.91 0.045
0.88 0.046
0.86 0.050
0.83 0.047
Table 4: Performance metrics for tests where the traffic load was 97%.
Relative Interarrival Jitter vs. Traffic Load
1
0.9
0.8
0.7
0.6
0.5
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
0.4
0.3
20
30
40
50
60
70
80
Traffic Load (Percentage of Nominal Bandwidth on Link R1-R2)
90
100
82
Relative Throughput
1.4
1.3
1.2
1.1
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
0.9
20
30
40
50
60
70
80
Traffic Load (Percentage of Nominal Bandwidth on Link R1-R2)
90
100
Our evaluation of the throughput and goodput of PRTP-ECN also gave positive results.
As evident in Figures 3 and 4 and the tables, significant improvements in both throughput
and goodput were obtained using PRTP-ECN. For example, an application accepting a 20%
packet loss rate could increase its throughput, as well as its goodput, by as much as 48%.
However, applications that tolerate a packet loss rate of only a few percent may also experience improvements in throughput and goodput of as much as 20%. From the confidence
intervals, it follows that the improvements in throughput and goodput were significant and
that PRTP-ECN could provide less fluctuating throughput and goodput than TCP. A comparison of the throughputs and goodputs further suggest that PRTP-ECN is better than TCP at
utilizing bandwidth. This has not been statistically verified, however.
Recall from Section 4.2 that a traffic load approximately corresponds to a particular
packet loss rate. Taking this into account in analyzing the results, it may be concluded that a
PRTP-ECN configuration had its optimum in relative interarrival jitter, relative throughput,
and relative goodput when the packet loss frequency was almost the same as the allowable
steady-state packet loss frequency. This is a direct consequence of the way we defined the allowable steady-state packet loss frequency. At packet loss frequencies lower than the allowable steady-state packet loss frequency, the gain in performance was limited by the fact that it
was not necessary to make very many retransmissions in the first place. When the packet loss
frequency exceeded the allowable steady-state packet loss frequency, the situation was the
reverse. In these cases, PRTP-ECN had to increase the number of retransmissions in order
to uphold the reliability level, which had a negative impact on interarrival jitter, throughput,
and goodput. However, it should be noted that even in cases in which PRTP-ECN had to
increase the number of retransmissions, it performed far fewer retransmissions than TCP.
TCP-friendliness is a prerequisite for a protocol to be able to be deployed on a large scale
83
1.4
Relative Goodput
1.3
1.2
1.1
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
0.9
20
30
40
50
60
70
80
Traffic Load (Percentage of Nominal Bandwidth on Link R1-R2)
90
100
84
Fairness Index
0.94
0.92
0.9
0.88
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
0.86
0.84
0.82
20
30
40
50
60
70
80
Traffic Load (Percentage of Nominal Bandwidth on Link R1-R2)
90
100
References
[1] M. Allman, V. Paxson, and W. Stevens. TCP congestion control. RFC 2581, IETF,
April 1999.
[2] P. D. Amer, C. Chassot, T. Connolly, and M. Diaz. Partial order quality of service to
support multimedia connections: Reliable channels. In 2nd High Performance Distributed Computing Conference, Spokane, Washington, USA, July 1993.
[3] P. D. Amer, C. Chassot, T. Connolly, and M. Diaz. Partial order quality of service
to support multimedia connections: Unreliable channels. In International Networking
Conference (INET), San Fransisco, California, USA, August 1993.
[4] P. D. Amer, C. Chassot, T. Connolly, M. Diaz, and P. T. Conrad. Partial order transport
service for multimedia and other applications. ACM/IEEE Transactions on Networking,
2(5), October 1994.
REFERENCES
85
[5] P. D. Amer, P. T. Conrad, E. Golden, S. Iren, and A. Caro. Partially-ordered, partiallyreliable transport service for multimedia applications. In Advanced Telecommunications/Information Distribution Research Program (ATIRP) Conference, pages 215
220, College Park, Maryland, USA, January 1997.
[6] Z. Chen, S-M. Tan, R. H. Campbell, and Y. Li. Real time video and audio in the
world wide web. In 4th International World Wide Web Conference (WWW), Boston,
Massachusetts, USA, December 1995.
[7] T. Connolly, P. D. Amer, and P. T. Conrad. An extension to TCP: Partial order service.
RFC 1693, IETF, November 1994.
[8] J. Davidson and J. Peters. Voice over IP Fundamentals. Cisco Press, March 2000.
[9] B. Dempsey. Retransmission-Based Error Control for Continuous Media Traffic in
Packet-Switched Networks. PhD thesis, University of Virginia, May 1994.
[10] B. Dempsey, J. Liebeherr, and A. Weaver. A delay-sensitive error control scheme for
continuous media communications. In 2nd IEEE Workshop on the Architecture and
Implementation of High Performance Communication Subsystems (HPCS), September
1993.
[11] B. Dempsey, T. Strayer, and A. Weaver. Adaptive error control for multimedia data
transfers. In International Workshop on Advanced Communications and Applications
for High-Speed Networks (IWACA), pages 279289, Munich, Germany, March 1992.
[12] M. Diaz, A. Lopez, C. Chassot, and P. D. Amer. Partial order connections: A new
concept for high speed and multimedia services and protocols. Annals of Telecomunications, 49(56):270281, 1994.
[13] S. Floyd and K. Fall. Promoting the use of end-to-end congestion control in the Internet.
ACM/IEEE Transactions on Networking, 7(4):458472, August 1999.
[14] K. Hess. Media streaming protocol: An adaptive protocol for the delivery of audio and
video over the Internet. Masters thesis, University of Illinois at Urbana-Champaign,
1998.
[15] R. Jain. The Art of Computer Systems Performance Analysis. John Wiley & Sons, Inc.,
April 1991.
[16] R. Jain, D. Chiu, and W. Hawe. A quantitative measure of fairness and discrimination
for resource allocation in shared computer systems. Technical Report DEC-TR-301,
Digital Equipment Corporation, September 1984.
[17] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP selective acknowledgement
options. RFC 2018, IETF, October 1996.
[18] M. K. McKusick, K. Bostic, M. J. Karels, and J. S. Quarterman. The Design and
Implementation of the 4.4 BSD Operating System. Addison-Wesley Publishing, 1996.
86
Paper IV
1 Introduction
A large portion of the emerging multimedia applications use the UDP transport protocol,
which, unlike TCP, provides neither flow nor congestion control. Often, as is the case in
RealNetworks RealPlayer [19] and Microsofts Windows Media Services [13], these multimedia applications are somewhat responsive to network congestion but not as much as
TCP-based applications. Furthermore, they use proprietary algorithms to respond to congestion that are more or less incompatible with the one used by TCP. Consequently, a large-scale
deployment of these UDP based multimedia applications could lead to an unfair bandwidth
allocation among competing traffic flows, which in a longer perspective could result in a
congestion collapse [9].
In an attempt to broaden the spectrum of applications that can run on top of TCP, making
89
90
it possible for some of those best effort multimedia applications that today use UDP to run
on TCP, we have proposed an extension to TCP: PRTP-ECN [8]. PRTP-ECN aims at making
TCP better suited for applications with soft real-time constraints, e.g., best effort multimedia
applications, while still being TCP-friendly. The principal idea behind PRTP-ECN is to trade
reliability for reduced jitter and improved throughput.
PRTP-ECN is implemented as a partially reliable error recovery mechanism and in that
respect builds on the early work on retransmission-based error recovery schemes conducted
by Dempsey [3, 4] and Papadopoulus and Parulkar [17]. Independently of each other, they
demonstrated the feasibility of using a retransmission-based partially reliable error recovery
mechanism for multimedia communication. Extensive work on partial reliability in connection with multimedia has also been done at LAAS-CNRS [5] and at the University of
Delaware [2], work that resulted in particular in POC (Partial Order Connection), a partially
ordered and partially reliable transport protocol specifically targeting multimedia applications. More recent proposals of partially reliable multimedia protocols also factor in TCPfriendliness. Examples of this genre of protocols include TLTCP (Time-Lined TCP) [15]
and the rate based protocol proposed by Jacobs et al. [10]. Both these protocols are designed
to provide a TCP-friendly delivery of time sensitive data to applications that are loss tolerant, such as streaming media players. Furthermore, U-SCTP [21], an unreliable extension to
SCTP (Stream Control Transmission Protocol) [20], has been proposed. It is able to provide
a limited form of congestion aware, partially reliable transport service. Considering that USCTP was proposed by the major designers of SCTP, this further emphasizes the need of a
transport protocol that is both TCP-friendly and offers a more flexible transport service than
TCP.
PRTP-ECN sets itself apart from many other partially reliable error recovery schemes
that have been proposed in that it reacts to congestion in the same way as standard TCP.
Furthermore, PRTP-ECN is completely compatible with standard TCP and is consequently
able to interwork with existing TCP implementations. In addition, PRTP-ECN only needs
to be implemented on the receiver side. Neither the TCP sender side nor any intermediate
network equipment such as routers, gateways, etc. are affected by PRTP-ECN.
To compare the performance of PRTP-ECN with TCP in terms of average interarrival jitter, average throughput, average goodput, and average link utilization, we conducted a simulation experiment. The primary objective of the simulation was to investigate whether PRTPECN performs better than TCP and whether the difference in performance between PRTPECN and TCP is statistically significant. Furthermore, we wanted to determine whether
PRTP-ECN exhibits a TCP-friendly behavior and competes fairly with standard TCP flows.
This paper gives a detailed description of this simulation experiment with an emphasis on
the statistical design and analysis of the experiment.
The remainder of this paper is organized as follows. Section 2 gives a brief overview
of the design principles behind PRTP-ECN. In Section 3, we discuss the organization of
the simulation experiment. In particular, we discuss the statistical design and analysis of
the simulation experiment and give a description of the simulation procedure. The results
of the simulation experiment are presented and discussed in Section 4. Finally, Section 5
summarizes the findings and gives some concluding remarks.
2. Overview of PRTP-ECN
91
11
00
00
11
00
11
00
11
threeway handshake,
ECN negotiation
sender side
TCP
11
00
00
11
00
11
00
11
receiver side
PRTPECN
insequence da
ta packet
outofseque
packet loss
et
packet loss detected
nt
acknowledgeme
ho flag set
with ECNEc
crl
rrl ?
congestion action
data packet wi
2 Overview of PRTP-ECN
The PRTP-ECN extension to TCP only involves changing the retransmission decision algorithm of TCP, i.e., PRTP-ECN retains most of the characteristics and mechanisms of TCP. In
particular, PRTP-ECN does not alter data transfer, flow control, multiplexing, or connection
characteristics of TCP in any way. This enables PRTP-ECN to transparently interwork with
existing TCP implementations. The modifications to TCP imposed by PRTP-ECN have been
isolated to the receiver side. No changes are required at the sender side.
The PRTP-ECN retransmission decision algorithm is parameterized. An application atop
PRTP-ECN explicitly prescribes a minimum acceptable reliability level by setting the parameters of the retransmission algorithm. Implicitly, the parameters govern the trade-off
between reliability, interarrival jitter, throughput, and goodput. By relaxing the reliability,
PRTP-ECN implicitly favors a reduction in interarrival jitter and an increase in throughput
and goodput.
Figure 1 illustrates how PRTP-ECN works. As long as no packets are lost, PRTP-ECN
behaves in the same way as standard TCP. When an out-of-sequence packet is received, it is
92
taken as an indication of packet loss and the retransmission decision algorithm is invoked.
This algorithm decides, on the basis of the success rate of previous packets, whether to
acknowledge all packets up to and including the out-of-sequence packet or to do the same
as standard TCP, i.e., acknowledge the last successfully received in-sequence packet and
wait for a retransmission. In PRTP-ECN, the success rate is measured as an exponentially
weighted moving average over all packets, lost and received, up to but not including the
out-of-sequence packet. This weighted moving average is called the current reliability level,
crl(n), and is defined as
Pn
nk
pk b k
k=1 af
,
(1)
crl(n) = P
n
nk
bk
k=1 af
where n is the sequence number of the packet preceding the out-of-sequence packet, af is
the weight or aging factor, and bk denotes the number of bytes contained in packet k. The
variable denoted pk is a conditional variable that only takes a value of 1 or 0. If the kth
packet was successfully received, then pk = 1 otherwise pk = 0.
An application communicates its lower bound reliability level through the aging factor
and a second parameter called the required reliability level. The required reliability level, rrl ,
acts as a reference value. As long as crl(n) rrl, dropped packets need not be retransmitted
and are therefore acknowledged. If an out-of-sequence packet is received and crl(n) is below
rrl , PRTP-ECN acknowledges the last in-sequence packet, and waits for a retransmission.
In the remainder of this text, a PRTP-ECN protocol that has been assigned fixed values for
af and rrl is called a PRTP-ECN configuration.
Acknowledging lost packets interferes with the congestion control in standard TCP, which
interprets lost packets as a signal of congestion. PRTP-ECN remedies this by employing
Explicit Congestion Notification (ECN) [18]. More specifically, PRTP-ECN uses the TCP
portions of ECN and does not involve intervening routers. Instead, when an out-of-sequence
packet is received by PRTP-ECN, it sets the ECN-Echo flag in the acknowledgement of the
out-of-sequence packet. When an acknowledgement with the ECN-Echo flag set is received,
the sender takes the appropriate congestion measures but does not retransmit any packet.
The sender confirms the receipt of the congestion notification in the next data packet. The
PRTP-ECN session then continues in the same way as a regular TCP session until the next
packet drop.
This section describes the design of the simulation experiment. Two aspects are considered.
First, in Subsection 3.1, the statistical design of the simulation experiment and the techniques used to analyze the design are discussed. Second, in Subsection 3.2, we describe the
simulation model used in the simulation experiment.
93
sponded to the performance metrics: average interarrival jitter, average throughput, average
goodput, average link utilization, and average fairness index (further explained in Section
4.4), i.e., all performance metrics studied except TCP-friendliness. In the remainder of this
section, we let M denote the set comprising the performance metrics studied excluding TCPfriendliness.
The simulation experiment comprised three factors: one primary factor and two secondary factors. The primary factor, protocol, had eight levels. Apart from TCP, we ran simulations on seven PRTP-ECN configurations. The secondary factors were traffic load and
the relative starting times of competing flows. The first secondary factor, traffic load, could
assume seven levels and the second secondary factor, relative starting times of competing
flows, three levels.
We were primarily interested in the effects of PRTP-ECN. Traffic load was introduced as
a way to indirectly control the packet loss rate. This was necessary since the performance of
PRTP-ECN was by way of its design directly dependent on the packet loss rate. The relative
starting times were included as a factor in the experiment to verify that PRTP-ECN shares
bandwidth fairly with TCP in all situations, irrespective of the bandwidth allocated to PRTPECN at the time a TCP flow starts. However, the relative starting times we observed to have
a marginal impact on all five performance metrics studied, including fairness.
A more detailed description of the three factors in the experiment and how they translate
into parameters in the simulation setup follows in Subsection 3.2, where the simulation
procedure is discussed. In the remainder of this section, we let P denote the set of levels of
the primary factor (protocol), T the set of levels of the first secondary factor (traffic load),
and S the set of levels of the other secondary factor (relative starting times of competing
flows).
The underlying effects model of our five-factorial experiment was
Yijkl
+ i + j + k + ( )ij +
+ ( )ik + ()jk + ( )ijk + ijkl
ing the (ijkl)th observation of performance metric , is the overall mean value of metric
, i is the effect of the ith protocol in P on metric , j is the effect of the jth traffic load
in T on metric , and k is the effect of the kth case of the relative starting times on metric
. The remaining terms, except ijkl , represent interaction effects between the three factors.
Term ijkl is a random error component that incorporates all other sources of variability. In
the effects model, P
the treatment and
P interaction effects are defined as deviations from the
mean where sums iP i , . . . , kS ( )ijk are equal to zero.
The following hypotheses were tested at a significance level of 0.01: H0 : i(i =
0); H1 : i(i 6= 0). The hypotheses were tested using a three-factor analysis of variance
(ANOVA) test. This test assumes that the random errors in the effects models are normally
and independently distributed with a mean of zero and a constant variance. We verified the
normality assumption with normal probability plots of the residuals and residual histograms.
The constant variance assumption was checked with plots of residuals versus fitted values
and by the Levene test [14]. Variance stabilizing transformations were employed to mitigate effects of non-homogeneous variances. The independence assumption was not verified.
94
S1
S2
S3
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
R1
1.5 Mbps, 50 ms
R2
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
S4
S5
S6
95
times of the UDP packets, the UDP flow was transformed to a variable bitrate flow.
The three factors in the simulation experiment translated to the following parameters:
protocol used at node S4, traffic load of the R1-R2 link, and starting times of the two FTP
flows.
The PRTP-ECN configurations were selected on the basis of their tolerance to packet
loss. More precisely, the PRTP-ECN configurations were selected on the basis of a metric,
the allowable steady-state packet loss frequency [7], which gives an estimate of the packet
loss tolerance in steady state for a given configuration. Seven configurations were selected,
with allowable steady-state packet loss frequencies approximately equal to 2%, 3%, 5%,
8%, 11%, 14%, and 20%. In the remainder of this paper, we denote these configurations,
PRTP-2, PRTP-3, PRTP-5, PRTP-8, PRTP-11, PRTP-14, and PRTP-20, respectively.
The traffic load was controlled by setting the mean sending rate of the UDP flow to a
fraction of the nominal bandwidth on the R1-R2 link. Tests were run for seven traffic loads,
20%, 60%, 67%, 80%, 87%, 93%, and 97%. These seven traffic loads corresponded approximately to the packet-loss rates of 1%, 2%, 3%, 5%, 8%, 14%, and 20% in the reference
tests.
In all simulations, the UDP flow started at 0 s, while three cases of start times for the FTP
flows were studied. In the first case, the flow between nodes S1 and S4 started at 0 s, and the
flow between nodes S2 and S5 started at 600 ms. In the second case, the situation was the
reverse, i.e., the flow between nodes S1 and S4 started at 600 ms, and the flow between nodes
S2 and S5 started at 0 s. Finally, in the last case, both flows started at 0 s. Each simulation
run lasted 100 s.
96
the transformed effects model had much longer tails while still showing a symmetrical distribution.This observation is confirmed by the normal probability plot of the residuals in
Figure 3(d).
As follows from Figure 3(f), the correspondence between the residuals and the average
interarrival jitter was far less obvious for the transformed effects model. It is also evident
from the plot in Figure 3(f) that the variance was not altogether constant, and this was confirmed by the Levene test (F > 8.265, P < 0.01). However, balanced ANOVA tests (equal
sample sizes in all treatments) are fairly robust toward minor violations of the constant variance assumption. Hence, the outcome of the ANOVA test conducted on the transformed
effects model was not affected by the somewhat fluctuating variance.
The transformed effects model had a coefficient of determination, R2 , equal to 99.3%,
i.e., it accounted for as much as 99.3% of the variability in average interarrival jitter. Considering that our effects model did not take into account the fact that, because of its design,
PRTP-ECNs performance is highly dependent on which specific packets are lost, this result
was very satisfactory and suggested that the model could be regarded as adequate.
It followed from the ANOVA test that the choice of protocol had indeed an impact on the
average interarrival jitter. We had a positive interaction between protocol and traffic load,
i.e., the reduction in average interarrival jitter attained with PRTP-ECN was greater when
the traffic load increased. This interaction effect was due mainly to the way PRTP-ECN
reacted to packet loss. Contrary to TCP, PRTP-ECN accepted a limited amount of packet
loss. At low packet loss rates, the retransmissions had only a marginal impact on the average
interarrival jitter, which explains why TCP and PRTP-ECN had almost the same average
interarrival jitter at low traffic loads. However, at higher loss rates the difference in the
number of retransmissions made by TCP and PRTP-ECN became significant, which explains
the improvement in average interarrival jitter attained by using PRTP-ECN at greater traffic
loads.
The Tukey test ( = 0.01) indicated that all PRTP-ECN configurations had less average
interarrival jitter than TCP. This test also suggested that the higher the allowable steady-state
packet loss frequency, the better the jitter characteristics for a PRTP-ECN configuration.
97
(c) Normal probability plot of residuals for the untransformed effects model.
(e) Residuals vs. fitted values plot for the untransformed effects model.
(f) Residuals vs. fitted values plot for the transformed effects model.
98
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
20%
27.5 1.5
18.9 0.8
18.6 0.7
18.0 0.7
17.3 0.6
17.3 0.6
17.0 0.7
17.1 0.6
60%
82.4 0.4
53.2 0.2
46.6 0.2
40.7 0.1
40.3 0.1
40.3 0.1
39.8 0.1
40.0 0.1
67%
101.8 3.6
72.5 2.8
63.5 2.5
51.8 1.4
50.0 0.8
49.9 1.1
49.8 1.0
49.9 1.0
Traffic Load
80%
87%
178.5 5.6 267.6 8.7
145.7 4.6 230.0 6.7
135.8 5.8 221.5 6.2
108.4 4.6 193.7 6.4
87.9 2.5 143.9 4.8
85.6 1.7 122.8 3.6
84.9 1.3 121.5 2.8
85.4 1.3 121.3 2.8
93%
460.3 21.9
420.7 15.9
404.0 13.1
361.3 20.1
306.1 15.1
242.9 12.8
191.2 9.4
182.0 5.2
97%
737.1 112.3
722.6 77.1
701.4 97.0
582.7 56.1
485.2 36.0
425.2 45.0
329.7 28.2
242.8 11.8
Protocol
Table 1: 99% confidence intervals for mean average interarrival jitter in sec.
99
100
Protocol
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
20%
60%
579 18
662 15
665 14
669 14
680 13
678 13
683 14
682 13
279 8
337 7
353 8
372 6
373 6
373 7
375 8
374 7
Traffic Load
67%
80%
237 5
269 7
284 7
309 6
312 4
313 6
313 5
312 5
148 2
158 3
162 4
178 5
190 4
192 3
192 3
191 2
87%
93%
97%
103 2
108 2
108 2
113 2
128 3
136 3
137 3
137 3
61 3
62 2
63 2
68 4
71 2
78 3
86 4
86 3
40 5
38 4
38 3
43 3
47 3
48 5
53 3
58 3
Figure 5: Verification of the transformed effects model for average link utilization.
101
utilization, and it followed from the Tukey test ( = 0.01) that all PRTP-ECN configurations
utilized the link better than TCP. The ns-2 trace files suggested that the reason PRTP-ECN
was able to better utilize the link was chiefly its robustness to packet loss. As the link
utilization approached the link capacity, the packet loss rate increased, which in the case
of TCP led to timeouts, followed by retransmissions and slow-start. In contrast, as long
as the packet loss rate was well below the allowable steady-state packet loss frequency,
PRTP-ECN did not experience any timeouts, did not have to perform any retransmissions,
and did not have to return to slow-start. Furthermore, the Tukey test indicated that the link
utilization increased with an increase in allowable steady-state packet loss frequency. This
is a direct consequence of what we discussed earlier concerning retransmissions: A higher
allowable packet loss frequency leads to fewer retransmissions and, consequently, to better
link utilization.
In conclusion, we can say that our statistical evaluation of the average link utilization
suggests that PRTP-ECN utilizes the link somewhat better than TCP. Furthermore, the gain
in link utilization seems to increase with an increase in allowable packet loss frequency.
The fairness index measures the equality of bandwidth sharing. If all flows sharing the
same link receive the same amount of bandwidth, the fairness index is 1 and the fairness
of the bandwidth allocation is 100%. As the disparity in bandwidth allocation increases,
fairness decreases and a bandwidth allocation scheme that favors only a selected few flows
has a fairness index near 0.
We used ANOVA to test the hypothesis that PRTP-ECN shared bandwidth in a fair manner with TCP. More precisely, we tested the hypothesis that the average fairness indexes
calculated in the PRTP-ECN simulations did not differ significantly from those calculated
in the TCP simulations. As before, the untransformed effects model was found inadequate.
The positive residuals especially exhibited a clear trend toward a decrease with an increased
fairness index. This is a direct consequence of the fairness index having an upper bound of 1.
The spread of residuals has, by definition, an upper bound given by the expression: 1 - fitted
value. As a consequence, the positive residuals must decrease with increased fitted values.
Since the fairness index shows a behavior similar to a proportion, we used an Omega
transformation [11]. More specifically, the following transformation was used:
Y
Y = 10 log10 ( 1Y
). The result of this transformation was quite satisfactory. As follows
from the normal probability plot in Figure 6(a), the distribution of the residuals was very
similar to a normal distribution but differed in that it was slightly skewed (a somewhat longer
upper tail than lower tail) and showed more peaks. As follows from Figure 6(b), the variance
of the residuals was still not constant, an observation confirmed by the Levene test: F >
102
Figure 6: Verification of the transformed effects model for average fairness index.
37.597, P < 0.01. More importantly, however, Figure 6 (b) shows that there was no obvious
correlation between the residuals and the fitted values.
The ANOVA test suggested that TCP and PRTP-ECN were not equally fair. Furthermore,
the Tukey test ( = 0.01) indicated that TCP was more fair than either of the PRTP-ECN
configurations and that the fairness index was roughly inversely proportional to the allowable
steady-state packet loss frequency. The difference in fairness between TCP and PRTP-ECN
was marginal, however. As shown in Table 3, the fairness index of the PRTP-ECN configurations was above 94% at all traffic loads except the highest one. At the highest traffic load,
those PRTP-ECN configurations having an allowable steady-state packet loss frequency of
11% or more attained a fairness index of less than 90%. It should be noted, however, that
TCP had a fairness index of only 92% at the highest traffic load and that PRTP-ECN always
had a fairness index well above 80%. Furthermore, at the two highest traffic loads, there
were PRTP-ECN configurations that had an even better average fairness index than TCP.
A problem in the fairness index is that it assumes that all flows passing through a particular link are able to utilize bandwidth as it becomes available. This is not always true, however. In particular, it is not true in our simulation experiment. As we noted in Section 4.3,
when the link utilization approached the link capacity, the packet loss rate increased, which
led to timeouts and a reduced TCP sending rate and in turn to reduced link utilization.
We addressed the inability of the fairness index to differentiate between utilization of
spare bandwidth and bandwidth conquered from contending flows by also considering
TCP-friendliness, a criteria very closely connected to fairness. A flow is said to be TCPfriendly if its arrival rate does not exceed the arrival rate of a conformant TCP connection
under the same circumstances [6]. In our simulation experiment, we tested whether PRTPECN is TCP-friendly using the TCP-friendliness test proposed by Floyd and Fall [6]. According to their test, a flow is TCP-friendly if the following inequality holds for its arrival
rate:
p
1.5 2/3
,
(3)
RT T ploss
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
20%
0.99 0.003
0.98 0.006
0.98 0.006
0.98 0.006
0.98 0.005
0.98 0.006
0.98 0.006
0.98 0.006
60%
0.99 0.003
0.98 0.006
0.96 0.009
0.94 0.008
0.94 0.008
0.94 0.011
0.94 0.012
0.94 0.009
67%
1.00 0.002
0.99 0.006
0.98 0.007
0.95 0.010
0.94 0.007
0.94 0.009
0.94 0.008
0.94 0.008
Traffic Load
80%
1.00 0.001
1.00 0.001
0.99 0.002
0.98 0.008
0.95 0.008
0.95 0.008
0.95 0.008
0.95 0.007
87%
1.00 0.001
1.00 0.002
1.00 0.001
0.99 0.003
0.96 0.009
0.94 0.010
0.94 0.010
0.94 0.010
93%
0.99 0.004
1.00 0.003
1.00 0.003
0.97 0.030
0.98 0.010
0.94 0.030
0.90 0.040
0.90 0.040
97%
0.92 0.050
0.95 0.030
0.95 0.030
0.93 0.040
0.91 0.050
0.88 0.050
0.86 0.050
0.83 0.050
Protocol
103
104
Protocol
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Pass Freq.
99.0%
99.2%
99.6%
99.8%
99.5%
99.5%
98.7%
98.9%
where is the arrival rate of the flow in Bps, denotes the packet size in bytes, RT T is the
minimum round-trip time in seconds, and ploss is the packet-loss frequency.
The results of the TCP-friendliness tests are given in Table 4. TCP was included in the
TCP-friendliness tests as a means to verify the TCP-friendliness calculations. As shown,
all PRTP-ECN configurations passed the TCP-friendliness test in almost all simulation runs.
The reason we did not have a 100% pass frequency was because the TCP-friendliness test
assumes a fairly constant round trip time [6], which was not completely fulfilled at higher
traffic loads. At higher traffic loads, the largest portion of the round trip time was queueing
delays, which exhibited non-negligible fluctuations. This is of course also the reason why
TCP failed the TCP-friendliness test in some simulation runs, which had been impossible
had all the assumptions of the test been fulfilled. Actually, as follows from Table 4, TCP
failed in more simulation runs than all but two PRTP-ECN configurations, PRTP-14 and
PRTP-20. This again suggests that PRTP-ECN indeed exhibited TCP-friendly behavior in
the simulation experiment.
Conclusions
To address the need of a TCP-friendly and fair transport protocol for multimedia applications, an extension to TCP, PRTP-ECN, is proposed. The performance of PRTP-ECN has
been compared with TCP in an extensive simulation study designed as a fixed-factor factorial experiment. This paper gives a detailed description of this simulation experiment. The
results of the experiment indicate that PRTP-ECN gives a reduced average interarrival jitter
and an increased average throughput. At the same time it exhibits TCP-friendly behavior and
is reasonably fair against contending congestion aware flows. In addition, PRTP-ECN seems
to improve the link utilization. We also noted that almost the same results were obtained for
average goodput as for average throughput, which suggests that the improvements in average
throughput given by PRTP-ECN directly translate into improvements in average goodput.
REFERENCES
105
References
[1] M. Allman, V. Paxson, and W. Stevens. TCP congestion control. RFC 2581, IETF,
April 1999.
[2] P. D. Amer, C. Chassot, T. Connolly, M. Diaz, and P. T. Conrad. Partial order transport
service for multimedia and other applications. ACM/IEEE Transactions on Networking,
2(5), October 1994.
[3] B. Dempsey. Retransmission-Based Error Control for Continuous Media Traffic in
Packet-Switched Networks. PhD thesis, University of Virginia, May 1994.
[4] B. Dempsey, T. Strayer, and A. Weaver. Adaptive error control for multimedia data
transfers. In International Workshop on Advanced Communications and Applications
for High-Speed Networks (IWACA), pages 279289, Munich, Germany, March 1992.
[5] M. Diaz, A. Lopez, C. Chassot, and P. D. Amer. Partial order connections: A new
concept for high speed and multimedia services and protocols. Annals of Telecomunications, 49(56):270281, 1994.
[6] S. Floyd and K. Fall. Promoting the use of end-to-end congestion control in the Internet.
ACM/IEEE Transactions on Networking, 7(4):458472, August 1999.
[7] K-J Grinnemo and A. Brunstrom. Enhancing TCP for applications with soft real-time
constraints. In SPIE Multimedia Systems and Applications, pages 1831, Denver, Colorado, USA, August 2001.
[8] K-J Grinnemo and A. Brunstrom. Evaluation of the QoS offered by PRTP-ECN a
TCP-compliant partially reliable transport protocol. In 9th International Workshop on
Quality of Service (IWQoS), pages 217231, Karlsruhe, Germany, June 2001.
[9] D. Hong, C. Albuquerque, C. Oliveira, and T. Suda. Evaluating the impact of emerging
streaming media applications on TCP/IP performance. IEEE Communications Magazine, 39(4):7682, April 2001.
[10] S. Jacobs and A. Eleftheriadis. Streaming video using dynamic rate shaping and TCP
congestion control. Journal of Visual Communication and Image Representation, 9(3),
1998.
[11] R. Jain. The Art of Computer Systems Performance Analysis. John Wiley & Sons, Inc.,
April 1991.
[12] R. Jain, D. Chiu, and W. Hawe. A quantitative measure of fairness and discrimination
for resource allocation in shared computer systems. Technical Report DEC-TR-301,
Digital Equipment Corporation, September 1984.
[13] Microsoft Corp. Windows media homepage. http://www.microsoft.com/
windows/mediaplayer/default.asp.
106
[14] D. Montgomery. Design and Analysis of Experiments. John Wiley & Sons Inc., 2000.
[15] B. Mukherjee and T. Brecht. Time-lined TCP for the TCP-friendly delivery of streaming media. In IEEE International Conference on Network Protocols (ICNP), pages
165176, Osaka, Japan, November 2000.
[16] The network simulator ns-2. http://www.isi.edu/nsnam/ns.
[17] C. Papadopoulos and G. Parulkar. Retransmission-based error control for continuous
media applications. In 6th International Workshop on Network and Operating System
Support for Digital Audio and Video (NOSSDAV), pages 512, Zushi, Japan, April
1996.
[18] K. Ramakrishnan and S. Floyd. A proposal to add explicit congestion notification
(ECN) to IP. RFC 2481, IETF, January 1999.
[19] RealNetworks Inc. RealPlayer. http://www.real.com.
[20] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina,
M. Kalla, L. Zhang, and V. Paxson. Stream control transmission protocol. RFC 2960,
IETF, October 2000.
[21] Q. Xie, R. Stewart, C. Sharp, and I. Rytina. SCTP unreliable data mode extension.
Internet draft, IETF, April 2001. Work in Progress.
Paper V
1 Introduction
Over the last decade, we have seen an explosive growth of multimedia communications and
applications [39]. A tremendous amount of traffic on todays networks consists not only of
text but also of images, video, audio, and other continuous data streams. The exponential
increase in the number of web servers and image intensive web pages combined with applications such as video broadcasting, multimedia conferencing, distance learning, etc. imposes
new requirements on existing transport services. Many of these applications are sensitive to
delay and interarrival jitter but can accommodate a limited amount of data loss.
109
110
The two predominant transport protocols in the Internet today, UDP [30] and
TCP [31], fail to provide an adequate transport service to multimedia applications and other
applications with soft real-time constraints. TCP provides an application with a completely
reliable transport service, while UDP makes no provisions whatsoever for improving the IP
service level. Furthermore, UDP has no built-in congestion control mechanism. It is widely
believed [10, 16, 34] that congestion control mechanisms are critical to the stable functioning of the Internet. At present the vast majority (90-95%) of Internet traffic uses the TCP
protocol [18]. However, due to the growing popularity of streaming media applications,
and because standard TCP is not suitable for the delivery of time sensitive data, increasing
numbers of applications are being implemented using UDP and other congestion unaware
transport protocols. The widespread use of protocols that do not implement congestion control or avoidance mechanisms may result in a congestive collapse of the Internet [16] similar
to the one that occurred in October, 1986 [21]. Taken together, this clearly demonstrates the
need of further research in transport protocols for soft real-time applications.
This paper describes an extension to TCP called PRTP-ECN. PRTP-ECN aims at making
TCP better suited for applications with soft real-time constraints, e.g., best effort multimedia
applications. It accomplishes this by trading reliability for reduced delay and interarrival jitter, and improved throughput. More precisely, PRTP-ECN converts TCP from a completely
reliable transport protocol to a partially reliable one, i.e., a transport protocol accepting a
prescribed packet loss rate. Implementing PRTP-ECN only involves modifying TCPs retransmission scheme; the rest of TCP is left unaffected. This makes PRTP-ECN compatible
with standard TCP implementations, and enables gradual deployment. It also follows that
PRTP-ECN reacts to congestion in a way similar to TCP, which makes it TCP-friendly and
reasonably fair.
The remainder of the paper is organized as follows. Section 2 reviews some related
work. This is followed by a description of the PRTP-ECN retransmission scheme in Section 3. Section 4 presents a simple theoretical analysis of the PRTP-ECN reliability scheme,
where the packet loss behavior of PRTP-ECN both at startup and in steady-state are studied. Furthermore, we investigate how PRTP-ECN reacts to packet loss bursts. In Section 5,
we briefly describe a simulation study in which the interarrival jitter, throughput, and TCPfriendliness of PRTP-ECN were evaluated against standard TCP. The paper is concluded in
Section 6 with a summary of our major findings.
Related Work
2. Related Work
111
uous playback. The PECC protocol [13, 14] provides a transport service for applications
that desire to trade reliability for latency. Developed as an enhancement to the XTP (Xpress
Transport Protocol) protocol [40], the PECC protocol modifies the retransmission algorithm
of XTP to provide a connection oriented service under which retransmission of lost packets occurs only when it will not incur additional delay to the data delivery. Early work on
partially reliable transport protocols for multimedia communication was also conducted by
Papadopoulus and Parulkar [28], who suggested an ARQ scheme involving, among other
things, selective repeat and retransmission expiration.
Extensive work on partial reliability in connection with multimedia transfer has been
done at LAAS-CNRS [15, 36] and at the University of Delaware [4, 5]. Their work resulted
in the proposal of a new transport protocol, POC (Partial Order Connection) [2, 3, 12]. The
POC approach toward realizing a partially reliable service combines a partitioning of the
media stream into objects with the notion of reliability classes. An application designates
individual objects as needing different levels of reliability, i.e., reliability classes are assigned
at the object level. By introducing the object concept and letting applications specify their
reliability requirements on a per-object basis, POC offers a very flexible transport service.
More recent work on partially reliable transport protocols has been carried out by Piecuch
et al [29]. They developed an application level protocol, SRP (Selective Retransmission
Protocol), which works on top of UDP. SRP implements two retransmission decision algorithms: equal loss latency and optimum quality. The retransmission algorithms differ in the
way quality is measured. However, both algorithms base their retransmission decisions on
the packet loss frequency and the latency.
In addition to general partially reliable protocols, a large number of transport protocols targeting specific applications have been proposed. For example, Rhee [35] presents a
retransmission-based error control technique for interactive video, and Li et al. [23] proposed
a novel scheme for distribution of MPEG coded video over a best effort network.
In contrast to PRTP-ECN, a large number of the proposed partially reliable transport protocols do not implement any congestion control at all, or employ one that does not interact
fairly with TCP. In addition, many of the proposed transport protocols are not able to interwork with the existing Internet infrastructure or require radical changes to extant transport
protocols. PRTP-ECN, on the other hand, entails only modifying the retransmission scheme
of TCP and is able to interact with standard TCP implementations.
A key feature of PRTP-ECN is its TCP-friendly behavior. In view of the negative effects
that TCP-unfriendly flows have on the stability and performance of the Internet, a large body
of work has accumulated describing various congestion control mechanisms for multimedia
applications. A majority of these mechanisms are either window based, as is TCP, or rate
based.
Examples of window based congestion control schemes are TLTCP (Time-lined TCP) [24,
25] and the binomial congestion control algorithms proposed by Bansal and Balakrishnan [6, 7]. TLTCP works in the same way as TCP except that it imposes deadlines on
outstanding data. It sends data in a similar fashion as in TCP until the deadline for a section
of data has been passed. Upon expiration of a deadline, obsolete data are discarded and the
sending window is moved forward to enable new data to be sent. The binomial congestion
control algorithms suggested by Bansal and Balakrishnan may be seen as a generalization of
112
The PRTP-ECN extension to TCP only involves changing the retransmission decision algorithm of TCP, i.e., PRTP-ECN retains most of the characteristics and mechanisms of TCP. In
particular, PRTP-ECN does not alter data transfer, flow control, multiplexing, or connection
characteristics of TCP in any way. This enables PRTP-ECN to transparently interwork with
existing TCP implementations. The modifications to TCP imposed by PRTP-ECN have been
isolated to the receiver side. No changes are required at the sender side.
The PRTP-ECN retransmission decision algorithm is parameterized. The application
atop PRTP-ECN explicitly prescribes a minimum acceptable reliability level by setting the
parameters of the retransmission algorithm. Implicitly, the parameters govern the trade-off
between delay jitter, throughput, and goodput. By relaxing the reliability, the application
receives less interarrival jitter and improved throughput and goodput.
Figure 1 illustrates how PRTP-ECN works. As long as no packets are lost, PRTP-ECN
behaves in the same way as standard TCP. When an out-of-sequence packet is received, it is
11
00
00
11
00
11
00
11
113
threeway handshake,
ECN negotiation
sender side
TCP
11
00
00
11
00
11
00
11
receiver side
PRTPECN
insequence da
ta packet
outofseque
packet loss
et
packet loss detected
nt
acknowledgeme
ho flag set
with ECNEc
crl
rrl ?
congestion action
data packet wi
where n is the number of packets received less the out-of-sequence packet, af is the weight
or aging factor, and bk denotes the number of bytes contained in packet k. The variable
denoted pk is a conditional variable that only takes a value of 1 or 0. If the kth packet was
successfully received, then pk = 1 otherwise pk = 0.2
2 In Eq. 1 packets
are numbered backwards starting at the packet preceding the out-of-sequence packet and going
114
An application communicates its lower bound reliability level through the aging factor
and a second parameter called the required reliability level. The required reliability level,
rrl , acts as a reference value. As long as crl(n) rrl , dropped packets need not be retransmitted and are therefore acknowledged. If an out-of-sequence package is received and
crl(n) is below rrl , PRTP-ECN acknowledges the last in-sequence packet and waits for a
retransmission.
Acknowledging lost packets interferes with the congestion control in standard TCP, which
interprets lost packets as a signal of congestion. PRTP-ECN remedies this by employing explicit congestion notification [32]. More specifically, PRTP-ECN uses the TCP portions of
ECN (Explicit Congestion Notification).
Normally, ECN involves both the IP and TCP layers. Upon incipient congestion, a router
sets a flag, the Congestion Experienced bit (CE), in the IP header of arriving packets. The
receiving end-node propagates the congestion signal back to the sender by setting the so
called ECN-Echo flag in the acknowledgement of a CE packet. When the sender receives
an ECN-Echo packet, it takes the same actions as are taken when a congestion loss has
occurred, e.g., reduces its congestion window. To provide robustness against dropped acknowledgements with the ECN-Echo flag set, the receiver continues to set the ECN-Echo
flag in acknowledgements until it receives a confirmation of the congestion action at the
sender. When the sender has taken the congestion action, it notifies the receiver by setting a
flag, the Congestion Window Reduced bit (CWR), in the following data packet.
PRTP-ECN does not involve intervening routers. Instead, when an out-of-sequence package is acknowledged, the ECN-Echo flag is set in the acknowledgement packet. When receiving the acknowledgement, the sender will take the appropriate congestion actions but
will not retransmit any packet.
This section theoretically analyzes how the response of PRTP-ECN to packet losses depends
on af and rrl by studying the expression for crl in Section 3 (Eq. 1). The analysis makes
the simplifying assumption that all packets are of equal size, which reduces the expression
for crl(n) to
Pn1 k
k=0 af pk
.
(2)
crl(n) = P
n1
k
k=0 af
In the remainder of this text, a PRTP-ECN protocol that has been assigned fixed values for
af and rrl is called a PRTP-ECN configuration.
In Subsection 4.1, we consider the startup behavior of PRTP-ECN. More precisely, we
investigate how the number of packets that must be successfully received until a packet loss
is allowed, i.e., until crl rrl, depends on af , rrl, and the initial value of crl. The steadystate behavior of PRTP-ECN is considered in Subsection 4.2. Explicit formulae are derived
for the upper and lower bounds for the required distance between packet losses, i.e., the
backwards to the first packet sent.
crl init =
1
+1
af
af 1
af
115
Packet
Sequence
Number
af + + af +1
1 + + af +1
(3)
By observing that both the numerator and the denominator of crl are geometric series, we
obtain:
+1
af 1af
1af + af
crl( + 1) =
(4)
+2
1af
1af
+1
1af
1af + af
1af +2
1af
rrl
ln af
(5)
(6)
116
Figures 3, 4, and 5 show how varies with af , rrl and . They illustrate how the initial
behavior of PRTP-ECN depends on the configuration and how it can be controlled by the
initial value used for crl . More packets must be received for high values of rrl or for low
values of af . As expected, using a large value allows PRTP-ECN to accept a packet loss
more quickly without requiring a retransmission.
117
n1
af n + l 1
af n + l 2
af l + 1
n+1
n +l
af l
af l 1
Packet
Sequence
Number
crl(n + l ) = lim
(7)
We decompose the expression for crl(n + l ) into two rational expressions: the first expression covering the packets from the first to the second packet loss and the second expression
covering all packets up to and including the first packet loss. This gives us
crl(n + l ) =
Pl 1
k
k=1 af
lim Pn+
l 1
n
af k
k=0
Pn+l 1
l +1
+ Pk=
n+l 1
k=0
af k
af k
(8)
118
Next, we reduce (8) by using the formula for the sum of a geometric series:
crl(n + l ) = lim af
n
1af l 1
1af
1af n+l
1af
+ af
l +1
1af n1
1af
1af n+l
1af
(9)
(10)
crl(n) = .
(12)
Since crl(n) = rrl , we obtain the following relationship between the numerator and the
denominator of crl at the time of the first packet loss:
= rrl .
(13)
119
crl(n + u) = rrl
crl(n) = rrl
n1
n+1
n +u
af n + u 1
af n + u 2
af u + 1
af u
af u 1
Packet
Sequence
Number
af + + af u 1 + af u rrl
1 + + af u 1 + af u
(14)
af
1af u 1
+ af u rrl
1af
1af u
u
1af + af
(15)
However, crl(n + u ) = rrl which gives us the following equation for the upper bound of
the required distance between packet losses:
af
1af u 1
+ af u rrl
1af
1af u
u
1af + af
= rrl .
(16)
(17)
&
af
ln rrl
rrl 1
ln af
'
The graph in Figure 9 shows how u depends on af and rrl . Again, u increases with
increasing rrl values and with decreasing af values. We can also see in Figures 7 and 9
that, for some configurations, l and u coincide meaning that both bounds are tight for
these configurations.
Since the allowable packet loss frequencies are bounded by the reciprocals of the required
distances between packet losses, we obtain the following formulae for the upper and lower
bounds of the allowable packet loss frequency:
1
,
l
1
.
fl =
u
fu =
(18)
(19)
120
Although fu gives some appreciation for the maximum allowable packet loss frequency
of PRTP-ECN, it is an overestimate. A more usable metric for the maximum packet loss
frequency of a PRTP-ECN configuration is obtained by considering the scenario in which
packets are lost at all times when PRTP-ECN allows this, i.e., at all times when crl rrl .
Simulations suggest that the allowable packet loss frequency in this scenario approaches a
limit, fs , as the total number of sent packets, n, reaches infinity, or more formally stated:
def
loss( n )
,
n
n
fs = lim
(20)
where n denotes the packet sequence comprising all packets sent up to and including the nth
packet, and loss( n ) is a function that returns the number of lost packets in n . Henceforth,
we will call the metric fs the allowable steady-state packet loss frequency. The graphs in
Figures 10, 11, and 12 illustrate how fs depends on af and rrl and how it relates to fu
and fl . We can see that the configuration dictates the allowable steady-state packet loss
frequency obtained by the protocol and that fs falls nicely in between the upper and lower
bound.
121
122
1
af n + b 2
2
af
n1
n+b3
af
af
n+b1
b1
Packet
Sequence
Number
(21)
If we recognize the numerator and denominator as being geometrical series and replace the
sums with the corresponding explicit formula, we obtain:
crl(n + b) = lim af
1af n1
1af
1af n+b1
1af
(22)
(23)
The problem of finding the maximum allowable packet loss burst length can now be formulated as finding:
sup {af b rrl } .
(24)
bN
Since, for af ]0, 1[, the function f (x) = af x is a strictly monotonically decreasing function, we have
$
%
sup {af b rrl } =
bN
(25)
xR
This gives us the following explicit formula for the maximum allowable packet loss burst
length:
$
%
ln rrl
b=
.
(26)
ln af
Figure 14 illustrates how b depends on af and rrl . Larger packet loss burst lengths are
accepted for small values of rrl and for high values of af .
5. Simulation Experiment
123
5 Simulation Experiment
In the previous section, we theoretically analyzed the packet loss behavior of the PRTPECN retransmission scheme. A simulation study was done to examine the implications of the
packet loss behavior of PRTP-ECN for interarrival jitter, throughput, and TCP-friendliness [19].
This section gives a condensed description of this simulation experiment.
5.1 Methodology
We used the ns-2 [26] network simulator to conduct the experiment reported in this paper.
Figure 15 shows the network topology used in all simulations. The primary factors in the
experiment were the protocol used at node S4, the traffic load on the link between nodes R1
and R2, and the starting times of the flows emanating from nodes S1 and S2. The last factor
was found to have a limited impact on the result, however, and is thus not discussed further
here.
In our experiment, simulations were made in pairs. In the reference tests, TCP was used
at both nodes S1 and S4 while, in the evaluation tests, the TCP agent at node S4 was replaced
with PRTP-ECN. In the simulations, two FTP applications attached to nodes S1 and S2 sent
data at a rate of 10 Mbps to receivers at nodes S4 and S5. The initial congestion window
of the TCP and PRTP-ECN agents was initialized to two segments [1]. All other agent
parameters were assigned their default values.
Background traffic was accomplished by a UDP flow between nodes S3 and S6. The UDP
flow was generated by a constant bitrate traffic generator residing at node S3. The departure
times of the packets from the traffic generator were randomized, however, resulting in a
variable bit rate flow between nodes S3 and S6. In all simulations, a maximum transfer unit
124
S1
S2
S3
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
R1
1.5 Mbps, 50 ms
R2
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
S4
S5
S6
5.2 Results
Several performance metrics were evaluated for the analysis of the simulation experiment.
Here, we consider only the average interarrival jitter, the average throughput, and TCPfriendliness for the FTP flow between nodes S1 and S4.
The graph in Figure 16 shows the result of the evaluation of the interarrival jitter. The
traffic load expressed in percent of the bandwidth of link R1-R2 is on the horizontal axis,
and the relative interarrival jitter is on the vertical axis. The relative interarrival jitter was
5. Simulation Experiment
125
calculated as the ratio of the measured interarrival jitter to the interarrival jitter measured in
the reference test. As follows from the graph, the PRTP-ECN configurations significantly
decreased the interarrival jitter as compared to TCP. At low traffic loads, the reduction was
about 30%. For packet loss rates in the neighborhood of 20%, the reduction was in some
cases as much as 68%.
Figure 17 presents the result of the throughput evaluation. As before, we have traffic load
on the horizontal axis. On the vertical axis, we have relative throughput, i.e., throughput relative to the throughput measured in the reference test. While the increase in throughput was
not as significant as the reduction in interarrival jitter, it was still significant. For example,
an application accepting a 20% packet loss rate could increase its throughput by as much
as 48%. Applications tolerating only a few percents packet loss rate could also experience
improvements in throughput of as much as 20%. A comparison of the throughputs for PRTPECN and TCP also suggest that PRTP-ECN is better at utilizing the bandwidth than TCP.
This has not been statistically verified, however.
Recall from Section 5.1 that a traffic load corresponds approximately to a particular
packet loss rate. Taking this into account in analyzing the results, it may be concluded
that a PRTP-ECN configuration had its optimum in both relative interarrival jitter and relative throughput when the packet-loss frequency was almost the same as fs . This is a direct
consequence of the way we defined fs . At packet loss frequencies lower than fs , the gain in
performance was limited by the fact that not very many retransmissions were necessary in
the first place. When the packet loss frequency exceeded fs , the situation was the reverse.
126
In these cases, PRTP-ECN had to increase the number of retransmissions in order to uphold
the reliability level, which had a negative impact on both interarrival jitter and throughput.
It should be noted, however, that even in cases when PRTP-ECN had to increase the number
of retransmissions, it performed far fewer retransmissions than TCP.
The TCP-friendliness of the FTP flow was evaluated using the TCP-friendliness test proposed by Floyd and Fall [16]. According to their test, a flow is said to be TCP-friendly if the
following inequality holds for its arrival rate:
p
1.5 2/3
,
RT T pl
(27)
where is the arrival rate of the flow in Bps, denotes the packet size in bytes, RT T is
the minimum round trip time in seconds, and pl is the packet loss frequency. In analyzing
the results of the TCP-friendliness tests, we said that a PRTP-ECN configuration was TCPfriendly if more that 95% of the runs in a test passed the TCP-friendliness test. The reason
for not requiring a 100% pass frequency was that not even TCP managed to be TCP-friendly
in all runs. Our simulation experiment suggests that PRTP-ECN is indeed TCP-friendly. As
a matter of fact, all PRTP-ECN configurations passed the TCP-friendliness test.
REFERENCES
127
6 Conclusion
This paper proposed an extension to TCP called PRTP-ECN. PRTP-ECN aims at making
TCP more suitable for applications with soft real-time constraints, e.g., best effort multimedia applications. It accomplishes this by trading reliability for reduced delay and interarrival
jitter, and improved throughput. More specifically, PRTP-ECN introduces a novel retransmission scheme that makes TCP a partially reliable transport protocol, i.e., a transport protocol accepting a prescribed packet loss rate.
Analytic models are developed for the packet loss behavior of PRTP-ECN. They establish how the application controlled parameters of the protocol influence the response of
PRTP-ECN to packet loss. In order to better appreciate the implications of the packet loss
behavior of PRTP-ECN for interarrival jitter, throughput, and TCP-friendliness, a simulation
experiment was done. The experiment suggests that PRTP-ECN is able to offer a service
with significantly reduced interarrival jitter and improved throughput as compared to TCP,
while at the same time being TCP-friendly.
References
[1] M. Allman, V. Paxson, and W. Stevens. TCP congestion control. RFC 2581, IETF,
April 1999.
[2] P. D. Amer, C. Chassot, T. Connolly, and M. Diaz. Partial order quality of service to
support multimedia connections: Reliable channels. In 2nd High Performance Distributed Computing Conference, Spokane, Washington, USA, July 1993.
[3] P. D. Amer, C. Chassot, T. Connolly, and M. Diaz. Partial order quality of service
to support multimedia connections: Unreliable channels. In International Networking
Conference (INET), San Fransisco, California, USA, August 1993.
[4] P. D. Amer, C. Chassot, T. Connolly, M. Diaz, and P. T. Conrad. Partial order transport
service for multimedia and other applications. ACM/IEEE Transactions on Networking,
2(5), October 1994.
[5] P. D. Amer, P. T. Conrad, E. Golden, S. Iren, and A. Caro. Partially-ordered, partiallyreliable transport service for multimedia applications. In Advanced Telecommunications/Information Distribution Research Program (ATIRP) Conference, pages 215
220, College Park, Maryland, USA, January 1997.
[6] D. Bansal. Congestion control for streaming video and audio applications. Masters
thesis, Massachusetts Institute of Technology (MIT), January 2001.
[7] D. Bansal and H. Balakrishnan. TCP-friendly congestion control for real-time streaming applications. Technical Report MIT-LCS-TR-806, Massachusetts Institute of Technology (MIT), May 2000.
[8] J-C. Bolot. End-to-end packet delay and loss behavior in the Internet. ACM Computer
Communication Review (SIGCOMM), 23(4):289298, September 1993.
128
[9] J-C Bolot and A. Vega-Garcia. Control mechanisms for packet audio in the Internet.
In 15th Annual Joint Conference of the IEEE Computer and Communications Societies
(INFOCOM), San Fransisco, California, USA, March 1996.
[10] B. Braden, D. Clark, J. Crowcroft, B. Davie, S. Deering, D. Estrin, S. Floyd, V. Jacobson, G. Minshall, C. Partridge, L. Peterson, K. Ramakrishnan, S. Shenker, J. Wroclawski, and L. Zhang. Recommendations on queue management and congestion
avoidance in the Internet. RFC 2309, IETF, April 1998.
[11] R. Caceres, P. Danzig, S. Jamin, and D. Mitzel. Characteristics of wide-area TCP/IP
conversations. ACM Computer Communication Review, 21(4):101112, September
1991.
[12] T. Connolly, P. D. Amer, and P. T. Conrad. An extension to TCP: Partial order service.
RFC 1693, IETF, November 1994.
[13] B. Dempsey. Retransmission-Based Error Control for Continuous Media Traffic in
Packet-Switched Networks. PhD thesis, University of Virginia, May 1994.
[14] Bert J. Dempsey, Jorg Liebeherr, and Alfred C. Weaver. On retransmission-based error
control for continuous media traffic in packet-switched networks. Computer Networks
and ISDN Systems, 28:719736, 1996.
[15] M. Diaz, A. Lopez, C. Chassot, and P. D. Amer. Partial order connections: A new
concept for high speed and multimedia services and protocols. Annals of Telecomunications, 49(56):270281, 1994.
[16] S. Floyd and K. Fall. Promoting the use of end-to-end congestion control in the Internet.
ACM/IEEE Transactions on Networking, 7(4):458472, August 1999.
[17] S. Floyd, M. Handley, J. Padhye, and J. Widmer. Equation-based congestion control for
unicast applications. ACM Computer Communication Review (SIGCOMM), 30(4):43
56, August 2000.
[18] Computer Association for Internet Data Analysis (CAIDA).
Traffic workload overview. http://www.caida.org/outreach/resources/learn/
trafficworkload/tcpudp.xm%l, June 1999.
[19] K-J Grinnemo and A. Brunstrom. Evaluation of the QoS offered by PRTP-ECN a
TCP-compliant partially reliable transport protocol. In 9th International Workshop on
Quality of Service (IWQoS), pages 217231, Karlsruhe, Germany, June 2001.
[20] M. Handley, J. Padhye, and S. Floyd. TCP friendly rate control (TFRC): Protocol
specification. Internet draft, IETF, May 2001. Work in Progress.
[21] V. Jacobson and M. J. Karels. Congestion avoidance and control. ACM Computer
Communication Review (SIGCOMM), 18(4):314329, August 1988.
[22] S. Karandikar, S. Kalyanaraman, P. Bagal, and B. Packer. TCP rate control. ACM
Computer Communication Review, 30(1), January 2000.
REFERENCES
129
[23] X. Li, S. Paul, P. Pancha, and M. Ammar. Layered video multicast with retransmission
(LVMR): Evaluation of error recovery schemes. In 6th International Workshop on
Network and Operating System Support for Digital Audio and Video (NOSSDAV), St.
Louis, Missouri, USA, May 1997.
[24] B. Mukherjee. Time-lined TCP: a transport protocol for delivery of streaming media
over the Internet. Masters thesis, University of Waterloo, 2000.
[25] B. Mukherjee and T. Brecht. Time-lined TCP for the TCP-friendly delivery of streaming media. In IEEE International Conference on Network Protocols (ICNP), pages
165176, Osaka, Japan, November 2000.
[26] The network simulator ns-2. http://www.isi.edu/nsnam/ns.
[27] J. Padhye, V. Firoiu, D. Towsley, and J. Krusoe. Modeling TCP throughput: A simple model and its empirical validation. ACM Computer Communication Review (SIGCOMM), 28(4):303314, August 1998.
[28] C. Papadopoulos and G. Parulkar. Retransmission-based error control for continuous
media applications. In 6th International Workshop on Network and Operating System
Support for Digital Audio and Video (NOSSDAV), pages 512, Zushi, Japan, April
1996.
[29] M. Piecuch, K. French, G. Oprica, and M. Claypool. A selective retransmission protocol for multimedia on the Internet. In SPIE Multimedia Systems and Applications,
Boston, Massachusetts, USA, November 2000.
[30] J. Postel. User datagram protocol. RFC 768, IETF, August 1980.
[31] J. Postel. Transmission control protocol. RFC 793, IETF, September 1981.
[32] K. Ramakrishnan and S. Floyd. A proposal to add explicit congestion notification
(ECN) to IP. RFC 2481, IETF, January 1999.
[33] S. Ramesh and I. Rhee. Issues in model-based flow control. Technical Report TR-9915, North Carolina State University, 1999.
[34] R. Rejaie, M. handley, and D. Estrin. RAP: An end-to-end rate-based congestion control mechanism for realtime streams in the Internet. In 18th Annual Joint Conference
of the IEEE Computer and Communications Societies (INFOCOM), New York, New
York, USA, March 1999.
[35] I. Rhee. Error control techniques for interactive low-bit rate video transmission over
the Internet. ACM Computer Communication Review (SIGCOMM), 28(4):290301,
September 1998.
[36] L. Rojas-Cardenas, L. Dairaine, P. Senac, and M. Diaz. An adaptive transport service
for multimedia streams. In IEEE International Conference on Multimedia Computing
and Systems (ICMCS), Florence, Italy, June 1999.
130
[37] H. Schulzrinne. RTP: A transport protocol for real-time applications. RFC 1889, IETF,
January 1996.
[38] D. Sisalem and H. Schulzrinne. The loss-delay based adjustment algorithm: A TCPfriendly adaptation scheme. In 8th International Workshop on Network and Operating
System Support for Digital Audio and Video (NOSSDAV), pages 215226, Cambridge,
United Kingdom, July 1998.
[39] R. Steinmetz and K. Nahrstedt. Multimedia: Computing Communications and Applications. Prentice Hall, 1995.
[40] W. Strayer, B. Dempsey, and A. Weaver. XTP: The Xpress Transfer Protocol. AddisonWesley Publishing, July 1992.
[41] W-T Tan and A. Zakhor. Real-time internet video using error resilient scalable compression and TCP-friendly transport protocol. IEEE Transactions on Multimedia,
1(2):172186, June 1999.
[42] K. Thompson, G. Miller, and R. Wilder. Wide-area Internet traffic patterns and characteristics (extended version). IEEE Network, pages 1023, November 1997.
Part II
Transport Service
for Telephony Signaling
Paper VI
135
136
Introduction
Few industries have experienced a more revolutionary change than the one which has shaken
the telecommunications industry in the last fifteen years. In the beginning of the nineties, the
telecommunication market basically comprised a number of national monopolies or national
incumbent operators. Today, incumbency in the telecom market has come under siege as a
result of country-by-country telecom liberalization, deregulation, privatization, and competition. This process spread rapidly from the U.S. Telecommunications Act of 1996, through
telecom reforms in each of 27 European countries during the second half of the 1990s, to
Indias New Telecom Policy of 1999. Today, this process, and the industry re-alignment that
it is causing, is far from over.
The wireline landscape has changed dramatically over the past couple of years. A number
of broadband access operators are competing for market shares, and consumers are increasingly aware that they can make low cost, or even free calls, to basically any destination in
the world. This has positioned todays wireline operators at a crossroad. On one hand, they
need to decrease both capital and operational expenditures, on the other hand, they have a
large installed base of legacy circuit-switched equipment that still generates the major part
of their revenue.
Also the wireless landscape is evolving. Although, the wireless industry is still a large
and dynamic industry that continues to enjoy significant growth worldwide, it needs sustained revenue growth and improved cost efficiency to protect margins. The wireless industry is today a mature industry that has been globally available for quite some time. Growth
of subscribers, traffic, and, most importantly, revenues, is by no means automatic. Entry
costs for new users and tariffs must be continuously reduced to increase subscriber numbers
and call minutes. Per unit pricing for as lucrative services as voice and Short Message Service (SMS) is eroding sharply in most markets. Thus, there is a strong belief in the wireless
industry that new services are needed to drive revenue growth. Further, due to the ever increasing popularity of Internet and Internet-based multimedia services, it is considered vital
that future wireless networks will seamlessly interwork with IP.
To address the challenges facing todays wireline and wireless industry, the so-called
softswitch solution or architecture has evolved. The softswitch architecture provides a smooth
first migration step from current circuit-switched fixed and cellular core networks to an all-IP,
multi-service telecommunication network. Section 3 introduces the softswitch architecture,
and discusses the incentives for both established incumbent operators and new competitive
operators to embrace this architecture. As will become evident in Section 3, one of the key
drivers of introducing the softswitch architecture is the promise of new revenue-generating
applications and services. To this end, Section 4 surveys the application/service creation
environments of the softswitch architecture.
At the heart of a telecommunication network is signaling: Call signaling is paramount to
manage call sessions, and bearer signaling to manage the actual media streams. Sections 5
and 6 discuss call and bearer signaling respectively in the softswitch architecture. Next,
since legacy wireless and wireline circuit-switched core networks will most likely live on
for the next decade or so, Section 7 examines the Internet Engineering Task Force (IETF)
SIGnaling TRANsport (SIGTRAN) framework architecture for transportation of Signaling
137
System No. 7 (SS7) signaling over IP. The report concludes in Section 8 with an outlook of
the migration steps following the softswitch architecture. In particular, an overview of the
3rd Generation Partnership Project (3GPP) IP Multimedia Subsystem (IMS) is given.
For those readers who are less familiar with signaling in current fixed and cellular telecommunication networks, Section 2 provides a brief introduction and summary of SS7, the most
widely used signaling system in both the Public Switched Telephone Network (PSTN) and
the Public Land Mobile Network (PLMN). The section also gives brief overviews of the architectures of the current PSTN and PLMN networks, and describes how SS7 is integrated
into these networks.
138
LE
LE
Core Network
z
1
Access Signaling
Network Signaling
Access Signaling
LE = Local Exchange
139
Associated Mode
Exchange
Exchange
Exchange
Quasi-Associated Mode
Non-Associated Mode
Exchange
Signaling
Transfer
Point
Exchange
Signaling
Transfer
Point
Exchange
Signaling
Transfer
Point
Exchange
Signaling
Transfer
Point
Bearer Traffic
Signaling Traffic
140
STP
STP
STP
STP
SCP
SSP
SSP
Bearer Traffic
Signaling Traffic
hence are assigned an Originating Point Code (OPC) and a Destination Point Code (DPC).
Routing in SS7 is in part done on the basis of the DPC of a message.
To provide more bandwidth and/or redundancy, links are usually organized into groups
known as linksets. A linkset is a collection of links that share the same destination and are
for the most part established directly between SPs. When links are collected in linksets, the
total load of traffic is typically shared between active links. There can be up to 16 links in a
linkset, and a single SP may support a number of linksets in between itself and other SPs.
When one SP is reachable from another SP, there is said to be a route between the two.
In other words, a route is the path that exists between any two SPs. The route may comprise
a single linkset or multiple linksets; the term simply refers to the existence of a network path
between two SPs. Where alternative routes exist between two SPs, they together constitute
a routeset. Figure 4 exemplifies the concepts of link, linkset, route, and routeset.
As illustrated in Figure 3, an SS7 network includes a number of different types of SPs.
In fact, there can be three different types of SPs in an SS7 network:
Service Switching Points (SSPs). SSPs are SS7-aware exchanges that originate, terminate, and, if integrated STPs (see below), forward calls. An SSP sends signaling
messages to other SSPs to setup, manage, and release voice circuits required to complete a call. An SSP may also send queries to Service Control Points (SCPs), e.g., to
determine how to route a call or in connection with an IN service (see Section 2.6).
STP
141
STP
Route
SCP
Route
STP
STP
Linkset
Linkset
Routeset
SSP
SSP
Bearer Traffic
Signaling Traffic
142
Exchange
Exchange
Message handling
UP
UP
SCCP
SCCP
Message transfer
NSP
MTP
MTP
Message transmission
143
Exchange
Exchange
TCAP
TCAP
ISUP
ISUP
SCCP
SCCP
MTP-L3
MTP-L3
MTP-L2
MTP-L2
MTP-L1
MTP-L1
Timeslot
Voice Circuit #1
Voice Circuit #2
Voice Circuit #3
24 Voice Circuits
E1 Framing Format
Timeslot
Framing Slot
Voice Circuit #1
Signaling Slot
30 Voice Circuits
Framing Bit
Signaling bit
144
NI
SIF
User Data
Spare
SI
SIO
Routing Label
SLS
OPC
DPC
Figure 8: Routing label and other fields used by MTP-L3 for routing.
the so-called Digital Signal (DS) service hierarchy. The basic unit of transmission on
a T1 trunk is 56 kbps and is designated DS-0A, and the basic transmission unit on an
E1 trunk is 64 kbps and is designated DS-0. A T1 trunk has a capacity of 24 DS-0As,
and an E1 trunk a capacity of 30 DS-0s.
MTP Level 2 (MTP-L2). MTP-L2 together with MTP-L1 provides for reliable signaling on a single signaling link in between two adjacent SPs. Specifically, MTP-L2
incorporates such capabilities as message delimitation, link error detection, link error
correction, link error monitoring, and link flow control.
MTP Level 3 (MTP-L3). Basically, MTP-L3 extends the functionality of MTP-L2 to
signaling routes. The MTP-L3 functions can be divided into two basic categories: Signaling Message Handling (SMH) and Signaling Network Management (SNM). The
SMH functionality is done on the basis of the routing label and the Service Information Octet (SIO) fields of an SS7 message (see Figure 8), and can further be divided
into message discrimination, message distribution, and message routing.
Message discrimination is the task of determining whether an incoming signaling message is destined to the SP currently processing the message. It makes this determination using the DPC and Network Indicator (NI) fields of the message. When the
discrimination function has determined that a message is destined for the current SP,
it performs the message distribution function by examining the Service Indicator (SI)
field. The SI field indicates which MTP-L3 user (i.e., either SCCP or a UP protocol)
the message should be forwarded to for further processing.
145
Routing takes place when the current SP has determined that a received message is to
be sent to another SP. The selection of an outgoing link is done based on the values
of the DPC and the Signaling Link Selector (SLS) fields. Each SP that provides STP
functionality has a routing table that is continuously updated with link status information. By mapping the DPC and SLS values of the received message against this table,
a suitable outgoing link is obtained.
The purpose of the SNM functionality is to provide for signaling link management,
signaling route management, and signaling traffic management. Signaling link management entails the management of locally attached signaling links. In particular,
SNM includes link management capabilities for link activation, deactivation, restoration, and linkset activation. The signaling route management includes the functions
needed to distribute information to adjacent SPs about the status of signaling routes.
Finally, the signaling traffic management concerns the rerouting of signaling traffic
from failed routes. It also concerns route-level congestion control.
MTP only supports circuit-related signaling, and SCCP was added to SS7 primarily to
provide for non-circuit related signaling. In particular, it appeared in the second version of
SS7 in 1984 to provide for non-circuit related signaling in connection with IN and cellular
networks.
The second major contribution of SCCP is a new routing mechanism, Global Title Routing (GTR), that complements MTP-L3 with incremental routing. A Global Title (GT) is
an address which in itself does not contain the information necessary to perform routing
in an SS7 network. There are numerous examples of GTs: in fixed networks, toll-free (e.g.,
020-numbers) and premium-rate numbers are examples of GTs, and in cellular networks, the
Mobile Subscriber Integrated Services Digital Network (MSISDN) and International Mobile
Subscriber Identity (IMSI) are examples of GTs.
GTR frees originating SPs from the burden of having to know every potential destination
to which they might have to route a message. When GTR is used, an SP, e.g., an SSP, does
not have to determine the final destination of a message. Instead, it might query an STP that
does GT translation, a so-called SCCP Relay Point (SRP), about the next SP along the route
towards the destination. The next SP is either the final destination or yet another SRP. If the
next SP is an SRP, a new GTT (Global Title Translation) is made when the message arrives
at this SP. The routing continues in this incremental way until the final SP is reached.
As mentioned earlier, in contrast to the NSP, the UP is to a large extent application dependent. However, two UP protocols stand out as being more important than others: the
Integrated Services Digital Network (ISDN) UP protocol (ISUP) and the Transaction Capabilities Application Part (TCAP).
ISUP is the UP protocol of the SS7 stack primary responsible for all circuit-related signaling. It conveys the signaling necessary to establish and maintain call connections. Each
exchange gets the call signaling information from the previous exchange along the voice
circuit as the connection is being established. Thus, ISUP messages are forwarded through
the SS7 network from SSP to SSP parallel to the voice circuit being established.
To illustrate the functionality of ISUP, Figure 9 shows the basic steps of a call setup
between a calling party, A, and a called party, B, in the PSTN. The steps are as follows:
146
A (calling party)
SSP-1
STP-1
SSP-2
B (called party)
SETUP
IAM
IAM
SETUP
ALERTING
ACM
ACM
CONNECT
CONNECT ACK
ALERTING
ANM
ANM
CONNECT
CONNECT ACK
Conversation
147
(7) The call setup is complete, and the conversation can commence.
The second major UP protocol is TCAP. TCAP was primarily introduced in SS7 to provide a generic transaction protocol for IN services and cellular networks. For example,
an SSP uses TCAP to query an SCP when it has to determine the route for a toll-free or
premium-rate call. TCAP is also used in connection with a mobile user roaming into a new
Mobile Switching Center (MSC)/Visitor Location Register (VLR) service area.
TCAP is primarily designed to be used for querying and retrieval of information from
SCPs. Logically, the TCAP protocol comprises two subparts: a component subpart and a
transaction subpart. Operations and their results are transmitted in between an SP and an
SCP as components. There are four types of components:
Invoke. The Invoke component is used to send an operation to an SCP.
Return Result. The result from an Invoke component is returned in the form of an
Return Result component.
Return Error. If an operation fails, a Return Error component is returned.
Reject. The Reject component reports the receipt and rejection of an incorrect component such as a badly formed Invoke.
The component subpart is responsible for accepting components from a TCAP user and
delivering those components, in order, to the recipient TCAP user. To be able to do so, the
component subpart employs the transaction subpart.
The transaction subpart packetizes components into messages, and sends the messages
in the form of transactions to the recipient TCAP user. There are five types of transactionsubpart messages: Begin, Continue, End, Abort, and Unidirectional. A Begin message starts
a transaction; one or several Continue messages are used following a Begin message; and the
End message terminates a successful transaction. The Abort message is used to terminate
an unsuccessful transaction, i.e., a transaction in which an abnormal situation has occurred.
Unidirectional messages are used in transactions that only contains requests and no replies.
148
ITE
NTE
NTE
RTE
LE
RTE
LE
LE
RTE
LE
LE
LE
I -STP
I-STP
N-STP
N-STP
R-STP
R-STP
R-STP
R-STP
S-STP
S-STP
SEP
SEP
SEP
SEP
149
SEP
SEP
SEP
Metropolitan Area
SEP
SEP
150
National Level
N-STP
N-STP
D Link
N-STP
N-STP
Regional Level
R-STP
SEP
E Link
R-STP
D Link
B Link
C Link
B Link
R-STP
F Link
R-STP
A Link
SEP
151
OMC
BTS
OSS
BTS
BSC
B
HLR
BTS
AuC
MSC
VLR
BTS
EIR
NSS
BTS
BSC
GMSC
MSC
BTS
PSTN
PLMNs
Internet
etc.
BSS
roams into a new MSC service area, the VLR connected to that MSC will request data
about the subscription from the HLR of the phone. Later, if the phone makes a call,
the VLR will have the information needed for call setup without having to contact the
HLR.
Authentication Center (AuC). The AuC is a database that stores authentication and
encryption parameters for subscribers to enable subscriber verification, and to provide
152
BTS
BSC
MSC
HLR
BSSAP
LAPDm
LAPDm
LAPD
D-channel Signaling
LAPD
MAP
MAP
TCAP
TCAP
BSSAP
SCCP
SCCP
SCCP
SCCP
MTP
MTP
MTP
MTP
SS7 Signaling
153
OMC
Node B
OSS
Node B
RNC
HLR
AuC
Node B
MSC
VLR
Node B
EIR
NSS
Node B
RNC
GMSC
MSC
Node B
PSTN
PLMNs
Internet
etc.
RAN
management information to the MSC from the BSC. In the remaining parts of the GSM architecture, the prevailing SS7 protocol is the Mobile Application Part (MAP) protocol. MAP
resides above TCAP. It is used to permit the network nodes of the NSS to communicate with
each other to provide services such as roaming, text messaging (i.e., SMS), and subscriber
authentication.
Over the past several years, the Universal Mobile Telecommunications System (UMTS)
has slowly began to take market shares from GSM. UMTS is actually not a new PLMN
154
Query
SSP
Response
SCP
155
SCE
SCP
SCP
STP
STP
Adjunct
SSP
SSP
Intelligent Peripheral
SSP
STP
SCP
INAP
INAP
TCAP
TCAP
SCCP
SCCP
SCCP
SCCP
MTP
MTP
MTP
MTP
156
the traditional exchange, but enhanced to support IN processing. The SSP performs basic
call processing and provides trigger and event detection points for IN processing. The SCP,
Adjunct, and Intelligent Peripheral are all additional nodes that were added to support the IN
architecture:
SCP. The SCP stores service data and executes service logic for incoming messages.
The SCP acts on the information in a received message by invoking the appropriate
SLP, and retrieving the necessary data for service processing. It then responds with
instructions to the SSP about how to proceed with the call. The SCP can be specialized
for a particular type of service, or it can implement several types of services.
Adjunct. The Adjunct performs similar functions to an SCP but, contrary to an SCP,
an Adjunct is often co-located with the SSP.
Intelligent Peripheral. The Intelligent Peripheral provides specialized functions for
call processing including voice announcements, voice recognition, and digit collection.
Service Creation Environment (SCE). The SCE enables operators, service
providers, and third-party vendors to prototype, test, and deploy new applications and
services.
With respect to SS7, IN is implemented as UP protocols atop TCAP (see Figure 18).
Throughout Europe, the Intelligent Networking Application Part (INAP) is the prevailing
IN protocol. In brief, INAP is responsible for keeping track of the TCAP components exchanged between an SSP and an SCP. The INAP protocol ensures that the contents of the
IN operations sent in TCAP components follow a predefined syntax as regards permitted
parameters and their coding.
As mentioned in Section 1, both the wireline and wireless industry see the softswitch architecture as a key component in the next-generation telecommunication network. In fact,
several operators and vendors see the advent of the softswitch architecture as pivotal for
continued cost efficiency and revenue growth.
The term softswitch was coined by one of the founders of the Softswitch Consortium,
Ike Elliott, in the late nineties. Although frequently used, the term is quite elusive. In fact, to
our knowledge, there exists no precise definition of the term. Still, there seems to be a fairly
broad consensus on the principal components of the softswitch architecture and the salient
functions of a softswitch.
The principal idea behind the softswitch architecture is to separate the control and media functions of a traditional telecom switch. In particular, as illustrated in Figure 19, the
softswitch architecture prescribes a separation and/or distribution of the application, call
control, and media transport functions of legacy telecom switches. That is, the architecture decouples the underlying switching hardware from the control, service, and application
functions.
157
Softswitch Solution
Application & Services
Transport
Transport
Feature/Application Server
Signaling Gateway
Softswitch
Media Gateway
Figure 20 illustrates the distributed architecture that is generally agreed upon as the
softswitch architecture. The architecture is bearer independent, and could be applied to both
packet- and circuit-switched networks. However, given that the next-generation telecommunication networks are assumed to be packet switched, the softswitch architecture is almost
exclusively applied to packet-switched networks. In fact, in the contexts used, it is often
tacitly assumed that the underlying network is either IP-based or based on Asynchronous
Transfer Mode (ATM).
158
As follows from Figure 20, the principal components of the softswitch architecture are
softswitch, Media Gateway (MG), Signaling Gateway (SG), and Feature/Application Server
(AS). The softswitch constitutes the intelligence that coordinates all signaling such as
call-control signaling, operations and management signaling, and bearer signaling. The
name softswitch originates from the fact that the majority of signaling functionality in a
softswitch resides in software as compared to hardware in traditional telecom switches.
The primary functions typically found in a softswitch are depicted in Figure 21. The
Call Agent Function (CA-F) administers the call-control signaling and provides the callstate machine for end points. Its primary role is to provide the call logic, and in so doing
interact with CA-Fs in peer softswitches. It also acts as a proxy for the AS, and assists the
AS in providing services and applications to the end user. The Media Gateway Controller
Function (MGC-F) controls and monitors the MGs, i.e., is responsible for the bearer control.
Specifically, it controls the creation, modification, and deletion of media streams. If needed,
it also acts as a conduit for media parameter negotiation between other MGC-Fs and external
networks. A softswitch is often responsible for routing of signaling messages between peer
softswitches and non-softswitch networks such as PSTN and PLMN networks. In Figure 21,
the Router Function (R-F) embodies the softswitch routing functionality. Other functions
that are not shown in Figure 21 but still could be part of a softswitch include: Accounting
Function (A-F), Border Gateway Function (BG-F), and various proxies, e.g., for the Wireless
Application Protocol (WAP), Java APIs for Integrated Networks (JAIN), Parlay, and the Call
Processing Language (CPL).
The MG serves as a gateway between two separate networks, e.g., two packet-switched
networks under different administrative control, or two networks employing different bearer
technology such as IP to TDM, IP to ATM, or IP to 3G. Its primary role is to transform
media from one transmission format to another. For example, an MG may terminate voice
calls from a PSTN, compress and packetize voice data, and deliver compressed voice packets
to an IP network.
An SG has the same function as an MG but for control or signaling transport. It acts as
gateway for signaling between two Voice over IP (VoIP) networks, or between a VoIP and a
PSTN/PLMN network. Notably, an SS7 SG serves as a protocol mediator/translator between
an IP and a PSTN/PLMN network. For example, when a call originates in an IP network
that uses H.323 or the Session Initiation Protocol (SIP) (cf. Section 5) as signaling protocol,
and terminates in a PSTN/PLMN network, a translation from H.323/SIP to SS7 is made in
an SS7 SG.
The final component of the softswitch architecture is the AS. The AS accommodates the
service and feature applications made available to the customers of a service provider. Examples include call forwarding, conferencing, voice mail, and forward on busy. Some networks
enable inter-AS communication which makes it possible to build complex, componentoriented applications.
It is important to understand that the softswitch architecture is a framework or logical
architecture which could be mapped to several different physical architectures. Particularly,
it could be mapped to both PSTN and PLMN networks. Figure 22 gives two examples of
how the softswitch architecture could be applied to a PSTN network.
Figure 22(a) shows a centralized physical architecture. The softswitch in this example
159
Softswitch
Softswitch
Call Agent Function
Routing Function
Routing Function
Media Gateway
Media Gateway
provides for both call and bearer control as well as basic application functions such as call
waiting and calling line identity. The MG and SG have the same roles as their logical counterparts in Figure 20 and serve as interfaces towards a PSTN.
Contrary to Figure 22(a), Figure 22(b) exemplifies a highly distributed architecture. In
fact, there is no such thing as a softswitch in this architecture. Instead, the functions of the
softswitch have been spread out on the Mediation Gateway and Feature Server. The Mediation Gateway functions as both an MG, an SG, and a softswitch in that it provides both
media conversion, signaling conversion, call-control, and basic routing functions. Servicelevel routing is provided by the Feature Server, which also accommodates certain service
logic. To offload the Mediation Gateway, a Media Server has been introduced. The Media
Server provides for specialized media resources such as Interactive Voice Response (IVR),
conferencing, fax, announcements, and speech recognition. It also handles the bearer interface to the Mediation Gateway.
In a PLMN network, the introduction of the softswitch architecture typically partitions
the MSC into two kinds of nodes: an MSC Server (MSC-S) and one or several Mobile Media
Gateways (M-MGs). As illustrated in Figure 23, the MSC-S acts as a softswitch, and thus
comprises the call- and bearer-control signaling of the legacy MSC. It interfaces with other
PLMN/PSTN networks via SGs. The M-MGs are controlled by the MSC-S, and, apart from
acting as MGs, the M-MGs comprise the switching functionality of the MSC.
160
Softswitch
CA-F
IP Phone
MGC-F
R-F
CA-F
*
AS-F
AS
VoIP
SG
H
EL
PSTN
MG
BayNet works
AS = Application Server
AS-F = AS Function
CA-F = Call Agent Function
MG = Media Gateway
MGC-F = Media Gateway Controller Function
R-F = Router Function
SG = Signaling Gateway
VoIP = Voice over IP
Media Server
IP Phone
AS-F
1
MGC-F
CA-F
*
Mediation
Gateway
BayN etworks
AS
CA-F
VoIP
MG-F
Feature
Server
PSTN
R-F
AS-F
R-F
AS = Application Server
AS-F = AS Function
CA-F = Call Agent Function
MG-F = Media Gateway Function
MGC-F = Media Gateway Controller Function
R-F = Router Function
VoIP = Voice over IP
161
Node B
MSC-S
CA-F
RNC
PSTN
SG
H
CA
RA
T
D
MGC-F
M-MG
R-F
BayN et
wo ks
r
MG-F
VoIP
R-F
M-MG
BayN et
wo ks
r
SG
H
P
W
E
A
MG-F
R-F
CA-F = Call Agent Function
MG = Media Gateway
MG-F = MG Function
MGC-F = MG Controller Function
M-MG = Mobile MG
MSC = Mobile Switching Center
MSC-S = MSC Server
PLMN = Public Land Mobile Network
PSTN = Public Switched Telephone Network
R-F = Router Function
RNC = Radio Network Controller
VoIP = Voice over IP
PLMN
Considering the fairly large changes required to transform legacy circuit-switched wireline and wireless networks into IP-based softswitch networks, one might wonder what the
incentives are. Unfortunately, the answer to this question is not as easily answered as asked.
In fact, the incentives are plentiful and differs among the actors involved. Still, maybe the
most important incentive to introduce the softswitch architecture is that it changes the telecom market from being vertical to horizontal. This opens up the opportunities for thirdparty developers, and will eventually bring the costs of telecom equipment down. The lower
equipment costs will, in turn, lower the initial costs for market entrants, and thus spur the development of a true competitive telecom market. Today, both the EU and U.S. wireline and
wireless markets are fairly oligopoly-like with a fem operators dominating their respective
markets, and this could change with the inception of the softswitch architecture.
Another compelling incentive for the softswitch architecture is that it enables the centralization of the signaling equipment to a few populated areas. Less populated, rural areas
can be controlled remotely. In fact, the softswitch architecture paves the way for virtual
providers that in the extreme case only owns the signaling equipment and leases the trunk
lines from another telecom or cable operator.
Still another virtue of the softswitch architecture is its scalability. For example, the Cisco
BTS 10200 softswitch [25] can scale from a single CPU up to 12 CPUs and then offer support
to millions of subscribers. This should be compared with an Ericsson Telecommunication
162
PIC
(E.g., digit collection)
DPs
(E.g., digits collected)
IN Service Logic
Call Model
SSP with IN
SCP
DP = Detection Point
IN = Intelligent Network
PIC = Points In Call
In todays telecommunication networks, applications and services are implemented as Intelligent Network (IN) services (cf. Section 2.6). Compared with its predecessors, and the way
services were implemented in these systems, todays IN-based telecommunication networks
represent a major leap forward. Notably, the IN concept introduced a generic representation
of SSP call-processing activities (see Figure 24). During call processing in a switch, a call
progresses through various states such as digit collection, translation, and routing. These
states existed before the inception of IN, however, before IN there was no agreement among
vendors on exactly what constituted each state, and what transitional events marked the entry
and exit of each state. IN defines a Basic Call State Model (BCSM), which unambiguously
163
identifies the various states of call processing and the points during call processing where IN
can occur known as Points In Call (PIC) and Detection Points (DPs), respectively.
Although IN meant a major improvement compared with prior service solutions, and
although substantial investments have been made in writing IN applications and services
for the current SS7-based telecommunication networks, the promise of a thriving, competitive, and versatile telecom-service marketplace has yet to materialize. Vendors have invested
in proprietary service development and execution environments which has efficiently hindered a market for third-party application providers. Proprietary service platforms have also
made the development of new services unnecessarily expensive since the service development costs are not shared among operators. Furthermore, applications and services are typically being developed in low-level, platform-dependent programming languages such as C.
This not only inhibits cross-platform development, but also requires developers with a high
level of proficiency in specific telecom platforms.
As mentioned in Section 3, a key incentive driving the development of the next-generation
telecommunication network and the softswitch architecture is to fulfill the promise of IN with
a viable service market. Particularly, operators want to build a service market that makes it
possible for them to recoup from shrinking margins on voice calls. To this end, the service
development environments of the softswitch architecture comprise declarative, platformindependent programming languages, and high-level, imperative programming languages
such as Java and C++. The declarative languages are parsed and executed by open, standardized interpreters, and the imperative languages are executed on platforms with open, standardized Application Programming Interfaces (APIs). Thus, in both types of development
environments, the developers are shielded from most of the low-level signaling intricacies.
164
ACE
(1)
Softswitch
Program
(2)
XML Parser
AS
(5)
(3)
Database
(4)
(6)
Media/Feature
Server
Operation &
Management
(7)
Figure 25: The creation, deployment, and execution of an XML-based application programming language.
operator are called server-side applications, and the majority of programs adhere to this category of applications. However, with the advent of more powerful terminals and end systems,
a number of programming languages for client-side application development have been proposed, e.g., CPL [63] and LESS [88]. The remainder of this paragraph briefly surveys some
of the more interesting server- and client-side application programming languages that have
been proposed in recent years.
165
<?xml version="1.0"?>
<vxml version="2.0"
xmlns="http://www.w3.org/2001/vxml"
xml:lang="en-US">
<form>
<field name="selection">
<prompt>
This is the ACME Weather Service.
Please choose Today, Tomorrow, or Week.
</prompt>
<grammar type="application/x-nuance-gsl">
[ today tomorrow week ]
</grammar>
</field>
<block>
<submit next="weather_service.jsp"/>
</block>
</form>
</vxml>
Figure 26: VoiceXML excerpt.
166
VoIP
(1
(4
(2 )
)
)
2 3
5 6
8 9
8 #
AS
(3 )
Internet
(2)
(3)
Weather Forecast
Customer (Caller)
IVR
Weather
Forecast
Service
AS = Application Server
IVR = Interactive Voice Response
Web Server
features of its own and make frequent use of Java in terms of JSPs and JavaScripts.
Figure 27 illustrates the execution of a typical VoiceXML application, e.g., our previous
weather forecasting service. A caller dials the phone number of the service. The call is routed
to an IVR server with a VoiceXML client (1). The IVR server translates the phone number to
a Uniform Resource Locator (URL), and the VoiceXML client places an HyperText Transfer
Protocol (HTTP) request to the specified URL (2). The Web server at the URL responds
with a VoiceXML document that contains one or several of the dialogs of the service (3).
Finally, the VoiceXML client interprets the fetched document, and interacts with the caller
by playing voice prompts and collecting input (4).
Many times a VoiceXML application requires some resources from outside the Web
server hosting the VoiceXML documents. A VoiceXML document can access the Web,
acting as a sort of voice-controlled browser. It can send information to the Web servers and
convey the reply to the caller. Access to the Web also opens up for simultaneous development of Web and telephony services. Often it is enough to write a VoiceXML frontend to
make a Web service accessible from a telephone.
Although a flexible and powerful language for single-party telephony services, VoiceXML
lacks support for multi-party services. To this end, the Call Control eXtensible Markup Language (CCXML) [31] was designed by W3C. CCXML complements VoiceXML by providing an elaborate call state model; support for multiple instances of VoiceXML interpreters;
the ability to trap external, asynchronous events such as on- or off-hook events; and the
ability to place outgoing calls.
167
In the same way as VoiceXML, a CCXML application consists of a number of XML documents. However, a CCXML document does not describe user dialogs. Instead, it describes
the actions that should be taken by calls during call transitions and events. For example, a
CCXML document might realize a call screening application by running a person unavailable VoiceXML program when persons whose phone number are on a list attempts to call
a certain other person.
While CCXML was designed to complement VoiceXML, the two languages are separate.
In fact, CCXML could be used to add call-state control to an arbitrary dialog system provided
the dialog system complies with certain requirements of CCXML.
4.1.2 CPL
Both VoiceXML and CCXML are examples of flexible, expressive languages which lend
themselves perfectly for use by operators and trusted third-party developers. However, due
to their flexibility and expressiveness, languages such as these also raise safety and security
concerns. It is very difficult for an operator who employs VoiceXML, CCXML, or similar
languages for third-party development, to protect itself from invalid or ill-conceived programs, e.g, programs that reveal security-sensitive information, or that consume excessive
amounts of system resources. Thus, to address the need for a language suitable for semiand untrusted third-party developers, Lennox et al. designed the Call Processing Language
(CPL) [63].
CPL is not tied to any particular signaling protocol or architecture, however, it is designed
on the basis of SIP (cf. Section 5). Contrary to languages such as VoiceXML, it is very
restrictive: It provides no way of writing loops or recursion, and has no ability to invoke
external programs like JSPs or Web services. In fact, it is designed to prohibit any kind
of unsafe action, and a CPL program is always executed in a finite amount of time. To
ensure a bound on the program execution time, each action within a CPL program is always
time limited, and hence actions that interface with external resources, e.g., databases, have
timeouts.
CPL is designed to be used for both client-side (e.g., phones) and server-side applications
and services. Like VoiceXML and CCXML, CPL is an XML application, and its syntax is
specified in a Document Type Definition (DTD). Semantically, a CPL program constitutes a
directed acyclic, i.e., loop free, graph of call processing actions. The call processing actions
are, in turn, trees of language primitives or nodes. There are four principal classes of language primitives in CPL. First, there are the signaling actions, the primitive class that forms
the core of CPL. They control the broad behavior of the underlying signaling protocol. In
particular, they control such signaling actions as proxying, i.e., forwarding of a call to one or
several locations; redirection of calls; and responses to failures. Second, there are the switch
nodes, which correspond to the control or selection statements of ordinary programming languages, and which enable a CPL program to make decisions. Third, there are the location
nodes that specify the location for succeeding signaling actions. For example, a location
node could specify that a call should be proxied to a certain SIP address (see the example in
Figure 28). Finally, there are the non-signaling actions that permit a CPL program to perform
operations which are not dependent on, or affected by, the underlying signaling protocol. For
example, CPL provides a mail node which makes it possible for a CPL script to notify a user
168
169
than incoming calls. In fact, LESS extends CPL with triggers for timers, user interactions,
program-controlled events, and instant messaging.
4.1.4 XTML
The eXtensible Telephony Markup Language (XTML) [22] is a feature-rich and flexible
XML-based application programming language . It is a proprietary language of Pactolus
Communications Software Inc., and is the native language of their RapidFLEX software
architecture. Although the RapidFLEX platform targets SIP, XTML is oblivious to the call
signaling protocol used. In fact, it could equally well be used together with H.323.
Basically, an XTML application consists of a set of event handlers which responds to
some given events. The events can be either signaling protocol-dependent, e.g., the arrival
of a SIP INVITE message, or protocol-independent, e.g., a timer that expires. The event
handlers are, in turn, made up of chains of actions which are linked together to reflect the
application call-flow. Compared with the previously described languages, e.g., VoiceXML
and CPL, XTML is designed to be easily extensible. The extensions can be written in both
XTML and general programming languages such as Java and C++.
4.1.5 SCML
The Service Creation Markup Language (SCML) [32] suite is part of the Java APIs for
Integrated Networks (JAIN) [8] standardization effort (see Section 4.2.2). The intention
with SCML is to provide a high-level scripting facility on top of the JAIN and Parlay [19]
APIs, and thus to provide a simple service creation environment for non-telecommunication
experts. Although envisioned to cover a broad range of features, e.g., web and presence
services, and instant messaging, the SCML suite is currently very much a work in progress.
In fact, at the time of this writing, only preliminary versions of the SCML call control have
been presented.
The SCML call control is defined in terms of an XML Schema that is derived from the
general call-control model of the Java Call Control (JCC) API [13]. To this end, SCML
provides an elaborate event mechanism completely on par with CCXML and XTML. Furthermore, since defined using an XML Schema, SCML is fairly easy to extend.
An SCML program is typically downloaded to an AS. At startup, the program registers
interest in events with a softswitch. When an event is triggered, e.g., a call arrives at the
softswitch, the softswitch generates a JCC event which is converted to an XML message and
delivered, e.g., via the Simple Object Access Protocol (SOAP) [45, 46, 65], to the AS. The
AS executes the SCML program and returns an XML message to the softswitch.
Figure 29 shows an example program in SCML. The program implements a simple call
forwarding application which diverts incoming calls to Mr. Karl-Johan Grinnemo (employee
IDentity (ID): kjgr), to a voice mail service when he is already busy with another call.
170
<scml>
<terminating>
<address-switch field=terminating>
<address is=sip:kjgr@office.acme.com>
<disconnected causeCode="CAUSE_BUSY">
<routeCall connectionPtr="conC">
<arguments>
<targetAddress>
sip:kjgr@voicemail.acme.com
</targetAddress>
</arguments>
</routeCall>
</disconnected>
</address>
</address-switch>
</terminating>
</scml>
Figure 29: A call forwarding application in SCML.
telecommunication networks. The key idea has been to design generic, technology-neutral
APIs that could be used by both operators and third-party developers alike. In particular,
the APIs should enable for operators to provide, in a secure way, network capabilities to
third-party application developers.
Today, we have two dominating API framework proposals: OSA/Parlay [19] and JAIN [8].
The OSA/Parlay proposal emanates from a collaboration between the Parlay Group, European Telecommunications Standards Institute (ETSI), ITU-T, and 3GPP. In 1998, British
Telecom (BT), Microsoft, Nortel, Siemens, and Ulticom formed the Parlay Group to define
a set of programming language-neutral APIs for third-party development of telecom applications and services in PSTN. The initial API framework was published in December 1998.
Since then, the membership has grown and now include companies such as Cisco, Ericsson,
Lucent, and IBM. Furthermore, the focus of the group has widened to cover both Internet
and PLMN. As the work on the second release of Parlay commenced, the API framework
was taken into ETSI and ITU-T in an attempt to make the API an international standard. At
about the same time, the Parlay Group initiated a work within 3GPP on an open application
interface towards UMTS. Facing the risk of having several incompatible standards, the Parlay, ETSI, and 3GPP initiatives were combined into one working group, the Joint Working
Group (JWG), in the context of what is called the Open Service Access (OSA) framework.
At about the same time as the Parlay Group was formed, the Java APIs for Integrated
Networks (JAIN) community was initiated by Sun Microsystems and others. The objective
of JAIN is similar to that of OSA/Parlay, however, contrary to OSA/Parlay, JAIN only considers Java. Furthermore, JAIN takes a broader perspective to application development than
OSA/Parlay and not only considers client-side, but also server-side applications. Still, there
171
ACE
ASP
Location
OSA/Parlay
Gateway
OSA/Parlay
Application
SMS
VoIP
AS
OSA/Parlay API
Softswitch
172
AS
AS
Application
AS
Application
Application
CORBA, DCOM
OSA/Parlay Interface
OSA/Parlay API Framework
SCFs
SCSs
OSA Framework
Call Control
SMS
Location
SMS Server
VoIP
Softswitch
Location
Server
Figure 31: Overview of the logical entities comprising the OSA/Parlay API framework.
173
AS
(7)ReturnServiceManager
(4)CreateServc
i eManager
(3)Authentc
i ation
Application
(2) Registration
OSA Framework
the logical entities which implement one or more SCFs, and, in so doing, interact with the
internal nodes of the operator network (e.g., location and SMS servers). Although, an SCS
might implement several SCFs or Parlay APIs, this is fairly unusual. Typically, an SCS only
implements a single API. The OSA Framework implements the core functionality of the
OSA/Parlay API, e.g., authentication and authorization, registration of SCSs, publication of
SCFs, and integrity/fault management. The communication between the ASs and the Framework/SCSs is made using either of the middleware technologies Common Object Request
Broker Architecture (CORBA) [71] or Distributed Component Object Model (DCOM) [2].
Figure 32 outlines the key steps in using the OSA/Parlay API framework. When a new
SCS is installed, it must authenticate itself (1) and register (2) with the OSA Framework.
The registration means that the SCS publishes its interface to the Framework. Next, when
an application wants to access the SCF provided by the SCS, it authenticates itself with
the Framework (3). It also obtains an instance of the SCS interface, a Service Manager in
174
Application
Framework Factory
IpFramework
IpAppCallControlManager
IpAppCall
IpCallControlManager
IpCall
(1) new()
(2) obtainScf()
(3) createNotification()
(4) ReportNotification()
(7) routeRes()
(8) deassignCall()
Figure 33: Excerpt of an UML sequence diagram for an OSA/Parlay call forwarding application.
OSA/Parlay parlance 4 (4-7). The application then accesses the SCF by invoking the methods
provided by the Service Manager.
To concretize the usage of the OSA/Parlay API, Figure 33 provides parts of a UML sequence diagram for a call forwarding application. Only the major actions have been included
in the example. In particular, those parts concerning the authentication have been deliberately omitted. The example begins with the application retrieving a reference to the OSA
Framework, typically an instance of the Framework interface (IpFramework) (1). The
application then calls upon the Framework to obtain a reference to the Service Manager of
the Generic Call Control (GCC) SCF (IpCallControlManager) (2). To
enable notification of incoming call events, the application registers a
callback interface, IpAppCallControlManager, with the Service Manager
(IpCallControlManager) (3). When a call arrives, the application is notified via the
IpAppCallControlManager callback interface (4). The application processes the call
(5), and routes it to the appropriate destination (6,7). At that time, the application is no longer
4 The
175
interested in controlling the call, and therefore deassigns the call (8). Note that this does not
mean that the call itself ends, it only means that there will be no further communication
between the call and the application.
The OSA/Parlay API framework has experienced a substantial development in recent
years, and, at the time of this writing, the OSA/Parlay API framework comprises a comprehensive set of standardized SCFs. Specifically, the OSA/Parlay 3GPP release 6 encompasses
SCFs ranging from basic call control to multi-party call control, instant messaging, multimedia messaging, presence, Quality of Service (QoS), and charging. Furthermore, it includes
support for Web services, which not only entails new SCFs, but also a new definition language, Web Services Description Language (WSDL) [33, 36, 37], and a new communication
middleware, SOAP [45, 46, 65].
4.2.2 JAIN
The JAIN API framework is being developed within the Java Community Process (JCP) under the terms of Suns Java Specification Participation Agreement (JSPA). The objective of
the JAIN initiative is to develop Java APIs that abstract the details of networks and protocol
implementations, and allow for the development of portable applications. The JAIN initiative is organized in two expert groups and several workgroups. It consists of a Protocols
Expert Group (PEG) that standardizes interfaces toward SS7 and IP signaling protocols, an
Application Expert Group (AEG) that primarily considers the APIs required for service creation within Java, and, finally, a number of workgroups whose task it is to develop prototype
implementations and feed the expert groups with their experiences and insights.
The JAIN API framework basically comprises two parts:
JAIN SS7 APIs that define implementation-agnostic APIs for the major SS7 protocols such
as ISUP, TCAP, MAP, INAP etc.
Java Service Logic Execution Environment (SLEE) that provides a generic Java-based
application platform for developing platform-independent applications and services.
From a historical viewpoint, the JAIN SS7 APIs could be seen as a predecessor to the Java
SLEE: Initially, JAIN only comprised a diverse, and relatively incoherent, set of telecom
APIs. However, this changed with the advent of the SLEE platform, which brought the APIs
together under a common framework. To this end, let us first consider the Java SS7 APIs.
All JAIN SS7 APIs are designed on the basis of the Factory and Observer design patterns.
In particular, as shown in Figure 34, each JAIN SS7 API is built up around five software
components: an SS7 factory class, JainSS7Factory; an interface class towards the SS7
stack, JainprotStack; an interface class towards the SS7 protocol, JainprotProvider;
a listener or callback interface class, JainprotListener, that enables for the protocol to
communicate with the application; and, finally, event classes for all protocol primitives that
can be exchanged between the application and the protocol. This includes primitives for
protocol messages, primitives for error indications, primitives for timeout indications, and
primitives for network status indications. Together, the provider, listener, and event classes
implement the Observer design pattern. While the factory, listener, and event classes are
vendor neutral, each stack vendor is required to implement the stack and provider classes.
176
Create
JainSS7Factory
Application
Jainprot Listener
Create
Create
Events
SS7 Protocol
SS7 Protocol
SS7 Protocol
SS7 Protocol
SS7 Protocol
Jain prot Provider
SS7 Protocol
SS7 Protocol
SS7 Protocol
SS7 Protocol
SS7 Protocol
Stack Vendor A
Stack Vendor B
As an example of the use of the JAIN SS7 APIs, consider the skeleton Java code in Figure 35 for an ISUP application. At lines 8-9, a Factory object for a particular SS7 stack
is created; in this example, a Factory object for an Ericsson SS7 stack. Next, using the
Factory object, an interface towards the SS7 stack is obtained in lines 11-14. Lines 1618 show parts of the configuration of the SS7 stack. For example, in line 17, the stack
is a assigned its point code. The initialization of the application concludes in lines 2024. An interface towards the ISUP protocol is obtained in line 20. In line 21, the actual
ISUP application is created. As follows from line 2, the application implements a callback interface, JainIsupListener. In lines 23-24, the application and the ISUP protocol are interconnected. Particularly, the ISUP protocol is supplied with a reference to
the JainIsupListener interface, and the application is provided with a reference to
JainIsupProvider. The JainIsupListener interface only consists of one method,
processIsupEvent (lines 30-43). This method is invoked by the ISUP protocol, via the
JainIsupProvider class, each time it needs to notify the application about events that
177
1 ...
2 public class IsupApplication implements JainIsupListener
3 {
4
public static void main ( String args[] )
5
{
6
try
7
{
8
JainSS7Factory aFactory = JainSS7Factory.getInstance();
9
aFactory.setPathName("com.ericsson");
10
11
JainIsupStack isupStack = null;
12
isupStack = ( JainIsupStackImpl )
13
aFactory.createSS7Object
14
( "javax.jain.ss7.isup.JainIsupStackImpl" );
15
...
16
isupStack.setVendorName ( "com.ericsson" );
17
isupStack.setSignalingPointCode( OPC );
18
isupStack.setStackName( "ericsson_stack" );
19
...
20
JainIsupProvider
aProvider = isupStack.createProvider();
21
JainIsupListenerImpl aListener = new IsupApplication();
22
...
23
aProvider.addIsupListener( aListener, myUserAddress );
24
aListener.setIsupProvider( aProvider );
25
...
26
}
27
...
28
}
29
30
public void processIsupEvent( IsupEvent isupEvt )
31
{
32
switch ( isupEvt.getIsupPrimitive() )
33
{
34
case IsupConstants.ISUP_PRIMITIVE_SETUP:
35
...
36
break;
37
38
case IsupConstants.ISUP_PRIMITIVE_ALERT:
39
...
40
break;
41
...
42
}
43
}
44 }
Figure 35: A skeleton ISUP application that uses the JAIN ISUP API.
178
Management Site
Management
Application
(4)
(3)
JMX Agent
JMX Agent
JMX Agent
MBean
MBean
MBean Server
(2)
(1)
MBean
Network Node
JMX = Java Management eXtensions
JMXMP = JMX Messaging Protocol
RMI = Remote Method Invokation
Figure 36: The JAIN OAM API and the JMX architecture.
accessed by a management application, operations that can be invoked, and notifications that
can be emitted. MBeans are run within an MBean Server (2) and accessed via Agent Services
objects (3) that can perform management operations on the MBeans registered in the MBean
Server. The MBean Server relies on Remote Method Invocation (RMI) [7], JMX Messaging
Protocol (JMXMP) [11], or similar technologies to make the OAM MBeans accessible for
remote management applications (4).
Other types of special JAIN SS7 APIs include the JAIN Parlay APIs [6]. As mentioned
earlier, the JAIN Parlay APIs are basically an adaptation of the OSA/Parlay API framework
to Java. Like the OSA/Parlay API, the JAIN Parlay API defines an API for describing the
interaction between ASs external to an operator network and network resources (i.e, SCSs)
within the operator domain.
179
The second part of the JAIN API framework is the JAIN SLEE [14]. The SLEE comprises an application framework and component model similar to Enterprise Java Beans
(EJBs) [10]. It builds upon the JAIN SS7 APIs, however, in addition to providing platformindependent access to SS7 protocols, it also provides support for transactions, persistence,
load balancing, and pooling.
Figure 37 shows the principal parts of the JAIN SLEE architecture. Applications and
services in SLEE are implemented as collections of reusable objects or Service Building
Blocks (SBBs) (1). For example, assume that you want to implement a call forwarding
application. Instead of building the application from the ground up, you would build the
application from two pre-existing SBBs: a call SBB and a forwarding SBB.
SBBs are run from within a SLEE Server (2). All communication between the SBBs
and network resources, such as SS7 protocols, goes via the SLEE Server. Incoming messages from network resources are translated by the SLEE Server to events and routed to
the appropriate SBBs. Conversely, outgoing events from SBBs are translated by the SLEE
Server to messages and routed to the appropriate network resources. The SLEE event model
is based on a publish/subscribe model which means that event sources are decoupled from
event sinks via an indirection mechanism, Activity Contexts (3). Event sinks subscribe for
events by attaching to Activity Contexts, and event sources publish events to Activity Contexts. The SLEE defined Activity Contexts maintain the relationships among event sources
and sinks. By using a publish/subscribe event model, sources and sinks need not to be aware
of each other. At the same time, it permits the SLEE to control and manage all source/sink
relationships, thus improves robustness.
The SLEE architecture defines how applications (i.e., SBBs) running within the SLEE
interact with network resources through Resource Adaptors (RAs). A RA shields SLEE
applications from the intricacies of particular network resources, e.g., vendor-specific details,
and publishes a common interface towards the applications (4).
The SLEE architecture also include some application facilities (5). The Timer facility
provides applications with the ability to perform periodic actions; the Alarm facility enables applications to generate alarm notifications to external management clients; the Trace
facility is used by applications to generate trace messages; and, finally, the Usage facility
provides applications with resource usage and network statistics. Furthermore, the SLEE
defines management interfaces using JMX and MBeans (6). These interfaces enable for a
management application on an OAM node to access applications on remote SLEE Server
nodes.
To conclude our discussion about JAIN, it should be mentioned how the JAIN API framework relates to some other Java API efforts. The Open Mobile Alliance (OMA) [17] initiative is defining a set of Web services interfaces in WSDL [33, 36, 37] which complement
both the JAIN SS7 APIs and SLEE by providing SOAP-based [45, 46, 65] interfaces toward
network resources. Another Java API effort is the OSS through Java (OSS/J) [18] initiative
which is an umbrella initiative to provide OSSs with Java capabilities. The OSS/J APIs add
support for QoS management, trouble reports, telecom management, and billing to JAIN,
and thus supplement the JAIN OAM API. Finally, there is the SIP Servlet [9] technology
which, like JAIN SLEE, provides a platform-neutral application environment with transaction support. However, unlike JAIN SLEE, SIP Servlets are tightly coupled to the SIP
180
SLEE Server
(6)
SLEE Container
Timer Facility
JMX Agent
Alarm Facility
SBB
(1)
SBB
(5)
SBB
MBean
Trace Facility
MBean
(2)
MBean
Usage Facility
Activity Context
Activity Context
Event Dispatcher
(4)
RA
SIP
RA
H.323
SIP
Vendor A
Protocol Stack
Activity Context
RA
MGCP
RA
MAP
RA
CAP
H.323
Vendor B
Protocol Stack
RA
INAP
(3)
Softswitch
181
Call-control Signaling
Softswitch
Bearer Signaling
MG
Media Path
MG
MG
Media Path
MG
MG = Media Gateway
182
Terminal
Terminal
Terminal
Terminal
H.323 Zone
H.323 Zone
Gatekeeper
Gatekeeper
IP
Network
MCU
Terminal
Terminal
Terminal
Terminal
IP
Network
Gateway
Gateway
PSTN
PSTN
MCU
5.1 H.323
As briefly mentioned earlier, H.323 is not a call-control signaling protocol per se. Instead,
H.323 describes the principal logical components of a multimedia communication system
and further specifies how the components should communicate. To that end, H.323 is a
framework specification which, in turn, references other ITU-T specifications for call signaling, control signaling, media transmission etc.
Figure 39 shows the H.323 architecture. It should be emphasized that this is a logical
architecture and that the components do not necessarily map to real physical devices.
As shown in Figure 39, a telecommunication network according to H.323 comprises one
or several zones. A zone is a logical demarcation and may straddle network segments that
are connected with routers, switches, or other network devices. It includes a gatekeeper and
at least one terminal. Optionally, it may include gateways and/or Multipoint Control Units
(MCUs).
A zone is administered and controlled by the gatekeeper. The gatekeeper performs the
following tasks:
Address Translation. Every device in a H.323 network has a network address that
uniquely identifies the device. Typically, in an IP environment, the address is an IP
address that is specified in the form of a URL. However, it is also possible to use E.164
addresses. A gatekeeper translates address aliases such as URLs and E.164 addresses
to IP addresses.
183
Admission Control. In a H.323 network, an end point, e.g., a terminal or gateway, has
to request access to the network before a call can be placed. A request for admission
specifies the bandwidth to be used by the end point, and the gatekeeper can choose to
accept or deny the request based on the bandwidth requested and the current network
state.
Bandwidth Management. Although bandwidth is initially provided through admission control, the bandwidth requirements may change during a call. The gatekeeper is
also responsible for mid-call bandwidth requests.
Optionally, a gatekeeper may provide call-control signaling. That is, a gatekeeper could
be the component responsible for routing call-signaling messages between H.323 end points.
Furthermore, a gatekeeper could perform call authorization, e.g., reject calls that originate
from certain addresses, or calls placed within certain time periods. A gatekeeper could also
handle call management and maintain information about all active calls. This information
could be used by the bandwidth-management function, or to re-route calls to different end
points to achieve load balancing.
As emphasized earlier, a gatekeeper is a logical component. In a physical network, the
gatekeeper could be a standalone device, but it could also be implemented as part of a gateway or MCU. Either way, the gatekeeper is commonly seen as the softswitch in the H.323
architecture.
The terminals and gateways in the H.323 architecture are typically referred to as the
end points since they are the components that originate and terminate signaling connections.
Terminals in H.323 could be anything ranging from a simple IP phone to a larger stationary
workstation. However, H.323 explicitly requires that all terminals must support the following
protocols:
The H.225 [58] call signaling protocol for call setup and release, and for Registration,
Admission, and Status (RAS) signaling.
The H.245 [61] control signaling protocol for exchanging terminal capabilities and for
the creation of media channels.
The Real-time Transport Protocol (RTP) and Real-time Transport Control Protocol
(RTCP) for media stream transport and control [79].
H.323 terminals must also support the G.711 [51] audio codec. Optional protocols in a terminal include additional audio codecs, video codecs, T.120 [52] data-conferencing protocols,
and MCU capabilities.
A gateway connects two dissimilar networks, typically a H.323 network with a PSTN
network. It provides translation of H.323 call-control protocols, i.e., H.225 and H.245, to,
e.g., SS7 and ISDN protocols. On the H.323 side, a gateway runs H.245 control signaling for
exchanging capabilities, H.225 call signaling for call setup and release, and H.225 RAS for
registration with the gatekeeper. On the other side, a gateway runs the signaling protocols of
the non-H.323 network, e.g., SS7 and ISDN protocols. A gateway may also perform media
translation, i.e., translation between different audio, video, and data formats. In a physical
network, a gateway could be co-located with a gatekeeper and/or MCU.
184
Data/Fax
Media
Audio
Codec
G.711
G.723
G.729
Video
Codec
H.261
H.263
RTCP
T.120
T.38
H.225
(Q.931,
Q.932)
H.225
RAS
H.245
TCP
UDP
TCP
RTP
UDP
TCP
IP
185
186
End Point O
Gatekeeper GO
Gatekeeper GT
End Point T
H.225 ARQ
(1)
H.225 ACF
Q.931 Setup
Q.931 Setup
Q.931 Call Proceeding
H.225 ARQ
H.225 ACF
Q.932 Facility
Q.931 Release Complete
Q.931 Setup
(2)
Q.931 Setup
Q.931 Call Proceeding
Q.931 Call Proceeding
H.225 ARQ
H.225 ACF
Q.931 Alert
Q.931 Alert
Q.931 Connect
Q.931 Alert
Q.931 Connect
Q.931 Connect
H.245 Terminal Capabilities
H.245 Terminal Capabilities
H.245 Terminal Capabilities
H.245 Terminal Capabilities
H.245 Master-Slave Negotiation
(3)
H.245 Master-Slave Negotiation
H.245 Open Audio Logical Channel
H.245 Open Audio Logical Channel Acknowledgement
H.245 Open Audio Logical Channel
H.245 Open Audio Logical Channel Acknowledgement
Media Communication (RTP/RTCP)
(4)
To illustrate how the protocols in the H.323 protocol suite work together to accomplish a
call session, Figure 41 outlines the time-sequence diagram for a gatekeeper-routed call setup.
The reason we selected to show a gatekeeper-routed call, and not a directly routed call, is
that billing is much easier to accomplish in this type of call. Thus, we believe that this type
of call will prevail in future telecommunication networks.
It is assumed that the two end points, O (originating end point) and T (terminating end
point), have already registered with gatekeepers GO and GT, respectively. Furthermore, it is
187
assumed that the call only involves speech. The steps in the call setup are as follows:
(1) End point O sends an Admission ReQuest (ARQ) on the RAS channel to gatekeeper
GO and requests to make a call with a certain bandwidth. The admission is confirmed
(ACF) by GO.
(2) End point O sets up a H.225/Q.931 call signaling channel between itself and end point
T. This is done in several steps.
(a) End point O sends a Q.931 Setup request to GO, which, in turn, forwards the
request to end point T.
(b) End point T responds to the Q.931 Setup request by sending back a Q.931 Call
proceeding message to GO.
(c) End point T obtains admission for the call by issuing an admission request to
GT.
(d) Since gatekeeper-routed call signaling is used, end point T informs GO that the
call should be routed through GT. This is done with a Q.932 Facility message.
(e) GO releases the current H.225/Q.931 channel with end point T and sets up a new
channel which goes through GT. Note that this procedure also involves end point
T obtaining a new admission.
(f) When end point T, which typically is an IP phone, starts ringing, it sends back a
Q.931 Alert message to end point O.
(g) Later, when the called party answers the call, end point T sends back a Q.931
Connect message. This message sometimes contains the transport UDP/IP address for the H.245 control signaling.
(3) H.245 control signaling takes place between end points O and T.
(a) Terminal capabilities are negotiated.
(b) It is decided which of the end points is the master.
(c) Two logical audio channels are opened one in each direction.
(4) Media communication takes place using RTP/RTCP.
To conclude this description of H.323, it could be mentioned that recent versions of the
standard has been complemented with some other recommendations. Notably, H.450.1 [55]
specifies a new protocol for supplementary phone services in H.323. Other recommendations
in the H.450 series specifies some common supplementary services such as call transfer, call
diversion, call park, and call hold. Also worth noting is the H.235 [60] security framework
for secure signaling in a H.323 network.
188
5.2 SIP
As briefly mentioned, SIP is a a signaling protocol for initiating, managing, and terminating
multimedia sessions across IP networks. It can be run over any IP transport layer protocol,
e.g., TCP, UDP, and the Stream Control Transmission Protocol (SCTP) [84]. This is in sharp
contrast to H.323 which specifies a complete, vertically integrated system. Furthermore,
contrary to H.323, which is a binary peer-to-peer protocol, SIP is a text-encoded clientserver protocol.
SIP, which was originally developed within the IETF Multiparty Multimedia Session
Control (MMUSIC) working group, forms part of IETFs multimedia architecture effort.
As such, SIP is used in conjunction with several other IETF protocols such as the Session
Description Protocol (SDP) [47, 70], the RTP [79] protocol, the Media Gateway Control
Protocol (MGCP) [30, 41] and the MEdia GAteway COntrol (MEGACO)/H.248 [44, 62]
protocol etc.
Figure 42 pictures the elements of a SIP network. As shown, a SIP network is composed
of eight types of logical components: user agents, redirect servers, proxy servers, Back2-Back User Agents (B2BUAs), registrars, location servers, presence servers, and events
servers. Each component has specific functions and participates in SIP communication as a
client, i.e., initiates requests, as a server, i.e., responds to requests, or as both. One physical device can have the functionality of more than one logical component. For example, a
network server that works as a proxy server might also function as a registrar.
User agents are client end-system applications that contain both user-agent client and
user-agent server functionality. Examples of physical devices that could be user agents include IP phones, workstations, telephony gateways, and various services such as automated
answering services. In a softswitch network, a user agent is typically configured with the
network address of the local redirect server, proxy server, or B2BUA. The redirect server
accepts a SIP request and maps the SIP address of the called party into zero (if there is no
known address) or more new addresses and returns them to the user agent. In contrast, a
proxy server does not return translated addresses to the user agent, but uses the addresses to
route the SIP request towards the destination user agent. It should be noted that a SIP request
may have to traverse several proxy servers on its way to a destination user agent.
It is useful to view proxy servers as SIP-level routers that forward SIP requests and
responses. However, SIP proxy servers employ routing logic that is commonly more sophisticated than just routing-table forwarding. In particular, RFC 3261 [78] allows proxy servers
to perform actions such as validate requests, authenticate users, fork requests, resolve addresses, cancel pending calls, so-called record- and loose-routing, and handle routing loops.
Forking means that after having processed an incoming SIP request and resolved the destination address, the proxy server forwards the request to multiple addresses. Depending on how
the proxy server is configured, the forking could be parallel, sequential, or a mix. Recordrouting is a SIP mechanism that allows SIP proxy servers to request being in the signaling
path of all future requests in a particular call, and loose-routing adds the possibility of having
several signaling paths in record-routing.
The RFC 3261 specification defines three types of proxy servers: stateless, stateful, and
call-stateful proxy servers. A stateless proxy is a simple message forwarder. When receiving
a SIP request, the stateless proxy processes the request without saving any state information.
189
VoIP
VoIP
TR
IP
Su
bs
c
r ib
t
ou
n
Lo catio
TR
Re
g
is
t
TR I
IP
R ou
Su
er
c
bs
r ib
B2BUA
SIP Location Server
SIP User Agent
VoIP
190
invoked to resolve destination addresses. Location servers are not formally a SIP component,
however, they are still an important part of a SIP network. To store routable SIP addresses in
a location server, a user agent contacts a register server or registrar. How the registrar, in turn,
uploads the SIP addresses to the location server is not specified. Some location servers use
the Lightweight Directory Access Protocol (LDAP) [49, 50, 85], others use CORBA [71].
Two more recently added servers to the SIP network are the events and presence servers.
The events server is a general implementation of a notifier as prescribed by the event notification framework of RFC 3265 [76]. The notifier in this framework is responsible for receiving SIP event subscription requests, and sending notifications to subscribers when their
subscribed events have occurred. Presence is a service that allows a party to know the ability
and willingness of another party to participate in a call before a call attempt has been made.
A user interested in receiving presence information for another user, a so-called watcher, can
subscribe to his/her presence status at a presence server. The concept of a presence server
emanates from work within the IETF SIP for Instant Messaging and Presence Leveraging
Extensions (SIMPLE) working group to develop a framework architecture for presence and
instant messaging.
Typically, each operator has its own SIP network. To permit call control signaling between customers of different operators the SIP networks have to exchange routing information. As is illustrated in Figure 42, IETF envisions the use of the Telephony Routing over IP
(TRIP) [77] protocol for this purpose. In TRIP, location servers communicate routing details
to location servers in both the same and different SIP networks using mechanisms similar
to those in Border Gateway Control Protocol 4 (BGP-4) [75]. Examples of routing details
communicated include reachability of destinations and the routes towards these destinations,
and policy information. It should be noted that although TRIP was developed primarily for
SIP networks, it is not in any way dependent on SIP. In fact, TRIP could be used as a routing
protocol for H.323 networks as well.
There are only two types of messages in SIP: requests sent from a client to a server, and
responses sent in the opposite direction. The RFC 3261 specification defines six SIP request
types or methods. The six methods are as follows:
INVITE. This method initiates a call session, and invites other user agents or servers
to participate in the session. It includes a session description, and, for two-party calls,
a description of the media the calling party wants to use in the session, e.g., G.711encoded audio over RTP.
ACK. This method is used to acknowledge the reception of a final response to an INVITE. (The meaning of a final response will be explained below.) A client originating
an INVITE request issues an ACK request when it receives a final response for the
INVITE.
OPTIONS. The OPTION method makes it possible for a calling party to query a
called party about its capabilities in terms of supported SIP methods and media.
BYE. This method is used by a party in a call session to abandon the session.
CANCEL. This method cancels pending transactions. For example, if a SIP server
191
has received an INVITE but not yet returned a final response, it will stop processing
the INVITE upon receipt of a CANCEL.
REGISTER. A user agent sends a REGISTER request to a registrar to update the
location server about its current location.
Apart from these methods, a number of extensions have been added in RFCs and proposed in
Internet drafts. This includes methods for event subscription and notification, SUBSCRIBE
and NOTIFY; methods for mid-call signaling; and a method, COMET, to ensure that certain
preconditions, such as QoS requirements, are met.
SIP responses are sent in response to SIP requests and indicate the outcome of the request.
They are represented by three-digit status codes, and are classified with respect to their most
significant digit. There are six classes of SIP responses:
100 Informational,
200 Success,
300 Redirection,
400 Client error,
500 Server failure, and
600 Global failure.
The informational SIP responses are used to indicate progress but do not terminate a SIP
transaction. The remaining classes of SIP responses are final, i.e., terminate SIP transactions.
The structure of SIP messages is to a large extent influenced by HTTP. Figure 43 pictures
the structure of SIP messages. As depicted, SIP messages are composed of the following
three parts:
Start Line. Every SIP message begins with a start line. The start line conveys the
message type, i.e., method types in requests and status codes in responses, and the
protocol version. Furthermore in requests, the start line includes a request Uniform
Resource Identifier (URI) which gives the SIP address of the called party.
Header Fields. SIP header fields are used to convey message attributes and to modify
message meaning. They are similar in syntax and semantics to HTTP header fields.
In fact, some headers are borrowed from HTTP. Some examples of key SIP headers
include:
Via. Indicates the route taken by a SIP request/response.
From. Identifies the originator of a SIP request/response.
To. Identifies the recipient of a SIP request/response.
Call-ID. The Call-ID contains a unique identifier for a particular call session.
All requests and responses during this call session will contain this same CallID. The Call-ID is for example used by a SIP proxy server to keep track of several
simultaneous call sessions.
192
SIP Request
Start
Line
Header
Fields
Body
SIP Response
SIP/2.0 200 OK
Call-ID: 12345@kjgr_ws.acme.com
Call-ID: 12345@kjgr_ws.acme.com
CSeq: 1 INVITE
CSeq: 1 INVITE
Content-Type: application/sdp
Content-Length: 150
Content-Type: application/sdp
Content-Length: 130
v=0
o=kjgr 535464 5321245 IN IP4 128.3.4.5
s=Call from Karl-Johan
c=IN IP4 kjgr_ws.acme.com
m=audio 1234 RTP/AVP 0 3 4 5
v=0
o=anna 53534 56734 IN IP4 192.1.2.3
s=Call from Karl-Johan
c=IN IP4 anna_ws.emca.com
m=audio 1234 RTP/AVP 0 3
User Agent O
193
Proxy Server PO
Location Server LO
Proxy Server PT
User Agent T
INVITE
(1)
Lookup SIP Address
(2)
(3)
100 Trying
INVITE
100 Trying
(4)
183 Progress
180 Ringing
180 Ringing
(5)
180 Ringing
200 OK
200 OK
(6)
200 OK
ACK
ACK
(7)
ACK
Media Communication (RTP/RTCP)
BYE
BYE
(8)
BYE
OK
OK
(9)
OK
194
(5) When user agent T receives the INVITE request from PT, it sends a Ringing response
back to user agent O via PT and PO.
(6) The calling party answers the call which results in an OK response being sent back to
user agent O. Again, the response is routed via PT and PO.
(7) When user agent O receives the OK, it sends an ACK request, via PO and PT, to user
agent T. Now, the call setup is completed and media begins to flow.
(8) The called party abandons the call session, and a BYE request is sent from user agent
T, via PT and PO, to user agent O.
(9) User agent O responds to the BYE request with an OK response. The call session ends
when user agent T receives the OK response.
Bearer Signaling
As mentioned in the introduction to Section 5, bearer signaling denotes the type of signaling taking place between Softswitches and MGs. Figure 45, illustrates the use of bearer
signaling. The Softswitch acts as a Media Gateway Controller (MGC, cf. Section 3) which
controls several associated MGs. The MGs translate media data between the VoIP and PSTN
networks. Acting as an MGC, the Softswitch directs the MGs as to which TDM time slot is
connected to which RTP stream. It may also direct the MGs to transcode media from one
format to another, or mix various media streams together. Since bearer signaling is used by
Softswitches/MGCs to control MGs, it is also referred to as gateway control signaling and
the corresponding protocols as gateway control protocols.
Gateway control protocols have had a long and convoluted history. In the beginning of
1998, there were several competing proposals, however, the dominating one was the Media
Gateway Control Protocol (MGCP). Toward the end of 1998, the IETF formed the MEdia
GAteway COntrol (MEGACO) working group with the charter to propose a single gateway
control protocol. Since, the MGCP was the dominating gateway control protocol at that
time, there was a strong support for making this protocol the IETF standard. However, the
MEGACO group never accepted MGCP as their choice. The closest to a standard MGCP
came was an informational RFC, RFC 3435 [30]. Instead, key aspects of MGCP along
with many other inputs were integrated in a new protocol, the MEGACO gateway control
protocol.
Parallel to the efforts of the IETF, the ITU-T study group SG-16 initiated a work on a
H-series gateway control protocol, at that time called H.GCP, but later designated H.248. To
avoid ending up with two differing and incompatible protocols, the IETF and ITU-T SG-16
began to work on a compromise approach between the MEGACO protocol and H.GCP. In the
summer of 1999, an agreement was reached between the two organizations to create an international standard, the MEGACO/H.248 protocol. During the following year, considerable
effort was made to merge the two standards, and in June 2000, the MEGACO/H.248 [44, 62]
protocol was approved by both standard bodies. Today, an overwhelming majority of vendors and operators envision the MEGACO/H.248 protocol the bearer protocol of the nextgeneration telecommunication network.
6. Bearer Signaling
195
VoIP
PSTN
Softswitch/MGC
IP Phone
SG
H
PSTN Switch
PSTN Phone
H.323/SIP
1
RTP Stream
ya
te
kr
TDM Stream
MG
MG = Media Gateway
MGC = MG Controller
PSTN = Public Switched Telephone Network
SG = Signaling Gateway
SIP = Session Initiation Protocol
TDM = Time Division Multiplexing
196
MG
Context C1
Termination
RTP Stream
Termination
TDM Stream
Context C2
Termination
RTP Stream
Termination
TDM Stream
Termination
TDM Stream
MG = Media Gateway
RTP = Real-time Transport Protocol
TDM = Time Division Multiplexing
6. Bearer Signaling
197
Message
Transaction
Action
Command
Command
Command
Command
Command
Command
Command
Action
Command
Transaction
Action
Command
Action
Command
Command
198
User O
MGO
MGC
MGT
User T
Picks up phone
Notify Request
(1)
Notify Reply
Modify Request
Modify Reply
(2)
Dialed user T
Notify Request
(3)
Notify Reply
Add Request
Add Reply
Add Request
(4)
Add Reply
Ringing
Modify Request
Modify Reply
(5)
Notify Reply
Modify Request
Modify Reply
(6)
Modify Request
Modify Reply
TDM
TDM
MG = Media Gateway
MGC = Media Gateway Controller
RTP = Real-time Transport Protocol
RTCP = Real-time Transport Control Protocol
TDM = Time Division Multiplexing
Figure 48: MEGACO/H.248 signaling during a call setup between two MGs.
are PSTN users and attached to their respective MG through persistent terminations.
The commands are as follows:
(1) A user O at MGO picks up the phone, and a Notify command is generated to the MGC.
(2) The MGC instructs the MGO, via a Modify command, to play a dial tone and to collect
dialed digits.
(3) When user O has dialed the phone number of the called party, a user T connected to
MGT, a Notify command with the phone number of user T is sent to the MGC.
(4) The MGC creates a connection between MGO and MGT by issuing two Add commands, and a Modify command. The Modify command is required to complement the
199
media settings of MGO with the settings of MGT. As soon as the connection between
the MGC and MGT has been setup, the MGT instructs the phone at user T to ring.
(5) When user T answers its phone, a Notify command is sent from MGT to MGC.
(6) The MGC sets up the media stream between MGO and MGT by issuing two Modify
commands, one to the respective MG. Encapsulated within each Modify command is
a request to stop the ringing signal and to notify about any on-hook event.
7.1 SCTP
Since IETF traditionally takes a rather conservative standpoint to new TCP/IP transport protocols, the development of the Stream Control Transmission Protocol or SCTP was not inceptionally an obvious choice. In fact, as a first step, the SIGTRAN working group evaluated
the two common transport protocols of the TCP/IP stack, the UDP and TCP transport protocols [80]. UDP was quickly ruled out since it did not meet the requirement of reliable,
in-order delivery. TCP, on the other hand, met this basic requirement, however, was found
to have some other severe limitations:
Head-of-Line Blocking (HoLB). TCP imposes a strict order-of-transmission on sent
data. This is too confining for SS7 signaling traffic. Particularly, this creates an artificial ordering between independent signaling message flows, and thus lets time delays
due to packet losses and retransmissions in one flow inflict on the timely delivery of
the remaining flows sent over the same TCP connection. For example, consider a TCP
200
SEP
STP
SG
Softswitch (MGC-F)
UP SS7
LP SS7
UP SS7
LP SS7
LP SS7
SIG
SIG
LP = Lower Parts
MGC-F = Media Gateway Controller Function
SEP = Signaling End Point
SG = Signaling Gateway
SIG = SIGTRAN Protocol Suite
SS7 = Signaling System No. 7
STP = Signaling Transfer Point
UP = Upper Parts
Figure 49: Interworking between a softswitch VoIP network and a legacy SS7 circuitswitched network according to SIGTRAN.
Adaptation Component
SCTP
IP
201
TCP Connection
IAM, Call #3
IAM, Call #2
IAM, Call #1
Figure 51: Head-of-line blocking in a TCP connection with simultaneous telephone call
attempts.
connection over which three simultaneous telephone call attempts are made (see Figure 51). The ISUP IAM message of call #1 is lost which, by necessity, delays this
call attempt. However, due to TCPs order-of-transmission requirement, it delays the
remaining two call attempts as well. According to the study in [80], a packet-loss
frequency of 1% could delay 9% of subsequent packets more than a one-way transfer
time.
Timer Granularity. The computation of the retransmission timer in TCP is commonly done using a coarse, non-tunable system clock. Although, this is actually not a
limitation of the TCP protocol per se, it is indeed a limitation of most TCP implementations.
Availability and Reliability. TCP takes a prohibitively long time to detect connection
failures, and offers no mechanisms to recover from end point failures such as failed
network interfaces.
Message Boundaries. TCP is byte oriented and treats each data transmission as an
unstructured sequence of bytes. Thus, it would force SS7 signaling protocols to explicitly insert and track message boundaries.
Security. TCP hosts are susceptible to blind Denial-of-Service (DoS) attacks by SYN
packets.
To overcome the above limitations of TCP, the SIGTRAN working group concluded that
a new transport protocol was deemed necessary, and SCTP was ratified as a standard in October 2000. Although SCTP is a new transport protocol, separate from TCP, it inherits many
of its properties from TCP. Like TCP, SCTP provides a connection-oriented, reliable transport service on top of IP. It uses window-based congestion- and flow-control mechanisms
202
SCTP Connection
IAM, Call #1
Stream #1
IAM, Call #2
Stream #2
IAM, Call #3
Stream #3
Figure 52: Avoiding HoLB in SCTP by sending simultaneous telephone call attempts over
separate streams.
that essentially work the same as the ones used in TCP SACK. In particular, a selective retransmission scheme is employed to correct packet losses and errors. However, unlike TCP,
and to address the shortcomings of TCP, SCTP also supports the following features:
Multiple Delivery Modes. SCTP supports several modes of delivery including strict
order-of-transmission (like TCP), unordered (like UDP), and partially ordered delivery. The partially ordered delivery mode is provided through multi-streaming. The
multi-streaming feature of SCTP separates and transmits messages or chunks on multiple, logically independent streams. Streams are the facility offered by SCTP to send
separate signaling message flows on the same connection independently from each
other and to this end avoid unnecessary HoLB. Each stream provides a reliable inorder delivery of messages, while no ordering is imposed in between streams. Figure 52 illustrates the use of multiple streams by revisiting the example in Figure 51,
however this time using an SCTP connection. Although, the IAM message of call #1
is lost, it does not prevent the other two IAM messages from being delivered.
Tunable Timeout Settings. Although SCTP like TCP utilizes a non-tunable system
IP Address A
End Point A
203
IP Address B
IP Network
End Point B
IP Address B
204
the retransmission timer expires, SCTP increases an error counter for the primary
path and retransmits the message chunk. The primary path is considered unreachable
if the error counter reaches a predefined threshold, Path.Max.Retrans. Since
message chunks are not normally sent on a regular basis on alternate paths, another
reachability mechanism is used there. A special heartbeat chunk is sent periodically
on these paths, based on a configured heartbeat timer. Each time the retransmission
timer expires on a heartbeat chunk, the error counter of the corresponding path is
incremented. Again, when the error counter reaches Path.Max.Retrans, the path
is considered unreachable. The error counters of both the primary and alternate paths
are reset to zero each time a message or heartbeat chunk is successfully acknowledged.
SCTP also monitors the availability of the end points. Each end point monitors the
availability of its peer by keeping an error counter. This error counter keeps track of
the total number of consecutively missed acknowledgements for message and heartbeat chunks on all paths between the end point and its peer. When this error counter
reaches a predefined threshold, Association.Max.Retrans, the peer is considered unavailable. This will in effect bring an end to the whole association.
Message Boundary Preservation. SCTP preserves the message-framing boundaries
of applications by placing messages inside one or more chunks. Large messages are
partitioned into multiple chunks.
DoS Protection. To mitigate the impact of DoS attacks, SCTP employs a security
cookie mechanism during the establishment of an association.
205
IP
SS 7o
SEP
T DM
IPSCP
SG
SS7
o IP
IPSEP
STP
STP
SEP
(NIF)
LP SS7
UP SS7
P2PA
P2PA
SCTP
SCTP
IP
IP
An SCTP association
emulates an SS7 link
PSTN/PLMN
SEP
SS 7o
MGC
T DM
SG
LP = Lower Parts
MGC = Media Gateway Controller
NIF = Nodal Interworking Function
IPSCP = An SCP node in an IP network
IPSEP = A SEP node in an IP network
P2PA = Peer-to-Peer Adaptation Layer
PLMN = Public Land Mobile Network
PSTN = Public Switched Telephone Network
SCP = Service Control Point
SCTP = Stream Control Transmission Protocol
SEP = Signaling End Point
SG = Signaling Gateway
SS7 = Signaling System No. 7
STP = Signaling Transfer Point
TDM = Time Division Multiplexing
UAC = User Adaptation Layer, Client Side
UAS = User Adaptation Layer, Server Side
UP = Upper Parts
SEP
IPSEP
IP
SS7
STP
STP
MGC
(NIF)
LP SS7
in S
CT
UP SS7
UAS
UAC
SCTP
SCTP
IP
IP
Figure 54: The distinguishing features of peer-to-peer and user adaptation protocols.
the functionality of a single SCP or SEP may be distributed over several softswitches/MGCs,
while not so with peer-to-peer adaptation protocols.
Currently, there are only one peer-to-peer adaptation protocol defined: the MTP-L2 Peerto-peer Adaptation protocol (M2PA) [43]. As the name suggests, this protocol emulates the
MTP-L2 layer of the SS7 stack. There are four user adaptation protocols defined:
MTP-L2 User Adaptation Layer (M2UA). The M2UA [66] protocol is primarily
defined for the transport of MTP-L2 user signaling, i.e., MTP-L3, between a SG and
206
SS7
SUA
IUA
V5UA
DUA
M3UA
M2PA
M2UA
SCTP
IP
a softswitch/MGC.
MTP-L3 User Adaptation Layer (M3UA). The M3UA [81] protocol is primarily
defined for the transport of MTP-L3 user signaling, e.g., ISUP and SCCP, between a
SG and a softswitch/MGC.
SCCP User Adaptation Layer (SUA). The SUA [64] protocol is primarily defined
for the transport of SCCP applications, such as TCAP and RANAP, between a SG and
a softswitch/MGC.
ISDN Q.921 User Adaptation Layer (IUA). The IUA [67] protocol is defined for
the transport of Q.931 ISDN signaling between a SG and a softswitch/MGC. Two
extensions to IUA have been defined: the V5.2 User Adaptation layer (V5UA) [86],
and the Digital Private Network Signaling System/Digital Access Signaling System 2
User Adaptation layer (DUA) [68] for transport of V5.2 access signaling and Private
Branch Exchange (PBX) signaling, respectively.
Figure 55 shows the complete SIGTRAN protocol suite as it looks at the time of this
writing. The remainder of this section provides a more detailed description of M2PA and the
user adaptation protocols M2UA, M3UA, and SUA. IUA and its extensions are not discussed
any further since they have, as yet, not found any widespread use.
207
7.3 M2PA
M2PA allows operators to keep their existing network topology (i.e., SSPs, STPs etc.) and
use IP to transport their SS7 messages instead of using traditional TDM-based links. All
other elements from the legacy SS7 network remain the same, except that the signaling links
are now virtual. M2PA simply changes the transport to IP, and in that respect enables a
first, very conservative, step towards an IP-based telecommunication network. M2PA also
provides a means for peer SS7 MTP-L3 layers in SGs to communicate directly, a setup
typically used for SS7 bypass signaling where a managed IP network is run in parallel to a
highly loaded legacy SS7 network to offload signaling traffic. MTP-L3 is present on each SG
to provide routing and management of the MTP-L2/M2PA links. Because of the presence of
MTP-L3, each SG have its own SS7 point code. Figure 56 illustrates the two discussed use
cases of M2PA.
Since M2PA is a peer-to-peer adaptation protocol, it has basically the same responsibilities as MTP-L2. This means, among other things, that M2PA is responsible for MTP-L2
chores such as link activation/deactivation; maintenance of link status information; maintenance of sequence numbers and retransmit buffers for MTP-L3; and last, but not least,
maintenance of local and remote processor outage status.
7.4 M2UA
Figure 57 depicts the principal use case of M2UA. As already mentioned, M2UA is commonly used to transfer MTP-L2 user data between an MTP-L2 instance on a SG and an
MTP-L3 instance on a softswitch/MGC. Since M2UA is a user adaptation protocol, there is
a client-server relationship between the M2UA instance on the softswitch and the MTP-L2
instance on the SG. Basically, M2UA provides a means by which an MTP-L2 service may be
provided on a softswitch. Neither the MTP-L2 instance on the SG nor the MTP-L3 instance
on the softswitch is aware that they are remote from each other. Further, since the SG has
no MTP-L3 layer of its own, it has no SS7 point code. In fact, the SG is transparent to SS7
in the PSTN/PLMN network, and routing is instead made on the basis of the softswitches
which do have SS7 point codes.
M2UA is typically used in the following cases:
SS7 links are physically remote from each other which have resulted in a large number
of separate SGs. In this case, M2UA makes it possible for a single softswitch/MGC
to support several SGs. Since only the softswitch/MGC needs to have a point code,
the use of M2UA in this case conserves point codes, a scarce resource in todays
PSTN/PLMN networks.
There is a low density of SS7 links at a particular physical point in a legacy SS7
network. By using M2UA, an IP network may complement the legacy SS7 network at
this point in the network.
The SG function is co-located with an MG.
Figure 57 depicts M2UA as a peer to MTP-L2 in the SG. However, in many ways M2UA
is a user of MTP-L2. M2UA is responsible for initiating actions which would normally be
208
PSTN/PLMN
IP
S S7 o
SEP
TD M
IP
SG
7o
SS7
IP
S S7o
IPSCP
SG
SEP
PSTN/PLMN
TDM
SEP
oI P
IPSEP
STP
STP
SEP
STP
STP
SG
PSTN/PLMN
UP SS7
S
7o
TD
M
MTP-L3
MTP-L3
M2PA
M2PA
SCTP
SCTP
IP
IP
SEP
MTP-L2
MTP-L1
UP SS7
MTP-L3
MTP-L3
M2PA
M2PA
SCTP
SCTP
IP
IP
MTP-L2
MTP-L1
MTP-L3
MTP-L2
MTP-L2
MTP-L1
MTP-L1
PSTN/PLMN
SEP
209
IP
S S 7o
MGC
TD M
SG
SEP
MT P
-L 3
o ve
STP
STP
r SC
TP
MGC
UP SS7
NIF
MTP-L3
M2UA
M2UA
SCTP
SCTP
IP
IP
MTP-L2
MTP-L1
7.5 M3UA
At the present time, M3UA is the adaptation protocol that is offering the broadest functional
coverage. It is also the adaptation protocol selected by the majority of telecom equipment
manufacturers and operators. Furthermore, M3UA is the only adaptation protocol included
in 3GPP Release 5, the 2002 release of the standards for the third generation cellular networks. Like M2UA, M3UA is typically used between a SG and a softswitch/MGC. Figure 58
shows this use case. The SG receives SS7 signaling using the SS7 Message Transfer Parts
(MTPs) as transport over a standard SS7 link. The SG terminates MTP-L2 and MTP-L3,
210
PSTN/PLMN
SEP
IP
SS 7o
MGC
TD M
SG
SEP
E. g ., IS
U P ov
er
SCT P
MGC
STP
STP
NIF
UP SS7
MTP-L3
M3UA
M3UA
MTP-L2
SCTP
SCTP
MTP-L1
IP
IP
211
IP
PSTN/PLMN
MGC-A
Cluster
Routing Key = RK-A
MGC-B
SG
STP-B
MGC-C
1.1.1
RK-A
1.1.2
RK-C
Routing Database
DPC = Destination Point Code
ISDN = Integrated Services Digital Network
ISUP = ISDN User Part
MGC = Media Gateway Controller
PLMN = Public Land Mobile Network
PSTN = Public Switched Telephone Network
RK = Routing Key
SG = Signaling Gateway
SI = Service Indicator
STP = Signaling Transfer Point
maintains the status of configured SS7 destinations accessible via each SG, and routes messages accordingly. At the SG, the M3UA layer provides interworking with MTP-L3 management functions to support seamless operation of signaling between the SS7 and IP networks.
For example, the M3UA layer at the SG indicates to its supported softswitches when an SS7
signaling point is reachable or unreachable, or when SS7 network congestion occurs. Additionally, the M3UA layer at one of the supported softswitches may explicitly request the
state of a remote SS7 destination reachable via the SG by querying the SG M3UA layer.
Since MTP-L3 is terminated at the SG, SS7 point code routing ends at the SG. Routing in
the IP network is instead done using something called Routing Keys (RKs). That is, the SG
routes messages from the legacy PSTN/PLMN network to the appropriate softswitch in the
IP network using RKs. The RK is defined as a set of SS7 parameters and parameter values
that uniquely specify a destination for SS7 traffic in the IP network. Specifically, a RK is
used to route SS7 messages from the SG to a particular softswitch or cluster of softswitches.
As an example, it could be mentioned that the Cisco IP Transfer Point [23] permits RK
assignments for M3UA on the basis of the DPC, OPC, and SI (cf. Section 2).
212
Figure 59 provides a RK example. The STP denoted STP-A forwards an ISUP message
with DPC 1.1.1 to the SG. The SG looks up the DPC in its routing database and finds that
it matches the RK, RK-A. On the basis of this RK, it then routes the ISUP message to the
softswitch denoted MGC-A. Let us now assume that MGC-A becomes unreachable, and that
yet another ISUP message with DPC 1.1.1 arrives at the SG. Since MGC-A is clustered with
the softswitch denoted MGC-B and thus has the same RK, the SG will re-route all traffic
normally destined for MGC-A to MGC-B. This illustrates one of the strengths with RKs:
they decouple softswitches from point codes and thus enable SS7-transparent management
of the IP network, e.g., transparent failover and load-sharing.
7.6 SUA
SUA emulates the services of SCCP by providing support for reliable transfer of SCCP user
messages, including support for both connectionless and connection-oriented services. It
also provides SCCP management services to, for example, manage SCCP subsystems. As is
illustrated in Figure 60, SUA typically provides a means by which an SCCP user, e.g., TCAP
or RANAP, on a softswitch/MGC may be reached via a SG. From the perspective of an SS7
signaling point, the SCCP user is located at the SG. An SCCP message is routed to the SG
based on the point code and the SCCP subsystem number. The SG then translates the point
code and subsystem number of the SCCP messages to the corresponding RK, and routes the
SCCP messages to the appropriate softswitch/MGC. If an SCCP message contains a global
title, the SG may also perform global title translation before the RK translation.
Future Outlook
As mentioned in the introduction, the softswitch solution constitutes the first step along the
migration path towards the next-generation, IP-based, multi-service telecommunication network. Although, only the future can tell with certainty what the next steps will be, several
standardization bodies have proposed, or are in the process of proposing, reference architectures for the next-generation network, including:
The International Packet Communication Consortium (IPCC). The IPCC [4] is a
continuation of the International Softswitch Consortium (ISC) which was founded in
1998. It is an international industry association dedicated to accelerating the deployment of voice and video over IP in both wireline, wireless, and cable networks. Its
memberlist includes vendors as well as government agencies.
The MultiService Forum (MSF). Founded in 1998 by Cisco Systems, Worldcom,
and Telcordia, the mission of MSF [16] is not so much to develop new standards, but
to bring together existing standards into a holistic network and services architecture.
As members of MSF, we find both vendors and operators.
ITU-T. In 2001, ITU-T started a new initiative, the Next Generation Network (NGN).
The incentive with this initiative was to develop guidelines and standards for the nextgeneration telecommunication network. In 2004, the work of the NGN initiative was
transferred to the Focus Group on Next Generation Network (FGNGN) [3].
8. Future Outlook
213
PSTN/PLMN
SEP
IP
SS 7o
MGC
TD M
SG
SEP
TC AP
o ve r S
C TP
MGC
STP
STP
NIF
SCCP
SCCPU
SUA
SUA
SCTP
SCTP
IP
IP
MTP-L3
MTP-L2
MTP-L1
214
and Protocols for Advanced Networking (TISPAN) [20]. The main objective with
TISPAN is basically the same as it was for TIPHON: the standardization of a multiservice, multi-protocol, and multi-access network based on IP.
3GPP and 3GPP2. Both 3GPP [28] and 3GPP2 [1] were born out of ITU-Ts International Mobile Telecommunications Initiative 2000 (IMT-2000) [5] to standardize the
third-generation wireless communications. However, due to their success, and the fact
that the next-generation, IP-based core network is envisioned to be shared by fixed and
cellular communication, their scope has been extended. Today, 3GPP and 3GPP2 are
collaborating with ETSI TISPAN and others in standardizing the next-generation core
signaling system for wireless and wireline networks, the IP Multimedia Subsystem
(IMS) [29].
On the basis of these standardization efforts, a three-step migration path, as depicted in
Figure 61, has emerged:
Step 1: The Softswitch Solution. The first migration step, which is the step portrayed
in the foregoing sections of this report, first and foremost aims at reducing capital and
operating expenditures for operators. This step, as previously shown, involves the introduction of the softswitch that enables the separation of application functions, call
control, and connectivity. One of the most important benefits of the softswitch solution
is that it enables the reuse of equipments from the traditional TDM-based telecommunication network, especially in the access network. When the softswitch solution is
introduced, access equipment can gradually be moved from circuit-switched nodes to
MGs.
Step 2: The IP Multimedia Service Introduction. While the first step primarily
aims at reducing capital and operating expenditures for operators, and gives less tangible benefits to the end users, the second migration step introduces a new IP-based
signaling subsystem, the IMS subsystem, that not only makes new multimedia services feasible, but also greatly facilitates the trend of fixed/cellular convergence. For
example, the cellular operator Orange has disclosed that it will use IMS to conquer
customers from British Telecom (BT) in the UK. Specifically, Orange will offer a
combination of wireless (using GSM) and wired (using VoIP over a Digital Subscriber
Line (DSL) technology) services provisioned and controlled by a common IMS infrastructure.
Step 3: Converged IMS-based Architecture. Although IMS was introduced in Step
2, it is envisioned that the demand for traditional PSTN services will continue, and
that a full migration to IMS is likely to take several years. Thus, in Step 3, as aging
access equipment is being replaced, Access Gateways (AGs) are being deployed in the
network. The AGs provide telephony services over IP networks, and are controlled
via some bearer control protocol, such as MEGACO/H.248, by a softswitch. The
softswitches from Step 1 will remain in the network as long as there is a demand
for traditional PSTN services, and not until then, they will be completely replaced
by IMS. This protects investments and enables a smooth migration, on a port-by-port
basis, from the complete PSTN service set provided by the softswitch to IMS.
8. Future Outlook
215
SG
CK
W
A
E TTERD
H
P
LA
SG
AP
E
H
Nyae
CK
A
W
EL
TRD
T
PLMN
MG
k
w
rto
MG
Softswitch
yB
a
PLMN
Nw
te skro
SG
CK
W
A
E TTERD
H
P
LA
Nyae
k
w
rto
Softswitch
PSTN
MG
IP
SG
AP
E
H
CK
A
W
EL
TRD
T
MG
IP
yB
a
PSTN
Nw
te skro
SG
CK
W
A
E TTERD
H
P
LA
IMS
VoIP
MG
B
Nyae
k
w
rto
VoIP
SG
H
P
W
CK
E
AL
TRD
TE
MG
a
B
AG
Softswitch
SG
H
P
W
CK
E
AL
TRD
TE
MG
IP
PLMN
Nw
tye skro
a
B
PSTN
Nw
tye skro
IMS
VoIP
AG = Access Gateway
IMS = IP Multimedia Subsystem
MG = Media Gateway
PLMN = Public Land Mobile Network
PSTN = Public Switched Telephone Network
SG = Signaling Gateway
VoIP = Voice over IP
From a signaling perspective, it appears that IMS is the key component of the nextgeneration network. In essence, IMS is an architecture for establishing, maintaining, and
tearing down a SIP session in between two user agents (cf. Section 5.2). Although IMS is
envisioned to be a common signaling architecture for both fixed and cellular communication,
it is currently only defined for cellular communication and then in particular for 3G UMTS
networks.
Since IMS has its roots in cellular communication, a key distinction is made between
the home and the visited network of an IMS user, e.g., a cellular phone. The main task for
216
Services
Control
Control
Control
Connectivity
Control
Connectivity
Connectivity
Home Network
Connectivity
Visited Networks
Figure 62: The visited network provides connectivity to the home network.
the visited network is to provide connectivity to the home network, while it is the home
network that hosts user data, session control, services, and applications (see Figure 62).
Users are always roaming in a visited network while the services are controlled from the
home network, regardless of visited network. The advantage with this approach is that it
limits the functional and protocol dependencies between the home and visited networks and
thereby minimizes the restrictions imposed on the services that can be deployed in the home
network. As a side effect, it also increases the rate at which services can actually be deployed.
Although, this approach means that all control signaling goes through the home network, the
bearer traffic is routed independently of the signaling traffic and thus is able to follow a more
efficient path.
Figure 63 illustrates the IMS architecture. As shown, the IMS architecture consists of
the following principal components:
Call Session Control Function (CSCF). The IMS architecture is built around the
Call Session Control Function which in a sense constitutes the softswitch of IMS.
There are three different types of CSCFs: the Proxy CSCF (P-CSCF), the Interrogating
CSCF (I-CSCF), and the Serving CSCF (S-CSCF).
The P-CSCF is the first contact point for IMS users. In fact, the P-CSCF component
is the only IMS component used by a roaming user in a visited network. All SIP
signaling traffic from the IMS user goes via the P-CSCF, i.e., the P-CSCF is analogous
to a SIP proxy server. The functions performed by the P-CSCF include forwarding of
SIP registration and session invitation messages, and forwarding of accounting-related
information.
The I-CSCF is the first point of contact within the home network from a visited network. Its main responsibility is to query the Home Subscriber Server (HSS), the subscriber database, and find the location of the S-CSCF serving the user. Although, this
is actually an optional component in IMS, it has a number of other responsibilities
as well. In particular, it provides a hiding functionality which makes it possible for
an operator to hide the topology, capacity etc. of its network from other operators
networks, and thus makes load sharing and other types of capacity management much
easier.
8. Future Outlook
217
IMS Domain
HSS
P-CSCF
BGCF
IMS Domain
HSS
I-CSCF
S-CSCF
P-CSCF
I-CSCF
IP-based
Network
S-CSCF
SG
SGSN
GGSN
MRFC
MGCF
IMS-MG
PSTN/PLMN
SG
RNC
IMS-MG
MRFP
RAN
IMS-MG
PSTN/PLMN
BTS
IP-based Trunk
Finally, the S-CSCF is the brain of the IMS architecture. It is located in the home network and performs session control and registration services for IMS users. While the
user is engaged in a session, the S-CSCF maintains a session state and interacts with
ASs and accounting functions. Although a S-CSCF in the home network is responsible for all session control, it could forward specific requests to a P-CSCF in the visited
network on the basis of the requirements of the request, e.g., to provide information
about the local dialing plan.
218
Home Subscriber Server (HSS). The HSS is the main data storage for all subscriber
and service-related data in IMS. The data stored in HSS includes: user identities,
registration information, and access parameters. The HSS interfaces with the I-CSCF
and the S-CSCF to provide information about the location of the IMS user and its
subscription information.
Media Resource Function (MRF). The MRF holds the functionality for manipulating
multimedia streams, to support multi-party multimedia services, multimedia message
playback, and media conversion services. The MRF is split into two parts: an MRF
Controller (MRFC) and an MRF Processor (MRFP). The MRFC interprets SIP signaling received via a S-CSCF and uses MEGACO/H.248 (cf. Section 6) instructions to
control the MRFP. In other words, MRFC is the part of MRF that resides in the control
layer, while MRFP resides in the connectivity layer.
Breakout Gateway Control Function (BGCF). The BGCF is one of the components
IMS provides for interworking with legacy circuit-switched networks. In particular,
the BGCF is responsible for choosing where a breakout to a circuit-switched network
should occur. The outcome of the selection process could either be that the breakout
should happen in the same network as that of the BGCF, or that the call should be
routed to another IP trunk network.
Media Gateway Control Function (MGCF). As the BGCF, the MGCF is a component for enabling interworking between IMS and circuit-switched users. All incoming
call-control signaling from legacy PSTN/PLMN users is routed to the MGCF that
performs protocol conversion between ISUP and SIP. Similarly, all IMS-originated
sessions towards legacy PSTN/PLMN networks are routed via the MGCF. The MGCF
also controls media channels in the corresponding IMS-MG.
IMS Media Gateway (IMS-MG). The IMS-MG provides the connectivity-layer link
between circuit-switched PSTN/PLMN networks and IMS. Specifically, it performs
the media translation between the IP-based trunk network and legacy PSTN/PLMN
networks.
To give some appreciation of how the components of IMS interwork, let us consider the
IMS session initiation procedure. We assume that User A in Figure 64 wants to initiate
a session with User B. To simplify matters, we also assume that both users are in their
respective home networks. Prior to the session initiation both users have gone through a socalled P-CSCF discovery procedure in which they obtained an IP address for their P-CSCF,
and a registration procedure in which they registered with the HSS of their home network.
The steps taken by User A when it initiates a session with User B are as follows:
(1) User A generates a SIP INVITE request and sends it to the P-CSCF in his home
network.
(2) The P-CSCF processes the request, and forwards it to the S-CSCF.
(3) The S-CSCF processes the request and determines an entry point of the home network
of User B, the I-CSCF.
8. Future Outlook
219
I-CSCF
HSS
HSS
(4)
(2)
(3)
P-CSCF
S-CSCF
GGSN
(5)
(4)
I-CSCF
S-CSCF
SGSN
SGSN
(1)
User A
RNC
BTS
P-CSCF
(6)
User B
GGSN
(7)
RNC
BTS
220
Summary
Competitive market conditions and narrowing profit margins are driving network operators to
optimize their network and to find new sources of revenue. In particular, todays incumbent
wireline and wireless operators are at a crossroad. They need to move to IP in order to cut
operating and capital expenditures. At the same time, they have made huge investments
in circuit-switched technology that are still delivering a major share of their total revenue.
To these operators, the softswitch offers an appealing solution. The softswitch solution lets
incumbents enjoy dramatically reduced costs, and at the same time provides support for a
still emerging new wave of revenue-generating services.
This report have given a fairly comprehensive survey of the softswitch solution from
a technical viewpoint. Basically, all components of the softswitch solution have been discussed: applications, call-control signaling, bearer signaling, and, last but not least, the interworking with the existing circuit-switched PSTN and PLMN networks. The report has
shown how the softswitch solution creates a decomposed architecture in which the signaling
and media functions are separated. The report has also shown how the decomposed architecture of the softswitch solution lends itself to more advanced and flexible applications and
services than is possible in the existing telecommunication networks.
Although, the softswitch solution is central to the evolution of the current PSTN and
PLMN networks, it only represents the first migration step towards the envisioned nextgeneration, all-IP network. Thus, in the final section of the report, a future outlook was
provided which attempted to see beyond the softswitch solution. Central to this outlook was
IMS. The IMS architecture defines the logical elements necessary to implement multimedia
services across multiple network types, and the final section gave a brief overview of this
architecture and its salient components.
References
[1] The 3rd generation partnership project 2 (3GPP2). http://www.3gpp2.org.
[2] COM: Component object model technologies. http://www.microsoft.com/
com.
[3] The focus group on next generation networks (FGNGN). http://www.itu.int/
ITU-T/ngn/fgngn.
[4] International packet communications consortium (IPCC).
ipccforum.org.
http://www.
REFERENCES
221
http://www.jcp.org/en/jsr/
http://www.jcp.org/en/jsr/
Online at http://www.
222
[29] 3GPP. 3rd generation partnership project; technical specification group services and
system aspects; IP multimedia subsystem (IMS); stage 2 (release 7). Technical Specification TS 23.228 v.7.1.0, 3GPP, September 2005.
[30] F. Andreasen and B. Foster. Media gateway control protocol (MGCP) version 1.0. RFC
3435, IETF, January 2003.
[31] R. J. Auburn. Voice browser call control: CCXML version 1.0. Working draft, W3C,
June 2005.
[32] J-L Bakker and R. Jain. Next generation service creation using XML scripting languages. In International Conference on Communications (ICC), New York, USA, April
2002.
[33] D. Booth and C. K. Liu. Web services description language (WSDL) version 2.0 part
0: Primer. Technical report, W3C, August 2005. Working Draft 3.
[34] T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, and F. Yergeau. Extensible
markup language (XML) 1.0 (third edition). Recommendation, W3C, February 2004.
[35] T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, and F. Yergeau. Voice extensible
markup language (VoiceXML) version 2.0. Recommendation, W3C, March 2004.
[36] R. Chinnici, H. Haas, A. Lewis, J. Moreau, D. Orchard, and S. Weerawarana. Web
services description language (WSDL) version 2.0 part 2: Adjuncts. Technical report,
W3C, August 2005. Working Draft 3.
[37] R. Chinnici, J. Moreau, A. Ryman, and S. Weerawarana. Web services description
language (WSDL) version 2.0 part 1: Core language. Technical report, W3C, August
2005. Working Draft 3.
[38] J. Davidson and J. Peters. Voice over IP Fundamentals. Cisco Press, March 2000.
[39] ETSI. Telecommunications and Internet protocol harmonization over networks
(TIPHON) release 4; architecture and reference points definition; network architecture
and reference points. Technical Specification TS 101 314 v. 4.1.1, ETSI, September
2003.
[40] V. Ferraro-Esparza, M. Gudmandsen, and K. Olsson. Ericsson telecom server platform
4. Ericsson Review, (3):104113, 2002.
[41] B. Foster and C. Sivachelvan. Media gateway control protocol (MGCP) return code
usage. RFC 3661, IETF, December 2003.
[42] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns:
Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.
[43] T. George, B. Bidulock, R. Dantu, H. Schwarzbauer, and K. Morneault. Signaling
system 7 (SS7) message transfer part 2 (MTP2) - user peer-to-peer adaptation layer
(M2PA). RFC 4165, IETF, September 2005.
REFERENCES
223
[44] C. Groves, M. Pantaleo, T. Anderson, and T. Taylor. Gateway control protocol version
1. RFC 3525, IETF, June 2003.
[45] M. Gudgin, M. Hadley, N. Mendelsohn, J. Moreau, and H. Nielsen. SOAP version 1.2
part 1: Messaging framework. Recommendation, W3C, June 2003.
[46] M. Gudgin, M. Hadley, N. Mendelsohn, J. Moreau, and H. Nielsen. SOAP version 1.2
part 2: Adjuncts. Recommendation, W3C, June 2003.
[47] M. Handley and V. Jacobson. SDP: Session description protocol. RFC 2327, IETF,
April 1998.
[48] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg. SIP: Session initiation
protocol. RFC 2543, IETF, March 1999.
[49] R. Harrison and K. Zeilenga. The lightweight directory access protocol (LDAP) intermediate response message. RFC 3771, IETF, April 2004.
[50] J. Hodges and R. Morgan. Lightweight directory access protocol (v3): Technical specification. RFC 3377, IETF, September 2002.
[51] ITU-T. Pulse code modulation (PCM) of voice frequencies. Recommendation G.711,
ITU-T, November 1988.
[52] ITU-T. Data protocols for multimedia conferencing. Recommendation T.120, ITU-T,
July 1996.
[53] ITU-T. Visual telephone systems and equipment for local area networks which provide
a non guaranteed quality of service. Recommendation H.323, ITU-T, November 1996.
[54] ITU-T. Digital subscriber signalling system no. 1 generic procedures for the control
of ISDN supplementary services. Recommendation Q.932, ITU-T, May 1998.
[55] ITU-T. Generic functional protocol for the support of supplementary services in H.323.
Technical Report H.450.1, ITU-T, February 1998.
[56] ITU-T. ISDN user-network interface layer 3 specification for basic call control. Recommendation Q.931, ITU-T, May 1998.
[57] ITU-T. Vocabulary of switching and signalling terms. Technical Report Q.9, ITU-T,
November 1998.
[58] ITU-T. Call signalling protocols and media stream packetization for packet-based multimedia communication systems. Recommendation H.225.0, ITU-T, July 2003.
[59] ITU-T. Packet-based multimedia communications systems. Recommendation H.323,
ITU-T, July 2003.
224
[60] ITU-T. Implementors guide for recommendations of the H.323 system (packetbased multimedia communications systems): H.323, H.225.0, H.245, H.246, H.283,
H.235, H.341, H.450 series, H.460 series, and H.500 series. Technical Report
H.Imp323/H.323/H.225.0/H.245/H.246/H.283/H.235/H.341, ITU-T, November 2004.
[61] ITU-T. Control protocol for multimedia communication. Recommendation H.245,
ITU-T, January 2005.
[62] ITU-T. Gateway control protocol: Version 3. Technical Report H.248.1, ITU-T,
September 2005.
[63] J. Lennox, X. Wu, and H. Schulzrinne. Call processing language (CPL): A language
for user control of Internet telephony services. RFC 3880, IETF, October 2004.
[64] J. Loughney, G. Sidebottom, L. Coene, G. Verwimp, J. Keller, and B. Bidulock. Signalling connection control part user adaptation layer (SUA). RFC 3868, IETF, October
2004.
[65] N. Mitra. SOAP version 1.2 part 0: Primer. Recommendation, W3C, June 2003.
[66] K. Morneault, R. Dantu, G. Sidebottom, B. Bidulock, and J. Heitz. Signaling system
7 (SS7) message transfer part 2 (MTP2) user adaptation layer. RFC 3331, IETF,
September 2002.
[67] K. Morneault, S. Rengasami, M. Kalla, and G. Sidebottom. ISDN Q.921-user adaptation layer. RFC 3057, IETF, February 2001.
[68] R. Mukundan, K. Morneault, and N. Mangalpally. Digital private network signaling
system (DPNSS)/digital access signaling system 2 (DASS 2) extensions to the IUA
protocol. RFC 4129, IETF, August 2005.
[69] F. D. Ohrtman. Softswitch Architecture for VoIP. McGraw-Hill, 2003.
[70] S. Olson, G. Camarillo, and A. B. Roach. Support for IPv6 in session description
protocol (SDP). RFC 3266, IETF, June 2002.
[71] OMG. Common object request broker architecture: Core specification. Recommendation Version 3.0.3, OMG, March 2004.
[72] L. Ong, I. Rytina, M. Garcia, H. Schwarzbauer, L. Coene, H. Lin, I. Juhasz, M. Holdrege, and C. Sharp. Framework architecture for signalling transport. RFC 2719, IETF,
October 1999.
[73] J. Postel. User datagram protocol. RFC 768, IETF, August 1980.
[74] J. Postel. Transmission control protocol. RFC 793, IETF, September 1981.
[75] Y. Rekhter and T. Li. A border gateway protocol 4 (BGP-4). RFC 1771, IETF, March
1995.
REFERENCES
225
[76] A. B. Roach. Session initiation protocol (SIP)-specific event notification. RFC 3265,
IETF, June 2002.
[77] J. Rosenberg, H. Salama, and M. Squire. Telephony routing over IP (TRIP). RFC 3219,
IETF, January 2002.
[78] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks,
M. Handley, and E. Schooler. SIP: Session initiation protocol. RFC 3261, IETF, June
2002.
[79] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. RTP: A transport protocol
for real-time applications. RFC 3550, IETF, July 2003.
[80] T. Seth, A. Broscius, C. Huitema, and H-A P. Lin. Performance requirements for
signaling in internet telephony. Internet draft, IETF, November 1998. Work in Progress.
[81] G. Sidebottom, K. Morneault, and J. Pastor-Balbas. Signaling system 7 (SS7) message
transfer part 3 (MTP3) user adaptation layer (M3UA). RFC 3332, IETF, September
2002.
[82] Signaling transport working group (SIGTRAN). http://www.ietf.org/html.
charters/sigtran-charter.html.
[83] D. Sprague, R. Benedyk, D. Brendes, and J. Keller. Tekelecs transport adapter layer
interface. RFC 3094, IETF, April 2001.
[84] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina,
M. Kalla, L. Zhang, and V. Paxson. Stream control transmission protocol. RFC 2960,
IETF, October 2000.
[85] M. Wahl, T. Howes, and S. Kille. Lightweight directory access protocol (v3). RFC
2251, IETF, December 1997.
[86] E. Weilandt, N. Khanchandani, and S. Rao. V5.2-user adaptation layer (V5UA). RFC
3807, IETF, June 2004.
[87] X. Wu and H. Schulzrinne. Programmable end system services using SIP. In International Conference on Communications (ICC), Anchorage, Alaska, USA, May 2003.
[88] X. Wu and H. Schulzrinne. LESS: Language for end system services in Internet telephony. Internet draft, IETF, February 2005. Work in Progress.
226
Abbreviations
3GPP
3GG2
A-F
ACE
ACF
ACK
ACM
AEG
AG
ANM
API
ARQ
AS
AT&T
ATM
AuC
AVP
B2BUA
BCSM
BG-F
BGCF
BGP-4
BSC
BSS
BSSAP
BT
BTS
CA-F
CAMEL
CAP
CAS
CCS
CCXML
CORBA
CPL
CPU
CSCF
DASS 2
DCOM
DoS
DP
DPC
REFERENCES
DPNSS
DS
DSL
DSS1
DTD
DUA
EIR
EJB
ESML
ETSI
FDMA
FGNGN
GCC
GCP
GGSN
GMSC
GPRS
GSM
GT
GTR
GTT
HLR
HoLB
HSS
HTTP
I-CSCF
IAM
IBM
ID
IDL
IETF
IMS
IMS-MG
IMSI
IMT
IN
INAP
IP
IPCC
IPSP
ISC
ISDN
ISUP
ITE
227
228
ITU
ITU-T
IUA
IVR
JAIN
JCC
JCP
JMX
JMXMP
JSP
JSPA
JWG
LAN
LDAP
LE
LESS
LP
LSB
M-MG
M2PA
M2UA
M3UA
MAP
MC
MCU
MEGACO
MG
MGC
MGC-F
MGCF
MGCP
MMUSIC
MP
MRF
MRFC
MRFP
MSC
MSC-S
MSF
MSISDN
MTP
MTP-L1
MTP-L2
MTP-L3
REFERENCES
NGN
NI
NIF
NSP
NSS
NTE
OAM
OMA
OMC
OPC
OSA
OSS
OSSJ
P-CSCF
PBX
PEG
PIC
PLMN
PSTN
QoS
R-F
RA
RAN
RANAP
RAS
RFC
RK
RMI
RNC
RTCP
RTE
RTP
S-CSCF
SACK
SBB
SCCP
SCE
SCF
SCML
SCP
SCS
SCTP
SDP
SEP
229
230
SG
SGSN
SI
SIGTRAN
SIMPLE
SIO
SIP
SLEE
SLP
SLS
SLT
SMH
SMS
SNM
SOAP
SP
SPAN
SRP
SS6
SS7
SSP
STP
SUA
TALI
TCAP
TCP
TDM
TDMA
TE
TIPHON
TISPAN
TRIP
TSP4
UDP
UML
UMTS
UP
URI
URL
UTRAN
V5UA
VLR
Signaling Gateway
Serving GPRS Support Node
Service Indicator
SIGnaling TRANsport
SIP for Instant Messaging and Presence Leveraging Extensions
Service Information Octet
Session Initiation Protocol
Service Logic Execution Environment
Service Logic Program
Signaling Link Selector
Signaling Link Terminal
Signaling Message Handling
Short Message Service
Signaling Network Management
Simple Object Access Protocol
Signaling Point
Services and Protocols for Advanced Networking
SCCP Relay Point
Signaling System No. 6
Signaling System No. 7
Service Switching Point
Signaling Transfer Point
SCCP User Adaptation layer
Transport Adapter Layer Interface
Transaction Capabilities Application Part
Transmission Control Protocol
Time Division Multiplexing
Time Division Multiple Access
Tandem Exchanges
Telecommunications and Internet Protocol Harmonization
Over Networks
Telecommunications and Internet converged Services and
Protocols for Advanced Networking
Telephony Routing over IP
Telecommunication Server Platform 4
User Datagram Protocol
Unified Modeling Language
Universal Mobile Telecommunications System
User Part
Uniform Resource Identifier
Uniform Resource Locator
UMTS Terrestrial Radio Access Network
V5.2 User Adaptation Layer
Visitor Location Register
REFERENCES
VoIP
W3C
WAN
WAP
WCDMA
WSDL
XML
XTML
231
Voice over IP
World Wide Web Consortium
Wide Area Network
Wireless Application Protocol
Wideband Code Division Multiple Access
Web Services Description Language
eXtensible Markup Language
eXtensible Telephony Markup Language
Paper VII
Torbjorn Andersson
TietoEnator AB
Karlstad, Sweden
andettor@tietoenator.com
Abstract
Mitigating the effects of Head-of-Line Blocking (HoLB) was one of the major reasons the IETF SIGTRAN working group developed SCTP, a new transport protocol for
PSTN signaling traffic, in the first place. However, studies of the impact of HoLB blocking on TCP and SCTP have given ambiguous results as to whether HoLB has, in fact,
any significantly deteriorating effect on transmission delay. To this end, we have carried
out a detailed experimental study on the quantitative effects of HoLB. Our study suggests
that although HoLB could indeed incur a substantial delay penalty on a small fraction of
the messages in an SCTP session, it has only a marginal impact on the average end-toend transmission delay. We only observed improvements in the range of 0% to 18% in
average message transmission delay of using unordered delivery as compared to ordered
delivery. Furthermore, there was a large variability in between different test runs, which
often made the impact of HoLB statistically insignificant.
1 Introduction
The communication industry is currently experiencing a period of dramatic and radical
changes which strives towards a single converged all-IP network for both voice, video, and
data. The reasons to this development are many. Operators are seeking ways to consolidate
their disparate communication platforms in order to reduce their development, operational,
and maintenance costs. Additionally, an all-IP network enables a multitude of new services,
with the promise of new revenue streams for the large number of carriers and equipment
manufacturers that saw their net profit plummet when the telecommunications boom abruptly
ended in 2000.
235
236
Leading this development are the telecom industry with most major telecom carriers in
the process of readying Voice-over-IP (VoIP) services for mass deployment. According to
Frost & Sullivan [8], VoIP will account for approximately 75% of the world voice services by
2007, and analysts project that the number of residential VoIP subscribers will rise 12-fold,
to about 12 million, by 2009 [4].
An important component of the currently launched VoIP networks are the Stream Control
Transmission Protocol (SCTP) [16]. Originally developed as a transport protocol for PSTN
signaling traffic over IP in the IETF SIGTRAN working group [14], SCTP has broadened
its use, and is today part of both the next generation IP-based wireline and wireless core
networks. In the wireline core network, SCTP has been proposed as a viable alternative to
UDP and TCP for the transport of the Session Initiation Protocol (SIP) [12] in the Softswitch
architecture [6]. In the wireless core network, SCTP is one of the signaling transport protocols in the IP Multimedia Subsystem (IMS) [2] architecture; the architecture defined by
3GPP [1] for the development of IP-based multimedia services in future mobile networks.
The origin of SCTP springs from studies of UDP and TCP as prospective transport protocols for PSTN signaling traffic. UDP was quickly ruled out since it did not meet the basic
requirements for reliable, in-order transport. While TCP met the basic requirements, it also
was found to have several limitations, one of which was its lack of features to prevent Headof-Line Blocking (HoLB). Since TCP delivery is strictly ordered (i.e., sequential), a single
packet loss or reordering in the network might introduce significant delays in the delivery of
subsequent packets, and this regardless of whether the packets are semantically dependent or
not. In fact, an analysis carried out by Telcordia [13] suggested that a 1% packet loss would
cause 9% of the packets to be delayed more than a one-way transfer delay.
However, more recent studies have given ambiguous results as to whether HoLB substantially impedes signaling traffic performance or not. A simulation study carried out by
Camarillo et al. [3] suggests that avoidance of HoLB does not give SCTP a substantial performance increase over TCP under normal conditions, while an experiment by De Marco et
al. [9] indicates the opposite. Thus, the benefits of avoiding HoLB in SCTP still remains an
open issue.
To shed some further light on this issue, we have conducted a detailed experimental study
of the effects of HoLB on SCTP. Contrary to previous studies, our study uses SCTP for both
ordered and unordered delivery. Furthermore, we consider the effect of HoLB for a number
of network conditions, not just a single one.
The study provides a quantitative analysis of the effects of HoLB on transmission delays. The major contribution of the paper is that it indicates that while HoLB could, indeed,
substantially increase the transmission delay of a small fraction of the messages in an SCTP
session, it has only a marginal impact on the average message transmission delay.
The remainder of this paper is organized as follows. In Section 2, we provide some
preliminaries on SCTP and HoLB. The design and setup of our experiment is described in
Section 3. Then, in Section 4, the results of the experiment is presented and discussed.
Finally, the paper ends in Section 5 with a brief summary of the paper, concluding remarks,
and outlooks for future work.
Application
(Port 10)
00
11
11
00
00
11
00
11
00
11
237
IP Network
Primary Path
Alternate Path
End Point A
Application
(Port 20)
00
11
11
00
00
11
00
11
00
11
End Point B
238
Host 1
Host 2
TCP Application
TCP Application
SCTP Application
Host 2
SCTP Application
Methodology
The network topology and test setup used in our experiment are depicted in Figure 4. As
follows, the network topology basically consisted of two hosts interconnected by a single
network path. The test flow consisted of a paced SCTP flow between the Flow Under Test
(FUT) Source and Sink applications. Tests were run for four different SCTP send rates:
133 Kbps, 200 Kbps, 400 Kbps, and with no pacing at all (i.e., more than 400 Kbps). In all
tests, an SCTP message size of 500 Bytes5 was used, and a test run always comprised 1000
SCTP messages.
The traffic competing with the SCTP flow, i.e., the cross traffic, comprised 0, 1, 2, or
8 greedy TCP bulk flows (i.e., flows that always had messages to send). The cross traffic
was transmitted between the TCP sources and sinks, in other words between the same hosts
5 E.g.,
3. Methodology
239
NTP Server
Traffic
Source
Traffic
Sink
Path Emulator
0
1
1
0
0
1
0
1
1
0
0
1
Bandwidth: 400 Kbps
Path Delay: 75 ms
Buffer: 12 pkts, 32 pkts
FUT
Source
TCP
Source 1
SCTP
TCP
Source N
TCP
FUT
Sink
TCP
Sink N
TCP
Sink 1
SCTP
IP
TCP
IP
as the SCTP flow. In all tests, the cross traffic was started 5 s before the SCTP flow. This
ascertained that the TCP flows had left the slow-start phase, and reached the congestion
avoidance or stationary phase before the SCTP flow was started.
The traffic sources and sinks were running on 2.8 GHz PCs with FreeBSD 4.10 as operating system. As a consequence, the TCP sources and sinks were running atop a NewReno [5]
implementation of TCP, and the SCTP sources and sinks were running on the FreeBSD
kernel implementation of SCTP in the KAME stack [7]. However, the KAME SCTP implementation was upgraded with patch 23, which was the latest patch at the time of the start of
the experiment. To keep the local clocks of the traffic source and sink hosts synchronized,
which was needed to, e.g., measure the end-to-end transmission delays, the hosts were both
using NTP, and attached to the same NTP server. A 2.8 GHz PC with FreeBSD 4.10 and
dummynet [11] acted as path emulator.
Tests were run for two different path configurations. Both configurations used a bandwidth of 400 Kbps, a one-way path propagation delay of 75 ms, and drop-tail queueing in
dummynet. However, the in-bound buffer size of dummynet differed between the two configurations. One of the configurations used a buffer size of 12 IP-packets, and the other a
buffer size of 32 IP-packets.
Although the in-bound buffer of dummynet was varied, the sizes of the SCTP send and
receive buffers at the traffic source and sink hosts were kept constant, and dimensioned to
prevent flow-control events. In particular, the receive buffer of SCTP at the FUT Sink was
configured to 134 KBytes or 268 messages, while the SCTP send buffer at the FUT Source
was only set to 47.5 KBytes or 95 messages. Thus, the FUT Source could never have more
than 95 outstanding messages, and thus could never overwhelm the FUT Sink which could
accommodate 268 messages. The reason we wanted to avoid flow-control induced regulation
of the SCTP test flow will become evident in Section 4.
240
Results
To be able to study the impact of HoLB on SCTP message transmission delays, all tests were
run for both ordered and unordered SCTP transmissions. Additionally, to make the sample
message transmission delays essentially independent, and thus obviate the strong correlation
that do indeed exist between the transmission delays of consecutive SCTP messages, the
average end-to-end transmission delay for a message in a test run was used as a sample of
the transmission delay of a message in a test. Each test was repeated 100 times, and the mean
of the sample transmission delays from the test runs was used as a metric for the message
transmission delay of SCTP in a test.
Table 1 summarizes the results of our experiment: E2Eo denotes the mean transmission
delay for ordered delivery; E2Eu , the mean transmission delay for unordered delivery; and
, the difference between the two. Table 1 also shows the 95% confidence interval for
, which tells us whether the difference between ordered and unordered transmission delays
was statistically significant or not. Finally, the relative improvement in transmission delay for
unordered delivery as compared to ordered delivery is calculated in column rel . A positive
value in this column says that an unordered delivery performed better than the corresponding
ordered delivery, and a negative value the opposite.
As follows from Table 1, the mean end-to-end transmission delays for messages sent with
the unordered transport service of SCTP were in between 0% and 18% lower than the transmission delays for messages sent with the ordered transport service. In other words, there
was only a small difference between the two transport services. However, it also follows
from the table that there was a large variability in the differences between the two services,
i.e., in . Notably, in the majority of the tests, and its 95% confidence interval was of the
same magnitude.
One reason to the large variability in could be that factors such as packet-loss rate
and packet-loss distribution were uncontrolled, and thus might have differed between corresponding ordered and unordered tests. To this end, we artificially generated the end-to-end
transmission delays for the tests with ordered delivery from the measured send and receive
times of messages in the corresponding tests with unordered delivery. Table 2 illustrates
how the generation was carried out. The messages with IDs 1 and 2 arrived at the FUT
Sink in order, and thus were given the same receive times in ordered delivery as they had
in unordered delivery. In contrast, the messages with IDs 3 and 4 arrived at the FUT Sink
before the message with ID 2, and, because of this, were given the same receive time as this
message; in this way, simulating the reordering that would have taken place had it been a
real test run. Finally, the message with ID 5 already had a receive time in the unordered test
that was larger than 23, the receive time given to message 4 in the ordered test, thus kept its
receive time from the unordered test in the ordered test.
We recognize that in the general case this kind of artificial generation is not valid since
it does not take into account the fact that the flow control might behave differently in the
ordered delivery tests as compared to the tests with unordered delivery. Specifically, SCTP
has to buffer packets that arrive out of order in the ordered delivery tests, while it in the corresponding unordered tests is able to immediately pass them on to the FUT Sink application.
Thus, in the ordered delivery tests, SCTP might run out of buffer space, and, as a result of
4. Results
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
E2Eo (s)
E2Eu (s)
0.92
1.49
1.94
6.34
0.44
1.46
1.78
6.19
0.09
0.42
1.16
5.98
0.09
0.18
0.27
5.91
0.85
1.34
1.86
5.52
0.36
1.37
1.69
5.65
0.09
0.40
1.07
5.76
0.09
0.17
0.23
5.31
12
with 95% C.I. (s)
6.67 102
1.42 101
8.15 102
8.23 101
7.75 102
8.85 102
8.41 102
5.40 101
6.60 105
2.54 102
9.02 102
2.12 101
6.70 104
1.02 102
4.39 102
5.96 101
6.67 105
6.06 102
8.40 102
3.91 101
1.79 103
6.88 102
1.20 101
3.12 101
7.33 105
6.26 102
1.89 101
4.38 101
7.70 105
8.41 103
3.41 102
3.63 101
rel (%)
E2Eo (s)
E2Eu (s)
7.27
9.58
4.20
12.96
17.67
6.06
4.73
8.74
0.08
6.04
7.74
3.55
0.78
5.79
15.98
10.09
0.92
1.42
1.79
4.72
0.43
1.12
1.89
4.51
0.09
0.38
0.80
4.39
0.09
0.28
0.43
3.86
0.88
1.21
1.68
4.42
0.39
0.95
1.78
4.33
0.09
0.33
0.76
3.94
0.08
0.26
0.42
3.86
32
with 95% C.I. (s)
3.74 102
2.09 102
1.03 101
2.97 101
4.02 102
1.73 101
1.16 101
1.81 101
4.28 104
5.16 102
4.43 102
4.44 101
1.40 103
1.66 102
1.12 102
3.17 103
8.53 105
3.43 102
8.85 102
2.39 101
1.06 102
4.66 102
1.12 101
2.51 101
7.82 105
1.28 102
8.60 102
2.66 101
7.44 105
1.05 102
3.07 102
2.74 101
rel (%)
4.07
14.71
5.75
6.29
9.38
15.38
6.14
4.01
0.50
13.49
5.54
10.11
1.62
5.94
2.61
-0.08
241
242
Msg ID
1
2
3
4
5
Send Time
10
11
12
13
14
Delivery Service
Unordered
Ordered
Recv Time
E2E Delay
Recv Time
E2E Delay
20
23
18
19
25
10
12
6
6
11
20
23
23
23
25
10
12
11
10
11
Table 2: End-to-end transmission delay for ordered delivery, artificially generated from unordered delivery traces.
this, have to advertise a smaller receiver window. However, in our tests, as remarked in Section 3, the SCTP send and receive buffers at the Traffic Source and Sink were dimensioned
to prevent flow-control induced throttling.
The result of the artificially generated end-to-end transmission delays for ordered delivery is shown in Table 3. The same notations as in Table 1 are used. However, those metrics
that involve the generated transmission times have been complemented with an extra index
to signify this fact.
It follows from Table 3 that mitigating the impact of uncontrolled factors, such as the
actual packet-loss rate and packet-loss distribution, in between ordered and unordered delivery tests did not have any substantial effect on the previous result. Thus, we conclude
that HoLB had a fairly small impact on the mean message transmission delay in tests with
ordered delivery.
Although HoLB had a small impact on the mean message transmission delays, it had
sometimes a large impact on individual messages. Consider Tables 4, 5, and 6. Table 4
present the distribution of the end-to-end transmission delay of individual messages in the
tests with unordered delivery, and Table 5 the corresponding distribution for ordered delivery
using the artificially generated message transmission delays. To facilitate a comparison between the two distributions in Tables 4 and 5, Table 6 shows the relative increase in percent
between corresponding percentiles for unordered and ordered delivery. It can be observed
that although the effect of HoLB was marginal for most messages, a smaller percent was,
indeed, substantially affected. For example, as follows from Table 6, in the test with queue
size 12 packets, send rate 400 Kbps, and no cross traffic, the increase of the median (i.e., p50 )
was limited to 6.90% while the 95th percentile was increased with as much as 69.82%.
It also follows from Table 3 that neither the traffic load nor the size of the dummynet
buffer had any significant impact on the effect of HoLB. To further investigate the reason to
this, we collected some statistics on the actual HoLB events. The statistics were collected
from our artificially generated transmission traces for ordered delivery, and are presented in
Table 7. The # Evts columns show the average number of HoLB events that occurred in
a test run. The remaining columns show the average length of a HoLB event in terms of
number of affected messages, and in terms of total delay penalty imposed on these messages
by a HoLB event. A message was considered to be affected by a HoLB event if its delivery
time was prolonged during our artificial generation (cf. Table 2). To give some appreciation
of the precision in the lengths of the HoLB events, their 95% confidence intervals are also
shown in the table.
4. Results
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
E2Eg
o
(s)
0.92
1.48
2.00
5.97
0.44
1.45
1.81
6.09
0.09
0.43
1.15
6.19
0.09
0.18
0.26
5.73
E2Eu (s)
12
with 95% C.I. (s)
0.85
1.34
1.86
5.52
0.36
1.37
1.69
5.65
0.09
0.40
1.07
5.76
0.09
0.17
0.23
5.31
(%)
rel
7.33
8.95
7.36
7.46
17.47
5.42
6.32
7.24
0.00
8.31
6.30
6.92
0.00
6.91
12.09
7.30
E2Eg
o (s)
E2Eu (s)
32
g with 95% C.I. (s)
0.92
1.31
1.81
4.69
0.42
1.08
1.93
4.55
0.09
0.36
0.84
4.15
0.08
0.27
0.45
4.05
0.88
1.21
1.68
4.42
0.39
0.95
1.78
4.33
0.09
0.33
0.76
3.94
0.08
0.26
0.42
3.86
g (%)
rel
4.20
7.56
6.85
5.70
8.68
11.72
7.79
5.02
0.00
8.39
9.87
5.09
0.00
2.32
6.13
4.65
Table 3: Experiment results with artificially generated end-to-end transmission delays for ordered delivery.
243
244
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
p50
p75
0.93
1.28
1.77
4.96
0.37
1.28
1.61
5.03
0.09
0.17
0.65
4.93
0.09
0.13
0.17
4.81
0.95
1.59
2.35
6.82
0.45
1.58
2.20
6.86
0.09
0.44
1.58
6.86
0.09
0.19
0.24
6.68
12
Percentiles
p90
p95
0.95
2.09
2.89
8.92
0.50
2.22
2.84
9.18
0.09
1.09
2.52
9.23
0.09
0.26
0.42
9.23
1.05
2.37
3.30
10.78
0.53
2.54
3.23
11.20
0.09
1.47
3.58
11.77
0.09
0.38
0.60
11.77
p99
p50
p75
1.49
2.74
4.17
16.96
0.78
3.09
4.76
19.03
0.09
2.24
4.99
22.20
0.09
0.77
1.35
19.24
0.93
1.22
1.55
4.11
0.39
0.97
1.72
4.03
0.09
0.29
0.53
3.64
0.08
0.24
0.37
3.53
0.95
1.44
2.01
5.70
0.50
1.28
2.20
5.56
0.09
0.45
1.12
5.15
0.09
0.40
0.52
5.18
32
Percentiles
p90
p95
0.95
1.70
2.63
7.24
0.59
1.53
2.90
7.00
0.09
0.59
1.58
6.74
0.09
0.49
0.76
7.02
1.10
1.96
3.26
8.33
0.62
1.82
3.21
7.97
0.09
0.80
1.99
7.78
0.09
0.53
1.01
8.18
p99
1.39
2.82
4.40
10.25
0.76
2.54
3.68
9.91
0.09
1.26
2.96
9.93
0.09
0.60
1.67
10.70
Table 4: Percentiles for the end-to-end transmission delay of individual messages during
unordered delivery.
Queue Size (pkts)
Send Rate
(Kbps)
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
p50
p75
0.93
1.40
1.93
5.36
0.40
1.35
1.73
5.43
0.09
0.18
0.76
5.32
0.09
0.13
0.18
5.16
0.95
1.76
2.48
7.25
0.50
1.69
2.30
7.34
0.09
0.53
1.67
7.25
0.09
0.20
0.28
7.10
12
Percentiles
p90
p95
1.20
2.26
2.99
9.51
0.76
2.30
2.94
9.67
0.09
1.17
2.65
9.74
0.09
0.30
0.53
9.95
1.34
2.50
3.39
11.35
0.90
2.65
3.35
11.82
0.09
1.58
3.81
12.58
0.09
0.50
0.71
12.39
p99
p50
p75
1.54
2.88
4.28
18.89
1.15
3.09
4.83
20.08
0.09
2.34
5.01
23.15
0.09
0.87
1.49
20.17
0.93
1.32
1.66
4.37
0.39
1.04
1.84
4.25
0.09
0.30
0.61
3.84
0.08
0.24
0.38
3.74
0.95
1.52
2.17
5.95
0.54
1.41
2.43
5.78
0.09
0.47
1.22
5.35
0.09
0.41
0.56
5.39
32
Percentiles
p90
p95
1.07
1.92
2.80
7.54
0.63
1.91
3.10
7.28
0.09
0.78
1.79
6.93
0.09
0.50
0.89
7.19
1.23
2.11
3.48
8.61
0.84
2.22
3.38
8.31
0.09
0.96
2.26
8.03
0.09
0.54
1.06
8.39
p99
1.43
2.97
4.49
10.45
1.23
3.01
4.35
10.18
0.09
1.33
3.18
10.11
0.09
0.83
1.73
10.91
Table 5: Percentiles for the artificially generated end-to-end transmission delay of individual
messages during ordered delivery.
We observe that both the HoLB frequency and the delay penalty incurred by a HoLB
event typically increased with increasing traffic load. However, it can also be observed that
in almost all tests, the length of the HoLB events in affected messages was inversely proportional to the HoLB frequency and the traffic load (i.e., number of cross-traffic TCP flows). At
low traffic load, the HoLB frequency was often low since there was typically no congestion,
and thus very few packet losses. However, since there were few packet losses, the sender
window could grow fairly large. Thus, when a HoLB event did in fact occur, it could involve
a relatively large number of packets or messages. Conversely, at high traffic load the HoLB
frequency was large and the congestion window, and consequently the sender window, was
4. Results
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
p50
p75
12
Percentiles
p90
0.03%
9.74%
9.25%
8.20%
6.90%
5.26%
7.69%
7.96%
0.00%
3.17%
16.49%
7.75%
0.00%
1.34%
4.61%
7.33%
0.20%
10.90%
5.70%
6.28%
10.79%
6.51%
4.21%
7.00%
0.00%
20.24%
5.49%
5.76%
0.00%
3.27%
15.57%
6.31%
26.88%
8.36%
3.54%
6.59%
51.12%
3.29%
3.65%
5.36%
0.00%
6.82%
5.19%
5.58%
0.00%
13.38%
28.79%
7.81%
p95
p99
p50
27.49%
5.48%
2.48%
5.33%
69.82%
4.39%
3.82%
5.46%
0.00%
7.38%
6.58%
6.85%
0.00%
31.91%
17.99%
5.23%
3.47%
5.23%
2.44%
11.35%
47.14%
0.07%
1.62%
5.50%
0.00%
4.50%
0.32%
4.30%
0.00%
12.95%
9.68%
4.84%
0.20%
7.94%
6.93%
6.37%
0.00%
7.18%
6.84%
5.56%
0.00%
1.43%
16.68%
5.32%
0.00%
1.27%
3.50%
5.82%
p75
32
Percentiles
p90
p95
p99
0.09%
5.35%
7.83%
4.37%
8.33%
10.32%
10.69%
3.94%
0.00%
5.40%
8.35%
3.82%
0.00%
1.17%
7.50%
4.05%
12.71%
12.57%
6.53%
4.13%
7.42%
25.03%
6.97%
3.95%
0.00%
31.69%
13.46%
2.92%
0.00%
1.45%
16.14%
2.50%
11.35%
7.77%
6.78%
3.42%
35.97%
21.92%
5.24%
4.24%
0.00%
19.49%
13.88%
3.12%
0.00%
2.42%
5.82%
2.58%
3.11%
5.35%
2.02%
1.99%
62.36%
18.41%
18.40%
2.66%
0.00%
5.95%
7.39%
1.88%
0.00%
39.60%
3.74%
1.96%
Table 6: Relative difference between percentiles for the end-to-end transmission delay of individual messages during unordered and
ordered delivery.
245
246
small. As a result, a single HoLB event only affected a few packets or messages. Translated
to our results on the mean transmission delays, this gives us that the delay penalties introduced by an increased HoLB frequency were more or less balanced out by fewer messages
being affected at each HoLB event. Put differently, the effect of HoLB became almost the
same for all studied traffic loads.
Table 7 also explains why the effect of HoLB on the mean transmission delays did not
decrease with increased buffer size. Although, the packet-loss rate, and thus the HoLB
frequency, was larger in the tests with a 12-packet buffer, the number of messages affected
by a HoLB event was mostly larger in the tests with a 32-packet buffer. Consequently, the
larger number of retransmissions in the 12-packet buffer tests was essentially compensated
for by the larger size of the HoLB events in the 32-packet buffer tests.
Conclusions
Avoiding HoLB when transporting PSTN signaling traffic over IP was one of the primary
incentives for the IETF SIGTRAN working group to develop SCTP, a new transport protocol,
in the first place. Although, several studies have been made on the impact of HoLB on TCP
and SCTP, their results are ambiguous, and, to some extent, contradictory. To this end, we
have conducted a detailed experimental study on the impact of HoLB on ordered delivery
in SCTP. Our study suggests that although HoLB could indeed incur a substantial delay
penalty on a small fraction of the messages in an SCTP session, it has only a marginal
impact on the average end-to-end transmission delay. We only observed improvements in
the range of 0% to 18% in average message transmission delay of using unordered delivery
as compared to ordered delivery. Additionally, it was evident that other factors many times
had a larger impact on the average transmission delay than HoLB: A large variability in
between different test runs made the impact of HoLB statistically insignificant in several
tests. Also worth noting is that the impact of HoLB did not always increase with increasing
traffic load, and the reason to this was that not only the frequency of HoLB events, but also
the number of affected messages determined the impact of HoLB. Thus, the largest impact
of HoLB occurred at times when the SCTP sender window was relatively large while we still
had a certain amount of HoLB events.
At present, we have only studied the effect of HoLB on a constant SCTP flow, however, in
future work we also plan to extend our scope to SCTP flows that better capture the properties
of actual PSTN and SIP signaling traffic. In this context, it may also be appropriate to
consider the multi-streaming feature of SCTP.
References
[1] The 3rd generation partnership project (3GPP). http://www.3gpp.org.
[2] 3GPP. IP multimedia subsystem (IMS); stage 2 (release 6). Technical Specification TS
23.228 v6.9.0, 3GPP, March 2005.
REFERENCES
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
# Evts
12
Length (msgs) with 95% C.I.
# Evts
32
Length (msgs) with 95% C.I.
24.00
16.75
14.80
30.74
7.84
14.03
15.13
31.08
0.00
9.20
13.56
31.58
0.00
4.98
10.60
31.23
9.75 0.62
15.21 0.92
18.20 0.99
10.99 0.32
31.38 1.78
15.32 0.76
15.82 0.81
11.12 0.31
0.00
11.66 0.63
14.19 0.56
10.67 0.28
0.00
11.12 0.43
11.22 0.36
10.83 0.29
2.80 0.16
7.89 0.94
9.97 0.91
13.19 1.20
9.76 0.78
5.60 0.52
7.57 0.76
13.05 1.38
0.00
3.89 0.37
5.34 0.47
12.47 1.10
0.00
2.47 0.15
3.00 0.19
12.20 0.87
40.00
6.06
11.65
14.87
1.20
9.07
8.86
14.79
0.00
4.42
9.44
15.03
0.00
1.20
4.32
14.69
2.35 0.23
32.66 1.56
17.78 1.09
16.14 0.67
68.99 5.07
23.05 1.50
22.46 1.44
15.46 0.61
0.00
20.85 1.29
17.13 1.20
15.21 0.61
0.00
20.58 0.97
19.21 0.96
15.14 0.55
0.97 0.06
16.37 0.90
10.64 0.91
17.64 1.69
30.72 2.45
13.91 1.52
16.95 1.98
14.86 1.27
0.00
6.86 0.53
8.77 1.04
13.38 0.98
0.00
5.21 0.55
6.34 0.38
12.84 0.77
Table 7: Statistics for HoLB Events calculated from artificially generated transmission traces for ordered delivery.
247
248
Paper VIII
1 Introduction
Unlike a datacom network, a telecom network logically comprises two networks: a transport
and a signaling network. The transport network carries the voice traffic, while the signaling
network carries the control information that is needed for the administration and supervision
of calls, and the management of the telecom network itself.
Traditionally, signaling traffic and voice traffic are both carried over TDM-based, circuitswitched connections. However, this is about to change. Using IP networks and protocols,
telecom operators are seeing ways to improve resource utilization and reduce the operational, maintenance, and network infrastructure costs. Still, the transition from TDM to IP
251
252
will not happen overnight maybe never. The traditional telecom network represents a huge
capital investment6 and is still unsurpassed in terms of reliability and QoS [11]. To address
the situation of two different, co-existing, networks, one TDM based and one IP based, the
IETF SIGTRAN working group has developed an architecture for signaling traffic over IP.
In particular, they have developed an architecture for running Signaling System No. 7 (SS7),
the predominant signaling system in traditional TDM-based telecom networks, over IP. Together with the so-called SoftSwitch architecture, the SIGTRAN architecture [13] constitutes
a complete solution for the integration of the two networks.
The interoperability between the traditional TDM-based telecom network and its IP
counterpart requires that the signaling performance in the IP network is comparable to that of
TDM. Although some time has passed since the SIGTRAN architecture was first published,
it is still unclear if it will perform comparable to the traditional telecom network [4], or if it
will lead to unacceptable performance degradations [5].
The SIGTRAN architecture specifies a common transport protocol for all SS7 signaling
traffic SCTP [17], and a number of adaptation layers that run on top of SCTP. Although
several adaptation layers have been specified, it seems as if a majority of telecom companies
have embraced the MTP-L3 User Adaptation Layer (M3UA) [16]. This adaptation layer
mimics the functionality of MTP-L3, the SS7 transport layer, and makes it possible to run
all layers of the SS7 stack above MTP-L3 without modification on top of SCTP.
The Message Transfer Part (MTP) of the SS7 stack, of which MTP-L3 is the topmost
layer, is not only responsible for the reliable transmission of signaling traffic, but also for
network redundancy. In particular, link failures in traditional TDM-based SS7 networks are
primarily managed by MTP. When a link failure occurs, this is detected by layer 2 in MTP
(MTP-L2). MTP-L2 informs MTP-L3 about the failed link, and a so-called changeover is
performed by MTP-L3. The changeover procedure diverts the signaling traffic carried by
the unavailable link to alternate links as quickly as possible while avoiding message loss,
duplication, or reordering.
To obtain a corresponding network redundancy in a SIGTRAN network as in a traditional
SS7 network, SCTP supports so-called multi-homed associations. Multi-homed associations
make it possible to manage several TCP-like connections, paths in SCTP, as one redundant
logical connection. When one path goes down, SCTP performs a failover and switches all
traffic to an alternative path. A similar failover mechanism as the one in SCTP is also provided by M3UA, therefore we henceforth call failovers in SCTP, SCTP-controlled failovers.
This paper evaluates the performance of SCTP-controlled failovers in M3UA-based SIGTRAN networks: both in terms of SCTP failover times, and in terms of the maximum Message Signal Unit (MSU) transfer times experienced by M3UA users during failovers. Moreover, the paper studies to what extent the performance of SCTP-controlled failovers correlates with the path propagation delay, and with the SCTP parameter Path.Max.Retrans,
the upper bound on the SCTP path error counter.
Our main contribution is to show that in order to have performance similar to the
changeover procedure in a traditional SS7 network, SCTP has to be configured much more
aggressively than what is recommended in RFC 2960. It is also shown that for the envisioned
path propagation delays in future SIGTRAN networks, the effect of the path propagation de6 There
is more than $350 billion of legacy equipment installed in the current telecom network [2].
2. Methodology
253
SEP1
SEP2
M3UA Application
M3UA
Primary Path
IP Network
SCTP
IP
M3UA Application
M3UA
SCTP
Alternative Path
IP
2 Methodology
The purpose with our experiment was to evaluate the performance of SCTP-controlled
failovers in the typical network scenario depicted in Figure 1. SEP1 and SEP2 are two
SIGTRAN signaling end points, each one running an M3UA application. The two M3UA
applications are engaged in a signaling session in which the SEP1 application acts as the
source of the signaling traffic and the SEP2 application acts as the sink. During the signaling
session, SEP2 becomes unreachable via its primary path; SCTP at SEP1 detects the failed
primary path and performs a failover to the alternate path. When the failover has completed,
the signaling session continues on the alternate path, and ends before the primary path has
been recovered.
To evaluate the failover performance of SCTP in the network scenario of Figure 1, we
used the experiment setup illustrated in Figure 2. The flow of events in the test runs of the experiment mimicked closely the flow of events in the evaluated network scenario. The source
application at SEP1 continuously sent MSUs to the sink application at SEP2. When 30s of
a test run had elapsed, i.e., more than enough time for SCTP to enter its stationary transmission behavior, the primary path was broken. A failover occurred, and the source application
resumed its transmission on the alternate path. The test run ended 90 s after the primary path
was taken down, which was enough time for SCTP to conclude the failover and regain its
254
NTP Server
LAN
L1 (PC, 400MHz)
SEP2 (Sun Ultra 10)
Dummynet
SEP2 Test Manager
Sink Application
Source Application
M3UA
M3UA
Alternate Path
SCTP
SCTP
L2 (PC, 230MHz)
IP
IP
Solaris 8
L2 Path Manager
Solaris 8
Dummynet
2. Methodology
255
Parameter
RT Oinit
RT Omin
RT Omax
Path.Max.Retrans (p)
Heartbeat Interval
SACK Timer
SCTP Configuration
RFC2960
Telecom(p)
3000 ms
80 ms
1000 ms
80 ms
60000 ms
150 ms
5 2 3 4 5
30000 ms
30000 ms
200 ms
40 ms
SCTP Configuration
RFC2960
Telecom(2)
Telecom(3)
Telecom(4)
Telecom(5)
notation alludes to the fact that these configurations are all variations of Telecom(2), which
is the configuration recommended by some large telecom companies. In particular, the other
four Telecom configurations included in the experiment are all examples of SCTP configurations which, in terms of failover, are more conservative than Telecom(2).
Tests were performed with three different path propagation delays: 5 ms, 10 ms, and
20 ms. These delays are believed to represent typical path propagation delays in future dedicated SIGTRAN networks.
Only a subset of the possible combinations of path propagation delay and SCTP configuration were tested. Specifically, our experiment comprised the 11 tests listed in Table 2.
Each test was run 10 times giving a total of 110 test runs.
As follows from Table 2, RFC2960, Telecom(2), and Telecom(5) were tested with all
three path propagation delays. This made it possible for us to study the correlation between
failover performance and path propagation delay for, on one hand, the SCTP configuration recommended by IETF, and, on the other hand, for the, in terms of failover conservativeness, extremes of the Telecom configurations. The SCTP configurations Telecom(3),
and Telecom(4) were only tested with a path propagation delay of 10 ms. However, combined with the corresponding tests for Telecom(2) and Telecom(5), these tests enabled us
to study the correlation between the SCTP failover performance and the SCTP parameter
Path.Max.Retrans.
256
Results
As briefly mentioned in Section 2, event logging at SEP1 and SEP2 took place in all test
runs. Specifically, the time the primary path was broken and the time the path failure was
detected by SCTP at SEP1 were logged. The failover time in a test run was then calculated
as the difference between the SCTP detection time and the actual time of the path failure.
Also the sending times of the MSUs by the source application, and the reception times
of the MSUs by the sink application were logged during each test run. (Note that the timing
of the MSUs occurred at the level of the M3UA application, and not at the SCTP level.)
Based on these values, the MSU transfer times were calculated as the difference between the
reception and the sending times of the MSUs.
Figure 3 and Table 3 summarize the results of the measurements of the failover times and
the MSU transfer times for the three SCTP configurations: RFC2960, Telecom(2), and Telecom(5). Recall from Section 2 that RFC2960 is the configuration of SCTP recommended
in RFC 2960 [17]; that Telecom(2) is an SCTP configuration with strong proponents in the
telecom sector; and that Telecom(5) is a conservative version of Telecom(2). In particular,
Telecom(5) is a merge of Telecom(2) and RFC2960: The RTO-parameters of Telecom(5)
are the same as for Telecom(2), i.e., are set with respect to the envisioned delays in future
SIGTRAN networks, while the failover behavior of Telecom(5) is as conservative as for
RFC2960.
The lin-log graphs in Figure 3(a) plot the sample means of the measured failover times
in the tests as a function of the path propagation delay. The sample means are also listed in
Table 3. Specifically, Table 3 lists the sample means and their corresponding 99% confidence
intervals.
It follows from Table 3 that the mean failover times for RFC2960 were of magnitude 63 s
for all three path propagation delays considered. This is not surprising since with five retransmissions until a path is abandoned (i.e., Path.Max.Retrans = 5), the theoretical
failover time for RFC2960 (assuming that RT O = RT Omin , which was the case is all our
tests) becomes exactly 63 s: 1 s + 2 s + 4 s + 8 s + 16 s + 32 s = 63 s.
As shown in Figure 3(a), the failover times for the Telecom configurations were several
orders of magnitude less than for RFC2960. In particular, it follows from Table 3 that the
failover times of Telecom(2) were mostly in the range of 435 ms - 505 ms, while Telecom(5)
had roughly twice the failover times of Telecom(2).
As mentioned in Section 1, the corresponding path failure scenario to the one studied
in our experiment is managed by the MTP-L3 changeover procedure in a traditional SS7
network. According to ITU-T recommendation Q.706 [8], the changeover time in an SS7
network must be less than or equal to 800 ms. Since basically the same applications will be
used in future SIGTRAN networks that is used in current SS7 networks, it is reasonable to
assume that the requirements are roughly the same. Thus, it follows from our experiment
that RFC2960 most likely will fail to meet the Q.706 requirement on changeover. In fact,
the failover times of RFC2960 were almost 80 times the changeover limit of Q.706. This
is, of course, to be expected, and is in agreement with the results reported in [5] and [9].
More interestingly, we observe that while the failover times of Telecom(2) were well below
the changeover limit of Q.706, this were not the case for Telecom(5). Thus, it seems that if
3. Results
257
1e+06
RFC2960
Telecom(2)
Telecom(5)
100000
10000
1000
100
10
1
0
10
15
Path Propagation Delay (ms)
20
25
1e+06
RFC2960
Telecom(2)
Telecom(5)
100000
10000
1000
100
10
1
0
10
15
Path Propagation Delay (ms)
20
25
258
SCTP is to be used for transfer of signaling traffic, it not only has to abandon the conservative
RTO settings of RFC 2960, but also has to switch from a failed path less conservatively than
recommended by RFC 2960.
Figure 3(a) and Table 3 also suggest that the path propagation delay only had a minor
impact on the SCTP failover time at least for propagation delays no greater than 20 ms,
i.e., for those path propagation delays considered typical in future dedicated SIGTRAN networks. Specifically, the increase in mean failover time for RFC2960 when the path propagation delay was increased from 5 ms to 20 ms was much less than 1%; for Telecom(2) the
increase was about 5%; and for Telecom(5) the increase was close to 12%.
Still, there was indeed a correlation between failover time and path propagation delay. The correlation could, as follows from Table 3, only be established for RFC2960 and
Telecom(5). However, for these two SCTP configurations there was, with a 99% confidence, an increase of the failover time when the path propagation delay increased from 5 ms
to 20 ms.
In the same way as for the failover times, Figure 3(b) and Table 3 give the results of the
measurements of the maximum MSU transfer times. To avoid having the SCTP slow start
and the transient behavior of SCTP during the termination of a test run interfere with the
results, the first and last seconds of a test run were excluded from the calculation.
The graphs show that the maximum MSU transfer times for RFC2960 and Telecom(2)
were almost the same as their failover times, while Telecom(5) had maximum MSU transfer
times about 380 ms less than its failover times. Contrary to the failover times, there is no
ITU-T recommendation that explicitly governs the MSU transfer times. Instead, the upper
bound of the MSU transfer times are determined by the application layers atop MTP-L3, i.e.,
the MTP-L3 stakeholders.
The primary stakeholders of MTP-L3 in terms of MSU transfer time are the ISUP (ISDN
User Part) [7] and TCAP (Transaction Capabilities Application Part) [6] application protocols. The basic function of ISUP is to control setup, connection, and teardown of telephone
calls, while TCAP is an application protocol that is used by a large number of distributed
SS7 applications. Examples of applications using TCAP include various Intelligent Networking (IN) applications and mobility support applications in mobile networks (i.e., GSM
and IS-41).
Although, neither ISUP nor TCAP imposes any explicit requirements on MSU transfer times, analyses have been made [1, 10, 15] suggesting that the maximum permissible
MSU transfer times with respect to these application protocols are in the range of 600 ms 1000 ms, with 1000 ms being barely acceptable. With these figures in mind, it is obvious
that RFC2960, with maximum MSU transfer times of about 63 s, did not comply with
the ISUP/TCAP requirements. Again, as with the RFC2960 failover times, this was to
be expected. Less expected was that also Telecom(5) had some difficulties passing the
ISUP/TCAP requirements. As follows from Table 3, the mean maximum MSU transfer
time for Telecom(5) at a path propagation delay of 20 ms was 718 ms. Considering that the
ISUP/TCAP requirements are worst case values, and that the measurements took place in a
scenario with no competing traffic, Telecom(5) may not give adequate MSU transfer times
during a failover in a real SIGTRAN network. Thus, the outcome of the maximum MSU
transfer time measurements only reinforces the outcome of the failover times: If SCTP is to
3. Results
Table 3: 99% confidence intervals for failover performance vs. path propagation delay.
259
260
Telecom(p)
1400
1200
1000
800
600
400
200
0
0
3
Path.Max.Retrans
Telecom(p)
1400
1200
1000
800
600
400
200
0
0
3
Path.Max.Retrans
Figure 4: Failover performance vs. Path.Max.Retrans for Telecom(p) (10 ms path propagation delay).
4. Conclusions
261
be used for signaling traffic, then it has to be much less conservative than recommended by
RFC 2960.
Figure 3(b) and Table 3 also show that the path propagation delay only had a minor
impact on the maximum MSU transfer times experienced during a failover. Furthermore,
the correlation between maximum MSU transfer time and path propagation time was weak,
and could only be established for Telecom(5).
We also performed a more detailed study of the impact of the SCTP parameter
Path.Max.Retrans on the SCTP failover performance. The outcome of this study is
compiled in the graphs in Figure 4 7 . The graphs plot the sample means of the measured
failover times and maximum MSU transfer times together with their 99% confidence intervals.
It follows from the graphs that the value of Path.Max.Retrans had indeed a major
impact on the failover time. An increase of Path.Max.Retrans from 2 to 3 resulted in
a relative increase of the mean failover time by 40%. And, when Path.Max.Retrans
was increased from 3 to 4, or from 4 to 5, the relative increase of the mean failover time was
about 20% in both cases. Even more important is to note that already with a
Path.Max.Retrans of 4, SCTP failed to meet the failover requirement of Q.706. Thus
again reinforcing the need for SCTP to be much more aggressive than what is recommended
by RFC 2960 if it is to be used for SS7 signaling transport.
The graphs also show that the value of Path.Max.Retrans had some influence on
the maximum MSU transfer time. Specifically, the maximum MSU transfer time increased
with approximately 35% when the value of Path.Max.Retrans was changed from 2
to 5. However, the maximum MSU transfer times were below ISUP/TCAP requirements
for all values of Path.Max.Retrans. Thus, in terms of MSU transfer time there was no
problem having Path.Max.Retrans configured as conservatively as recommended by
RFC 2960 and still meet the SS7 signaling transport needs.
4 Conclusions
This paper presents an evaluation of the performance of SCTP-controlled failovers in future
M3UA-based SIGTRAN networks. The evaluation suggests that in order to meet the failover
performance objectives of a traditional SS7 network, SCTP has to abandon the conservative failover behavior recommended by RFC 2960. Specifically, it has to set the parameter
Path.Max.Retrans to a value no larger than 3. In addition, it has to change from the
RTO-parameter configuration recommended by RFC 2960 to a parameter configuration far
more in line with the actual path propagation delays in the SIGTRAN network.
The evaluation also suggests that the configuration of the SCTP parameter
Path.Max.Retrans has a major impact on the failover performance: Especially in terms
of failover time, but also to some extent in terms of the maximum MSU transfer time experienced by an M3UA application during failover.
In contrast, the evaluation indicates that for path propagation delays in the range of 5 ms
to 20 ms, i.e., for path propagation delays believed to be representative for dedicated SIG7 The dotted lines in the graphs are only provided to make the trends more clear, and do not suggest that
Path.Max.Retrans is continuous.
262
TRAN networks, the path propagation delay has only a minor impact on the failover performance.
Our future work includes studying the effects of introducing competing signaling traffic
on the performance of SCTP-controlled failovers. In particular, to study the tradeoff between
shorter failover times and spurious failovers. However, we also want to study to what extent
the SCTP failover performance degrades with different levels and mixtures of competing
traffic. Furthermore, it remains to find out how other configurable SCTP parameters, e. g.,
RT Omin and RT Omax , affect the failover performance.
References
[1] T. Seth A. Broscius, C. Huitema, and H-A P. Lin. Performance requirements for TCAP
signaling in Internet telephony. Internet draft, IETF, February 1999. Work in Progress.
[2] W. Buga. The evolution of softswitch architecture. Annual Review of Communications,
55:7376, 2002.
[3] A. L. Caro Jr., J. R. Iyengar, P. D. Amer, G. J. Heinz, and R. R. Stewart. Using SCTP
multihoming for fault tolerance and load balancing. In SIGCOMM 2002 Poster Session,
Pittsburg, Pennsylvania, USA, August 2002.
[4] L. Coene and J. Pastor. Telephony signalling transport over SCTP applicability statement. Internet draft, IETF, August 2003. Work in Progress.
[5] K. D. Gradischning and M. Tuexen. Signaling transport over IP-based networks using
IETF standards. In 3rd International Workshop on the Design of Reliable Communication Networks (DRCN), pages 168174, Budapest, Hungary, October 2001.
[6] ITU-T. Signalling system no. 7 functional description of transaction capabilities.
Recommendation Q.771, ITU-T, June 1997.
[7] ITU-T. Signalling system no. 7 ISDN user part functional description. Recommendation Q.761, ITU-T, December 1999.
[8] ITU-T. Signalling system no. 7 message transfer part signalling performance. Recommendation Q.706, ITU-T, March 1999.
[9] A. Jungmaier, E P. Rathgeb, and M. Tuexen. On the use of SCTP in failover scenarios. In 6th World Multiconference on Systemics, Cybernetics and Informatics, Orlando,
Florida, USA, July 2002.
[10] H-A P. Lin, K-M Yang, T. Seth, and C. Huitema. VoIP signaling performance requirements and expectations. Internet draft, IETF, October 1999. Work in Progress.
[11] P. Molinero-Fernandez, N. McKeown, and H. Zhang. Is IP going to take over the world
(of communications)? ACM Computer Communication Review, 33:113118, January
2003.
REFERENCES
263
Paper IX
Impact of Traffic Load on SCTP Failovers
in SIGTRAN
Reprinted from
1 Introduction
Since Voice over IP (VoIP) roared into prominence during the latter part of the 1990s, the
idea of a converged network based on IP technology for voice, video, and data has gained
strong momentum. However, in spite of all prospective advantages with IP it would be naive
to think that the transition from the traditional circuit-switched network to IP would happen
overnight.
In light of this, the IETF Signaling Transport (SIGTRAN) working group has defined
an architecture, the SIGTRAN architecture [13], for seamless Signaling System #7 (SS7)
signaling between VoIP and the traditional telecom network. The SIGTRAN architecture
essentially comprises two components: a new IP transport protocol, the Stream Control
267
268
Transmission Protocol (SCTP) [16], specifically designed for signaling traffic; and an adaptation sublayer. The adaptation sublayer shields SS7 from SCTP and IP, and depending on
how much of the SS7 stack is run atop SCTP, different adaptation protocols are used. Examples of adaptation protocols include: M2PA [4] for adaptation of the SS7 MTP-L3 [6]
protocol to IP, and M3UA [15] for adaptation of SCCP [7] and user part protocols such as
ISUP [8].
It is widely recognized that to gain user acceptance, the SIGTRAN architecture has to
perform comparable to the traditional circuit-switched telecom network [3]. In particular, it
has to provide the same level of availability as a traditional SS7 network. Considering that
ITU-T prescribes an availability level of 99.9988% [9], i.e., no more than 10 minutes downtime per year, and that many telecom networks have an even higher availability level [11],
this is indeed a great challenge.
To meet the stringent requirements of SS7, several availability mechanisms have been
included in the SIGTRAN architecture of which the SCTP failover mechanism is one of
the more important ones if not the most important one. It corresponds with the MTP-L3
changeover procedure, and enables rapid re-routing of traffic from a failed signaling route
within a SIGTRAN network. In particular, the SCTP failover mechanism constitutes part of
SCTPs multi-homing support.
Although, the SCTP failover mechanism plays a key role in the availability support of
the SIGTRAN architecture, very few results are available on its actual performance in this
context. Jungmaier et al. [10] have studied the SCTP failover performance in an M2PA-based
network, and showed that it only meets ITU-T requirements provided it is configured very
aggressively, and provided the network path propagation delays are very short. A similar
result was also obtained by Grinnemo et al. [5] when they performed measurements on SCTP
failover performance in an M3UA-based network.
Both the study in [10] and in [5] took place in unloaded networks, i.e., under quite unrealistic conditions. This paper advances the work in [5], and partly the work in [10], by studying
the impact of traffic load on the SCTP failover performance in an M3UA-based SIGTRAN
network. The main contribution of the paper is that it demonstrates that cross traffic, especially bursty cross traffic such as SS7 signaling traffic, could indeed significantly deteriorate
the SCTP failover performance. Furthermore, the paper stresses the importance to keep the
router queues in a SIGTRAN network relatively small. In fact, the paper shows that bursty
traffic in combination with ill-dimensioned router queues may well cause the SCTP failover
mechanism to not comply with the ITU-T requirement on the MTP-L3 changeover procedure [9]. Furthermore, the paper identifies some issues regarding the design of the SCTP
failover mechanism which in some cases negatively affect the failover performance.
The remainder of the paper is organized as follows. Section 2 gives a brief description
of the SCTP failover mechanism. Then, in Section 3 follows a description of the design and
execution of the experiment that underlies our study. Next, in Section 4, we elaborate on
the results of the experiment. Finally, in Section 5, the paper ends with some concluding
remarks and words on future work.
2. Failovers in SCTP
269
2 Failovers in SCTP
While IP networks have many virtues, high availability and reliability have traditionally not
been seen as two of them. Unlike circuit-switched paths, which exhibit changeover and
failover times on the order of milliseconds, measurements show that it may take well over
ten seconds before the routers in the Internet reach a consistent view after a path failure [12]
in other words, too long for delay-sensitive SS7 signaling traffic.
In the SIGTRAN architecture, the unsuitability of IP for high-availability routing of SS7
signaling messages is addressed through various redundancy mechanisms at the transport
and adaptation layers. As previously mentioned, one of the most important network redundancy mechanisms in SIGTRAN is the SCTP failover mechanism.
An example of how the SCTP failover mechanism works is illustrated in Figure 1. In
this example, we have an SCTP connection, a so-called association, between two signaling
end points: SEP-A and SEP-B. The association comprises two routing paths: path #1 and
path #2. Since SCTP does not support load-sharing, one path in an association is always
designated the primary path and is the path on which signaling traffic is normally sent. The
remaining paths, if any, become backup or alternate paths. In our example, path #1 is the
primary path and path #2 an alternate path.
SCTP continuously monitors reachability on the primary and alternate paths on an
active primary path SCTP probes for reachability using the transferred data packets themselves, and on idle alternate paths specific heartbeat packets are used. Furthermore, for each
path (actually network destination), SCTP keeps an error counter that counts the number of
consecutively missed acknowledgements to data or heartbeat packets. A path is considered
unreachable when the error counter of the path exceeds the value of the SCTP parameter
Path.Max.Retrans. In the remaining discussion, it is assumed that the SCTP stacks at
SEP-A and SEP-B are configured with Path.Max.Retrans set to 2.
As follows from the time line in Figure 1, a failure occurs on the primary path at time
t1 . At that time, the SCTP retransmission timeout (RTO) variable is assumed to be 240 ms,
and it is assumed that there are outstanding traffic. Thus, at t2 t1 + 240 ms, the SCTP
retransmission timer, T3-rtx, expires and a timeout occurs; an SCTP packet worth of outstanding data is retransmitted on the alternate path, and the error counter of the primary path
is incremented by one. Furthermore, the RTO variable is backed off, or more precisely
RT O min {max (2 RT Ocur , RT Omin ) , RT Omax } ,
(1)
where RT Ocur denotes the current value of the RTO variable, and RT Omin and RT Omax
are SCTP parameters that limit the range of the RTO variable. Here, it is assumed that
RT Omin is set to 80 ms and RT Omax to 250 ms.
At time t3 , new data is sent out on the primary path, and the T3-rtx timer is restarted with
the value of the updated RTO variable. The flow of events that occurred at times t2 and t3 are
repeated at times t4 and t5 . When time t6 is reached, the error counter of the primary path
becomes 3, i.e., greater than Path.Max.Retrans, and SCTP considers the path failed
and starts sending new data onto the alternate path. In other words, the failover concludes.
270
SEPA
SEPB
path #1
Primary Path
PA
PB
SIGTRAN Network
AA
AB
Alternate Path
path #2
AA
t1
PA
240 ms
t2
t3
PB
AB
1
0
0
1
0
1
0
1
0
1
0
1
1111111111111
0000000000000
0000000000000
1111111111111
min{max{2x240,80},250}=250
ms
0000000000000
1111111111111
0000000000000
1111111111111
t4
t5
0000000000000
1111111111111
min{max{2x250,80},250}=250
ms
0000000000000
1111111111111
0000000000000
1111111111111
t6
1
0
0
1
0
1
0
1
0
1
0
1
0
1
Methodology
To be able to study the impact of traffic load on the SCTP failover performance, we considered the network scenario depicted in Figure 2.
In this scenario, two M3UA users at signaling end points SEP1 and SEP2 were engaged
in a signaling session over a SIGTRAN network with varying degrees of traffic load. The
session took place over a multi-path association with one primary and one alternate path.
Initially, all signaling traffic in the M3UA session was routed on the primary path. However,
30 s into the signaling session a failure occurred on the primary path. As a result, the signaling traffic was re-routed from the primary to the alternate path. The network scenario ended
3. Methodology
271
SIGTRAN Network
SEP1
SEP2
Source App.
Sink App.
Primary Path
M3UA
M3UA
SCTP
SCTP
Alternate Path
IP
IP
272
SEP3
PC
SEP4
Path Delay: 25 ms
Bandwidth: 1 Mbps
Queue: 3, 6, 13 KBytes
Red Hat 8
Source App.
Red Hat 8
Sink App.
SCTP/UDP
SCTP/UDP
IP
IP
Router1
SEP1
PC
PC
SEP2
Sun Ultra 10
Sun Ultra 10
FreeBSD 5.0
Ethernet switch
Solaris 8
M3UA
Ethernet switch
dummynet
Solaris 8
Ethernet
Switch
Source App.
Ethernet switch
Ethernet
Switch
Router 2
PC
Sink App.
M3UA
Ethernet switch
SCTP
SCTP
Ethernet
Switch
IP
Ethernet
Switch
FreeBSD 5.0
IP
dummynet
SEP5
PC
SEP6
PC
Red Hat 8
Red Hat 8
Source App.
Sink App.
SCTP/UDP
IP
SCTP/UDP
Path Delay: 25 ms
Bandwidth: 1 Mbps
Queue: 3, 6, 13 KBytes
IP
Setting
250 ms
80 ms
250 ms
2
40 ms
3. Methodology
273
1000
Router Queue: 3 KBytes
Router Queue: 6 KBytes
Router Queue: 13 KBytes
800
600
400
200
0
CT-NONE
CT-LOW
CT-MEDIUM
Cross Traffic
CT-HIGH
1000
Router Queue: 3 KBytes
Router Queue: 6 KBytes
Router Queue: 13 KBytes
800
600
400
200
0
CT-NONE
CT-LOW
CT-MEDIUM
Cross Traffic
CT-HIGH
274
Results
The SCTP failover performance was evaluated in terms of two metrics: the failover time
experienced by the SEP1 source application, and the maximum Message Signal Unit (MSU)
transfer time measured during failover in the M3UA session between SEP1 and SEP2. As
estimates of the failover times and the max. MSU transfer times in the tests, the sample
means were used.
Figure 4 summarizes the result of our experiment. In Figure 4 (a), it is shown how the
SCTP failover time was affected by increasing traffic load at different router queue sizes,
while Figure 4 (b) shows the same relationship for the max. MSU transfer time. The error
bars depict the 99% confidence intervals, and the lines connecting the mean failover times
and max. MSU transfer times are only supplied as a visualization aid. Specifically, these
lines are only included to help visualize the trends.
As follows from Figure 4, the deteriorating effect of the cross traffic on the failover
performance increased with increased traffic load and router queue size. When the Router1
queue was only 3 KBytes, the cross traffic did not inflict significantly on the failover and
max. MSU transfer times. However, as the queue size was increased, the effect of the cross
traffic became more and more apparent. Thus, when the Router1 queue was 13 KBytes, the
CT-HIGH cross traffic increased the failover time with more than 50% and the max. MSU
transfer time with almost 40% as compared with no cross traffic at all.
The reason to the increased failover and max. MSU transfer times was the queueing
delays that arose at Router1 when the router queue was fairly large, and when the cross traffic
was bursty (i.e., when the short-term bandwidth requirement of the cross traffic sometimes
exceeded the bandwidth capacity of the primary path). As a matter of fact, in previous tests
with the same test flow, but with constant bit rate cross traffic, it was found that the traffic
load had no significant impact on the failover performance provided it was less than the path
capacity.
Another observation worth making concerns the SCTP failover times with regards to
the requirement of ITU-T on the MTP-L3 changeover procedure [9]. To comply with this
requirement, the SCTP failover times should be no more than 800 ms. However, as follows
from Figure 4, this requirement was only fulfilled in those cases the Router1 queue was
relatively small (3 KBytes or 6 Kbytes). In the tests with a router queue of 13 KBytes or
twice the bandwidth-delay product (to our knowledge a quite common configuration [1]),
the failover times averaged well above 850 ms at medium (CT-MEDIUM) and high (CTHIGH) traffic loads.
Interestingly, in all tests, the measured failover times were significantly larger than what
could be expected given the RTOs. However, the discrepancy was larger with larger traffic
loads and router queues. Consider, for example, the test with a 13 KByte Router1 queue and
the CT-HIGH cross traffic. When this test was re-ran with tracing on the RTO, the RTO at
the time of the path failure, RT Ot , was measured to 240 ms. Only considering the timeout
periods, this gives us a theoretical failover time of 240 ms + 250 ms + 250 ms = 740 ms
(see Section 2). However, the measured failover time was 920 ms, or 180 ms larger than our
estimate.
The reason to this discrepancy turned out to be substantial delays between the expira-
4. Results
275
SEP1
SEP2
P1
Primary Path
P2
SIGTRAN Network
A2
A1
Alternate Path
A1
P1
P2
A2
First timeout on P1
cwnd = 1 MTU
80 ms
T3rtx restarted on P1
Second timeout on P 1
cwnd = 1 MTU
80 ms
T3rtx restarted on P1
Third timeout on P 1
cwnd = 1 MTU
tion of the T3-rtx timer and its restart during the failover (see Figure 5). When a timeout
occurred, the SCTP congestion window at SEP1 was reduced to 1 Maximum Transmission
Unit (MTU). As a result, no packets were sent out on the primary path, and the T3-rtx timer
was not restarted, until the amount of outstanding data went below 1 MTU. This meant, as
shown in Figure 5, an extra delay (apart from the timeout delay) of about 80 ms at each
timeout event.
Although, an extra delay of 80 ms at each timeout during a failover has to be considered
as a quite large delay in this context (SS7 signaling), even larger delays could be expected
in real-world SIGTRAN networks. Specifically, it could take several transmission rounds
before the T3-rtx timer of the primary path is restarted again after a timeout in cases with
large amounts of outstanding data at the time of a path failure.
Finally, as an aside, we would like to mention the significant penalty in terms of failover
performance that could be the result of setting RT Oinit , the initial value of RTO, too low.
276
Specifically, a too low value on RT Oinit with respect to the round-trip time of the alternate
path8 could result in one extra retransmission, and thus one extra timeout period, before
SCTP considers the primary path failed. To gain some appreciation of the extent to which
this could in fact impede on the failover performance in a SIGTRAN network, we re-ran the
test with the Router1 queue set to 13 KBytes and with no cross traffic (CT-NONE), but this
time with RT Oinit at SEP1 and SEP2 configured to 80 ms instead of 250 ms. The result
of this test was that we observed an increase in failover time with about 180 ms, or 32%,
compared with the original test (cf. Figure 4 (a)).
Conclusions
This paper studies the impact of traffic load on the SCTP failover performance in an M3UAbased SIGTRAN network. Two performance metrics are considered: the SCTP failover
time, and the maximum transfer time experienced by an M3UA user during failover. The
paper shows that cross traffic, especially bursty cross traffic such as SS7 signaling traffic,
could indeed significantly deteriorate the SCTP failover performance. Furthermore, the
paper demonstrates how important it is to configure the routers in a SIGTRAN network
with relatively small queues. For example, in tests with bursty cross traffic and with router
queues twice the bandwidth-delay product (to our knowledge a quite common configuration), failover times were measured which on the average were more than 50% longer than
what was measured with no cross traffic at all. In fact, in these situations, our study suggests
the SCTP failover performance may not even meet the requirement of ITU-T on MTP-L3
changeovers.
Two important observations are also made in the paper which concern the SCTP failover
behavior. First, it is shown that the delays which occur in between the expiration of the SCTP
retransmission timer (T3-rtx) and its restart during a failover could contribute significantly
to the failover and max. MSU transfer times. Second, the paper comments on the extent to
which a too low initial retransmission timeout (RTO) value, i.e., a too low value on the SCTP
parameter RT Oinit , could deteriorate the failover performance.
While cross traffic, T3-rtx restart delays, and low values on RT Oinit could have a significant negative effect on the SCTP failover performance, it still remains that a major factor
is the length of the timeout periods. Thus, in our future work, we intend to study ways of
shortening these periods without threatening network stability. Specifically, we intend to
study the effect of introducing a more relaxed RTO backoff mechanism.
References
[1] M. Allman. TCP byte counting refinements. ACM Computer Communication Review,
3(29):1422, July 1999.
8 Note that the first transmission round on the alternate path within a timeout period only comprises a single
SCTP packet. Consequently, the SACK timer delay adds to the initial round-trip time in a timeout period on the
alternate path, something that is easily overlooked when RT Oinit is configured.
REFERENCES
277
[2] A. T. Andersen. Modelling of packet traffic with matrix analytic methods. PhD thesis, Informatics and Mathematical Modelling, Technical University of Denmark, DTU,
September 1995.
[3] S. Fu and M. Atiquzzaman. SCTP: State of the art in research, products, and technical
challenges. IEEE Communications Magazine, 42(4):6476, April 2004.
[4] T. George, B. Bidulock, R. Dantu, H. J. Schwarzbauer, and K. Morneault. Signaling
system 7 (SS7) message transfer part 2 (MTP2) - user peer-to-peer adaptation layer
(M2PA). Internet draft, IETF, June 2004. Work in Progress.
[5] K-J Grinnemo and A. Brunstrom. Performance of SCTP-controlled failovers in M3UAbased SIGTRAN networks. In Advanced Simulation Technologies Conference (ASTC),
Applied Telecommunication Symposium (ATS), pages 8691, Arlington, Virginia, USA,
April 2004.
[6] ITU-T. Specifications of signalling system no. 7 message transfer part: Signalling
network functions and messages. Recommendation Q.704, ITU-T, July 1996.
[7] ITU-T. Specifications of signalling system no. 7 - signalling connection control part:
Signalling connection control part procedures. Recommendation Q.714, ITU-T, July
1996.
[8] ITU-T. Specifications of signalling system no. 7 - ISDN user part: ISDN user part
signalling procedures. Recommendation Q.764, ITU-T, July 1997.
[9] ITU-T. Signalling system no. 7 message transfer part signalling performance. Recommendation Q.706, ITU-T, March 1999.
[10] A. Jungmaier, E P. Rathgeb, and M. Tuexen. On the use of SCTP in failover scenarios. In 6th World Multiconference on Systemics, Cybernetics and Informatics, Orlando,
Florida, USA, July 2002.
[11] R. Kuhn. Sources of failure in the public switched telephone network. IEEE Computer,
30(4):3136, April 1997.
[12] C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian. Delayed Internet routing convergence.
IEEE/ACM Transactions on Networking, 9(3):293306, June 2001.
[13] L. Ong, I. Rytina, M. Garcia, H. Schwarzbauer, L. Coene, H. Lin, I. Juhasz, M. Holdrege, and C. Sharp. Framework architecture for signalling transport. RFC 2719, IETF,
October 1999.
[14] F. J. Scholtz. Statistical analysis of common channel signaling system no. 7 traffic. In
15th Internet Traffic Engineering and Traffic Management (ITC) Specialist Seminar,
Wurzburg, Germany, July 2002.
[15] G. Sidebottom, K. Morneault, and J. Pastor-Balbas. Signaling system 7 (SS7) message
transfer part 3 (MTP3) user adaptation layer (M3UA). RFC 3332, IETF, September
2002.
278
Paper X
Using Relaxed Timer Backoff to
Reduce SCTP Failover Times
Under submission
Adam Wolisz
Department of Electrical Engineering
Technical University Berlin, Germany
awa@ieee.org
Abstract
SCTPs multi-homing feature allows it to fail over to an alternate network path in
case of network failures, which is vital for meeting the reliability requirements of many
signaling applications. But using the standard RFC 2960 configuration, SCTP takes a
minimum of 63 s to detect a failure and perform the failover, which is far too slow for
these applications. The main cause for the long failover time lies in SCTPs use of a
binary exponential timer backoff. This is part of its congestion control scheme and also
plays an important role for Karns algorithm.
Existing approaches tries to accelerate the failure detection work by disabling or
limiting the exponential timer growth. This, however, partly defeats the purposes of
the backoff, impairing congestion control and limiting SCTPs ability to adapt to delay
variations. In case of unexpected delay increases this may lead to unstable behavior and
a large number of spurious retransmission timeouts.
Instead of just disabling or limiting the timer backoff, we propose to accelerate the
failure detection in a soft way by using smaller backoff factors. Based on existing
research results from the area of MAC contention, we argue that properly chosen smaller
backoff factors still yield stable behavior in case of congestion. Also, full adaptability to
delay variations is maintained.
We present estimations of the failure detection times that can be achieved with this
approach in realistic network scenarios. Complementing the estimations, we present
results of simulations and measurements with a commercial SCTP stack for selected
scenarios.
281
282
Introduction
Over the past decade, the Internet has become a ubiquitous means of communications and
has quickly surpassed traditional circuit-switched network traffic volumes. More and more
telecommunications carriers, companies, and vendors have come to envision a nextgeneration network in which voice, video, and data converge into a single IP-based infrastructure, operating over both wired and wireless physical media. They see Voice over IP
(VoIP) as a natural step in this direction. Still, it is clear that the transition from the traditional public switched telephone network (PSTN) to VoIP will not happen overnight: With
about 1.4 billion users, about $900 billion of worldwide revenue, and more than $350 billion of legacy equipment installed, the PSTN will most likely live on for at least a decade
or so. Thus, to enable seamless interoperation of VoIP with the traditional PSTN, the IETF
Signaling Transport (SIGTRAN) working group has defined a framework architecture [35]
for transportation of PSTN signaling, i.e., Signaling System No. 7 (SS7) traffic over IP.
The SIGTRAN architecture essentially comprises two components: a new transport protocol, the Stream Control Transmission Protocol (SCTP) [43], specifically designed for signaling traffic; and an adaptation sub-layer which essentially makes it possible to run existing
SS7 application protocols unaltered on top of SIGTRAN.
To allow interoperability between PSTN and VoIP networks, it is important that the SIGTRAN architecture meets the functional and performance requirements of SS7. In particular,
it must exhibit the same availability as a traditional SS7 network. To this end, several network redundancy mechanisms have been incorporated into the SIGTRAN architecture with
the SCTP failover mechanism being among the most important ones.
The SCTP failover mechanism replaces the Changeover and Forced Rerouting procedures of SS7, and thus should approximately match the performance provided by these
procedures. Specifically, as will be shown in Section 3, an SCTP failover should take no
more than about 2 s to complete. However, several studies, e.g. [15, 26], indicate that SCTP
will not always be able to achieve a failover performance in that order of magnitude, especially if failovers should be carried out in a conservative manner to avoid false or spurious
failovers. To this end, this paper proposes a modified Retransmission Timeout strategy which
involves using a relaxed timer backoff factor of less than the standard factor of two used by
SCTPs exponential backoff mechanism [43]. The paper shows through simulations, complemented with validating experiments using a real SCTP stack, that the relaxed backoff
could significantly improve SCTP failover times. Specifically, the paper suggests that such
a strategy could substantially enlarge the range of permissible network delays (i.e., delays
over which SCTP remains compliant with SS7 failover requirements) for SIGTRAN networks, and this without having to resort to a much less conservative failover behavior. The
paper also presents strong arguments that SCTP remains stable with a relaxed timer backoff;
this is significant because the backoff mechanism is an important part of SCTPs congestion
control.
The remainder of the paper is organized as follows. Section 2 gives some preliminary
material on SCTP and the SCTP failover mechanism. This is followed in Section 3 with an
elaboration of the requirements in terms of failover time imposed on SCTP by SS7. Furthermore, Section 3 surveys previous and current work on improving the SCTP failover perfor-
283
mance. Next, in Section 4, the stability of SCTP with a relaxed timer backoff is considered,
and arguments are presented showing that it is reasonable to believe that the exponential
timer backoff mechanism of SCTP remains stable with an appropriate backoff factor of less
than 2. Section 5 presents theoretical estimations of the failover times that could be expected
with a relaxed timer backoff. This is complemented in Section 6 with simulations of failover
times for a selection of backoff factors and network conditions. Section 6 also includes an
experimental validation of a subset of the executed simulations. Finally, Section 7 concludes
the paper and outlines future work.
284
data is sent that way. Any remaining paths serve as alternate paths, and will normally only
be used for retransmissions of dropped packets. Only if SCTP comes to the conclusion that
the primary path has failed permanently, one of the alternate paths is selected as the new
primary path. From that point on, new data is sent over this path. Note that the SCTP specifications currently do not support the concurrent transmission of new data on multiple paths,
which could be used e.g. for load-sharing. Multi-homing is currently only used to enhance
failure resilience.
SCTP provides two kinds of probing mechanisms for monitoring the reachability of
the peer destination addresses, one for the primary path and one for the alternate paths.
For the primary path, SCTP keeps an error counter which counts the number of consecutively missed responses to data transmissions (i.e., acknowledgements), detected by expired retransmission timeouts. If the error counter exceeds a configurable threshold, called
Path.Max.Retrans, the destination address belonging to the primary path is considered
unreachable. If, on the other hand, an acknowledgement for data sent on the primary path is
received, the error counter is reset to zero.
The purpose of the threshold Path.Max.Retrans is to reduce the risk of false unreachability detections due to packet losses that may happen even if the destination address
is still reachable, e.g., in case of network congestion. Choosing a value for
Path.Max.Retrans involves a tradeoff: A high value reduces the risk of false detections, but a lower value allows for faster detection of real failures because fewer consecutive
timeouts are needed until the destination is considered unreachable. RFC 2960 recommends
a rather conservative default value of 5 for Path.Max.Retrans.
A similar error counter as for the primary path is also used for each alternate path. But
since alternate paths are not normally used for data transmissions, SCTP uses a heartbeat
mechanism for probing. Heartbeats are sent periodically, based on a configurable heartbeat
timer. If a heartbeat response on an alternate path is not received within a specified time
period, the error counter is incremented in the same way as described above. Again, when
a counter exceeds Path.Max.Retrans, the corresponding destination address is considered unreachable.
Figure 1 illustrates the flow of events taking place when a failure occurs on the primary
of two paths of an association between two multi-homed Signaling End Points (SEPs), SEP1
and SEP2. The SEPs in this example have two addresses each (A and B), and are connected
by the two paths A1A2 and B1B2, with A1A2 being the initial primary path. Both end
points are assumed to be configured using the default values recommended by RFC 2960,
i.e., Path.Max.Retrans is assumed to be 5.
At (2), the primary path fails, which results in a timeout at (3) for the data chunks sent
earlier at (1). The timeout triggers the retransmission of the outstanding data at (3), which,
as mentioned earlier, takes place on an available alternate path, here B1B2. Furthermore,
the error counter of the primary path is incremented by one.
A new retransmission timeout (RTO) is now calculated for the primary path. According
to RFC 2960, the RTO is doubled following each retransmission timeout. This is called
exponential timer backoff, and is an essential part of SCTPs congestion control mechanism,
following the principles described in . Consequently, the RTO grows exponentially with each
consecutive timeout if the path failure persists. However, the RTO is bounded by the SCTP
SEP1
285
SEP2
Primary Path
A1
A2
SIGTRAN Network
B1
B2
Alternate Path
B1
A1
A2
B2
(1)
(2)
RTO 1
(3)
(4)
RTO 2
(5)
(6)
286
RTO is at least 1 s at the time of the path failure. Thus, the minimum failover time for SCTP
configured as recommended by RFC 2960 is 1 s + 2 s + 4 s + 8 s + 16 s + 32 s = 63 s, i.e.,
the sum of the time intervals between consecutive timeouts.
As previously mentioned, the SCTP failover mechanism replaces the Changeover and Forced
Rerouting procedures of SS7 [18, 32], and thus should provide approximately the same
failover performance as these procedures. In this section, we quantify an appropriate performance target for our further considerations.
Some previous work [15, 26] has used an upper limit on the SS7 Changeover procedure
of 800 ms, but the performance figures are actually only indirectly specified by the ITU-T
standards. The ITU-T recommendation Q.703 [17] imposes through the timer T7, Excessive delay of acknowledgement, an upper limit on the link failure detection time between
500 ms and 2 s, and recommendation Q.706 [19] prescribes upper limits on the Changeover
detection and response times of 500 ms and 300 ms, respectively. Together with the transmission and handling delays for the Changeover request and response messages, which typically
sum up to about 100 ms, we end up with an upper limit between 1.4 s and 2.9 s.
This range for an upper limit of a failover is also in line with the SIGTRAN prestudy
work by Seth et al. [6, 39] which suggests an upper limit on a path failover of less than 2 s.
Further, a similar delay requirement is also imposed on SCTP by BICC [20], a common
call-control protocol in a GSM/WCDMA network.
Taking these considerations into account, we chose a target value of 2 s for the SCTP
failover time. As is evident from the failover example in Section 2, SCTP configured according to RFC 2960 does not even come close to meeting this target.
The reason for this mismatch between SS7 requirements and RFC 2960 is that the IETF
standardized SCTP with a conservative configuration that can be used over the Internet,
rather than specifically targeting controlled signaling networks. In the Telephony Signaling
Transport over SCTP Applicability Statement [11], the IETF acknowledges this fact, and
suggests several ways to adapt SCTP for SS7. They suggest setting Path.Max.Retrans
to a value less than 5 and/or setting RT Omin to less than 1 s. They also suggest more radical
solutions such as disabling or drastically limiting the exponential timer backoff. Typically,
the latter is done by setting RT Omax to a value much less than the 60 s recommended
by RFC 2960, thus curbing the growth of the RTO during the consecutive timeout events
leading to a failover. Examples of practical uses of this method include the SCTP/T stack by
Adax [2], and the work of Jungmaier et al. [26].
However, as recognized by the IETF, the suggested solutions to improve the SCTP
failover times are not without problems. Configuring Path.Max.Retrans to a value
lower than 5 increases the risk of false or spurious failovers. Limiting or disabling the exponential timer backoff mechanism can contribute to a destabilization of the network in case
of congestion [14, 16, 24, 40, 41] (also see Section 4). Furthermore, it impedes Karns algorithm [27], which relies on the timer backoff to acquire a new round-trip time (RTT) estimate
in case of a sudden delay increase. The algorithm, which is mandatory for SCTP [43], states
that the RTT must not be measured using packets that have been retransmitted to avoid re-
287
transmission ambiguities [27, 45]. Instead, the backed-off RTO is used as the new RTO after
a timeout, ensuring that the RTO eventually becomes greater than the increased RTT. Without the timer backoff, SCTP may never be able to collect a correct RTT measurement in such
a situation.
Apart from the solutions put forth by the IETF to improve SCTPs failover times, Caro,
Iyengar et al. [7, 8, 22] performed extensive studies on the SCTP failover and changeover
mechanisms. They have proposed a two-level threshold (, ) failover mechanism [7] which
disentangles failover detection from failover recovery: When the number of retransmissions
reaches the first threshold, , recovery begins and traffic is re-routed from the primary path
to one of the alternate paths. However, the primary path is not considered unreachable until
the number of retransmissions reaches the second threshold, . The two-level threshold
mechanism improves throughput during SCTP failovers, and could be used as a complement
to our proposal. Specifically, our proposal enables the use of a higher value on , and thus
avoids unnecessary re-routings which, from a traffic-engineering viewpoint, might have a
destabilizing effect on a network.
The failover times could also be improved by extending SCTP with support for concurrent transmission on several paths. A path recovery in this context then translates to a redistribution of data from the failed path to still active paths. Several recent works have been
concerned with concurrent multi-path transfers. For example, Iyengar et al. [21, 23] studied
the consequences of sending new data across several paths, not just the primary one. They
proposed modifications and extensions to SCTP to mitigate the negative effects, in terms of
packet reordering and impaired congestion control, which emerge with multi-path transfers.
In line with their work., Ye et al. [44] have proposed IPCC-SCTP, an enhancement to SCTP
for more efficient support of multi-homing. In IPCC-SCTP, the per-association congestion
control of SCTP has been replaced with an independent per-path congestion control. As a
successor to IPCC-SCTP, LS-SCTP [3] has been proposed. LS-SCTP extends on the work
with IPCC-SCTP by providing load sharing among multiple paths.
Also others have proposed extensions of SCTP for load sharing. Casetti et al. [10] have
suggested an extension to SCTP for bandwidth-aware load sharing. They have devised and
implemented a bandwidth-aware source scheduler in SCTP with the objective to maximize
throughput. Furthermore, in a later incarnation of their work [9], they have incorporated
the low-pass bandwidth estimation technique of TCP Westwood+ [30] in their SCTP stack,
and developed Westwood SCTP. Still further examples of load sharing include the RivuS [1]
open-source project.
Although the number of research efforts on concurrent multi-path transfers and load sharing in SCTP is fairly large, it should be noted that this still is very much work in progress.
Requests to explicitly permit transmission over several paths have, so far, been rejected in
the IETF. As a matter of fact, it still remains to be decided if layer three is indeed the correct
layer for multi-path routing in the first place.
288
a certain point, the so-called knee, the rate at which the throughput increases declines. As
the load on the network continues to increase, the network capacity, the cliff, is eventually
reached. Beyond the cliff, throughput actually drops with increased load and the network
is said to be in a state of congestion collapse [33]. In this state, the majority of packets
being transmitted are actually retransmissions of discarded or presumed discarded packets.
The network is poorly utilized in this state despite high traffic demand, and the response
times are excessively long. Still worse, when congestion collapse is reached, the network
is not able to recover from this state by itself [33].
A transport protocol is said to be stable if it adjusts its sending rate in such a way that
the network always operates below the cliff, or, in other words, prevents the network
from reaching congestion collapse in the first place. As for TCP, the stability of SCTP
relies upon its congestion control and retransmission timer backoff mechanisms [24]. Thus,
to be able to use SCTP with a relaxed retransmission timer backoff in anything but welldimensioned, controlled network environments, it has to be stable.
SCTP essentially uses the same retransmission timer backoff mechanism as TCP, and, as
follows from Jacobson [24], the argument for stability of the retransmission timer backoff
mechanism in TCP relies to a large extent upon the work of Kelly [28]. Kelly showed that
no backoff mechanism that backs off slower than exponential is stable when the number of
contending flows is infinite. In [24], Jacobson argues that the problem of stability of the
TCP retransmission timer backoff mechanism is equivalent to that of the stability of backoff
protocols for multiple access control (MAC) channels such as Ethernet. Several works, such
as the works of Aldous [4] and Goodman [14] show that no matter how large the number
of stations is, as long as it is finite, a MAC contention protocol with a backoff factor of 2 is
stable. As a matter of fact, Song et al. [41] have shown that provided the number of sending
stations is finite, the optimal backoff factor is not 2 but approximately 1.6. It has even been
shown by Hastad et al. [16] and Goldberg et al. [13] that, provided the number of stations is
finite, it suffices to use a superlinear polynomial backoff, e.g., a quadratic backoff to obtain
stability.
Taken together, these findings make us believe that using a relaxed backoff factor of less
than 2 will not endanger the stability of SCTP. In this paper, we consider backoff factors of
1.5 and 1.75, which are close to or above the optimum found in [41], and well-suited for
efficient implementation using binary integer arithmetic.
To be able to theoretically estimate the failover times that could be expected by using a
relaxed backoff factor, we make a number of assumptions regarding the implementation of
the SCTP retransmission mechanism:
Minimum RTO: In order to achieve the desired failover performance, it is necessary to
set the minimum RTO (RT Omin ) to values significantly smaller than the one recommended in the standard [43] (i.e., 1 s). Some research results suggest that lowering
RT Omin can lead to an increased number of spurious timeouts [5], while others argue that a lower bound should not be needed if a proper predictor is used for the RTO
289
calculations [12, 29]9 . The latter work also demonstrates that the timer as specified
in RFC 2988 [36] is still conservative regardless of RT Omin due to the fact that the
retransmission timer is restarted whenever an acknowledgement for the earliest outstanding segment (or TSN in case of SCTP) is received, which delays the timeout by
roughly one round-trip time beyond the nominal RTO. Furthermore, [12] describes a
simple enhancement that can greatly reduce the number of spurious timeouts without
setting a static RT Omin . The RTO is not allowed to be smaller than the most recent
RTT sample plus a safety margin of two times the timer granularity. This dynamic
RT Omin was, to our knowledge, first introduced in the BSD implementation of TCP.
In any case, the choice of a particular value for RT Omin can be seen as a tradeoff between risk of spurious timeouts and responsiveness of the timer. Given the available
research, we believe that choosing a lower RT Omin to meet our application requirements bears only a limited risk of performance degradation due to increased spurious
timeouts. In the following we assume RT Omin to be set to a value equal to or smaller
than the RTT.
Timer Granularity: In order to use a low RT Omin , the timer granularity must be sufficiently fine. In cases where the RTO timer is implemented using a heartbeat timer,
the granularity of this timer effectively determines the lower limit for RT Omin . More
specifically, RT Omin should in this case be no less than twice the granularity of the
heartbeat timer [36].
Delayed Acknowledgements: RFC 2960 recommends the use of delayed acknowledgements with a delay of 200 ms. This adds to the RTT seen by the SCTP sender and
thus increases the RTO and prolongs the total failover time. To achieve the lowest
possible failover times, it should be possible to disable delayed acknowledgements, or
to set the delay to a lower value than recommended in the RFC. Note that this may
come at the cost of increased protocol overhead on the return path.
As described in Section 2, a path is declared inactive by SCTP, and a failover is initiated,
if more than Path.Max.Retrans consecutive timeouts occur on the path. The failover
time, i.e., the time that elapses from the first packet loss caused by the path failure until
the path is declared inactive, is thus approximately the sum of the lengths of these timeouts10 . Taking the exponential timer backoff into account, the failover time can therefore be
calculated as follows:
tf ailover = tRT O =
PX
MR
i=0
i = tRT O
P MR+1 1
,
1
(1)
where tRT O denotes the RTO at the time of the failure (before exponential backoff), PMR
the parameter Path.Max.Retrans, the backoff factor, and tf ailover the total failover
time.
9 Note that SCTP uses essentially the same timer algorithms as TCP [36], so research based on TCP is also valid
for SCTP in this context.
10 In practice, the failover time can be somewhat longer because of processing delays within the protocol stack.
290
6
Backoff Factor 1.50
Backoff Factor 1.75
Backoff Factor 2.00
5
0
1
Path.Max.Retrans
Figure 2: Estimated failover times for various values of Path.Max.Retrans and backoff
factors (RTT = 40 ms).
To obtain a failover time estimation for a given network scenario, we need to estimate
tRT O . However, tRT O is in itself a function of the RTT and its long- and short-term variations [24, 36], neither of which we can say anything about in the general case. In order to
get an impression of possible benefits in a real-world scenario, we thus make some further
assumptions. In particular, we choose an exemplary signaling network with an RTT of 40 ms
for our discussion. We also assume for the purpose of this discussion that tRT O is 2 RT T
(which is a fairly conservative assumption for a tightly controlled network 11 ). Given these
assumptions, Figure 2 shows how the SCTP failover time depends on the backoff factor and
Path.Max.Retrans.
To further demonstrate the gain from using a relaxed backoff factor, we expand our discussion to RTTs in the interval 20 ms to 100 ms, and assume a fixed
Path.Max.Retrans of 4. Figure 3 shows how the SCTP failover time varies with the
backoff factor and the RTT. We observe that while standard SCTP exhibits failover times
longer than 2 s already at RTTs above 35 ms, the failover times of SCTP with a backoff fac11 In [5], Allman et al. present captured traces of TCP traffic from the Internet which suggest that before any backoffs take place, the RTO is typically 3 5 times the RTT. However, since we assume a dedicated SIGTRAN network
with far less delay variations, we believe using an RTO of 2 times the RTT is a fairly conservative assumption.
291
7
Backoff Factor 1.50
Backoff Factor 1.75
Backoff Factor 2.00
6
0
0.02
0.04
0.06
0.08
0.1
RTT (s)
Figure 3:
Estimated failover times for various RTTs and backoff factors
(Path.Max.Retrans = 4).
tor of 1.75 remain below 2 s almost up to an RTT of 50 ms, and SCTP with a backoff factor
of 1.5 almost up to 80 ms.
292
12 Competing Flows
Traffic
Sink
Traffic
Traffic
Source
Traffic
Source
SCTP/
Traffic
TCP
SCTP/
Source
TCP
SCTP/
TCP
Sink
SCTP/
Traffic
TCP
Sink
SCTP/
TCP
SCTP/
TCP
(L
A
2 Mbps (WAN)
M
bp
bps
M
A
(L
A1
A2
Failover
B2
SCTP
10
B1
Traffic
Sink
Alternate Path
(L
A
SCTP
10
Primary Path
Traffic
Source
SEP2
SEP1
10
N)
Cross Traffic
bp
N)
10
A
(L
bps
s
2 Mbps (WAN)
RED
Queue: 50 packets
293
The SCTP senders RTO, SRTT, and RTTVAR variables at the time of the link failure.
This enabled us to distinguish between the two estimators used in the calculation of
the RTO (the mean and variance estimators of the RTT, i.e., SRTT and RTTVAR). As
pointed out in Section 5, the RTO at the time of the path failure (tRT O ) is the primary
factor determining the failover time for a given parameter setting.
For each parameter setting, 50 simulation runs were executed using different, mutually
independent random number streams for the individual runs. The averages over the fifty sets
of observations were computed along with the 95% confidence intervals.
294
Link Delay
Backoff Factor
Path.Max.Retrans
SACK Delay
RT Oinit
RT Omin
RT Omax
20 ms
1.5
40 ms
1.75
80 ms
100 ms
2
4
0 ms
20 ms
3s
20 ms
60 s
295
Single Association
Single Association/Delayed ACK
Multiple Association
Multiple Association/Delayed ACK
Mixed Traffic
Mixed Traffic/Delayed ACK
Time (s)
0.15
0.1
0.05
0
RTO
SRTT
RTTVar
Figure 5: Impact of traffic scenario and delayed acknowledgements on RTO variables (Link
Delay = 40 ms). Error bars show 95% confidence intervals.
As follows from Figure 5, the Single Association and Multiple Associations scenarios
had similar RTT and RTT variance estimations, and, accordingly, the RTO at the time of the
failover differed very little between these two scenarios. This was no surprise, considering
that the WAN link was not saturated in either of these two scenarios. In contrast, due to
larger queueing delays and delay variances, the RTO in the Mixed Traffic scenario was considerably larger than in the other two scenarios. Figure 5 also shows that the use of delayed
acknowledgements increased the estimated RTT and its variance, and thus had a significant
impact on the RTO in the Single Association and Multiple Associations scenarios. For the
remainder of this section we only consider the simulations with no delayed acknowledgements.
Figure 6 presents the failover times obtained in the Single Association and Multiple Associations scenarios. Since the two scenarios had almost the same RTO at the time of the
failover, they exhibited very similar failover times.
We observe from Figure 6 that with the standard backoff factor of 2, our target failover
time of 2 s was only met when the link delay was less than about 60 ms. In comparison, a
backoff factor of 1.5 expanded the permissible link delay far beyond 100 ms. In other words,
a reduction of the backoff factor from 2 to 1.5 allowed us to still achieve our performance
296
3.5
2.5
1.5
0.5
0
0.02
0.04
0.06
0.08
0.1
Figure 6: Failover times with 95% confidence intervals for different backoff factors in Single
and Multiple Associations scenarios (Path.Max.Retrans = 4, no delayed acknowledgements).
297
7
Backoff Factor 1.50
Backoff Factor 1.75
Backoff Factor 2.00
6
0
0.02
0.04
0.06
0.08
0.1
Figure 7: Failover times with 95% confidence intervals in Mixed Traffic scenario
(Path.Max.Retrans = 4, no delayed acknowledgements).
acknowledgements at all should be used (although especially the latter comes at the cost of
increased overhead on the return path). Judging from the performance observed with the
Mixed Traffic scenario, it seems unlikely that the failover performance required for timesensitive signaling applications can be achieved in a pure best-effort environment without
risking unstable behavior.
298
L1 (PC)
FreeBSD 5.0
Mb
bp
ps
10
(L
AN
Dummynet
10
A
(L
N)
Primary Path
A1
Traffic
Source
A2
Failover
B1
SCTP
B2
(L
A
N)
Alternate Path
Solaris 8
SCTP
Solaris 8
bp
s
10
L2 (PC)
Traffic
Sink
10
bp
(L
AN
)
Dummynet
Dummynet
Bandwidth: 2 Mbps
Path Delay: 10 ms, 20 ms, 40 ms, 50 ms
FreeBSD 5.0
Conclusions
In this paper, we have proposed a new approach to lower the failover times of SCTP for
use with time-critical signaling applications. This was motivated by the observation that
using aggressive tuning of the SCTP retransmission timer parameters, as commonly used
today to reduce failover times, bears a number of risks with regard to network stability and
protocol performance. In particular, the adaptability to changing network delays is limited
and the congestion control mechanism is impaired, making the use of these approaches risky
in environments where there are no strict delay guarantees and congestion cannot be ruled
out completely.
The key element of our proposal is to lower the factor used in SCTPs exponential timer
backoff to a value lower than 2. Since the exponential backoff is the main contributing factor
to SCTPs long failover times, using a relaxed backoff helps shorten the failover times sig-
7. Conclusions
299
3
Simulated/Backoff Factor 1.50
Simulated/Backoff Factor 1.75
Measured/Backoff Factor 1.50
Measured/Backoff Factor 1.75
0
0.02
0.04
0.06
0.08
0.1
Figure 9: Validation of failover times (Path.Max.Retrans = 4, no delayed acknowledgements). Error bars show 95% confidence intervals.
300
References
[1] RivuS project. http://sourceforge.net/projects/rivus.
[2] The Need for High Performance Convergence. Adax, 2001. White Paper.
[3] A. A. El Al, T. Saadawi, and L. Myung. LS-SCTP: A bandwidth aggregation technique
for stream control transmission protocol. Elsevier Computer Communications, 27(10),
June 2004.
[4] D. J. Aldous.
Ultimate instability of exponential backoff protocol for
acknowledgement-based transmission control of random access communication channels. IEEE Transactions on Information Theory, 33(2), March 1987.
[5] M. Allman and V. Paxson. On estimating end-to-end network path properties. In
SIGCOMM, Cambridge, Massachusetts, USA, August 1999.
[6] T. Seth A. Broscius, C. Huitema, and H-A P. Lin. Performance requirements for TCAP
signaling in Internet telephony. Internet draft, IETF, February 1999. Work in Progress.
REFERENCES
301
302
REFERENCES
303
[35] L. Ong, I. Rytina, M. Garcia, H. Schwarzbauer, L. Coene, H. Lin, I. Juhasz, M. Holdrege, and C. Sharp. Framework architecture for signalling transport. RFC 2719, IETF,
October 1999.
[36] V. Paxson and M. Allman. Computing TCPs retransmission timer. RFC 2988, IETF,
November 2000.
[37] L. Rizzo. Dummynet: A simple approach to the evaluation of network protocols. ACM
Computer Communication Review, 27(1):3134, January 1997.
[38] SCTP module for ns-2. http://www.armandocaro.net/software.
[39] T. Seth, A. Broscius, C. Huitema, and H-A P. Lin. Performance requirements for
signaling in internet telephony. Internet draft, IETF, November 1998. Work in Progress.
[40] S. Shenker. Some conjectures on the behavior of acknowledgement-based transmission
control of random access communication channels. In ACM SIGMETRICS, May 1987.
[41] N.-O. Song, B.-J. Kwak, and L. E. Miller. On the stability of exponential backoff.
Journal of Research of the National Institute of Standards and Technology, 108(4),
August 2003.
[42] R. Stewart, I. Arias-Rodriguez, K. Poon, A. Caro, and M. Tuexen. Stream control
transmission protocol (SCTP) specification errata and issues. Internet draft, IETF,
October 2005. Work in Progress.
[43] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina,
M. Kalla, L. Zhang, and V. Paxson. Stream control transmission protocol. RFC 2960,
IETF, October 2000.
[44] G. Ye, T. N. Saadawi, and M. Lee. IPCC-SCTP: An enhancement to the standard
SCTP to support multihoming efficiently. In 23rd IEEE International Performance
Computing and Communications Conference (IPCCC), Phoenix, Arizona, USA, April
2004.
[45] L. Zhang. Why TCP timers dont work well. In SIGCOMM, August 1986.