tmp2281 TMP

T HESIS
FOR THE DEGREE OF
D OCTOR OF T ECHNOLOGY
Transport Services for

Soft Real-Time Applications in IP Networks
KARL-JOHAN GRINNEMO
Department of Computer Science

KARLSTAD UNIVERSITY
Karlstad, Sweden 2006

KARL-JOHAN GRINNEMO
Technical Report no. 2006:8
Karlstad University
SE651 88 Karlstad, Sweden
Phone: +46 (0)54700 1000
Contact information:
Karl-Johan Grinnemo
Telecom & Media
TietoEnator AB
Box 1038
SE651 15 Karlstad, Sweden
Phone: +46 (0)5429 41 49
Fax: +46 (0)5429 40 01
Email: karl-johan.grinnemo@tietoenator.com
Printed in Sweden
Karlstads Universitetstryckeri
Karlstad, Sweden 2006
To my parents

KARL-JOHAN GRINNEMO
Department of Computer Science, Karlstad University
Abstract
In recent years, Internet and IP technologies have made inroads into almost every communication market ranging from best-effort services such as email and Web, to soft real-time
applications such as VoIP, IPTV, and video. However, providing a transport service over
IP that meets the timeliness and availability requirements of soft real-time applications has
turned out to be a complex task. Although network solutions such as IntServ, DiffServ,
MPLS, and VRRP have been suggested, these solutions many times fail to provide a transport service for soft real-time applications end to end. Additionally, they have so far only
been modestly deployed. In light of this, this thesis considers transport protocols for soft
real-time applications.
Part I of the thesis focuses on the design and analysis of transport protocols for soft realtime multimedia applications with lax deadlines such as image-intensive Web applications.
Many of these applications do not need a completely reliable transport service, and to this
end Part I studies so-called partially reliable transport protocols, i.e., transport protocols that
enable applications to explicitly trade reliability for improved timeliness. Specifically, Part
I investigates the feasibility of designing retransmission-based, partially reliable transport
protocols that are congestion aware and fair to competing traffic. Two transport protocols
are presented in Part I, PRTP and PRTP-ECN, which are both extensions to TCP for partial
reliability. Simulations and theoretical analysis suggest that these transport protocols could
give a substantial improvement in throughput and jitter as compared to TCP. Additionally, the
simulations indicate that PRTP-ECN is TCP friendly and fair against competing congestionaware traffic such as TCP flows. Part I also presents a taxonomy for retransmission-based,
partially reliable transport protocols.
Part II of the thesis considers the Stream Control Transmission Protocol (SCTP), which
was developed by the IETF to transfer telephony signaling traffic over IP. The main focus of
Part II is on evaluating the SCTP failover mechanism. Through extensive experiments, it is
suggested that in order to meet the availability requirements of telephony signaling, SCTP
has to be configured much more aggressively than is currently recommended by IETF. Furthermore, ways to improve the transport service provided by SCTP, especially with regards
to the failover mechanism, are suggested. Part II also studies the effects of Head-of-Line
Blocking (HoLB) on SCTP transmission delays. HoLB occurs when packets in one flow
block packets in another, independent, flow. The study suggests that the short-term effects
of HoLB could be substantial, but that the long-term effects are marginal.
Keywords: transport protocol, congestion control, partial reliability, soft real-time, SCTP,
failover, SIGTRAN, head-of-line blocking
Acknowledgments
This thesis has benefited from the help of many people. First and foremost, I would like
to acknowledge my supervisor, Prof. Anna Brunstrom (Department of Computer Science,
Karlstad University, Sweden), for being an excellent research mentor, and for guiding me
through my doctoral studies. Also, I would like to express my sincere gratitude to my employer and main sponsor, TietoEnator. Without their economical and administrative support,
I would not have been able to pursue any doctoral studies in the first place.
Second, I would like to thank my current co-supervisor, Dr. Reiner Ludwig (Senior Specialist, Ericsson Research, Aachen, Germany) for his assistance in my work on SCTP, and
my former co-supervisor, Dr. Jakob Angeby

(Mecel, Amal,
Sweden) for his support during
my licentiate thesis. I would also like to acknowledge all co-authors of the papers included in
this thesis and written during my doctoral studies: Prof. Anna Brunstrom, Stephan Baucke
(Ericsson Research, Aachen, Germany), Dr. Reiner Ludwig, Assistant Prof. Johan Garcia
(Department of Computer Science, Karlstad University, Sweden), Torbjorn Andersson (TietoEnator, Karlstad, Sweden), Assistant Prof. Stefan Lindskog (Department of Computer
Science, Karlstad University, Sweden), Annika Wennstrom (Department of Computer Science, Karlstad University, Sweden), Katarina Asplund (Department of Computer Science,
Karlstad University, Sweden), Sean Schneyer (Ericsson Inc., Richardson, Texas, U.S.), and
Prof. Adam Wolisz (Department of Electrical Engineering, Technical University of Berlin,
Germany).
Third, I would like to acknowledge the past and present members of the Distributed
Systems and Communications Research Group (DISCO) at the Department of Computer
Science at Karlstad University including Prof. Anna Brunstrom, Assistant Prof. Johan
Garcia, Johan Eklund, Stefan Alfredsson, Annika Wennstrom, Hannes Persson, Assistant
Prof. Andreas Kassler, and Katarina Asplund; especially, I would like to thank Johan Garcia and Stefan Alfredsson for helping me several times with Unix and Linux questions. I
would also like to acknowledge the past and present members of the PRTP/SCTP reference
group: Gunnar Lorentzon (TietoEnator, Karlstad, Sweden), Magnus Larsson (TietoEnator,
Karlstad, Sweden), Mikael Blom (TietoEnator, Karlstad, Sweden), Soren Torstensson (TietoEnator, Karlstad, Sweden), Eivind Nordby (Department of Computer Science, Karlstad
University, Sweden), and Prof. Erland Jonsson (Department of Computer Science and Engineering, Chalmers University of Technology, Sweden). In this context, I would also like to
thank Ulf Melin (TietoEnator, Karlstad, Sweden) and Nils Bojeryd (TietoEnator, Karlstad,
Sweden) for their assistance with telecom and SIGTRAN questions, and Rickard Persson
(TietoEnator, Karlstad, Sweden) for reviewing and assisting me with Paper VI.
Last but not least, this thesis would not have been possible without the support of my
parents, Goran and Ingrid Grinnemo, and my brother Karl-Henrik Grinnemo. I am also
indebted to Rikard Ed-Svensson for being a good friend.
iii
List of Appended Papers

This thesis consists of an introductory summary and reprints of the following ten papers:
Part I: Partially Reliable Transport Protocols for Multimedia Applications
Paper I: K-J Grinnemo, J. Garcia, and A. Brunstrom. Taxonomy and Survey of
Retransmission-based Partially Reliable Transport Protocols. Computer Communications, Elsevier, Vol. 27, Issue 15, September 2004, pp. 14411452.
Paper II: K-J Grinnemo and A. Brunstrom. A Simulation Based Performance Evaluation of PRTP. Karlstad University Studies 2002:35, Karlstad University, Sweden, October 2002.
Paper III: K-J Grinnemo and A. Brunstrom. Evaluation of the QoS offered by PRTPECN A TCP Compliant Partially Reliable Transport Protocol. In Proceedings
of the 9th International Workshop on QoS (IWQoS), pp. 217231, Karlsruhe,
Germany, June 2001.
Paper IV: K-J Grinnemo and A. Brunstrom. A Simulation Based Performance Analysis of a TCP Extension for Best Effort Multimedia Applications. In Proceedings
of the 35th Annual Simulation Symposium (ANSS35), pp. 327336, San Diego,
California, USA, April 2002.
Paper V: K-J Grinnemo and A. Brunstrom. Enhancing TCP for Applications with
Soft Real-Time Constraints. In Proceedings of the Convergence of Information
Technologies and Communications (ITCom), Vol. 4518, pp. 1831, Denver,
Colorado, USA, August 2001.
Part II: Transport Service for Telephony Signaling
Paper VI: K-J Grinnemo and A. Brunstrom. Towards the Next Generation Network:
The Softswitch Solution. Karlstad University Studies 2006:6, Karlstad University, Sweden, April 2006.
Paper VII: K-J Grinnemo, T. Andersson, and A. Brunstrom. Performance Benefits
of Avoiding Head-of-Line Blocking in SCTP. In Proceedings of the Joint International Conference on Autonomic/Autonomous Systems (ICAS)/
International Conference on Networking and Services (ICNS), Tahiti, French
Polynesia, October 2005.
Paper VIII: K-J Grinnemo and A. Brunstrom. Performance of SCTP-controlled
Failovers in M3UA-based SIGTRAN Networks. In Proceedings of the Advanced
Simulation Technologies Conference (ASTC), Applied Telecommunication Symposium (ATS), Arlington, Virginia, USA, April 2004.
Paper IX: K-J Grinnemo and A. Brunstrom. Impact of Traffic Load on SCTP Failovers
in SIGTRAN. In Proceedings of the 4th International Conference on Networking
(ICN), Reunion Island, April 2005.
Paper X: S. Baucke, K-J Grinnemo, R. Ludwig, A. Brunstrom, and A. Wolisz. Using
Relaxed Timer Backoff to Reduce SCTP Failover Times. Under submission.
Minor editorial changes have been made to some of the papers.
v
Comments on My Participation
I am the principal contributor to all papers except Papers I, VII, and X. The taxonomy presented in Paper I builds upon an earlier work conducted by Assistant Prof. Johan Garcia and
Prof. Anna Brunstrom at the Dept. of Computer Science at Karlstad University. However, I
have substantially re-worked their taxonomy, and I am the main author of Paper I. In Paper
VII, Torbjorn Andersson, at that time assistant at the Dept. of Computer Science at Karlstad
University, is responsible for the design and execution of the tests presented in the paper.
My participation in Paper VII includes the analysis of the test results and the writing of the
paper. Paper X is a joint effort between Ericsson Eurolab in Aachen and Karlstad University.
My work on this paper includes participation in the discussions which led to the proposed
retransmission timeout strategy; taking part in the design and analysis of the simulations and
experiments; executing the experiments; and co-authoring the paper.
Other Papers
Apart from the papers included in this thesis, I have authored or co-authored the following
papers:
[1] K. Asplund, A. Brunstrom, J. Garcia, K-J Grinnemo, and S. Schneyer. PRTP A Partially Reliable Transport Protocol for Multimedia Applications: Background Information and Analysis. Karlstad University Studies 1999:5, Karlstad University, Sweden,
June 1999.
[2] K-J Grinnemo, J. Garcia, and A. Brunstrom. A Taxonomy and Survey of Retransmission Based Partially Reliable Transport Protocols. Karlstad University Studies
2002:34, Karlstad University, Sweden, October 2002.
[3] K-J Grinnemo and A. Brunstrom. A Survey of TCP-Friendly Congestion Control
Mechanisms for Multimedia Traffic. Karlstad University Studies 2003:1, January
2003.
[4] K-J Grinnemo and A. Brunstrom. Impact of SCTP-controlled Failovers for M3UA
Users in a Dedicated SIGTRAN Network. In Proceedings of the Second Swedish
National Computer Networking Workshop (SNCNW). Stockholm, Sweden, September
2003.
[5] K-J Grinnemo and A. Brunstrom. Some Observations on the Performance of SCTPcontrolled Failovers in M3UA-based SIGTRAN Networks. In Proceedings of the Second Swedish National Computer Networking Workshop (SNCNW). Karlstad, Sweden,
November 2004.
[6] S. Lindskog, K-J Grinnemo, and A. Brunstrom. Physical Separation for Data Protection based on SCTP Multihoming. In Proceedings of the Second Swedish National
Computer Networking Workshop (SNCNW). Karlstad, Sweden, November 2004.
vi
[7] S. Lindskog, K-J Grinnemo, and A. Brunstrom. Data Protection Based on Physical
Separation: Concepts and Application Scenarios. In Proceedings of the International
Conference on Computational Science and its Application (ICCSA). Singapore, May
2005.
[8] K-J Grinnemo, S. Baucke, A. Brunstrom, R. Ludwig, and A. Wolisz. An Easy Way
to Reduce SCTP Failover Times. In Proceedings of the Third Swedish National Computer Networking Workshop (SNCNW). Halmstad, Sweden, November 2005.
vii
Contents
Introductory Summary
1 Introduction
2 Research Objectives
3 Contributions
3.1 Part I: Partially Reliable Transport Protocols for Multimedia Applications .
3.2 Part II: Transport Service for Telephony Signaling . . . . . . . . . . . . . .
5
5
6
4 Thesis Outline
4.1 Part I: Partially Reliable Transport Protocols for Multimedia Applications .
4.2 Part II: Transport Service for Telephony Signaling . . . . . . . . . . . . . .
7
7
9
5 Conclusions and Future Research
10
Part I: Partially Reliable Transport Protocols for Multimedia Applications
13
Paper I: Taxonomy and Survey of Retransmission-based Partially Reliable Transport Protocols

15
1 Introduction
17
2 Preliminaries
19
3 The Taxonomy
3.1 Classification with Respect to Reliability Service . . . . . . . . . . . . . .
3.2 Classification with Respect to Error Control Scheme . . . . . . . . . . . .
20
20
23
4 A Classification and Survey of Existing Protocols

4.1 PECC . . . . . . . . . . . . . . . . . . . . .
4.2 POCv2 . . . . . . . . . . . . . . . . . . . . .
4.3 SRP . . . . . . . . . . . . . . . . . . . . . .
4.4 HPF . . . . . . . . . . . . . . . . . . . . . .
4.5 PR-SCTP . . . . . . . . . . . . . . . . . . .
4.6 PRTP-ECN . . . . . . . . . . . . . . . . . .
28
28
30
31
32
32
33
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Concluding Remarks
34
Paper II: A Simulation Based Performance Evaluation of PRTP
39
1 Introduction
41
2 Protocol Design
42
ix
The PRTP Simulation Model
44
Validation of the PRTP Simulation Model

4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
45
46
Stationary Analysis
5.1 Simulation Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
52
53
Transient Analysis
6.1 Simulation Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
58
63
Conclusions
66
Paper III: Evaluation of the QoS Offered by PRTP-ECN A TCP Compliant Partially Reliable Transport Protocol
71
1
Introduction
73
Related Work
74
Overview of PRTP-ECN
75
Description of Simulation Experiment

4.1 Implementation . . . . . . . . . . . . .
4.2 Simulation Methodology . . . . . . . .
4.3 Selection of PRTP-ECN Configurations
4.4 Performance Metrics . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
76
77
77
78
79
Results
80
Conclusions and Future Work
83
Paper IV: A Simulation Based Performance Analysis of a TCP Extension for Best
Effort Multimedia Applications
87
1
Introduction
89
Overview of PRTP-ECN
91
Description of the Simulation Experiment

3.1 Statistical Design and Analysis . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Simulation Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
92
94
4 Results of the Simulation Experiment

4.1 Average Interarrival Jitter . . . . . . . . . .
4.2 Average Throughput and Average Goodput
4.3 Average Link Utilization . . . . . . . . . .
4.4 Average Fairness and TCP-Friendliness . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
95
. 96
. 98
. 100
. 101
5 Conclusions
104
Paper V: Enhancing TCP for Applications with Soft Real-Time Constraints
107
1 Introduction
109
2 Related Work
110
3 The PRTP-ECN Retransmission Scheme
112
4 Packet-Loss Behavior of the PRTP-ECN Retransmission Scheme

114
4.1 The Startup Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.2 The Steady-State Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.3 The Maximum Allowable Packet-Loss Burst Length . . . . . . . . . . . . 120
5 Simulation Experiment
123
5.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6 Conclusion
127
Part II: Transport Service for Telephony Signaling
131
Paper VI: Towards the Next Generation Network: The Softswitch Solution
133
1 Introduction
136
2 Signaling in Todays Telecommunication Networks

2.1 Taxonomy of Signaling . . . . . . . . . . . . .
2.2 SS7 Network Architecture . . . . . . . . . . .
2.3 The SS7 Protocol Stack . . . . . . . . . . . . .
2.4 SS7 in PSTN . . . . . . . . . . . . . . . . . .
2.5 SS7 in PLMN . . . . . . . . . . . . . . . . . .
2.6 Intelligent Networks . . . . . . . . . . . . . .
3 The Softswitch Architecture
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
137
137
139
141
147
149
154
156
4 Applications and Services

162
4.1 Application Programming Languages . . . . . . . . . . . . . . . . . . . . 163
4.2 API Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
xi
Call Control Signaling

181
5.1 H.323 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.2 SIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Bearer Signaling
Interworking with Legacy Circuit-Switched Networks

7.1 SCTP . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Adaptation Component . . . . . . . . . . . . . . .
7.3 M2PA . . . . . . . . . . . . . . . . . . . . . . . .
7.4 M2UA . . . . . . . . . . . . . . . . . . . . . . . .
7.5 M3UA . . . . . . . . . . . . . . . . . . . . . . . .
7.6 SUA . . . . . . . . . . . . . . . . . . . . . . . . .
194
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
199
199
204
207
207
209
212
Future Outlook
212
Summary
220
Paper VII: Performance Benefits of Avoiding Head-of-Line Blocking in SCTP
233
Introduction
235
SCTP and HoLB
237
Methodology
238
Results
240
Conclusions
246
Paper VIII: Performance of SCTP-controlled Failovers in M3UA-based SIGTRAN

Networks
249
1
Introduction
251
Methodology
253
Results
256
Conclusions
261
Paper IX: Impact of Traffic Load on SCTP Failovers in SIGTRAN
265
Introduction
267
Failovers in SCTP
269
Methodology
270
xii
4 Results
274
5 Conclusions
276
Paper X: Using Relaxed Timer Backoff to Reduce SCTP Failover Times
279
1 Introduction
282
2 SCTP and SCTP Failover

283
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
2.2 SCTP Multi-Homing and Failure Detection . . . . . . . . . . . . . . . . . 283
3 Background and Related Work
286
4 Stability of Relaxed Exponential Backoff
287
5 Theoretical Estimation of Failover Times
288
6 Experimental Estimation of Failover Times

6.1 Simulation Setup . . . . . . . . . . . .
6.2 Traffic Scenarios . . . . . . . . . . . .
6.3 Simulation Parameters . . . . . . . . .
6.4 Simulation Results . . . . . . . . . . .
6.5 Validation of Simulations . . . . . . . .
7 Conclusions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
291
291
293
294
294
297
298
xiii
1. Introduction
1 Introduction
Over the course of the last decade, the phenomenal success of the Internet and the universal
adoption of Internet technologies have driven profound changes in the data- and telecommunication industry. Maybe the most remarkable outcome of this evolution is the vision of the
Internet as a ubiquitous service platform for basically every known communication service;
from being the platform of basic data services such as file transfer, email, and Web browsing,
to also become a platform for applications such as video broadcasting, IPTV, and, not the
least, wireline and wireless telecommunication. However, using Internet as a ubiquitous service platform greatly challenges the overall architectural philosophy of the Internet Protocol
(IP) which prescribes an end-to-end architecture with smart hosts and a dumb switching
network [19]. In particular, IP does not lend itself easily to soft real-time applications, such
as video and telephony, with time deadlines that need to be met most of the time, and with
fairly stringent availability requirements.
To address the timeliness and availability requirements of soft real-time applications,
quality-of-service (QoS) architectures such as the Integrated Services (IntServ) [5] and Differentiated Services (DiffServ) [7] architectures have been proposed; traffic engineering solutions such as MultiProtocol Label Switching (MPLS) [18] have been developed; and availability/redundancy solutions such as the Virtual Router Redundancy Protocol (VRRP) [11]
and IP-based Fast Rerouting [21] have been suggested. However, in spite of these network
solutions, many times IP still has problems meeting soft real-time requirements. The reasons
to this are many and include the fact that neither of the proposed network solutions have so
far enjoyed widespread deployment. They are also fairly expensive and need to be supported
by a complex management architecture. Furthermore, even from a theoretical viewpoint the
proposed network solutions have difficulties to provide a real-time service from one end host
to another [12]. Often the solutions fall back to the end-to-end architecture, and it becomes
the transport protocols of the end hosts that try to provide an end-to-end real-time service to
the best of their abilities. To this end, this thesis is concerned with transport protocols for
soft real-time applications. The thesis considers both timeliness and availability issues, and
focuses on two important categories of soft real-time applications in the Internet: multimedia
and telephony signaling.
Part I of the thesis considers the design and analysis of transport protocols for multimedia
applications; particularly, for multimedia applications with lax deadlines such as imageintensive Web applications. Many of these applications do not require a completely reliable
transport service, and, in view of this, Part I studies so-called partially reliable transport
protocols. These protocols enable an application to explicitly trade reliability for improved
timeliness.
From an implementation perspective, we may differentiate between two major classes of
partially reliable transport protocols: open- and closed-loop protocols (cf. Paper I). Openloop protocols comprise those protocols which do not employ feedback from the network or
end nodes when they perform error recovery, while closed-loop protocols do employ feedback. Part I studies a subclass of closed-loop protocols, retransmission-based protocols, i.e.,
partially reliable transport protocols which recover from packet losses by retransmitting lost
packets. Specifically, Part I investigates the feasibility of designing retransmission-based,
partially reliable transport protocols that are congestion aware and fair to contending flows.
Part II of the thesis considers the Stream Control Transmission Protocol (SCTP) [22].
The SCTP transport protocol was developed by the Internet Engineering Task Force
(IETF) [1] for the transfer of Public Switched Telephony Network (PSTN) signaling traffic
over IP. From a broad viewpoint, Part II studies how well SCTP meets the timeliness and
availability requirements of PSTN signaling in the SIGnaling TRANsport (SIGTRAN) architecture [16], i.e., in the interworking architecture between traditional telecom networks
and carrier-grade Voice over IP (VoIP) networks proposed by the IETF (cf. Paper VI).
The majority of Part II considers the performance of the network path recovery mechanism of SCTP, the so-called SCTP failover mechanism. SCTP supports redundant network
paths between two end points: one primary path and one or several backup or alternative
paths. Normally, all traffic goes on the primary path, however, if this path becomes unavailable, traffic is rerouted to one of the alternative paths. The detection of an unavailable
path, and the rerouting of traffic from the primary to the selected alternative path are done
by the SCTP failover mechanism. To be able to interwork with the corresponding recovery
mechanisms in the traditional telecom network, it is essential that SCTP exhibits the same
failover performance as these mechanisms. To this end, Part II evaluates the performance of
the SCTP failover mechanism, and, on the basis of this, suggests improvements to its current
design.
Part II also studies the deteriorating effect of Head-of-Line Blocking (HoLB) on the endto-end transmission delay. HoLB occurs when packets from one flow block packets from
another separate, independent flow. Since mitigating the impact of HoLB was one of the
main reasons SCTP was developed in the first place, Part II tries to quantify the effects of
HoLB on PSTN signaling traffic under various network conditions.
Research Objectives
The overall objective of this thesis can be formulated as follows:

The objective of this thesis is to design and analyze IP-based transport protocols
for soft real-time applications.
The thesis applies this objective to two important categories of soft real-time applications in the Internet: multimedia and PSTN signaling. Part I of the thesis considers multimedia applications, and studies the design and analysis of retransmission-based, partially
reliable transport protocols for multimedia applications with lax deadlines. Apart from making it possible for these applications to explicitly trade reliability for improved timeliness,
the transport protocols should be congestion aware and compatible with existing Internet
transport protocols. Furthermore, it is advantageous if the protocols exhibit a fair and TCPfriendly behavior. More formally, the objective of Part I can be formulated as follows:
The objective of Part I of this thesis is to design and analyze retransmissionbased, partially reliable transport protocols for soft real-time multimedia applications with lax deadlines. The transport protocols should be compatible with
3. Contributions
existing Internet transport protocols; be congestion-aware; and, ideally, react

to congestion in a fair and TCP-friendly manner.
Part II of the thesis considers PSTN signaling over IP, and, in so doing, studies to what
extent SCTP is able to meet important timeliness and availability requirements of PSTN
signaling traffic in the IETF SIGTRAN architecture. Part II also studies ways of improving
the transport service offered by SCTP. The objective of Part II can be formally stated in the
following way:
The objective of Part II of this thesis is to evaluate the transport service offered
by SCTP in terms of timeliness and availability, and investigate to what extent
SCTP is able to meet PSTN signaling requirements in the IETF SIGTRAN architecture. The objective also includes studying ways of improving the SCTP
transport service.
3 Contributions
The main contributions of this thesis are summarized in this section. The contributions of
Part I are summarized in Subsection 3.1, and the contributions of Part II are summarized in
Subsection 3.2.
3.1 Part I: Partially Reliable Transport Protocols for Multimedia Applications

The main contribution of Part I of this thesis is the design and analysis of two retransmissionbased, partially reliable transport protocols: Partially Reliable Transport Protocol (PRTP)
and Partially Reliable Transport Protocol using Explicit Congestion Notification (PRTPECN). Both protocols are extensions to TCP, and both protocols build upon the same selective retransmission scheme. An application atop PRTP/PRTP-ECN explicitly specifies
a minimum acceptable reliability level by setting some parameters of the retransmission
scheme. Implicitly, the parameters govern the tradeoff between reliability, throughput, and
jitter. By relaxing the reliability, the application can receive less jitter and better throughput.
An attractive feature of PRTP/PRTP-ECN is that neither one of them entail any elaborate
changes to standard TCP. This in contrast to other suggested TCP extensions for partial
reliability such as POC [3, 9] and TLTCP [15]. Further, unlike POC and TLTCP, both PRTP
and PRTP-ECN are purely receiver based, i.e., only involve changes on the receiver side of
a TCP connection.
Extensive simulations complemented with theoretical studies suggest that both PRTP and
PRTP-ECN could give substantial improvements in average throughput compared to TCP for
both long- and short-lived connections. They also suggest that both PRTP and PRTP-ECN
could significantly reduce the average interarrival jitter for long-lived connections. Additionally, the simulations indicate that PRTP-ECN is TCP friendly and fair against contending
TCP flows.
Another contribution of our work on retransmission-based, partially reliable transport

protocols is a taxonomy for this class of transport protocols. Apart from serving as a framework for classification, the taxonomy articulates the principal components of retransmissionbased, partially reliable transport protocols; introduces a uniform terminology; and makes
clear those aspects that need further research.
3.2 Part II: Transport Service for Telephony Signaling

The overall contribution of Part II of this thesis is that it provides guidelines for the configuration of SCTP in the SIGTRAN architecture, and suggests ways of improving the transport
service offered by SCTP. In this way, Part II serves as a complement to the recommendations
provided by IETF [8, 22]. However, the guidelines and improvements in Part II may also be
applicable in other application domains where SCTP can be used and where timeliness and
availability are important issues, e.g., multimedia applications.
The majority of Part II focuses on the SCTP failover mechanism and its ability to meet
PSTN signaling availability requirements. Extensive experimental studies presented in Part
II indicate that the SCTP failover mechanism must be configured much more aggressively
than is recommended in the SCTP specification, RFC 2960 [22], in order to meet PSTN
signaling availability requirements. In part, this finding is in agreement with work conducted
by Jungmaier et al. [13]. The studies also suggest that bursty cross traffic in combination
with large router queues could significantly deteriorate the performance of the SCTP failover
mechanism. This implies that the routers in the SIGTRAN architecture should be configured
with relatively small buffers, a finding which concurs with contemporary work [10, 17, 23].
On the basis of the studies made on the SCTP failover performance, Part II suggests configuration guidelines and improvements to the existing failover mechanism. Notably, Paper
X recognizes that one of the main problems with the failover mechanism is that it relies on
the SCTP retransmission strategy to detect a possible path failure. SCTP essentially counts
retransmission timeouts, and when the number of consecutive timeouts reaches a predetermined retransmission threshold, the path is considered unavailable. However, since SCTP,
like TCP, employs a binary exponential retransmission backoff scheme, this means that the
failover time increases with a factor of two for each increase of the retransmission threshold.
To this end, Paper X studies the use of relaxed retransmission backoff schemes which employ
backoff factors of less than two. Through simulations and experiments, it is shown that the
proposed backoff schemes could significantly improve SCTP failover times. Further, since
the SCTP retransmission and congestion control schemes are intertwined, Paper X argues,
on the basis of existing research on Multiple Access Control (MAC) protocols, that such a
scheme does not threaten network stability during congestion periods.
Part II also includes a fairly comprehensive experimental study on the impact of HoLB on
ordered delivery in SCTP. The study suggests that HoLB could have substantial short-term
effects on single messages or groups of messages in an SCTP flow, but that the long-term
effects are marginal. Compared to previous studies on the effects of HoLB, our study more
or less corroborates the work of Camarillo et al. [6], but to some extent refutes the work of
De Marco et al. [14] and the prestudy on HoLB of Telcordia [20].
4. Thesis Outline
4 Thesis Outline
This thesis is arranged in two parts. Part I considers the design and analysis of retransmission-based, partially reliable transport protocols for soft real-time multimedia applications. It comprises five papers: Paper I Paper V. Paper I presents a taxonomy and survey of
retransmission-based, partially reliable transport protocols. Furthermore, Paper I provides an
introduction to the subject. Paper II introduces PRTP, and Papers III - V discuss PRTP-ECN.
Part II of the thesis concerns the transport service offered by SCTP in the SIGTRAN architecture. It consists of five papers: Paper VI Paper X. Paper VI, provides a background to
our research on SCTP. It gives a comprehensive introduction to the softswitch solution, the
common way of implementing carrier-grade VoIP networks, and how softswitch networks
interwork with traditional telecom networks in the SIGTRAN architecture. Paper VII considers HoLB and its short- and long-term impact on SCTP traffic, and Paper VIII Paper X
address SIGTRAN availability and the SCTP failover mechanism. In particular, Paper VIII
and Paper IX evaluate the SCTP failover performance in unloaded and loaded SIGTRAN
networks, and Paper X presents our study of using relaxed retransmission backoff schemes
to improve the failover performance.
A more detailed description of the papers included in this thesis is provided in Subsections 4.1 and 4.2, below.
4.1 Part I: Partially Reliable Transport Protocols for Multimedia Applications

Paper I: Taxonomy and Survey of Retransmission-based Partially Reliable Transport
Protocols
This paper presents a taxonomy for retransmission-based, partially reliable transport protocols. The taxonomy consists of two classification schemes. The first scheme classifies
retransmission-based, partially reliable transport protocols with respect to their offered reliability service, and the second scheme classifies them with respect to their error control
scheme. The paper also includes a survey of retransmission-based, partially reliable transport protocols. The survey primarily focuses on the services offered by the protocols; how
they are realized; and how they map into the taxonomy. Taken together, the taxonomy and
survey not only provides the foundation for retransmission-based, partially reliable transport
protocols, they also serve as an introduction to this class of transport protocols, and to the
remaining material in Part I of the thesis.
Paper II: A Simulation Based Performance Evaluation of PRTP
This paper introduces PRTP. The core principles and the design of the protocol are described.
However, the largest portion of the paper is devoted to an extensive simulation study of PRTP
which comprises three simulation experiments. In the first simulation experiment, our simulation model of PRTP is validated against a prototype of PRTP in Linux that has previously
been developed by Asplund et al. [4] in our research group. The second simulation experiment evaluates the stationary performance of PRTP compared to TCP. In particular, the
performance of PRTP for long-lived connections in terms of average interarrival jitter, average throughput, and average fairness is considered. The second simulation experiment
also studies whether PRTP is TCP friendly or not. Finally, the third simulation experiment
evaluates the transient performance of PRTP compared to TCP. Specifically, the third simulation experiment studies the throughput performance of PRTP in a typical Web browsing
scenario. Since, the throughput obtained in Web browsing is very much dependent on the
type of Internet connection, three types of connections are studied: fixed, modem, and GSM.
Paper III: Evaluation of the QoS Offered by PRTP-ECN A TCP Compliant Partially
Reliable Transport Protocol
The simulations presented in Paper II found PRTP to be TCP unfriendly and not altogether
fair. To address this, PRTP-ECN was conceived. This paper considers PRTP-ECN: the
principal ideas behind the protocol and its design. The stationary performance of PRTPECN is evaluated using the same simulation testbed as was used in the stationary analysis
of PRTP (see Paper II). The paper gives a detailed description of the stationary analysis of
PRTP-ECN. Specifically, it evaluates the stationary performance of PRTP-ECN compared
to TCP in terms of average interarrival jitter, average throughput, average goodput, average
fairness, and TCP friendliness.
Paper IV: A Simulation Based Performance Analysis of a TCP Extension for Best Effort
Multimedia Applications
In the same way as Paper III, this paper considers the stationary performance analysis of
PRTP-ECN. However, here the focus is on the statistical design and analysis of the simulation experiment. The simulation experiment is designed as a series of factorial experiments,
one for each studied performance metric. The paper elaborates on the underlying effects
model. Examples of issues discussed are model fitting, e.g., variance stabilizing transforms,
and the statistical hypotheses tested. In addition to the performance metrics studied in Paper III, the paper considers the link utilization of PRTP-ECN as compared to TCP.
Paper V: Enhancing TCP for Applications with Soft Real-Time Constraints

This paper presents a theoretical analysis of the transient and stationary behavior of PRTPECN. The paper presents analytical expressions for the packet-loss tolerance of PRTP-ECN
at startup; explicit formulae for the upper an lower bounds of the stationary packet-loss
tolerance of PRTP; and, finally, an expression for the maximum packet-loss burst tolerated
by PRTP-ECN in stationary state. Although the central theme of the paper is a theoretical
evaluation of PRTP-ECN, it also provides a summary of the stationary performance analysis
of PRTP-ECN as described in Paper III.
4. Thesis Outline
4.2 Part II: Transport Service for Telephony Signaling

Paper VI: Towards the Next Generation Network: The Softswitch Solution
This paper serves as an introduction to the application domain studied in Part II. It provides a
comprehensive introduction to the softswitch solution, and how it interworks with traditional
telecom networks in the SIGTRAN architecture. It also discusses the steps following the
softswitch in the migration towards an all-IP telecom network. In particular, the paper briefly
reviews the IP Multimedia Subsystem (IMS) [2] architecture and how it is envisioned to
provide multimedia services in future IP-based wireless and wireline telecom core networks.
Paper VII: Performance Benefits of Avoiding Head-of-Line Blocking in SCTP

This paper studies the effect of HoLB on ordered delivery in SCTP. The paper considers both
the short- and long-term effects of HoLB. Furthermore, the paper takes into account both
packet-loss rate and distribution when studying the impact of HoLB. The major contribution
of the paper is that it indicates that while HoLB could, indeed, substantially increase the
transmission delay of a small fraction of messages in an SCTP session, it seems to have only
a marginal impact on the average message transmission delay.
Paper VIII: Performance of SCTP-controlled Failovers in M3UA-based SIGTRAN

Networks
This paper examines the performance of the SCTP failover mechanism in an unloaded SIGTRAN network. The performance metrics considered in the paper are failover time and
end-to-end transmission delay. The main contribution of the paper is that it shows that to
have a failover performance on par with the failover performance in a traditional telecom
network, SCTP has to be configured much more aggressively than what is suggested in the
SCTP specification, RFC 2960 [22].
Paper IX: Impact of Traffic Load on SCTP Failovers in SIGTRAN

As a complement to Paper VIII, this paper examines the performance of the SCTP failover
mechanism in a loaded SIGTRAN network. Apart from the impact of cross traffic, the impact
of router queues on the SCTP failover performance is studied. Again, the performance
metrics considered are failover time and end-to-end transmission delay. The tests presented
in the paper cover a range of traffic loads with different degrees of burstiness. The major
finding of the tests is that cross traffic, especially bursty cross traffic, could significantly
deteriorate the SCTP failover performance. Moreover, the tests suggest that the sizes of the
router queues in a SIGTRAN network is an important factor on the performance of SCTP
failovers. In fact, the tests indicate that a combination of bursty cross traffic and large router
queues could result in the SCTP failover performance failing to comply with PSTN signaling
requirements.
10
Paper X: Using Relaxed Timer Backoff to Reduce SCTP Failover Times

The Papers VIII and IX suggest that SCTP may have difficulties accommodating the failover
requirements of PSTN signaling, to this end this paper proposes an improvement to the existing SCTP failover mechanism. Particularly, the paper studies a modified retransmission
strategy which involves using a relaxed backoff factor of less than two. The paper shows
through both simulations and experiments that such a modified retransmission strategy could
substantially improve the SCTP failover performance. The paper also presents strong arguments from research on MAC protocols that SCTP, in spite of using a relaxed backoff factor,
remains stable.
Conclusions and Future Research
This thesis considers IP-based transport protocols for soft real-time applications. The thesis
is concerned with both timeliness and availability issues, and focuses on two particular categories of applications: multimedia and PSTN signaling. Part I of the thesis considers the design and analysis of a subclass of partially reliable transport protocols: retransmission-based,
partially reliable transport protocols. The objective with this work was to study the feasibility of designing retransmission-based, partially reliable transport protocols for soft real-time
applications that are compatible with existing Internet transport protocols; are congestion
aware; and are, if possible, fair and TCP friendly. Our work resulted in two extensions to
TCP for partial reliability, PRTP and PRTP-ECN, and Part I shows through simulations and
theoretical analysis that these protocols could give a substantial improvement in throughput
and jitter compared to TCP. Furthermore, the simulations in Part I suggest that while PRTP
is not altogether TCP friendly, PRTP-ECN is both TCP friendly and reasonably fair against
competing TCP flows. Part I of the thesis also presents a taxonomy for retransmissionbased, partially reliable transport protocols which, apart from serving as a classification
framework, provides a uniform terminology for the subject. The work presented in Part I
opens up a number of avenues for future research including effective image and/or video
coding techniques for partially reliable transport protocols, and alternative partially reliable
retransmission schemes.
Part II of the thesis evaluates the transport service provided by SCTP, and studies to what
extent SCTP is able to meet PSTN signaling requirements in the IETF SIGTRAN architecture. The main focus of Part II is on the SCTP failover mechanism and its ability to meet the
availability requirements of PSTN signaling. Through extensive experiments, it is suggested
that in order to meet the availability requirements of PSTN signaling, SCTP has to be configured much more aggressively than is recommended in RFC 2960. Ways to improve the
transport service provided by SCTP are also presented. In particular, a relaxed retransmission scheme is proposed. Simulations and complementary experiments suggest that such a
retransmission scheme could significantly improve the SCTP failover performance. Part II
also studies the effects of HoLB on SCTP transmission delays. The study suggests that the
short-term effects of HoLB could be substantial, but that the long-term effects are marginal.
This thesis does not by any means signify the end of our study of timeliness and availability issues in SCTP. Currently, SCTP uses more or less the same congestion control mecha-
REFERENCES
11
nism as TCP, a congestion control mechanism which is believed by many, such as Camarillo
et al. [6], to be less than ideal for signaling traffic. In future research, we intend to study and
evaluate alternative congestion control mechanisms that, in better ways, take into account
the properties of signaling traffic, e.g., in terms of burstiness and duration. Also our research
on the SCTP failover mechanism will be continued. Notably, we intend to further our study
of relaxed retransmission timeout schemes and consider alternative solutions. Furthermore,
we intend to more formally analyze the stability of congestion control schemes that utilize a
relaxed retransmission strategy.
References
[1] Internet engineering task force (IETF). http://www.ietf.org.
[2] 3GPP. 3rd generation partnership project; technical specification group services and
system aspects; IP multimedia subsystem (IMS); stage 2 (release 7). Technical Specification TS 23.228 v.7.1.0, 3GPP, September 2005.
[3] P. D. Amer, C. Chassot, T. Connolly, M. Diaz, and P. T. Conrad. Partial order transport
service for multimedia and other applications. ACM/IEEE Transactions on Networking,
2(5), October 1994.
[4] K. Asplund, J. Garcia, A. Brunstrom, and S. Schneyer. Decreasing transfer delay
through partial reliability. In Protocols for Multimedia Systems (PROMS), Cracow,
Poland, October 2000.
[5] R. Braden, D. Clark, and S. Shenker. Integrated services in the internet architecture.
RFC 1633, IETF, June 1994.
[6] G. Camarillo and H. Schulzrinne. Signalling transport protocols. Technical report,
Dept. of Computer Science, Columbia University, February 2002.
[7] M. Carlson, W. Weiss, S. Blake, Z. Wang, D. Black, and E. Davies. An architecture for
differentiated services. RFC 2475, IETF, December 1998.
[8] L. Coene and J. Pastor-Balbas. Telephony signalling transport over stream control
transmission control (SCTP) applicability statement. RFC 4166, IETF, February 2006.
[9] M. Diaz, A. Lopez, C. Chassot, and P. D. Amer. Partial order connections: A new
concept for high speed and multimedia services and protocols. Annals of Telecomunications, 49(56):270281, 1994.
[10] M. Enachescu, Y. Ganjali, A. Goel, N. McKeown, and T. Roughgarden. Part iii:
Routers with very small buffers. ACM Computer Communication Review, 35(3):8389,
July 2005.
[11] R. Hinden. Virtual router redundancy protocol (VRRP). RFC 3768, IETF, April 2004.
12
[12] G. Huston. Internet Performance Survival Guide QoS Strategies for Multiservice
Networks. John Wiley & Sons, Inc., 1st edition, 2000.
[13] A. Jungmaier, E P. Rathgeb, and M. Tuexen. On the use of SCTP in failover scenarios. In 6th World Multiconference on Systemics, Cybernetics and Informatics, Orlando,
Florida, USA, July 2002.
[14] G. De Marco, D. De Vito, M. Longo, and S. Loreto. SCTP as a transport for SIP: a
case study. In 7th World Multiconference on Systemics, Cybernetics and Informatics
(SCI), Orlando, Florida, USA, July 2003.
[15] B. Mukherjee and T. Brecht. Time-lined TCP for the TCP-friendly delivery of streaming media. In IEEE International Conference on Network Protocols (ICNP), pages
165176, Osaka, Japan, November 2000.
[16] L. Ong, I. Rytina, M. Garcia, H. Schwarzbauer, L. Coene, H. Lin, I. Juhasz, M. Holdrege, and C. Sharp. Framework architecture for signalling transport. RFC 2719, IETF,
October 1999.
[17] G. Raina, D. Towsley, and D. Wischik. Part ii: Control theory for buffer sizing. ACM
Computer Communication Review, 35(3):7982, July 2005.
[18] E. Rosen, A. Viswanathan, and R. Callon. Multiprotocol label switching architecture.
RFC 3031, IETF, January 2001.
[19] J. H. Saltzer, D. P. Reed, and D. D. Clark. End-to-end arguments in system design.
ACM/IEEE Transactions on Networking, 2(4):277288, November 1984.
[20] T. Seth, A. Broscius, C. Huitema, and H-A P. Lin. Performance requirements for
signaling in internet telephony. Internet draft, IETF, November 1998. Work in Progress.
[21] M. Shand and S. Bryant. IP fast reroute framework. Internet draft, IETF, March 2006.
Work in Progress.
[22] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina,
M. Kalla, L. Zhang, and V. Paxson. Stream control transmission protocol. RFC 2960,
IETF, October 2000.
[23] D. Wischik and N. McKeown. Part i: Buffer sizes for core routers. ACM Computer
Communication Review, 35(3):7578, July 2005.
Part I
Partially Reliable Transport Protocols
for Multimedia Applications
Paper I
Taxonomy and Survey of Retransmission-based Partially

Reliable Transport Protocols
Reprinted from
Computer Communications, Elsevier

Vol. 27, Issue 15
September 2004, pp. 14411452
Taxonomy and Survey of Retransmission-based

Partially Reliable Transport Protocols
Karl-Johan Grinnemo, Johan Garcia, Anna Brunstrom
Karlstad University, Sweden
{Karl-Johan.Grinnemo, Johan.Garcia, Anna.Brunstrom}@kau.se
Abstract
The mismatch between the services offered by the two standard transport protocols
in the Internet, TCP and UDP, and the services required by distributed multimedia applications has led to the development of a large number of partially reliable transport
protocols: protocols that in terms of reliability place themselves between TCP and UDP.
This paper presents a taxonomy for retransmission-based, partially reliable transport protocols, i.e., the subclass of partially reliable transport protocols that makes error recovery
through retransmissions. The taxonomy comprises two classification schemes: one that
classifies protocols with respect to the reliability service they offer and one that classifies
them with respect to their error control scheme. The objective of our taxonomy is fourfold: to introduce a unified terminology; to provide a framework in which this class of
protocols can be examined, compared and contrasted; to make explicit the error control
schemes used by these protocols; and, finally, to gain new insight into these protocols and
thereby suggest avenues for future research. Based on our taxonomy, a survey is made of
existing retransmission-based, partially reliable transport protocols. The survey shows
how protocols are categorized according to our taxonomy and exemplifies the majority
of reliability services and error control schemes detailed in our taxonomy.
Keywords: Taxonomy, Survey, Partial reliability, Retransmission-based, Error control
scheme
1 Introduction
The two standard transport protocols in the TCP/IP suite, TCP [29] and UDP [28], date
back some thirty years. They were primarily designed to offer a service appropriate for the
prevailing applications in those days. TCP was designed to offer a completely reliable service, a service suitable for such applications as email, file transfer and remote login; UDP,
on the other hand, was designed to offer an unreliable service, a service appropriate for
simple query-response applications such as name servers and network management systems.
However, in recent years, we have witnessed a growing interest in distributed multimedia applications: a category of applications typically requiring a service that in terms of reliability
17
18
Taxonomy and Survey of Retransmission-based Partially Reliable Transport...
places itself somewhere between the services offered by TCP and UDP but that often have
stringent demands on throughput, delay and delay jitter. To address the needs of this class
of applications, the Real-time Transport Protocol (RTP) [31] and other application-support
protocols have been proposed. These protocols try to follow the principles of protocol architecture design outlined by Clark and Tennenhouse [9]. Specifically, they try to implement
the two key architectural principles of Clark and Tennenhouse: application level framing
and integrated layer processing. However, apart from giving the application more control
over transport level decisions, these protocols also impose a number of responsibilities on
the application, e.g., flow and congestion control, and the administration of the timing of the
application messages.
The fact that RTP and similar protocols impose a number of duties on the application that
are normally taken care of by the transport protocol, and that are essentially of no interest to
the application, has made many researchers argue that this is not the right way to go [21].
They re-emphasize the requirement that there should be a clean separation between the specification and implementation of a service. In particular, they argue for a transport protocol
that, in the same way as RTP, gives the application greater influence on the transport-level
decisions but that, in contrast to RTP, relieves the application of all transport-level responsibilities. To this end, a class of transport protocols has emerged that offers a service enabling
the application to trade some reliability for improved performance with respect to some or
all of the metrics: throughput, delay and delay jitter; a class of transport protocols that offers
a service more suitable for multimedia and other real-time applications than either TCP or
UDP. This class of protocols is commonly called partially reliable transport protocols.
From an implementation perspective, we may differentiate between two major classes
of partially reliable transport protocols: closed-loop and open-loop protocols. Closed-loop
protocols dynamically adapt their error control scheme on the basis of feedback on the current network conditions. In contrast, open-loop protocols do not use any feedback from the
network, but work instead by adding redundancy to the payload and thereby mitigating the
impact of losses.
While open-loop protocols have been extensively studied and used for multimedia communication during the past twenty years [5, 6], closed-loop techniques were not considered
appropriate for multimedia communication until ten years ago. It was at that time, at the
beginning of the 1990s, Dempsey [12], Papadopoulos and Parulkar [26] and others demonstrated the feasibility of using retransmission-based, closed-loop protocols for multimedia
communication. Since then, a large number of protocols belonging to this class have been
proposed. However, along with the growing interest in this class of protocols, a terminology
has evolved which in some parts is both incoherent and inconsistent. Consequently, it is currently difficult to analyze the relative merits of different protocols in a meaningful way. It is
also difficult to discern which efforts are likely to be most rewarding and thus should be considered in future research. This paper presents a taxonomy for these protocols, a taxonomy
which we believe not only gives a unified terminology but also clarifies the core principles
and thus could serve as a basis for future work in this area.
The taxonomy presented comprises two classification schemes: one that classifies the
protocols with respect to the reliability service they offer and one that classifies them with
respect to their error control scheme. The taxonomy has been inspired by the earlier work
2. Preliminaries
19
of, among others, Diaz and Dempsey. In particular, our view of the concept of a partially
reliable service is to a large extent derived from the work on partially reliable and partially
ordered services by Diaz et al. [14]. Furthermore, our classification of retransmission-based,
partially reliable transport protocols with respect to their error control scheme is to some
degree influenced by the discussion of closed-loop, partially reliable transport protocols in
Dempseys thesis [11]. However, in contrast to these and other works, our taxonomy maintains a clear distinction between the service offered by a protocol and its implementation.
This paper also provides a classification and survey of existing protocols. Specifically,
it shows how a selection of existing protocols are classified with respect to our taxonomy.
Noteworthy, it follows from our classification that the majority of protocols employ a relatively small set of core principles. A subset of the protocols classified is further elaborated
in a survey. The survey complements the classification by illustrating how the majority of
the reliability services and error control schemes in our taxonomy have been implemented
in actual protocols.
The remainder of the paper is organized as follows. Section 2 provides some preparatory
material. In particular, the concept of a retransmission-based, partially reliable transport
protocol is defined. Our taxonomy is presented in Section 3. Section 4 gives a survey of
existing retransmission-based, partially reliable transport protocols and shows how they are
classified with respect to our taxonomy. Finally, Section 5 concludes the paper with a brief
summary of our proposed taxonomy and a discussion of the insight gained in developing the
taxonomy.
2 Preliminaries
The taxonomy of retransmission-based, partially reliable transport protocols presented in
this paper is based on the following definition of a partially reliable service.
Definition 1 Let r denote the reliability level offered by a service. A service is considered
partially reliable provided r ]0%, 100%[.
On the basis of Definition 1, we used the following definition of a retransmission-based,
partially reliable transport protocol.
Definition 2 A retransmission-based, partially reliable transport protocol is a transport
protocol that is explicitly designed to offer a partially reliable service and that uses an error
control scheme in which error recovery is made through retransmissions.
The reason why Definition 2 specifically states that a transport protocol must be explicitly
designed to offer a partially reliable service in order to be a partially reliable transport protocol is that, from a theoretical viewpoint, there is no such thing as a transport protocol offering
a completely reliable service. For example, TCP is designed to offer a completely reliable
service but is only able to do so as long as some conditions hold. In particular, TCP depends
on the ability of the IP protocol to provide end-to-end connectivity and assumes that the
TCP checksum is able to detect all conceivable kinds of bit errors. Consequently, TCP fails
to provide a completely reliable service in the rare occasions when the network becomes
partitioned or the TCP checksum fails to detect a corrupted packet [35].
20
Reliability
Service
Specification of
Reliability Level
Implicit
Adaptiveness
Granularity
Explicit
Message
Message
Group
Flow
Non-Adaptive
Adaptive
Per-Flow
In-Flow
Per-Message
Figure 1: Classification with respect to reliability service.
The Taxonomy
As mentioned in Section 1, our taxonomy consists of two classification schemes: one that
classifies retransmission-based, partially reliable transport protocols with respect to their
offered reliability service and one that classifies them with respect to their error control
scheme. The former classification scheme is presented in Subsection 3.1 and the latter in
Subsection 3.2.
3.1 Classification with Respect to Reliability Service

Figure 1 depicts the classification scheme with respect to the reliability service offered. As
follows from Figure 1, the protocols are classified along three dimensions: specification of
reliability level, granularity and adaptiveness.
Specification of Reliability Level. We distinguish between two main classes of specifications: implicit specifications and explicit specifications. In implicit specifications, the
reliability level is implicitly determined by one or more QoS and/or transport constraints other than just the guaranteed reliability level, e.g., number of retransmissions, priorities, deadlines, playback rate or maximum latency. Contrary to implicit
specifications, explicit specifications explicitly prescribe a certain guaranteed reliability level. Common ways of formulating explicit reliability level requirements include:
specifying an upper bound for packet loss, stating the minimum number of messages
that must be successfully transmitted out of a fixed-length message group or assigning
reliability classes to messages.
Granularity. The dimension of granularity refers to the calculation base of the reliability
service provided by a retransmission-based, partially reliable transport protocol. Or,
put differently, the granularity of a retransmission-based, partially reliable transport
protocol represents the smallest unit of data on which the protocol controls the reliability service and, consequently, is the smallest unit of data on which the protocol
ascertains the reliability service. In implementation terms, the dimension granularity
3. The Taxonomy
21
refers to the unit of data considered by the error control scheme when it makes retransmission decisions on lost or corrupted packets (cf. Section 3.2). As shown in Figure 1,
we distinguish between three main classes of protocols with respect to granularity:
message, message group and flow.
Message. In this context, the term message denotes messages exchanged between
application entities, i.e., Application Protocol Data Units (APDUs). When a
protocol has a granularity of a message, this means that the error control scheme
of the protocol performs retransmissions based on the status of one message.
This, for example, can be that the reliability service is based on the number of
times a message has been retransmitted, or that the reliability service is based on
a priority level assigned to a message.
Typically, protocols with a granularity of a message target video and audio streaming applications: both video and audio coders usually generate sequences of
fixed-sized data blocks that fit conveniently into messages. Furthermore, video
coding schemes such as MPEG-1, MPEG-2 and H.263 generate data blocks
of different importance, e.g., H.263 generates I-, P- and B-frames, of which Iframes are more important than either the P- or B-frames. By using a protocol
with a granularity of a message, a video application using one of these video
coding schemes can exploit the fact that all data blocks are not equally important
and can assign different reliability levels to each data block.
Message Group. A protocol is said to have a granularity of a message group when
the retransmission decision of the protocol involves not only one message but a
pre-specified number of messages. Typically, a protocol adhering to this class,
calculates the reliability level by counting the number of successfully received
messages within a fixed-length message group.
Protocols with a granularity of a message group primarily target the same niche
of applications as those with a granularity of a message, i.e., video and audio
streaming applications. Compared to protocols with a granularity of a message,
the major advantages of message group-based protocols are that they are generally easier to realize and, owing to their coarser granularity, entail less overhead.
However, the coarser granularity of message group based protocols is not always
beneficial. While it could be argued that a granularity of a message group is
sufficient for many audio streaming applications, it is less than ideal for video
streaming applications: audio coders typically produce packets of more or less
the same importance, but, as said, this is generally not the case with video coders.
Flow. A flow refers to a unidirectional stream of messages from a single media source.
One or several flows constitute a session: a single connection and/or conversation
between one end point and one or several other end points. Common examples of
flows are video streams in a video broadcast application and audio streams in an
Internet radio application. When a protocol belonging to this class of protocols
makes a retransmission decision at a particular time during a flow, it bases its
retransmission decision on all packets sent and/or received in this flow up to this
time.
22
To our knowledge, the number of protocols having a granularity of a flow are

very scarce. As a matter of fact, we know of only two: SRP [27] and PRTPECN [17, 18]. SRP primarily targets audio applications, and PRTP-ECN has
been used to improve the transmission of images on Web pages. The principal
incentive for these two protocols to use a granularity of a flow is that it makes
them very easy to implement: compared to message- and message group-based
protocols, flow-based protocols tend to involve less state information and are,
because of that, often less complex. However, flow-based protocols tacitly assume that all messages are of equal importance. Therefore, special arrangements
must be made to be able to use these protocols for transmission of images and
video streams. For example, PRTP-ECN only works for images that are coded
according to some robust image coding scheme.
Adaptiveness. The last dimension, adaptiveness, refers to the situation in which, during
the lifetime of a session, a protocol permits an application to change its reliability
requirements. There are two main classes of protocols with respect to adaptiveness:
non-adaptive and adaptive.
Non-Adaptive. Those protocols, which offer a pre-configured reliability service, are
members of the non-adaptive class of protocols. In other words, non-adaptive
protocols comprise protocols which, in the same way as TCP and UDP, are designed to offer a certain fixed reliability service. However, in contrast to TCP
and UDP, they are rarely general protocols. They are often specifically tuned for
a certain category of applications, or even a certain product.
Adaptive. The adaptive class is the complement of the non-adaptive class, i.e., comprises all protocols that do not offer a pre-configured reliability service. It has
three subclasses: per-flow adaptive, in-flow adaptive and per-message adaptive.
In per-flow adaptive protocols, the reliability service is specified at the inception of a flow and remains fixed during the whole lifetime of the flow. A further
degree of flexibility is provided by in-flow adaptive protocols, which enables
applications to alter the reliability service during the flow lifetime. Finally, the
highest degree of flexibility is found in in-flow adaptive flows, which permit applications to alter the reliability service on a per-message basis.
Since per-flow adaptive protocols make no provisions for an application to renegotiate the reliability service of a flow, they are primarily useful for inelastic
applications such as audio broadcasting. More elastic applications, such as many
video streaming tools, are typically better off with an in-flow or per-message
adaptive transport protocol: at high traffic loads, an in-flow or per-message adaptive protocol permits an application to continue a flow, but, at a lower service
level.
As follows from the definition of in-flow adaptive flows, they represent a middle
way between per-flow adaptive and per-message adaptive protocols. There are
very few in-flow adaptive protocols. An example of one of these few is PRTPECN [17, 18].
3. The Taxonomy
23
3.2 Classification with Respect to Error Control Scheme

This section explains how, according to our taxonomy, retransmission-based, partially reliable transport protocols are classified with respect to their error control scheme: Subsection
3.2.1 discusses the logical architecture of an error control scheme of a retransmission-based,
partially reliable transport protocol. On the basis of the logical architecture, the actual classification scheme is presented in Subsection 3.2.2.
3.2.1 The Logical Architecture of an Error Control Scheme
The error control scheme of a retransmission-based, partially reliable transport protocol can
be viewed as consisting of five principal components: an error detection component, an error
feedback component, a retransmission decision component, a retransmission decision feedback component and, finally, a retransmission component. Figure 2 illustrates the interrelationship between the five components, and the responsibility of the respective components
is as follows:
Error Detection Component. The error detection component is responsible for detecting
lost or corrupted packets. Packet losses are usually detected through the reception
of out-of-sequence packets and/or timeouts, while packet corruption is discovered
through the use of checksums.
Error Feedback Component. In error control schemes in which the error detection and the
retransmission decision are not made on the same side, i.e., error detection occurs at
the receiver side while the retransmission decision takes place at the sender side, an
error feedback component is used to inform the retransmission decision component
about packet loss and corruption events.
Retransmission Decision Component. As follows from its name, the retransmission decision component comprises the part of the error control scheme that is responsible for
deciding whether a lost or corrupted packet should be retransmitted.
Retransmission Decision Feedback Component. The retransmission decision feedback
component is only needed in error control schemes in which the retransmission decision takes place at the receiver. It is the responsibility of this component to communicate the retransmission decisions of the receiver back to the retransmission component
at the sender.
Retransmission Component. The retransmission component is always located at the sender.
It is the responsibility of this component to execute the retransmission requests made
by the retransmission decision component, i.e., it is this component that makes the
actual retransmissions.
3.2.2 The Classification Scheme
Retransmission-based, partially reliable transport protocols differ from completely reliable
ones primarily in their retransmission strategy. Thus, as shown in Figure 3, the classification
24
Error Detection Component
Sender side
Receiver side
Location of
error detection
Location of
retransmission
decision
Sender side
Receiver side
Error Feedback Component
Retransmission Decision
Component
Sender side
Location of
retransmission
decision
Receiver side
Feedback Component
Retransmission Component
Figure 2: The logical architecture of an error control scheme.
scheme with respect to the error control scheme is exclusively based on the most salient
features of the retransmission decision component. In particular, classification is made along
the two dimensions of location and decision base.
Location. The retransmission decision component of an error control scheme is located
3. The Taxonomy
25
Retransmission
Decision
Decision Base
Location
Sender
Receiver
Packet Loss
Priority
Metrics
Time
Number of
Retransmissions
Reliability
Classes
PR/UR
R/PR/UR
Statistic
Sliding
Window
Deadline
Statistic
Decision Base
for PR class
Point Estimate
Absolute
Deadline
Relative
Deadline
Point Estimate
Metrics
Figure 3: Classification with respect to error control scheme.
at either the sender side or the receiver side. Depending on the location of the retransmission decision component, we distinguish between sender-based protocols and
receiver-based protocols.
Although an analytical study by Marasli et al. [23] suggests that sender-based protocols in some instances exhibit a slightly better throughput performance than receiverbased protocols, the latter indeed possess some attractive features. Receiver-based
protocols are generally more scalable and versatile. An example is the case in which
a server serves several clients, all with different reliability service requirements. In
the sender-based case, the responsibility of providing the correct service to all clients
is primarily the servers. In the receiver-based case, the responsibility has to a large
degree been distributed to each one of the clients. Furthermore, in the receiver-based
case, the sender is not involved in the retransmission decisions and is consequently
able to simultaneously serve several clients with different retransmission decision
components.
In some protocols, the retransmission decision is normally made by the receiver but in
exceptional cases is ignored by the sender. For example, in the CM protocol proposed
by Papadopoulos and Parulkar [25], the sender has a retransmission buffer whose size
approximately follows the size of the playout buffer at the receiver side. When a
packet has been sent, it is placed in the retransmission buffer and, if necessary, the
oldest packet in the retransmission buffer is discarded. When a retransmission request
arrives at the sender for a packet that has already been discarded from the retransmis-
26
sion buffer, the retransmission request is ignored by the sender. Although it could be
argued that protocols like the CM protocol should be classified as hybrid sender-based
receiver-based protocols, we classify them as receiver-based protocols because only
the retransmission decision at the receiver side is made with regard to the requirements imposed by the data stream, i.e., they use a decision base that is either metrics
based, priority based or based on reliability classes.
Decision Base The decision base comprises the metrics, rules and/or heuristics that form the
basis for the retransmission decisions made by a retransmission decision component.
This dimension has three main classes: priority, metrics and reliability classes.
Priority. In priority-based protocols, each packet is assigned a priority based on its
relative importance. Retransmission of lost packets is always made in priority
order, where high priority packets are sent before packets with lower priorities.
Often, protocols are not pure priority-based protocols but are a combination of
priority and deadline based. Apart from having a priority, packets in combined
priority- and deadline-based protocols must also meet a deadline. In these protocols, lost packets are typically retransmitted in priority order until they have been
successfully received or their deadline has expired. This means that high priority
packets have a better chance of being delivered since, in the event of packet loss,
they are more likely to be retransmitted.
Priority-based protocols are very lightweight and have therefore found their use
in the area of audio and video broadcasting. For example, the CUDP [32] transport protocol targets audio and video file servers and SR-RTP [15] targets video
file servers.
Metrics. Protocols that base their retransmission decision on direct or indirect measurements of one or several properties of a flow, e.g., packet loss rate or timeliness, are considered metrics based protocols. We distinguish between three
classes of metrics-based protocols: protocols that base their retransmission on
some kind of estimate of the packet loss rate; protocols that make retransmissions as long as the timeliness of the data flow is not violated; and protocols that
indirectly consider the timeliness of the data flow by restricting the number of
retransmissions made on individual messages.
Packet Loss. To our knowledge, the existing protocols use either of two ap-
proaches to estimate the packet loss rate. The first approach involves using
a statistic, e.g., the arithmetic average or mean packet loss rate. Protocols
belonging to this class usually monitor the packet loss rate by continuously
calculating an estimate or weighted estimate of the mean packet loss rate.
The second approach involves calculating the average packet loss rate over a
fixed-length sequence of packets. Since the packet loss-based protocols using this approach typically keep track of the fixed-length sequence of packets through the use of a sliding window mechanism, this subclass of packet
loss-based protocols is called the sliding window class.
A problem with pure packet loss-based protocols are their obliviousness to
time, which makes them not altogether perfect for audio, video and other
3. The Taxonomy
27
time sensitive applications. However, measurements made on a pure packet

loss-based protocol, AOEC [16], suggest that the improvements obtained in
terms of throughput and delay by selective retransmissions compensates for
this problem to some degree.
Time. Time-based protocols comprise those protocols that employ a metric that
is a function of time. The class of time-based protocols can be further divided into the subclasses: deadline-based and statistic-based protocols.
In deadline-based protocols, the retransmission decision component issues
a retransmission request for a lost packet provided the retransmitted packet
is likely to meet the deadline of the lost packet. Typically, deadline-based
protocols are used by streaming media applications, where it is important
that packets arrive at the receiver before their playout time. There are two
major classes of deadline-based protocols: absolute deadline-based and relative deadline-based protocols. Absolute deadline-based protocols refer to
protocols in which deadlines are calculated with respect to the beginning of
a flow while, in relative deadline-based protocols, the deadline of a packet
is calculated relative to preceding packets.
The class of statistic-based protocols comprises all time-based protocols that
use a metric that is a function of statistical estimates of one or several time
related performance metrics, e.g., mean latency.
As far as we know, there exist no purely time-statistic based protocols. However, there is at least one protocol, SRP [27], that uses a metric involving
both packet loss and time. In contrast to deadline-based protocols, SRP is
able to make explicit trade-offs between reliability level and timeliness.
Number of Retransmissions. Unlike time-based protocols, protocols belonging to
this class have no notion of time. Instead, they impose timely delivery of
packets by limiting the number of times a packet can be retransmitted. More
specifically, a retransmission counter is associated with each packet. The
counter is decreased every time the packet is retransmitted. When the retransmission counter of a packet reaches zero, the packet is discarded.
There are very few protocols that adhere to this class. In fact, we have not
found a single protocol that has a decision base only involving the number
of times a packet has been retransmitted. On the other hand, there is a
reliability class-based protocol, k-XP [3, 22], in which one reliability class
has a decision base consisting of the number of times a packet has been
retransmitted.
The reason that there are so few protocols that base their retransmission decision on the number of times a packet has been retransmitted is of course
that this is a very imprecise metric; it only indirectly governs the packet loss
rate and the timeliness. However, in combination with reliability classes,
it has proven itself quite useful. In particular, using the number of times a
packet has been retransmitted as a decision base has been proven useful in
implementing a better than best effort reliability class, similar to the controlled load service class of IntServ [7].
28
Reliability Classes. In reliability class-based protocols, each packet is associated with

one of a pre-defined set of reliability classes. The decision as to whether a lost
or corrupted packet should be retransmitted is solely based on the class of the
packet. For example, packets associated with a reliable service class are retransmitted until they have been successfully delivered, while packets that are
members of an unreliable service class are never retransmitted.
The majority of reliability class-based protocols use either of two reliability
class schemes: PR/UR or R/PR/UR. The PR/UR scheme comprises two service
classes, a partially reliable (PR) service class and an unreliable (UR) service
class, while the R/PR/UR scheme also offers a reliable (R) service class.
From a logical point of view, the PR service class of a reliability class scheme
has its own decision base and therefore theoretically the same subclasses as the
decision base dimension of the retransmission decision component. However, in
practice, the decision base of the PR service classes in existing protocols is primarily metrics based. The decision base here is normally deadline based or takes
the form of an upper limit on the number of times a packet can be retransmitted.
A major advantage of reliability class-based protocols is that they provide for
explicit synchronization between heterogeneous flows, i.e., flows with different
reliability and timing requirements. Thus, video conferencing and similar applications that involve several heterogeneous flows are relieved from the complex
task of synchronization between multiple independent flows.
A Classification and Survey of Existing Protocols
This section surveys a selection of existing retransmission-based, partially reliable transport

protocols and classifies them with respect to our taxonomy (see Section 3). Table 1 shows
how the following protocols are classified: Slack ARQ [13], PECC [11, 12], k-XP [3, 22],
POCv2 [10], AOEC [16], CUDP [32], VDP [8], Jacobs/Eleftheriadis [20], TLTCP [24],
SRP [27], Papadopoulos/Parulkar [25], SR-RTP [15], HPF [21], MSP [19], XUDP [4], PRSCTP [33] and PRTP-ECN [17, 18]. A survey of a subset of these protocols is also presented:
PECC, POCv2, SRP, HPF, PR-SCTP and PRTP-ECN. Together, this subset illustrates the
majority of reliability services and error control techniques detailed in our taxonomy.
4.1 PECC
To be precise, PECC (Partially Error-Controlled Connection) is an error control scheme, not
a protocol. Developed by Dempsey et al. [11, 12] as an extension to the multi-service transport protocol Xpress Transfer Protocol [36] (XTP), it was primarily intended for continuous
media applications.
The application atop PECC specifies its service requirements explicitly on a per-flow
basis through four parameters: fifo min, window length, window density and
max gap. The fifo min parameter indicates the minimum number of contiguous bytes
that must be queued before the receiver is permitted to issue a retransmission request. In
Protocol
Specification of
Reliability Level
Granularity
Adaptiveness
Decision Base
Location
Slack ARQ
PECC
k-XP
POCv2
AOEC
CUDP
VDP
Jacobs/Eleftheriadis
TLTCP
SRP
Papadopoulos/Parulkar
SR-RTP
HPF
MSP
XUDP
PR-SCTP
PRTP-ECN
Implicit
Explicit
Explicit
Explicit
Explicit
Implicit
Implicit
Implicit
Implicit
Explicit
Implicit
Implicit
Explicit
Implicit
Explicit
Implicit
Explicit
Message
Message Group
Message
Message
Message Group
Message Group
Message
Message
Message
Flow
Message
Message
Message
Message
Message
Message
Flow
Per-Flow
Per-Flow
Message
Message
Per-Flow
Per-Flow
Per-Flow
Per-Flow
Message
Per-Flow
Per-Flow
Per-Flow
Per-Flow
Per-Flow
Message
Message
In-Flow
Absolute Deadline
Absolute Deadline, Sliding Window
Reliability Classes, R/PR/UR(Number of Retransmissions)
Reliability Classes, R/PR/UR(Relative Deadline)
Sliding Window
Priority
Reliability Classes, PR/UR(Absolute Deadline)
Absolute Deadline
Absolute Deadline
Point Estimate of Packet Loss and Time
Absolute Deadline
Absolute Deadline, Priority
Reliability Classes, R/PR/UR(Absolute Deadline)
Absolute Deadline
Reliability Classes, R/PR/UR(Absolute Deadline)
Absolute Deadline
Point Estimate of Packet Loss
Receiver
Receiver
Sender
Receiver
Receiver
Sender
Receiver
Receiver
Sender
Receiver
Receiver
Receiver
Sender
Receiver
Sender
Sender
Receiver
4. A Classification and Survey of Existing Protocols
Reliability Service
Table 1: Classification of retransmission-based, partially reliable transport protocols in our taxonomy. The decision bases of the
partially reliable classes in the reliability class schemes are appended in parentheses to the names of the schemes.
29
30
other words, fifo min should be an estimate of the number of bytes consumed by the
application during one round trip time. The two parameters of window length and
window density specify the loss tolerance of the application. They specify that no more
than window density bytes are permitted to be lost out of window length bytes of
application data. From a practical viewpoint, this means that a reliability service at a granularity of a message group is offered: setting window length to a number of bytes equal to
or less than the size of a message is not meaningful1. Finally, the last parameter, max gap,
puts a limit on burst losses, giving an upper bound to how many contiguous bytes are permitted to be lost by the application.
Since PECC is intended mainly for continuous media applications, it assumes that the
XTP receiver logically places received data in a FIFO buffer that is emptied at an approximately constant rate (isochronously). Consequently, the fifo min parameter serves as
an absolute deadline for the data received. Together with parameters window length
and window density, this makes PECC an absolute deadline and sliding window-based
protocol.
In PECC, the retransmission decision takes place at the receiver side. Every time an outof-sequence packet is received, this is taken as an indication of one or more packets being
lost and results in the invoking of the retransmission decision component of PECC. If there
are more than fifo min bytes in the FIFO buffer, a retransmission request is issued for the
presumably lost data. Otherwise, PECC tries to skip as much data as is needed to facilitate
a retransmission, i.e., to make the depth of the FIFO buffer greater than fifo min bytes.
However, when this is not possible without violating the max gap parameter or the sliding
window determined by the window length and window density parameters, PECC
reports a failure to the application and skips the data anyway.
4.2 POCv2
POCv2 was proposed by Conrad et al. [10] at the University of Delaware in an attempt
to design a transport protocol better suited to distributed multimedia applications. However, the origins of POCv2 can be traced back to the POC [1, 2] (Partial Order Connection)
protocol, which was developed as a joint effort between the University of Delaware and
LAAS/ENSICA.
Apart from being a partially reliable transport protocol, POCv2 builds upon the notion
of partial order. It considers a flow as consisting of a partially ordered sequence of messages, where each message corresponds to exactly one media object (e.g., an audio clip or a
component of a video frame).
POC is a reliability class based protocol: the POCv2 application decomposes a flow into
messages, specifies the partial order of the messages and assigns each message to one of
three reliability classes, reliable, partially reliable or unreliable. Furthermore, during a flow,
POCv2 enables an application to alter the reliability class assigned a particular message.
1 Notably, if window length is assigned a value equal to or less than the size of a message, an unreliable
service is obtained when window density is set to 0, and a completely reliable service is obtained for all values
of window density greater than 0.
31
The retransmission decisions in POCv2 are made by the receiver. Whether or not a request for retransmission of a message should be issued depends on the reliability class of the
message: for reliable messages, retransmission requests are issued until they are successfully
received; partially reliable messages are retransmitted as long as the receiver has some messages to deliver to the application. The receiver will stop issuing retransmission requests for
a partially reliable message when there are no more messages to deliver to the application.
Declaring the partially reliable message as lost will make some messages succeeding the lost
message (with respect to the partial order among the messages) deliverable. In other words,
the decision base of the partially reliable class of messages is relative deadline based, where
the deadlines for the partially reliable messages are measured with respect to their preceding
messages in the partial order. Finally, no retransmission requests are issued for unreliable
messages.
4.3 SRP
Commonly, retransmission-based partially reliable transport protocols that consider both the
timing and the packet loss requirements of a multimedia streaming application, e.g., PECC,
do so by simply giving the timing requirements priority over the packet loss requirements;
lost packets are retransmitted as long as this can be done without violating the timing requirements. In contrast, the SRP (Selective Retransmission Protocol) protocol proposed by
Piecuch et al. [27] not only strives to offer a service that complies with the timing and packet
loss requirements imposed by the streaming application but also strives to offer the service
that gives the optimal trade-off between the two requirements.
During an SRP session, the SRP client is responsible for receiving a multimedia stream
from the server and issuing retransmission requests if necessary. The retransmission decisions of the receiver are governed by the maximum tolerable transmission delay and the
maximum tolerable packet loss rate of the application. These performance requirements are
communicated to SRP by the application at the inception of a flow and are specified on a
per-flow basis.
The SRP receiver assumes that packets arrive at a constant rate. The receiver thus maintains an estimate of the arrival time of the next packet in a flow. A packet is considered lost
if it has not been received before its expected arrival time has elapsed.
The expected arrival time of a packet is calculated as the arrival time of the previous
packet plus the round trip timeout (RTO) value maintained by the SRP receiver. Since it is
vital for the performance of SRP that the RTO is accurate, the receiver updates the RTO by
sending time probe packets to the sender at regular intervals.
The SRP receiver implements two retransmission decision algorithms: Equal Loss Latency (ELL) and Optimum Quality (OQ). ELL and OQ are based on the notions of loss ratio
and delay ratio. The loss ratio, rloss , is the quotient of the current packet loss rate divided
by the maximum tolerable packet loss rate, and the delay ratio, rdelay , is the quotient of
the current transmission delay divided by the maximum tolerable transmission delay. As an
estimate of the current transmission delay, one-half of RTO is used. When a packet loss is
detected, ELL decides whether the lost packet should be retransmitted on the basis of which
32
case minimizes Equation 1, while OQ tries to minimize Equation 2.

= |rloss rdelay |
2
2
= rloss
+ rdelay
(1)
(2)
4.4 HPF
The HPF (Heterogeneous Packet Flows) protocol was designed by Li et al. [21] to effectively
support heterogeneous packet flows: for example, MPEG flows with frames of different priority or multiplexed audio/video streams. The primary motivation for HPF was to demonstrate the feasibility of designing a transport protocol that provides mechanisms for flow and
congestion control on a per-flow basis and mechanisms for reliability, sequencing, framing
and prioritization on a per-message basis.
The application atop HPF partitions the data stream into messages, e.g., MPEG frames,
and specifies the service requirements at the initiation of a flow on a per-message basis. In
particular, HPF enables an application to assign a message to one of the three reliability
classes: reliable, unreliable and unreliable delay bounded. The application messages are
then treated as single entities by HPF, i.e., all packets belonging to the same application
message are assigned the same reliability class as the message.
HPF is a sender-based protocol, i.e., the retransmission decisions are made by the sender.
The retransmission policy for reliable and unreliable packets is simple: a reliable packet
is retransmitted until it has been successfully received, while an unreliable packet is never
retransmitted. The retransmission policy for unreliable delay-bounded packets is somewhat
more complex. Specifically, all delay-bounded packets are assigned a deadline based on the
transmission rate. A packet is only retransmitted if the estimated round trip time suggests that
the packet can be retransmitted and still meet its deadline. In other words, delay-bounded
packets use an absolute deadline-based decision base.
4.5 PR-SCTP
To address the shortcomings and limitations of TCP and UDP for the transportation of telephony signaling messages, the SIGTRAN (Signaling Transport) working group in IETF developed SCTP [34].
SCTP provides a message-based, reliable, and ordered transport service to an application. It accomplishes this by fragmenting the application messages into so called chunks and
assigning a Transmission Sequence Number (TSN) to each chunk. In the same way as in
TCP, SCTP employs a cumulative acknowledgement mechanism: data chunks received are
acknowledged by informing the sender of the TSN of the next expected chunk.
While the development of SCTP was directly motivated by the transfer of the SS7 (Signaling System Number 7) signaling protocol to IP, SIGTRAN ensured that the design of
SCTP was general enough for the protocol to be suitable for applications with similar requirements. As a result of this design decision, a number of extensions to SCTP have been
proposed [33, 37]. One of the most recent is PR-SCTP [33].
PR-SCTP is SCTP extended with a framework for implementing partially reliable transport services. It entails adding two new items to SCTP: a new parameter and a new type of
33
chunk. The parameter introduced by PR-SCTP is used during the initialization of an SCTP
session by both sides to signal support for PR-SCTP to the other side. A PR-SCTP session
can only be initiated if both sides use PR-SCTP. If either the sending or the receiving side is
not using PR-SCTP, the other side has the option of ending the session or starting an SCTP
session instead. The new type of chunk introduced by PR-SCTP, Forward Cumulative TSN
(FCTSN), is used by the sender to inform the receiver that it should consider all chunks
having a TSN less than a certain value as having been received.
At present, only one service has been proposed that is based on PR-SCTP, timed reliability. When an application uses this service, it assigns deadlines to its messages. In contrast to
HPF, the deadlines are assigned continuously during the lifetime of a flow, and not only at
the flow inception. These deadlines are translated into chunk lifetimes by PR-SCTP. Before
a packet is transmitted or retransmitted, the PR-SCTP sender evaluates the lifetime of the
packet. When the lifetime of a packet has expired, it is discarded, and the sender informs the
receiver of this by sending it an FCTSN.
4.6 PRTP-ECN
PRTP-ECN is an extension to TCP suggested by Grinnemo et al. [17, 18]. It aims to make
TCP better suited to applications with soft real-time constraints (e.g., best effort multimedia
applications). In particular, PRTP-ECN is an attempt to make this traditionally congestioninsensitive class of applications aware of congestion. An attractive feature of PRTP-ECN is
that it only entails modifying the retransmission decision component of TCP on the receiver
side; the sender side remains unaffected.
PRTP-ECN offers a flow-based reliability service. As long as no packets are lost in
a flow, PRTP-ECN behaves in the same way as standard TCP. When an out-of-sequence
packet is received, however, this is taken as an indication of packet loss, and the modified
retransmission decision component is invoked. This component decides, on the basis of the
success rate of all previous packets in a flow, whether to acknowledge all packets up to and
including the out-of-sequence packet or to do the same as standard TCP, i.e., acknowledge
the last successfully received in-sequence packet and wait for a retransmission.
The success rate of previous packets, called the current reliability level (crl), is calculated
as an exponentially weighted moving average over all packets up to but not including the outof-sequence packet. It is calculated as
Pn
nk
pk b k
k=1 af
,
(3)
crl(n) = P
n
nk b
af
k
k=1
where n is the sequence number of the packet preceding the out-of-sequence packet, af
is the weight or aging factor, and bk denotes the number of bytes contained in packet k.
Variable pk is given a binary value. If the kth packet was successfully received, then pk = 1,
otherwise pk = 0.
An application communicates its lower bound reliability level through the aging factor
and a second parameter called the required reliability level (rrl). This parameter functions
as a target value. As long as crl(n) rrl, lost packets are acknowledged. However, if an
out-of-sequence packet is received at a time when crl(n) is below rrl , the last in-sequence
34
packet is acknowledged, forcing the sender to retransmit the lost packet. It should be noted
that an application is permitted to alter either af or rrl at any time during the lifetime of a
flow, i.e., PRTP-ECN is an example of an in-flow adaptive protocol.
Although, PRTP-ECN is a flow-based protocol, it has some features in common with
message group-based protocols. In particular, PRTP-ECN does not consider all messages
equally important, which is usually the case with flow-based protocols. Instead, the aging
factor makes PRTP-ECN consider newly arrived messages to be more important than messages that arrived some time ago. Furthermore, to some degree, PRTP-ECN provides for an
application to control the maximum tolerable message burst loss length: more or less the
same reliability level is given by several different combinations of af and rrl . However,
the combinations differ from each other in that they translate to different upper limits on
message burst loss.
In order to decouple the error control and congestion control schemes of TCP, PRTP-ECN
uses the transport level flags of ECN [30] (Explicit Congestion Notification). In particular,
when lost packets are acknowledged, PRTP-ECN signals congestion to the sender side by
setting the explicit congestion notification flag in the acknowledgement packet. As a result,
when the sender receives an acknowledgement of a lost packet, it will act as though a congestion has occurred, e.g., it will reduce its congestion window. However, it will not re-send
the lost packet.
Concluding Remarks
This paper presents a taxonomy for retransmission-based, partially reliable transport protocols. The taxonomy comprises two classification schemes. The first classification scheme
classifies protocols with respect to the reliability service they offer. In this scheme, protocols are classified along three dimensions: specification of reliability level, granularity and
adaptiveness. The second classification scheme classifies protocols with respect to their retransmission decision component. It comprises two dimensions: location and decision base.
The paper also shows how existing protocols are classified according to this taxonomy and
gives a survey of a subset of the classified protocols. The surveyed subset of protocols is
selected such that it covers the majority of reliability services and error control schemes in
the taxonomy.
The taxonomy suggests that the majority of retransmission-based, partially reliable transport protocols use error control schemes that are simply variations of a relatively small set
of core principles. Specifically, most protocols make their retransmission decision on the
basis of one or several of the following core principles: an estimate of the average packet
loss rate, an absolute deadline, an upper bound on the number of retransmissions, priorities, or reliability classes. Noteworthy, neither of the core principles of the error control
schemes explicitly involve delay jitter, a performance parameter that, for many multimedia
applications, is at least as important as the delay itself. Instead, delay jitter is almost always
indirectly controlled through buffers at the receiver side. Furthermore, the taxonomy and
survey suggest that, in a large number of protocols, the service interface is closely coupled
to the implementation of the error control scheme; in fact, many times the service interface
and implementation are intertwined. Taken together, these observations suggest that future
REFERENCES
35
work on this class of protocols should consider including delay jitter as a parameter in the
error control scheme and strive for more generic and application-oriented service interfaces
that are decoupled from the implementation.
In summary, this paper presents a taxonomy for retransmission-based, partially reliable
transport protocols that gives a unified terminology and a framework for comparison and
evaluation of this class of protocols. In addition, we believe that the insight provided by the
taxonomy and survey in this paper can be used to guide future research in this area.
References
[1] P. D. Amer, C. Chassot, T. Connolly, and M. Diaz. Partial order quality of service to
support multimedia connections: Reliable channels. In 2nd High Performance Distributed Computing Conference, Spokane, Washington, USA, July 1993.
[2] P. D. Amer, C. Chassot, T. Connolly, and M. Diaz. Partial order quality of service
to support multimedia connections: Unreliable channels. In International Networking
Conference (INET), San Fransisco, California, USA, August 1993.
[3] P. D. Amer, P. T. Conrad, E. Golden, S. Iren, and A. Caro. Partially-ordered, partiallyreliable transport service for multimedia applications. In Advanced Telecommunications/Information Distribution Research Program (ATIRP) Conference, pages 215
220, College Park, Maryland, USA, January 1997.
[4] M. J. Andrews. XUDP: A real-time multimedia networking protocol. Bachelor thesis,
Worcester Polytechnic Institute, March 1997.
[5] G. Barberis and D. Pazzaglia. Analysis and design of a packet-voice receiver. IEEE
Transactions on Communications, 28(2):152156, February 1981.
[6] V. Bhargava. Forward error correction schemes for digital communications. IEEE
Communications Magazine, 21:1119, January 1983.
[7] R. Braden, D. Clark, and S. Shenker. Integrated services in the internet architecture.
RFC 1633, IETF, June 1994.
[8] Z. Chen, S-M. Tan, R. H. Campbell, and Y. Li. Real time video and audio in the
world wide web. In 4th International World Wide Web Conference (WWW), Boston,
Massachusetts, USA, December 1995.
[9] D. Clark and D. Tennenhouse. Architectural considerations for a new generation of protocols. ACM Computer Communication Review (SIGCOMM), pages 200208, September 1990.
[10] P. T. Conrad, E. Golden, P. D. Amer, and R. Marasli. A multimedia document retrieval
system using partially-ordered/partially-reliable transport service. In Multimedia Computing and Networking, San Jose, California, USA, January 1996.
36
[11] B. Dempsey. Retransmission-Based Error Control for Continuous Media Traffic in

Packet-Switched Networks. PhD thesis, University of Virginia, May 1994.
[12] B. Dempsey, T. Strayer, and A. Weaver. Adaptive error control for multimedia data
transfers. In International Workshop on Advanced Communications and Applications
for High-Speed Networks (IWACA), pages 279289, Munich, Germany, March 1992.
[13] B. J. Dempsey, J. Liebeherr, and A. C. Weaver. A new error control scheme for packetized voice over high-speed local area networks. In 18th Conference on Local Computer
Networks (LCN), pages 91100, Minneapolis, Minnesota, USA, September 1993.
[14] M. Diaz, K. Drira, A. Lozes, and C. Chassot. On the definition and representation of
the quality of service for multimedia systems. In International Conference on High
Performance Networking (HPN), Palma de Mallorca, Spain, September 1995.
[15] N. G. Feamster. Adaptive delivery of real-time streaming video. Masters thesis, Massachusetts Institute of Technology, May 2001.
[16] F. Gong and G. Parulkar. An application-oriented error control scheme for high-speed
networks. Technical Report WUCS-92-37, Washington University, November 1992.
[17] K-J Grinnemo and A. Brunstrom. Enhancing TCP for applications with soft real-time
constraints. In SPIE Multimedia Systems and Applications, pages 1831, Denver, Colorado, USA, August 2001.
[18] K-J Grinnemo and A. Brunstrom. Evaluation of the QoS offered by PRTP-ECN a
TCP-compliant partially reliable transport protocol. In 9th International Workshop on
Quality of Service (IWQoS), pages 217231, Karlsruhe, Germany, June 2001.
[19] K. Hess. Media streaming protocol: An adaptive protocol for the delivery of audio and
video over the Internet. Masters thesis, University of Illinois at Urbana-Champaign,
1998.
[20] S. Jacobs and A. Eleftheriadis. Streaming video using dynamic rate shaping and TCP
congestion control. Journal of Visual Communication and Image Representation, 9(3),
1998.
[21] J-R Li, D. Dwyer, and V. Bharghavan. A transport protocol for heterogeneous packet
flows. In 18th Annual Joint Conference of the IEEE Computer and Communications
Societies (INFOCOM), New York, New York, USA, March 1999.
[22] R. Marasli, P. D. Amer, and P. T. Conrad. Retransmission-based partially reliable
transport service: An analytic model. In 15th Annual Joint Conference of the IEEE
Computer and Communications Societies (INFOCOM), pages 621629, San Fransisco,
California, USA, March 1996.
[23] R. Marasli, P. D. Amer, and P. T. Conrad. Partially reliable transport service. In 2nd
Symposium on Computers and Communications (ISCC), Alexandria, Egypt, July 1997.
REFERENCES
37
[25] C. Papadopoulos. Error Control for Continuous Media and Large Scale Multicast
Applications. PhD thesis, Washington University, August 1999.
[26] C. Papadopoulos and G. Parulkar. Retransmission-based error control for continuous
media applications. In 6th International Workshop on Network and Operating System
Support for Digital Audio and Video (NOSSDAV), pages 512, Zushi, Japan, April
1996.
[27] M. Piecuch, K. French, G. Oprica, and M. Claypool. A selective retransmission protocol for multimedia on the Internet. In SPIE Multimedia Systems and Applications,
Boston, Massachusetts, USA, November 2000.
[28] J. Postel. User datagram protocol. RFC 768, IETF, August 1980.
[29] J. Postel. Transmission control protocol. RFC 793, IETF, September 1981.
[30] K. Ramakrishnan and S. Floyd. A proposal to add explicit congestion notification
(ECN) to IP. RFC 2481, IETF, January 1999.
[31] H. Schulzrinne. RTP: A transport protocol for real-time applications. RFC 1889, IETF,
January 1996.
[32] B. C. Smith. Cyclic-UDP: A priority-driven best effort protocol. Unpublished, May
1994.
[33] R. Stewart, M. Ramalho, Q. Xie, M. Tuexen, and P. T. Conrad. SCTP partial reliability
extension. Internet draft, IETF, May 2002. Work in Progress.
IETF, October 2000.
[35] J. Stone and C. Partridge. When the CRC and TCP checksums disagree. In ACM
Computer Communication Review (SIGCOMM), pages 309319, Stockholm, Sweden,
August 2000.
[36] W. Strayer, B. Dempsey, and A. Weaver. XTP: The Xpress Transfer Protocol. AddisonWesley Publishing, July 1992.
[37] Q. Xie, R. Stewart, C. Sharp, and I. Rytina. SCTP unreliable data mode extension.
Internet draft, IETF, April 2001. Work in Progress.
Paper II
A Simulation Based Performance Evaluation of PRTP

Reprinted from
Karlstad University Studies 2002:35

October 2002
A Simulation Based Performance Evaluation of

PRTP
Karl-Johan Grinnemo, Anna Brunstrom
{Karl-Johan.Grinnemo, Anna.Brunstrom}@kau.se
Abstract
PRTP is proposed to address the need of a transport service that is more suitable
for applications with soft real-time requirements, e.g., video broadcasting. It is an extension for partial reliability to TCP. The main idea behind PRTP is to exploit the fact
that many soft real-time applications tolerate a limited amount of packet loss. In particular, PRTP enables an application to trade some of the reliability offered by TCP for
improved throughput and interarrival jitter. This paper describes the design of PRTP and
gives a detailed description of a simulation based performance evaluation. The performance evaluation involved the performance of PRTP compared to TCP in long- as well
as in short-lived connections and showed that PRTP probably would give significant improvements in performance, both in terms of throughput and interarrival jitter, for a wide
range of applications. The performance evaluation also suggested that PRTP is not TCPfriendly and altogether fair against competing flows and thus is not suitable for use in an
environment where the main reason for packet loss is congestion, e.g., in a fixed Internet
environment. We believe, however, that PRTP could be a viable alternative for wireless
applications in error prone environments.
1 Introduction
Traditionally, the Internet has been used by two kinds of applications: applications that require a completely reliable service, such as file transfers, remote login, and electronic mail,
and applications for which a best effort service suffices, such as network management and
name services. Over the course of the last decade, however, a large number of new applications have become part of the Internets development and have started to erode this traditional
picture. In particular, we have experienced a growing interest in distributed applications with
soft real-time requirements, e.g., best effort multimedia applications.
While the two standard transport protocols, TCP [26] and UDP [25], successfully provide
transport services to traditional Internet applications, they fail to provide adequate transport
services to applications with soft real-time requirements. In particular, TCP provides a completely reliable transport service and gives priority to reliability over timeliness. While, soft
41
42
real-time applications commonly require timely delivery, they can do with a less than completely reliable transport service. In contrast, UDP provides a best effort service and makes
no attempt to recover from packet loss. Nevertheless, despite the fact that soft real-time
applications often tolerate packet loss, the amount that is tolerated is limited.
To better meet the needs of soft real-time applications in terms of timely delivery, a
new class of transport protocols has been proposed: retransmission-based, partially reliable
transport protocols. These protocols exploit the fact that many soft real-time applications
do not require a completely reliable transport service and therefore trade some reliability for
improved throughput and jitter.
Early work on retransmission-based, partially reliable transport protocols was done by
Dempsey [9], and Papadopoulos and Parulkar [24]. Independently of each other, they demonstrated the feasibility of using a retransmission-based partially reliable transport protocol for
multimedia communication. Further extensive work on retransmission-based partially reliable transport protocols was done by Diaz et al. [10] at LAAS-CNRS and by Amer et
al. [2] at the University of Delaware. Their work resulted in the proposal of the POC (Partial Order Connection) protocol, which has also been suggested as an extension to TCP [7].
Examples of more recent work on retransmission-based partially reliable transport protocols
are TLTCP [21] (Time-Lined TCP) and PR-SCTP [28] (Partially Reliable extension to the
Stream Control Transmission Protocol).
This paper presents PRTP, an extension for partial reliability to TCP. PRTP differs from
many other proposed extensions to TCP, such as POC, in that it does not involve any elaborate changes to standard TCP. In fact, PRTP only involves a change in the retransmission
decision scheme of standard TCP: lost packets are acknowledged provided the cumulative
packet reception success rate is always kept above the minimum guaranteed reliability level
communicated to PRTP by the application. Another attractive feature of PRTP is that it
only has to be implemented on the receiver side. Neither the sender nor any intermediary
network equipment, e.g., routers, need be aware of PRTP. Thus, PRTP provides for gradual
deployment.
This paper also presents a simulation based performance evaluation that has been performed on PRTP. The performance evaluation comprised of two subactivities. Our simulation model of PRTP was first validated against a prototype of PRTP implemented in Linux.
Second, we made a performance analysis of PRTP. The performance analysis involved studying the stationary and the transient behavior of PRTP, i.e., the behavior of PRTP in long- as
well as short-lived connections.
The paper is organized as follows. Section 2 elaborates on the design of PRTP. Sections 3
and 4 discuss the implementation and validation of the PRTP simulation model. The performance analysis of PRTP is presented in Sections 5 and 6. Section 5 considers the stationary
performance of PRTP, while Section 6 considers its transient behavior. Section 7 gives a
brief summary and some concluding remarks.
Protocol Design
As mentioned above, PRTP is an extension for partial reliability to TCP that aims at making
TCP better suited for applications with soft real-time constraints, e.g., best-effort multimedia
2. Protocol Design
43
11
00
00
11
00
11
00
11
00
11
Regular TCP transfer
Sender side
TCP
11
00
00
11
00
11
00
11
00
11
Insequence da
ta packet
Receiver side
PRTP
Outofseque
Packet loss
nce data pack
et
Packet loss detected (1)
to and
all packets up
Acknowledge
nce packet
outofseque
including the
crl
rrl (2)
Regular TCP transfer
Insequence da
ta packet
Outofseque
Packet loss
nce data pack
et
Packet loss detected
Acknowledge
e packet
last insequenc
crl < rrl (3)
Figure 1: PRTP session.
applications. In particular, the main intent in PRTP is to trade some of the reliability offered
by TCP for improved throughput and jitter.
An attractive feature of PRTP is that it only involves changing the retransmission decision
scheme of standard TCP; the rest of the implementation of standard TCP is left as is. This
makes PRTP very easy to implement and compatible with existing TCP implementations. In
addition, PRTP only needs to be implemented on the receiver side. Neither the sender side
nor any intermediate network equipment such as routers are affected by PRTP.
PRTP allows an application to specify a reliability level between 0% and 100%. The
application is then guaranteed that this reliability level will be maintained until a new reliability level is specified or until the session is terminated. More precisely, the retransmission
decision scheme of PRTP is parameterized. The application atop PRTP explicitly specifies
a minimum acceptable reliability level by setting the parameters of the PRTP retransmission decision scheme. Implicitly, the parameters govern the trade-off between reliability,
throughput, and jitter. By relaxing the reliability, the application can receive less jitter and
better throughput.
The timeline in Figure 1 illustrates how PRTP works. As long as no packets are lost,
PRTP works in the same way as standard TCP. When an out-of-sequence packet is received
44
(1), however, this is taken as an indication of packet loss, and the PRTP retransmission
decision scheme is invoked. This scheme decides whether the lost packet should be retransmitted on the basis of the rate of successfully received packets at the time of the reception
of the out-of-sequence packet. To be more precise, this scheme calculates an exponentially
weighted moving average over all packets, lost and received, up to but not including the
out-of-sequence packet. This weighted moving average is called the current reliability level,
crl(n), and is defined as
crl(n) =
Pn
nk
pk b k
k=1 af
P
.
n
nk
bk
k=1 af
(1)
In Equation 1, n is the sequence number of the packet preceding the out-of-sequence packet,
af is the weight or aging factor, and bk is the number of bytes contained in packet k. The
variable denoted pk is a conditional variable that only takes a value of 1 or 0. If the kth
packet was successfully received, then pk = 1 otherwise pk = 0.
and a second parameter called the required reliability level. The required reliability level,
rrl , acts as a reference value. As long as crl(n) rrl , lost packets need not be retransmitted
and are therefore acknowledged (2). If an out-of-sequence packet is received and crl(n) is
below rrl , PRTP acknowledges the last in-sequence packet, and waits for a retransmission
(3). In the remainder of this text, a PRTP protocol that has been assigned fixed values for af
and rrl is called a PRTP configuration.
The PRTP Simulation Model
All simulations of PRTP in this paper were made with version 2.1b5 of the ns-2 [23] network
simulator. In ns-2, PRTP is implemented as an agent derived from the FullTcp class. In
particular, the PRTP agent only modifies the retransmission decision scheme of FullTcp
in the way detailed in Section 2. Besides this modification, the PRTP agent fully supports
the TCP Reno [11] congestion control mechanisms. Most notably, fast retransmit and fast
recovery are supported.
The FullTcp class in ns-2 strives to correctly model the 4.4 BSD implementation of
TCP [20]. However, in version 2.1b5 of ns-2, the FullTcp class still lacks support for some
of the congestion control mechanisms of 4.4 BSD, specifically for selective acknowledgements (SACK) [19] and timestamps [17]. A consequence of this is that our PRTP simulation
model also lacks support for these mechanisms.
Validation of the PRTP Simulation Model
This section considers the validation of our PRTP simulation model against a prototype of
PRTP implemented in Linux. Subsection 4.1 summarizes the validation setup and methodology, and Subsection 4.2 discusses the results of the validation.
4. Validation of the PRTP Simulation Model
45
Node 2
Ethernet LAN
Bandwidth: 10 Mbps
Propagation delay: 0 ms
Node 3
Node 1
1
0
0
1
0
1
0
1
0
1
0
1
Running NIST Net

Propagation delays: 50 ms, 125 ms
Packet loss frequencies: 1%, 3%, 5%
0000
1111
0000
0000000000001111
111111111111
0000
1111
1
0
0
1
0
1
0
1
0
1
0
1
0000
1111
0000
0000000000001111
111111111111
0000
1111
Running receiver application
Running sender application
(a) Experiment testbed.
Bandwidth: 10 Mbps
Uniform error model
Propagation delays: 50 ms, 125 ms

Packet loss frequencies: 1%, 3%, 5%
Node 1
Node 3

(Application/FTP)

(PRTP agent in LISTEN mode)
(b) Simulation testbed.
Figure 2: Validation of the PRTP simulation model.
4.1 Methodology
The validation of our ns-2 simulation model was done using the experiment testbed depicted
in Figure 2(a). Nodes 1, 2, and 3 in Figure 2(a) denote three 233 MHz Pentium II PCs. Nodes
1 and 2 ran unmodified versions of Linux 2.2.14, while node 3 ran our PRTP prototype. All
three PCs ran on a 10 Mbps Ethernet LAN. Traffic was introduced between the nodes 1 and
3 by a constant bitrate traffic generator residing at node 1 that sent bulk data to a PRTP sink
application at node 3. The traffic between nodes 1 and 3 was routed through node 2 on which
NIST Net [22], a network emulation tool, was running. NIST Net was introduced to make it
possible for us to vary the propagation delay and packet loss frequency on the link between
nodes 1 and 3.
Our PRTP prototype [3] is derived from Linux 2.2.14, whose TCP implementation supports both the SACK and the timestamp options. Therefore, in order to mitigate the differences between our PRTP simulation model and our prototype implementation, neither of
these two options was turned on during the validation.
Figure 2(b) illustrates how the experiment testbed in Figure 2(a) was modeled in ns-
46
2. Node 1 hosted an FTP application (Application/FTP) which ran on top of a TCP

agent (FullTcp), while node 3 ran our PRTP agent in LISTEN mode (i.e., our PRTP
agent functioned as a sink and discarded all received packets). The TCP and PRTP agents
were configured with the same parameter settings as the corresponding protocols in the
experiment testbed. Bidirectional packet losses were induced by a uniform error model
(ErrorModel/Uniform).
The PRTP simulation model was validated for propagation delays of 50 ms and 125 ms
and packet loss frequencies of 1%, 3%, and 5%. Furthermore, in order to validate a number
of PRTP configurations, the aging factor, af , and the required reliability level, rrl , of PRTP
were varied. In particular, experiments and simulations were run for af set to 0.9 and 1.0,
and rrl set to a range of values between 70% and 100%. In each scenario, traffic was injected
by the constant bitrate traffic generator/FTP application at node 1, which sent 5 Mbytes of
data to the receiver application at node 3. To be able to obtain statistically significant results,
each scenario was run 20 times.
4.2 Results
The validation of the PRTP simulation model was confined to only one metric: the transfer
time. Figures 3, 4, 5, and 6 show the results of the simulations and experiments in which af
was equal to 0.9 and 1.0, respectively. In each graph, the time it took to transfer 5 Mbytes of
data is plotted against rrl . The 95% confidence interval for each transfer time is shown.
As follows from figures 3, 4, 5, and 6 there was a close correspondence between the
transfer times observed in the experiments and the simulations. Still, the results of the experiments and the simulations differed in two important ways. In scenarios where the propagation delay was 125 ms and rrl was below 90%, the simulations tended to predict shorter
transfer times than were measured in the experiments. In the scenarios where rrl was higher
than 90%, the situation was the reverse: the transfer times observed in the simulations were
in most cases longer than those obtained in the experiments.
The first difference between the experiments and the simulations was an effect of our
PRTP prototype being hampered by more spurious timeouts than our simulation model.
This, in turn, seemed to be an effect of the retransmission timeout being calculated more
aggressively in Linux 2.2.14 than in ns-2, and consequently more aggressively in our PRTP
prototype than our simulation model. This also explains why the difference in transfer times
was marginal when the propagation delay was short (50 ms) and became obvious first when
the propagation delay was longer (125 ms). When the propagation delay was short, the time
it took to recover from a retransmission timeout was also very short. Together with the fact
that the number of retransmission timeouts occurring at low values of rrl was very small,
this made the fraction of time spent on recovering from timeouts almost negligible compared
with the total transfer time. However, when the propagation delay was long, each timeout
recovery took a substantial amount of time. Therefore, even though the number of timeouts
was small, the duration of each timeout recovery was long enough to make the total fraction
of the transfer time spent on recovering from timeouts non-negligible.
As mentioned earlier, the implementation of TCP in our PRTP simulation model and prototype differed in some important ways. Despite the fact that we had tried to make them be-
47
(a) Packet loss frequency: 1%.
(b) Packet loss frequency: 3%.
(c) Packet loss frequency: 5%.
Figure 3: Transfer time against rrl for an aging factor of 0.9 and a propagation delay of
50 ms.
48
125 ms.
49
50 ms.
50
125 ms.
5. Stationary Analysis
51
have as similarly as possible during the validation, some differences remained. In particular,
the TCP implementation of Linux 2.2.14 implemented the TCP NewReno [16] modification
to TCP Renos fast recovery, which was found to be the reason for the second difference
between the simulations and the experiments.
TCP NewReno made our PRTP prototype more robust toward multiple packet losses
within a single sender window. This difference was of no importance at low rrl values
since PRTP then was quite tolerant to packet loss anyway. However, at rrl values greater
than 90%, PRTP more or less required that every lost packet was retransmitted, i.e., PRTP
worked almost in the same way as TCP. Thus, in those scenarios, our PRTP prototype reacted
to multiple packet losses in a TCP NewReno fashion, while our PRTP simulation model
reacted in the same way as TCP Reno. Consequently, when the packet loss rate was low
(1%), and therefore the chances of experiencing multiple packet losses was marginal, there
was no significant difference between our PRTP prototype and simulation model. At higher
packet loss rates (3% and 5%), however, the number of multiple packet losses was large
enough to make the transfer times obtained with the PRTP prototype significantly shorter
than in the simulations.
5 Stationary Analysis
The stationary analysis of PRTP studied the performance of PRTP in long-lived connections.
Three performance metrics were studied: average interarrival jitter, average throughput, and
average fairness. Average fairness was measured using Jains fairness index [18]. In particular, the average fairness for n flows, each one acquiring an average bandwidth, bi , on a
given link, was calculated as
Pn
2
( i=1 bi )
Pn 2 .
Fairness index =
n ( i=1 bi )
def
(2)
A problem in Jains fairness index is that it essentially considers all protocols with better
link utilization than TCP more or less unfair [4]. To address this problem, Jains fairness
index was complemented with the TCP-friendliness test proposed by Floyd et al. [12]. According to this test, a flow is TCP-friendly provided its arrival rate does not exceed the arrival
rate of a conformant TCP connection under the same circumstances. Specifically, it means
that the following inequality must hold between the arrival rate of a flow, , the packet size,
, the minimum round-trip time, RT T , and the experienced packet-loss rate, ploss :
p
1.5 2/3
.
(3)
RT T ploss
Compared to Jains fairness index, the main advantage of the TCP-friendliness test is
that it accepts a certain skewness in the bandwidth allocation between competing flows. In
particular, a flow is permitted to use more bandwidth than dictated by Jains fairness index
provided it does not use more bandwidth than the theoretically most aggressive TCP flow
would have in the same situation.
52
Node 1
Bandwidth: 10 Mbps
Bandwidth: 10 Mbps
Node 4

(TCP/PRTP agent in LISTEN mode)

(Application/FTP)
Node 2
Node 7
Bandwidth: 1.5 Mbps

Buffer size: 25 segments

(Application/FTP)
Node 8
Buffer size: 25 segments
Node 3
Node 5

(TCP agent in LISTEN mode)
Node 6
Running VBR traffic generator

(Application/Traffic/CBR with random_ 1)

(UDP agent in LISTEN mode)
Figure 7: Simulation testbed used in the stationary analysis.
5.1 Simulation Experiment

Figure 7 depicts the simulation testbed used for the stationary analysis. The primary factors
in the simulation experiment were the protocol used at node 4 and the traffic load on the link
between nodes 7 and 8.
In our experiment, simulations were made in pairs. TCP was used at both nodes 1 and 4
in the reference simulations, while the TCP agent at node 4 was replaced with PRTP in the
comparative simulations.
Each simulation comprised two FTP applications (Application/FTP) at nodes 1 and
2 that sent data at a rate of 10 Mbps to receivers at nodes 4 and 5. The initial congestion
windows of the TCP agents (FullTcp) at nodes 1 and 2, i.e., the sender side transport
agents, were initialized to two segments [1]. The advertised windows of the TCP and PRTP
agents at nodes 4 and 5, i.e., the receiver side transport agents, were set to 20 segments. Both
the sender side and the receiver side transport agents announced a maximum segment size
of 1460 bytes.
Background traffic was accomplished by a UDP flow between nodes 3 and 6. The UDP
flow was generated by a constant bitrate traffic generator residing at node 3. However, the
departure times of the packets from the traffic generator were randomized, effectively turning
the constant bitrate traffic generator into a variable bitrate generator (VBR).
Nodes 7 and 8 modeled routers, specifically routers with a buffer capacity of 25 segments
and which employed drop-tail (FIFO) queueing.
The traffic load was controlled by setting the mean sending rate of the UDP flow to a
fraction of the nominal bandwidth of the link between the routers. Simulations were run
for seven traffic loads: 20%, 60%, 67%, 80%, 87%, 93%, and 97%. These seven loads
corresponded approximately to packet loss rates of 1%, 2%, 3%, 5%, 8%, 14%, and 20%
in the reference simulations. Simulations were run for seven PRTP configurations, and each
simulation was run 40 times to obtain statistically significant results.
The PRTP configurations were selected on the basis of the metric allowable steady-state
packet loss frequency, fs , defined in [15]. In short, fs gives an approximate estimate of the
largest sustainable packet loss tolerance of a particular PRTP configuration and is based on
the ideal scenario in which packets are lost at all times at which the PRTP configuration
permits the packet loss, i.e., at all times at which crl rrl . Seven configurations were
53
Figure 8: Interarrival jitter vs. traffic load.

selected in which fs was made approximately equal to 2%, 3%, 5%, 8%, 11%, 14%, and
20%. These configurations are denoted: PRTP-2, PRTP-3, PRTP-5, PRTP-8, PRTP-11,
PRTP-14, and PRTP-20.
In all simulations, the UDP flow and the FTP flow between nodes 2 and 5 started at 0 s.
The FTP flow between nodes 1 and 4 started at 600 ms. Each simulation run lasted 100 s.
5.2 Results
The graphs in Figures 8, 9, and 10 show how the three performance metrics studied, average
interarrival jitter, average throughput, and average fairness, varied with the protocol used at
node 4 and the router link traffic load, i.e., the two primary factors. In each graph, the sample
mean of the observations obtained for a performance metric in the 40 runs comprising a
simulation was taken as a point estimate for the performance metric in that simulation.
Figure 8 illustrates how the average interarrival jitter of the seven PRTP configurations
varied with the traffic load. The traffic load is on the horizontal axis, and the average interarrival jitter relative to the average interarrival jitter obtained with TCP is on the vertical
axis.
As is evident from Figure 8, the largest reductions in average interarrival jitter with PRTP
compared to TCP were obtained at low traffic loads and with the PRTP configurations that
had the largest fs values. Specifically, both PRTP-14 and PRTP-20 gave a reduction in
average interarrival of more than 150% when the traffic load was 20% (ca. 1% packet loss
rate). This was of course not surprising: when the traffic load was low, i.e., when the packet
loss rate was low, those PRTP configurations tolerating large packet loss rates almost never
54
Figure 9: Throughput vs. traffic load.
Figure 10: Fairness index vs. traffic load.
55
had to make retransmissions of lost packets. Consequently, the interarrival jitter experienced
with these PRTP configurations at low traffic loads was primarily caused by variations in the
queueing delays at the two routers, and was therefore very low.
However, even though the largest reductions in average interarrival jitter at low traffic
loads were obtained with the PRTP configurations that had largest packet loss tolerances,
significant reductions were also obtained with PRTP configurations with relatively low fs
values. In particular, Figure 8 shows that, at a traffic load of 20%, PRTP-5 gave a reduction
in average interarrival jitter of 106%, and PRTP-8 gave a reduction in average interarrival
jitter of 137%. In fact, even PRTP-2 gave a non-negligible reduction in average interarrival
jitter when the traffic load was low: at a traffic load of 20%, PRTP-2 gave a reduction in
average interarrival jitter of 59%, and, at a traffic load of 60% (ca. 2% packet loss rate), the
reduction was 29%.
At high traffic loads, the largest reductions in average interarrival jitter were also obtained
with those PRTP configurations tolerating the largest packet loss rates: at a traffic load of
87% (ca. 8% packet loss rate), PRTP-14 gave a reduction in average interarrival jitter of
70%, and PRTP-20 gave a reduction of 94%. However, PRTP-8 and PRTP-11 also gave
substantial reductions in average interarrival jitter at high traffic loads. In fact, as shown in
Figure 8, these two PRTP configurations gave a substantial reduction in average interarrival
jitter for all seven traffic loads studied, from the lowest to the highest traffic load. Most
notably, they both reduced the average interarrival jitter compared to TCP at the two highest
traffic loads, 93% (ca. 14% packet loss rate) and 97% (ca. 20% packet loss rate), by more
than 30%.
Another noteworthy observation is the increase in the reduction of the average interarrival
jitter obtained with PRTP configurations PRTP-8, PRTP-11, PRTP-14, and PRTP-20, as
compared to TCP, when the traffic load increased from 93% to 97%. However, this was
not because these PRTP configurations actually reduced their average interarrival jitter when
the traffic load was increased. Instead, it was caused by a large increase in the average
interarrival jitter for TCP, PRTP-2, PRTP-3, and PRTP-5 when the traffic load increased
from 93% to 97%, i.e., when the packet loss rate reached approximately 20%. This in turn
seemed to be an effect of loss of retransmitted packets at times when the sending window
was only two packets, i.e., at times when the loss of packets unconditionally led to timeouts.
Specifically what happened at those times, was that one of the two packets in the sender
window was dropped, which, apart from leading to a timeout and a retransmission of the
dropped packet, also led to a doubling of the RTO (Retransmission timer). Then, when the
retransmitted packet was also dropped, it took twice the time of the first retransmission until
a second attempt to retransmit the dropped packet was made. PRTP configurations PRTP-8,
PRTP-11, PRTP-14, and PRTP-20 never found themselves in the same predicaments. This
was primarily because they managed to keep their sender windows large enough to enable
packet loss recovery through fast retransmit and fast recovery.
Figure 9 shows the result of the throughput evaluation. As before, the traffic load is on
the horizontal axis, and, this time, the average throughput relative to the average throughput
of TCP is on the vertical axis.
A comparison between the results of the throughput evaluation and the results of the evaluation of the average interarrival jitter indicates that the improvements in average throughput
56
obtained with PRTP as compared to TCP were not of the same magnitude as the reductions
in average interarrival jitter. In addition, we observe that, in the same way as for average
interarrival jitter, the largest gains in average throughput compared to TCP, at both low and
high traffic loads, were obtained with the PRTP configurations that had the largest fs values.
For example, at a traffic load of 20%, PRTP-20 gave an improvement in average throughput
of almost 50%, while PRTP-2 gave an improvement in average throughput of only 26%; at
a traffic load of 97%, the difference was even larger: while PRTP-20 gave an improvement
in average throughput of as much as 75%, PRTP-2 gave an improvement of only 11%. The
reason the largest gains in average throughput were obtained with the PRTP configurations
with the largest packet loss tolerances was the same as for average interarrival jitter: the
PRTP configurations that tolerated the largest packet loss rates almost never had to make
retransmissions of lost packets, which not only resulted in reductions in average interarrival jitter but also in improvements in average throughput. Specifically, the retransmissions
performed by the TCP sender always entailed reductions of the TCP sender window. This
meant that at high packet loss rates, the TCP sender window was frequently less than four
packets. As a consequence, the TCP sender was often unable to detect packet losses through
fast retransmit at high packet loss rates. Instead, packet losses at those times were detected
through timeouts. In other words, at high packet loss rates, the retransmissions made by the
TCP sender were often preceded by a timeout period that drastically reduced the throughput
performance.
While the largest improvements in average throughput were indeed obtained with the
PRTP configurations with the largest packet loss tolerances, we observe that all seven PRTP
configurations in fact gave noticeable improvements in average throughput over TCP at low
traffic loads. At a traffic load of 20%, PRTP-2 (as already mentioned) gave an improvement
in average throughput of 26%; PRTP-3 gave an improvement of 31%; PRTP-5 gave an improvement of 39%; PRTP-8 gave an improvement of 44%; PRTP-11 and PRTP-14 gave an
improvement of 47%; and PRTP-20 gave an improvement of 48%.
As follows from Figure 9, when the traffic load increased from 20%, the relative throughput performance of the four PRTP configurations with the lowest packet loss tolerances,
PRTP-2, PRTP-3, PRTP-5, and PRTP-8, immediately started to decrease, while the relative
throughput performance of the three PRTP configurations with the highest packet loss tolerances, PRTP-11, PRTP-14, and PRTP-20, continued to increase. In particular, the relative
throughput performance of PRTP-11 and PRTP-14 continued to increase up to a traffic load
of 67% (ca. 3% packet loss rate), while the relative throughput performance of PRTP-20
peaked at the traffic load of 60%.
The reason for this behavior in the relative throughput performance of the seven PRTP
configurations was of course an effect of their different packet loss tolerances. The four
PRTP configurations with the lowest packet loss tolerances had to begin issuing retransmission requests for lost packets at lower packet loss rates, i.e., lower traffic loads, than the three
with the highest packet loss tolerances. Consequently, the relative throughput performance
of the four PRTP configurations with the lowest packet loss tolerances started to decrease at
much lower traffic loads than the three with the highest packet-loss tolerances.
One might have expected that the maximum relative throughput performance of the seven
PRTP configurations would have occurred closer to their respective fs values, i.e., closer
57
to their allowable steady-state packet loss frequencies: The number of retransmissions increased with increased traffic load, i.e., increased packet loss rate, for TCP, but should have
been kept low for a particular PRTP configuration as long as the packet loss rate was below fs . For example, one might have expected that the relative throughput performance of
PRTP-3 and PRTP-5 would have peaked at traffic loads of 67% (ca. 3% packet loss rate)
and 80% (ca. 5% packet loss rate) and that the relative throughput performance of PRTP-20
would have increased monotically all the way up to the traffic load of 97% (ca. 20% packet
loss rate). However, as mentioned in Section 5.1, fs is defined on the basis of a certain ideal
packet loss scenario. Depending on how much the actual packet losses in a simulation differed from this packet loss scenario, the actual packet loss tolerance of a PRTP configuration
differed from fs .
It follows from Figure 9 that the relative throughput performance of all seven PRTP configurations increased when the traffic load increased from 93% to 97%. This was not because
the throughput performance of these PRTP configurations actually suddenly increased but
was instead, in the same way as for average interarrival jitter, an effect of the performance
of TCP deteriorating rapidly when the packet-loss rate approached 20%. In fact, in the same
way as for average interarrival jitter, this was due to TCP experiencing loss of retransmitted packets at times when the sender window was only two packets: the retransmissions
and the doubling of the RTO not only impeded the jitter performance but also impeded the
throughput performance.
Figure 10 shows the result of the evaluation of the average fairness index. Again, traffic
load is displayed on the horizontal axis, and the fairness index is displayed on the vertical
axis.
Since the fairness index, by the way it is defined (see Equation 2), is inversely proportional to the gain in average throughput obtained with PRTP, it is not surprising that the plots
of the fairness indexes for the seven PRTP configurations have shapes roughly opposite to
those of the corresponding plots for average throughput.
As follows from the plots of the fairness indexes for the PRTP configurations in Figure 10, PRTP was not altogether fair: While the reference TCP flow between nodes 1 and
4, except for the simulations with a traffic load of 97%, never acquired more than a 20%
larger bandwidth than the competing TCP flow between nodes 2 and 5 (i.e., had a fairness
index less than 0.99), the PRTP configurations with large packet loss tolerances had a fairness index less than 0.9, i.e., acquired more than a 50% larger bandwidth than the TCP flow
between nodes 2 and 5, at a wide range of traffic loads. Most notably, PRTP-20 had an
average fairness index of 0.74 at a traffic load of 60% (ca. 2% packet loss rate), i.e., had a
bandwidth allocation that was four times that of the TCP flow between nodes 2 and 5.
The fact that PRTP was not as fair as TCP was expected and was a direct consequence of
the way PRTP works: PRTP not only acknowledges successfully received packets but also
lost packets provided crl(n) rrl (see Section 2). However, in TCP error and flow control
are intertwined. Thus, in the cases that PRTP acknowledges lost packets, it temporarily
disables the congestion control mechanism of the sender side TCP.
Although PRTP was not as fair as TCP, it could still, as mentioned in Section 5, have
exhibited a TCP-friendly behavior. However, as follows from Table 1, this was not the case.
Even at low packet loss tolerances, PRTP was TCP-unfriendly, and this became worse when
58
Protocol
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Pass Freq.
96.6%
41.6%
36.6%
26.8%
14.9%
5.8%
1.0%
0.0%
Table 1: Results of TCP-friendliness tests.

the packet loss tolerance was increased. The PRTP configurations with packet loss tolerances
above 10% only passed the TCP-friendliness test in a few percent of the simulations. PRTP20 never passed the TCP-friendliness test.
A contributing factor to the low pass frequencies in the TCP-friendliness tests, and the
reason TCP did not pass all tests, was that the TCP-friendliness test assumes a fairly constant
round-trip time [12]. Although, the round-trip times were reasonably constant at low traffic
loads, this was not the case at higher traffic loads: At higher traffic loads a large fraction of
the round-trip times comprised queueing delays which exhibited non-negligible fluctuations.
Transient Analysis
The previous section evaluated the performance of PRTP for long-lived connections. However, at present, short-lived connections in the form of Web traffic constitutes by far the
largest fraction of the total Internet traffic [6]. To this end, we decided to complement our
study of the stationary behavior of PRTP with an analysis of its transient performance. Subsection 6.1 describes the simulation setup and methodology of the transient analysis, and
Subsection 6.2 discusses the results of the simulation experiment.
6.1 Simulation Experiment

The transient performance of PRTP was evaluated in a typical Web browsing scenario. Figure 11(a) illustrates the envisioned Web browsing scenario. As follows from Figure 11(a),
we considered the case when a fictive user requested Web pages from a Web server. Since
performance improvements with PRTP over TCP can only be obtained for loss tolerant applications, the Web pages were considered to contain JPEG images coded according to some
robust scheme (e.g., the one proposed in [13]).
The Web browsing scenario in Figure 11(a) was modeled in ns-2, as depicted in Figure 11(b). Node 1 modeled the Web client and node 2 the Web server.
Since version 2.1b5 of ns-2 did not lend itself easily to simulations of Web client/server
scenarios, some simplifications were made: the Web browsing scenario was modeled as a
6. Transient Analysis
59
HTTP requests
HTTP responses
(HTML pages with JPEG images)
Web client
Web server
(a) Web browsing scenario.
Bandwidth: b kbps
Propagation delay: d ms
Node 1
Deterministic error model
Running Web client

(TCP/PRTP agent in LISTEN mode)
Packet drops specified

for periods of 48 packets
Node 2
Running Web server

(Application/FTP)
(b) Simulation testbed.
Figure 11: Transient analysis of PRTP.
file transfer scenario between an FTP client at node 1 and an FTP server at node 2; the
FTP server was modeled by an FTP application (Application/FTP) running atop a TCP
agent (FullTcp); and the FTP client was modeled by a TCP (FullTcp) or PRTP agent
running in LISTEN mode (i.e., worked as a sink and discarded all received packets). More
precisely, the FTP client was modeled by a TCP agent in the reference simulations and by a
PRTP agent in the comparative simulations.
In all simulations, the TCP and PRTP agents used a maximum segment size of 1460
bytes. Furthermore, the PRTP agent was configured in all simulations with an aging factor,
af , of 1.0 and a required reliability level, rrl, of 85%, which approximately translated to an
allowable steady-state packet loss frequency of 15%.
Simulations were run for seven different file sizes: 5 kB, 8 kB, 12 kB, 20 kB, 35 kB,
60 kB, and 1 MB; the 1 MB file was included to enable straightforward comparisons between
the performance of PRTP for long- and short-lived connections. Furthermore, simulations
were performed for three types of Internet connections: fixed, modem, and GSM.
The three types of Internet connections simulated were modeled using different bandwidths and propagation delays on the link between nodes 1 and 2. The link configurations
shown in Table 2 were used. The link configurations for the fixed Internet connection were
intended to model three typical LAN connections, the link configurations for the modem connection was intended to model a 56 kbps modem in two common Internet access scenarios,
and the link configuration for the GSM connection was intended to model a non-transparent,
60
Internet connection
Fixed
Modem
GSM
Bandwidth, b (kbps)
400
150
60
33.6
48
7.68
Propagation delay, d (ms)

12
30
180
35
35
310
Table 2: Link configurations for simulated Internet connections.
3.1 kHz, GSM modem connection.

The performance of PRTP for short-lived connections is by way of its design (see Section 2) not only dependent on how many packets are lost, i.e., the packet loss rate but also to a
large degree dependent on which particular packets are lost. Therefore, in order to ascertain
that the same packets were lost for both TCP and PRTP, and thereby making the results of the
TCP and PRTP simulations comparable, packet losses were introduced in the simulations by
way of a custom designed deterministic error model. Although, another way of approaching the problem would have been to run each simulation a large number of times, it was
recognized that the number of repetitions that would have been needed to obtain the same
variability in the result as for the custom designed deterministic error model would have
been excessive. Specifically, it would have precluded us from using the same design for both
simulations and experiments [14] and consequently made straightforward comparisons more
difficult.
The deterministic error model enabled us to specify the ordinal numbers of those packets
that should be dropped. The error model was initialized with a list containing the ordinal
numbers of those packets within a period of 48 packets that should be dropped. One such
list was called a loss profile. Simulations were run for the six loss profiles listed in Table
3. As follows from Table 3, these loss profiles represented a spectrum of packet loss rates
between 0% and 20%.
Loss profile (#)

1
2
3
4
5
6
Packets dropped (ordinal numbers)

None
8
10, 33
6, 22, 38
4, 15, 18, 24, 44
5, 6, 9, 15, 22, 28, 33, 35, 42, 47
Table 3: Loss profiles.
Packet-loss rate (%)

0
2
4
6
10
20
61
(a) Bandwidth: 400 kbps, Propagation delay: 12 ms.
(b) Bandwidth: 150 kbps, Propagation delay: 30 ms.
(c) Bandwidth: 60 kbps, Propagation delay: 180 ms.
Figure 12: Throughputs for fixed connection.
62
(a) Bandwidth: 400 kbps, Propagation delay: 12 ms.
(c) Bandwidth: 60 kbps, Propagation delay: 180 ms.
Figure 13: Relative throughputs for fixed connection.
63
6.2 Results
As mentioned, the primary objective of the transient analysis was to evaluate the performance
of PRTP compared to TCP in a typical Web browsing scenario for three types of Internet
connections: fixed, modem, and GSM. In contrast to the stationary analysis, the transient
analysis only evaluated the throughput performance of PRTP.
Figures 12 through 16 show the result of the transient analysis. The graphs in Figures 12,
14, and 16(a) show the actual throughputs obtained with TCP and PRTP and Figures 13,
15, and 16(b) show the throughputs of PRTP relative to TCP expressed in percent of the
throughputs of TCP. To better appreciate how the relative throughput of PRTP varied with
increased packet loss rate (i.e., increasing loss profile number) and increased file size for the
three studied types of Internet connections, the markers in the relative throughput graphs are
connected with lines.
Let us first consider the results of the simulations of the fixed Internet connection. The
graphs in Figures 12 and 13 show that the trend was that the relative throughput performance
of PRTP increased with increased packet loss rate and increased file size. The largest gains
in throughput obtained with PRTP compared to TCP were obtained when the packet loss rate
was 20% (i.e., loss profile #6) and the file was at least 60 kB.
On the other hand, Figures 12 and 13 also show that significant improvements in throughput were already obtained when the packet loss rate was as moderate as 10% and the file was
no larger than 35 kB. Specifically, when the packet loss rate was 10% (i.e., loss profile #5)
and the file was 35 kB, the relative throughput of PRTP was 245% in the simulation of a
400 kbps fixed Internet connection; 184% in the simulation of a 150 kbps fixed Internet connection; and 229% in the simulation of a 60 kbps fixed Internet connection.
Similar results were obtained for the simulations of the modem (see Figures 14 and 15)
and GSM connections (see Figure 16). The trend in the simulations of these two types of
Internet connections was also that the relative throughput performance of PRTP increased
with increased packet loss rate and increased file size. However, for these two types of
Internet connections, significant improvements in throughput were also obtained at moderate
packet loss rates and with relatively small files. When the packet loss rate was 10% and the
file was 20 kB, the relative throughput of PRTP was 216% in the simulation of a 33.6 kbps
modem connection; 197% in the simulation of a 48 kbps modem connection; and 227% in
the simulation of a GSM connection.
As follows from the relative throughput graphs in Figures 13, 15, and 16(b), the relative
throughput performance of PRTP exhibited large fluctuations between different loss profiles
and file sizes. The trend of the throughput performance of PRTP increasing with increased
packet loss rate and increased file sizes was pretty weak.
The large fluctuations in the relative throughput performance of PRTP was primarily an
effect of us using a deterministic error model instead of a stochastic one: in some simulations, a certain loss profile turned out to represent a particularly favorable packet loss pattern
for PRTP, and sometimes an unfavorable one. For example, the loss profile #2 represented a
particularly favorable packet loss pattern for PRTP when the file size was 12 kB and a fixed
Internet connection was simulated (see Figure 13). In the simulations of the 400 kbps and
150 kbps fixed Internet connections the relative throughput of PRTP increased from 100% to
175% when the file size increased from 8 kB to 12 kB, and in the simulation of the 60 kbps
64
(a) Bandwidth: 33.6 kbps, Propagation delay: 35 ms.
Figure 14: Throughputs for modem connection.

fixed Internet connection the increase was from 100% to almost 254%.
What happened in these simulations was that the next to last packet in the TCP sender
window was lost at a time the sender window comprised only three packets, i.e., at a time
the lost packet could only be detected through a timeout. For PRTP, the lost packet had
only marginal impact on the throughput performance: PRTP permitted this packet loss, and
therefore ACK:ed the lost packet in the ACK of the last packet.
On the other hand, loss profile #6 represented a particularly unfavorable packet loss pattern for PRTP in the simulations of the 33.6 kbps and 48 kbps modem connections (see Figure 15) when the file size was 20 kB: in the simulation of the 33.6 kbps modem connection,
the relative throughput of PRTP was barely 47%, and in the simulation of the 44.8 kbps
modem connection the relative throughput was only a little bit over 50%.
The reason to the dramatic decrease in relative throughput for PRTP at these two occa-
65
(a) Bandwidth: 33.6 kbps, Propagation delay: 35 ms.
Figure 15: Relative throughputs for modem connection.
sions was that the ACK of a retransmitted packet was lost at a time the sender window was
only one packet: the first packet in a sender window of three packets was lost. This led to a
timeout and a retransmission of the lost packet. Unfortunately, the ACK of the retransmitted
packet was also lost and yet another timeout and retransmission occurred. This time, the
timeout was twice as long and significantly impeded the throughput performance of PRTP.
To sum up, the transient analysis suggests that PRTP can give substantial improvements
in throughput for short-lived Internet connections as diverse as fixed, modem, and GSM.
However, the results are very preliminary and further simulations are needed to more firmly
establish them.
66
(a) Throughput.
(b) Relative throughput.
Figure 16: GSM connection with bandwidth of 7.68 kbps and propagation delay of 310 ms.
Conclusions
This paper presents an extension for partial reliability to TCP, PRTP, that aims at making
TCP more suitable for multimedia applications. PRTP enables an application to prescribe a
minimum guaranteed reliability level between 0% (i.e., a best-effort transport service such
as UDP) and 100% (i.e, a completely reliable transport service such as TCP).
The major advantage of PRTP is that it only entails modifying the retransmission decision
scheme of TCP. Specifically, PRTP alters the retransmission decision scheme of TCP in such
a way that retransmissions of lost packets are made only when it is necessary in order to
uphold the minimum required reliability level.
The paper also gives a detailed description of a simulation based performance evaluation
that has been performed on PRTP. The performance evaluation was made using the ns-2
REFERENCES
67
network simulator and comprised two subactivities. First, our simulation model of PRTP was
validated against our implementation of PRTP in Linux: in short, this validation suggested
that our simulation model quite accurately modeled PRTP. Second, a performance analysis
of PRTP was conducted. This analysis entailed evaluating the stationary and the transient
performance of PRTP compared to TCP.
The stationary analysis indicated that significant reductions in average interarrival jitter
and improvements in average throughput could be obtained with PRTP at both low and high
traffic loads and with PRTP configured to tolerate low as well as high packet loss rates.
However, the stationary analysis also found PRTP to be less fair than TCP and not TCPfriendly.
The transient analysis entailed evaluating the throughput performance of PRTP in a typical Web browsing scenario. Three types of Internet connections were considered: fixed,
modem, and GSM. The results suggested that at packet loss rates as low as 10%, and for
files as small as 35 kB, throughput gains larger than 140% could be obtained with PRTP
irrespective of the Internet connection.
Taken together, the stationary and transient analyses clearly indicate that PRTP can give
significant performance improvements compared to TCP for a wide range of loss tolerant
applications. However, the stationary analysis also indicated that PRTP would most likely
exhibit a TCP-unfriendly behavior and therefore would not share bandwidth in a fair manner
with competing TCP flows. Consequently, we do not recommend using PRTP for wireline
applications. We believe that PRTP can indeed be a good choice for loss tolerant applications
in error prone wireless environments, e.g., GSM with a transparent link layer: error rates on
wireless links are much higher compared to the error rates in fiber and copper links used in
the fixed Internet. Thus, packet loss that is not related to congestion is much more common
and cannot always be compensated for by layer two retransmissions. Trying to retransmit on
layer two could, for example, trigger a TCP retransmission if it takes too much time.
Although, the current version of PRTP is not TCP-friendly and altogether fair, and therefore not suitable for wireline applications, we intend to change this in future versions of the
protocol. In particular, we believe that PRTP could be made more TCP-friendly and fair in at
least two ways. First, PRTP could be modified so that when a lost packet is acknowledged,
ECN [27] (Explicit Congestion Notification) is used to signal congestion to the sender side
TCP. Second, PRTP could, at the same time it acknowledges a lost packet, also advertise a
smaller receiver window. Furthermore, it would be interesting to see whether the adverse effects that PRTP has on TCP could be reduced by active queueing techniques such as RED [5]
(Random Early Detection) and WFQ [8] (Weighted Fair Queueing).
References
[1] M. Allman, V. Paxson, and W. Stevens. TCP congestion control. RFC 2581, IETF,
April 1999.
2(5), October 1994.
68
[3] K. Asplund, J. Garcia, A. Brunstrom, and S. Schneyer. Decreasing transfer delay

through partial reliability. In Protocols for Multimedia Systems (PROMS), Cracow,
Poland, October 2000.
[4] D. Bansal, H. Balakrishnan, S. Floyd, and S. Shenker. Dynamic behavior of slowlyresponsive congestion control algorithms. ACM Computer Communication Review
(SIGCOMM), 31(4):263274, August 2001.
[5] B. Braden, D. Clark, J. Crowcroft, B. Davie, S. Deering, D. Estrin, S. Floyd, V. Jacobson, G. Minshall, C. Partridge, L. Peterson, K. Ramakrishnan, S. Shenker, J. Wroclawski, and L. Zhang. Recommendations on queue management and congestion
avoidance in the Internet. RFC 2309, IETF, April 1998.
[6] K. Claffy, G. Miller, and K. Thompson. The nature of the beast: Recent traffic measurements from an Internet backbone. In Internet Global Summit (INET), Geneva,
Switzerland, July 1998.
[7] T. Connolly, P. D. Amer, and P. T. Conrad. An extension to TCP: Partial order service.
RFC 1693, IETF, November 1994.
[8] A. Demers, S. Keshav, and S. Shenker. Analysis and simulation of a fair queueing
algorithm. ACM Computer Communication Review (SIGCOMM), 19(4), September
1989.
[11] K. Fall and S. Floyd. Simulation-based comparisons of Tahoe, Reno, and SACK TCP.
ACM Computer Communication Review, 26(3):521, July 1996.
[12] S. Floyd and K. Fall. Promoting the use of end-to-end congestion control in the Internet.
ACM/IEEE Transactions on Networking, 7(4):458472, August 1999.
[13] J. Garcia and A. Brunstrom. A robust JPEG coder for a partially reliable transport
service. In 7th International Workshop on Interactive Distributed Multimedia Systems
and Telecommunication Services (IDMS), Enschede, Netherlands, October 2000.
[14] J. Garcia and A. Brunstrom. An experimental performance evaluation of a partially reliable web application. In International Network Conference (INC), Plymouth, United
Kingdom, July 2002.
REFERENCES
69
[16] T. Henderson and S. Floyd. The NewReno modification to TCPs fast recovery algorithm. RFC 2582, IETF, April 1999.
[17] V. Jacobson, R. Braden, and D. Borman. TCP extensions for high performance. RFC
1323, IETF, May 1992.
[18] R. Jain, D. Chiu, and W. Hawe. A quantitative measure of fairness and discrimination
for resource allocation in shared computer systems. Technical Report DEC-TR-301,
Digital Equipment Corporation, September 1984.
[19] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP selective acknowledgement
options. RFC 2018, IETF, October 1996.
[20] M. K. McKusick, K. Bostic, M. J. Karels, and J. S. Quarterman. The Design and
Implementation of the 4.4 BSD Operating System. Addison-Wesley Publishing, 1996.
[22] NIST Net. http://snad.ncsl.nist.gov/itg/nistnet.
[23] The network simulator ns-2. http://www.isi.edu/nsnam/ns.
1996.
[28] R. Stewart, M. Ramalho, Q. Xie, M. Tuexen, and P. T. Conrad. SCTP partial reliability
extension. Internet draft, IETF, May 2002. Work in Progress.
Paper III
Evaluation of the QoS Offered by PRTP-ECN

A TCP Compliant Partially Reliable Transport Protocol
Reprinted from
Proceedings of the 9th International Workshop on QoS (IWQoS)

Karlsruhe, Germany
June 2001
Evaluation of the QoS offered by PRTP-ECN A

TCP Compliant Partially Reliable Transport
Protocol
Abstract
The introduction of multimedia in the Internet imposes new QoS requirements on existing transport protocols. Since neither TCP nor UDP comply with these requirements,
a common approach today is to use RTP/UDP and to relegate the QoS responsibility to
the application. While this approach has many advantages, it also entails leaving the
responsibility for congestion control to the application. Considering the importance of
efficient and reliable congestion control for maintaining stability in the Internet, this approach may prove dangerous. Better support at the transport layer is therefore needed.
This paper presents a partially reliable transport protocol, PRTP-ECN, designed to be
both TCP-friendly and to better comply with the QoS requirements of applications with
soft real-time constraints. This is achieved by trading reliability for better jitter characteristics and improved throughput. A simulation study was performed on PRTP-ECN
and the outcome suggests that PRTP-ECN can give applications that tolerate a limited
amount of packet loss, significant reductions in interarrival jitter and improvements in
throughput as compared to TCP. The simulations also verified the TCP-friendly behavior
of PRTP-ECN.
1 Introduction
Distribution of multimedia traffic such as streaming media over the Internet poses a major
challenge to existing transport protocols. Apart from having demands on throughput, many
multimedia applications are sensitive to delays and variations in those delays [27]. In addition, they often have an inherent tolerance for limited data loss [29].
The two prevailing transport protocols in the Internet today, TCP [21] and UDP [20], fail
to meet the QoS requirements of streaming media and other applications with soft real-time
constraints. TCP offers a fully reliable transport service at the cost of increased delay and
reduced throughput. UDP on the other hand introduces virtually no increase in delay or
reduction in throughput but provides no reliability enhancement over IP. In addition, UDP
73
74
Evaluation of the QoS Offered by PRTP-ECN A TCP Compliant Partially...
leaves congestion control to the discretion of the application. If misused, this could impair
the stability of the Internet.
In this paper, we present a novel transport protocol, Partially Reliable Transport Protocol using ECN (PRTP-ECN), which offers a transport service that better complies with the
QoS requirements of applications with soft real-time requirements. PRTP-ECN is a receiver
based, partially reliable transport protocol that is implemented as an extension to TCP and
is able to work within the existing Internet infrastructure. It employs a congestion control
mechanism that largely corresponds to the one used in TCP. A simulation evaluation suggests
that, by trading reliability for latency, PRTP-ECN is able to offer a service with significantly
reduced interarrival jitter and increased throughput and goodput as compared to TCP. In addition, the evaluation implies that PRTP-ECN is TCP-friendly, which may not be the case in
some RTP/UDP solutions.
The paper is organized as follows. Section 2 discusses related work. Section 3 gives
a brief overview of the design principles behind PRTP-ECN and Section 4 describes the
design of the simulation experiment. The results of the simulation experiment are discussed
in Section 5. Finally, in Section 6, we summarize the major findings and indicate further
areas of study.
Related Work
PRTP-ECN builds on the work of a number of researchers. The feasibility of using

retransmission-based, partially reliable error control schemes to address the QoS requirements of digital continuous media in general, and interactive voice in particular was demonstrated by Dempsey [9]. On the basis of his findings, he introduced two new retransmission
schemes Slack Automatic Repeat Request (S-ARQ) [10] and Partially Error-Controlled Connection (PECC) [11]. The principle behind the S-ARQ technique is to extend the buffering
strategy at the receiver to handle jitter in such a way that a retransmission can be done without violating the time limit imposed by the application. In contrast to S-ARQ, PECC does
not involve any modifications to the playback buffer. Instead, it modifies the retransmission
algorithm so that retransmission of lost packets occurs only in those cases that it can be done
without violating the latency requirements of the application. PECC was incorporated into
the Xpress Transport Protocol [28], a protocol designed to support a variety of applications
ranging from real-time embedded systems to multimedia distribution. Although PRTP-ECN
and PECC have some similarities, they differ in that PRTP-ECN considers congestion control, and PECC does not.
Extensive work has been done at LAAS-CNRS [12, 23] and the University of
Delaware [4, 5] on using partially reliable and partially ordered transport protocols to offer
a service better adapted to the QoS needs of streaming media applications. Their work
resulted in the proposal of a new transport protocol, Partial Order Connection (POC) [2, 3].
The POC approach to realizing a partially reliable service combines a partitioning of the
media stream into objects, with the notion of reliability classes. An application designates
individual objects as needing different levels of reliability, i.e., reliability classes are assigned
at the object level. By introducing the object concept and letting applications specify their
reliability requirements on a per-object basis, POC offers a very flexible transport service.
3. Overview of PRTP-ECN
75
However, PRTP-ECN is significantly easier to integrate with current Internet standards. An

early version of POC was considered as an extension to TCP but required extensive rework
of the TCP implementation [7]. In addition, POC needs to be implemented at both the sender
and the receiver side, while PRTP-ECN involves only the receiver side.
Three examples of partially reliable protocols utilizing the existing Internet infrastructure
are: Cyclic-UDP [26], VDP (Video Datagram Protocol) [6], and MSP (Media Streaming
Protocol) [14]. Cyclic-UDP works on top of UDP and supports the delivery of prioritized
media units. It uses an error correction strategy that makes the probability of successful
delivery of a media unit proportional to the units priority. The CM Player [25], which is
part of an experimental video-on-demand system at the University of California at Berkeley, employs Cyclic-UDP for the transport of video streams between the video-on-demand
server and the CM Player client. VDP is more or less an augmented version of RTP [24].
It is specifically designed for transmission of video and uses an application driven retransmission scheme. MSP, a successor to VDP, uses a point-to-point client-server architecture.
A media session in MSP comprises two connections one UDP connection used for the
media transfer and one TCP connection for feedback control. On the basis of the feedback,
the sender starts dropping frames from the stream, taking into account the media format. In
contrast to these protocols, PRTP-ECN is a general transport protocol that is not aimed at a
particular application domain. Furthermore, PRTP-ECN uses the same congestion control
mechanism as TCP does the same congestion control mechanism that has proven successful in the Internet for years. In addition to this, all these three protocols involve the sender,
which is not the case in PRTP-ECN. By involving the sender, these protocols can only lend
themselves to small-scale deployments and homogeneous environments. This is not so for
PRTP-ECN.
3 Overview of PRTP-ECN
As mentioned above, PRTP-ECN is a partially reliable transport protocol. It is implemented
as an extension to TCP, and differs from TCP only in the way it handles packet losses. PRTPECN need only be employed at the receiver side. An ECN-capable TCP is used at the sender
side.
PRTP-ECN lets the QoS requirements imposed by the application govern the retransmission scheme. This is done by allowing the application to specify the parameters in a
retransmission decision algorithm. The parameters let the application directly prescribe an
acceptable packet loss rate and indirectly affect the interarrival jitter, throughput, and goodput. By relaxing the reliability, the application receives less interarrival jitter and better
throughput and goodput.
PRTP-ECN works in a way identical to TCP as long as no packets are lost. When an
out-of-sequence packet is received, this is taken as an indication of packet loss. PRTP-ECN
must then decide whether the lost data are needed to ensure the required reliability level
imposed by the application. This decision is based on the success rate of previous packets.
In PRTP-ECN, the success rate is measured as an exponentially weighted moving average
over all packets, lost and received, up to but not including the out-of-sequence packet. This
76
weighted moving average is called the current reliability level, crl(n), and is defined as
crl(n) =
Pn
nk
pk b k
k=1 af
P
,
n
nk b
af
k
k=1
(1)
where n is the sequence number of the packet preceding the out-of-sequence packet, af is
the weight or aging factor, and bk denotes the number of bytes contained in packet k. The
The QoS requirements imposed on PRTP-ECN by the application translate into two parameters in the retransmission scheme: af and rrl. The required reliability level, rrl, is the
reference in the the feedback control system made up of the data flow between the sender
and the receiver, and the flow of acknowledgements in the reverse direction. As long as
crl(n) rrl, dropped packets need not be retransmitted and are therefore acknowledged.
If an out-of-sequence packed is received and crl(n) is below rrl, PRTP-ECN acknowledges
the last in-sequence packet and waits for a retransmission. In other words, PRTP-ECN does
the same thing as TCP in this situation.
There is, however, a problem in acknowledging lost packets. In TCP, the retransmission scheme and the congestion control scheme are intertwined. An acknowledgement not
only signals the successful reception of one or several packets; it also indicates that there is
no noticeable congestion in the network between the sender and the receiver. PRTP-ECN
decouples these two schemes by using the TCP portions of ECN (Explicit Congestion Notification) [22].
The only requirement imposed on the network by PRTP-ECN is that the TCP implementation on the sender side must be ECN capable. It does not engage intermediary routers. In
the normal case, ECN enables direct notification of congestion instead of indirect notification
via missing packets. It engages both the IP and TCP layers. Upon incipient congestion, a
router sets a flag, the Congestion Experienced bit (CE), in the IP header of arriving packets.
When the receiver of a packet finds that the CE bit has been set, it sets a flag, the ECNEcho flag, in the TCP header of the subsequent acknowledgement. Upon reception of an
acknowledgement having the ECN-Echo flag set, the sender halves its congestion window
and performs fast recovery. PRTP-ECN does not involve intermediate routers, however, and
correspondingly does not need the IP parts of ECN. It employs the ECN-Echo flag only to
signal congestion. When an out-of-sequence packet is acknowledged, the ECN-Echo flag is
set in the acknowledgement. When the acknowledgement is received, the sender will throttle
its flow but refrain from re-sending any packet.
Description of Simulation Experiment
The QoS offered by PRTP-ECN as compared to TCP was evaluated by simulation, where
potential improvements in average interarrival jitter, average throughput, and average goodput were examined. We also investigated whether PRTP-ECN connections are TCP-friendly
and fair against competing flows.
4. Description of Simulation Experiment
S1
S2
S3
77
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
R1
1.5 Mbps, 50 ms
R2
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
S4
S5
S6
Figure 1: Network topology.
4.1 Implementation
We used version 2.1b5 of the ns-2 network simulator [19] to conduct the simulations described in this paper. The TCP protocol was modeled by the FullTcp agent, while PRTPECN was simulated by PRTP, an agent developed by us.
The FullTcp agent is similar to the 4.4 BSD TCP implementation [18, 30]. This means,
among other things, that it uses a congestion control mechanism similar to TCP Renos [1].
However, SACK [17] is not implemented in FullTcp. The PRTP-ECN agent, PRTP, inherits most of its functionality from the FullTcp agent. Only the retransmission mechanism
differs between FullTcp and PRTP.
4.2 Simulation Methodology

The network topology used in the simulation study is depicted in Figure 1. There were three
primary factors in the experiment:
1. protocol used at node S4,
2. traffic load, and
3. starting times for the flows emanating from nodes S1 and S2.
Two FTP applications attached to nodes S1 and S2 sent data at a rate of 10 Mbps to
receivers at nodes S4 and S5. The FTP applications were attached to TCP agents. Node S4
accommodated two transport protocols: TCP and PRTP-ECN. Both protocol agents used an
initial congestion window of two segments [1]. All other agent parameters were assigned
their default values.
Background traffic was accomplished by a UDP flow between nodes S3 and S6. The UDP
flow was generated by a constant bitrate traffic generator residing at node S3. The departure
times of the packets from the traffic generator were randomized, however, resulting in a
variable bit rate flow between nodes S3 and S6. A maximum transfer unit (MTU) of 1500
bytes was used in all simulations.
78
Routers R1 and R2 had a single output queue for each attached link and used FCFS
scheduling. Both router buffers had a capacity of 25 segments, i.e., approximately twice the
bandwidth-delay product of the network path. All receivers used a fixed advertised window
size of 20 segments, which enabled each of the senders to fill the bottleneck link.
The traffic load was controlled by setting the mean sending rate of the UDP flow to
a fraction of the nominal bandwidth on the R1-R2 link. Tests were run for seven traffic
loads: 20%, 60%, 67%, 80%, 87%, 93%, and 97%. These seven traffic loads corresponded
approximately to packet-loss rates of 1%, 2%, 3%, 5%, 8%, 14%, and 20% in the reference
tests, i.e., the tests in which TCP was used at node S4. Tests were run for eight PRTPECN configurations (see Section 4.3), and each test was run 40 times to obtain statistically
significant results.
In all simulations, the UDP flow started at 0 s, while three cases of start times for the FTP
flows were studied. In the first case, the flow between nodes S1 and S4 started at 0 s, and the
flow between nodes S2 and S5 started at 600 ms. In the second case, the situation was the
reverse, i.e., the flow between nodes S1 and S4 started at 600 ms, and the flow between nodes
S2 and S5 started at 0 s. Finally, in the last case, both flows started at 0 s. Each simulation
run lasted 100 s.
4.3 Selection of PRTP-ECN Configurations

As explained in Section 3, the service provided by PRTP-ECN depends on the values of parameters af and rrl. In the remainder of this text, we call an assignment of these parameters
a PRTP-ECN configuration.
The PRTP-ECN configurations used in the simulation experiment were selected on the
basis of their tolerance to packet losses. Since the packet loss frequency that a PRTP-ECN
configuration tolerates depends on the packet loss pattern, we defined a metric, the allowable
steady-state packet loss frequency, derived from the particularly favorable scenario in which
packets are lost at all times when PRTP-ECN allows it, i.e., all times when crl rrl.
Simulations suggest that the allowable packet loss frequency in this scenario approaches a
limit, floss , as the total number of sent packets, n, reaches infinity, or more formally stated:
def
loss( n )
,
n
n
floss = lim
(2)
where n denotes the packet sequence comprising all packets sent up to and including the nth
packet and loss( n ) is a function that returns the number of lost packets in n . Considering
that packet losses almost always occur in less favorable situations, the allowable steady-state
packet loss frequency may be seen as a rough estimate of the upper bound of the packet loss
frequency tolerated by a particular PRTP-ECN configuration.
We selected seven PRTP-ECN configurations that had allowable steady-state packet loss
frequencies ranging from 2% to 20%. Since our metric, the allowable steady-state packet
loss frequency, did not capture all aspects of a particular PRTP-ECN configuration, care
was taken to ensure that the selection was done in a consistent manner. Of the eligible
configurations for a particular packet loss frequency, we always selected the one with the
largest aging factor. If there were several configurations with the same aging factor, we
4. Description of Simulation Experiment
Configuration Name
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Steady-State Packet-Loss Frequency

0.02
0.03
0.05
0.08
0.11
0.14
0.20
79
af
0.99
0.99
0.99
0.99
0.99
0.99
0.99
rrl
0.97
0.96
0.94
0.91
0.88
0.85
0.80
Table 1: Selected PRTP-ECN configurations

consistently selected the configuration having the largest required reliability level. Table 1,
lists the selected PRTP-ECN configurations. As seen in the table, the allowable steady-state
packet loss frequencies for the selected PRTP-ECN configurations roughly correspond to the
packet loss rates in the reference tests (see Section 4.2).
4.4 Performance Metrics

This subsection gives definitions of the performance metrics studied in the simulation experiment.
Average interarrival jitter: The average interarrival jitter is the average variation in delay
between consecutive deliverable packets in a flow [8].
Average throughput: The average throughput of a flow is the average bandwidth delivered
to the receiver, including duplicate packets [15].
Average goodput: The average goodput of a flow is the average bandwidth delivered to the
receiver, excluding duplicate packets [13].
Average fairness: The average fairness of a protocol on a link is the degree to which the
utilized link bandwidth has been equally allocated among contending flows. A metric
commonly used for measuring fairness is Jains fairness index [16]. For n flows, with
flow i receiving a fraction, bi , on a given link, the fairness of the allocation is defined
as:
Pn
2
def (
i=1 bi )
P
.
(3)
Fairness index =
n
n ( i=1 b2i )
TCP-friendliness A flow is said to be TCP-friendly if its arrival rate does not exceed the
arrival rate of a conformant TCP connection under the same circumstances [13]. In
this simulation study, we make use of the TCP-friendliness test presented by Floyd and
Fall [13]. According to their test, a flow is TCP-friendly if the following inequality
holds for its arrival rate:
p
1.5 2/3
,
(4)
RT T ploss
80
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Jitter (ms)
Throughput (bps)
Goodput (bps)
Fairness Index
27.5 1.5
18.9 0.8
18.6 0.7
18.0 0.7
17.3 0.6
17.3 0.6
17.0 0.7
17.1 0.6
579315 17992
662325 14829
664734 13949
669018 14385
680175 12561
677904 12588
683313 14350
681975 13047
578781 17981
662082 14832
664467 13951
668883 14395
680046 12566
677778 12586
683193 14350
681855 13047
0.99 0.003
0.98 0.006
0.98 0.006
0.98 0.006
0.98 0.005
0.98 0.006
0.98 0.006
0.98 0.006
Table 2: Performance metrics for tests where the traffic load was 20%.
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Jitter (ms)
Throughput (bps)
Goodput (bps)
Fairness Index
101.8 3.6
72.6 2.8
63.5 2.5
51.8 1.4
50.0 0.8
49.9 1.1
49.8 1.0
49.9 1.0
237156 5479
268659 6627
284100 6707
309105 5813
312003 4111
312624 5633
312618 5071
311787 4778
235920 5483
268248 6630
283797 6725
308934 5814
311871 4108
312483 5640
312471 5065
311652 4783
1.00 0.002
0.98 0.005
0.98 0.007
0.95 0.010
0.94 0.007
0.94 0.009
0.94 0.008
0.94 0.007
where is the arrival rate of the flow in Bps, denotes the packet size in bytes, RT T
is the minimum round trip time in seconds, and ploss is the packet loss frequency.
Results
In the analysis of the simulation experiment, we made a TCP-friendliness test, and calculated
the average interarrival jitter, the average throughput, and the average goodput for the flow
between nodes S1 and S4. In addition, we calculated the average fairness in each run. We
let the mean, taken over all runs, be an estimate of a performance metric in a test. Of the
three primary factors studied in this experiment, the starting times of the two FTP flows were
found to have marginal impact on the results. For this reason, we focus our discussion on
one of the three cases of starting times: the one in which the FTP flow between the nodes S1
and S4 started 600 ms after the flow between nodes S2 and S5. It should be noted, however,
that the conclusions drawn from these tests also apply to the tests in the other two cases.
To make comparisons easier, the graphs show interarrival jitter, throughput, and goodput
for the PRTP-ECN configurations relative to TCP, i.e., the ratios between the metrics obtained for the PRTP-ECN configurations and the metrics obtained for TCP are plotted. As
a complement to the graphs, Tables 2, 3, and 4 show the estimates of the metrics together
with their 99%, two-sided, confidence interval for a selection of traffic loads.
5. Results
81
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Jitter (ms)
Throughput (bps)
Goodput (bps)
Fairness Index
737.1 112.3
722.6 77.1
701.4 97.0
582.7 56.1
485.2 36.0
425.2 45.0
329.7 28.2
242.8 11.8
39741 4521
37812 3710
38214 3494
42669 3434
46737 2880
48396 4512
53304 3138
58173 2853
39093 4406
37392 3664
37848 3464
42393 3420
46557 2864
48237 4512
53184 3131
58059 2851
0.92 0.053
0.95 0.033
0.95 0.030
0.93 0.042
0.91 0.045
0.88 0.046
0.86 0.050
0.83 0.047
Relative Interarrival Jitter vs. Traffic Load
1
0.9
Relative Interarrival Jitter
0.8
0.7
0.6
0.5
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
0.4
0.3
20
30
40
50
60
70
80
Traffic Load (Percentage of Nominal Bandwidth on Link R1-R2)
90
100

Our evaluation of the jitter characteristics of PRTP-ECN gave very promising results. As
can be seen in the graph in Figure 2 and the tables, the PRTP-ECN configurations decreased
the interarrival jitter as compared to TCP. The reduction was about 30% at low traffic loads
but was in some cases as much as 68% for packet loss rates in the neighborhood of 20%.
The confidence intervals show that the improvements in interarrival jitter obtained by using
PRTP-ECN are statistically significant. They also show that, by using a properly configured PRTP-ECN configuration, not only could the interarrival jitter be decreased, but the
variations in the interarrival jitter could also become more predictable.
Considering the importance of jitter for streaming media, the suggested reduction in interarrival jitter may make PRTP-ECN a viable alternative for such applications. For example,
a video broadcasting system that tolerates high packet loss rates could theoretically decrease
its playback buffer significantly by using PRTP-ECN.
82
Relative Throughput vs. Traffic Load

1.5
Relative Throughput
1.4
1.3
1.2
1.1
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
0.9
20
30
40
50
60
70
80
90
100
Our evaluation of the throughput and goodput of PRTP-ECN also gave positive results.
As evident in Figures 3 and 4 and the tables, significant improvements in both throughput
and goodput were obtained using PRTP-ECN. For example, an application accepting a 20%
packet loss rate could increase its throughput, as well as its goodput, by as much as 48%.
However, applications that tolerate a packet loss rate of only a few percent may also experience improvements in throughput and goodput of as much as 20%. From the confidence
intervals, it follows that the improvements in throughput and goodput were significant and
that PRTP-ECN could provide less fluctuating throughput and goodput than TCP. A comparison of the throughputs and goodputs further suggest that PRTP-ECN is better than TCP at
utilizing bandwidth. This has not been statistically verified, however.
Recall from Section 4.2 that a traffic load approximately corresponds to a particular
packet loss rate. Taking this into account in analyzing the results, it may be concluded that a
PRTP-ECN configuration had its optimum in relative interarrival jitter, relative throughput,
and relative goodput when the packet loss frequency was almost the same as the allowable
steady-state packet loss frequency. This is a direct consequence of the way we defined the allowable steady-state packet loss frequency. At packet loss frequencies lower than the allowable steady-state packet loss frequency, the gain in performance was limited by the fact that it
was not necessary to make very many retransmissions in the first place. When the packet loss
frequency exceeded the allowable steady-state packet loss frequency, the situation was the
reverse. In these cases, PRTP-ECN had to increase the number of retransmissions in order
to uphold the reliability level, which had a negative impact on interarrival jitter, throughput,
and goodput. However, it should be noted that even in cases in which PRTP-ECN had to
increase the number of retransmissions, it performed far fewer retransmissions than TCP.
TCP-friendliness is a prerequisite for a protocol to be able to be deployed on a large scale
6. Conclusions and Future Work
83
Relative Goodput vs. Traffic Load

1.5
1.4
Relative Goodput
1.3
1.2
1.1
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
0.9
20
30
40
50
60
70
80
90
100
Figure 4: Goodput vs. traffic load.

and was therefore an important design consideration in PRTP-ECN. As mentioned in Section
4.4, this study employed the TCP-friendliness test proposed by Floyd and Fall [13]. We said
that a PRTP-ECN configuration was TCP-friendly if it passed the TCP-friendliness test in
more than 95% of the simulation runs. The reason for not requiring a 100% pass-frequency
was that not even TCP managed to be TCP-friendly in all runs.
Our simulation experiment suggested that PRTP-ECN is indeed TCP-friendly. All PRTPECN configurations passed the TCP-friendliness test. We also computed the fairness index [16] for each simulation run. As can be seen in the graph in Figure 5, PRTP-ECN is
reasonably fair. However, since PRTP-ECN gave better throughput than TCP, it follows
from the definition of the fairness index that it must be lower for PRTP-ECN than for TCP.
6 Conclusions and Future Work

This paper presented a TCP-compliant, partially reliable transport protocol, PRTP-ECN, that
addresses the QoS requirements of applications with soft real-time constraints. PRTP-ECN
was built as an extension to TCP and trades reliability for reduced interarrival jitter and improved throughput and goodput. Our simulation evaluation suggests that PRTP-ECN is able
to offer a service with significantly reduced interarrival jitter and increased throughput and
goodput as compared to TCP. Our simulations also found PRTP-ECN to be TCP-friendly.
In all tests, PRTP-ECN passed the TCP-friendliness test proposed by Floyd et al. [13]. In a
broader perspective, our simulations illustrate the trade-off that exists between different QoS
parameters and how it may be exploited to achieve a general transport service that better
meets the needs of a particular application. It furthermore demonstrates the appropriateness
84
Fairness Index vs. Traffic Load

1
0.98
0.96
Fairness Index
0.94
0.92
0.9
0.88
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
0.86
0.84
0.82
20
30
40
50
60
70
80
90
100
Figure 5: Fairness index vs. traffic load.

of letting the application control the QoS trade-offs that must be made but allowing a general
transport protocol to be responsible for carrying it out, thereby making it possible to enforce
congestion control on all flows. This is of major importance for the stability of the Internet.
A more extensive simulation experiment of PRTP-ECN will be conducted in the near
future. This study will treat both the steady-state behavior of PRTP-ECN and its transient
behavior. We are also working on an implementation of PRTP-ECN. Once the implementation is completed, an experimental study will be initiated.
References
April 1999.
2(5), October 1994.
REFERENCES
85
[6] Z. Chen, S-M. Tan, R. H. Campbell, and Y. Li. Real time video and audio in the
world wide web. In 4th International World Wide Web Conference (WWW), Boston,
Massachusetts, USA, December 1995.
[8] J. Davidson and J. Peters. Voice over IP Fundamentals. Cisco Press, March 2000.
[10] B. Dempsey, J. Liebeherr, and A. Weaver. A delay-sensitive error control scheme for
continuous media communications. In 2nd IEEE Workshop on the Architecture and
Implementation of High Performance Communication Subsystems (HPCS), September
1993.
[14] K. Hess. Media streaming protocol: An adaptive protocol for the delivery of audio and
video over the Internet. Masters thesis, University of Illinois at Urbana-Champaign,
1998.
[15] R. Jain. The Art of Computer Systems Performance Analysis. John Wiley & Sons, Inc.,
April 1991.
[18] M. K. McKusick, K. Bostic, M. J. Karels, and J. S. Quarterman. The Design and
Implementation of the 4.4 BSD Operating System. Addison-Wesley Publishing, 1996.
86

[23] L. Rojas-Cardenas, L. Dairaine, P. Senac, and M. Diaz. An adaptive transport service
for multimedia streams. In IEEE International Conference on Multimedia Computing
and Systems (ICMCS), Florence, Italy, June 1999.
January 1996.
[25] B. Smith. Implementation Techniques for Continuous Media Systems and Applications.
PhD thesis, University of California, Berkeley, December 1994.
[26] B. C. Smith. Cyclic-UDP: A priority-driven best effort protocol. Unpublished, May
1994.
[27] R. Steinmetz. Human perception of jitter and media synchronization. IEEE Journal on
Selected Areas in Communications, 14(2):6172, February 1996.
[29] D. Wijesekera. Experimental evaluation of loss perception in continuous media. ACM
Multimedia Systems Journal, 7(6):486499, July 1999.
[30] G. Wright and W. Stevens. TCP/IP Illustrated, volume 2. Addison-Wesley, December
1999.
Paper IV
A Simulation Based Performance Analysis of a TCP

Extension for Best Effort Multimedia Applications
Reprinted from
Proceedings of the 35th Annual Simulation Symposium (ANSS35)

San Diego, California, USA
April 2002
A Simulation Based Performance Analysis of a

TCP Extension for Best Effort Multimedia
Applications
Abstract
Since TCP is considered unsuitable for the majority of emerging multimedia applications, these applications primarily use UDP transport together with proprietary congestion control schemes that have better jitter and throughput characteristics. A common
problem in these congestion control schemes is that they often exhibit a TCP-unfriendly
and unfair behavior. As the number of applications that use this kind of schemes increases, this can become a serious threat to the stability and performance of the Internet.
In an attempt to make TCP a viable alternative to some best effort multimedia applications, we have proposed an extension to TCP PRTP-ECN. The performance of PRTPECN has been compared with TCP in an extensive factorial simulation experiment. This
paper gives a detailed description of this simulation experiment with an emphasis on its
statistical design and analysis. The analysis of the experiment includes, among other
things, a series of ANOVA tests. These tests indicate that PRTP-ECN gives significant
reductions in average interarrival jitter while at the same time leads to improvements
in average throughput and goodput and better link utilization. In addition, the analysis
suggests that PRTP-ECN is almost as fair as TCP and exhibits a TCP-friendly behavior.
1 Introduction
A large portion of the emerging multimedia applications use the UDP transport protocol,
which, unlike TCP, provides neither flow nor congestion control. Often, as is the case in
RealNetworks RealPlayer [19] and Microsofts Windows Media Services [13], these multimedia applications are somewhat responsive to network congestion but not as much as
TCP-based applications. Furthermore, they use proprietary algorithms to respond to congestion that are more or less incompatible with the one used by TCP. Consequently, a large-scale
deployment of these UDP based multimedia applications could lead to an unfair bandwidth
allocation among competing traffic flows, which in a longer perspective could result in a
congestion collapse [9].
In an attempt to broaden the spectrum of applications that can run on top of TCP, making
89
90
A Simulation Based Performance Analysis of a TCP Extension for...
it possible for some of those best effort multimedia applications that today use UDP to run
on TCP, we have proposed an extension to TCP: PRTP-ECN [8]. PRTP-ECN aims at making
TCP better suited for applications with soft real-time constraints, e.g., best effort multimedia
applications, while still being TCP-friendly. The principal idea behind PRTP-ECN is to trade
reliability for reduced jitter and improved throughput.
PRTP-ECN is implemented as a partially reliable error recovery mechanism and in that
respect builds on the early work on retransmission-based error recovery schemes conducted
by Dempsey [3, 4] and Papadopoulus and Parulkar [17]. Independently of each other, they
demonstrated the feasibility of using a retransmission-based partially reliable error recovery
mechanism for multimedia communication. Extensive work on partial reliability in connection with multimedia has also been done at LAAS-CNRS [5] and at the University of
Delaware [2], work that resulted in particular in POC (Partial Order Connection), a partially
ordered and partially reliable transport protocol specifically targeting multimedia applications. More recent proposals of partially reliable multimedia protocols also factor in TCPfriendliness. Examples of this genre of protocols include TLTCP (Time-Lined TCP) [15]
and the rate based protocol proposed by Jacobs et al. [10]. Both these protocols are designed
to provide a TCP-friendly delivery of time sensitive data to applications that are loss tolerant, such as streaming media players. Furthermore, U-SCTP [21], an unreliable extension to
SCTP (Stream Control Transmission Protocol) [20], has been proposed. It is able to provide
a limited form of congestion aware, partially reliable transport service. Considering that USCTP was proposed by the major designers of SCTP, this further emphasizes the need of a
transport protocol that is both TCP-friendly and offers a more flexible transport service than
TCP.
PRTP-ECN sets itself apart from many other partially reliable error recovery schemes
that have been proposed in that it reacts to congestion in the same way as standard TCP.
Furthermore, PRTP-ECN is completely compatible with standard TCP and is consequently
able to interwork with existing TCP implementations. In addition, PRTP-ECN only needs
to be implemented on the receiver side. Neither the TCP sender side nor any intermediate
network equipment such as routers, gateways, etc. are affected by PRTP-ECN.
To compare the performance of PRTP-ECN with TCP in terms of average interarrival jitter, average throughput, average goodput, and average link utilization, we conducted a simulation experiment. The primary objective of the simulation was to investigate whether PRTPECN performs better than TCP and whether the difference in performance between PRTPECN and TCP is statistically significant. Furthermore, we wanted to determine whether
PRTP-ECN exhibits a TCP-friendly behavior and competes fairly with standard TCP flows.
This paper gives a detailed description of this simulation experiment with an emphasis on
the statistical design and analysis of the experiment.
The remainder of this paper is organized as follows. Section 2 gives a brief overview
of the design principles behind PRTP-ECN. In Section 3, we discuss the organization of
the simulation experiment. In particular, we discuss the statistical design and analysis of
the simulation experiment and give a description of the simulation procedure. The results
of the simulation experiment are presented and discussed in Section 4. Finally, Section 5
summarizes the findings and gives some concluding remarks.
2. Overview of PRTP-ECN
91
11
00
00
11
00
11
00
11
threeway handshake,
ECN negotiation
sender side
TCP
11
00
00
11
00
11
00
11
receiver side
PRTPECN
regular TCP transfer
insequence da
ta packet
outofseque
packet loss
nce data pack
et
packet loss detected
nt
acknowledgeme
ho flag set
with ECNEc
crl
rrl ?
congestion action
data packet wi
th CWR flag set

notification of
congestion action
Figure 1: PRTP-ECN session.
2 Overview of PRTP-ECN
The PRTP-ECN extension to TCP only involves changing the retransmission decision algorithm of TCP, i.e., PRTP-ECN retains most of the characteristics and mechanisms of TCP. In
particular, PRTP-ECN does not alter data transfer, flow control, multiplexing, or connection
characteristics of TCP in any way. This enables PRTP-ECN to transparently interwork with
existing TCP implementations. The modifications to TCP imposed by PRTP-ECN have been
isolated to the receiver side. No changes are required at the sender side.
The PRTP-ECN retransmission decision algorithm is parameterized. An application atop
PRTP-ECN explicitly prescribes a minimum acceptable reliability level by setting the parameters of the retransmission algorithm. Implicitly, the parameters govern the trade-off
between reliability, interarrival jitter, throughput, and goodput. By relaxing the reliability,
PRTP-ECN implicitly favors a reduction in interarrival jitter and an increase in throughput
and goodput.
Figure 1 illustrates how PRTP-ECN works. As long as no packets are lost, PRTP-ECN
behaves in the same way as standard TCP. When an out-of-sequence packet is received, it is
92
taken as an indication of packet loss and the retransmission decision algorithm is invoked.
This algorithm decides, on the basis of the success rate of previous packets, whether to
acknowledge all packets up to and including the out-of-sequence packet or to do the same
as standard TCP, i.e., acknowledge the last successfully received in-sequence packet and
wait for a retransmission. In PRTP-ECN, the success rate is measured as an exponentially
Pn
nk
pk b k
k=1 af
,
(1)
crl(n) = P
n
nk
bk
k=1 af
where n is the sequence number of the packet preceding the out-of-sequence packet, af is
the weight or aging factor, and bk denotes the number of bytes contained in packet k. The
and a second parameter called the required reliability level. The required reliability level, rrl ,
acts as a reference value. As long as crl(n) rrl, dropped packets need not be retransmitted
and are therefore acknowledged. If an out-of-sequence packet is received and crl(n) is below
rrl , PRTP-ECN acknowledges the last in-sequence packet, and waits for a retransmission.
In the remainder of this text, a PRTP-ECN protocol that has been assigned fixed values for
af and rrl is called a PRTP-ECN configuration.
Acknowledging lost packets interferes with the congestion control in standard TCP, which
interprets lost packets as a signal of congestion. PRTP-ECN remedies this by employing
Explicit Congestion Notification (ECN) [18]. More specifically, PRTP-ECN uses the TCP
portions of ECN and does not involve intervening routers. Instead, when an out-of-sequence
packet is received by PRTP-ECN, it sets the ECN-Echo flag in the acknowledgement of the
out-of-sequence packet. When an acknowledgement with the ECN-Echo flag set is received,
the sender takes the appropriate congestion measures but does not retransmit any packet.
The sender confirms the receipt of the congestion notification in the next data packet. The
PRTP-ECN session then continues in the same way as a regular TCP session until the next
packet drop.
Description of the Simulation Experiment
This section describes the design of the simulation experiment. Two aspects are considered.
First, in Subsection 3.1, the statistical design of the simulation experiment and the techniques used to analyze the design are discussed. Second, in Subsection 3.2, we describe the
simulation model used in the simulation experiment.
3.1 Statistical Design and Analysis

The simulation experiment was designed as a fixed-factor factorial experiment with five
response variables [11, 14] and was replicated 40 times. The five response variables corre-
3. Description of the Simulation Experiment
93
sponded to the performance metrics: average interarrival jitter, average throughput, average
goodput, average link utilization, and average fairness index (further explained in Section
4.4), i.e., all performance metrics studied except TCP-friendliness. In the remainder of this
section, we let M denote the set comprising the performance metrics studied excluding TCPfriendliness.
The simulation experiment comprised three factors: one primary factor and two secondary factors. The primary factor, protocol, had eight levels. Apart from TCP, we ran simulations on seven PRTP-ECN configurations. The secondary factors were traffic load and
the relative starting times of competing flows. The first secondary factor, traffic load, could
assume seven levels and the second secondary factor, relative starting times of competing
flows, three levels.
We were primarily interested in the effects of PRTP-ECN. Traffic load was introduced as
a way to indirectly control the packet loss rate. This was necessary since the performance of
PRTP-ECN was by way of its design directly dependent on the packet loss rate. The relative
starting times were included as a factor in the experiment to verify that PRTP-ECN shares
bandwidth fairly with TCP in all situations, irrespective of the bandwidth allocated to PRTPECN at the time a TCP flow starts. However, the relative starting times we observed to have
a marginal impact on all five performance metrics studied, including fairness.
A more detailed description of the three factors in the experiment and how they translate
into parameters in the simulation setup follows in Subsection 3.2, where the simulation
procedure is discussed. In the remainder of this section, we let P denote the set of levels of
the primary factor (protocol), T the set of levels of the first secondary factor (traffic load),
and S the set of levels of the other secondary factor (relative starting times of competing
flows).
The underlying effects model of our five-factorial experiment was
Yijkl
+ i + j + k + ( )ij +
+ ( )ik + ()jk + ( )ijk + ijkl
is a random variable denotwhere M, i P, j T, k S, and 1 l 40. Here Yijkl
ing the (ijkl)th observation of performance metric , is the overall mean value of metric
, i is the effect of the ith protocol in P on metric , j is the effect of the jth traffic load
in T on metric , and k is the effect of the kth case of the relative starting times on metric
. The remaining terms, except ijkl , represent interaction effects between the three factors.
Term ijkl is a random error component that incorporates all other sources of variability. In
the effects model, P
the treatment and
P interaction effects are defined as deviations from the
mean where sums iP i , . . . , kS ( )ijk are equal to zero.
The following hypotheses were tested at a significance level of 0.01: H0 : i(i =
0); H1 : i(i 6= 0). The hypotheses were tested using a three-factor analysis of variance
(ANOVA) test. This test assumes that the random errors in the effects models are normally
and independently distributed with a mean of zero and a constant variance. We verified the
normality assumption with normal probability plots of the residuals and residual histograms.
The constant variance assumption was checked with plots of residuals versus fitted values
and by the Levene test [14]. Variance stabilizing transformations were employed to mitigate effects of non-homogeneous variances. The independence assumption was not verified.
94
S1
S2
S3
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
R1
1.5 Mbps, 50 ms
R2
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
S4
S5
S6

Since this is a simulation experiment, we tacitly assumed that the random errors were independently distributed over time.
In tests in which the null hypothesis was rejected, which turned out to be all tests, we
compared the means of the performance metrics for the PRTP-ECN configurations with the
ones obtained for TCP. In addition, we compared the PRTP-ECN configurations with each
other. All comparisons were made using Tukeys test [14].
3.2 Simulation Procedure

We used the ns-2 [16] network simulator to conduct the experiment presented in this paper.
Figure 2 shows the network topology used in all simulations. Simulations were done in pairs.
In the reference scenarios, TCP was used at all four nodes, S1, S2, S4, and S5, while TCP at
node S4 was replaced with PRTP-ECN in the evaluation scenarios.
The initial congestion windows of the TCP and PRTP-ECN protocols were initialized to
two segments [1]. They used an advertised window size of 20 segments, which enabled each
of the senders to fill the bottleneck link. All other protocol parameters were assigned their
default values. The maximum transfer unit (MTU) was set to 1500 bytes.
The routers, R1 and R2, had a single output queue for each attached link and used FCFS
scheduling. Both router buffers had a capacity of 25 segments, i.e., approximately twice the
bandwidth-delay product of the network path.
The traffic consisted of two FTP flows and one UDP flow. The two FTP flows were
generated by FTP applications sending data at a constant rate from node S1 to node S4 and
from node S2 to node S5. Both FTP applications had an unlimited amount of data to send,
and their sending rates were limited only by the available bandwidth on the bottleneck link
between nodes R1 and R2. Although both FTP flows were monitored in our experiment, the
key component was the FTP flow between nodes S1 and S4. The FTP flow between nodes
S2 and S5, which in all simulations was a standard TCP flow, was only introduced to make
it possible for us to calculate the average fairness index.
The UDP flow served as background traffic and was generated by a constant bitrate traffic
generator residing at node S3 and sending data to node S6. By randomizing the departure
4. Results of the Simulation Experiment
95
times of the UDP packets, the UDP flow was transformed to a variable bitrate flow.
The three factors in the simulation experiment translated to the following parameters:
protocol used at node S4, traffic load of the R1-R2 link, and starting times of the two FTP
flows.
The PRTP-ECN configurations were selected on the basis of their tolerance to packet
loss. More precisely, the PRTP-ECN configurations were selected on the basis of a metric,
the allowable steady-state packet loss frequency [7], which gives an estimate of the packet
loss tolerance in steady state for a given configuration. Seven configurations were selected,
with allowable steady-state packet loss frequencies approximately equal to 2%, 3%, 5%,
8%, 11%, 14%, and 20%. In the remainder of this paper, we denote these configurations,
PRTP-2, PRTP-3, PRTP-5, PRTP-8, PRTP-11, PRTP-14, and PRTP-20, respectively.
The traffic load was controlled by setting the mean sending rate of the UDP flow to a
fraction of the nominal bandwidth on the R1-R2 link. Tests were run for seven traffic loads,
20%, 60%, 67%, 80%, 87%, 93%, and 97%. These seven traffic loads corresponded approximately to the packet-loss rates of 1%, 2%, 3%, 5%, 8%, 14%, and 20% in the reference
tests.
In all simulations, the UDP flow started at 0 s, while three cases of start times for the FTP
flows were studied. In the first case, the flow between nodes S1 and S4 started at 0 s, and the
flow between nodes S2 and S5 started at 600 ms. In the second case, the situation was the
reverse, i.e., the flow between nodes S1 and S4 started at 600 ms, and the flow between nodes
S2 and S5 started at 0 s. Finally, in the last case, both flows started at 0 s. Each simulation
run lasted 100 s.
4 Results of the Simulation Experiment

This section discusses the results of the simulation experiment. A large part of the discussion concerns the post-processing of the observations from the simulation experiment. In
particular, we elaborate on the validity of the effects models and the steps taken to establish
reasonably valid models for the ANOVA tests. Due to space limitations, the discussion of the
results has been condensed. Only the results in one performance metric, average interarrival
jitter, are discussed in full. The discussion of the other performance metrics is limited to the
outcome of the ANOVA and Tukey tests and the most important parts of the post-processing
activity.
We used initially the untransformed effects model for average interarrival jitter (see Subsection 3.1). When verifying the model adequacy, the residuals were found to deviate considerably from a normal distribution. As follows from the histogram and normal probability
plot in Figures 3(a) and 3(c), the distribution of the residuals was reasonably symmetrical
but deviated from a normal distribution in that it had much shorter and thinner tails.
The constant variance assumption was severely violated. As Figure 3(e) clearly shows,
there was a trend in the spread of residuals as a function of the average interarrival jitter.
Regression analysis suggested the power transformation: Y = 27.034Y 0.229 .
The normality assumption and the constant variance assumption were better satisfied by
the transformed effects model. As is illustrated by the residual histogram in Figure 3(b),
96
the transformed effects model had much longer tails while still showing a symmetrical distribution.This observation is confirmed by the normal probability plot of the residuals in
Figure 3(d).
As follows from Figure 3(f), the correspondence between the residuals and the average
interarrival jitter was far less obvious for the transformed effects model. It is also evident
from the plot in Figure 3(f) that the variance was not altogether constant, and this was confirmed by the Levene test (F > 8.265, P < 0.01). However, balanced ANOVA tests (equal
sample sizes in all treatments) are fairly robust toward minor violations of the constant variance assumption. Hence, the outcome of the ANOVA test conducted on the transformed
effects model was not affected by the somewhat fluctuating variance.
The transformed effects model had a coefficient of determination, R2 , equal to 99.3%,
i.e., it accounted for as much as 99.3% of the variability in average interarrival jitter. Considering that our effects model did not take into account the fact that, because of its design,
PRTP-ECNs performance is highly dependent on which specific packets are lost, this result
was very satisfactory and suggested that the model could be regarded as adequate.
It followed from the ANOVA test that the choice of protocol had indeed an impact on the
average interarrival jitter. We had a positive interaction between protocol and traffic load,
i.e., the reduction in average interarrival jitter attained with PRTP-ECN was greater when
the traffic load increased. This interaction effect was due mainly to the way PRTP-ECN
reacted to packet loss. Contrary to TCP, PRTP-ECN accepted a limited amount of packet
loss. At low packet loss rates, the retransmissions had only a marginal impact on the average
interarrival jitter, which explains why TCP and PRTP-ECN had almost the same average
interarrival jitter at low traffic loads. However, at higher loss rates the difference in the
number of retransmissions made by TCP and PRTP-ECN became significant, which explains
the improvement in average interarrival jitter attained by using PRTP-ECN at greater traffic
loads.
The Tukey test ( = 0.01) indicated that all PRTP-ECN configurations had less average
interarrival jitter than TCP. This test also suggested that the higher the allowable steady-state
packet loss frequency, the better the jitter characteristics for a PRTP-ECN configuration.
4.1 Average Interarrival Jitter

To conclude, we infer that PRTP-ECN most probably would lead to reduced average interarrival jitter. Our simulation experiment suggests in particular that the reduction in average
interarrival jitter obtained with PRTP-ECN increases with increased traffic load. It seems
further that the reduction in average interarrival jitter increases with increased allowable
steady-state packet loss frequency. Table 1 shows the 99% confidence intervals for the mean
average interarrival jitter for TCP and the PRTP-ECN configurations. As seen in the table,
the PRTP-ECN configurations had about 30% less average interarrival jitter at low traffic
loads as compared to TCP and, at very high traffic loads, some configurations gave more
than a 50% reduction in average interarrival jitter. Furthermore, it follows from Table 1 that
PRTP-ECN not only gives a reduction in the average interarrival jitter, it also reduces the
variations in the average interarrival jitter, which makes it more predictable.
97
(a) Histogram of residuals for the untransformed effects model.
(b) Histogram of residuals for the transformed effects model.
(c) Normal probability plot of residuals for the untransformed effects model.
(d) Normal probability plot of residuals for the

transformed effects model.
(e) Residuals vs. fitted values plot for the untransformed effects model.
(f) Residuals vs. fitted values plot for the transformed effects model.
Figure 3: Verification of the effects models for average interarrival jitter.
98
(a) Normal probability plot of residuals.
(b) Residuals vs. fitted values plot.
Figure 4: Verification of the transformed effects model for average throughput.
4.2 Average Throughput and Average Goodput

The results of the factorial experiment for average throughput and average goodput were
almost identical. Thus only average throughput is discussed in this section. It should be
remembered, however, that the same results also apply to average goodput.
In the untransformed effects model, the standard deviation was found to be approximately
proportional to the mean average throughput. We therefore applied a log-transformation [11]:
Y = log (1851 + 0.0449Y ). As follows from the normal probability plot in Figure 4(a), the
distribution of the residuals fitted a normal distribution reasonably well. The plot is slightly
S-shaped, which indicates that the distribution, although symmetrical, had somewhat shorter
tails than a normal distribution.
The Levene test suggested that the errors in the case of the transformed effects model
were not constant: F > 9.269, P < 0.01. Although the variance was not altogether constant,
it did not exhibit any conspicuous pattern (see Figure 4(b)).
The ANOVA test suggested that PRTP-ECN had indeed a positive effect on the average
throughput. In addition, the ANOVA test showed that there was a positive interaction between protocol and traffic load. This is a consequence of PRTP-ECNs resilience against
packet loss. The average throughput of TCP deteriorates rapidly when we start to drop packets. In contrast, PRTP-ECN is able to withstand a limited amount of packet loss.
The Tukey test ( = 0.01) confirmed that all PRTP-ECN configurations had a better
average throughput than TCP. It also followed from the Tukey test that an increase in the
allowable steady-state packet loss frequency resulted in an increase in average throughput.
To summarize our findings, PRTP-ECN seems to give a significant improvement in average throughput compared to TCP. As expected, there is a trade-off between throughput
and resilience to packet loss. The higher the allowable packet loss frequency, the higher
the gain in average throughput. Table 2, which shows the 99% confidence intervals for
the mean average throughputs, suggests that PRTP-ECN can give between 14% and 20%
improvement in average throughput at low traffic loads, and close to 50% improvement in
average throughput for some configurations when the traffic load approaches the bottleneck
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
20%
27.5 1.5
18.9 0.8
18.6 0.7
18.0 0.7
17.3 0.6
17.3 0.6
17.0 0.7
17.1 0.6
60%
82.4 0.4
53.2 0.2
46.6 0.2
40.7 0.1
40.3 0.1
40.3 0.1
39.8 0.1
40.0 0.1
67%
101.8 3.6
72.5 2.8
63.5 2.5
51.8 1.4
50.0 0.8
49.9 1.1
49.8 1.0
49.9 1.0
Traffic Load
80%
87%
178.5 5.6 267.6 8.7
145.7 4.6 230.0 6.7
135.8 5.8 221.5 6.2
108.4 4.6 193.7 6.4
87.9 2.5 143.9 4.8
85.6 1.7 122.8 3.6
84.9 1.3 121.5 2.8
85.4 1.3 121.3 2.8
93%
460.3 21.9
420.7 15.9
404.0 13.1
361.3 20.1
306.1 15.1
242.9 12.8
191.2 9.4
182.0 5.2
97%
737.1 112.3
722.6 77.1
701.4 97.0
582.7 56.1
485.2 36.0
425.2 45.0
329.7 28.2
242.8 11.8
Protocol
Table 1: 99% confidence intervals for mean average interarrival jitter in sec.
99
100
Protocol
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
20%
60%
579 18
662 15
665 14
669 14
680 13
678 13
683 14
682 13
279 8
337 7
353 8
372 6
373 6
373 7
375 8
374 7
Traffic Load
67%
80%
237 5
269 7
284 7
309 6
312 4
313 6
313 5
312 5
148 2
158 3
162 4
178 5
190 4
192 3
192 3
191 2
87%
93%
97%
103 2
108 2
108 2
113 2
128 3
136 3
137 3
137 3
61 3
62 2
63 2
68 4
71 2
78 3
86 4
86 3
40 5
38 4
38 3
43 3
47 3
48 5
53 3
58 3
Table 2: 99% confidence intervals for mean average throughput in kbps.

link capacity.
4.3 Average Link Utilization

We started with the effects model for the untransformed response variable. This model
was rejected, however, since we had an obvious linear relationship between residuals and
link utilization. On the basis of this linear relationship, we derived a power transformation:
Y = Y 201.358041 .
Figures 5(a) and 5(b) show the normal probability plot and the residuals versus fitted
values plot for the transformed effects model. These plots show that the residuals were
almost normally distributed and that there was no apparent trend between the residuals and
the average link utilization. We still did not have a constant variance. Levenes test gave
F > 37.597, P < 0.01. However, as mentioned in Section 4.1, this had only a marginal
impact on the ANOVA test.
The ANOVA test indicated that PRTP-ECN and TCP did not have the same average link
Figure 5: Verification of the transformed effects model for average link utilization.
101
utilization, and it followed from the Tukey test ( = 0.01) that all PRTP-ECN configurations
utilized the link better than TCP. The ns-2 trace files suggested that the reason PRTP-ECN
was able to better utilize the link was chiefly its robustness to packet loss. As the link
utilization approached the link capacity, the packet loss rate increased, which in the case
of TCP led to timeouts, followed by retransmissions and slow-start. In contrast, as long
as the packet loss rate was well below the allowable steady-state packet loss frequency,
PRTP-ECN did not experience any timeouts, did not have to perform any retransmissions,
and did not have to return to slow-start. Furthermore, the Tukey test indicated that the link
utilization increased with an increase in allowable steady-state packet loss frequency. This
is a direct consequence of what we discussed earlier concerning retransmissions: A higher
allowable packet loss frequency leads to fewer retransmissions and, consequently, to better
link utilization.
In conclusion, we can say that our statistical evaluation of the average link utilization
suggests that PRTP-ECN utilizes the link somewhat better than TCP. Furthermore, the gain
in link utilization seems to increase with an increase in allowable packet loss frequency.
4.4 Average Fairness and TCP-Friendliness

To study whether PRTP-ECN shared bandwidth fairly with TCP, we used the fairness metric
proposed by Jain et al. [12]. According to their metric, the so called fairness index, the
fairness of an allocation of bandwidth to n flows, each receiving bi bps, is given by the
formula:
Pn
2
def (
i=1 bi )
P
.
(2)
Fairness index =
n ( ni=1 b2i )
The fairness index measures the equality of bandwidth sharing. If all flows sharing the
same link receive the same amount of bandwidth, the fairness index is 1 and the fairness
of the bandwidth allocation is 100%. As the disparity in bandwidth allocation increases,
fairness decreases and a bandwidth allocation scheme that favors only a selected few flows
has a fairness index near 0.
We used ANOVA to test the hypothesis that PRTP-ECN shared bandwidth in a fair manner with TCP. More precisely, we tested the hypothesis that the average fairness indexes
calculated in the PRTP-ECN simulations did not differ significantly from those calculated
in the TCP simulations. As before, the untransformed effects model was found inadequate.
The positive residuals especially exhibited a clear trend toward a decrease with an increased
fairness index. This is a direct consequence of the fairness index having an upper bound of 1.
The spread of residuals has, by definition, an upper bound given by the expression: 1 - fitted
value. As a consequence, the positive residuals must decrease with increased fitted values.
Since the fairness index shows a behavior similar to a proportion, we used an Omega
transformation [11]. More specifically, the following transformation was used:
Y
Y = 10 log10 ( 1Y
). The result of this transformation was quite satisfactory. As follows
from the normal probability plot in Figure 6(a), the distribution of the residuals was very
similar to a normal distribution but differed in that it was slightly skewed (a somewhat longer
upper tail than lower tail) and showed more peaks. As follows from Figure 6(b), the variance
of the residuals was still not constant, an observation confirmed by the Levene test: F >
102
Figure 6: Verification of the transformed effects model for average fairness index.
37.597, P < 0.01. More importantly, however, Figure 6 (b) shows that there was no obvious
correlation between the residuals and the fitted values.
The ANOVA test suggested that TCP and PRTP-ECN were not equally fair. Furthermore,
the Tukey test ( = 0.01) indicated that TCP was more fair than either of the PRTP-ECN
configurations and that the fairness index was roughly inversely proportional to the allowable
steady-state packet loss frequency. The difference in fairness between TCP and PRTP-ECN
was marginal, however. As shown in Table 3, the fairness index of the PRTP-ECN configurations was above 94% at all traffic loads except the highest one. At the highest traffic load,
those PRTP-ECN configurations having an allowable steady-state packet loss frequency of
11% or more attained a fairness index of less than 90%. It should be noted, however, that
TCP had a fairness index of only 92% at the highest traffic load and that PRTP-ECN always
had a fairness index well above 80%. Furthermore, at the two highest traffic loads, there
were PRTP-ECN configurations that had an even better average fairness index than TCP.
A problem in the fairness index is that it assumes that all flows passing through a particular link are able to utilize bandwidth as it becomes available. This is not always true, however. In particular, it is not true in our simulation experiment. As we noted in Section 4.3,
when the link utilization approached the link capacity, the packet loss rate increased, which
led to timeouts and a reduced TCP sending rate and in turn to reduced link utilization.
We addressed the inability of the fairness index to differentiate between utilization of
spare bandwidth and bandwidth conquered from contending flows by also considering
TCP-friendliness, a criteria very closely connected to fairness. A flow is said to be TCPfriendly if its arrival rate does not exceed the arrival rate of a conformant TCP connection
under the same circumstances [6]. In our simulation experiment, we tested whether PRTPECN is TCP-friendly using the TCP-friendliness test proposed by Floyd and Fall [6]. According to their test, a flow is TCP-friendly if the following inequality holds for its arrival
rate:
p
1.5 2/3
,
(3)
RT T ploss
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
20%
0.99 0.003
0.98 0.006
0.98 0.006
0.98 0.006
0.98 0.005
0.98 0.006
0.98 0.006
0.98 0.006
60%
0.99 0.003
0.98 0.006
0.96 0.009
0.94 0.008
0.94 0.008
0.94 0.011
0.94 0.012
0.94 0.009
67%
1.00 0.002
0.99 0.006
0.98 0.007
0.95 0.010
0.94 0.007
0.94 0.009
0.94 0.008
0.94 0.008
Traffic Load
80%
1.00 0.001
1.00 0.001
0.99 0.002
0.98 0.008
0.95 0.008
0.95 0.008
0.95 0.008
0.95 0.007
87%
1.00 0.001
1.00 0.002
1.00 0.001
0.99 0.003
0.96 0.009
0.94 0.010
0.94 0.010
0.94 0.010
93%
0.99 0.004
1.00 0.003
1.00 0.003
0.97 0.030
0.98 0.010
0.94 0.030
0.90 0.040
0.90 0.040
97%
0.92 0.050
0.95 0.030
0.95 0.030
0.93 0.040
0.91 0.050
0.88 0.050
0.86 0.050
0.83 0.050
Protocol
Table 3: 99% confidence intervals for mean average fairness index.
103
104
Protocol
TCP
PRTP-2
PRTP-3
PRTP-5
PRTP-8
PRTP-11
PRTP-14
PRTP-20
Pass Freq.
99.0%
99.2%
99.6%
99.8%
99.5%
99.5%
98.7%
98.9%
Table 4: Results of TCP-friendliness tests.
where is the arrival rate of the flow in Bps, denotes the packet size in bytes, RT T is the
minimum round-trip time in seconds, and ploss is the packet-loss frequency.
The results of the TCP-friendliness tests are given in Table 4. TCP was included in the
TCP-friendliness tests as a means to verify the TCP-friendliness calculations. As shown,
all PRTP-ECN configurations passed the TCP-friendliness test in almost all simulation runs.
The reason we did not have a 100% pass frequency was because the TCP-friendliness test
assumes a fairly constant round trip time [6], which was not completely fulfilled at higher
traffic loads. At higher traffic loads, the largest portion of the round trip time was queueing
delays, which exhibited non-negligible fluctuations. This is of course also the reason why
TCP failed the TCP-friendliness test in some simulation runs, which had been impossible
had all the assumptions of the test been fulfilled. Actually, as follows from Table 4, TCP
failed in more simulation runs than all but two PRTP-ECN configurations, PRTP-14 and
PRTP-20. This again suggests that PRTP-ECN indeed exhibited TCP-friendly behavior in
the simulation experiment.
Conclusions
To address the need of a TCP-friendly and fair transport protocol for multimedia applications, an extension to TCP, PRTP-ECN, is proposed. The performance of PRTP-ECN has
been compared with TCP in an extensive simulation study designed as a fixed-factor factorial experiment. This paper gives a detailed description of this simulation experiment. The
results of the experiment indicate that PRTP-ECN gives a reduced average interarrival jitter
and an increased average throughput. At the same time it exhibits TCP-friendly behavior and
is reasonably fair against contending congestion aware flows. In addition, PRTP-ECN seems
to improve the link utilization. We also noted that almost the same results were obtained for
average goodput as for average throughput, which suggests that the improvements in average
throughput given by PRTP-ECN directly translate into improvements in average goodput.
REFERENCES
105
References
April 1999.
2(5), October 1994.
[7] K-J Grinnemo and A. Brunstrom. Enhancing TCP for applications with soft real-time
constraints. In SPIE Multimedia Systems and Applications, pages 1831, Denver, Colorado, USA, August 2001.
[9] D. Hong, C. Albuquerque, C. Oliveira, and T. Suda. Evaluating the impact of emerging
streaming media applications on TCP/IP performance. IEEE Communications Magazine, 39(4):7682, April 2001.
[10] S. Jacobs and A. Eleftheriadis. Streaming video using dynamic rate shaping and TCP
congestion control. Journal of Visual Communication and Image Representation, 9(3),
1998.
[11] R. Jain. The Art of Computer Systems Performance Analysis. John Wiley & Sons, Inc.,
April 1991.
[13] Microsoft Corp. Windows media homepage. http://www.microsoft.com/
windows/mediaplayer/default.asp.
106
[14] D. Montgomery. Design and Analysis of Experiments. John Wiley & Sons Inc., 2000.
1996.
[19] RealNetworks Inc. RealPlayer. http://www.real.com.
IETF, October 2000.
[21] Q. Xie, R. Stewart, C. Sharp, and I. Rytina. SCTP unreliable data mode extension.
Internet draft, IETF, April 2001. Work in Progress.
Paper V
Enhancing TCP for Applications with Soft Real-Time

Constraints
Reprinted from
Proceedings of the Convergence of Information

Technologies and Communications (ITCom)
Denver, Colorado, USA
August 2001
Enhancing TCP for Applications with Soft

Real-Time Constraints
Abstract
Experiences in the use of the Internet as a delivery medium for multimedia based applications have revealed serious deficiencies in its ability to provide the QoS of multimedia applications. We propose an extension to TCP that addresses the QoS requirements
of applications with soft real-time constraints. Although TCP has been found unsuitable
for real-time applications, it can with minor modifications be adjusted to better comply
with the QoS needs of applications that have soft real-time requirements. Enhancing
TCP with support for this group of applications is important since the congestion control
mechanism of TCP assures the stability of the Internet. In contrast, specialized multimedia protocols that lack appropriate congestion control can never be deployed on a
large-scale basis. Two factors of great importance for applications with soft real-time
constraints are jitter and throughput. By relaxing the reliability offered by TCP, the extension gives better jitter characteristics and throughput. The extension need only be
implemented at the receiving side. The reliability provided is controlled by the receiving application, thereby allowing a flexible trade-off between different QoS parameters.
This paper describes and analyses our TCP extension. The analysis investigates how
the different application controlled parameters influence performance. Our analysis is
supported by a simulation study that investigates the trade-off between interarrival jitter,
throughput, and reliability. The simulation results also confirm that the extended version
of TCP still behaves in a TCP-friendly manner.
Keywords: QoS, TCP, Partial reliability, TCP-friendly, Analysis, Simulation
1 Introduction
Over the last decade, we have seen an explosive growth of multimedia communications and
applications [39]. A tremendous amount of traffic on todays networks consists not only of
text but also of images, video, audio, and other continuous data streams. The exponential
increase in the number of web servers and image intensive web pages combined with applications such as video broadcasting, multimedia conferencing, distance learning, etc. imposes
new requirements on existing transport services. Many of these applications are sensitive to
delay and interarrival jitter but can accommodate a limited amount of data loss.
109
110
Enhancing TCP for Applications with Soft Real-Time Constraints
The two predominant transport protocols in the Internet today, UDP [30] and
TCP [31], fail to provide an adequate transport service to multimedia applications and other
applications with soft real-time constraints. TCP provides an application with a completely
reliable transport service, while UDP makes no provisions whatsoever for improving the IP
service level. Furthermore, UDP has no built-in congestion control mechanism. It is widely
believed [10, 16, 34] that congestion control mechanisms are critical to the stable functioning of the Internet. At present the vast majority (90-95%) of Internet traffic uses the TCP
protocol [18]. However, due to the growing popularity of streaming media applications,
and because standard TCP is not suitable for the delivery of time sensitive data, increasing
numbers of applications are being implemented using UDP and other congestion unaware
transport protocols. The widespread use of protocols that do not implement congestion control or avoidance mechanisms may result in a congestive collapse of the Internet [16] similar
to the one that occurred in October, 1986 [21]. Taken together, this clearly demonstrates the
need of further research in transport protocols for soft real-time applications.
This paper describes an extension to TCP called PRTP-ECN. PRTP-ECN aims at making
TCP better suited for applications with soft real-time constraints, e.g., best effort multimedia
applications. It accomplishes this by trading reliability for reduced delay and interarrival jitter, and improved throughput. More precisely, PRTP-ECN converts TCP from a completely
reliable transport protocol to a partially reliable one, i.e., a transport protocol accepting a
prescribed packet loss rate. Implementing PRTP-ECN only involves modifying TCPs retransmission scheme; the rest of TCP is left unaffected. This makes PRTP-ECN compatible
with standard TCP implementations, and enables gradual deployment. It also follows that
PRTP-ECN reacts to congestion in a way similar to TCP, which makes it TCP-friendly and
reasonably fair.
The remainder of the paper is organized as follows. Section 2 reviews some related
work. This is followed by a description of the PRTP-ECN retransmission scheme in Section 3. Section 4 presents a simple theoretical analysis of the PRTP-ECN reliability scheme,
where the packet loss behavior of PRTP-ECN both at startup and in steady-state are studied. Furthermore, we investigate how PRTP-ECN reacts to packet loss bursts. In Section 5,
we briefly describe a simulation study in which the interarrival jitter, throughput, and TCPfriendliness of PRTP-ECN were evaluated against standard TCP. The paper is concluded in
Section 6 with a summary of our major findings.
Related Work
As already mentioned in Section 1, PRTP-ECN is an extension to TCP that transforms

it to a retransmission-based, partially reliable transport protocol. The feasibility of using
a retransmission-based error recovery scheme for multimedia traffic was demonstrated by
Dempsey et al. [14]. They showed that, by adding some delay before the playout of each
audio packet received, retransmission can be used to protect audio data from packet loss.
In his Ph.D. thesis [13], Dempsey proposed two novel retransmission-based error recovery
schemes: Slack ARQ and PECC (Partially Error-Controlled Connection). The principle behind Slack ARQ [13] was to extend the control time, i.e., the time a packet is delayed at
the receiver before playback, to minimize interarrival jitter and thereby provide for contin-
2. Related Work
111
uous playback. The PECC protocol [13, 14] provides a transport service for applications
that desire to trade reliability for latency. Developed as an enhancement to the XTP (Xpress
Transport Protocol) protocol [40], the PECC protocol modifies the retransmission algorithm
of XTP to provide a connection oriented service under which retransmission of lost packets occurs only when it will not incur additional delay to the data delivery. Early work on
partially reliable transport protocols for multimedia communication was also conducted by
Papadopoulus and Parulkar [28], who suggested an ARQ scheme involving, among other
things, selective repeat and retransmission expiration.
Extensive work on partial reliability in connection with multimedia transfer has been
done at LAAS-CNRS [15, 36] and at the University of Delaware [4, 5]. Their work resulted
in the proposal of a new transport protocol, POC (Partial Order Connection) [2, 3, 12]. The
POC approach toward realizing a partially reliable service combines a partitioning of the
media stream into objects with the notion of reliability classes. An application designates
individual objects as needing different levels of reliability, i.e., reliability classes are assigned
at the object level. By introducing the object concept and letting applications specify their
reliability requirements on a per-object basis, POC offers a very flexible transport service.
More recent work on partially reliable transport protocols has been carried out by Piecuch
et al [29]. They developed an application level protocol, SRP (Selective Retransmission
Protocol), which works on top of UDP. SRP implements two retransmission decision algorithms: equal loss latency and optimum quality. The retransmission algorithms differ in the
way quality is measured. However, both algorithms base their retransmission decisions on
the packet loss frequency and the latency.
In addition to general partially reliable protocols, a large number of transport protocols targeting specific applications have been proposed. For example, Rhee [35] presents a
retransmission-based error control technique for interactive video, and Li et al. [23] proposed
a novel scheme for distribution of MPEG coded video over a best effort network.
In contrast to PRTP-ECN, a large number of the proposed partially reliable transport protocols do not implement any congestion control at all, or employ one that does not interact
fairly with TCP. In addition, many of the proposed transport protocols are not able to interwork with the existing Internet infrastructure or require radical changes to extant transport
protocols. PRTP-ECN, on the other hand, entails only modifying the retransmission scheme
of TCP and is able to interact with standard TCP implementations.
A key feature of PRTP-ECN is its TCP-friendly behavior. In view of the negative effects
that TCP-unfriendly flows have on the stability and performance of the Internet, a large body
of work has accumulated describing various congestion control mechanisms for multimedia
applications. A majority of these mechanisms are either window based, as is TCP, or rate
based.
Examples of window based congestion control schemes are TLTCP (Time-lined TCP) [24,
25] and the binomial congestion control algorithms proposed by Bansal and Balakrishnan [6, 7]. TLTCP works in the same way as TCP except that it imposes deadlines on
outstanding data. It sends data in a similar fashion as in TCP until the deadline for a section
of data has been passed. Upon expiration of a deadline, obsolete data are discarded and the
sending window is moved forward to enable new data to be sent. The binomial congestion
control algorithms suggested by Bansal and Balakrishnan may be seen as a generalization of
112
the additive-increase/multiplicative-decrease (AIMD) congestion scheme employed by TCP.

The principle idea behind this family of congestion control algorithms is to more smoothly
adapt a flow to available bandwidth. This is done by decreasing the rate of a flow less rapidly
than multiplicative decrease upon incipient congestion and increasing the flow less rapidly
than the additive increase when probing for more bandwidth.
Rate based congestion control mechanisms calculate an appropriate sending rate by viewing it as a function of the estimated round trip time and packet loss rate. Numerous rate based
schemes have been proposed. The RAP (Rate Adaptation Protocol) [34] uses an AIMD rate
control scheme based on regular acknowledgements sent by the receiver, which the sender
uses to detect lost packets and estimate the round trip time. Sisalem and Schulzrinne [38]
propose a congestion control mechanism, LDA (Loss-Delay Based Adjustment Algorithm),
similar to the one used by RAP. Their scheme relies on regular RTP/RTCP reports [37] sent
between the sender and the receiver to estimate the loss rate and round trip time. In addition,
they propose modifications to RTP that allow the protocol to estimate the bottleneck bandwidth using the packet pair technique proposed by Bolot et al. [9]. Model based congestion
control schemes are a novel category of rate based algorithms. They are based, either explicitly or implicitly, on analytic models of TCP. The congestion control schemes proposed
by Tan et al. [41] and Karandikar et al. [22] are based on the simplified relationship between
throughput, packet loss probability, and round trip time formulated by Floyd et al [16]. Padhye et al. developed a more thorough model of TCP [27], also taking into account the impact
of timeouts. On the basis of this model, they designed a more accurate congestion control
scheme [17] that has been submitted as an Internet draft to IETF [20].
A problem with many rate based schemes is that they rely on accurate estimates of packet
loss rates and round trip times. Ramesh and Rhee [33] show that inaccuracies in these estimates can result in significant overallocation or underallocation of bandwidth. In contrast,
the congestion control mechanism employed by PRTP-ECN and TCP is based on the principle of conservation of packets [21]. This principle helps to create a robust and self-regulating
congestion control mechanism.
The PRTP-ECN Retransmission Scheme
The PRTP-ECN extension to TCP only involves changing the retransmission decision algorithm of TCP, i.e., PRTP-ECN retains most of the characteristics and mechanisms of TCP. In
particular, PRTP-ECN does not alter data transfer, flow control, multiplexing, or connection
characteristics of TCP in any way. This enables PRTP-ECN to transparently interwork with
existing TCP implementations. The modifications to TCP imposed by PRTP-ECN have been
isolated to the receiver side. No changes are required at the sender side.
The PRTP-ECN retransmission decision algorithm is parameterized. The application
atop PRTP-ECN explicitly prescribes a minimum acceptable reliability level by setting the
parameters of the retransmission algorithm. Implicitly, the parameters govern the trade-off
between delay jitter, throughput, and goodput. By relaxing the reliability, the application
receives less interarrival jitter and improved throughput and goodput.
Figure 1 illustrates how PRTP-ECN works. As long as no packets are lost, PRTP-ECN
behaves in the same way as standard TCP. When an out-of-sequence packet is received, it is
3. The PRTP-ECN Retransmission Scheme
11
00
00
11
00
11
00
11
113
threeway handshake,
ECN negotiation
sender side
TCP
11
00
00
11
00
11
00
11
receiver side
PRTPECN
insequence da
ta packet
outofseque
packet loss
nce data pack
et
packet loss detected
nt
acknowledgeme
ho flag set
with ECNEc
crl
rrl ?
congestion action
data packet wi
th CWR flag set

notification of
congestion action
Figure 1: PRTP-ECN session.

taken as an indication of packet loss and the retransmission decision algorithm is invoked.
This algorithm decides, on the basis of the success rate of previous packets, whether to
acknowledge all packets up to and including the out-of-sequence packet or to do the same
as standard TCP, i.e., acknowledge the last successfully received in-sequence packet and
wait for a retransmission. In PRTP-ECN, the success rate is measured as an exponentially
Pn1 k
k=0 af pk bk
,
(1)
crl(n) = P
n1
k
k=0 af bk
where n is the number of packets received less the out-of-sequence packet, af is the weight
or aging factor, and bk denotes the number of bytes contained in packet k. The variable
denoted pk is a conditional variable that only takes a value of 1 or 0. If the kth packet was
successfully received, then pk = 1 otherwise pk = 0.2
2 In Eq. 1 packets
are numbered backwards starting at the packet preceding the out-of-sequence packet and going
114
and a second parameter called the required reliability level. The required reliability level,
rrl , acts as a reference value. As long as crl(n) rrl , dropped packets need not be retransmitted and are therefore acknowledged. If an out-of-sequence package is received and
crl(n) is below rrl , PRTP-ECN acknowledges the last in-sequence packet and waits for a
retransmission.
Acknowledging lost packets interferes with the congestion control in standard TCP, which
interprets lost packets as a signal of congestion. PRTP-ECN remedies this by employing explicit congestion notification [32]. More specifically, PRTP-ECN uses the TCP portions of
ECN (Explicit Congestion Notification).
Normally, ECN involves both the IP and TCP layers. Upon incipient congestion, a router
sets a flag, the Congestion Experienced bit (CE), in the IP header of arriving packets. The
receiving end-node propagates the congestion signal back to the sender by setting the so
called ECN-Echo flag in the acknowledgement of a CE packet. When the sender receives
an ECN-Echo packet, it takes the same actions as are taken when a congestion loss has
occurred, e.g., reduces its congestion window. To provide robustness against dropped acknowledgements with the ECN-Echo flag set, the receiver continues to set the ECN-Echo
flag in acknowledgements until it receives a confirmation of the congestion action at the
sender. When the sender has taken the congestion action, it notifies the receiver by setting a
flag, the Congestion Window Reduced bit (CWR), in the following data packet.
PRTP-ECN does not involve intervening routers. Instead, when an out-of-sequence package is acknowledged, the ECN-Echo flag is set in the acknowledgement packet. When receiving the acknowledgement, the sender will take the appropriate congestion actions but
will not retransmit any packet.
Packet-Loss Behavior of the PRTP-ECN Retransmission

Scheme
This section theoretically analyzes how the response of PRTP-ECN to packet losses depends
on af and rrl by studying the expression for crl in Section 3 (Eq. 1). The analysis makes
the simplifying assumption that all packets are of equal size, which reduces the expression
for crl(n) to
Pn1 k
k=0 af pk
.
(2)
crl(n) = P
n1
k
k=0 af
In the remainder of this text, a PRTP-ECN protocol that has been assigned fixed values for
af and rrl is called a PRTP-ECN configuration.
In Subsection 4.1, we consider the startup behavior of PRTP-ECN. More precisely, we
investigate how the number of packets that must be successfully received until a packet loss
is allowed, i.e., until crl rrl, depends on af , rrl, and the initial value of crl. The steadystate behavior of PRTP-ECN is considered in Subsection 4.2. Explicit formulae are derived
for the upper and lower bounds for the required distance between packet losses, i.e., the
backwards to the first packet sent.
4. Packet-Loss Behavior of the PRTP-ECN Retransmission Scheme
crl init =
1
+1
af
af 1
af
115
Packet
Sequence
Number
Weights in expression for crl( + 1)
Figure 2: Derivation of formula for .

number of packets that has to follow a packet loss until a packet can be lost again without
having to be retransmitted. On the basis of the formulae for the bounds of the required
distance between packet losses, we derive bounds for the allowable packet loss frequency of
a PRTP-ECN configuration. We also define a metric for the allowable packet loss frequency
and show how this metric relates to the previously derived packet loss frequency bounds.
Finally, in Subsection 4.3, we analyze how PRTP-ECN reacts to bursts of packet losses.
4.1 The Startup Behavior

Considering that the majority of conversations over the Internet involve only a small amount
of data [11, 42], it is of interest to study the behavior of PRTP-ECN at startup. In this subsection, we derive an expression for the upper bound, , of the number of packets that must
be successfully received at startup until it is possible for a packet to be lost without having
to be retransmitted. It follows from the formula for crl (Eq. 2) that the packet loss behavior
of PRTP-ECN at startup depends on three parameters: af , rrl , and the initial value of crl, .
We assume that crl is initialized to by setting its numerator to and its denominator to 1.
At the time of the first packet loss, the value of crl is given by (See Figure 2):
crl( + 1) =
af + + af +1
1 + + af +1
(3)
By observing that both the numerator and the denominator of crl are geometric series, we
obtain:
+1
af 1af
1af + af
crl( + 1) =
(4)
+2
1af
1af
A packet loss is allowed when crl rrl, which gives us:

af
+1
1af
1af + af
1af +2
1af
rrl
Eq. (5) gives us the following upper bound for :

i
h
rrlaf
ln (rrl)af
2 +(1)af
.
=
ln af
(5)
(6)
116
Figure 3: as a function of af and rrl when = 0.0.
Figures 3, 4, and 5 show how varies with af , rrl and . They illustrate how the initial
behavior of PRTP-ECN depends on the configuration and how it can be controlled by the
initial value used for crl . More packets must be received for high values of rrl or for low
values of af . As expected, using a large value allows PRTP-ECN to accept a packet loss
more quickly without requiring a retransmission.
4.2 The Steady-State Behavior

As is evident from the description of the PRTP-ECN retransmission scheme in Section 3,
the long-term packet loss behavior of PRTP-ECN is governed by the values of af and rrl .
In this section, we will investigate this relationship further by deriving explicit formulae for
the upper and lower bounds on allowable packet loss frequency, i.e., the rate of packet losses
PRTP-ECN can accommodate without having to make retransmission requests. We begin by
117

l
n1
af n + l 1
af n + l 2
af l + 1
n+1
n +l
af l
af l 1
Packet
Sequence
Number
Weights in expression for crl(n + l)
Figure 6: Derivation of formula for l .

studying the problem of finding expressions for the upper and lower bounds of the required
distance between packet losses.
To find a formula for the lower bound of the required distance between packet losses, l ,
we consider the case in which an infinite amount of packets have been successfully received
followed by two packet losses that occur l 1 packets apart. This scenario is illustrated in
Figure 6. At the time of the second packet loss, we have
af + + af l 1 + af l +1 + + af n+l 1
.
n
1 + + af n+l 1
crl(n + l ) = lim
(7)
We decompose the expression for crl(n + l ) into two rational expressions: the first expression covering the packets from the first to the second packet loss and the second expression
covering all packets up to and including the first packet loss. This gives us
crl(n + l ) =
Pl 1
k
k=1 af
lim Pn+
l 1
n
af k
k=0
Pn+l 1
l +1
+ Pk=
n+l 1
k=0
af k
af k
(8)
118
Figure 7: l as a function of af and rrl .
Next, we reduce (8) by using the formula for the sum of a geometric series:
crl(n + l ) = lim af
n
1af l 1
1af
1af n+l
1af
+ af
l +1
1af n1
1af
1af n+l
1af
(9)
By evaluating (9), we obtain:

crl(n + l ) = af l +1 af l + af .
(10)
We find a formula for l by minimizing crl(n + l ) subject to the constraints: l N

crl(n + l ) rrl , which gives
& rrlaf '
ln af 1
.
(11)
l =
ln af
The graph in Figure 7 shows how l varies with af and rrl. As expected, l increases with
increasing rrl values and with decreasing af values.
In a theoretically worst case, all allowable packet losses result in crl obtaining the value
of rrl. Considering this case, illustrated in Figure 8, we derive a formula for the upper
bound of the required distance between packet losses. We express crl at the time of the first
packet loss in terms of its numerator and denominator:
crl(n) = .
(12)
Since crl(n) = rrl , we obtain the following relationship between the numerator and the
denominator of crl at the time of the first packet loss:
= rrl .
(13)
119
crl(n + u) = rrl
crl(n) = rrl
n1
n+1
n +u
af n + u 1
af n + u 2
af u + 1
af u
af u 1
Packet
Sequence
Number
Weights in expression for crl(n + u)

Figure 8: Derivation of formula for u .
This gives us the following expression for crl at the time of the second packet loss:
crl(n + u ) =
af + + af u 1 + af u rrl
1 + + af u 1 + af u
(14)
Employing the formula for the sum of a geometric series, we get:

crl(n + u ) =
af
1af u 1
+ af u rrl
1af
1af u
u
1af + af
(15)
However, crl(n + u ) = rrl which gives us the following equation for the upper bound of
the required distance between packet losses:
af
1af u 1
+ af u rrl
1af
1af u
u
1af + af
= rrl .
(16)
(17)
Solving (16) for u gives us:

u =
&
af
ln rrl
rrl 1
ln af
'
The graph in Figure 9 shows how u depends on af and rrl . Again, u increases with
increasing rrl values and with decreasing af values. We can also see in Figures 7 and 9
that, for some configurations, l and u coincide meaning that both bounds are tight for
these configurations.
Since the allowable packet loss frequencies are bounded by the reciprocals of the required
distances between packet losses, we obtain the following formulae for the upper and lower
bounds of the allowable packet loss frequency:
1
,
l
1
.
fl =
u
fu =
(18)
(19)
120
Figure 9: u as a function of af and rrl .
Although fu gives some appreciation for the maximum allowable packet loss frequency
of PRTP-ECN, it is an overestimate. A more usable metric for the maximum packet loss
frequency of a PRTP-ECN configuration is obtained by considering the scenario in which
packets are lost at all times when PRTP-ECN allows this, i.e., at all times when crl rrl .
Simulations suggest that the allowable packet loss frequency in this scenario approaches a
limit, fs , as the total number of sent packets, n, reaches infinity, or more formally stated:
def
loss( n )
,
n
n
fs = lim
(20)
where n denotes the packet sequence comprising all packets sent up to and including the nth
packet, and loss( n ) is a function that returns the number of lost packets in n . Henceforth,
we will call the metric fs the allowable steady-state packet loss frequency. The graphs in
Figures 10, 11, and 12 illustrate how fs depends on af and rrl and how it relates to fu
and fl . We can see that the configuration dictates the allowable steady-state packet loss
frequency obtained by the protocol and that fs falls nicely in between the upper and lower
bound.
4.3 The Maximum Allowable Packet-Loss Burst Length

Since packet losses very often occur in bursts [8], it is important to analyze how PRTPECN reacts to packet loss bursts. To obtain an explicit formula for the maximum allowable
packet-loss burst length, we consider the case in which an infinite number of packets have
been successfully received by PRTP-ECN at the time of the packet loss burst. Figure 13
illustrates this case. Let n denote the number of packets sent at the time of the first packet
Figure 10: fu as a function of af and rrl .
Figure 11: fs as a function of af and rrl .
Figure 12: fl as a function of af and rrl .
121
122
1
af n + b 2
2
af
n1
n+b3
af
af
n+b1
b1
Packet
Sequence
Number
Weights in expression for crl(n + b 1)
Figure 13: Maximum allowable packet-loss burst length.

loss in the packet loss burst and b the length of the burst in number of packets. At the end of
the packet loss burst, crl is given by
Pn+b2 k
af
.
crl(n + b) = lim Pk=b
n+b2
n
af k
k=0
(21)
If we recognize the numerator and denominator as being geometrical series and replace the
sums with the corresponding explicit formula, we obtain:
crl(n + b) = lim af
1af n1
1af
1af n+b1
1af
(22)
From (22) it follows that

crl(n + b) = af b .
(23)
The problem of finding the maximum allowable packet loss burst length can now be formulated as finding:
sup {af b rrl } .
(24)
bN
Since, for af ]0, 1[, the function f (x) = af x is a strictly monotonically decreasing function, we have
$
%
sup {af b rrl } =
bN
sup {af x rrl} .
(25)
xR
This gives us the following explicit formula for the maximum allowable packet loss burst
length:
$
%
ln rrl
b=
.
(26)
ln af
Figure 14 illustrates how b depends on af and rrl . Larger packet loss burst lengths are
accepted for small values of rrl and for high values of af .
5. Simulation Experiment
123
Figure 14: b as a function of af and rrl.
5 Simulation Experiment
In the previous section, we theoretically analyzed the packet loss behavior of the PRTPECN retransmission scheme. A simulation study was done to examine the implications of the
packet loss behavior of PRTP-ECN for interarrival jitter, throughput, and TCP-friendliness [19].
This section gives a condensed description of this simulation experiment.
5.1 Methodology
We used the ns-2 [26] network simulator to conduct the experiment reported in this paper.
Figure 15 shows the network topology used in all simulations. The primary factors in the
experiment were the protocol used at node S4, the traffic load on the link between nodes R1
and R2, and the starting times of the flows emanating from nodes S1 and S2. The last factor
was found to have a limited impact on the result, however, and is thus not discussed further
here.
In our experiment, simulations were made in pairs. In the reference tests, TCP was used
at both nodes S1 and S4 while, in the evaluation tests, the TCP agent at node S4 was replaced
with PRTP-ECN. In the simulations, two FTP applications attached to nodes S1 and S2 sent
data at a rate of 10 Mbps to receivers at nodes S4 and S5. The initial congestion window
of the TCP and PRTP-ECN agents was initialized to two segments [1]. All other agent
parameters were assigned their default values.
Background traffic was accomplished by a UDP flow between nodes S3 and S6. The UDP
flow was generated by a constant bitrate traffic generator residing at node S3. The departure
times of the packets from the traffic generator were randomized, however, resulting in a
variable bit rate flow between nodes S3 and S6. In all simulations, a maximum transfer unit
124
S1
S2
S3
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
R1
1.5 Mbps, 50 ms
R2
10 Mbps, 0 ms
10 Mbps, 0 ms
10 Mbps, 0 ms
S4
S5
S6

(MTU) of 1500 bytes was used. The routers, R1 and R2, had a single output queue for each
attached link and used FCFS scheduling. Both router buffers had a capacity of 25 segments,
i.e., approximately twice the bandwidth-delay product of the network path. All receivers
used a fixed advertised window size of 20 segments, which enabled each one of the senders
to fill the bottleneck link.
The traffic load was controlled by setting the mean sending rate of the UDP flow to
a fraction of the nominal bandwidth on the R1-R2 link. Tests were run for seven traffic
loads, 20%, 60%, 67%, 80%, 87%, 93%, and 97%. These seven traffic loads corresponded
approximately to packet loss rates of 1%, 2%, 3%, 5%, 8%, 14%, and 20% in the reference
tests. Tests were run for seven PRTP-ECN configurations. Each test was run 40 times to
obtain statistically significant results.
The PRTP-ECN configurations were selected on the basis of the metric fs defined in
Section 4.2. Seven configurations were selected in which fs was approximately equal to:
2%, 3%, 5%, 8%, 11%, 14%, and 20%. These configurations are denoted: PRTP-2, PRTP-3,
PRTP-5, PRTP-8, PRTP-11, PRTP-14, and PRTP-20.
In all simulations, the UDP flow and the FTP flow between nodes S2 and S5 started at
0 s. The FTP flow between nodes S1 and S4 started at 600 ms. Each simulation run lasted
100 s.
5.2 Results
Several performance metrics were evaluated for the analysis of the simulation experiment.
Here, we consider only the average interarrival jitter, the average throughput, and TCPfriendliness for the FTP flow between nodes S1 and S4.
The graph in Figure 16 shows the result of the evaluation of the interarrival jitter. The
traffic load expressed in percent of the bandwidth of link R1-R2 is on the horizontal axis,
and the relative interarrival jitter is on the vertical axis. The relative interarrival jitter was
5. Simulation Experiment
125
calculated as the ratio of the measured interarrival jitter to the interarrival jitter measured in
the reference test. As follows from the graph, the PRTP-ECN configurations significantly
decreased the interarrival jitter as compared to TCP. At low traffic loads, the reduction was
about 30%. For packet loss rates in the neighborhood of 20%, the reduction was in some
cases as much as 68%.
Figure 17 presents the result of the throughput evaluation. As before, we have traffic load
on the horizontal axis. On the vertical axis, we have relative throughput, i.e., throughput relative to the throughput measured in the reference test. While the increase in throughput was
not as significant as the reduction in interarrival jitter, it was still significant. For example,
an application accepting a 20% packet loss rate could increase its throughput by as much
as 48%. Applications tolerating only a few percents packet loss rate could also experience
improvements in throughput of as much as 20%. A comparison of the throughputs for PRTPECN and TCP also suggest that PRTP-ECN is better at utilizing the bandwidth than TCP.
This has not been statistically verified, however.
Recall from Section 5.1 that a traffic load corresponds approximately to a particular
packet loss rate. Taking this into account in analyzing the results, it may be concluded
that a PRTP-ECN configuration had its optimum in both relative interarrival jitter and relative throughput when the packet-loss frequency was almost the same as fs . This is a direct
consequence of the way we defined fs . At packet loss frequencies lower than fs , the gain in
performance was limited by the fact that not very many retransmissions were necessary in
the first place. When the packet loss frequency exceeded fs , the situation was the reverse.
126
In these cases, PRTP-ECN had to increase the number of retransmissions in order to uphold
the reliability level, which had a negative impact on both interarrival jitter and throughput.
It should be noted, however, that even in cases when PRTP-ECN had to increase the number
of retransmissions, it performed far fewer retransmissions than TCP.
The TCP-friendliness of the FTP flow was evaluated using the TCP-friendliness test proposed by Floyd and Fall [16]. According to their test, a flow is said to be TCP-friendly if the
following inequality holds for its arrival rate:
p
1.5 2/3
,
RT T pl
(27)
where is the arrival rate of the flow in Bps, denotes the packet size in bytes, RT T is
the minimum round trip time in seconds, and pl is the packet loss frequency. In analyzing
the results of the TCP-friendliness tests, we said that a PRTP-ECN configuration was TCPfriendly if more that 95% of the runs in a test passed the TCP-friendliness test. The reason
for not requiring a 100% pass frequency was that not even TCP managed to be TCP-friendly
in all runs. Our simulation experiment suggests that PRTP-ECN is indeed TCP-friendly. As
a matter of fact, all PRTP-ECN configurations passed the TCP-friendliness test.
REFERENCES
127
6 Conclusion
This paper proposed an extension to TCP called PRTP-ECN. PRTP-ECN aims at making
TCP more suitable for applications with soft real-time constraints, e.g., best effort multimedia applications. It accomplishes this by trading reliability for reduced delay and interarrival
jitter, and improved throughput. More specifically, PRTP-ECN introduces a novel retransmission scheme that makes TCP a partially reliable transport protocol, i.e., a transport protocol accepting a prescribed packet loss rate.
Analytic models are developed for the packet loss behavior of PRTP-ECN. They establish how the application controlled parameters of the protocol influence the response of
PRTP-ECN to packet loss. In order to better appreciate the implications of the packet loss
behavior of PRTP-ECN for interarrival jitter, throughput, and TCP-friendliness, a simulation
experiment was done. The experiment suggests that PRTP-ECN is able to offer a service
with significantly reduced interarrival jitter and improved throughput as compared to TCP,
while at the same time being TCP-friendly.
References
April 1999.
2(5), October 1994.
[6] D. Bansal. Congestion control for streaming video and audio applications. Masters
thesis, Massachusetts Institute of Technology (MIT), January 2001.
[7] D. Bansal and H. Balakrishnan. TCP-friendly congestion control for real-time streaming applications. Technical Report MIT-LCS-TR-806, Massachusetts Institute of Technology (MIT), May 2000.
[8] J-C. Bolot. End-to-end packet delay and loss behavior in the Internet. ACM Computer
Communication Review (SIGCOMM), 23(4):289298, September 1993.
128
[9] J-C Bolot and A. Vega-Garcia. Control mechanisms for packet audio in the Internet.
In 15th Annual Joint Conference of the IEEE Computer and Communications Societies
(INFOCOM), San Fransisco, California, USA, March 1996.
[10] B. Braden, D. Clark, J. Crowcroft, B. Davie, S. Deering, D. Estrin, S. Floyd, V. Jacobson, G. Minshall, C. Partridge, L. Peterson, K. Ramakrishnan, S. Shenker, J. Wroclawski, and L. Zhang. Recommendations on queue management and congestion
avoidance in the Internet. RFC 2309, IETF, April 1998.
[11] R. Caceres, P. Danzig, S. Jamin, and D. Mitzel. Characteristics of wide-area TCP/IP
conversations. ACM Computer Communication Review, 21(4):101112, September
1991.
[14] Bert J. Dempsey, Jorg Liebeherr, and Alfred C. Weaver. On retransmission-based error
control for continuous media traffic in packet-switched networks. Computer Networks
and ISDN Systems, 28:719736, 1996.
[17] S. Floyd, M. Handley, J. Padhye, and J. Widmer. Equation-based congestion control for
unicast applications. ACM Computer Communication Review (SIGCOMM), 30(4):43
56, August 2000.
[18] Computer Association for Internet Data Analysis (CAIDA).
Traffic workload overview. http://www.caida.org/outreach/resources/learn/
trafficworkload/tcpudp.xm%l, June 1999.
[20] M. Handley, J. Padhye, and S. Floyd. TCP friendly rate control (TFRC): Protocol
specification. Internet draft, IETF, May 2001. Work in Progress.
[21] V. Jacobson and M. J. Karels. Congestion avoidance and control. ACM Computer
Communication Review (SIGCOMM), 18(4):314329, August 1988.
[22] S. Karandikar, S. Kalyanaraman, P. Bagal, and B. Packer. TCP rate control. ACM
Computer Communication Review, 30(1), January 2000.
REFERENCES
129
[23] X. Li, S. Paul, P. Pancha, and M. Ammar. Layered video multicast with retransmission
(LVMR): Evaluation of error recovery schemes. In 6th International Workshop on
Network and Operating System Support for Digital Audio and Video (NOSSDAV), St.
Louis, Missouri, USA, May 1997.
[24] B. Mukherjee. Time-lined TCP: a transport protocol for delivery of streaming media
over the Internet. Masters thesis, University of Waterloo, 2000.
[27] J. Padhye, V. Firoiu, D. Towsley, and J. Krusoe. Modeling TCP throughput: A simple model and its empirical validation. ACM Computer Communication Review (SIGCOMM), 28(4):303314, August 1998.
1996.
[29] M. Piecuch, K. French, G. Oprica, and M. Claypool. A selective retransmission protocol for multimedia on the Internet. In SPIE Multimedia Systems and Applications,
Boston, Massachusetts, USA, November 2000.
[33] S. Ramesh and I. Rhee. Issues in model-based flow control. Technical Report TR-9915, North Carolina State University, 1999.
[34] R. Rejaie, M. handley, and D. Estrin. RAP: An end-to-end rate-based congestion control mechanism for realtime streams in the Internet. In 18th Annual Joint Conference
of the IEEE Computer and Communications Societies (INFOCOM), New York, New
York, USA, March 1999.
[35] I. Rhee. Error control techniques for interactive low-bit rate video transmission over
the Internet. ACM Computer Communication Review (SIGCOMM), 28(4):290301,
September 1998.
[36] L. Rojas-Cardenas, L. Dairaine, P. Senac, and M. Diaz. An adaptive transport service
for multimedia streams. In IEEE International Conference on Multimedia Computing
and Systems (ICMCS), Florence, Italy, June 1999.
130
January 1996.
[38] D. Sisalem and H. Schulzrinne. The loss-delay based adjustment algorithm: A TCPfriendly adaptation scheme. In 8th International Workshop on Network and Operating
System Support for Digital Audio and Video (NOSSDAV), pages 215226, Cambridge,
United Kingdom, July 1998.
[39] R. Steinmetz and K. Nahrstedt. Multimedia: Computing Communications and Applications. Prentice Hall, 1995.
[41] W-T Tan and A. Zakhor. Real-time internet video using error resilient scalable compression and TCP-friendly transport protocol. IEEE Transactions on Multimedia,
1(2):172186, June 1999.
[42] K. Thompson, G. Miller, and R. Wilder. Wide-area Internet traffic patterns and characteristics (extended version). IEEE Network, pages 1023, November 1997.
Part II
Transport Service
for Telephony Signaling
Paper VI
Towards the Next Generation Network:

The Softswitch Solution
Reprinted from
Karlstad University Studies 2006:6

April 2006
Towards the Next Generation Network: The

Softswitch Solution
Abstract
Over the course of the last fifteen years, the telecommunication market has undergone dramatic changes. In the beginning of the nineties, the market essentially comprised a number of national monopolies. Today, yesterdays monopolies are under siege,
and the incumbent operators face strong competition from newly established operators.
Furthermore, in recent years broadband-based VoIP providers have entered the telecommunication market as worthy contenders to traditional operators. To be able to survive
and thrive in this new, much more competitive, market, traditional wireless and wireline operators have to reduce their capital and operational expenditures. They also need
to provide new revenue-generating applications and services. To this end, a large number of traditional operators has replaced, or seriously consider to replace, their legacy
circuit-switched fixed and cellular core networks with IP. As a first step in the migration
from circuit-switched technologies to IP, the softswitch solution has evolved. This report
provides a comprehensive treatment of the softswitch solution from a technical viewpoint. Additionally, the report concludes with a brief discussion of the migration steps
following the softswitch solution. In particular, an overview of the 3GPP IP Multimedia
Subsystem (IMS) is given.
Keywords: Next Generation Networks, softswitch, signaling, SS7, Parlay, JAIN, H.323,
SIP, MEGACO, H.248, SIGTRAN, IMS
135
136
Towards the Next Generation Network: The Softswitch Solution
Introduction
Few industries have experienced a more revolutionary change than the one which has shaken
the telecommunications industry in the last fifteen years. In the beginning of the nineties, the
telecommunication market basically comprised a number of national monopolies or national
incumbent operators. Today, incumbency in the telecom market has come under siege as a
result of country-by-country telecom liberalization, deregulation, privatization, and competition. This process spread rapidly from the U.S. Telecommunications Act of 1996, through
telecom reforms in each of 27 European countries during the second half of the 1990s, to
Indias New Telecom Policy of 1999. Today, this process, and the industry re-alignment that
it is causing, is far from over.
The wireline landscape has changed dramatically over the past couple of years. A number
of broadband access operators are competing for market shares, and consumers are increasingly aware that they can make low cost, or even free calls, to basically any destination in
the world. This has positioned todays wireline operators at a crossroad. On one hand, they
need to decrease both capital and operational expenditures, on the other hand, they have a
large installed base of legacy circuit-switched equipment that still generates the major part
of their revenue.
Also the wireless landscape is evolving. Although, the wireless industry is still a large
and dynamic industry that continues to enjoy significant growth worldwide, it needs sustained revenue growth and improved cost efficiency to protect margins. The wireless industry is today a mature industry that has been globally available for quite some time. Growth
of subscribers, traffic, and, most importantly, revenues, is by no means automatic. Entry
costs for new users and tariffs must be continuously reduced to increase subscriber numbers
and call minutes. Per unit pricing for as lucrative services as voice and Short Message Service (SMS) is eroding sharply in most markets. Thus, there is a strong belief in the wireless
industry that new services are needed to drive revenue growth. Further, due to the ever increasing popularity of Internet and Internet-based multimedia services, it is considered vital
that future wireless networks will seamlessly interwork with IP.
To address the challenges facing todays wireline and wireless industry, the so-called
softswitch solution or architecture has evolved. The softswitch architecture provides a smooth
first migration step from current circuit-switched fixed and cellular core networks to an all-IP,
multi-service telecommunication network. Section 3 introduces the softswitch architecture,
and discusses the incentives for both established incumbent operators and new competitive
operators to embrace this architecture. As will become evident in Section 3, one of the key
drivers of introducing the softswitch architecture is the promise of new revenue-generating
applications and services. To this end, Section 4 surveys the application/service creation
environments of the softswitch architecture.
At the heart of a telecommunication network is signaling: Call signaling is paramount to
manage call sessions, and bearer signaling to manage the actual media streams. Sections 5
and 6 discuss call and bearer signaling respectively in the softswitch architecture. Next,
since legacy wireless and wireline circuit-switched core networks will most likely live on
for the next decade or so, Section 7 examines the Internet Engineering Task Force (IETF)
SIGnaling TRANsport (SIGTRAN) framework architecture for transportation of Signaling
2. Signaling in Todays Telecommunication Networks
137
System No. 7 (SS7) signaling over IP. The report concludes in Section 8 with an outlook of
the migration steps following the softswitch architecture. In particular, an overview of the
3rd Generation Partnership Project (3GPP) IP Multimedia Subsystem (IMS) is given.
For those readers who are less familiar with signaling in current fixed and cellular telecommunication networks, Section 2 provides a brief introduction and summary of SS7, the most
widely used signaling system in both the Public Switched Telephone Network (PSTN) and
the Public Land Mobile Network (PLMN). The section also gives brief overviews of the architectures of the current PSTN and PLMN networks, and describes how SS7 is integrated
into these networks.
2 Signaling in Todays Telecommunication Networks

The term signaling is used in many contexts. In technical systems it commonly refers to
control of procedures. Examples of technical systems which include some kind of signaling are network control systems, railway traffic systems, air traffic systems, process control systems, and, of course, telecom systems. In a telephony context, signaling means the
distribution of information and instructions from one telephone node to one or several others to provide for calls, and for network management. The telecommunication sector of
the International Telecommunication Union (ITU-T) defines signaling as the exchange of
information (other than by speech) specifically concerned with the establishment, release
and other control of calls, and network management, in automatic telecommunications operation [57]. The main purpose of using signaling in telecom networks, where different
telephone nodes must cooperate and communicate with each other, is to enable transfer of
control information between nodes in connection with:
traffic control procedures such as setup, supervision, and teardown of calls and services;
database communication, e.g., database queries concerning specific services, roaming
in cellular networks etc.; and
network management procedures, e.g., blocking or de-blocking of signaling links.
2.1 Taxonomy of Signaling

Traditionally, signaling in a telecommunication network is divided into two types: subscriber
or access signaling and trunk or network signaling. As Figure 1 illustrates, access signaling
denotes the signaling that takes place between a subscriber terminal, e.g., a phone, and a local
exchange, while network signaling denotes the signaling that occurs between exchanges. In
this report, only network signaling is considered.
Network signaling has further been divided into Channel Associated Signaling (CAS)
and Common Channel Signaling (CCS). The key feature that distinguishes CAS from CCS
is the deterministic relationship between the voice circuits and the call-control signals controlling the voice circuits in CAS. Particularly, in CAS, a dedicated, fixed signaling capacity
138
LE
LE
Core Network
z
1
Access Signaling
Network Signaling
Access Signaling
LE = Local Exchange
Figure 1: Access and network signaling in a telecommunication network.

is set aside for each and every voice circuit in a pre-determined way, while, in CCS, the
signaling capacity is provided in a common pool for several voice circuits, and with the capacity being used as and when necessary. In fact, a signaling circuit in CCS is typically able
to carry signaling information for thousands of voice circuits. Network signaling was previously implemented using CAS techniques and systems. However, for the past two decades,
it has been replaced with CCS systems.
CCS systems are packet based, i.e., the signaling information is transferred as messages.
Thus, there is no rigid tie between the signaling and the adhering voice circuits, which makes
two different types of CCS signaling feasible: circuit-related signaling and non-circuit related signaling. Circuit-related signaling refers to the original usage of signaling, which was
to establish, supervise, and release voice circuits. In contrast, non-circuit related signaling
refers to signaling that is not related to the management of voice circuits. Specifically, with
the advent of cellular networks and Intelligent Network (IN) services, there was a need for
non-circuit related signaling in connection with database accesses. Apart from some remnants of Signaling System No. 6 (SS6), Signaling System No. 7 (SS7) is the CCS system of
use in todays telecommunication networks.
Since there is no inherent relationship between voice circuits and signaling in a CCS
system, three types of, so-called, signaling modes have been defined: associated, nonassociated, and quasi-associated. The signaling mode of a CCS system is determined on the
basis of how circuit-related signaling is routed through the system. In associated mode, the
signaling and the corresponding bearer traffic take the same route through the telecommunication network. Contrary to this, in non-associated mode the signaling and bearer traffic are
routed separately. Furthermore, the route taken by the signaling traffic for a specific bearer
traffic is not fixed. That is, the signaling has many possible routes through the network for a
given call or transaction. The quasi-associated mode of signaling could be seen as a limited
case of the non-associated mode where the route taken by the signaling traffic for a specific
bearer traffic is predetermined and, at a given point in time, fixed. Figure 2 shows the three
different types of CCS signaling modes. SS7 is only specified for use in the associated and
quasi-associated modes, and does not support non-associated signaling. Associated signaling is the common means of implementation outside of U.S., e.g., in Europe. However, in
U.S., quasi-associated signaling is frequently used. Since the way associated signaling is
implemented differs greatly between different nations, and, since quasi-associated signaling
139
Associated Mode
Exchange
Exchange
Exchange
Quasi-Associated Mode
Non-Associated Mode
Exchange
Signaling
Transfer
Point
Exchange
Signaling
Transfer
Point
Exchange
Signaling
Transfer
Point
Exchange
Signaling
Transfer
Point
Bearer Traffic
Signaling Traffic
Figure 2: Common channel signaling modes.

gives a cleaner interface between signaling and bearer traffic, the signaling examples in the
text that follows assume quasi-associated signaling.
2.2 SS7 Network Architecture

As already mentioned, SS7 is the prevailing network signaling system in todays telecommunication networks, in both the PSTN and the PLMN networks. Logically, as illustrated
in Figure 3, SS7 constitutes a separate network within a telecommunication network, however, physically, SS7 establishes a framework between telecom exchanges and dedicated
signaling nodes by which signaling information is exchanged via dedicated signaling circuits. These circuits are known as signaling data links or simply links. Each signaling node
and SS7-aware exchange acts as a Signaling Point (SP), and communicates with other SPs
via dedicated links.
Links connect SPs to their neighbors and form communication paths or routes between
them. Within an SS7 network, all SPs are identified by a unique address. This address is
called a point code. All SS7 messages have a point of origin and a point of destination, and
140
STP
STP
STP
STP
SCP
SSP
SSP
Bearer Traffic
Signaling Traffic
SCP = Service Control Point

SSP = Service Switching Point
STP = Signaling Transfer Point
Figure 3: A logical view of the SS7 network architecture.
hence are assigned an Originating Point Code (OPC) and a Destination Point Code (DPC).
Routing in SS7 is in part done on the basis of the DPC of a message.
To provide more bandwidth and/or redundancy, links are usually organized into groups
known as linksets. A linkset is a collection of links that share the same destination and are
for the most part established directly between SPs. When links are collected in linksets, the
total load of traffic is typically shared between active links. There can be up to 16 links in a
linkset, and a single SP may support a number of linksets in between itself and other SPs.
When one SP is reachable from another SP, there is said to be a route between the two.
In other words, a route is the path that exists between any two SPs. The route may comprise
a single linkset or multiple linksets; the term simply refers to the existence of a network path
between two SPs. Where alternative routes exist between two SPs, they together constitute
a routeset. Figure 4 exemplifies the concepts of link, linkset, route, and routeset.
As illustrated in Figure 3, an SS7 network includes a number of different types of SPs.
In fact, there can be three different types of SPs in an SS7 network:
Service Switching Points (SSPs). SSPs are SS7-aware exchanges that originate, terminate, and, if integrated STPs (see below), forward calls. An SSP sends signaling
messages to other SSPs to setup, manage, and release voice circuits required to complete a call. An SSP may also send queries to Service Control Points (SCPs), e.g., to
determine how to route a call or in connection with an IN service (see Section 2.6).
STP
141
STP
Route
SCP
Route
STP
STP
Linkset
Linkset
Routeset
SSP
SSP
Bearer Traffic
Signaling Traffic

Figure 4: Link, linkset, route, and routeset.

Signaling Transfer Points (STPs). STPs are packet switches that route traffic between SPs. There are two types of STPs: standalone STPs and integrated STPs. A
standalone STP means that the STP functionality is allocated to an SS7 node whose
only task is to operate as an STP. In contrast, an integrated STP is an SSP with STP
functionality.
Service Control Points (SCPs). SCPs are centralized network databases that underpin
IN services and subscriber mobility in cellular networks. The SCP accepts queries
from an SSP and returns the requested information. For example, an SSP calls an SCP
to determine the routing of a toll-free call.
Additionally, it should be mentioned that an SSP or SCP that either originates or terminates
signaling traffic is also called a Signaling End Point (SEP).
2.3 The SS7 Protocol Stack

As outlined in Figure 5, the protocol stack of SS7 basically comprises two main functional
parts: a Network Service Part (NSP) and a User Part (UP). The NSP is primarily concerned
with the transportation of signaling messages between application protocols or UP protocols,
while the UP protocols themselves are responsible for the actual processing of signaling
messages. The NSP is common to all application areas, e.g., the PSTN, the PLMN, and
the IN services, while the protocols of the UP to a large extent depend on the particular
142
Exchange
Exchange
Message handling
UP
UP
SCCP
SCCP
Message transfer
NSP
MTP
MTP
Message transmission
MTP = Message Transfer Part

NSP = Network Service Part
SCCP = Signaling Connection Control Part
UP = User Part
Figure 5: The main structure of the SS7 protocol stack.

application area. The NSP comprises two functional parts: the Message Transfer Part (MTP)
and the Signaling Connection Control Part (SCCP).
In Figure 6, a more detailed view of the SS7 protocol stack is given. As follows, the MTP
consists of three parts:
MTP Level 1 (MTP-L1). MTP-L1 refers to the signaling data link and defines the
physical, electrical, and functional characteristics of the link. It also defines the means
to connect a signaling data link to exchanges.
In the PSTN and PLMN core networks, trunks carry voice and signaling traffic between exchanges. While analog trunks still exist, the majority of trunks in use today
are digital. Digital trunks mostly employ Time Division Multiplexing (TDM), and
are either of type T1 or E1; U.S. uses T1 while Europe uses E1. On a T1/E1 trunk,
voice and signaling circuits are multiplexed into digital bit streams. Figure 7 shows
the T1/E1 framing formats. As follows, each voice circuit occupies one timeslot in a
T1/E1 frame, and there are 24 multiplexed voice circuits in a T1 frame, and 30 voice
circuits in an E1 frame. A signaling link is implemented differently in T1 and E1.
In E1, the signaling link is implemented by using one of the voice circuits in each
E1 frame for signaling. However, in T1 no single timeslot is dedicated for signaling, instead a signaling link is implemented as one bit in every timeslot of every sixth
frame.
The transmission service provided by T1/E1 trunks is typically expressed according to
143
Exchange
Exchange
TCAP
TCAP
ISUP
ISUP
SCCP
SCCP
MTP-L3
MTP-L3
MTP-L2
MTP-L2
MTP-L1
MTP-L1
ISDN = Integrated Services Digital Network

ISUP = ISDN User Part
MTP-L1 = MTP Level 1
TCAP = Transaction Capabilities Application Part
Figure 6: A more detailed view of the SS7 protocol stack.

T1 Framing Format
Signaling occupies the LSB in every sixth frame
Timeslot
Voice Circuit #1
Voice Circuit #2
Voice Circuit #3
Voice Circuit #24
24 Voice Circuits
E1 Framing Format
Timeslot
Framing Slot
Voice Circuit #1
Voice Circuit #15
Signaling Slot
30 Voice Circuits
LSB = Least Significant Bit
Framing Bit
Signaling bit
Figure 7: The T1 and E1 framing formats.
Voice Circuit #30
144
NI
SIF
User Data
Spare
SI
SIO
Routing Label
SLS
OPC
DPC
DPC = Destination Point Code

MTP-L2 = Message Transfer Part Level 2
NI = Network Indicatior
OPC = Originating Point Code
SI = Service Indicator
SIF = Signaling Information Field
SIO = Service Information Octet
SLS = Signaling Link Selector
Figure 8: Routing label and other fields used by MTP-L3 for routing.
the so-called Digital Signal (DS) service hierarchy. The basic unit of transmission on
a T1 trunk is 56 kbps and is designated DS-0A, and the basic transmission unit on an
E1 trunk is 64 kbps and is designated DS-0. A T1 trunk has a capacity of 24 DS-0As,
and an E1 trunk a capacity of 30 DS-0s.
MTP Level 2 (MTP-L2). MTP-L2 together with MTP-L1 provides for reliable signaling on a single signaling link in between two adjacent SPs. Specifically, MTP-L2
incorporates such capabilities as message delimitation, link error detection, link error
correction, link error monitoring, and link flow control.
MTP Level 3 (MTP-L3). Basically, MTP-L3 extends the functionality of MTP-L2 to
signaling routes. The MTP-L3 functions can be divided into two basic categories: Signaling Message Handling (SMH) and Signaling Network Management (SNM). The
SMH functionality is done on the basis of the routing label and the Service Information Octet (SIO) fields of an SS7 message (see Figure 8), and can further be divided
into message discrimination, message distribution, and message routing.
Message discrimination is the task of determining whether an incoming signaling message is destined to the SP currently processing the message. It makes this determination using the DPC and Network Indicator (NI) fields of the message. When the
discrimination function has determined that a message is destined for the current SP,
it performs the message distribution function by examining the Service Indicator (SI)
field. The SI field indicates which MTP-L3 user (i.e., either SCCP or a UP protocol)
the message should be forwarded to for further processing.
145
Routing takes place when the current SP has determined that a received message is to
be sent to another SP. The selection of an outgoing link is done based on the values
of the DPC and the Signaling Link Selector (SLS) fields. Each SP that provides STP
functionality has a routing table that is continuously updated with link status information. By mapping the DPC and SLS values of the received message against this table,
a suitable outgoing link is obtained.
The purpose of the SNM functionality is to provide for signaling link management,
signaling route management, and signaling traffic management. Signaling link management entails the management of locally attached signaling links. In particular,
SNM includes link management capabilities for link activation, deactivation, restoration, and linkset activation. The signaling route management includes the functions
needed to distribute information to adjacent SPs about the status of signaling routes.
Finally, the signaling traffic management concerns the rerouting of signaling traffic
from failed routes. It also concerns route-level congestion control.
MTP only supports circuit-related signaling, and SCCP was added to SS7 primarily to
provide for non-circuit related signaling. In particular, it appeared in the second version of
SS7 in 1984 to provide for non-circuit related signaling in connection with IN and cellular
networks.
The second major contribution of SCCP is a new routing mechanism, Global Title Routing (GTR), that complements MTP-L3 with incremental routing. A Global Title (GT) is
an address which in itself does not contain the information necessary to perform routing
in an SS7 network. There are numerous examples of GTs: in fixed networks, toll-free (e.g.,
020-numbers) and premium-rate numbers are examples of GTs, and in cellular networks, the
Mobile Subscriber Integrated Services Digital Network (MSISDN) and International Mobile
Subscriber Identity (IMSI) are examples of GTs.
GTR frees originating SPs from the burden of having to know every potential destination
to which they might have to route a message. When GTR is used, an SP, e.g., an SSP, does
not have to determine the final destination of a message. Instead, it might query an STP that
does GT translation, a so-called SCCP Relay Point (SRP), about the next SP along the route
towards the destination. The next SP is either the final destination or yet another SRP. If the
next SP is an SRP, a new GTT (Global Title Translation) is made when the message arrives
at this SP. The routing continues in this incremental way until the final SP is reached.
As mentioned earlier, in contrast to the NSP, the UP is to a large extent application dependent. However, two UP protocols stand out as being more important than others: the
Integrated Services Digital Network (ISDN) UP protocol (ISUP) and the Transaction Capabilities Application Part (TCAP).
ISUP is the UP protocol of the SS7 stack primary responsible for all circuit-related signaling. It conveys the signaling necessary to establish and maintain call connections. Each
exchange gets the call signaling information from the previous exchange along the voice
circuit as the connection is being established. Thus, ISUP messages are forwarded through
the SS7 network from SSP to SSP parallel to the voice circuit being established.
To illustrate the functionality of ISUP, Figure 9 shows the basic steps of a call setup
between a calling party, A, and a called party, B, in the PSTN. The steps are as follows:
146
A (calling party)
SSP-1
STP-1
SSP-2
B (called party)
SETUP
IAM
IAM
SETUP
ALERTING
ACM
ACM
CONNECT
CONNECT ACK
ALERTING
ANM
ANM
CONNECT
CONNECT ACK
Conversation
ACM = Address Complete Message

ANM = ANswer Message
IAM = Initial Address Message
Figure 9: The ISUP call-setup procedure in the PSTN.

(1) The call setup begins when A initiates a call using an access signaling protocol, e.g.,
Q.931 or V5.2. In this particular example, A employs Q.931 and sends a Q.931
SETUP message to SSP-1.
(2) When SSP-1 receives the SETUP message, it sends an ISUP Initial Address Message
(IAM) to STP-1. The IAM contains the information that is necessary to establish a
call between A and B, such as the phone number of B.
(3) On receiving the IAM from SSP-1, STP-1 sets up a voice channel between SSP-1 and
SSP-2. Furthermore, STP-1 forwards the IAM to SSP-2.
(4) SSP-2, on receiving the IAM from STP-1, notifies the called party, B, using access
signaling. In this example, a Q.931 SETUP message is sent to B.
(5) B optionally responds with a Q.931 ALERTING message, which is passed backwards
through the network as an ISUP Address Complete Message (ACM). When SSP-1
receives the ACM, a Q.931 ALERTING message is sent to A. At this point, A hears a
ringback tone.
(6) At the time B answers the call, a Q.931 CONNECT message is sent back to SSP2. SSP-2 responds with a Q.931 CONNECT ACK. It also sends an ISUP ANswer
Message (ANM) backwards to SSP-1, which, when receiving the ANM issues a Q.931
CONNECT message to A. A responds to this message with a Q.931 CONNECT ACK.
147
(7) The call setup is complete, and the conversation can commence.
The second major UP protocol is TCAP. TCAP was primarily introduced in SS7 to provide a generic transaction protocol for IN services and cellular networks. For example,
an SSP uses TCAP to query an SCP when it has to determine the route for a toll-free or
premium-rate call. TCAP is also used in connection with a mobile user roaming into a new
Mobile Switching Center (MSC)/Visitor Location Register (VLR) service area.
TCAP is primarily designed to be used for querying and retrieval of information from
SCPs. Logically, the TCAP protocol comprises two subparts: a component subpart and a
transaction subpart. Operations and their results are transmitted in between an SP and an
SCP as components. There are four types of components:
Invoke. The Invoke component is used to send an operation to an SCP.
Return Result. The result from an Invoke component is returned in the form of an
Return Result component.
Return Error. If an operation fails, a Return Error component is returned.
Reject. The Reject component reports the receipt and rejection of an incorrect component such as a badly formed Invoke.
The component subpart is responsible for accepting components from a TCAP user and
delivering those components, in order, to the recipient TCAP user. To be able to do so, the
component subpart employs the transaction subpart.
The transaction subpart packetizes components into messages, and sends the messages
in the form of transactions to the recipient TCAP user. There are five types of transactionsubpart messages: Begin, Continue, End, Abort, and Unidirectional. A Begin message starts
a transaction; one or several Continue messages are used following a Begin message; and the
End message terminates a successful transaction. The Abort message is used to terminate
an unsuccessful transaction, i.e., a transaction in which an abnormal situation has occurred.
Unidirectional messages are used in transactions that only contains requests and no replies.
2.4 SS7 in PSTN

Typically, the exchanges of the PSTN are organized in a hierarchy as depicted in Figure 10.
Subscribers are attached to Local Exchanges (LEs). The LEs are interconnected locally, and
are aggregated upwards toward Tandem Exchanges (TEs) or Regional Transit Exchanges
(RTEs). The RTEs are, in turn, aggregated toward National Transit Exchanges (NTE), and,
at the topmost level, there are International Transit Exchanges (ITEs) which bind together
different countries.
At the time of the inception of SS7, i.e., in the beginning of the 1980s, the general
structure of the PSTN was already in place and represented a substantial investment. To
this end, SS7 was designed to integrate easily with the existing PSTN. In particular, the
requirements of the PSTN have traditionally been met by organizing the SS7 network as a
four-level hierarchy with SEPs, and regional, national, and international STPs supporting
the signaling for the corresponding levels of PSTN exchanges. Figure 11 outlines this SS7
148
ITE
NTE
NTE
RTE
LE
RTE
LE
LE
RTE
LE
LE
LE
ITE = International Transit Exchange

NTE = National Transit Exchange
LE = Local Exchange
RTE = Regional Transit Exchange
Figure 10: A generic PSTN architecture.

architecture, and also shows how the SPs map to different PSTN exchanges. In the majority
of cases, the STPs are integrated with the corresponding transit exchanges, however, in some
crowded areas, standalone STPs might be deployed.
On the basis of the SS7 network hierarchy, one differentiate between six different types
of signaling links (see Figure 12):
Access Link (A Link). An A link connects a SEP to an STP. Only messages originating from or destined to a SEP are transmitted on an A link.
Bridge Link (B Link). A B link connects STPs belonging to the same hierarchical
level. Typically, quads of B links interconnect mated pairs of STPs in different regions.
Since the hierarchical level of an STP can be rather ambiguous, B links are sometimes
referred to as B/D links.
Cross Link (C Link). A C link connects STPs performing identical functions into a
so-called mated pair. Mated STPs are used to enhance the reliability of the signaling
network. A C link is only transporting signaling traffic when an STP has no other
route available to an SP.
Diagonal Link (D Link). A D link connects STPs belonging to different hierarchical
levels. Apart from this, D links are the same as B links.
I -STP
I-STP
N-STP
N-STP
R-STP
R-STP
R-STP
R-STP
S-STP
S-STP
SEP
SEP
SEP
SEP
Town Area West
149
SEP
SEP
SEP
Metropolitan Area
SEP
SEP
Town Area East
I-STP = International STP

N-STP = National STP
R-STP = Regional STP
SEP = Signaling End Point
S-STP = Standalone STP
Figure 11: Traditional SS7 signaling network architecture in the PSTN.

Extended Link (E Link). An E link provides an alternate or backup link to an A link.
E links are scarcely used in SS7 networks since the benefit of a marginally higher
degree of reliability does not usually justify the added expense of an extra link.
Fully Associated Link (F Link). An F link provides a direct connection between two
adjacent SEPs. As for E links, F links are rarely used.
2.5 SS7 in PLMN

Signaling in the PLMN is much more complex and demanding than in the PSTN. In addition
to the signaling required in the PSTN, the PLMN needs signaling to cater for mobility management. In fact, in the PLMN the largest part of the SS7 signaling concerns the mobility
management, and only a fraction of the signaling pertains to call control.
The predominant PLMN system in use today is the Global System for Mobile communication (GSM). Figure 13 shows the general GSM architecture. As can be seen, the GSM
architecture comprises three subsystems: the Base Station Subsystem (BSS), the Network
and Switching Subsystem (NSS), and the Operation and Support Subsystem (OSS). The BSS
is responsible for all radio-access signaling and is comprised of the Base Transceiver Station
150
National Level
N-STP
N-STP
D Link
N-STP
N-STP
Regional Level
R-STP
SEP
E Link
R-STP
D Link
B Link
C Link
B Link
R-STP
F Link
R-STP
A Link
SEP
Town Area West

N-STP = National STP
R-STP = Regional STP
Figure 12: SS7 signaling link types.

(BTS) and the Base Station Controller (BSC). The NSS is responsible for call processing
and management of cellular users. The NSS includes the following logical network nodes:
Mobile Switching Center (MSC). The MSC is responsible for mobility management.
It also acts as the interface between different operators cellular networks, the PSTN,
and other external networks, e.g., the Internet. To keep the complexity of the GSM
network down, typically only a few MSCs interface with external networks. These
MSCs are called Gateway MSCs (GMSCs).
Home Location Register (HLR). The HLR is a database or SCP used for storage and
management of subscriptions. The HLR is considered the most important database
as it stores permanent data about subscribers, including a subscribers service profile,
location information, and activity status. When a person acquires a subscription from
a cellular operator, he or she is registered in the HLR by the operator.
Visitor Location Register (VLR). The VLR is a database that contains temporary
information about subscribers that is needed by the MSC in order to service visiting
subscribers. The VLR is usually integrated with the MSC. When a cellular phone
151
OMC
BTS
OSS
BTS
BSC
B
HLR
BTS
AuC
MSC
VLR
BTS
EIR
NSS
BTS
BSC
GMSC
MSC
BTS
PSTN
PLMNs
Internet
etc.
BSS
AuC = Authentication Center

BSC = Base Station Controller
BSS = Base Station Subsystem
BTS = Base Transceiver Station
EIR = Equipment Identity Register
GMSC = Gateway MSC
HLR = Home Location Register
MSC = Mobile Switching Center
NSS = Network and Switching Subsystem
OMC = Operation and Maintenance Center
OSS = Operation and Support Subsystem
PLMN = Public Land Mobile Network
PSTN = Public Switched Telephone Network
VLR = Visitor Location Register
Figure 13: The GSM architecture.
roams into a new MSC service area, the VLR connected to that MSC will request data
about the subscription from the HLR of the phone. Later, if the phone makes a call,
the VLR will have the information needed for call setup without having to contact the
HLR.
Authentication Center (AuC). The AuC is a database that stores authentication and
encryption parameters for subscribers to enable subscriber verification, and to provide
152
BTS
BSC
MSC
HLR
BSSAP
LAPDm
LAPDm
LAPD
D-channel Signaling
LAPD
MAP
MAP
TCAP
TCAP
BSSAP
SCCP
SCCP
SCCP
SCCP
MTP
MTP
MTP
MTP
SS7 Signaling
BSSAP = Base Station System Application Part

BSC = Base Station Controller
MAP = Mobile Application Part
LAPD = Link Access Procedure on D-channel
LAPDm = LAPD modified
Figure 14: SS7 signaling in the GSM architecture.

confidentiality of calls.
Equipment Identity Register (EIR). The EIR is a database that holds all valid mobile
equipment, e.g., cellular phones, in the GSM network. Thus, the EIR prevents calls
from stolen or unauthorized cellular phones.
The OSS consists of Operation and Maintenance Centers (OMCs) that are responsible
for monitoring and controlling the cellular network. The OSS is typically proprietary and
differs between vendors.
Figure 14 shows a cross-section of the GSM architecture in Figure 13 along the line A-B.
In particular, it shows the extension of SS7 signaling in the GSM architecture. As follows,
SS7 signaling is used up to the BSC. Between the BSC and the BTS, as well as between the
BTS and the cellular phone, a signaling system based on the Digital Subscriber Signaling
System No. 1 (DSS1) is used.
The SS7 signaling protocol used between the MSC and the BSC is the Base Station System Application Part (BSSAP). The BSSAP protocol transports mobility and connectivity
153
OMC
Node B
OSS
Node B
RNC
HLR
AuC
Node B
MSC
VLR
Node B
EIR
NSS
Node B
RNC
GMSC
MSC
Node B
PSTN
PLMNs
Internet
etc.
RAN
AuC = Authentication Center

EIR = Equipment Identity Register
GMSC = Gateway MSC
NSS = Network and Switching Subsystem
OMC = Operation and Maintenance Center
OSS = Operation and Support Subsystem
RAN = Radio Access Network
RNC = Radio Network Controller
VLR = Visitor Location Register
Figure 15: The principal UMTS architecture.
management information to the MSC from the BSC. In the remaining parts of the GSM architecture, the prevailing SS7 protocol is the Mobile Application Part (MAP) protocol. MAP
resides above TCAP. It is used to permit the network nodes of the NSS to communicate with
each other to provide services such as roaming, text messaging (i.e., SMS), and subscriber
authentication.
Over the past several years, the Universal Mobile Telecommunications System (UMTS)
has slowly began to take market shares from GSM. UMTS is actually not a new PLMN
154
Query
SSP
Response
SCP

Figure 16: A simple IN service.

system, but an evolution of GSM. Figure 15, provides a schematic view of the UMTS architecture. The NSS and OSS parts of UMTS are almost the same as for GSM. Instead,
the major differences are found in the access network. To accommodate the new principles
for air-interface transmission (i.e., Wideband Code Division Multiple Access (WCDMA)
instead of Time Division Multiple Access (TDMA) or Frequency Division Multiple Access
(FDMA)), the GSM BSS is replaced with a new Radio Access Network (RAN): the UMTS
Terrestrial Radio Access Network (UTRAN). Two new logical network nodes are introduced
in UTRAN: the Radio Network Controller (RNC) and Node B. The RNC is connected to one
or several Node B nodes, each of which can serve one or several cells. The RNC performs
essentially the same functions as the GSM BSC, and Node B is more or less an upgrade of
the GSM BTS. From an SS7 signaling viewpoint, the major difference between GSM and
UMTS is a new signaling protocol, the Radio Access Network Application Part (RANAP)
that replaces the BSSAP protocol as the signaling protocol used between the MSC and the
RNC.
2.6 Intelligent Networks

The Intelligent Network (IN) is an architecture that redistributes a portion of the call processing that is traditionally performed by exchanges to other network nodes with the incentive to
provide telecom operators with the means to develop and control applications and services
more efficiently. Furthermore, IN makes customization of services to the needs of individual
users significantly much easier. Examples of services realized by IN include: toll-free calls,
universal access numbers, premium-rate calls, credit-card calls, and televoting.
In its simplest form, an SSP that communicates with an SCP to retrieve information about
how to process a phone call demonstrates an IN service (see Figure 16). The IN service can
be triggered in various ways, but most often the service is triggered by the user dialing
phone numbers that have a special meaning, e.g., toll-free phone numbers. When the service
is triggered, the SSP issues a query to the SCP; the SCP runs the corresponding Service
Logic Program (SLP) and returns with a response to the SSP, which continues processing
the phone call.
An IN network consists of several components that work collectively to deliver services.
Figure 17 shows a fairly complete view of the IN network architecture. The SSP represents
155
SCE
SCP
SCP
STP
STP
Adjunct
SSP
SSP
Intelligent Peripheral
SCE = Service Creation Environment

Figure 17: The IN network architecture.
SSP
STP
SCP
INAP
INAP
TCAP
TCAP
SCCP
SCCP
SCCP
SCCP
MTP
MTP
MTP
MTP
INAP = Intelligent Networking Application Part

Figure 18: IN in SS7.
156
the traditional exchange, but enhanced to support IN processing. The SSP performs basic
call processing and provides trigger and event detection points for IN processing. The SCP,
Adjunct, and Intelligent Peripheral are all additional nodes that were added to support the IN
architecture:
SCP. The SCP stores service data and executes service logic for incoming messages.
The SCP acts on the information in a received message by invoking the appropriate
SLP, and retrieving the necessary data for service processing. It then responds with
instructions to the SSP about how to proceed with the call. The SCP can be specialized
for a particular type of service, or it can implement several types of services.
Adjunct. The Adjunct performs similar functions to an SCP but, contrary to an SCP,
an Adjunct is often co-located with the SSP.
Intelligent Peripheral. The Intelligent Peripheral provides specialized functions for
call processing including voice announcements, voice recognition, and digit collection.
Service Creation Environment (SCE). The SCE enables operators, service
providers, and third-party vendors to prototype, test, and deploy new applications and
services.
With respect to SS7, IN is implemented as UP protocols atop TCAP (see Figure 18).
Throughout Europe, the Intelligent Networking Application Part (INAP) is the prevailing
IN protocol. In brief, INAP is responsible for keeping track of the TCAP components exchanged between an SSP and an SCP. The INAP protocol ensures that the contents of the
IN operations sent in TCAP components follow a predefined syntax as regards permitted
parameters and their coding.
The Softswitch Architecture
As mentioned in Section 1, both the wireline and wireless industry see the softswitch architecture as a key component in the next-generation telecommunication network. In fact,
several operators and vendors see the advent of the softswitch architecture as pivotal for
continued cost efficiency and revenue growth.
The term softswitch was coined by one of the founders of the Softswitch Consortium,
Ike Elliott, in the late nineties. Although frequently used, the term is quite elusive. In fact, to
our knowledge, there exists no precise definition of the term. Still, there seems to be a fairly
broad consensus on the principal components of the softswitch architecture and the salient
functions of a softswitch.
The principal idea behind the softswitch architecture is to separate the control and media functions of a traditional telecom switch. In particular, as illustrated in Figure 19, the
softswitch architecture prescribes a separation and/or distribution of the application, call
control, and media transport functions of legacy telecom switches. That is, the architecture decouples the underlying switching hardware from the control, service, and application
functions.
3. The Softswitch Architecture
157
Softswitch Solution
Application & Services
Traditional Telecom Switch

Application & Services
Call Control & Switching
Call Control & Switching
Transport
Transport
Figure 19: The principal idea behind the softswitch architecture.
Feature/Application Server
Signaling Gateway
Softswitch
Media Gateway
Figure 20: The softswitch architecture components.
Figure 20 illustrates the distributed architecture that is generally agreed upon as the
softswitch architecture. The architecture is bearer independent, and could be applied to both
packet- and circuit-switched networks. However, given that the next-generation telecommunication networks are assumed to be packet switched, the softswitch architecture is almost
exclusively applied to packet-switched networks. In fact, in the contexts used, it is often
tacitly assumed that the underlying network is either IP-based or based on Asynchronous
Transfer Mode (ATM).
158
As follows from Figure 20, the principal components of the softswitch architecture are
softswitch, Media Gateway (MG), Signaling Gateway (SG), and Feature/Application Server
(AS). The softswitch constitutes the intelligence that coordinates all signaling such as
call-control signaling, operations and management signaling, and bearer signaling. The
name softswitch originates from the fact that the majority of signaling functionality in a
softswitch resides in software as compared to hardware in traditional telecom switches.
The primary functions typically found in a softswitch are depicted in Figure 21. The
Call Agent Function (CA-F) administers the call-control signaling and provides the callstate machine for end points. Its primary role is to provide the call logic, and in so doing
interact with CA-Fs in peer softswitches. It also acts as a proxy for the AS, and assists the
AS in providing services and applications to the end user. The Media Gateway Controller
Function (MGC-F) controls and monitors the MGs, i.e., is responsible for the bearer control.
Specifically, it controls the creation, modification, and deletion of media streams. If needed,
it also acts as a conduit for media parameter negotiation between other MGC-Fs and external
networks. A softswitch is often responsible for routing of signaling messages between peer
softswitches and non-softswitch networks such as PSTN and PLMN networks. In Figure 21,
the Router Function (R-F) embodies the softswitch routing functionality. Other functions
that are not shown in Figure 21 but still could be part of a softswitch include: Accounting
Function (A-F), Border Gateway Function (BG-F), and various proxies, e.g., for the Wireless
Application Protocol (WAP), Java APIs for Integrated Networks (JAIN), Parlay, and the Call
Processing Language (CPL).
The MG serves as a gateway between two separate networks, e.g., two packet-switched
networks under different administrative control, or two networks employing different bearer
technology such as IP to TDM, IP to ATM, or IP to 3G. Its primary role is to transform
media from one transmission format to another. For example, an MG may terminate voice
calls from a PSTN, compress and packetize voice data, and deliver compressed voice packets
to an IP network.
An SG has the same function as an MG but for control or signaling transport. It acts as
gateway for signaling between two Voice over IP (VoIP) networks, or between a VoIP and a
PSTN/PLMN network. Notably, an SS7 SG serves as a protocol mediator/translator between
an IP and a PSTN/PLMN network. For example, when a call originates in an IP network
that uses H.323 or the Session Initiation Protocol (SIP) (cf. Section 5) as signaling protocol,
and terminates in a PSTN/PLMN network, a translation from H.323/SIP to SS7 is made in
an SS7 SG.
The final component of the softswitch architecture is the AS. The AS accommodates the
service and feature applications made available to the customers of a service provider. Examples include call forwarding, conferencing, voice mail, and forward on busy. Some networks
enable inter-AS communication which makes it possible to build complex, componentoriented applications.
It is important to understand that the softswitch architecture is a framework or logical
architecture which could be mapped to several different physical architectures. Particularly,
it could be mapped to both PSTN and PLMN networks. Figure 22 gives two examples of
how the softswitch architecture could be applied to a PSTN network.
Figure 22(a) shows a centralized physical architecture. The softswitch in this example
159
Softswitch
Softswitch
Call Agent Function
Routing Function
Call Agent Function
Routing Function
Media Gateway Control Function
Media Gateway
Media Gateway
Figure 21: The primary functions of a softswitch.
provides for both call and bearer control as well as basic application functions such as call
waiting and calling line identity. The MG and SG have the same roles as their logical counterparts in Figure 20 and serve as interfaces towards a PSTN.
Contrary to Figure 22(a), Figure 22(b) exemplifies a highly distributed architecture. In
fact, there is no such thing as a softswitch in this architecture. Instead, the functions of the
softswitch have been spread out on the Mediation Gateway and Feature Server. The Mediation Gateway functions as both an MG, an SG, and a softswitch in that it provides both
media conversion, signaling conversion, call-control, and basic routing functions. Servicelevel routing is provided by the Feature Server, which also accommodates certain service
logic. To offload the Mediation Gateway, a Media Server has been introduced. The Media
Server provides for specialized media resources such as Interactive Voice Response (IVR),
conferencing, fax, announcements, and speech recognition. It also handles the bearer interface to the Mediation Gateway.
In a PLMN network, the introduction of the softswitch architecture typically partitions
the MSC into two kinds of nodes: an MSC Server (MSC-S) and one or several Mobile Media
Gateways (M-MGs). As illustrated in Figure 23, the MSC-S acts as a softswitch, and thus
comprises the call- and bearer-control signaling of the legacy MSC. It interfaces with other
PLMN/PSTN networks via SGs. The M-MGs are controlled by the MSC-S, and, apart from
acting as MGs, the M-MGs comprise the switching functionality of the MSC.
160
Softswitch
CA-F
IP Phone
MGC-F
R-F
CA-F
*
AS-F
AS
VoIP
SG
H
EL
PSTN
MG
BayNet works
AS = Application Server
AS-F = AS Function
CA-F = Call Agent Function
MG = Media Gateway
MGC-F = Media Gateway Controller Function
R-F = Router Function
SG = Signaling Gateway
VoIP = Voice over IP
(a) Centralized architecture.
Media Server
IP Phone
AS-F
1
MGC-F
CA-F
*
Mediation
Gateway
BayN etworks
AS
CA-F
VoIP
MG-F
Feature
Server
PSTN
R-F
AS-F
R-F
AS-F = AS Function
MG-F = Media Gateway Function
(b) Distributed architecture.
Figure 22: The softswitch architecture applied to a PSTN network.
161
Node B
MSC-S
CA-F
RNC
PSTN
SG
H
CA
RA
T
D
MGC-F
M-MG
R-F
BayN et
wo ks
r
MG-F
VoIP
R-F
M-MG
BayN et
wo ks
r
SG
H
P
W
E
A
MG-F
R-F
MG = Media Gateway
MG-F = MG Function
MGC-F = MG Controller Function
M-MG = Mobile MG
MSC-S = MSC Server
PLMN
Figure 23: The softswitch architecture applied to a PLMN network.
Considering the fairly large changes required to transform legacy circuit-switched wireline and wireless networks into IP-based softswitch networks, one might wonder what the
incentives are. Unfortunately, the answer to this question is not as easily answered as asked.
In fact, the incentives are plentiful and differs among the actors involved. Still, maybe the
most important incentive to introduce the softswitch architecture is that it changes the telecom market from being vertical to horizontal. This opens up the opportunities for thirdparty developers, and will eventually bring the costs of telecom equipment down. The lower
equipment costs will, in turn, lower the initial costs for market entrants, and thus spur the development of a true competitive telecom market. Today, both the EU and U.S. wireline and
wireless markets are fairly oligopoly-like with a fem operators dominating their respective
markets, and this could change with the inception of the softswitch architecture.
Another compelling incentive for the softswitch architecture is that it enables the centralization of the signaling equipment to a few populated areas. Less populated, rural areas
can be controlled remotely. In fact, the softswitch architecture paves the way for virtual
providers that in the extreme case only owns the signaling equipment and leases the trunk
lines from another telecom or cable operator.
Still another virtue of the softswitch architecture is its scalability. For example, the Cisco
BTS 10200 softswitch [25] can scale from a single CPU up to 12 CPUs and then offer support
to millions of subscribers. This should be compared with an Ericsson Telecommunication
162
PIC
(E.g., digit collection)
DPs
(E.g., digits collected)
IN Service Logic
Call Model
SSP with IN
SCP
DP = Detection Point
IN = Intelligent Network
PIC = Points In Call
Figure 24: The call state model concept in IN.

Server Platform 4 (TSP4) node which still in its micro configuration comprises 10 CPUs
and accommodates 8 E1/T1 connections [40]. Additionally, a softswitch has a considerably
smaller footprint than a legacy PSTN/PLMN switch. Depending on the configuration, a
softswitch may take as little as one-thirteenth of the space required by a traditional circuit
switch [69]. Furthermore, as a result of its smaller footprint, a softswitch solution typically
has less power and cooling requirements than its corresponding legacy switches.
The softswitch architecture not only offers strong incentives to market entrants and smaller
competitive operators, it also offers solutions that are equally attractive to incumbents. This
includes the prospect of a single, common signaling and bearer solution for all media, both
voice, video, and data, and envisioned less Operation, Administration, and Management
(OAM) expenditures. It also includes the prospect of enhanced services and applications
that combine media in elaborate ways, and that will compensate operators for eroding margins on voice traffic.
Applications and Services
In todays telecommunication networks, applications and services are implemented as Intelligent Network (IN) services (cf. Section 2.6). Compared with its predecessors, and the way
services were implemented in these systems, todays IN-based telecommunication networks
represent a major leap forward. Notably, the IN concept introduced a generic representation
of SSP call-processing activities (see Figure 24). During call processing in a switch, a call
progresses through various states such as digit collection, translation, and routing. These
states existed before the inception of IN, however, before IN there was no agreement among
vendors on exactly what constituted each state, and what transitional events marked the entry
and exit of each state. IN defines a Basic Call State Model (BCSM), which unambiguously
4. Applications and Services
163
identifies the various states of call processing and the points during call processing where IN
can occur known as Points In Call (PIC) and Detection Points (DPs), respectively.
Although IN meant a major improvement compared with prior service solutions, and
although substantial investments have been made in writing IN applications and services
for the current SS7-based telecommunication networks, the promise of a thriving, competitive, and versatile telecom-service marketplace has yet to materialize. Vendors have invested
in proprietary service development and execution environments which has efficiently hindered a market for third-party application providers. Proprietary service platforms have also
made the development of new services unnecessarily expensive since the service development costs are not shared among operators. Furthermore, applications and services are typically being developed in low-level, platform-dependent programming languages such as C.
This not only inhibits cross-platform development, but also requires developers with a high
level of proficiency in specific telecom platforms.
As mentioned in Section 3, a key incentive driving the development of the next-generation
telecommunication network and the softswitch architecture is to fulfill the promise of IN with
a viable service market. Particularly, operators want to build a service market that makes it
possible for them to recoup from shrinking margins on voice calls. To this end, the service
development environments of the softswitch architecture comprise declarative, platformindependent programming languages, and high-level, imperative programming languages
such as Java and C++. The declarative languages are parsed and executed by open, standardized interpreters, and the imperative languages are executed on platforms with open, standardized Application Programming Interfaces (APIs). Thus, in both types of development
environments, the developers are shielded from most of the low-level signaling intricacies.
4.1 Application Programming Languages

The declarative programming languages used to develop applications and services for the
next-generation softswitch architecture are primarily based on the eXtensible Markup Language (XML) [34]. The reasons to this are many. Aside from being standardized and readable by both humans and machines, XML offers several other benefits: An XML-based
programming language is easily parsed and validated. Furthermore, there exists a number of
off-the-shelf XML parsers and processors on the market which make the implementation of
XML-based languages relatively simple.
Typically, an application or service in an XML-based programming language goes
through three phases. Figure 25 illustrates these phases. First, the application is created in
an application creation environment (1) which can be a general-purpose text editor. However, to many programming languages there are available graphical environments where the
application logic is designed using flowcharts of basic components. In the next phase, the
deployment phase (2), the program is parsed by an XML parser that validates its syntax. The
final program is then stored in a repository/database (3). Finally, the program is activated
through some kind of application management program (4) that downloads the program to
either the softswitch (5) itself or a separate platform, e.g., an AS (6) or a Media/Feature
Server (7).
Programs that are executed on ASs, softswitches, or other servers in the network of the
164
ACE
(1)
Softswitch
Program
(2)
XML Parser
AS
(5)
(3)
Database
(4)
(6)
Media/Feature
Server
Operation &
Management
(7)
ACE = Application Creation Environment

XML = eXtensible Markup Language
Figure 25: The creation, deployment, and execution of an XML-based application programming language.
operator are called server-side applications, and the majority of programs adhere to this category of applications. However, with the advent of more powerful terminals and end systems,
a number of programming languages for client-side application development have been proposed, e.g., CPL [63] and LESS [88]. The remainder of this paragraph briefly surveys some
of the more interesting server- and client-side application programming languages that have
been proposed in recent years.
165
<?xml version="1.0"?>
<vxml version="2.0"
xmlns="http://www.w3.org/2001/vxml"
xml:lang="en-US">
<form>
<field name="selection">
<prompt>
This is the ACME Weather Service.
Please choose Today, Tomorrow, or Week.
</prompt>
<grammar type="application/x-nuance-gsl">
[ today tomorrow week ]
</grammar>
</field>
<block>
<submit next="weather_service.jsp"/>
</block>
</form>
</vxml>
Figure 26: VoiceXML excerpt.
4.1.1 VoiceXML and CCXML

Voice eXtensible Markup Language (VoiceXML) [35] is a markup language for developing
IVR applications. It is an XML-based programming language that is standardized by the
World Wide Web Consortium (W3C). In 1999, the four large companies American Telephone and Telegraph (AT&T), International Business Machines (IBM), Lucent, and Motorola formed the VoiceXML Forum [21] to promote the application and development of
VoiceXML. Thus, VoiceXML will probably be one of the prevailing application and service
development technologies in the next-generation telecommunication networks.
A VoiceXML application comprises a number of documents with dialogs between a
VoiceXML client on an IVR platform3 and a customer. The functionality of the dialogs
include audio output, such as voice prompts; collection of numeric input from a telephone
handset; and handling of asynchronous events such as timeouts, exceptions due to unrecognized input, and user-defined events. As an example of a VoiceXML application, Figure 26
shows an excerpt of a weather forecasting service. The example illustrates some key points
about VoiceXML. As follows, dialogs in VoiceXML are implemented as forms with fields
for required input. The field tag, in turn, comprises three parts: a voice prompt to play, grammar to use for recognizing the caller reply, and actions to perform on successful recognition.
In this example, the action taken after input is a call to a Java Server Page (JSP), which
is a typical way of handling actions in VoiceXML. VoiceXML has very few programming
3 The
IVR platform is typically included in an AS or a Media Server (cf. Section 3).
166
VoIP
(1
(4
(2 )
)
)
2 3
5 6
8 9
8 #
AS
(3 )
Internet
(2)
(3)
Weather Forecast
Customer (Caller)
IVR
Weather
Forecast
Service
IVR = Interactive Voice Response
Web Server
Figure 27: Execution of a typical VoiceXML application.
features of its own and make frequent use of Java in terms of JSPs and JavaScripts.
Figure 27 illustrates the execution of a typical VoiceXML application, e.g., our previous
weather forecasting service. A caller dials the phone number of the service. The call is routed
to an IVR server with a VoiceXML client (1). The IVR server translates the phone number to
a Uniform Resource Locator (URL), and the VoiceXML client places an HyperText Transfer
Protocol (HTTP) request to the specified URL (2). The Web server at the URL responds
with a VoiceXML document that contains one or several of the dialogs of the service (3).
Finally, the VoiceXML client interprets the fetched document, and interacts with the caller
by playing voice prompts and collecting input (4).
Many times a VoiceXML application requires some resources from outside the Web
server hosting the VoiceXML documents. A VoiceXML document can access the Web,
acting as a sort of voice-controlled browser. It can send information to the Web servers and
convey the reply to the caller. Access to the Web also opens up for simultaneous development of Web and telephony services. Often it is enough to write a VoiceXML frontend to
make a Web service accessible from a telephone.
Although a flexible and powerful language for single-party telephony services, VoiceXML
lacks support for multi-party services. To this end, the Call Control eXtensible Markup Language (CCXML) [31] was designed by W3C. CCXML complements VoiceXML by providing an elaborate call state model; support for multiple instances of VoiceXML interpreters;
the ability to trap external, asynchronous events such as on- or off-hook events; and the
ability to place outgoing calls.
167
In the same way as VoiceXML, a CCXML application consists of a number of XML documents. However, a CCXML document does not describe user dialogs. Instead, it describes
the actions that should be taken by calls during call transitions and events. For example, a
CCXML document might realize a call screening application by running a person unavailable VoiceXML program when persons whose phone number are on a list attempts to call
a certain other person.
While CCXML was designed to complement VoiceXML, the two languages are separate.
In fact, CCXML could be used to add call-state control to an arbitrary dialog system provided
the dialog system complies with certain requirements of CCXML.
4.1.2 CPL
Both VoiceXML and CCXML are examples of flexible, expressive languages which lend
themselves perfectly for use by operators and trusted third-party developers. However, due
to their flexibility and expressiveness, languages such as these also raise safety and security
concerns. It is very difficult for an operator who employs VoiceXML, CCXML, or similar
languages for third-party development, to protect itself from invalid or ill-conceived programs, e.g, programs that reveal security-sensitive information, or that consume excessive
amounts of system resources. Thus, to address the need for a language suitable for semiand untrusted third-party developers, Lennox et al. designed the Call Processing Language
(CPL) [63].
CPL is not tied to any particular signaling protocol or architecture, however, it is designed
on the basis of SIP (cf. Section 5). Contrary to languages such as VoiceXML, it is very
restrictive: It provides no way of writing loops or recursion, and has no ability to invoke
external programs like JSPs or Web services. In fact, it is designed to prohibit any kind
of unsafe action, and a CPL program is always executed in a finite amount of time. To
ensure a bound on the program execution time, each action within a CPL program is always
time limited, and hence actions that interface with external resources, e.g., databases, have
timeouts.
CPL is designed to be used for both client-side (e.g., phones) and server-side applications
and services. Like VoiceXML and CCXML, CPL is an XML application, and its syntax is
specified in a Document Type Definition (DTD). Semantically, a CPL program constitutes a
directed acyclic, i.e., loop free, graph of call processing actions. The call processing actions
are, in turn, trees of language primitives or nodes. There are four principal classes of language primitives in CPL. First, there are the signaling actions, the primitive class that forms
the core of CPL. They control the broad behavior of the underlying signaling protocol. In
particular, they control such signaling actions as proxying, i.e., forwarding of a call to one or
several locations; redirection of calls; and responses to failures. Second, there are the switch
nodes, which correspond to the control or selection statements of ordinary programming languages, and which enable a CPL program to make decisions. Third, there are the location
nodes that specify the location for succeeding signaling actions. For example, a location
node could specify that a call should be proxied to a certain SIP address (see the example in
Figure 28). Finally, there are the non-signaling actions that permit a CPL program to perform
operations which are not dependent on, or affected by, the underlying signaling protocol. For
example, CPL provides a mail node which makes it possible for a CPL script to notify a user
168
<?xml version="1.0" ?>

<!DOCTYPE
cpl PUBLIC -//IETF/DTD RFC3880 CPL 1.0//EN
cpl.dtd>
<cpl>
<incoming>
<location url="sip:kjgr@office.acme.com">
<proxy timeout="8">
<busy>
<location url="sip:kjgr@voicemail.acme.com">
<proxy />
</location>
</busy>
<noanswer>
<location url="sip:kjgr@voicemail.acme.com">
<proxy />
</location>
</noanswer>
</proxy>
</location>
</incoming>
</cpl>
Figure 28: A call forwarding application in CPL.
about its status, and a log node which causes a signaling server (e.g., a softswitch or AS) to
log information about an ongoing call.
The program in Figure 28 implements a simple call forwarding application and illustrates
the use of CPL. When an incoming call arrives at the signaling server that administers the
network of the called party, the call forwarding application is invoked. First, the program
attempts to forward the incoming call to an internal SIP address. If this succeeds, the service
is ended. Otherwise, if the internal address is busy or does not answer, i.e., a timeout occurs,
the call is redirected to voice mail.
4.1.3 LESS
During the course of its evolution, CPL has in some ways proven itself to be too restrictive
to be used for client-side, third-party development. Particularly, CPL lacks support for noncall related events such as timers and origination of calls. To address these shortcomings,
the Language for End System Services (LESS) [88] was developed. LESS emanates from
an earlier work, Endpoint Service Markup Language (ESML) [87], by the same research
group at Columbia University that originally designed CPL. It is designed as an extension
to CPL and inherits the graph semantic of CPL, its lack of support for loops, and its inability to call external routines. However, unlike CPL, LESS is able to catch other events
169
than incoming calls. In fact, LESS extends CPL with triggers for timers, user interactions,
program-controlled events, and instant messaging.
4.1.4 XTML
The eXtensible Telephony Markup Language (XTML) [22] is a feature-rich and flexible
XML-based application programming language . It is a proprietary language of Pactolus
Communications Software Inc., and is the native language of their RapidFLEX software
architecture. Although the RapidFLEX platform targets SIP, XTML is oblivious to the call
signaling protocol used. In fact, it could equally well be used together with H.323.
Basically, an XTML application consists of a set of event handlers which responds to
some given events. The events can be either signaling protocol-dependent, e.g., the arrival
of a SIP INVITE message, or protocol-independent, e.g., a timer that expires. The event
handlers are, in turn, made up of chains of actions which are linked together to reflect the
application call-flow. Compared with the previously described languages, e.g., VoiceXML
and CPL, XTML is designed to be easily extensible. The extensions can be written in both
XTML and general programming languages such as Java and C++.
4.1.5 SCML
The Service Creation Markup Language (SCML) [32] suite is part of the Java APIs for
Integrated Networks (JAIN) [8] standardization effort (see Section 4.2.2). The intention
with SCML is to provide a high-level scripting facility on top of the JAIN and Parlay [19]
APIs, and thus to provide a simple service creation environment for non-telecommunication
experts. Although envisioned to cover a broad range of features, e.g., web and presence
services, and instant messaging, the SCML suite is currently very much a work in progress.
In fact, at the time of this writing, only preliminary versions of the SCML call control have
been presented.
The SCML call control is defined in terms of an XML Schema that is derived from the
general call-control model of the Java Call Control (JCC) API [13]. To this end, SCML
provides an elaborate event mechanism completely on par with CCXML and XTML. Furthermore, since defined using an XML Schema, SCML is fairly easy to extend.
An SCML program is typically downloaded to an AS. At startup, the program registers
interest in events with a softswitch. When an event is triggered, e.g., a call arrives at the
softswitch, the softswitch generates a JCC event which is converted to an XML message and
delivered, e.g., via the Simple Object Access Protocol (SOAP) [45, 46, 65], to the AS. The
AS executes the SCML program and returns an XML message to the softswitch.
Figure 29 shows an example program in SCML. The program implements a simple call
forwarding application which diverts incoming calls to Mr. Karl-Johan Grinnemo (employee
IDentity (ID): kjgr), to a voice mail service when he is already busy with another call.
4.2 API Frameworks

In addition to declarative programming languages, the notion of open API frameworks has
been proposed to enable rapid application and service development in the next-generation
170
<scml>
<terminating>
<address-switch field=terminating>
<address is=sip:kjgr@office.acme.com>
<disconnected causeCode="CAUSE_BUSY">
<routeCall connectionPtr="conC">
<arguments>
<targetAddress>
sip:kjgr@voicemail.acme.com
</targetAddress>
</arguments>
</routeCall>
</disconnected>
</address>
</address-switch>
</terminating>
</scml>
Figure 29: A call forwarding application in SCML.
telecommunication networks. The key idea has been to design generic, technology-neutral
APIs that could be used by both operators and third-party developers alike. In particular,
the APIs should enable for operators to provide, in a secure way, network capabilities to
third-party application developers.
Today, we have two dominating API framework proposals: OSA/Parlay [19] and JAIN [8].
The OSA/Parlay proposal emanates from a collaboration between the Parlay Group, European Telecommunications Standards Institute (ETSI), ITU-T, and 3GPP. In 1998, British
Telecom (BT), Microsoft, Nortel, Siemens, and Ulticom formed the Parlay Group to define
a set of programming language-neutral APIs for third-party development of telecom applications and services in PSTN. The initial API framework was published in December 1998.
Since then, the membership has grown and now include companies such as Cisco, Ericsson,
Lucent, and IBM. Furthermore, the focus of the group has widened to cover both Internet
and PLMN. As the work on the second release of Parlay commenced, the API framework
was taken into ETSI and ITU-T in an attempt to make the API an international standard. At
about the same time, the Parlay Group initiated a work within 3GPP on an open application
interface towards UMTS. Facing the risk of having several incompatible standards, the Parlay, ETSI, and 3GPP initiatives were combined into one working group, the Joint Working
Group (JWG), in the context of what is called the Open Service Access (OSA) framework.
At about the same time as the Parlay Group was formed, the Java APIs for Integrated
Networks (JAIN) community was initiated by Sun Microsystems and others. The objective
of JAIN is similar to that of OSA/Parlay, however, contrary to OSA/Parlay, JAIN only considers Java. Furthermore, JAIN takes a broader perspective to application development than
OSA/Parlay and not only considers client-side, but also server-side applications. Still, there
171
ACE
ASP
Location
OSA/Parlay
Gateway
OSA/Parlay
Application
SMS
VoIP
AS
OSA/Parlay API
Softswitch
ACE = Application Creation Environment

ASP = Application Service Provider
OSA = Open Service Access
SMS = Short Message Service
Figure 30: The principal use case of OSA/Parlay.

is a large overlap between the two API framework proposals, and, almost from its inception,
JAIN has informally collaborated with OSA/Parlay on an adaptation of the OSA/Parlay API
framework to Java. The result of this collaboration is the JAIN Parlay API [6], which is
considered a subset of the JAIN API framework. The remainder of this paragraph further
elaborates on the design and implementation of the OSA/Parlay and JAIN API frameworks.
4.2.1 OSA/Parlay
The OSA/Parlay API framework primarily targets semi- and untrusted third-party developers. Figure 30, depicts the principal use case of OSA/Parlay. Third-party applications and
services reside, and are run from, ASs external to the operator network, e.g., in the corporate
network of a customer, or in the premises of an application service provider. Specifically, the
applications execute outside the secure operator domain, and access the network resources
of the operator network via the OSA/Parlay framework API. The Parlay framework provides
resource location, and the authentication and authorization functions required for external
applications to gain access to the network resources.
The syntax of the OSA/Parlay API is defined in the Interface Description Language
(IDL) [26], and the semantics in the Unified Modeling Language (UML) [27]. As shown in
Figure 31, the OSA/Parlay API framework consists of two principal entities: Service Capability Servers (SCSs) and the so-called OSA Framework. The ASs in the figure are the same
entities as in the softswitch architecture (cf. Section 3), i.e., the network nodes on which
the applications reside and are run. The network resources provided by the OSA/Parlay API
framework are denoted Service Capability Features (SCFs). They are provided by the SCSs,
172
AS
AS
Application
AS
Application
Application
CORBA, DCOM
OSA/Parlay Interface
OSA/Parlay API Framework
SCFs
SCSs
OSA Framework
Call Control
SMS
Location
SMS Server
VoIP
Softswitch
Location
Server
API = Application Programming Interface

CORBA = Common Object Request Broker Architecture
DCOM = Distributed Component Object Model
SCF = Service Capability Feature
SCS = Service Capability Server
SMS = Short Message Service
Figure 31: Overview of the logical entities comprising the OSA/Parlay API framework.
173
AS
(7)ReturnServiceManager
(4)CreateServc
i eManager
(3)Authentc
i ation
Application
OSA/Parlay API Framework

SCSs
(1) Authentication
(2) Registration
OSA Framework
Call Control (SCF)

(5) Create Service Manager
(6) Return Service Manager
API = Application Programming Interface

SCF = Service Capability Feature
SCS = Service Capability Server
Figure 32: Usage and registration of a service in OSA/Parlay.
the logical entities which implement one or more SCFs, and, in so doing, interact with the
internal nodes of the operator network (e.g., location and SMS servers). Although, an SCS
might implement several SCFs or Parlay APIs, this is fairly unusual. Typically, an SCS only
implements a single API. The OSA Framework implements the core functionality of the
OSA/Parlay API, e.g., authentication and authorization, registration of SCSs, publication of
SCFs, and integrity/fault management. The communication between the ASs and the Framework/SCSs is made using either of the middleware technologies Common Object Request
Broker Architecture (CORBA) [71] or Distributed Component Object Model (DCOM) [2].
Figure 32 outlines the key steps in using the OSA/Parlay API framework. When a new
SCS is installed, it must authenticate itself (1) and register (2) with the OSA Framework.
The registration means that the SCS publishes its interface to the Framework. Next, when
an application wants to access the SCF provided by the SCS, it authenticates itself with
the Framework (3). It also obtains an instance of the SCS interface, a Service Manager in
174
Application
Framework Factory
IpFramework
IpAppCallControlManager
IpAppCall
IpCallControlManager
IpCall
(1) new()
(2) obtainScf()
(3) createNotification()
(4) ReportNotification()
(4) "forward event"
(5) Processing call

(6) routeReq()
(7) routeRes()
(7) "forward event"
(8) deassignCall()
Figure 33: Excerpt of an UML sequence diagram for an OSA/Parlay call forwarding application.
OSA/Parlay parlance 4 (4-7). The application then accesses the SCF by invoking the methods
provided by the Service Manager.
To concretize the usage of the OSA/Parlay API, Figure 33 provides parts of a UML sequence diagram for a call forwarding application. Only the major actions have been included
in the example. In particular, those parts concerning the authentication have been deliberately omitted. The example begins with the application retrieving a reference to the OSA
Framework, typically an instance of the Framework interface (IpFramework) (1). The
application then calls upon the Framework to obtain a reference to the Service Manager of
the Generic Call Control (GCC) SCF (IpCallControlManager) (2). To
enable notification of incoming call events, the application registers a
callback interface, IpAppCallControlManager, with the Service Manager
(IpCallControlManager) (3). When a call arrives, the application is notified via the
IpAppCallControlManager callback interface (4). The application processes the call
(5), and routes it to the appropriate destination (6,7). At that time, the application is no longer
4 The
creation of a Service Manager follows the Factory design pattern [42].
175
interested in controlling the call, and therefore deassigns the call (8). Note that this does not
mean that the call itself ends, it only means that there will be no further communication
between the call and the application.
The OSA/Parlay API framework has experienced a substantial development in recent
years, and, at the time of this writing, the OSA/Parlay API framework comprises a comprehensive set of standardized SCFs. Specifically, the OSA/Parlay 3GPP release 6 encompasses
SCFs ranging from basic call control to multi-party call control, instant messaging, multimedia messaging, presence, Quality of Service (QoS), and charging. Furthermore, it includes
support for Web services, which not only entails new SCFs, but also a new definition language, Web Services Description Language (WSDL) [33, 36, 37], and a new communication
middleware, SOAP [45, 46, 65].
4.2.2 JAIN
The JAIN API framework is being developed within the Java Community Process (JCP) under the terms of Suns Java Specification Participation Agreement (JSPA). The objective of
the JAIN initiative is to develop Java APIs that abstract the details of networks and protocol
implementations, and allow for the development of portable applications. The JAIN initiative is organized in two expert groups and several workgroups. It consists of a Protocols
Expert Group (PEG) that standardizes interfaces toward SS7 and IP signaling protocols, an
Application Expert Group (AEG) that primarily considers the APIs required for service creation within Java, and, finally, a number of workgroups whose task it is to develop prototype
implementations and feed the expert groups with their experiences and insights.
The JAIN API framework basically comprises two parts:
JAIN SS7 APIs that define implementation-agnostic APIs for the major SS7 protocols such
as ISUP, TCAP, MAP, INAP etc.
Java Service Logic Execution Environment (SLEE) that provides a generic Java-based
application platform for developing platform-independent applications and services.
From a historical viewpoint, the JAIN SS7 APIs could be seen as a predecessor to the Java
SLEE: Initially, JAIN only comprised a diverse, and relatively incoherent, set of telecom
APIs. However, this changed with the advent of the SLEE platform, which brought the APIs
together under a common framework. To this end, let us first consider the Java SS7 APIs.
All JAIN SS7 APIs are designed on the basis of the Factory and Observer design patterns.
In particular, as shown in Figure 34, each JAIN SS7 API is built up around five software
components: an SS7 factory class, JainSS7Factory; an interface class towards the SS7
stack, JainprotStack; an interface class towards the SS7 protocol, JainprotProvider;
a listener or callback interface class, JainprotListener, that enables for the protocol to
communicate with the application; and, finally, event classes for all protocol primitives that
can be exchanged between the application and the protocol. This includes primitives for
protocol messages, primitives for error indications, primitives for timeout indications, and
primitives for network status indications. Together, the provider, listener, and event classes
implement the Observer design pattern. While the factory, listener, and event classes are
vendor neutral, each stack vendor is required to implement the stack and provider classes.
176
Create
JainSS7Factory
Application
Jainprot Listener
Create
Create
Events
Jain prot Stack
Jain prot Stack
SS7 Protocol
SS7 Protocol
SS7 Protocol
SS7 Protocol
SS7 Protocol
Jain prot Provider
SS7 Protocol
SS7 Protocol
SS7 Protocol
SS7 Protocol
SS7 Protocol
Stack Vendor A
Stack Vendor B
Jain prot Provider
Figure 34: The JAIN API architecture.
As an example of the use of the JAIN SS7 APIs, consider the skeleton Java code in Figure 35 for an ISUP application. At lines 8-9, a Factory object for a particular SS7 stack
is created; in this example, a Factory object for an Ericsson SS7 stack. Next, using the
Factory object, an interface towards the SS7 stack is obtained in lines 11-14. Lines 1618 show parts of the configuration of the SS7 stack. For example, in line 17, the stack
is a assigned its point code. The initialization of the application concludes in lines 2024. An interface towards the ISUP protocol is obtained in line 20. In line 21, the actual
ISUP application is created. As follows from line 2, the application implements a callback interface, JainIsupListener. In lines 23-24, the application and the ISUP protocol are interconnected. Particularly, the ISUP protocol is supplied with a reference to
the JainIsupListener interface, and the application is provided with a reference to
JainIsupProvider. The JainIsupListener interface only consists of one method,
processIsupEvent (lines 30-43). This method is invoked by the ISUP protocol, via the
JainIsupProvider class, each time it needs to notify the application about events that
177
1 ...
2 public class IsupApplication implements JainIsupListener
3 {
4
public static void main ( String args[] )
5
{
6
try
7
{
8
JainSS7Factory aFactory = JainSS7Factory.getInstance();
9
aFactory.setPathName("com.ericsson");
10
11
JainIsupStack isupStack = null;
12
isupStack = ( JainIsupStackImpl )
13
aFactory.createSS7Object
14
( "javax.jain.ss7.isup.JainIsupStackImpl" );
15
...
16
isupStack.setVendorName ( "com.ericsson" );
17
isupStack.setSignalingPointCode( OPC );
18
isupStack.setStackName( "ericsson_stack" );
19
...
20
JainIsupProvider
aProvider = isupStack.createProvider();
21
JainIsupListenerImpl aListener = new IsupApplication();
22
...
23
aProvider.addIsupListener( aListener, myUserAddress );
24
aListener.setIsupProvider( aProvider );
25
...
26
}
27
...
28
}
29
30
public void processIsupEvent( IsupEvent isupEvt )
31
{
32
switch ( isupEvt.getIsupPrimitive() )
33
{
34
case IsupConstants.ISUP_PRIMITIVE_SETUP:
35
...
36
break;
37
38
case IsupConstants.ISUP_PRIMITIVE_ALERT:
39
...
40
break;
41
...
42
}
43
}
44 }
Figure 35: A skeleton ISUP application that uses the JAIN ISUP API.
have occurred such as incoming messages and timeouts.

A special type of JAIN SS7 API is the OAM API [12]. The OAM API defines the
attributes and operations required by a management application to provision and manage
an SS7 stack, including the capability to collect statistics and handle alarms emitted by the
stack. Currently, the OAM API comprises classes and interfaces to provision and maintain
an SS7 stack at the MTP-L2, MTP-L3, SCCP, and TCAP levels.
Contrary to other SS7 APIs, the OAM API is built around the Java Management eXtensions (JMX) architecture [15], an architecture developed outside of the JAIN initiative and
which is illustrated in Figure 36. Each managed resource in the OAM API is represented by
a so-called MBean (1). An MBean is a Java object that implements a specific management
interface. The management interface of an MBean provides valued attributes that can be
178
Management Site
Management
Application
(4)
(3)
RMI, JMXMP etc.
JMX Agent
JMX Agent
JMX Agent
MBean
MBean
MBean Server
(2)
(1)
MBean
Network Node
JMX = Java Management eXtensions
JMXMP = JMX Messaging Protocol
RMI = Remote Method Invokation
Figure 36: The JAIN OAM API and the JMX architecture.
accessed by a management application, operations that can be invoked, and notifications that
can be emitted. MBeans are run within an MBean Server (2) and accessed via Agent Services
objects (3) that can perform management operations on the MBeans registered in the MBean
Server. The MBean Server relies on Remote Method Invocation (RMI) [7], JMX Messaging
Protocol (JMXMP) [11], or similar technologies to make the OAM MBeans accessible for
remote management applications (4).
Other types of special JAIN SS7 APIs include the JAIN Parlay APIs [6]. As mentioned
earlier, the JAIN Parlay APIs are basically an adaptation of the OSA/Parlay API framework
to Java. Like the OSA/Parlay API, the JAIN Parlay API defines an API for describing the
interaction between ASs external to an operator network and network resources (i.e, SCSs)
within the operator domain.
179
The second part of the JAIN API framework is the JAIN SLEE [14]. The SLEE comprises an application framework and component model similar to Enterprise Java Beans
(EJBs) [10]. It builds upon the JAIN SS7 APIs, however, in addition to providing platformindependent access to SS7 protocols, it also provides support for transactions, persistence,
load balancing, and pooling.
Figure 37 shows the principal parts of the JAIN SLEE architecture. Applications and
services in SLEE are implemented as collections of reusable objects or Service Building
Blocks (SBBs) (1). For example, assume that you want to implement a call forwarding
application. Instead of building the application from the ground up, you would build the
application from two pre-existing SBBs: a call SBB and a forwarding SBB.
SBBs are run from within a SLEE Server (2). All communication between the SBBs
and network resources, such as SS7 protocols, goes via the SLEE Server. Incoming messages from network resources are translated by the SLEE Server to events and routed to
the appropriate SBBs. Conversely, outgoing events from SBBs are translated by the SLEE
Server to messages and routed to the appropriate network resources. The SLEE event model
is based on a publish/subscribe model which means that event sources are decoupled from
event sinks via an indirection mechanism, Activity Contexts (3). Event sinks subscribe for
events by attaching to Activity Contexts, and event sources publish events to Activity Contexts. The SLEE defined Activity Contexts maintain the relationships among event sources
and sinks. By using a publish/subscribe event model, sources and sinks need not to be aware
of each other. At the same time, it permits the SLEE to control and manage all source/sink
relationships, thus improves robustness.
The SLEE architecture defines how applications (i.e., SBBs) running within the SLEE
interact with network resources through Resource Adaptors (RAs). A RA shields SLEE
applications from the intricacies of particular network resources, e.g., vendor-specific details,
and publishes a common interface towards the applications (4).
The SLEE architecture also include some application facilities (5). The Timer facility
provides applications with the ability to perform periodic actions; the Alarm facility enables applications to generate alarm notifications to external management clients; the Trace
facility is used by applications to generate trace messages; and, finally, the Usage facility
provides applications with resource usage and network statistics. Furthermore, the SLEE
defines management interfaces using JMX and MBeans (6). These interfaces enable for a
management application on an OAM node to access applications on remote SLEE Server
nodes.
To conclude our discussion about JAIN, it should be mentioned how the JAIN API framework relates to some other Java API efforts. The Open Mobile Alliance (OMA) [17] initiative is defining a set of Web services interfaces in WSDL [33, 36, 37] which complement
both the JAIN SS7 APIs and SLEE by providing SOAP-based [45, 46, 65] interfaces toward
network resources. Another Java API effort is the OSS through Java (OSS/J) [18] initiative
which is an umbrella initiative to provide OSSs with Java capabilities. The OSS/J APIs add
support for QoS management, trouble reports, telecom management, and billing to JAIN,
and thus supplement the JAIN OAM API. Finally, there is the SIP Servlet [9] technology
which, like JAIN SLEE, provides a platform-neutral application environment with transaction support. However, unlike JAIN SLEE, SIP Servlets are tightly coupled to the SIP
180
SLEE Server
(6)
SLEE Container
Timer Facility
JMX Agent
Alarm Facility
SBB
(1)
SBB
(5)
SBB
MBean
Trace Facility
MBean
(2)
MBean
Usage Facility
Activity Context
Activity Context
Event Dispatcher
(4)
RA
SIP
RA
H.323
SIP
Vendor A
Protocol Stack
Activity Context
RA
MGCP
RA
MAP
RA
CAP
H.323
Vendor B
Protocol Stack
CAMEL = Customized Application for Mobile network Enhanced Logic

CAP = CAMEL Application Part
INAP = Intelligent Network Application Part
JMX = Java Management Extensions
MAP = Mobile Application Part
MGCP = Media Gateway Control Protocol
RA = Resource Adaptor
SBB = Service Building Block
SIP = Session Initiation Protocol
SLEE = Service Logic Execution Environment
Figure 37: The JAIN SLEE architecture.
RA
INAP
(3)
5. Call Control Signaling
Softswitch
181
Call-control Signaling
Softswitch
Bearer Signaling
MG
Media Path
MG
MG
Media Path
MG
MG = Media Gateway
Figure 38: Call-control signaling in the softswitch architecture.
protocol. Additionally, it uses a much more rigid event model.
5 Call Control Signaling

As you may recall from Section 3, the softswitch architecture entails a separation of the control and media functions of a traditional telecom switch. In the softswitch architecture, the
media functions reside in MGs and the control functions in softswitches. A consequence of
this separation is that the softswitch architecture, in contrast to the traditional SS7 network
architecture (cf. 2.2), needs two types of signaling: call-control signaling and bearer signaling. Call-control signaling embodies the signaling that is required between softswitches to
enable call setup, modification, and teardown of multimedia sessions, and bearer signaling
considers the creation, modification, and deletion of media streams. Particularly, bearer signaling considers the signaling that takes place between softswitches and MGs. Figure 38
illustrates the relationship between call-control and bearer signaling. This section considers
call-control signaling, and bearer signaling is discussed in Section 6.
The two primary candidates for being the call-control signaling protocol of the nextgeneration telecommunication networks are H.323 and SIP. The H.323 standard is specified
by ITU-T. It is an umbrella standard that not only covers call-control signaling but a complete protocol suite and framework architecture for multimedia communication. The first
version of the standard was released in 1996 and considered multimedia communication
over enterprise LANs [53], however, the emergence of VoIP has paved the way for considerable revisions to the standard. Thus, today, with version 5 of H.323 [59] in force, the
standard not only considers LAN communication but also WAN and internet communication. The SIP protocol [78] is standardized within IETF. The first version of SIP came in
1999 [48], and thus was preceded by H.323 with three years. Contrary to H.323, SIP only
covers call-control signaling.
182
Terminal
Terminal
Terminal
Terminal
H.323 Zone
H.323 Zone
Gatekeeper
Gatekeeper
IP
Network
MCU
Terminal
Terminal
Terminal
Terminal
IP
Network
Gateway
Gateway
PSTN
PSTN
MCU
MCU = Multipoint Control Unit

Figure 39: The H.323 architecture.
5.1 H.323
As briefly mentioned earlier, H.323 is not a call-control signaling protocol per se. Instead,
H.323 describes the principal logical components of a multimedia communication system
and further specifies how the components should communicate. To that end, H.323 is a
framework specification which, in turn, references other ITU-T specifications for call signaling, control signaling, media transmission etc.
Figure 39 shows the H.323 architecture. It should be emphasized that this is a logical
architecture and that the components do not necessarily map to real physical devices.
As shown in Figure 39, a telecommunication network according to H.323 comprises one
or several zones. A zone is a logical demarcation and may straddle network segments that
are connected with routers, switches, or other network devices. It includes a gatekeeper and
at least one terminal. Optionally, it may include gateways and/or Multipoint Control Units
(MCUs).
A zone is administered and controlled by the gatekeeper. The gatekeeper performs the
following tasks:
Address Translation. Every device in a H.323 network has a network address that
uniquely identifies the device. Typically, in an IP environment, the address is an IP
address that is specified in the form of a URL. However, it is also possible to use E.164
addresses. A gatekeeper translates address aliases such as URLs and E.164 addresses
to IP addresses.
183
Admission Control. In a H.323 network, an end point, e.g., a terminal or gateway, has
to request access to the network before a call can be placed. A request for admission
specifies the bandwidth to be used by the end point, and the gatekeeper can choose to
accept or deny the request based on the bandwidth requested and the current network
state.
Bandwidth Management. Although bandwidth is initially provided through admission control, the bandwidth requirements may change during a call. The gatekeeper is
also responsible for mid-call bandwidth requests.
Optionally, a gatekeeper may provide call-control signaling. That is, a gatekeeper could
be the component responsible for routing call-signaling messages between H.323 end points.
Furthermore, a gatekeeper could perform call authorization, e.g., reject calls that originate
from certain addresses, or calls placed within certain time periods. A gatekeeper could also
handle call management and maintain information about all active calls. This information
could be used by the bandwidth-management function, or to re-route calls to different end
points to achieve load balancing.
As emphasized earlier, a gatekeeper is a logical component. In a physical network, the
gatekeeper could be a standalone device, but it could also be implemented as part of a gateway or MCU. Either way, the gatekeeper is commonly seen as the softswitch in the H.323
architecture.
The terminals and gateways in the H.323 architecture are typically referred to as the
end points since they are the components that originate and terminate signaling connections.
Terminals in H.323 could be anything ranging from a simple IP phone to a larger stationary
workstation. However, H.323 explicitly requires that all terminals must support the following
protocols:
The H.225 [58] call signaling protocol for call setup and release, and for Registration,
Admission, and Status (RAS) signaling.
The H.245 [61] control signaling protocol for exchanging terminal capabilities and for
the creation of media channels.
The Real-time Transport Protocol (RTP) and Real-time Transport Control Protocol
(RTCP) for media stream transport and control [79].
H.323 terminals must also support the G.711 [51] audio codec. Optional protocols in a terminal include additional audio codecs, video codecs, T.120 [52] data-conferencing protocols,
and MCU capabilities.
A gateway connects two dissimilar networks, typically a H.323 network with a PSTN
network. It provides translation of H.323 call-control protocols, i.e., H.225 and H.245, to,
e.g., SS7 and ISDN protocols. On the H.323 side, a gateway runs H.245 control signaling for
exchanging capabilities, H.225 call signaling for call setup and release, and H.225 RAS for
registration with the gatekeeper. On the other side, a gateway runs the signaling protocols of
the non-H.323 network, e.g., SS7 and ISDN protocols. A gateway may also perform media
translation, i.e., translation between different audio, video, and data formats. In a physical
network, a gateway could be co-located with a gatekeeper and/or MCU.
184
Data/Fax
Media
Audio
Codec
G.711
G.723
G.729
Video
Codec
H.261
H.263
RTCP
T.120
T.38
Call and Control
H.225
(Q.931,
Q.932)
H.225
RAS
H.245
TCP
UDP
TCP
RTP
UDP
TCP
IP
RAS = Registration, Admission, and Status

RTP = Real-time Transport Protocol
RTCP = Real-time Transport Control Protocol
TCP = Transmission Control Protocol
UDP = User Datagram Protocol
Figure 40: The H.323 protocol suite.

MCUs provide support for conferences between three or more end points. All end points
participating in the conference establish a connection with the MCU. The MCU manages
conference resources and negotiates between end points for the purpose of determining the
audio or video codec to use. An MCU can be a standalone device, but it can also be resident
in a terminal, gateway or gatekeeper. Physically, an MCU consists of two parts: Multipoint
Controllers (MCs) and Multipoint Processors (MPs). At a minimum, an MCU consists of
one MC and one MP, but, typically, an MCU has one MP per media, i.e., one MP for audio,
video, and data, respectively.
Figure 40 depicts the H.323 protocol suite. In essence, the protocol suite comprises
three types of signaling: RAS signaling, call signaling, and control signaling. The H.225
protocol is responsible for RAS and call signaling, and the H.245 protocol is responsible
for the control signaling. RAS, call, and control signaling take place over separate signaling
channels.
RAS signaling primarily provides pre-call control in H.323 networks. A RAS channel
is established between end points and gatekeepers, and is opened before any other signaling channels. Physically, a RAS channel comprises an unreliable User Datagram Protocol
(UDP) [73] connection in an IP network.
RAS signaling basically encompasses six activities or processes:
Gatekeeper Discovery. The gatekeeper discovery process is used by the H.323 end
points to determine the gatekeeper with which they must register. The discovery process can be done statically or dynamically. In static discovery, the end point knows the
address of its gatekeeper beforehand. In the dynamic method, the end point performs
185
a multicast on the gatekeepers discovery multicast address.

Registration. Registration is the process that enables end points and MCUs to join a
zone and inform the gatekeeper of their network addresses (e.g., in an IP network their
IP addresses) and alias addresses (e.g., URLs or E.164 addresses). It takes place after
the gatekeeper discovery but before any call attempts.
End Point Location. End point location is the process in which the gatekeeper translates an alias to a network address. This process takes place when an end point wishes
to communicate with a particular end point for which it only has an alias identifier. The
end point sends a location request with the alias to the gatekeeper and the gatekeeper
responds with the corresponding network address.
Admission. Gatekeepers authorize access to H.323 networks by confirming or rejecting admission requests. An admission request includes the requested bandwidth. In
the confirmation of an admission request, the gatekeeper is permitted to admit less
than the requested bandwidth.
Status Monitoring. A gatekeeper may use the RAS channel to obtain status information from an end point. For example, a gatekeeper may use RAS signaling to monitor
whether an end point is available or unavailable.
Bandwidth Management. RAS signaling takes place between an end point and a
gatekeeper when the end point requests an increase or decrease in call bandwidth in
the middle of a call session.
The H.225 call signaling is an adaptation of the SS7 Q.931 [56] and Q.932 [54] protocols. A reliable call channel is created across an IP network using the Transmission Control
Protocol (TCP) [74]. The Q.931 part of H.225 specifies the procedures and messages for
connecting, maintaining, and disconnecting calls; Q.932 messages are used to provide supplementary services. H.225 messages are exchanged either directly between the end points or
between the end points after being routed through the gatekeeper. The first method is called
direct call signaling, and the second method is called gatekeeper-routed call signaling. The
method chosen is decided by the gatekeeper during RAS-admission message exchange.
As previously mentioned, H.245 handles the control signaling between H.323 end points.
H.245 procedures establish logical channels for media transmission, i.e., transmission of
audio, video, and data. The logical media channels are unidirectional and thus two channels
are opened in a bidirectional call session. H.245 also includes procedures for capabilities
exchange and flow control. The capabilities exchange entails the exchange of the end points
transmit and receive capabilities in a call session.
In H.323, there is a peer-to-peer relationship between communicating terminals. To override the peer-to-peer relationship, H.245 includes procedures to determine which end point
is master and which end point is slave in a particular call. The master-slave relationship is
maintained for the duration of the call and is used to resolve conflicts between end points.
Specifically, the master-slave relationship helps to resolve conflicts when both end points in
a call requests similar actions at the same time.
186
End Point O
Gatekeeper GO
Gatekeeper GT
End Point T
H.225 ARQ
(1)
H.225 ACF
Q.931 Setup
Q.931 Setup
Q.931 Call Proceeding
H.225 ARQ
H.225 ACF
Q.932 Facility
Q.931 Release Complete
Q.931 Setup
(2)
Q.931 Setup
H.225 ARQ
H.225 ACF
Q.931 Alert
Q.931 Alert
Q.931 Connect
Q.931 Alert
Q.931 Connect
Q.931 Connect
H.245 Terminal Capabilities
H.245 Master-Slave Negotiation
(3)
H.245 Master-Slave Negotiation
H.245 Open Audio Logical Channel
H.245 Open Audio Logical Channel Acknowledgement
H.245 Open Audio Logical Channel
H.245 Open Audio Logical Channel Acknowledgement
Media Communication (RTP/RTCP)
(4)
ACF = Admission ConFirm

ARQ = Admission Request
Figure 41: Gatekeeper-routed call setup in H.323.
To illustrate how the protocols in the H.323 protocol suite work together to accomplish a
call session, Figure 41 outlines the time-sequence diagram for a gatekeeper-routed call setup.
The reason we selected to show a gatekeeper-routed call, and not a directly routed call, is
that billing is much easier to accomplish in this type of call. Thus, we believe that this type
of call will prevail in future telecommunication networks.
It is assumed that the two end points, O (originating end point) and T (terminating end
point), have already registered with gatekeepers GO and GT, respectively. Furthermore, it is
187
assumed that the call only involves speech. The steps in the call setup are as follows:
(1) End point O sends an Admission ReQuest (ARQ) on the RAS channel to gatekeeper
GO and requests to make a call with a certain bandwidth. The admission is confirmed
(ACF) by GO.
(2) End point O sets up a H.225/Q.931 call signaling channel between itself and end point
T. This is done in several steps.
(a) End point O sends a Q.931 Setup request to GO, which, in turn, forwards the
request to end point T.
(b) End point T responds to the Q.931 Setup request by sending back a Q.931 Call
proceeding message to GO.
(c) End point T obtains admission for the call by issuing an admission request to
GT.
(d) Since gatekeeper-routed call signaling is used, end point T informs GO that the
call should be routed through GT. This is done with a Q.932 Facility message.
(e) GO releases the current H.225/Q.931 channel with end point T and sets up a new
channel which goes through GT. Note that this procedure also involves end point
T obtaining a new admission.
(f) When end point T, which typically is an IP phone, starts ringing, it sends back a
Q.931 Alert message to end point O.
(g) Later, when the called party answers the call, end point T sends back a Q.931
Connect message. This message sometimes contains the transport UDP/IP address for the H.245 control signaling.
(3) H.245 control signaling takes place between end points O and T.
(a) Terminal capabilities are negotiated.
(b) It is decided which of the end points is the master.
(c) Two logical audio channels are opened one in each direction.
(4) Media communication takes place using RTP/RTCP.
To conclude this description of H.323, it could be mentioned that recent versions of the
standard has been complemented with some other recommendations. Notably, H.450.1 [55]
specifies a new protocol for supplementary phone services in H.323. Other recommendations
in the H.450 series specifies some common supplementary services such as call transfer, call
diversion, call park, and call hold. Also worth noting is the H.235 [60] security framework
for secure signaling in a H.323 network.
188
5.2 SIP
As briefly mentioned, SIP is a a signaling protocol for initiating, managing, and terminating
multimedia sessions across IP networks. It can be run over any IP transport layer protocol,
e.g., TCP, UDP, and the Stream Control Transmission Protocol (SCTP) [84]. This is in sharp
contrast to H.323 which specifies a complete, vertically integrated system. Furthermore,
contrary to H.323, which is a binary peer-to-peer protocol, SIP is a text-encoded clientserver protocol.
SIP, which was originally developed within the IETF Multiparty Multimedia Session
Control (MMUSIC) working group, forms part of IETFs multimedia architecture effort.
As such, SIP is used in conjunction with several other IETF protocols such as the Session
Description Protocol (SDP) [47, 70], the RTP [79] protocol, the Media Gateway Control
Protocol (MGCP) [30, 41] and the MEdia GAteway COntrol (MEGACO)/H.248 [44, 62]
protocol etc.
Figure 42 pictures the elements of a SIP network. As shown, a SIP network is composed
of eight types of logical components: user agents, redirect servers, proxy servers, Back2-Back User Agents (B2BUAs), registrars, location servers, presence servers, and events
servers. Each component has specific functions and participates in SIP communication as a
client, i.e., initiates requests, as a server, i.e., responds to requests, or as both. One physical device can have the functionality of more than one logical component. For example, a
network server that works as a proxy server might also function as a registrar.
User agents are client end-system applications that contain both user-agent client and
user-agent server functionality. Examples of physical devices that could be user agents include IP phones, workstations, telephony gateways, and various services such as automated
answering services. In a softswitch network, a user agent is typically configured with the
network address of the local redirect server, proxy server, or B2BUA. The redirect server
accepts a SIP request and maps the SIP address of the called party into zero (if there is no
known address) or more new addresses and returns them to the user agent. In contrast, a
proxy server does not return translated addresses to the user agent, but uses the addresses to
route the SIP request towards the destination user agent. It should be noted that a SIP request
may have to traverse several proxy servers on its way to a destination user agent.
It is useful to view proxy servers as SIP-level routers that forward SIP requests and
responses. However, SIP proxy servers employ routing logic that is commonly more sophisticated than just routing-table forwarding. In particular, RFC 3261 [78] allows proxy servers
to perform actions such as validate requests, authenticate users, fork requests, resolve addresses, cancel pending calls, so-called record- and loose-routing, and handle routing loops.
Forking means that after having processed an incoming SIP request and resolved the destination address, the proxy server forwards the request to multiple addresses. Depending on how
the proxy server is configured, the forking could be parallel, sequential, or a mix. Recordrouting is a SIP mechanism that allows SIP proxy servers to request being in the signaling
path of all future requests in a particular call, and loose-routing adds the possibility of having
several signaling paths in record-routing.
The RFC 3261 specification defines three types of proxy servers: stateless, stateful, and
call-stateful proxy servers. A stateless proxy is a simple message forwarder. When receiving
a SIP request, the stateless proxy processes the request without saving any state information.
189
SIP Proxy Server

SIP User Agent
SIP Proxy Server
VoIP
VoIP
TR
IP
Su
bs
c
r ib
SIP Events Server
t
ou
SIP Location Server

SIP Proxy Server
n
Lo catio
SIP Proxy Server
TR
SIP Location Server
Re
g
is
t
TR I
SIP Redirect Server

te
IP
R ou
SIP User Agent
Su
SIP Proxy Server
er
c
bs
r ib
SIP Location Server
B2BUA
SIP Location Server
SIP User Agent
SIP Presence Server
SIP Proxy Server

SIP Registrar
VoIP
SIP Proxy Server

B2BUA = Back-2-Back User Agent
TRIP = Telephony Routing over IP
Figure 42: The elements of a SIP network.

This means that once the request has been forwarded, the proxy has no remaining knowledge
of the request. A stateful proxy server processes transactions rather than individual SIP
messages. The stateful proxy manages two types of transactions: server transactions to
receive requests and return responses, and client transactions to send requests and receive
responses. Finally, a call-stateful proxy server keeps track of a call during its complete
lifetime which may encompass several SIP transactions.
In some cases, a user agent is connected to a B2BUA. For example, this is the case in the
IP Multimedia Subsystem (IMS, cf. Section 8). A B2BUA receives a SIP request, processes
it as a user agent server, and, in order to determine how the request should be answered, acts
as a user agent client and generates requests. A B2BUA must maintain call state. It is similar
in many ways to a proxy server, but has tighter control over a call. Furthermore, it does not
have the limitations of a proxy server. For example, a B2BUA may disconnect an ongoing
call or alter the body of SIP messages, things not permitted for proxy servers to do.
Recall that redirect servers, proxy servers, and B2BUAs map destination SIP addresses to
new, routable, SIP addresses. The routable addresses are stored in location servers which are
190
invoked to resolve destination addresses. Location servers are not formally a SIP component,
however, they are still an important part of a SIP network. To store routable SIP addresses in
a location server, a user agent contacts a register server or registrar. How the registrar, in turn,
uploads the SIP addresses to the location server is not specified. Some location servers use
the Lightweight Directory Access Protocol (LDAP) [49, 50, 85], others use CORBA [71].
Two more recently added servers to the SIP network are the events and presence servers.
The events server is a general implementation of a notifier as prescribed by the event notification framework of RFC 3265 [76]. The notifier in this framework is responsible for receiving SIP event subscription requests, and sending notifications to subscribers when their
subscribed events have occurred. Presence is a service that allows a party to know the ability
and willingness of another party to participate in a call before a call attempt has been made.
A user interested in receiving presence information for another user, a so-called watcher, can
subscribe to his/her presence status at a presence server. The concept of a presence server
emanates from work within the IETF SIP for Instant Messaging and Presence Leveraging
Extensions (SIMPLE) working group to develop a framework architecture for presence and
instant messaging.
Typically, each operator has its own SIP network. To permit call control signaling between customers of different operators the SIP networks have to exchange routing information. As is illustrated in Figure 42, IETF envisions the use of the Telephony Routing over IP
(TRIP) [77] protocol for this purpose. In TRIP, location servers communicate routing details
to location servers in both the same and different SIP networks using mechanisms similar
to those in Border Gateway Control Protocol 4 (BGP-4) [75]. Examples of routing details
communicated include reachability of destinations and the routes towards these destinations,
and policy information. It should be noted that although TRIP was developed primarily for
SIP networks, it is not in any way dependent on SIP. In fact, TRIP could be used as a routing
protocol for H.323 networks as well.
There are only two types of messages in SIP: requests sent from a client to a server, and
responses sent in the opposite direction. The RFC 3261 specification defines six SIP request
types or methods. The six methods are as follows:
INVITE. This method initiates a call session, and invites other user agents or servers
to participate in the session. It includes a session description, and, for two-party calls,
a description of the media the calling party wants to use in the session, e.g., G.711encoded audio over RTP.
ACK. This method is used to acknowledge the reception of a final response to an INVITE. (The meaning of a final response will be explained below.) A client originating
an INVITE request issues an ACK request when it receives a final response for the
INVITE.
OPTIONS. The OPTION method makes it possible for a calling party to query a
called party about its capabilities in terms of supported SIP methods and media.
BYE. This method is used by a party in a call session to abandon the session.
CANCEL. This method cancels pending transactions. For example, if a SIP server
191
has received an INVITE but not yet returned a final response, it will stop processing
the INVITE upon receipt of a CANCEL.
REGISTER. A user agent sends a REGISTER request to a registrar to update the
location server about its current location.
Apart from these methods, a number of extensions have been added in RFCs and proposed in
Internet drafts. This includes methods for event subscription and notification, SUBSCRIBE
and NOTIFY; methods for mid-call signaling; and a method, COMET, to ensure that certain
preconditions, such as QoS requirements, are met.
SIP responses are sent in response to SIP requests and indicate the outcome of the request.
They are represented by three-digit status codes, and are classified with respect to their most
significant digit. There are six classes of SIP responses:
100 Informational,
200 Success,
300 Redirection,
400 Client error,
500 Server failure, and
600 Global failure.
The informational SIP responses are used to indicate progress but do not terminate a SIP
transaction. The remaining classes of SIP responses are final, i.e., terminate SIP transactions.
The structure of SIP messages is to a large extent influenced by HTTP. Figure 43 pictures
the structure of SIP messages. As depicted, SIP messages are composed of the following
three parts:
Start Line. Every SIP message begins with a start line. The start line conveys the
message type, i.e., method types in requests and status codes in responses, and the
protocol version. Furthermore in requests, the start line includes a request Uniform
Resource Identifier (URI) which gives the SIP address of the called party.
Header Fields. SIP header fields are used to convey message attributes and to modify
message meaning. They are similar in syntax and semantics to HTTP header fields.
In fact, some headers are borrowed from HTTP. Some examples of key SIP headers
include:
Via. Indicates the route taken by a SIP request/response.
From. Identifies the originator of a SIP request/response.
To. Identifies the recipient of a SIP request/response.
Call-ID. The Call-ID contains a unique identifier for a particular call session.
All requests and responses during this call session will contain this same CallID. The Call-ID is for example used by a SIP proxy server to keep track of several
simultaneous call sessions.
192
SIP Request
Start
Line
Header
Fields
Body
SIP Response
INVITE sip:anna@emca.com SIP/2.0
SIP/2.0 200 OK
Via: SIP/2.0/UDP acme.com:5060
Via: SIP/2.0/UDP acme.com:5060
From: Karl-Johan <sip:kjgr@acme.com>
From: Karl-Johan <sip:kjgr@acme.com>
To: Anna <sip:anna@emca.com>
To: Anna <sip:anna@emca.com>
Call-ID: 12345@kjgr_ws.acme.com
Call-ID: 12345@kjgr_ws.acme.com
CSeq: 1 INVITE
CSeq: 1 INVITE
Subject: Meeting today?
Subject: Meeting today?
Contact: Karl-Johan <sip:kjgr@acme.com>
Contact: Anna <sip:anna@emca.com>
Content-Type: application/sdp
Content-Length: 150
Content-Type: application/sdp
Content-Length: 130
v=0
o=kjgr 535464 5321245 IN IP4 128.3.4.5
s=Call from Karl-Johan
c=IN IP4 kjgr_ws.acme.com
m=audio 1234 RTP/AVP 0 3 4 5
v=0
o=anna 53534 56734 IN IP4 192.1.2.3
s=Call from Karl-Johan
c=IN IP4 anna_ws.emca.com
m=audio 1234 RTP/AVP 0 3
AVP = Audio/Video Profile

SDP = Session Description Protocol
UDP = User Datagram Protocol
Figure 43: The structure of SIP messages.
CSeq. The Command Sequence or CSeq header is used to differentiate between

different SIP requests in the same call session (i.e., requests with the same callID).
Subject. Call subject and/or nature.
Contact. A Contact header provides a URL where the user can be reached directly, i.e., without traversing SIP servers.
Content-Type. The Content-Type header specifies the format of the message
body. For example in Figure 43, the Content-Type specifies that the message
body contains a description of the call session in the Session Description Protocol
(SDP) format.
Content-Length. The size of the message body in number of octets.
Body. The message body holds the contents of the message. In the example in Figure 43, the message body contains a session description in SDP. The v line specifies
the version of SDP used; the o line specifies the session origin; the s line contains a
session subject description; the c line provides connection information; and, finally,
the m line specifies the media type, port, and possible media formats the calling party
is willing to receive and send, or media formats the called party is willing to accept.
User Agent O
193
Proxy Server PO
Location Server LO
Proxy Server PT
User Agent T
INVITE
(1)
Lookup SIP Address
(2)
Return SIP address of PT

INVITE
(3)
100 Trying
INVITE
100 Trying
(4)
183 Progress
180 Ringing
180 Ringing
(5)
180 Ringing
200 OK
200 OK
(6)
200 OK
ACK
ACK
(7)
ACK
BYE
BYE
(8)
BYE
OK
OK
(9)
OK

Figure 44: A SIP call session.

To conclude this description of SIP, Figure 44 illustrates a call session between user
agents O and T. The steps in the call session are as follows:
(1) Proxy server PO receives an INVITE request from user agent O.
(2) PO finds the location of the called party by contacting location server LO. However,
instead of returning the address of the called party, which is the address of user agent
T, LO returns the address of the proxy server of user agent T, PT.
(3) PO sends an INVITE request, on behalf of user agent O, to PT. To indicate progress,
PO sends a Trying response back to user agent O.
(4) On reception of the INVITE request from PO, PT forwards the request to user agent
T and sends a Trying response back to PO. PO, in turn, sends a Session Progress back
to user agent O.
194
(5) When user agent T receives the INVITE request from PT, it sends a Ringing response
back to user agent O via PT and PO.
(6) The calling party answers the call which results in an OK response being sent back to
user agent O. Again, the response is routed via PT and PO.
(7) When user agent O receives the OK, it sends an ACK request, via PO and PT, to user
agent T. Now, the call setup is completed and media begins to flow.
(8) The called party abandons the call session, and a BYE request is sent from user agent
T, via PT and PO, to user agent O.
(9) User agent O responds to the BYE request with an OK response. The call session ends
when user agent T receives the OK response.
Bearer Signaling
As mentioned in the introduction to Section 5, bearer signaling denotes the type of signaling taking place between Softswitches and MGs. Figure 45, illustrates the use of bearer
signaling. The Softswitch acts as a Media Gateway Controller (MGC, cf. Section 3) which
controls several associated MGs. The MGs translate media data between the VoIP and PSTN
networks. Acting as an MGC, the Softswitch directs the MGs as to which TDM time slot is
connected to which RTP stream. It may also direct the MGs to transcode media from one
format to another, or mix various media streams together. Since bearer signaling is used by
Softswitches/MGCs to control MGs, it is also referred to as gateway control signaling and
the corresponding protocols as gateway control protocols.
Gateway control protocols have had a long and convoluted history. In the beginning of
1998, there were several competing proposals, however, the dominating one was the Media
Gateway Control Protocol (MGCP). Toward the end of 1998, the IETF formed the MEdia
GAteway COntrol (MEGACO) working group with the charter to propose a single gateway
control protocol. Since, the MGCP was the dominating gateway control protocol at that
time, there was a strong support for making this protocol the IETF standard. However, the
MEGACO group never accepted MGCP as their choice. The closest to a standard MGCP
came was an informational RFC, RFC 3435 [30]. Instead, key aspects of MGCP along
with many other inputs were integrated in a new protocol, the MEGACO gateway control
protocol.
Parallel to the efforts of the IETF, the ITU-T study group SG-16 initiated a work on a
H-series gateway control protocol, at that time called H.GCP, but later designated H.248. To
avoid ending up with two differing and incompatible protocols, the IETF and ITU-T SG-16
began to work on a compromise approach between the MEGACO protocol and H.GCP. In the
summer of 1999, an agreement was reached between the two organizations to create an international standard, the MEGACO/H.248 protocol. During the following year, considerable
effort was made to merge the two standards, and in June 2000, the MEGACO/H.248 [44, 62]
protocol was approved by both standard bodies. Today, an overwhelming majority of vendors and operators envision the MEGACO/H.248 protocol the bearer protocol of the nextgeneration telecommunication network.
6. Bearer Signaling
195
VoIP
PSTN
Softswitch/MGC
IP Phone
SG
H
PSTN Switch
PSTN Phone
H.323/SIP
1
RTP Stream
ya
te
kr
TDM Stream
MG
MG = Media Gateway
MGC = MG Controller
TDM = Time Division Multiplexing
Figure 45: Interworking between VoIP and PSTN networks.
The MEGACO/H.248 protocol is a master/slave protocol. The Softswitches/MGCs act

as masters and control the slaves, the MGs. The master/slave architecture was chosen to
eliminate processor-intensive functionalities in the MGs and thus making them fairly inexpensive. The idea was that the MGs should act as dumb terminals awaiting commands from
the MGCs for its next actions. The MEGACO/H.248 protocol is not tied to any particular
call-control signaling protocol and consequently can interoperate with both H.323 and SIP.
MEGACO/H.248 utilizes a logical connection model to control the resources of an associated MG. The two main components of this model is terminations and contexts. A
termination sources and/or sinks media and/or control streams. Examples of terminations
include TDM time slots and RTP streams. Typically, terminations of TDM streams, e.g.,
terminations of DS-0s, are instantiated by an MG during boot up and remains active at all
times. Such terminations are called persistent terminations. Other terminations are created
when they are needed and released soon afterwards. They are called ephemerals and are
often used to represent packet flows such as RTP flows.
Streams in the MEGACO/H.248 connection model are routed between terminations by
associating them with a common context. That is, a context is an association between a number of terminations for the purposes of sharing media and/or control information between
those terminations. A context is created when the first termination is added to the context
and is destroyed when the last termination is removed from the context. A termination always belongs to one and only one context. Persistent terminations that are not currently
engaged in a call belong to a special context, the null context. Figure 46 illustrates the use
of terminations and contexts in the MEGACO/H.248 connection model. The depicted MG
accommodates two active contexts. In context C1, we have a simple call session between
an IP phone and a PSTN phone. Context C2, exemplifies a more complex call session: a
196
MG
Context C1
Termination
RTP Stream
Termination
TDM Stream
Context C2
Termination
RTP Stream
Termination
TDM Stream
Termination
TDM Stream
MG = Media Gateway
Figure 46: An example of an MG with active contexts and terminations.

conference call between three parties of which one resides in an IP network and the other
two in a PSTN network.
The components of the MEGACO/H.248 connection model is manipulated by a set of
commands provided by the protocol:
Add. The Add command creates and adds a termination to a context. Since a context
is created when the first termination is added, this command also creates contexts.
Modify. The Modify command changes the characteristics of a termination.
Subtract. The Subtract command removes a termination from a context. Recall that
when the last termination is removed, the context is deleted.
Move. The Move command moves a termination from one context to another.
AuditValue. The AuditValue command returns the characteristics of a termination or
an MG as a whole.
AuditCapability. The AuditCapability command is used by an MGC to determine
the possible values supported for the characteristics of a particular termination or an
MG as a whole.
Notify. The Notify command is issued by an MG to inform the MGC of events that
have occurred within the MG. The events to be reported have previously been requested as part of an Add, Modify, or Move command.
6. Bearer Signaling
197
Message
Transaction
Action
Command
Command
Command
Command
Command
Command
Command
Action
Command
Transaction
Action
Command
Action
Command
Command
Figure 47: The structure of a MEGACO/H.248 message.

ServiceChange. The MG uses the ServiceChange command at startup/restart to inform the MGC about its availability. The MGC may also use this command to move
the responsibility of an MG from itself to another MGC.
The commands sent between MGs and MGCs are not sent as individual messages in
MEGACO/H.248, instead a single message is structured hierarchically as shown in Figure 47, and may comprise several commands that act on several different contexts. Specifically, a message consists of a header and one or more transactions. The transactions are
independent units of execution, i.e., there is no specific execution order imposed on the
transactions. A transaction is the largest functional unit in a message. Every transaction is
initiated by a transaction request and is closed by a transaction reply.
The commands within a transaction are grouped into actions. An action comprises all
commands that acts on the same context. The actions are executed in their order of appearance within a transaction, and, similarly, the commands are executed sequentially within an
action.
To illustrate the use of the MEGACO/H.248 protocol, Figure 48 presents the commands
sent during a call setup. To simplify matters, the two MGs involved are assumed to be
associated with the same MGC. Further, it is assumed that both the calling and called party
198
User O
MGO
MGC
MGT
User T
Picks up phone
Notify Request
(1)
Notify Reply
Modify Request
Modify Reply
(2)
Dialed user T
Notify Request
(3)
Notify Reply
Add Request
Add Reply
Add Request
(4)
Add Reply
Ringing
Modify Request
Modify Reply
User T answers the phone

Notify Request
(5)
Notify Reply
Modify Request
Modify Reply
(6)
Modify Request
Modify Reply
TDM
TDM
MG = Media Gateway
MGC = Media Gateway Controller
Figure 48: MEGACO/H.248 signaling during a call setup between two MGs.
are PSTN users and attached to their respective MG through persistent terminations.
The commands are as follows:
(1) A user O at MGO picks up the phone, and a Notify command is generated to the MGC.
(2) The MGC instructs the MGO, via a Modify command, to play a dial tone and to collect
dialed digits.
(3) When user O has dialed the phone number of the called party, a user T connected to
MGT, a Notify command with the phone number of user T is sent to the MGC.
(4) The MGC creates a connection between MGO and MGT by issuing two Add commands, and a Modify command. The Modify command is required to complement the
7. Interworking with Legacy Circuit-Switched Networks
199
media settings of MGO with the settings of MGT. As soon as the connection between
the MGC and MGT has been setup, the MGT instructs the phone at user T to ring.
(5) When user T answers its phone, a Notify command is sent from MGT to MGC.
(6) The MGC sets up the media stream between MGO and MGT by issuing two Modify
commands, one to the respective MG. Encapsulated within each Modify command is
a request to stop the ringing signal and to notify about any on-hook event.
7 Interworking with Legacy Circuit-Switched Networks

As follows from Section 3, the softswitch architecture is designed with the intent to seamlessly interwork with todays legacy circuit-switched wireline and wireless networks. Initially, the interworking solutions were proprietary, e.g., Tekelecs Transport Adapter Layer
Interface (TALI) [83], and Ciscos Signaling Link Terminal (SLT) [24] and Virtual Switch
Controller [38] technologies. However, this began to change in the late 1990s when an effort
to standardize SS7 signaling over IP began in the IETF SIGTRAN working group [82].
As a first step, the SIGTRAN working group defined a framework architecture, the SIGTRAN architecture [72], illustrated in Figure 49. The SIGTRAN architecture formalizes the
interworking between a softswitch-based VoIP network and a circuit-switched network as
regards signaling traffic. Specifically, it defines a protocol suite, denoted SIG in Figure 49,
whose principal components are depicted in Figure 50.
At the lowest layer, there is standard IP. On top of IP, there is a common transport protocol, SCTP [84], that supports a common set of reliable transport functions for signaling
transport. Finally, there is an adaptation component which comprises a set of adaptation
protocols that support the primitives of the lower parts of the SS7 stack (see Subsection 7.2).
The key incentive with the adaptation component is to make the upper parts of the SS7 stack
oblivious to the fact that the underlying transport is IP rather than TDM, thus making it
possible to use these parts more or less unaltered in a VoIP network.
7.1 SCTP
Since IETF traditionally takes a rather conservative standpoint to new TCP/IP transport protocols, the development of the Stream Control Transmission Protocol or SCTP was not inceptionally an obvious choice. In fact, as a first step, the SIGTRAN working group evaluated
the two common transport protocols of the TCP/IP stack, the UDP and TCP transport protocols [80]. UDP was quickly ruled out since it did not meet the requirement of reliable,
in-order delivery. TCP, on the other hand, met this basic requirement, however, was found
to have some other severe limitations:
Head-of-Line Blocking (HoLB). TCP imposes a strict order-of-transmission on sent
data. This is too confining for SS7 signaling traffic. Particularly, this creates an artificial ordering between independent signaling message flows, and thus lets time delays
due to packet losses and retransmissions in one flow inflict on the timely delivery of
the remaining flows sent over the same TCP connection. For example, consider a TCP
200
SEP
STP
SG
Softswitch (MGC-F)
UP SS7
LP SS7
UP SS7
LP SS7
LP SS7
SIG
SIG
LP = Lower Parts
SIG = SIGTRAN Protocol Suite
SS7 = Signaling System No. 7
UP = Upper Parts
Figure 49: Interworking between a softswitch VoIP network and a legacy SS7 circuitswitched network according to SIGTRAN.
Adaptation Component
SCTP
IP
SCTP = Stream Control Transmission Protocol
Figure 50: The principal components of the SIGTRAN protocol suite.
201
TCP Connection
IAM, Call #3
IAM, Call #2
IAM, Call #1

TCP = Transmission Control Protocol
Figure 51: Head-of-line blocking in a TCP connection with simultaneous telephone call
attempts.
connection over which three simultaneous telephone call attempts are made (see Figure 51). The ISUP IAM message of call #1 is lost which, by necessity, delays this
call attempt. However, due to TCPs order-of-transmission requirement, it delays the
remaining two call attempts as well. According to the study in [80], a packet-loss
frequency of 1% could delay 9% of subsequent packets more than a one-way transfer
time.
Timer Granularity. The computation of the retransmission timer in TCP is commonly done using a coarse, non-tunable system clock. Although, this is actually not a
limitation of the TCP protocol per se, it is indeed a limitation of most TCP implementations.
Availability and Reliability. TCP takes a prohibitively long time to detect connection
failures, and offers no mechanisms to recover from end point failures such as failed
network interfaces.
Message Boundaries. TCP is byte oriented and treats each data transmission as an
unstructured sequence of bytes. Thus, it would force SS7 signaling protocols to explicitly insert and track message boundaries.
Security. TCP hosts are susceptible to blind Denial-of-Service (DoS) attacks by SYN
packets.
To overcome the above limitations of TCP, the SIGTRAN working group concluded that
a new transport protocol was deemed necessary, and SCTP was ratified as a standard in October 2000. Although SCTP is a new transport protocol, separate from TCP, it inherits many
of its properties from TCP. Like TCP, SCTP provides a connection-oriented, reliable transport service on top of IP. It uses window-based congestion- and flow-control mechanisms
202
SCTP Connection
IAM, Call #1
Stream #1
IAM, Call #2
Stream #2
IAM, Call #3
Stream #3

Figure 52: Avoiding HoLB in SCTP by sending simultaneous telephone call attempts over
separate streams.
that essentially work the same as the ones used in TCP SACK. In particular, a selective retransmission scheme is employed to correct packet losses and errors. However, unlike TCP,
and to address the shortcomings of TCP, SCTP also supports the following features:
Multiple Delivery Modes. SCTP supports several modes of delivery including strict
order-of-transmission (like TCP), unordered (like UDP), and partially ordered delivery. The partially ordered delivery mode is provided through multi-streaming. The
multi-streaming feature of SCTP separates and transmits messages or chunks on multiple, logically independent streams. Streams are the facility offered by SCTP to send
separate signaling message flows on the same connection independently from each
other and to this end avoid unnecessary HoLB. Each stream provides a reliable inorder delivery of messages, while no ordering is imposed in between streams. Figure 52 illustrates the use of multiple streams by revisiting the example in Figure 51,
however this time using an SCTP connection. Although, the IAM message of call #1
is lost, it does not prevent the other two IAM messages from being delivered.
Tunable Timeout Settings. Although SCTP like TCP utilizes a non-tunable system
IP Address A
End Point A
203
IP Address B
IP Network
End Point B
IP Address B
Figure 53: An SCTP multi-homing example.

clock, it provides for the use of more fine-grained clocks by making it possible for an
application to adjust the retransmission timeout settings. As a consequence of this,
SCTP is able to detect connection failures more quickly than TCP.
Multi-homing. To increase network path availability and reliability, SCTP supports
multi-homing. Unlike TCP, which only supports connections between single source
and destination IP addresses, SCTP permits connections which span several IP addresses at both the source and destination end points. Specifically, an SCTP connection is defined as follows:
A set of IP addresses and an SCTP port number at a source end point, together with a set of IP addresses and an SCTP port number at a destination
end point.
Note that although the end points may comprise several IP addresses, the IP addresses
always share the same SCTP port. To distinguish an SCTP connection from a connection in TCP, SCTP uses the term association. Figure 53 gives an example of an SCTP
association between a single and a dual-homed end point. End point A has established
an association with end point B which comprises two network paths: one for each of
the IP addresses of end point B.
Since SCTP only supports multi-homing for availability and reliability purposes, and
not for load balancing, one network path is always selected as the primary path; any
remaining paths function as backup or alternate paths. New packets are always sent
on the primary path, while retransmissions are made on one of the alternate paths.
Retransmitting on an alternate path decreases failure recovery time. Further, if the
primary path fails, the selected alternate path is automatically promoted to primary
path. That is, a path failure recovery is completely transparent to the SCTP application.
SCTP continuously monitors reachability on the primary and alternate paths. On the
primary path, SCTP monitors reachability via the acknowledgements of sent chunks.
If an acknowledgement for an outstanding message chunk has not been received when
204
the retransmission timer expires, SCTP increases an error counter for the primary
path and retransmits the message chunk. The primary path is considered unreachable
if the error counter reaches a predefined threshold, Path.Max.Retrans. Since
message chunks are not normally sent on a regular basis on alternate paths, another
reachability mechanism is used there. A special heartbeat chunk is sent periodically
on these paths, based on a configured heartbeat timer. Each time the retransmission
timer expires on a heartbeat chunk, the error counter of the corresponding path is
incremented. Again, when the error counter reaches Path.Max.Retrans, the path
is considered unreachable. The error counters of both the primary and alternate paths
are reset to zero each time a message or heartbeat chunk is successfully acknowledged.
SCTP also monitors the availability of the end points. Each end point monitors the
availability of its peer by keeping an error counter. This error counter keeps track of
the total number of consecutively missed acknowledgements for message and heartbeat chunks on all paths between the end point and its peer. When this error counter
reaches a predefined threshold, Association.Max.Retrans, the peer is considered unavailable. This will in effect bring an end to the whole association.
Message Boundary Preservation. SCTP preserves the message-framing boundaries
of applications by placing messages inside one or more chunks. Large messages are
partitioned into multiple chunks.
DoS Protection. To mitigate the impact of DoS attacks, SCTP employs a security
cookie mechanism during the establishment of an association.
7.2 Adaptation Component

As mentioned earlier, the adaptation protocols of the SIGTRAN adaptation component encapsulate SS7 protocols for transport over an IP network using SCTP. While each adaptation
protocol, by necessity, is unique in terms of the encapsulation, they do share some common
features:
They provide support for seamless operation of the upper parts of an SS7 stack over
IP.
They shield management of SCTP associations.
They enable asynchronous reporting of SS7 management messages.
There are two main categories of adaptation protocols: peer-to-peer adaptation protocols
and user adaptation protocols. Peer-to-peer adaptation protocols implement a complete emulation of a lower layer of the SS7 stack. In particular, they manage SCTP associations as
traditional SS7 links. Contrary to this, user adaptation protocols work in a client/server relationship with a SG. Basically, a SG works as a proxy for one or several softswitches/MGCs.
The SG terminates the SS7 layer corresponding to the user adaptation layer and forwards
upper-layer SS7 messages as ordinary SCTP/IP packets. Figure 54 illustrates the differences
between peer-to-peer and user adaptation protocols. Note that with user adaptation protocols,
205
Peer-to-Peer Adaptation Protocol

PSTN/PLMN
IP
SS 7o
SEP
T DM
IPSCP
SG
SS7
o IP
IPSEP
STP
STP
SEP
(NIF)
LP SS7
Traditional SS7 link

(i.e., SS7 over TDM)
UP SS7
P2PA
P2PA
SCTP
SCTP
IP
IP
An SCTP association
emulates an SS7 link
User Adaptation Protocol
PSTN/PLMN
SEP
SS 7o
MGC
T DM
SG
LP = Lower Parts
NIF = Nodal Interworking Function
IPSCP = An SCP node in an IP network
IPSEP = A SEP node in an IP network
P2PA = Peer-to-Peer Adaptation Layer
UAC = User Adaptation Layer, Client Side
UAS = User Adaptation Layer, Server Side
UP = Upper Parts
SEP
IPSEP
IP
SS7
STP
STP
MGC
(NIF)
LP SS7
Traditional SS7 link

(i.e., SS7 over TDM)
in S
CT
UP SS7
UAS
UAC
SCTP
SCTP
IP
IP
SS7 messages bundled

in SCTP/IP packets
Figure 54: The distinguishing features of peer-to-peer and user adaptation protocols.
the functionality of a single SCP or SEP may be distributed over several softswitches/MGCs,
while not so with peer-to-peer adaptation protocols.
Currently, there are only one peer-to-peer adaptation protocol defined: the MTP-L2 Peerto-peer Adaptation protocol (M2PA) [43]. As the name suggests, this protocol emulates the
MTP-L2 layer of the SS7 stack. There are four user adaptation protocols defined:
MTP-L2 User Adaptation Layer (M2UA). The M2UA [66] protocol is primarily
defined for the transport of MTP-L2 user signaling, i.e., MTP-L3, between a SG and
206
SS7
SUA
IUA
V5UA
DUA
M3UA
M2PA
M2UA
SCTP
IP
DASS 2 = Digital Access Signaling System 2

DPNSS = Digital Private Network Signaling System
DUA = DPNSS/DASS 2 User Adaptation layer
IUA = ISDN Q.921 User Adaptation layer
M2PA = MTP-L2 Peer-to-peer Adaptation protocol
M2UA = MTP-L2 User Adaptation layer
SUA = SCCP User Adaptation layer
V5UA = V5.2 User Adaptation Layer
Figure 55: The SIGTRAN protocol suite.
a softswitch/MGC.
MTP-L3 User Adaptation Layer (M3UA). The M3UA [81] protocol is primarily
defined for the transport of MTP-L3 user signaling, e.g., ISUP and SCCP, between a
SG and a softswitch/MGC.
SCCP User Adaptation Layer (SUA). The SUA [64] protocol is primarily defined
for the transport of SCCP applications, such as TCAP and RANAP, between a SG and
a softswitch/MGC.
ISDN Q.921 User Adaptation Layer (IUA). The IUA [67] protocol is defined for
the transport of Q.931 ISDN signaling between a SG and a softswitch/MGC. Two
extensions to IUA have been defined: the V5.2 User Adaptation layer (V5UA) [86],
and the Digital Private Network Signaling System/Digital Access Signaling System 2
User Adaptation layer (DUA) [68] for transport of V5.2 access signaling and Private
Branch Exchange (PBX) signaling, respectively.
Figure 55 shows the complete SIGTRAN protocol suite as it looks at the time of this
writing. The remainder of this section provides a more detailed description of M2PA and the
user adaptation protocols M2UA, M3UA, and SUA. IUA and its extensions are not discussed
any further since they have, as yet, not found any widespread use.
207
7.3 M2PA
M2PA allows operators to keep their existing network topology (i.e., SSPs, STPs etc.) and
use IP to transport their SS7 messages instead of using traditional TDM-based links. All
other elements from the legacy SS7 network remain the same, except that the signaling links
are now virtual. M2PA simply changes the transport to IP, and in that respect enables a
first, very conservative, step towards an IP-based telecommunication network. M2PA also
provides a means for peer SS7 MTP-L3 layers in SGs to communicate directly, a setup
typically used for SS7 bypass signaling where a managed IP network is run in parallel to a
highly loaded legacy SS7 network to offload signaling traffic. MTP-L3 is present on each SG
to provide routing and management of the MTP-L2/M2PA links. Because of the presence of
MTP-L3, each SG have its own SS7 point code. Figure 56 illustrates the two discussed use
cases of M2PA.
Since M2PA is a peer-to-peer adaptation protocol, it has basically the same responsibilities as MTP-L2. This means, among other things, that M2PA is responsible for MTP-L2
chores such as link activation/deactivation; maintenance of link status information; maintenance of sequence numbers and retransmit buffers for MTP-L3; and last, but not least,
maintenance of local and remote processor outage status.
7.4 M2UA
Figure 57 depicts the principal use case of M2UA. As already mentioned, M2UA is commonly used to transfer MTP-L2 user data between an MTP-L2 instance on a SG and an
MTP-L3 instance on a softswitch/MGC. Since M2UA is a user adaptation protocol, there is
a client-server relationship between the M2UA instance on the softswitch and the MTP-L2
instance on the SG. Basically, M2UA provides a means by which an MTP-L2 service may be
provided on a softswitch. Neither the MTP-L2 instance on the SG nor the MTP-L3 instance
on the softswitch is aware that they are remote from each other. Further, since the SG has
no MTP-L3 layer of its own, it has no SS7 point code. In fact, the SG is transparent to SS7
in the PSTN/PLMN network, and routing is instead made on the basis of the softswitches
which do have SS7 point codes.
M2UA is typically used in the following cases:
SS7 links are physically remote from each other which have resulted in a large number
of separate SGs. In this case, M2UA makes it possible for a single softswitch/MGC
to support several SGs. Since only the softswitch/MGC needs to have a point code,
the use of M2UA in this case conserves point codes, a scarce resource in todays
PSTN/PLMN networks.
There is a low density of SS7 links at a particular physical point in a legacy SS7
network. By using M2UA, an IP network may complement the legacy SS7 network at
this point in the network.
The SG function is co-located with an MG.
Figure 57 depicts M2UA as a peer to MTP-L2 in the SG. However, in many ways M2UA
is a user of MTP-L2. M2UA is responsible for initiating actions which would normally be
208
PSTN/PLMN
IP
S S7 o
SEP
TD M
IP
SG
7o
SS7
IP
S S7o
IPSCP
SG
SEP
PSTN/PLMN
TDM
SEP
oI P
IPSEP
STP
STP
SEP
STP
STP
SG
PSTN/PLMN
UP SS7
S
7o
TD
M
MTP-L3
MTP-L3
M2PA
M2PA
SCTP
SCTP
IP
IP
SEP
MTP-L2
MTP-L1
UP SS7
MTP-L3
MTP-L3
M2PA
M2PA
SCTP
SCTP
IP
IP
MTP-L2
MTP-L1
MTP-L3
MTP-L2
MTP-L2
MTP-L1
MTP-L1
M2PA = MTP-L2 Peer-to-peer Adaptation protocol

IPSEP = A SEP node in an IP network
UP = Upper Parts
Figure 56: The principal use cases of M2PA.
issued by MTP-L3 such as link activation/deactivation, sequence number requests, MTP-L2

transmit/retransmit buffer updating procedure, and buffer flushing. On the IP side, M2UA is
responsible for the mapping of SS7 links to SCTP associations. Typically, each SS7 link is
mapped to its own stream in an association.
PSTN/PLMN
SEP
209
IP
S S 7o
MGC
TD M
SG
SEP
MT P
-L 3
o ve
STP
STP
r SC
TP
MGC
UP SS7
NIF
MTP-L3
M2UA
M2UA
SCTP
SCTP
IP
IP
MTP-L2
MTP-L1

UP = Upper Parts
Figure 57: The principal use case of M2UA.
7.5 M3UA
At the present time, M3UA is the adaptation protocol that is offering the broadest functional
coverage. It is also the adaptation protocol selected by the majority of telecom equipment
manufacturers and operators. Furthermore, M3UA is the only adaptation protocol included
in 3GPP Release 5, the 2002 release of the standards for the third generation cellular networks. Like M2UA, M3UA is typically used between a SG and a softswitch/MGC. Figure 58
shows this use case. The SG receives SS7 signaling using the SS7 Message Transfer Parts
(MTPs) as transport over a standard SS7 link. The SG terminates MTP-L2 and MTP-L3,
210
PSTN/PLMN
SEP
IP
SS 7o
MGC
TD M
SG
SEP
E. g ., IS
U P ov
er
SCT P
MGC
STP
STP
NIF
UP SS7
MTP-L3
M3UA
M3UA
MTP-L2
SCTP
SCTP
MTP-L1
IP
IP

UP = Upper Parts
Figure 58: The principal use case of M3UA.

and delivers ISUP, SCCP or any other MTP-L3 user, as well as certain MTP network management events, over SCTP associations to the softswitch. Similar to the M2UA case, the
MTP-L3 user at the softswitch is unaware that the MTP-L3 services are not provided locally, but remotely at the SG. Conversely, the MTP-L3 layer at the SG is, for the most part,
unaware that its users are remote.
Conceptually, M3UA extends access of MTP-L3 services at the SG to remote softswitches. If a softswitch is connected to more than one SG, the M3UA layer at the softswitch
211
IP
PSTN/PLMN
MGC-A
Cluster
Routing Key = RK-A
ISUP message, DPC: 1.1.1

STP-A
MGC-B
SG
STP-B
MGC-C
RK-A = {DPC: 1.1.1, SI: ISUP}

RK-C = {DPC: 1.1.2, SI: ISUP}
1.1.1
RK-A
1.1.2
RK-C
Routing Key = RK-C
Routing Database
DPC = Destination Point Code
RK = Routing Key
SI = Service Indicator
Figure 59: A routing key example.
maintains the status of configured SS7 destinations accessible via each SG, and routes messages accordingly. At the SG, the M3UA layer provides interworking with MTP-L3 management functions to support seamless operation of signaling between the SS7 and IP networks.
For example, the M3UA layer at the SG indicates to its supported softswitches when an SS7
signaling point is reachable or unreachable, or when SS7 network congestion occurs. Additionally, the M3UA layer at one of the supported softswitches may explicitly request the
state of a remote SS7 destination reachable via the SG by querying the SG M3UA layer.
Since MTP-L3 is terminated at the SG, SS7 point code routing ends at the SG. Routing in
the IP network is instead done using something called Routing Keys (RKs). That is, the SG
routes messages from the legacy PSTN/PLMN network to the appropriate softswitch in the
IP network using RKs. The RK is defined as a set of SS7 parameters and parameter values
that uniquely specify a destination for SS7 traffic in the IP network. Specifically, a RK is
used to route SS7 messages from the SG to a particular softswitch or cluster of softswitches.
As an example, it could be mentioned that the Cisco IP Transfer Point [23] permits RK
assignments for M3UA on the basis of the DPC, OPC, and SI (cf. Section 2).
212
Figure 59 provides a RK example. The STP denoted STP-A forwards an ISUP message
with DPC 1.1.1 to the SG. The SG looks up the DPC in its routing database and finds that
it matches the RK, RK-A. On the basis of this RK, it then routes the ISUP message to the
softswitch denoted MGC-A. Let us now assume that MGC-A becomes unreachable, and that
yet another ISUP message with DPC 1.1.1 arrives at the SG. Since MGC-A is clustered with
the softswitch denoted MGC-B and thus has the same RK, the SG will re-route all traffic
normally destined for MGC-A to MGC-B. This illustrates one of the strengths with RKs:
they decouple softswitches from point codes and thus enable SS7-transparent management
of the IP network, e.g., transparent failover and load-sharing.
7.6 SUA
SUA emulates the services of SCCP by providing support for reliable transfer of SCCP user
messages, including support for both connectionless and connection-oriented services. It
also provides SCCP management services to, for example, manage SCCP subsystems. As is
illustrated in Figure 60, SUA typically provides a means by which an SCCP user, e.g., TCAP
or RANAP, on a softswitch/MGC may be reached via a SG. From the perspective of an SS7
signaling point, the SCCP user is located at the SG. An SCCP message is routed to the SG
based on the point code and the SCCP subsystem number. The SG then translates the point
code and subsystem number of the SCCP messages to the corresponding RK, and routes the
SCCP messages to the appropriate softswitch/MGC. If an SCCP message contains a global
title, the SG may also perform global title translation before the RK translation.
Future Outlook
As mentioned in the introduction, the softswitch solution constitutes the first step along the
migration path towards the next-generation, IP-based, multi-service telecommunication network. Although, only the future can tell with certainty what the next steps will be, several
standardization bodies have proposed, or are in the process of proposing, reference architectures for the next-generation network, including:
The International Packet Communication Consortium (IPCC). The IPCC [4] is a
continuation of the International Softswitch Consortium (ISC) which was founded in
1998. It is an international industry association dedicated to accelerating the deployment of voice and video over IP in both wireline, wireless, and cable networks. Its
memberlist includes vendors as well as government agencies.
The MultiService Forum (MSF). Founded in 1998 by Cisco Systems, Worldcom,
and Telcordia, the mission of MSF [16] is not so much to develop new standards, but
to bring together existing standards into a holistic network and services architecture.
As members of MSF, we find both vendors and operators.
ITU-T. In 2001, ITU-T started a new initiative, the Next Generation Network (NGN).
The incentive with this initiative was to develop guidelines and standards for the nextgeneration telecommunication network. In 2004, the work of the NGN initiative was
transferred to the Focus Group on Next Generation Network (FGNGN) [3].
8. Future Outlook
213
PSTN/PLMN
SEP
IP
SS 7o
MGC
TD M
SG
SEP
TC AP
o ve r S
C TP
MGC
STP
STP
NIF
SCCP
SCCPU
SUA
SUA
SCTP
SCTP
IP
IP
MTP-L3
MTP-L2
MTP-L1

RANAP = Radio Access Network Application Part
SCCPU = SCCP user, e.g., TCAP and RANAP
SUA = SCCP User Adaptation layer
Figure 60: The principal use case of SUA.
ETSI. In 1997, ETSI initialized the technical committee, Telecommunications and

Internet Protocol Harmonization Over Networks (TIPHON) [39]. The initial goal
with TIPHON was to establish a global standard for traffic between H.323 and different types of circuit-switched networks such as PSTN and GSM. However, in 2000,
TIPHON was refocused to develop specifications for telephony and multimedia communication services over next-generation networks. In 2003, TIPHON was merged
with another technical committee, Services and Protocols for Advanced Networking
(SPAN), and formed the group, Telecommunications and Internet converged Services
214
and Protocols for Advanced Networking (TISPAN) [20]. The main objective with
TISPAN is basically the same as it was for TIPHON: the standardization of a multiservice, multi-protocol, and multi-access network based on IP.
3GPP and 3GPP2. Both 3GPP [28] and 3GPP2 [1] were born out of ITU-Ts International Mobile Telecommunications Initiative 2000 (IMT-2000) [5] to standardize the
third-generation wireless communications. However, due to their success, and the fact
that the next-generation, IP-based core network is envisioned to be shared by fixed and
cellular communication, their scope has been extended. Today, 3GPP and 3GPP2 are
collaborating with ETSI TISPAN and others in standardizing the next-generation core
signaling system for wireless and wireline networks, the IP Multimedia Subsystem
(IMS) [29].
On the basis of these standardization efforts, a three-step migration path, as depicted in
Figure 61, has emerged:
Step 1: The Softswitch Solution. The first migration step, which is the step portrayed
in the foregoing sections of this report, first and foremost aims at reducing capital and
operating expenditures for operators. This step, as previously shown, involves the introduction of the softswitch that enables the separation of application functions, call
control, and connectivity. One of the most important benefits of the softswitch solution
is that it enables the reuse of equipments from the traditional TDM-based telecommunication network, especially in the access network. When the softswitch solution is
introduced, access equipment can gradually be moved from circuit-switched nodes to
MGs.
Step 2: The IP Multimedia Service Introduction. While the first step primarily
aims at reducing capital and operating expenditures for operators, and gives less tangible benefits to the end users, the second migration step introduces a new IP-based
signaling subsystem, the IMS subsystem, that not only makes new multimedia services feasible, but also greatly facilitates the trend of fixed/cellular convergence. For
example, the cellular operator Orange has disclosed that it will use IMS to conquer
customers from British Telecom (BT) in the UK. Specifically, Orange will offer a
combination of wireless (using GSM) and wired (using VoIP over a Digital Subscriber
Line (DSL) technology) services provisioned and controlled by a common IMS infrastructure.
Step 3: Converged IMS-based Architecture. Although IMS was introduced in Step
2, it is envisioned that the demand for traditional PSTN services will continue, and
that a full migration to IMS is likely to take several years. Thus, in Step 3, as aging
access equipment is being replaced, Access Gateways (AGs) are being deployed in the
network. The AGs provide telephony services over IP networks, and are controlled
via some bearer control protocol, such as MEGACO/H.248, by a softswitch. The
softswitches from Step 1 will remain in the network as long as there is a demand
for traditional PSTN services, and not until then, they will be completely replaced
by IMS. This protects investments and enables a smooth migration, on a port-by-port
basis, from the complete PSTN service set provided by the softswitch to IMS.
8. Future Outlook
215
SG
CK
W
A
E TTERD
H
P
LA
SG
AP
E
H
Nyae
CK
A
W
EL
TRD
T
PLMN
MG
k
w
rto
MG
Softswitch
yB
a
PLMN
Nw
te skro
SG
CK
W
A
E TTERD
H
P
LA
Nyae
k
w
rto
Softswitch
PSTN
MG
IP
SG
AP
E
H
CK
A
W
EL
TRD
T
MG
IP
yB
a
PSTN
Nw
te skro
SG
CK
W
A
E TTERD
H
P
LA
IMS
VoIP
MG
B
Nyae
k
w
rto
VoIP
Step 2: The IP Multimedia Service

Introduction
Step 1: The Softswitch Solution
SG
H
P
W
CK
E
AL
TRD
TE
MG
a
B
AG
Softswitch
SG
H
P
W
CK
E
AL
TRD
TE
MG
IP
PLMN
Nw
tye skro
a
B
PSTN
Nw
tye skro
IMS
VoIP
Step 3: Converged IMS-based

Architecture
AG = Access Gateway
IMS = IP Multimedia Subsystem
MG = Media Gateway
Figure 61: Migration path towards NGN.
From a signaling perspective, it appears that IMS is the key component of the nextgeneration network. In essence, IMS is an architecture for establishing, maintaining, and
tearing down a SIP session in between two user agents (cf. Section 5.2). Although IMS is
envisioned to be a common signaling architecture for both fixed and cellular communication,
it is currently only defined for cellular communication and then in particular for 3G UMTS
networks.
Since IMS has its roots in cellular communication, a key distinction is made between
the home and the visited network of an IMS user, e.g., a cellular phone. The main task for
216
Services
Control
Control
Control
Connectivity
Control
Connectivity
Connectivity
Home Network
Connectivity
Visited Networks
Figure 62: The visited network provides connectivity to the home network.
the visited network is to provide connectivity to the home network, while it is the home
network that hosts user data, session control, services, and applications (see Figure 62).
Users are always roaming in a visited network while the services are controlled from the
home network, regardless of visited network. The advantage with this approach is that it
limits the functional and protocol dependencies between the home and visited networks and
thereby minimizes the restrictions imposed on the services that can be deployed in the home
network. As a side effect, it also increases the rate at which services can actually be deployed.
Although, this approach means that all control signaling goes through the home network, the
bearer traffic is routed independently of the signaling traffic and thus is able to follow a more
efficient path.
Figure 63 illustrates the IMS architecture. As shown, the IMS architecture consists of
the following principal components:
Call Session Control Function (CSCF). The IMS architecture is built around the
Call Session Control Function which in a sense constitutes the softswitch of IMS.
There are three different types of CSCFs: the Proxy CSCF (P-CSCF), the Interrogating
CSCF (I-CSCF), and the Serving CSCF (S-CSCF).
The P-CSCF is the first contact point for IMS users. In fact, the P-CSCF component
is the only IMS component used by a roaming user in a visited network. All SIP
signaling traffic from the IMS user goes via the P-CSCF, i.e., the P-CSCF is analogous
to a SIP proxy server. The functions performed by the P-CSCF include forwarding of
SIP registration and session invitation messages, and forwarding of accounting-related
information.
The I-CSCF is the first point of contact within the home network from a visited network. Its main responsibility is to query the Home Subscriber Server (HSS), the subscriber database, and find the location of the S-CSCF serving the user. Although, this
is actually an optional component in IMS, it has a number of other responsibilities
as well. In particular, it provides a hiding functionality which makes it possible for
an operator to hide the topology, capacity etc. of its network from other operators
networks, and thus makes load sharing and other types of capacity management much
easier.
8. Future Outlook
217
IMS Domain
HSS
P-CSCF
BGCF
IMS Domain
HSS
I-CSCF
S-CSCF
P-CSCF
I-CSCF
IP-based
Network
S-CSCF
SG
SGSN
GGSN
MRFC
MGCF
IMS-MG
PSTN/PLMN
SG
RNC
IMS-MG
MRFP
RAN
IMS-MG
PSTN/PLMN
BTS
IP-based Trunk
BGCF = Breakout Gateway Control Function

CSCF = Call Session Control Function
GGSN = Gateway GPRS Support Node
GPRS = General Packet Radio Service
IMS = IP Multimedia Subsystem
I-CSCF = Interrogating CSCF
HSS = Home Subscriber Server
MG = Media Gateway
MRF = Media Resource Function
MRFC = MRF Controller
MRFP = MRF Processor
P-CSCF = Proxy CSCF
RAN = Radio Access Network
S-CSCF = Serving CSCF
SGSN = Serving GPRS Support Node
Figure 63: The IMS architecture.
Finally, the S-CSCF is the brain of the IMS architecture. It is located in the home network and performs session control and registration services for IMS users. While the
user is engaged in a session, the S-CSCF maintains a session state and interacts with
ASs and accounting functions. Although a S-CSCF in the home network is responsible for all session control, it could forward specific requests to a P-CSCF in the visited
network on the basis of the requirements of the request, e.g., to provide information
about the local dialing plan.
218
Home Subscriber Server (HSS). The HSS is the main data storage for all subscriber
and service-related data in IMS. The data stored in HSS includes: user identities,
registration information, and access parameters. The HSS interfaces with the I-CSCF
and the S-CSCF to provide information about the location of the IMS user and its
subscription information.
Media Resource Function (MRF). The MRF holds the functionality for manipulating
multimedia streams, to support multi-party multimedia services, multimedia message
playback, and media conversion services. The MRF is split into two parts: an MRF
Controller (MRFC) and an MRF Processor (MRFP). The MRFC interprets SIP signaling received via a S-CSCF and uses MEGACO/H.248 (cf. Section 6) instructions to
control the MRFP. In other words, MRFC is the part of MRF that resides in the control
layer, while MRFP resides in the connectivity layer.
Breakout Gateway Control Function (BGCF). The BGCF is one of the components
IMS provides for interworking with legacy circuit-switched networks. In particular,
the BGCF is responsible for choosing where a breakout to a circuit-switched network
should occur. The outcome of the selection process could either be that the breakout
should happen in the same network as that of the BGCF, or that the call should be
routed to another IP trunk network.
Media Gateway Control Function (MGCF). As the BGCF, the MGCF is a component for enabling interworking between IMS and circuit-switched users. All incoming
call-control signaling from legacy PSTN/PLMN users is routed to the MGCF that
performs protocol conversion between ISUP and SIP. Similarly, all IMS-originated
sessions towards legacy PSTN/PLMN networks are routed via the MGCF. The MGCF
also controls media channels in the corresponding IMS-MG.
IMS Media Gateway (IMS-MG). The IMS-MG provides the connectivity-layer link
between circuit-switched PSTN/PLMN networks and IMS. Specifically, it performs
the media translation between the IP-based trunk network and legacy PSTN/PLMN
networks.
To give some appreciation of how the components of IMS interwork, let us consider the
IMS session initiation procedure. We assume that User A in Figure 64 wants to initiate
a session with User B. To simplify matters, we also assume that both users are in their
respective home networks. Prior to the session initiation both users have gone through a socalled P-CSCF discovery procedure in which they obtained an IP address for their P-CSCF,
and a registration procedure in which they registered with the HSS of their home network.
The steps taken by User A when it initiates a session with User B are as follows:
(1) User A generates a SIP INVITE request and sends it to the P-CSCF in his home
network.
(2) The P-CSCF processes the request, and forwards it to the S-CSCF.
(3) The S-CSCF processes the request and determines an entry point of the home network
of User B, the I-CSCF.
8. Future Outlook
219
Home Network of User A
Home Network of User B
I-CSCF
HSS
HSS
(4)
(2)
(3)
P-CSCF
S-CSCF
GGSN
(5)
(4)
I-CSCF
S-CSCF
SGSN
SGSN
(1)
User A
RNC
BTS
P-CSCF
(6)
User B
GGSN
(7)
RNC
BTS

CSCF = Call Session Control Function
GGSN = Gateway GPRS Support Node
GPRS = General Packet Radio Service
HSS = Home Subscriber Server
I-CSCF = Interrogating CSCF
P-CSCF = Proxy CSCF
S-CSCF = Serving CSCF
SGSN = Serving GPRS Support Node
Figure 64: The IMS session initiation procedure.

(4) The I-CSCF receives the request from the S-CSCF of User A and contacts the HSS to
find the S-CSCF serving User-B.
(5) The S-CSCF of User B processes the request and eventually delivers the request to the
P-CSCF of User B.
(6) After some further processing, the P-CSCF delivers the SIP INVITE request to User
B.
(7) User A and User B negotiate the media characteristics (e.g., number of media channels, codecs etc.), and the negotiated media resources are reserved in the trunk network. Once resource reservation has been completed successfully, User B sends a SIP
OK response back to User A, which replies with a SIP ACK message to confirm the
session setup. The session is established.
220
Summary
Competitive market conditions and narrowing profit margins are driving network operators to
optimize their network and to find new sources of revenue. In particular, todays incumbent
wireline and wireless operators are at a crossroad. They need to move to IP in order to cut
operating and capital expenditures. At the same time, they have made huge investments
in circuit-switched technology that are still delivering a major share of their total revenue.
To these operators, the softswitch offers an appealing solution. The softswitch solution lets
incumbents enjoy dramatically reduced costs, and at the same time provides support for a
still emerging new wave of revenue-generating services.
This report have given a fairly comprehensive survey of the softswitch solution from
a technical viewpoint. Basically, all components of the softswitch solution have been discussed: applications, call-control signaling, bearer signaling, and, last but not least, the interworking with the existing circuit-switched PSTN and PLMN networks. The report has
shown how the softswitch solution creates a decomposed architecture in which the signaling
and media functions are separated. The report has also shown how the decomposed architecture of the softswitch solution lends itself to more advanced and flexible applications and
services than is possible in the existing telecommunication networks.
Although, the softswitch solution is central to the evolution of the current PSTN and
PLMN networks, it only represents the first migration step towards the envisioned nextgeneration, all-IP network. Thus, in the final section of the report, a future outlook was
provided which attempted to see beyond the softswitch solution. Central to this outlook was
IMS. The IMS architecture defines the logical elements necessary to implement multimedia
services across multiple network types, and the final section gave a brief overview of this
architecture and its salient components.
References
[1] The 3rd generation partnership project 2 (3GPP2). http://www.3gpp2.org.
[2] COM: Component object model technologies. http://www.microsoft.com/
com.
[3] The focus group on next generation networks (FGNGN). http://www.itu.int/
ITU-T/ngn/fgngn.
[4] International packet communications consortium (IPCC).
ipccforum.org.
http://www.
[5] ITU activities on IMT-2000. http://www.itu.int/home/imt.html.

[6] JAIN SLEE and OSA/parlay. http://jainslee.org/othertechnologies/
osaandslee.html.
[7] Java RMI specification.
http://java.sun.com/j2se/1.4.2/docs/
guide/rmi/spec/rmiTOC.html.
REFERENCES
221
[8] JSLEE and the JAIN initiative. http://java.sun.com/products/jain.

[9] JSR 116: SIP servlet API. http://www.jcp.org/en/jsr/detail?id=116.
[10] JSR 153: Enterprise javabeans 2.1. http://www.jcp.org/en/jsr/detail?
id=153.
[11] JSR 160: Java management extensions (JMX) remote API 1.0. http://www.jcp.
org/en/jsr/detail?id=160.
[12] JSR 18: JAIN OAM API specification.
detail?id=18.
http://www.jcp.org/en/jsr/
[13] JSR 21: JAIN JCC specification. http://www.jcp.org/en/jsr/detail?

id=21.
[14] JSR 22: JAIN SLEE API specification.
detail?id=22.
http://www.jcp.org/en/jsr/
[15] JSR 3: Java management extensions (JMX) specification. http://www.jcp.org/

en/jsr/detail?id=3.
[16] The multiservice forum (MSF). http://www.msforum.org.
[17] OMA - open mobile alliance. http://www.openmobilealliance.org.
[18] OSS through java initiative. http://www.ossj.org.
[19] The parlay group. http://www.parlay.org.
[20] The telecommunications and Internet converged services and protocols for advanced
networking (TISPAN) technical body. http:/portal.etsi.org/tispan.
[21] VoiceXML forum. http://www.voicexml.org.
[22] XTML (extensible telephony markup language).
pactolus.com.
Online at http://www.
[23] Cisco IP Transfer Point. Cisco Systems, 2002. White Paper.

[24] Cisco Signaling Link Terminal. Cisco Systems, 2002. Data Sheet.
[25] Cisco BTS 10200 Softswitch. Cisco Systems, 2005. Data Sheet.
[26] ISO/IEC standard 14750, information technology open distributed processing interface definition language, January 2005.
[27] OMG unified modeling language: Superstructure, version 2.0, July 2005.
[28] The 3rd generation partnership project (3GPP). http://www.3gpp.org.
222
[29] 3GPP. 3rd generation partnership project; technical specification group services and
system aspects; IP multimedia subsystem (IMS); stage 2 (release 7). Technical Specification TS 23.228 v.7.1.0, 3GPP, September 2005.
[30] F. Andreasen and B. Foster. Media gateway control protocol (MGCP) version 1.0. RFC
3435, IETF, January 2003.
[31] R. J. Auburn. Voice browser call control: CCXML version 1.0. Working draft, W3C,
June 2005.
[32] J-L Bakker and R. Jain. Next generation service creation using XML scripting languages. In International Conference on Communications (ICC), New York, USA, April
2002.
[33] D. Booth and C. K. Liu. Web services description language (WSDL) version 2.0 part
0: Primer. Technical report, W3C, August 2005. Working Draft 3.
[34] T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, and F. Yergeau. Extensible
markup language (XML) 1.0 (third edition). Recommendation, W3C, February 2004.
[35] T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, and F. Yergeau. Voice extensible
markup language (VoiceXML) version 2.0. Recommendation, W3C, March 2004.
[36] R. Chinnici, H. Haas, A. Lewis, J. Moreau, D. Orchard, and S. Weerawarana. Web
services description language (WSDL) version 2.0 part 2: Adjuncts. Technical report,
W3C, August 2005. Working Draft 3.
[37] R. Chinnici, J. Moreau, A. Ryman, and S. Weerawarana. Web services description
language (WSDL) version 2.0 part 1: Core language. Technical report, W3C, August
2005. Working Draft 3.
[38] J. Davidson and J. Peters. Voice over IP Fundamentals. Cisco Press, March 2000.
[39] ETSI. Telecommunications and Internet protocol harmonization over networks
(TIPHON) release 4; architecture and reference points definition; network architecture
and reference points. Technical Specification TS 101 314 v. 4.1.1, ETSI, September
2003.
[40] V. Ferraro-Esparza, M. Gudmandsen, and K. Olsson. Ericsson telecom server platform
4. Ericsson Review, (3):104113, 2002.
[41] B. Foster and C. Sivachelvan. Media gateway control protocol (MGCP) return code
usage. RFC 3661, IETF, December 2003.
[42] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns:
Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.
[43] T. George, B. Bidulock, R. Dantu, H. Schwarzbauer, and K. Morneault. Signaling
system 7 (SS7) message transfer part 2 (MTP2) - user peer-to-peer adaptation layer
(M2PA). RFC 4165, IETF, September 2005.
REFERENCES
223
[44] C. Groves, M. Pantaleo, T. Anderson, and T. Taylor. Gateway control protocol version
1. RFC 3525, IETF, June 2003.
[45] M. Gudgin, M. Hadley, N. Mendelsohn, J. Moreau, and H. Nielsen. SOAP version 1.2
part 1: Messaging framework. Recommendation, W3C, June 2003.
[46] M. Gudgin, M. Hadley, N. Mendelsohn, J. Moreau, and H. Nielsen. SOAP version 1.2
part 2: Adjuncts. Recommendation, W3C, June 2003.
[47] M. Handley and V. Jacobson. SDP: Session description protocol. RFC 2327, IETF,
April 1998.
[48] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg. SIP: Session initiation
protocol. RFC 2543, IETF, March 1999.
[49] R. Harrison and K. Zeilenga. The lightweight directory access protocol (LDAP) intermediate response message. RFC 3771, IETF, April 2004.
[50] J. Hodges and R. Morgan. Lightweight directory access protocol (v3): Technical specification. RFC 3377, IETF, September 2002.
[51] ITU-T. Pulse code modulation (PCM) of voice frequencies. Recommendation G.711,
ITU-T, November 1988.
[52] ITU-T. Data protocols for multimedia conferencing. Recommendation T.120, ITU-T,
July 1996.
[53] ITU-T. Visual telephone systems and equipment for local area networks which provide
a non guaranteed quality of service. Recommendation H.323, ITU-T, November 1996.
[54] ITU-T. Digital subscriber signalling system no. 1 generic procedures for the control
of ISDN supplementary services. Recommendation Q.932, ITU-T, May 1998.
[55] ITU-T. Generic functional protocol for the support of supplementary services in H.323.
Technical Report H.450.1, ITU-T, February 1998.
[56] ITU-T. ISDN user-network interface layer 3 specification for basic call control. Recommendation Q.931, ITU-T, May 1998.
[57] ITU-T. Vocabulary of switching and signalling terms. Technical Report Q.9, ITU-T,
November 1998.
[58] ITU-T. Call signalling protocols and media stream packetization for packet-based multimedia communication systems. Recommendation H.225.0, ITU-T, July 2003.
[59] ITU-T. Packet-based multimedia communications systems. Recommendation H.323,
ITU-T, July 2003.
224
[60] ITU-T. Implementors guide for recommendations of the H.323 system (packetbased multimedia communications systems): H.323, H.225.0, H.245, H.246, H.283,
H.235, H.341, H.450 series, H.460 series, and H.500 series. Technical Report
H.Imp323/H.323/H.225.0/H.245/H.246/H.283/H.235/H.341, ITU-T, November 2004.
[61] ITU-T. Control protocol for multimedia communication. Recommendation H.245,
ITU-T, January 2005.
[62] ITU-T. Gateway control protocol: Version 3. Technical Report H.248.1, ITU-T,
September 2005.
[63] J. Lennox, X. Wu, and H. Schulzrinne. Call processing language (CPL): A language
for user control of Internet telephony services. RFC 3880, IETF, October 2004.
[64] J. Loughney, G. Sidebottom, L. Coene, G. Verwimp, J. Keller, and B. Bidulock. Signalling connection control part user adaptation layer (SUA). RFC 3868, IETF, October
2004.
[65] N. Mitra. SOAP version 1.2 part 0: Primer. Recommendation, W3C, June 2003.
[66] K. Morneault, R. Dantu, G. Sidebottom, B. Bidulock, and J. Heitz. Signaling system
7 (SS7) message transfer part 2 (MTP2) user adaptation layer. RFC 3331, IETF,
September 2002.
[67] K. Morneault, S. Rengasami, M. Kalla, and G. Sidebottom. ISDN Q.921-user adaptation layer. RFC 3057, IETF, February 2001.
[68] R. Mukundan, K. Morneault, and N. Mangalpally. Digital private network signaling
system (DPNSS)/digital access signaling system 2 (DASS 2) extensions to the IUA
protocol. RFC 4129, IETF, August 2005.
[69] F. D. Ohrtman. Softswitch Architecture for VoIP. McGraw-Hill, 2003.
[70] S. Olson, G. Camarillo, and A. B. Roach. Support for IPv6 in session description
protocol (SDP). RFC 3266, IETF, June 2002.
[71] OMG. Common object request broker architecture: Core specification. Recommendation Version 3.0.3, OMG, March 2004.
October 1999.
[75] Y. Rekhter and T. Li. A border gateway protocol 4 (BGP-4). RFC 1771, IETF, March
1995.
REFERENCES
225
[76] A. B. Roach. Session initiation protocol (SIP)-specific event notification. RFC 3265,
IETF, June 2002.
[77] J. Rosenberg, H. Salama, and M. Squire. Telephony routing over IP (TRIP). RFC 3219,
IETF, January 2002.
[78] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks,
M. Handley, and E. Schooler. SIP: Session initiation protocol. RFC 3261, IETF, June
2002.
[79] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. RTP: A transport protocol
for real-time applications. RFC 3550, IETF, July 2003.
[81] G. Sidebottom, K. Morneault, and J. Pastor-Balbas. Signaling system 7 (SS7) message
transfer part 3 (MTP3) user adaptation layer (M3UA). RFC 3332, IETF, September
2002.
[82] Signaling transport working group (SIGTRAN). http://www.ietf.org/html.
charters/sigtran-charter.html.
[83] D. Sprague, R. Benedyk, D. Brendes, and J. Keller. Tekelecs transport adapter layer
interface. RFC 3094, IETF, April 2001.
IETF, October 2000.
[85] M. Wahl, T. Howes, and S. Kille. Lightweight directory access protocol (v3). RFC
2251, IETF, December 1997.
[86] E. Weilandt, N. Khanchandani, and S. Rao. V5.2-user adaptation layer (V5UA). RFC
3807, IETF, June 2004.
[87] X. Wu and H. Schulzrinne. Programmable end system services using SIP. In International Conference on Communications (ICC), Anchorage, Alaska, USA, May 2003.
[88] X. Wu and H. Schulzrinne. LESS: Language for end system services in Internet telephony. Internet draft, IETF, February 2005. Work in Progress.
226
Abbreviations
3GPP
3GG2
A-F
ACE
ACF
ACK
ACM
AEG
AG
ANM
API
ARQ
AS
AT&T
ATM
AuC
AVP
B2BUA
BCSM
BG-F
BGCF
BGP-4
BSC
BSS
BSSAP
BT
BTS
CA-F
CAMEL
CAP
CAS
CCS
CCXML
CORBA
CPL
CPU
CSCF
DASS 2
DCOM
DoS
DP
DPC
3rd Generation Partnership Project

3rd Generation Partnership Project 2
Accounting Function
Application Creation Environment
Admission ConFirm
ACKnowledgement
Address Complete Message
Application Expert Group
Access Gateway
ANswer Message
Application Programming Interface
Admission ReQuest
Application Server
American Telephone and Telegraph
Asynchronous Transfer Mode
Authentication Center
Audio/Video Profile
Back-2-Back User Agent
Basic Call State Model
Border Gateway Function
Breakout Gateway Control Function
Border Gateway Protocol 4
Base Station Controller
Base Station Subsystem
Base Station System Application Part
British Telecom
Base Transceiver Station
Call Agent Function
Customized Application for Mobile network Enhanced Logic
CAMEL Application Part
Channel Associated Signaling
Common Channel Signaling
Call Control eXtensible Markup Language
Common Object Request Broker Architecture
Call Processing Language
Central Processing Unit
Call Session Control Function
Digital Access Signaling System 2
Distributed Component Object Model
Denial of Service
Detection Point
Destination Point Code
REFERENCES
DPNSS
DS
DSL
DSS1
DTD
DUA
EIR
EJB
ESML
ETSI
FDMA
FGNGN
GCC
GCP
GGSN
GMSC
GPRS
GSM
GT
GTR
GTT
HLR
HoLB
HSS
HTTP
I-CSCF
IAM
IBM
ID
IDL
IETF
IMS
IMS-MG
IMSI
IMT
IN
INAP
IP
IPCC
IPSP
ISC
ISDN
ISUP
ITE
227
Digital Private Network Signaling System

Digital Signal
Digital Subscriber Lines
Digital Subscriber Signaling System No. 1
Document Type Definition
DPNSS/DASS 2 User Adaptation layer
Equipment Identity Register
Enterprise Java Beans
Endpoint Service Markup Language
European Telecommunications Standards Institute
Frequency Division Multiple Access
Focus Group on Next Generation Network
Generic Call Control
Gateway Control Protocol
Gateway GPRS Support Node
Gateway MSC
General Packet Radio Service
Global System for Mobile communication
Global Title
Global Title Routing
Global Title Translation
Home Location Register
Head-of-Line Blocking
Home Subscriber Server
HyperText Transfer Protocol
Interrogating CSCF
Initial Address Message
International Business Machines
IDentity
Interface Definition Language
Internet Engineering Task Force
IP Multimedia Subsystem
IMS Media Gateway
International Mobile Subscriber Identity
International Mobile Telecommunications (Initiative)
Intelligent Network
Intelligent Networking Application Part
Internet Protocol
International Packet Communication Consortium
IP-based Signaling Point
International Softswitch Consortium
Integrated Services Digital Network
ISDN User Part
International Transit Exchange
228
ITU
ITU-T
IUA
IVR
JAIN
JCC
JCP
JMX
JMXMP
JSP
JSPA
JWG
LAN
LDAP
LE
LESS
LP
LSB
M-MG
M2PA
M2UA
M3UA
MAP
MC
MCU
MEGACO
MG
MGC
MGC-F
MGCF
MGCP
MMUSIC
MP
MRF
MRFC
MRFP
MSC
MSC-S
MSF
MSISDN
MTP
MTP-L1
MTP-L2
MTP-L3
International Telecommunication Union

ITU Telecommunication Standardization Sector
ISDN Q.921 User Adaptation layer
Interactive Voice Response
Java APIs for Integrated Networks
Java Call Control
Java Community Process
Java Management eXtensions
JMX Messaging Protocol
Java Server Page
Java Specification Participation Agreement
Joint Working Group
Local Area Network
Lightweight Directory Access Protocol
Local Exchange
Language for End System Services
Lower Parts
Least Significant Bit
Mobile MG
MTP-L2 Peer-to-Peer Adaptation Protocol
MTP-L2 User Adaptation layer
MTP-L3 User Adaptation layer
Mobile Application Part
Multipoint Controller
Multipoint Control Unit
MEdia GAteway COntrol
Media Gateway
Media Gateway Controller
MGC Function
Media Gateway Control Protocol
Multiparty Multimedia Session Control
Multipoint Processor
Media Resource Function
MRF Controller
MRF Processor
Mobile Switching Center
MSC Server
MultiService Forum
Mobile Subscriber ISDN
Message Transfer Part
MTP Level 1
MTP Level 2
MTP Level 3
REFERENCES
NGN
NI
NIF
NSP
NSS
NTE
OAM
OMA
OMC
OPC
OSA
OSS
OSSJ
P-CSCF
PBX
PEG
PIC
PLMN
PSTN
QoS
R-F
RA
RAN
RANAP
RAS
RFC
RK
RMI
RNC
RTCP
RTE
RTP
S-CSCF
SACK
SBB
SCCP
SCE
SCF
SCML
SCP
SCS
SCTP
SDP
SEP
229
Next Generation Network

Network Indicator
Nodal Interworking Function
Network Service Part
Network and Switching Subsystem
National Transit Exchanges
Operation, Administration, and Management
Open Mobile Alliance
Operation and Maintenance Center
Originating Point Code
Open Service Access
Operation and Support Subsystem
OSS through Java
Proxy CSCF
Private Branch Exchange
Protocols Expert Group
Points in Call
Public Land Mobile Network
Public Switched Telephone Network
Quality of Service
Router Function
Resource Adaptor
Radio Access Network
Radio Access Network Application Part
Registration, Admission, and Status
Request For Comment
Routing Key
Remote Method Invocation
Radio Network Controller
Real-time Transport Control Protocol
Regional Transit Exchange
Real-time Transport Protocol
Serving CSCF
Selective ACK
Service Building Block
Signaling Connection Control Part
Service Creation Environment
Service Capability Feature
Service Creation Markup Language
Service Control Point
Service Capability Server
Stream Control Transmission Protocol
Session Description Protocol
Signaling End Point
230
SG
SGSN
SI
SIGTRAN
SIMPLE
SIO
SIP
SLEE
SLP
SLS
SLT
SMH
SMS
SNM
SOAP
SP
SPAN
SRP
SS6
SS7
SSP
STP
SUA
TALI
TCAP
TCP
TDM
TDMA
TE
TIPHON
TISPAN
TRIP
TSP4
UDP
UML
UMTS
UP
URI
URL
UTRAN
V5UA
VLR
Signaling Gateway
Serving GPRS Support Node
Service Indicator
SIGnaling TRANsport
SIP for Instant Messaging and Presence Leveraging Extensions
Service Information Octet
Session Initiation Protocol
Service Logic Execution Environment
Service Logic Program
Signaling Link Selector
Signaling Link Terminal
Signaling Message Handling
Short Message Service
Signaling Network Management
Simple Object Access Protocol
Signaling Point
Services and Protocols for Advanced Networking
SCCP Relay Point
Signaling System No. 6
Signaling System No. 7
Service Switching Point
Signaling Transfer Point
SCCP User Adaptation layer
Transport Adapter Layer Interface
Transaction Capabilities Application Part
Transmission Control Protocol
Time Division Multiplexing
Time Division Multiple Access
Tandem Exchanges
Telecommunications and Internet Protocol Harmonization
Over Networks
Telecommunications and Internet converged Services and
Protocols for Advanced Networking
Telephony Routing over IP
Telecommunication Server Platform 4
User Datagram Protocol
Unified Modeling Language
Universal Mobile Telecommunications System
User Part
Uniform Resource Identifier
Uniform Resource Locator
UMTS Terrestrial Radio Access Network
V5.2 User Adaptation Layer
Visitor Location Register
REFERENCES
VoIP
W3C
WAN
WAP
WCDMA
WSDL
XML
XTML
231
Voice over IP
World Wide Web Consortium
Wide Area Network
Wireless Application Protocol
Wideband Code Division Multiple Access
Web Services Description Language
eXtensible Markup Language
eXtensible Telephony Markup Language
Paper VII
Performance Benefits of Avoiding

Head-of-Line Blocking in SCTP
Reprinted from
Proceedings of the Joint

International Conference on Autonomic/Autonomous Systems (ICAS) 2005/
International Conference on Networking and Services (ICNS) 2005
Tahiti, French Polynesia
October 2005
Performance Benefits of Avoiding Head-of-Line

Blocking in SCTP
Torbjorn Andersson
TietoEnator AB
Karlstad, Sweden
andettor@tietoenator.com
Abstract
Mitigating the effects of Head-of-Line Blocking (HoLB) was one of the major reasons the IETF SIGTRAN working group developed SCTP, a new transport protocol for
PSTN signaling traffic, in the first place. However, studies of the impact of HoLB blocking on TCP and SCTP have given ambiguous results as to whether HoLB has, in fact,
any significantly deteriorating effect on transmission delay. To this end, we have carried
out a detailed experimental study on the quantitative effects of HoLB. Our study suggests
that although HoLB could indeed incur a substantial delay penalty on a small fraction of
the messages in an SCTP session, it has only a marginal impact on the average end-toend transmission delay. We only observed improvements in the range of 0% to 18% in
average message transmission delay of using unordered delivery as compared to ordered
delivery. Furthermore, there was a large variability in between different test runs, which
often made the impact of HoLB statistically insignificant.
1 Introduction
The communication industry is currently experiencing a period of dramatic and radical
changes which strives towards a single converged all-IP network for both voice, video, and
data. The reasons to this development are many. Operators are seeking ways to consolidate
their disparate communication platforms in order to reduce their development, operational,
and maintenance costs. Additionally, an all-IP network enables a multitude of new services,
with the promise of new revenue streams for the large number of carriers and equipment
manufacturers that saw their net profit plummet when the telecommunications boom abruptly
ended in 2000.
235
236
Performance Benefits of Avoiding Head-of-Line Blocking in SCTP
Leading this development are the telecom industry with most major telecom carriers in
the process of readying Voice-over-IP (VoIP) services for mass deployment. According to
Frost & Sullivan [8], VoIP will account for approximately 75% of the world voice services by
2007, and analysts project that the number of residential VoIP subscribers will rise 12-fold,
to about 12 million, by 2009 [4].
An important component of the currently launched VoIP networks are the Stream Control
Transmission Protocol (SCTP) [16]. Originally developed as a transport protocol for PSTN
signaling traffic over IP in the IETF SIGTRAN working group [14], SCTP has broadened
its use, and is today part of both the next generation IP-based wireline and wireless core
networks. In the wireline core network, SCTP has been proposed as a viable alternative to
UDP and TCP for the transport of the Session Initiation Protocol (SIP) [12] in the Softswitch
architecture [6]. In the wireless core network, SCTP is one of the signaling transport protocols in the IP Multimedia Subsystem (IMS) [2] architecture; the architecture defined by
3GPP [1] for the development of IP-based multimedia services in future mobile networks.
The origin of SCTP springs from studies of UDP and TCP as prospective transport protocols for PSTN signaling traffic. UDP was quickly ruled out since it did not meet the basic
requirements for reliable, in-order transport. While TCP met the basic requirements, it also
was found to have several limitations, one of which was its lack of features to prevent Headof-Line Blocking (HoLB). Since TCP delivery is strictly ordered (i.e., sequential), a single
packet loss or reordering in the network might introduce significant delays in the delivery of
subsequent packets, and this regardless of whether the packets are semantically dependent or
not. In fact, an analysis carried out by Telcordia [13] suggested that a 1% packet loss would
cause 9% of the packets to be delayed more than a one-way transfer delay.
However, more recent studies have given ambiguous results as to whether HoLB substantially impedes signaling traffic performance or not. A simulation study carried out by
Camarillo et al. [3] suggests that avoidance of HoLB does not give SCTP a substantial performance increase over TCP under normal conditions, while an experiment by De Marco et
al. [9] indicates the opposite. Thus, the benefits of avoiding HoLB in SCTP still remains an
open issue.
To shed some further light on this issue, we have conducted a detailed experimental study
of the effects of HoLB on SCTP. Contrary to previous studies, our study uses SCTP for both
ordered and unordered delivery. Furthermore, we consider the effect of HoLB for a number
of network conditions, not just a single one.
The study provides a quantitative analysis of the effects of HoLB on transmission delays. The major contribution of the paper is that it indicates that while HoLB could, indeed,
substantially increase the transmission delay of a small fraction of the messages in an SCTP
session, it has only a marginal impact on the average message transmission delay.
The remainder of this paper is organized as follows. In Section 2, we provide some
preliminaries on SCTP and HoLB. The design and setup of our experiment is described in
Section 3. Then, in Section 4, the results of the experiment is presented and discussed.
Finally, the paper ends in Section 5 with a brief summary of the paper, concluding remarks,
and outlooks for future work.
2. SCTP and HoLB
Application
(Port 10)
00
11
11
00
00
11
00
11
00
11
237
IP Network
Primary Path
Alternate Path
End Point A
Application
(Port 20)
00
11
11
00
00
11
00
11
00
11
End Point B
Each end point comprises two transport addresses
Figure 1: An SCTP association with multi-homed end points.
2 SCTP and HoLB

As mentioned in the introduction, the Stream Control Transmission Protocol or SCTP emanates from the IETF SIGTRAN working group [14], and their effort to develop a transport
protocol that was more aligned with the requirements of telephony signaling traffic. Currently, SCTP is standardized in RFC 2960 [16], with an update of its checksum algorithm
in RFC 3309 [17]. Additionally, there is an SCTP Implementers Guide [15] that contains
corrections and clarifications to RFC 2960.
SCTP inherits many of its features from TCP. Like TCP, SCTP provides a connectionoriented, reliable transport service on top of IP. Further, like TCP, SCTP offers an acknowledged, non-duplicated transfer of packets. It also uses window-based congestion and flowcontrol mechanisms that basically works the same as the ones used in TCP SACK [10]. In
particular, a selective retransmission mechanism is employed to correct loss and errors.
However, unlike TCP, SCTP provides a number of features that are considered essential
for telephony or PSTN signaling transport. For example, while TCP is byte oriented and
does not preserve the boundaries of single application messages, SCTP does in fact support
framing of individual messages. More importantly though, SCTP provides mechanisms for
network redundancy and avoiding Head-of-Line Blocking (HoLB).
To provide for network or, more specifically, path redundancy, SCTP supports multihoming. As Figure 1 illustrates, each end point in an SCTP connection or association, in
contrast to TCP, may comprise several IP addresses which are bound to a single port. A
transport address in SCTP, i.e., an IP address together with a port number, translates to a
path for sending and receiving data through the network. One path is selected as the primary
path, and provided that path is available all data is sent this way. Any remaining paths
serve as backup or alternate paths. If the primary path fails, one of the alternate paths is
automatically selected as the primary path.
HoLB occurs in TCP because it imposes in-order delivery on all its packets. Figure 2
provides an example of HoLB with TCP. When packet 2 is dropped, packets 3 to 5 cannot
be delivered to the TCP application at host 2 because of the in-order delivery requirement
of TCP. SCTP, on the other hand, supports not only in-order but also unordered, reliable
delivery of messages. Figure 3 shows how unordered delivery in SCTP resolves the HoLB
in the previous example. Since packets 3, 4, and 5 can be passed on to the SCTP application
at host 2 before packet 2 has been retransmitted, no delivery delay is imposed on these
238
Host 1
Host 2
TCP Application
TCP Application
Figure 2: Example of HoLB in TCP.

Host 1
SCTP Application
Host 2
SCTP Application
Figure 3: Use of unordered delivery in SCTP to avoid HoLB.

packets by packet 2, thus HoLB is avoided.
In addition to unordered delivery, SCTP also includes support for multi-streaming to
avoid HoLB. This feature allows data to be partitioned into multiple streams with in-order
delivery within a stream but no particular order imposed on messages belonging to different
streams. Thus, a message loss or reordering in a stream will only affect the delivery of
messages within that particular stream, and not any other streams.
Although, the use of multi-streaming to prevent HoLB is indeed interesting in its own
respect, this paper only considers the use of unordered delivery to prevent HoLB. The reason
for this is that in-stream HoLB still remains an issue with multi-streaming, but not so with
unordered delivery. Thus unordered delivery is able to give us a more accurate appreciation
of the delay penalties incurred by HoLB.
Methodology
The network topology and test setup used in our experiment are depicted in Figure 4. As
follows, the network topology basically consisted of two hosts interconnected by a single
network path. The test flow consisted of a paced SCTP flow between the Flow Under Test
(FUT) Source and Sink applications. Tests were run for four different SCTP send rates:
133 Kbps, 200 Kbps, 400 Kbps, and with no pacing at all (i.e., more than 400 Kbps). In all
tests, an SCTP message size of 500 Bytes5 was used, and a test run always comprised 1000
SCTP messages.
The traffic competing with the SCTP flow, i.e., the cross traffic, comprised 0, 1, 2, or
8 greedy TCP bulk flows (i.e., flows that always had messages to send). The cross traffic
was transmitted between the TCP sources and sinks, in other words between the same hosts
5 E.g.,
SIP INVITE messages are typically about 500 Bytes.
3. Methodology
239
NTP Server
Traffic
Source
Traffic
Sink
Path Emulator
0
1
1
0
0
1
0
1
1
0
0
1
Bandwidth: 400 Kbps
Path Delay: 75 ms
Buffer: 12 pkts, 32 pkts
FUT
Source
TCP
Source 1
SCTP
TCP
Source N
TCP
FUT
Sink
TCP
Sink N
TCP
Sink 1
SCTP
IP
TCP
IP
Figure 4: Experiment setup.
as the SCTP flow. In all tests, the cross traffic was started 5 s before the SCTP flow. This
ascertained that the TCP flows had left the slow-start phase, and reached the congestion
avoidance or stationary phase before the SCTP flow was started.
The traffic sources and sinks were running on 2.8 GHz PCs with FreeBSD 4.10 as operating system. As a consequence, the TCP sources and sinks were running atop a NewReno [5]
implementation of TCP, and the SCTP sources and sinks were running on the FreeBSD
kernel implementation of SCTP in the KAME stack [7]. However, the KAME SCTP implementation was upgraded with patch 23, which was the latest patch at the time of the start of
the experiment. To keep the local clocks of the traffic source and sink hosts synchronized,
which was needed to, e.g., measure the end-to-end transmission delays, the hosts were both
using NTP, and attached to the same NTP server. A 2.8 GHz PC with FreeBSD 4.10 and
dummynet [11] acted as path emulator.
Tests were run for two different path configurations. Both configurations used a bandwidth of 400 Kbps, a one-way path propagation delay of 75 ms, and drop-tail queueing in
dummynet. However, the in-bound buffer size of dummynet differed between the two configurations. One of the configurations used a buffer size of 12 IP-packets, and the other a
buffer size of 32 IP-packets.
Although the in-bound buffer of dummynet was varied, the sizes of the SCTP send and
receive buffers at the traffic source and sink hosts were kept constant, and dimensioned to
prevent flow-control events. In particular, the receive buffer of SCTP at the FUT Sink was
configured to 134 KBytes or 268 messages, while the SCTP send buffer at the FUT Source
was only set to 47.5 KBytes or 95 messages. Thus, the FUT Source could never have more
than 95 outstanding messages, and thus could never overwhelm the FUT Sink which could
accommodate 268 messages. The reason we wanted to avoid flow-control induced regulation
of the SCTP test flow will become evident in Section 4.
240
Results
To be able to study the impact of HoLB on SCTP message transmission delays, all tests were
run for both ordered and unordered SCTP transmissions. Additionally, to make the sample
message transmission delays essentially independent, and thus obviate the strong correlation
that do indeed exist between the transmission delays of consecutive SCTP messages, the
average end-to-end transmission delay for a message in a test run was used as a sample of
the transmission delay of a message in a test. Each test was repeated 100 times, and the mean
of the sample transmission delays from the test runs was used as a metric for the message
transmission delay of SCTP in a test.
Table 1 summarizes the results of our experiment: E2Eo denotes the mean transmission
delay for ordered delivery; E2Eu , the mean transmission delay for unordered delivery; and
, the difference between the two. Table 1 also shows the 95% confidence interval for
, which tells us whether the difference between ordered and unordered transmission delays
was statistically significant or not. Finally, the relative improvement in transmission delay for
unordered delivery as compared to ordered delivery is calculated in column rel . A positive
value in this column says that an unordered delivery performed better than the corresponding
ordered delivery, and a negative value the opposite.
As follows from Table 1, the mean end-to-end transmission delays for messages sent with
the unordered transport service of SCTP were in between 0% and 18% lower than the transmission delays for messages sent with the ordered transport service. In other words, there
was only a small difference between the two transport services. However, it also follows
from the table that there was a large variability in the differences between the two services,
i.e., in . Notably, in the majority of the tests, and its 95% confidence interval was of the
same magnitude.
One reason to the large variability in could be that factors such as packet-loss rate
and packet-loss distribution were uncontrolled, and thus might have differed between corresponding ordered and unordered tests. To this end, we artificially generated the end-to-end
transmission delays for the tests with ordered delivery from the measured send and receive
times of messages in the corresponding tests with unordered delivery. Table 2 illustrates
how the generation was carried out. The messages with IDs 1 and 2 arrived at the FUT
Sink in order, and thus were given the same receive times in ordered delivery as they had
in unordered delivery. In contrast, the messages with IDs 3 and 4 arrived at the FUT Sink
before the message with ID 2, and, because of this, were given the same receive time as this
message; in this way, simulating the reordering that would have taken place had it been a
real test run. Finally, the message with ID 5 already had a receive time in the unordered test
that was larger than 23, the receive time given to message 4 in the ordered test, thus kept its
receive time from the unordered test in the ordered test.
We recognize that in the general case this kind of artificial generation is not valid since
it does not take into account the fact that the flow control might behave differently in the
ordered delivery tests as compared to the tests with unordered delivery. Specifically, SCTP
has to buffer packets that arrive out of order in the ordered delivery tests, while it in the corresponding unordered tests is able to immediately pass them on to the FUT Sink application.
Thus, in the ordered delivery tests, SCTP might run out of buffer space, and, as a result of
4. Results
Queue Size (pkts)

Send Rate
(Kbps)
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
E2Eo (s)
E2Eu (s)
0.92
1.49
1.94
6.34
0.44
1.46
1.78
6.19
0.09
0.42
1.16
5.98
0.09
0.18
0.27
5.91
0.85
1.34
1.86
5.52
0.36
1.37
1.69
5.65
0.09
0.40
1.07
5.76
0.09
0.17
0.23
5.31
12
with 95% C.I. (s)
6.67 102
1.42 101
8.15 102
8.23 101
7.75 102
8.85 102
8.41 102
5.40 101
6.60 105
2.54 102
9.02 102
2.12 101
6.70 104
1.02 102
4.39 102
5.96 101
6.67 105
6.06 102
8.40 102
3.91 101
1.79 103
6.88 102
1.20 101
3.12 101
7.33 105
6.26 102
1.89 101
4.38 101
7.70 105
8.41 103
3.41 102
3.63 101
rel (%)
E2Eo (s)
E2Eu (s)
7.27
9.58
4.20
12.96
17.67
6.06
4.73
8.74
0.08
6.04
7.74
3.55
0.78
5.79
15.98
10.09
0.92
1.42
1.79
4.72
0.43
1.12
1.89
4.51
0.09
0.38
0.80
4.39
0.09
0.28
0.43
3.86
0.88
1.21
1.68
4.42
0.39
0.95
1.78
4.33
0.09
0.33
0.76
3.94
0.08
0.26
0.42
3.86
32
with 95% C.I. (s)
3.74 102
2.09 102
1.03 101
2.97 101
4.02 102
1.73 101
1.16 101
1.81 101
4.28 104
5.16 102
4.43 102
4.44 101
1.40 103
1.66 102
1.12 102
3.17 103
8.53 105
3.43 102
8.85 102
2.39 101
1.06 102
4.66 102
1.12 101
2.51 101
7.82 105
1.28 102
8.60 102
2.66 101
7.44 105
1.05 102
3.07 102
2.74 101
rel (%)
4.07
14.71
5.75
6.29
9.38
15.38
6.14
4.01
0.50
13.49
5.54
10.11
1.62
5.94
2.61
-0.08
Table 1: Experiment results.
241
242
Msg ID
1
2
3
4
5
Send Time
10
11
12
13
14
Delivery Service
Unordered
Ordered
Recv Time
E2E Delay
Recv Time
E2E Delay
20
23
18
19
25
10
12
6
6
11
20
23
23
23
25
10
12
11
10
11
Table 2: End-to-end transmission delay for ordered delivery, artificially generated from unordered delivery traces.
this, have to advertise a smaller receiver window. However, in our tests, as remarked in Section 3, the SCTP send and receive buffers at the Traffic Source and Sink were dimensioned
to prevent flow-control induced throttling.
The result of the artificially generated end-to-end transmission delays for ordered delivery is shown in Table 3. The same notations as in Table 1 are used. However, those metrics
that involve the generated transmission times have been complemented with an extra index
to signify this fact.
It follows from Table 3 that mitigating the impact of uncontrolled factors, such as the
actual packet-loss rate and packet-loss distribution, in between ordered and unordered delivery tests did not have any substantial effect on the previous result. Thus, we conclude
that HoLB had a fairly small impact on the mean message transmission delay in tests with
ordered delivery.
Although HoLB had a small impact on the mean message transmission delays, it had
sometimes a large impact on individual messages. Consider Tables 4, 5, and 6. Table 4
present the distribution of the end-to-end transmission delay of individual messages in the
tests with unordered delivery, and Table 5 the corresponding distribution for ordered delivery
using the artificially generated message transmission delays. To facilitate a comparison between the two distributions in Tables 4 and 5, Table 6 shows the relative increase in percent
between corresponding percentiles for unordered and ordered delivery. It can be observed
that although the effect of HoLB was marginal for most messages, a smaller percent was,
indeed, substantially affected. For example, as follows from Table 6, in the test with queue
size 12 packets, send rate 400 Kbps, and no cross traffic, the increase of the median (i.e., p50 )
was limited to 6.90% while the 95th percentile was increased with as much as 69.82%.
It also follows from Table 3 that neither the traffic load nor the size of the dummynet
buffer had any significant impact on the effect of HoLB. To further investigate the reason to
this, we collected some statistics on the actual HoLB events. The statistics were collected
from our artificially generated transmission traces for ordered delivery, and are presented in
Table 7. The # Evts columns show the average number of HoLB events that occurred in
a test run. The remaining columns show the average length of a HoLB event in terms of
number of affected messages, and in terms of total delay penalty imposed on these messages
by a HoLB event. A message was considered to be affected by a HoLB event if its delivery
time was prolonged during our artificial generation (cf. Table 2). To give some appreciation
of the precision in the lengths of the HoLB events, their 95% confidence intervals are also
shown in the table.
4. Results
Queue Size (pkts)

Send Rate
(Kbps)
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
E2Eg
o
(s)
0.92
1.48
2.00
5.97
0.44
1.45
1.81
6.09
0.09
0.43
1.15
6.19
0.09
0.18
0.26
5.73
E2Eu (s)
12
with 95% C.I. (s)
0.85
1.34
1.86
5.52
0.36
1.37
1.69
5.65
0.09
0.40
1.07
5.76
0.09
0.17
0.23
5.31
6.72 102 7.18 105

1.32 101 6.13 102
1.47 101 8.80 102
4.45 101 3.18 101
7.65 102 1.82 103
7.85 102 7.04 102
1.14 101 1.12 101
4.40 101 2.93 101
0.00 7.37 105
3.58 102 6.50 102
7.23 102 1.90 101
4.28 101 5.49 101
0.00 6.93 105
1.23 102 9.18 103
3.18 102 2.87 102
4.18 101 3.61 101
(%)
rel
7.33
8.95
7.36
7.46
17.47
5.42
6.32
7.24
0.00
8.31
6.30
6.92
0.00
6.91
12.09
7.30
E2Eg
o (s)
E2Eu (s)
32
g with 95% C.I. (s)
0.92
1.31
1.81
4.69
0.42
1.08
1.93
4.55
0.09
0.36
0.84
4.15
0.08
0.27
0.45
4.05
0.88
1.21
1.68
4.42
0.39
0.95
1.78
4.33
0.09
0.33
0.76
3.94
0.08
0.26
0.42
3.86
3.86 102 7.96 105

9.91 102 2.80 102
1.24 101 1.02 101
2.67 101 2.53 101
3.68 102 9.58 103
1.26 101 5.04 102
1.50 101 1.01 101
2.29 101 2.43 101
0.00 8.26 105
3.03 102 1.18 102
8.27 102 9.17 102
2.11 101 2.66 101
0.00 8.40 105
6.25 103 8.96 103
2.74 102 3.61 102
1.88 101 3.06 101
g (%)
rel
4.20
7.56
6.85
5.70
8.68
11.72
7.79
5.02
0.00
8.39
9.87
5.09
0.00
2.32
6.13
4.65
Table 3: Experiment results with artificially generated end-to-end transmission delays for ordered delivery.
243
244
Queue Size (pkts)

Send Rate
(Kbps)
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
p50
p75
0.93
1.28
1.77
4.96
0.37
1.28
1.61
5.03
0.09
0.17
0.65
4.93
0.09
0.13
0.17
4.81
0.95
1.59
2.35
6.82
0.45
1.58
2.20
6.86
0.09
0.44
1.58
6.86
0.09
0.19
0.24
6.68
12
Percentiles
p90
p95
0.95
2.09
2.89
8.92
0.50
2.22
2.84
9.18
0.09
1.09
2.52
9.23
0.09
0.26
0.42
9.23
1.05
2.37
3.30
10.78
0.53
2.54
3.23
11.20
0.09
1.47
3.58
11.77
0.09
0.38
0.60
11.77
p99
p50
p75
1.49
2.74
4.17
16.96
0.78
3.09
4.76
19.03
0.09
2.24
4.99
22.20
0.09
0.77
1.35
19.24
0.93
1.22
1.55
4.11
0.39
0.97
1.72
4.03
0.09
0.29
0.53
3.64
0.08
0.24
0.37
3.53
0.95
1.44
2.01
5.70
0.50
1.28
2.20
5.56
0.09
0.45
1.12
5.15
0.09
0.40
0.52
5.18
32
Percentiles
p90
p95
0.95
1.70
2.63
7.24
0.59
1.53
2.90
7.00
0.09
0.59
1.58
6.74
0.09
0.49
0.76
7.02
1.10
1.96
3.26
8.33
0.62
1.82
3.21
7.97
0.09
0.80
1.99
7.78
0.09
0.53
1.01
8.18
p99
1.39
2.82
4.40
10.25
0.76
2.54
3.68
9.91
0.09
1.26
2.96
9.93
0.09
0.60
1.67
10.70
Table 4: Percentiles for the end-to-end transmission delay of individual messages during
unordered delivery.
Queue Size (pkts)
Send Rate
(Kbps)
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
p50
p75
0.93
1.40
1.93
5.36
0.40
1.35
1.73
5.43
0.09
0.18
0.76
5.32
0.09
0.13
0.18
5.16
0.95
1.76
2.48
7.25
0.50
1.69
2.30
7.34
0.09
0.53
1.67
7.25
0.09
0.20
0.28
7.10
12
Percentiles
p90
p95
1.20
2.26
2.99
9.51
0.76
2.30
2.94
9.67
0.09
1.17
2.65
9.74
0.09
0.30
0.53
9.95
1.34
2.50
3.39
11.35
0.90
2.65
3.35
11.82
0.09
1.58
3.81
12.58
0.09
0.50
0.71
12.39
p99
p50
p75
1.54
2.88
4.28
18.89
1.15
3.09
4.83
20.08
0.09
2.34
5.01
23.15
0.09
0.87
1.49
20.17
0.93
1.32
1.66
4.37
0.39
1.04
1.84
4.25
0.09
0.30
0.61
3.84
0.08
0.24
0.38
3.74
0.95
1.52
2.17
5.95
0.54
1.41
2.43
5.78
0.09
0.47
1.22
5.35
0.09
0.41
0.56
5.39
32
Percentiles
p90
p95
1.07
1.92
2.80
7.54
0.63
1.91
3.10
7.28
0.09
0.78
1.79
6.93
0.09
0.50
0.89
7.19
1.23
2.11
3.48
8.61
0.84
2.22
3.38
8.31
0.09
0.96
2.26
8.03
0.09
0.54
1.06
8.39
p99
1.43
2.97
4.49
10.45
1.23
3.01
4.35
10.18
0.09
1.33
3.18
10.11
0.09
0.83
1.73
10.91
Table 5: Percentiles for the artificially generated end-to-end transmission delay of individual
messages during ordered delivery.
We observe that both the HoLB frequency and the delay penalty incurred by a HoLB
event typically increased with increasing traffic load. However, it can also be observed that
in almost all tests, the length of the HoLB events in affected messages was inversely proportional to the HoLB frequency and the traffic load (i.e., number of cross-traffic TCP flows). At
low traffic load, the HoLB frequency was often low since there was typically no congestion,
and thus very few packet losses. However, since there were few packet losses, the sender
window could grow fairly large. Thus, when a HoLB event did in fact occur, it could involve
a relatively large number of packets or messages. Conversely, at high traffic load the HoLB
frequency was large and the congestion window, and consequently the sender window, was
4. Results
Queue Size (pkts)

Send Rate
(Kbps)
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
p50
p75
12
Percentiles
p90
0.03%
9.74%
9.25%
8.20%
6.90%
5.26%
7.69%
7.96%
0.00%
3.17%
16.49%
7.75%
0.00%
1.34%
4.61%
7.33%
0.20%
10.90%
5.70%
6.28%
10.79%
6.51%
4.21%
7.00%
0.00%
20.24%
5.49%
5.76%
0.00%
3.27%
15.57%
6.31%
26.88%
8.36%
3.54%
6.59%
51.12%
3.29%
3.65%
5.36%
0.00%
6.82%
5.19%
5.58%
0.00%
13.38%
28.79%
7.81%
p95
p99
p50
27.49%
5.48%
2.48%
5.33%
69.82%
4.39%
3.82%
5.46%
0.00%
7.38%
6.58%
6.85%
0.00%
31.91%
17.99%
5.23%
3.47%
5.23%
2.44%
11.35%
47.14%
0.07%
1.62%
5.50%
0.00%
4.50%
0.32%
4.30%
0.00%
12.95%
9.68%
4.84%
0.20%
7.94%
6.93%
6.37%
0.00%
7.18%
6.84%
5.56%
0.00%
1.43%
16.68%
5.32%
0.00%
1.27%
3.50%
5.82%
p75
32
Percentiles
p90
p95
p99
0.09%
5.35%
7.83%
4.37%
8.33%
10.32%
10.69%
3.94%
0.00%
5.40%
8.35%
3.82%
0.00%
1.17%
7.50%
4.05%
12.71%
12.57%
6.53%
4.13%
7.42%
25.03%
6.97%
3.95%
0.00%
31.69%
13.46%
2.92%
0.00%
1.45%
16.14%
2.50%
11.35%
7.77%
6.78%
3.42%
35.97%
21.92%
5.24%
4.24%
0.00%
19.49%
13.88%
3.12%
0.00%
2.42%
5.82%
2.58%
3.11%
5.35%
2.02%
1.99%
62.36%
18.41%
18.40%
2.66%
0.00%
5.95%
7.39%
1.88%
0.00%
39.60%
3.74%
1.96%
Table 6: Relative difference between percentiles for the end-to-end transmission delay of individual messages during unordered and
ordered delivery.
245
246
small. As a result, a single HoLB event only affected a few packets or messages. Translated
to our results on the mean transmission delays, this gives us that the delay penalties introduced by an increased HoLB frequency were more or less balanced out by fewer messages
being affected at each HoLB event. Put differently, the effect of HoLB became almost the
same for all studied traffic loads.
Table 7 also explains why the effect of HoLB on the mean transmission delays did not
decrease with increased buffer size. Although, the packet-loss rate, and thus the HoLB
frequency, was larger in the tests with a 12-packet buffer, the number of messages affected
by a HoLB event was mostly larger in the tests with a 32-packet buffer. Consequently, the
larger number of retransmissions in the 12-packet buffer tests was essentially compensated
for by the larger size of the HoLB events in the 32-packet buffer tests.
Conclusions
Avoiding HoLB when transporting PSTN signaling traffic over IP was one of the primary
incentives for the IETF SIGTRAN working group to develop SCTP, a new transport protocol,
in the first place. Although, several studies have been made on the impact of HoLB on TCP
and SCTP, their results are ambiguous, and, to some extent, contradictory. To this end, we
have conducted a detailed experimental study on the impact of HoLB on ordered delivery
in SCTP. Our study suggests that although HoLB could indeed incur a substantial delay
penalty on a small fraction of the messages in an SCTP session, it has only a marginal
impact on the average end-to-end transmission delay. We only observed improvements in
the range of 0% to 18% in average message transmission delay of using unordered delivery
as compared to ordered delivery. Additionally, it was evident that other factors many times
had a larger impact on the average transmission delay than HoLB: A large variability in
between different test runs made the impact of HoLB statistically insignificant in several
tests. Also worth noting is that the impact of HoLB did not always increase with increasing
traffic load, and the reason to this was that not only the frequency of HoLB events, but also
the number of affected messages determined the impact of HoLB. Thus, the largest impact
of HoLB occurred at times when the SCTP sender window was relatively large while we still
had a certain amount of HoLB events.
At present, we have only studied the effect of HoLB on a constant SCTP flow, however, in
future work we also plan to extend our scope to SCTP flows that better capture the properties
of actual PSTN and SIP signaling traffic. In this context, it may also be appropriate to
consider the multi-streaming feature of SCTP.
References
[1] The 3rd generation partnership project (3GPP). http://www.3gpp.org.
[2] 3GPP. IP multimedia subsystem (IMS); stage 2 (release 6). Technical Specification TS
23.228 v6.9.0, 3GPP, March 2005.
REFERENCES
Queue Size (pkts)

Send Rate
(Kbps)
# TCPs
> 400
> 400
> 400
> 400
400
400
400
400
200
200
200
200
133
133
133
133
0
1
2
8
0
1
2
8
0
1
2
8
0
1
2
8
# Evts
12
Length (msgs) with 95% C.I.
Length (s) with 95% C.I.
# Evts
32
Length (msgs) with 95% C.I.
Length (s) with 95% C.I.
24.00
16.75
14.80
30.74
7.84
14.03
15.13
31.08
0.00
9.20
13.56
31.58
0.00
4.98
10.60
31.23
9.75 0.62
15.21 0.92
18.20 0.99
10.99 0.32
31.38 1.78
15.32 0.76
15.82 0.81
11.12 0.31
0.00
11.66 0.63
14.19 0.56
10.67 0.28
0.00
11.12 0.43
11.22 0.36
10.83 0.29
2.80 0.16
7.89 0.94
9.97 0.91
13.19 1.20
9.76 0.78
5.60 0.52
7.57 0.76
13.05 1.38
0.00
3.89 0.37
5.34 0.47
12.47 1.10
0.00
2.47 0.15
3.00 0.19
12.20 0.87
40.00
6.06
11.65
14.87
1.20
9.07
8.86
14.79
0.00
4.42
9.44
15.03
0.00
1.20
4.32
14.69
2.35 0.23
32.66 1.56
17.78 1.09
16.14 0.67
68.99 5.07
23.05 1.50
22.46 1.44
15.46 0.61
0.00
20.85 1.29
17.13 1.20
15.21 0.61
0.00
20.58 0.97
19.21 0.96
15.14 0.55
0.97 0.06
16.37 0.90
10.64 0.91
17.64 1.69
30.72 2.45
13.91 1.52
16.95 1.98
14.86 1.27
0.00
6.86 0.53
8.77 1.04
13.38 0.98
0.00
5.21 0.55
6.34 0.38
12.84 0.77
Table 7: Statistics for HoLB Events calculated from artificially generated transmission traces for ordered delivery.
247
248
[3] G. Camarillo and H. Schulzrinne. Signalling transport protocols. Technical report,

Dept. of Computer Science, Columbia University, February 2002.
[4] S. Cherry. VoIP is turning telephony into just another Internet application and cheap
one at that. IEEE Spectrum, pages 4751, March 2005.
[5] S. Floyd, T. Henderson, and A. Gurtov. The newreno modification to TCPs fast recovery algorithm. RFC 3782, IETF, April 2004.
[6] International softswitch consortium. http://www.softswitch.org.
[7] KAME project. http://www.kame.net.
[8] J. N. Kornitsky and D. Kukoleca. Where is VoIP today? VoIP and Enhanced IP
Communications Services Comprehensive Report, pages 111115, 2004.
[9] G. De Marco, D. De Vito, M. Longo, and S. Loreto. SCTP as a transport for SIP: a
case study. In 7th World Multiconference on Systemics, Cybernetics and Informatics
(SCI), Orlando, Florida, USA, July 2003.
[11] L. Rizzo. Dummynet: A simple approach to the evaluation of network protocols. ACM
Computer Communication Review, 27(1):3134, January 1997.
[12] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks,
M. Handley, and E. Schooler. SIP: Session initiation protocol. RFC 3261, IETF, June
2002.
[14] Signaling transport working group (SIGTRAN). http://www.ietf.org/html.
charters/sigtran-charter.html.
[15] R. Stewart, I. Arias-Rodriguez, K. Poon, A. Caro, and M. Tuexen. Stream control
transmission protocol (SCTP) implementers guide. Internet draft, IETF, February
2005. Work in Progress.
IETF, October 2000.
[17] J. Stone, R. Stewart, and D. Otis. Stream control transmission protocol (SCTP) checksum change. RFC 3309, IETF, September 2002.
Paper VIII
Performance of SCTP-controlled Failovers in

M3UA-based SIGTRAN Networks
Reprinted from
Proceedings of the Advanced Simulation Technologies Conference (ASTC),

Applied Telecommunication Symposium (ATS)
Arlington, Virginia, USA
April 2004
Performance of SCTP-controlled Failovers in

M3UA-based SIGTRAN Networks
Abstract
There are some large economic, operational, and, to some extent, technical incentives
to replace the traditional telecom network with IP. However, such a large transition will
not happen overnight maybe never. Meanwhile, IP-based and traditional TDM-based
telephony will have to co-exist. To address this situation, the IETF SIGTRAN working group has developed an architecture for transportation of Signaling System No. 7
(SS7) traffic over IP. Still, it remains to be shown that the introduction of the SIGTRAN
architecture will not significantly deteriorate the performance of SS7. To this end, this
paper evaluates the failover performance in SIGTRAN networks. Specifically, the paper evaluates the performance of SCTP-controlled failovers in M3UA-based SIGTRAN
networks. The paper suggests that in order to obtain a failover performance with SCTP
comparable to that obtained in traditional TDM-based SS7 systems, SCTP has to abandon many of the configuration recommendations of RFC 2960 and become much more
aggressive in its failover behavior. Furthermore, the paper suggests that the SCTP parameter Path.Max.Retrans has a major impact on the SCTP failover performance. Our
evaluation also indicates that for those path propagation delays envisioned in future SIGTRAN networks, the impact of the path propagation delay on the failover performance
is marginal.
Keywords: failover, SCTP, multi-homing, SIGTRAN, M3UA
1 Introduction
Unlike a datacom network, a telecom network logically comprises two networks: a transport
and a signaling network. The transport network carries the voice traffic, while the signaling
network carries the control information that is needed for the administration and supervision
of calls, and the management of the telecom network itself.
Traditionally, signaling traffic and voice traffic are both carried over TDM-based, circuitswitched connections. However, this is about to change. Using IP networks and protocols,
telecom operators are seeing ways to improve resource utilization and reduce the operational, maintenance, and network infrastructure costs. Still, the transition from TDM to IP
251
252
Performance of SCTP-controlled Failovers in M3UA-based SIGTRAN Networks
will not happen overnight maybe never. The traditional telecom network represents a huge
capital investment6 and is still unsurpassed in terms of reliability and QoS [11]. To address
the situation of two different, co-existing, networks, one TDM based and one IP based, the
IETF SIGTRAN working group has developed an architecture for signaling traffic over IP.
In particular, they have developed an architecture for running Signaling System No. 7 (SS7),
the predominant signaling system in traditional TDM-based telecom networks, over IP. Together with the so-called SoftSwitch architecture, the SIGTRAN architecture [13] constitutes
a complete solution for the integration of the two networks.
The interoperability between the traditional TDM-based telecom network and its IP
counterpart requires that the signaling performance in the IP network is comparable to that of
TDM. Although some time has passed since the SIGTRAN architecture was first published,
it is still unclear if it will perform comparable to the traditional telecom network [4], or if it
will lead to unacceptable performance degradations [5].
The SIGTRAN architecture specifies a common transport protocol for all SS7 signaling
traffic SCTP [17], and a number of adaptation layers that run on top of SCTP. Although
several adaptation layers have been specified, it seems as if a majority of telecom companies
have embraced the MTP-L3 User Adaptation Layer (M3UA) [16]. This adaptation layer
mimics the functionality of MTP-L3, the SS7 transport layer, and makes it possible to run
all layers of the SS7 stack above MTP-L3 without modification on top of SCTP.
The Message Transfer Part (MTP) of the SS7 stack, of which MTP-L3 is the topmost
layer, is not only responsible for the reliable transmission of signaling traffic, but also for
network redundancy. In particular, link failures in traditional TDM-based SS7 networks are
primarily managed by MTP. When a link failure occurs, this is detected by layer 2 in MTP
(MTP-L2). MTP-L2 informs MTP-L3 about the failed link, and a so-called changeover is
performed by MTP-L3. The changeover procedure diverts the signaling traffic carried by
the unavailable link to alternate links as quickly as possible while avoiding message loss,
duplication, or reordering.
To obtain a corresponding network redundancy in a SIGTRAN network as in a traditional
SS7 network, SCTP supports so-called multi-homed associations. Multi-homed associations
make it possible to manage several TCP-like connections, paths in SCTP, as one redundant
logical connection. When one path goes down, SCTP performs a failover and switches all
traffic to an alternative path. A similar failover mechanism as the one in SCTP is also provided by M3UA, therefore we henceforth call failovers in SCTP, SCTP-controlled failovers.
This paper evaluates the performance of SCTP-controlled failovers in M3UA-based SIGTRAN networks: both in terms of SCTP failover times, and in terms of the maximum Message Signal Unit (MSU) transfer times experienced by M3UA users during failovers. Moreover, the paper studies to what extent the performance of SCTP-controlled failovers correlates with the path propagation delay, and with the SCTP parameter Path.Max.Retrans,
the upper bound on the SCTP path error counter.
Our main contribution is to show that in order to have performance similar to the
changeover procedure in a traditional SS7 network, SCTP has to be configured much more
aggressively than what is recommended in RFC 2960. It is also shown that for the envisioned
path propagation delays in future SIGTRAN networks, the effect of the path propagation de6 There
is more than $350 billion of legacy equipment installed in the current telecom network [2].
2. Methodology
253
SEP1
SEP2
M3UA Application
M3UA
Primary Path
IP Network
SCTP
IP
M3UA Application
M3UA
SCTP
Alternative Path
IP
Figure 1: Evaluated network scenario.

lay on the SCTP failover performance is minor. However, there seems to be a strong correlation between failover performance and the value of the Path.Max.Retrans parameter.
Specifically, we observe that in order to comply with the SS7 performance requirements,
SCTP should not have Path.Max.Retrans set to a value larger than 3.
A similar experiment as the one presented in this paper has been carried out by Jungmaier et al. [9]. However, their experiment considered the MTP-L2 Peer-to-Peer adaptation
layer (M2PA) [12]. Furthermore, Caro Jr. et al. at the University of Delaware have made
extensive simulation studies of issues related to SCTP multi-homed associations. They have,
among other things, suggested a two-level threshold mechanism [3] as an improvement to
the existing SCTP failover mechanism.
The remainder of the paper is organized as follows. Section 2 describes the experimental
procedure and setup. The results of the experiment are presented and analyzed in Section 3.
Finally, Section 4 concludes the paper and makes some comments on future work.
2 Methodology
The purpose with our experiment was to evaluate the performance of SCTP-controlled
failovers in the typical network scenario depicted in Figure 1. SEP1 and SEP2 are two
SIGTRAN signaling end points, each one running an M3UA application. The two M3UA
applications are engaged in a signaling session in which the SEP1 application acts as the
source of the signaling traffic and the SEP2 application acts as the sink. During the signaling
session, SEP2 becomes unreachable via its primary path; SCTP at SEP1 detects the failed
primary path and performs a failover to the alternate path. When the failover has completed,
the signaling session continues on the alternate path, and ends before the primary path has
been recovered.
To evaluate the failover performance of SCTP in the network scenario of Figure 1, we
used the experiment setup illustrated in Figure 2. The flow of events in the test runs of the experiment mimicked closely the flow of events in the evaluated network scenario. The source
application at SEP1 continuously sent MSUs to the sink application at SEP2. When 30s of
a test run had elapsed, i.e., more than enough time for SCTP to enter its stationary transmission behavior, the primary path was broken. A failover occurred, and the source application
resumed its transmission on the alternate path. The test run ended 90 s after the primary path
was taken down, which was enough time for SCTP to conclude the failover and regain its
254
NTP Server
LAN
L1 (PC, 400MHz)
SEP2 (Sun Ultra 10)
SEP1 (Sun Ultra 10)

L1 Path Manager
Dummynet
SEP2 Test Manager
SEP1 Test Manager

Primary Path
Sink Application
Source Application
M3UA
M3UA
Alternate Path
SCTP
SCTP
L2 (PC, 230MHz)
IP
IP
Solaris 8
L2 Path Manager
Solaris 8
Dummynet

stationary transmission behavior.
The two paths in between SEP1 and SEP2 consisted of links of bandwidth 100 Mbps.
Both paths included link emulators (L1 and L2 in Figure 2) that enabled us to vary the
propagation delays of the two paths. In addition, L1 enabled us to introduce path breaks on
the primary path. The link emulators were PCs running FreeBSD 5.0 and dummynet [14].
All tests were run automatically by the SEP1 test manager program with assistance from
the SEP2 test manager and the L1 and L2 path managers. The SEP1 test manager directly
administered the execution of the source application and the SEP1 SIGTRAN stack. Furthermore, via commands, the SEP1 test manager controlled the execution of the SEP2 test
manager and the L1 and L2 path managers. The SEP2 test manager and the L1 and L2 path
managers, in their turn, acted as proxies to the SEP1 test manager. That is, on behalf of the
SEP1 test manager, they administered the execution of the sink application and the SEP2
SIGTRAN stack, as well as performed the configuration of dummynet at L1 and L2.
In all test runs, event logging took place at both SEP1 and SEP2. Therefore, it was
important that the local clocks of SEP1 and SEP2 were synchronized. To this end, NTP was
used which kept the clocks of SEP1 and SEP2 differ with about 10 ms in our experiment.
Six SCTP configurations were evaluated. The six evaluated SCTP configurations are
shown in Table 1. The configuration denoted RFC2960 is the configuration of SCTP recommended in RFC 2960 [17]. A special notation is used for the remaining five SCTP configurations, Telecom(p), where p is the value of the SCTP parameter Path.Max.Retrans.The
2. Methodology
255
Parameter
RT Oinit
RT Omin
RT Omax
Path.Max.Retrans (p)
Heartbeat Interval
SACK Timer
SCTP Configuration
RFC2960
Telecom(p)
3000 ms
80 ms
1000 ms
80 ms
60000 ms
150 ms
5 2 3 4 5
30000 ms
30000 ms
200 ms
40 ms
Table 1: Evaluated SCTP configurations.
SCTP Configuration
RFC2960
Telecom(2)
Telecom(3)
Telecom(4)
Telecom(5)
Path Propagation Delay (ms)

5, 10, 20
5, 10, 20
10
10
5, 10, 20
Table 2: Executed tests.
notation alludes to the fact that these configurations are all variations of Telecom(2), which
is the configuration recommended by some large telecom companies. In particular, the other
four Telecom configurations included in the experiment are all examples of SCTP configurations which, in terms of failover, are more conservative than Telecom(2).
Tests were performed with three different path propagation delays: 5 ms, 10 ms, and
20 ms. These delays are believed to represent typical path propagation delays in future dedicated SIGTRAN networks.
Only a subset of the possible combinations of path propagation delay and SCTP configuration were tested. Specifically, our experiment comprised the 11 tests listed in Table 2.
Each test was run 10 times giving a total of 110 test runs.
As follows from Table 2, RFC2960, Telecom(2), and Telecom(5) were tested with all
three path propagation delays. This made it possible for us to study the correlation between
failover performance and path propagation delay for, on one hand, the SCTP configuration recommended by IETF, and, on the other hand, for the, in terms of failover conservativeness, extremes of the Telecom configurations. The SCTP configurations Telecom(3),
and Telecom(4) were only tested with a path propagation delay of 10 ms. However, combined with the corresponding tests for Telecom(2) and Telecom(5), these tests enabled us
to study the correlation between the SCTP failover performance and the SCTP parameter
Path.Max.Retrans.
256
Results
As briefly mentioned in Section 2, event logging at SEP1 and SEP2 took place in all test
runs. Specifically, the time the primary path was broken and the time the path failure was
detected by SCTP at SEP1 were logged. The failover time in a test run was then calculated
as the difference between the SCTP detection time and the actual time of the path failure.
Also the sending times of the MSUs by the source application, and the reception times
of the MSUs by the sink application were logged during each test run. (Note that the timing
of the MSUs occurred at the level of the M3UA application, and not at the SCTP level.)
Based on these values, the MSU transfer times were calculated as the difference between the
reception and the sending times of the MSUs.
Figure 3 and Table 3 summarize the results of the measurements of the failover times and
the MSU transfer times for the three SCTP configurations: RFC2960, Telecom(2), and Telecom(5). Recall from Section 2 that RFC2960 is the configuration of SCTP recommended
in RFC 2960 [17]; that Telecom(2) is an SCTP configuration with strong proponents in the
telecom sector; and that Telecom(5) is a conservative version of Telecom(2). In particular,
Telecom(5) is a merge of Telecom(2) and RFC2960: The RTO-parameters of Telecom(5)
are the same as for Telecom(2), i.e., are set with respect to the envisioned delays in future
SIGTRAN networks, while the failover behavior of Telecom(5) is as conservative as for
RFC2960.
The lin-log graphs in Figure 3(a) plot the sample means of the measured failover times
in the tests as a function of the path propagation delay. The sample means are also listed in
Table 3. Specifically, Table 3 lists the sample means and their corresponding 99% confidence
intervals.
It follows from Table 3 that the mean failover times for RFC2960 were of magnitude 63 s
for all three path propagation delays considered. This is not surprising since with five retransmissions until a path is abandoned (i.e., Path.Max.Retrans = 5), the theoretical
failover time for RFC2960 (assuming that RT O = RT Omin , which was the case is all our
tests) becomes exactly 63 s: 1 s + 2 s + 4 s + 8 s + 16 s + 32 s = 63 s.
As shown in Figure 3(a), the failover times for the Telecom configurations were several
orders of magnitude less than for RFC2960. In particular, it follows from Table 3 that the
failover times of Telecom(2) were mostly in the range of 435 ms - 505 ms, while Telecom(5)
had roughly twice the failover times of Telecom(2).
As mentioned in Section 1, the corresponding path failure scenario to the one studied
in our experiment is managed by the MTP-L3 changeover procedure in a traditional SS7
network. According to ITU-T recommendation Q.706 [8], the changeover time in an SS7
network must be less than or equal to 800 ms. Since basically the same applications will be
used in future SIGTRAN networks that is used in current SS7 networks, it is reasonable to
assume that the requirements are roughly the same. Thus, it follows from our experiment
that RFC2960 most likely will fail to meet the Q.706 requirement on changeover. In fact,
the failover times of RFC2960 were almost 80 times the changeover limit of Q.706. This
is, of course, to be expected, and is in agreement with the results reported in [5] and [9].
More interestingly, we observe that while the failover times of Telecom(2) were well below
the changeover limit of Q.706, this were not the case for Telecom(5). Thus, it seems that if
3. Results
257
1e+06
RFC2960
Telecom(2)
Telecom(5)
100000
Failover Time (ms)
10000
1000
100
10
1
0
10
15
20
25
(a) Failover time vs. path propagation delay.
1e+06
RFC2960
Telecom(2)
Telecom(5)
Max. MSU Transfer Time (ms)
100000
10000
1000
100
10
1
0
10
15
20
(b) Max. MSU transfer time vs. path propagation delay.
Figure 3: Failover performance vs. path propagation delay.
25
258
SCTP is to be used for transfer of signaling traffic, it not only has to abandon the conservative
RTO settings of RFC 2960, but also has to switch from a failed path less conservatively than
recommended by RFC 2960.
Figure 3(a) and Table 3 also suggest that the path propagation delay only had a minor
impact on the SCTP failover time at least for propagation delays no greater than 20 ms,
i.e., for those path propagation delays considered typical in future dedicated SIGTRAN networks. Specifically, the increase in mean failover time for RFC2960 when the path propagation delay was increased from 5 ms to 20 ms was much less than 1%; for Telecom(2) the
increase was about 5%; and for Telecom(5) the increase was close to 12%.
Still, there was indeed a correlation between failover time and path propagation delay. The correlation could, as follows from Table 3, only be established for RFC2960 and
Telecom(5). However, for these two SCTP configurations there was, with a 99% confidence, an increase of the failover time when the path propagation delay increased from 5 ms
to 20 ms.
In the same way as for the failover times, Figure 3(b) and Table 3 give the results of the
measurements of the maximum MSU transfer times. To avoid having the SCTP slow start
and the transient behavior of SCTP during the termination of a test run interfere with the
results, the first and last seconds of a test run were excluded from the calculation.
The graphs show that the maximum MSU transfer times for RFC2960 and Telecom(2)
were almost the same as their failover times, while Telecom(5) had maximum MSU transfer
times about 380 ms less than its failover times. Contrary to the failover times, there is no
ITU-T recommendation that explicitly governs the MSU transfer times. Instead, the upper
bound of the MSU transfer times are determined by the application layers atop MTP-L3, i.e.,
the MTP-L3 stakeholders.
The primary stakeholders of MTP-L3 in terms of MSU transfer time are the ISUP (ISDN
User Part) [7] and TCAP (Transaction Capabilities Application Part) [6] application protocols. The basic function of ISUP is to control setup, connection, and teardown of telephone
calls, while TCAP is an application protocol that is used by a large number of distributed
SS7 applications. Examples of applications using TCAP include various Intelligent Networking (IN) applications and mobility support applications in mobile networks (i.e., GSM
and IS-41).
Although, neither ISUP nor TCAP imposes any explicit requirements on MSU transfer times, analyses have been made [1, 10, 15] suggesting that the maximum permissible
MSU transfer times with respect to these application protocols are in the range of 600 ms 1000 ms, with 1000 ms being barely acceptable. With these figures in mind, it is obvious
that RFC2960, with maximum MSU transfer times of about 63 s, did not comply with
the ISUP/TCAP requirements. Again, as with the RFC2960 failover times, this was to
be expected. Less expected was that also Telecom(5) had some difficulties passing the
ISUP/TCAP requirements. As follows from Table 3, the mean maximum MSU transfer
time for Telecom(5) at a path propagation delay of 20 ms was 718 ms. Considering that the
ISUP/TCAP requirements are worst case values, and that the measurements took place in a
scenario with no competing traffic, Telecom(5) may not give adequate MSU transfer times
during a failover in a real SIGTRAN network. Thus, the outcome of the maximum MSU
transfer time measurements only reinforces the outcome of the failover times: If SCTP is to
3. Results

RFC2960
Telecom(2)
Telecom(5)
Failover Time (ms)

5
10
20
63086 44 63147 46 63244 28
458 23
484 20
480 16
975 17
1008 16
1093 29

5
10
20
61809 1403 62612 31 62735 45
457 41
457 29
495 46
592 24
620 19
718 23
Table 3: 99% confidence intervals for failover performance vs. path propagation delay.
259
260
Telecom(p)
1400
1200
Failover Time (ms)
1000
800
600
400
200
0
0
3
Path.Max.Retrans
(a) Failover time vs. Path.Max.Retrans.
Telecom(p)
1400
1200
1000
800
600
400
200
0
0
3
Path.Max.Retrans
(b) Max. MSU transfer time vs. Path.Max.Retrans.
Figure 4: Failover performance vs. Path.Max.Retrans for Telecom(p) (10 ms path propagation delay).
4. Conclusions
261
be used for signaling traffic, then it has to be much less conservative than recommended by
RFC 2960.
Figure 3(b) and Table 3 also show that the path propagation delay only had a minor
impact on the maximum MSU transfer times experienced during a failover. Furthermore,
the correlation between maximum MSU transfer time and path propagation time was weak,
and could only be established for Telecom(5).
We also performed a more detailed study of the impact of the SCTP parameter
Path.Max.Retrans on the SCTP failover performance. The outcome of this study is
compiled in the graphs in Figure 4 7 . The graphs plot the sample means of the measured
failover times and maximum MSU transfer times together with their 99% confidence intervals.
It follows from the graphs that the value of Path.Max.Retrans had indeed a major
impact on the failover time. An increase of Path.Max.Retrans from 2 to 3 resulted in
a relative increase of the mean failover time by 40%. And, when Path.Max.Retrans
was increased from 3 to 4, or from 4 to 5, the relative increase of the mean failover time was
about 20% in both cases. Even more important is to note that already with a
Path.Max.Retrans of 4, SCTP failed to meet the failover requirement of Q.706. Thus
again reinforcing the need for SCTP to be much more aggressive than what is recommended
by RFC 2960 if it is to be used for SS7 signaling transport.
The graphs also show that the value of Path.Max.Retrans had some influence on
the maximum MSU transfer time. Specifically, the maximum MSU transfer time increased
with approximately 35% when the value of Path.Max.Retrans was changed from 2
to 5. However, the maximum MSU transfer times were below ISUP/TCAP requirements
for all values of Path.Max.Retrans. Thus, in terms of MSU transfer time there was no
problem having Path.Max.Retrans configured as conservatively as recommended by
RFC 2960 and still meet the SS7 signaling transport needs.
4 Conclusions
This paper presents an evaluation of the performance of SCTP-controlled failovers in future
M3UA-based SIGTRAN networks. The evaluation suggests that in order to meet the failover
performance objectives of a traditional SS7 network, SCTP has to abandon the conservative failover behavior recommended by RFC 2960. Specifically, it has to set the parameter
Path.Max.Retrans to a value no larger than 3. In addition, it has to change from the
RTO-parameter configuration recommended by RFC 2960 to a parameter configuration far
more in line with the actual path propagation delays in the SIGTRAN network.
The evaluation also suggests that the configuration of the SCTP parameter
Path.Max.Retrans has a major impact on the failover performance: Especially in terms
of failover time, but also to some extent in terms of the maximum MSU transfer time experienced by an M3UA application during failover.
In contrast, the evaluation indicates that for path propagation delays in the range of 5 ms
to 20 ms, i.e., for path propagation delays believed to be representative for dedicated SIG7 The dotted lines in the graphs are only provided to make the trends more clear, and do not suggest that
Path.Max.Retrans is continuous.
262
TRAN networks, the path propagation delay has only a minor impact on the failover performance.
Our future work includes studying the effects of introducing competing signaling traffic
on the performance of SCTP-controlled failovers. In particular, to study the tradeoff between
shorter failover times and spurious failovers. However, we also want to study to what extent
the SCTP failover performance degrades with different levels and mixtures of competing
traffic. Furthermore, it remains to find out how other configurable SCTP parameters, e. g.,
RT Omin and RT Omax , affect the failover performance.
References
[1] T. Seth A. Broscius, C. Huitema, and H-A P. Lin. Performance requirements for TCAP
signaling in Internet telephony. Internet draft, IETF, February 1999. Work in Progress.
[2] W. Buga. The evolution of softswitch architecture. Annual Review of Communications,
55:7376, 2002.
[3] A. L. Caro Jr., J. R. Iyengar, P. D. Amer, G. J. Heinz, and R. R. Stewart. Using SCTP
multihoming for fault tolerance and load balancing. In SIGCOMM 2002 Poster Session,
Pittsburg, Pennsylvania, USA, August 2002.
[4] L. Coene and J. Pastor. Telephony signalling transport over SCTP applicability statement. Internet draft, IETF, August 2003. Work in Progress.
[5] K. D. Gradischning and M. Tuexen. Signaling transport over IP-based networks using
IETF standards. In 3rd International Workshop on the Design of Reliable Communication Networks (DRCN), pages 168174, Budapest, Hungary, October 2001.
[6] ITU-T. Signalling system no. 7 functional description of transaction capabilities.
Recommendation Q.771, ITU-T, June 1997.
[7] ITU-T. Signalling system no. 7 ISDN user part functional description. Recommendation Q.761, ITU-T, December 1999.
[8] ITU-T. Signalling system no. 7 message transfer part signalling performance. Recommendation Q.706, ITU-T, March 1999.
[10] H-A P. Lin, K-M Yang, T. Seth, and C. Huitema. VoIP signaling performance requirements and expectations. Internet draft, IETF, October 1999. Work in Progress.
[11] P. Molinero-Fernandez, N. McKeown, and H. Zhang. Is IP going to take over the world
(of communications)? ACM Computer Communication Review, 33:113118, January
2003.
REFERENCES
263
[12] K. Morneault, R. Dantu, G. Sidebottom, B. Bidulock, and J. Heitz. Signaling system

7 (SS7) message transfer part 2 (MTP2) user adaptation layer. RFC 3331, IETF,
September 2002.
October 1999.
2002.
IETF, October 2000.
Paper IX
Impact of Traffic Load on SCTP Failovers
in SIGTRAN
Reprinted from
Proceedings of the 4th International

Conference on Networking (ICN)
Reunion Island
April 2005
Impact of Traffic Load on SCTP Failovers in

SIGTRAN
Abstract
With Voice over IP (VoIP) emerging as a viable alternative to the traditional circuitswitched telephony, it is vital that the two are able to intercommunicate. To this end, the
IETF Signaling Transport (SIGTRAN) group has defined an architecture for seamless
transportation of SS7 signaling traffic between a VoIP network and a traditional telecom
network. However, at present, it is unclear if the SIGTRAN architecture will, in reality, meet the SS7 requirements, especially the stringent availability requirements. The
SCTP transport protocol is one of the core components of the SIGTRAN architecture,
and its failover mechanism is one of the most important availability mechanisms of SIGTRAN. This paper studies the impact of traffic load on the SCTP failover performance
in an M3UA-based SIGTRAN network. The paper shows that cross traffic, especially
bursty cross traffic such as SS7 signaling traffic, could indeed significantly deteriorate
the SCTP failover performance. Furthermore the paper stresses the importance of configuring routers in a SIGTRAN network with relatively small queues. For example, in
tests with bursty cross traffic, and with router queues twice the bandwidth-delay product, failover times were measured which were more than 50% longer than what was
measured with no cross traffic at all. Furthermore, the paper also identifies some properties of the SCTP failover mechanism that could, in some cases, significantly degrade its
performance.
1 Introduction
Since Voice over IP (VoIP) roared into prominence during the latter part of the 1990s, the
idea of a converged network based on IP technology for voice, video, and data has gained
strong momentum. However, in spite of all prospective advantages with IP it would be naive
to think that the transition from the traditional circuit-switched network to IP would happen
overnight.
In light of this, the IETF Signaling Transport (SIGTRAN) working group has defined
an architecture, the SIGTRAN architecture [13], for seamless Signaling System #7 (SS7)
signaling between VoIP and the traditional telecom network. The SIGTRAN architecture
essentially comprises two components: a new IP transport protocol, the Stream Control
267
268
Impact of Traffic Load on SCTP Failovers in SIGTRAN
Transmission Protocol (SCTP) [16], specifically designed for signaling traffic; and an adaptation sublayer. The adaptation sublayer shields SS7 from SCTP and IP, and depending on
how much of the SS7 stack is run atop SCTP, different adaptation protocols are used. Examples of adaptation protocols include: M2PA [4] for adaptation of the SS7 MTP-L3 [6]
protocol to IP, and M3UA [15] for adaptation of SCCP [7] and user part protocols such as
ISUP [8].
It is widely recognized that to gain user acceptance, the SIGTRAN architecture has to
perform comparable to the traditional circuit-switched telecom network [3]. In particular, it
has to provide the same level of availability as a traditional SS7 network. Considering that
ITU-T prescribes an availability level of 99.9988% [9], i.e., no more than 10 minutes downtime per year, and that many telecom networks have an even higher availability level [11],
this is indeed a great challenge.
To meet the stringent requirements of SS7, several availability mechanisms have been
included in the SIGTRAN architecture of which the SCTP failover mechanism is one of
the more important ones if not the most important one. It corresponds with the MTP-L3
changeover procedure, and enables rapid re-routing of traffic from a failed signaling route
within a SIGTRAN network. In particular, the SCTP failover mechanism constitutes part of
SCTPs multi-homing support.
Although, the SCTP failover mechanism plays a key role in the availability support of
the SIGTRAN architecture, very few results are available on its actual performance in this
context. Jungmaier et al. [10] have studied the SCTP failover performance in an M2PA-based
network, and showed that it only meets ITU-T requirements provided it is configured very
aggressively, and provided the network path propagation delays are very short. A similar
result was also obtained by Grinnemo et al. [5] when they performed measurements on SCTP
failover performance in an M3UA-based network.
Both the study in [10] and in [5] took place in unloaded networks, i.e., under quite unrealistic conditions. This paper advances the work in [5], and partly the work in [10], by studying
the impact of traffic load on the SCTP failover performance in an M3UA-based SIGTRAN
network. The main contribution of the paper is that it demonstrates that cross traffic, especially bursty cross traffic such as SS7 signaling traffic, could indeed significantly deteriorate
the SCTP failover performance. Furthermore, the paper stresses the importance to keep the
router queues in a SIGTRAN network relatively small. In fact, the paper shows that bursty
traffic in combination with ill-dimensioned router queues may well cause the SCTP failover
mechanism to not comply with the ITU-T requirement on the MTP-L3 changeover procedure [9]. Furthermore, the paper identifies some issues regarding the design of the SCTP
failover mechanism which in some cases negatively affect the failover performance.
The remainder of the paper is organized as follows. Section 2 gives a brief description
of the SCTP failover mechanism. Then, in Section 3 follows a description of the design and
execution of the experiment that underlies our study. Next, in Section 4, we elaborate on
the results of the experiment. Finally, in Section 5, the paper ends with some concluding
remarks and words on future work.
2. Failovers in SCTP
269
2 Failovers in SCTP
While IP networks have many virtues, high availability and reliability have traditionally not
been seen as two of them. Unlike circuit-switched paths, which exhibit changeover and
failover times on the order of milliseconds, measurements show that it may take well over
ten seconds before the routers in the Internet reach a consistent view after a path failure [12]
in other words, too long for delay-sensitive SS7 signaling traffic.
In the SIGTRAN architecture, the unsuitability of IP for high-availability routing of SS7
signaling messages is addressed through various redundancy mechanisms at the transport
and adaptation layers. As previously mentioned, one of the most important network redundancy mechanisms in SIGTRAN is the SCTP failover mechanism.
An example of how the SCTP failover mechanism works is illustrated in Figure 1. In
this example, we have an SCTP connection, a so-called association, between two signaling
end points: SEP-A and SEP-B. The association comprises two routing paths: path #1 and
path #2. Since SCTP does not support load-sharing, one path in an association is always
designated the primary path and is the path on which signaling traffic is normally sent. The
remaining paths, if any, become backup or alternate paths. In our example, path #1 is the
primary path and path #2 an alternate path.
SCTP continuously monitors reachability on the primary and alternate paths on an
active primary path SCTP probes for reachability using the transferred data packets themselves, and on idle alternate paths specific heartbeat packets are used. Furthermore, for each
path (actually network destination), SCTP keeps an error counter that counts the number of
consecutively missed acknowledgements to data or heartbeat packets. A path is considered
unreachable when the error counter of the path exceeds the value of the SCTP parameter
Path.Max.Retrans. In the remaining discussion, it is assumed that the SCTP stacks at
SEP-A and SEP-B are configured with Path.Max.Retrans set to 2.
As follows from the time line in Figure 1, a failure occurs on the primary path at time
t1 . At that time, the SCTP retransmission timeout (RTO) variable is assumed to be 240 ms,
and it is assumed that there are outstanding traffic. Thus, at t2 t1 + 240 ms, the SCTP
retransmission timer, T3-rtx, expires and a timeout occurs; an SCTP packet worth of outstanding data is retransmitted on the alternate path, and the error counter of the primary path
is incremented by one. Furthermore, the RTO variable is backed off, or more precisely
RT O min {max (2 RT Ocur , RT Omin ) , RT Omax } ,
(1)
where RT Ocur denotes the current value of the RTO variable, and RT Omin and RT Omax
are SCTP parameters that limit the range of the RTO variable. Here, it is assumed that
RT Omin is set to 80 ms and RT Omax to 250 ms.
At time t3 , new data is sent out on the primary path, and the T3-rtx timer is restarted with
the value of the updated RTO variable. The flow of events that occurred at times t2 and t3 are
repeated at times t4 and t5 . When time t6 is reached, the error counter of the primary path
becomes 3, i.e., greater than Path.Max.Retrans, and SCTP considers the path failed
and starts sending new data onto the alternate path. In other words, the failover concludes.
270
SEPA
SEPB
path #1
Primary Path
PA
PB
SIGTRAN Network
AA
AB
Alternate Path
path #2
AA
t1
PA
240 ms
t2
t3
PB
AB
1
0
0
1
0
1
0
1
0
1
0
1
1111111111111
0000000000000
0000000000000
1111111111111
min{max{2x240,80},250}=250
ms
0000000000000
1111111111111
0000000000000
1111111111111
t4
t5
0000000000000
1111111111111
min{max{2x250,80},250}=250
ms
0000000000000
1111111111111
0000000000000
1111111111111
t6
1
0
0
1
0
1
0
1
0
1
0
1
0
1
Figure 1: Failover scenario between two dual-homed signaling end points.
Methodology
To be able to study the impact of traffic load on the SCTP failover performance, we considered the network scenario depicted in Figure 2.
In this scenario, two M3UA users at signaling end points SEP1 and SEP2 were engaged
in a signaling session over a SIGTRAN network with varying degrees of traffic load. The
session took place over a multi-path association with one primary and one alternate path.
Initially, all signaling traffic in the M3UA session was routed on the primary path. However,
30 s into the signaling session a failure occurred on the primary path. As a result, the signaling traffic was re-routed from the primary to the alternate path. The network scenario ended
3. Methodology
271
SIGTRAN Network
SEP1
SEP2
Source App.
Sink App.
Primary Path
M3UA
M3UA
SCTP
SCTP
Alternate Path
IP
IP
Figure 2: Studied network scenario.

Name
CT-NONE
CT-LOW
CT-MEDIUM
CT-HIGH
Burst Size (KBytes)

0
4
8
16
Inter-Burst Gap (ms)

0
200
100
50
Table 1: Cross Traffic Characteristics.

when 90 s had elapsed from the time of the path failure.
The network scenario in Figure 2 was modeled using the experiment setup illustrated
in Figure 3. The M3UA session between SEP1 and SEP2 was modeled as a constant bit
rate flow of 200 Kbps. Although it could be argued that a constant bit rate flow is not a
particularly realistic model of actual SS7 traffic [14], a more realistic model would make
it much more difficult to measure the failover times. Particularly, introducing randomness
in the traffic generation at SEP1 would render it difficult to establish the start times of the
failovers.
The cross traffic comprised single SCTP flows between SEP3 and SEP4, and SEP5 and
SEP6. Since the SS7 traffic in future dedicated SIGTRAN networks will presumably be
bursty [2, 14], the cross traffic was generated as bursty flows. Tests were run for a range of
cross traffic flows representing a spectrum of traffic loads with different degrees of burstiness.
Specifically, tests were run with cross traffic flows having burst sizes and inter-burst gaps as
listed in Table 1. It should be noted that CT-NONE denotes no cross traffic at all, and that the
CT-HIGH cross traffic case actually meant that the SEP1 source application did not impose
any limits on the SCTP transmission rate.
To be able to study the impact of queueing delay on the SCTP failover performance,
272
SEP3
PC
SEP4
Path Delay: 25 ms
Bandwidth: 1 Mbps
Queue: 3, 6, 13 KBytes
Red Hat 8
Source App.
Red Hat 8
Sink App.
SCTP/UDP
SCTP/UDP
IP
IP
Router1
SEP1
PC
PC
SEP2
Sun Ultra 10
Sun Ultra 10
FreeBSD 5.0
Ethernet switch
Solaris 8
M3UA
Ethernet switch
dummynet
Solaris 8
Ethernet
Switch
Source App.
Ethernet switch
Ethernet
Switch
Router 2
PC
Sink App.
M3UA
Ethernet switch
SCTP
SCTP
Ethernet
Switch
IP
Ethernet
Switch
FreeBSD 5.0
IP
dummynet
SEP5
PC
SEP6
PC
Red Hat 8
Red Hat 8
Source App.
Sink App.
SCTP/UDP
IP
SCTP/UDP
Path Delay: 25 ms
Bandwidth: 1 Mbps
Queue: 3, 6, 13 KBytes
IP

Parameter
RT Oinit
RT Omin
RT Omax
Path.Max.Retrans
SACK timer
Setting
250 ms
80 ms
250 ms
2
40 ms
Table 2: SCTP configuration.

tests were run with three different router queue sizes: 3 Kbytes (approximately half the
bandwidth-delay product), 6 KBytes (approximately the same as the bandwidth-delay product), and 13 KBytes (approximately twice the bandwidth-delay product). These queue sizes
were selected with the intent to model the router configurations found in both controlled,
delay-sensitive, networks, and uncontrolled networks.
The SCTP stacks at SEP1 and SEP2 were configured to meet the ITU-T requirements
on the MTP-L3 changeover procedure [9], i.e., according to the findings in [5, 10]. More
precisely, they were configured as shown in Table 2, with the remaining parameters set as
recommended in RFC 2960 [16]. The SCTP stacks at the remaining SEPs were configured
in accordance with RFC 2960.
Tests were run for all combinations of cross traffic and router queue sizes, giving a total
of 12 tests. Furthermore, to obtain statistical validity each test was repeated 40 times.
3. Methodology
273
1000
Router Queue: 3 KBytes
Failover Time (ms)
800
600
400
200
0
CT-NONE
CT-LOW
CT-MEDIUM
Cross Traffic
CT-HIGH
(a) Failover Time vs. Cross Traffic.
1000
800
600
400
200
0
CT-NONE
CT-LOW
CT-MEDIUM
Cross Traffic
CT-HIGH
(b) Max. MSU Transfer Time vs. Cross Traffic.
Figure 4: Impact of traffic load on SCTP failover performance.
274
Results
The SCTP failover performance was evaluated in terms of two metrics: the failover time
experienced by the SEP1 source application, and the maximum Message Signal Unit (MSU)
transfer time measured during failover in the M3UA session between SEP1 and SEP2. As
estimates of the failover times and the max. MSU transfer times in the tests, the sample
means were used.
Figure 4 summarizes the result of our experiment. In Figure 4 (a), it is shown how the
SCTP failover time was affected by increasing traffic load at different router queue sizes,
while Figure 4 (b) shows the same relationship for the max. MSU transfer time. The error
bars depict the 99% confidence intervals, and the lines connecting the mean failover times
and max. MSU transfer times are only supplied as a visualization aid. Specifically, these
lines are only included to help visualize the trends.
As follows from Figure 4, the deteriorating effect of the cross traffic on the failover
performance increased with increased traffic load and router queue size. When the Router1
queue was only 3 KBytes, the cross traffic did not inflict significantly on the failover and
max. MSU transfer times. However, as the queue size was increased, the effect of the cross
traffic became more and more apparent. Thus, when the Router1 queue was 13 KBytes, the
CT-HIGH cross traffic increased the failover time with more than 50% and the max. MSU
transfer time with almost 40% as compared with no cross traffic at all.
The reason to the increased failover and max. MSU transfer times was the queueing
delays that arose at Router1 when the router queue was fairly large, and when the cross traffic
was bursty (i.e., when the short-term bandwidth requirement of the cross traffic sometimes
exceeded the bandwidth capacity of the primary path). As a matter of fact, in previous tests
with the same test flow, but with constant bit rate cross traffic, it was found that the traffic
load had no significant impact on the failover performance provided it was less than the path
capacity.
Another observation worth making concerns the SCTP failover times with regards to
the requirement of ITU-T on the MTP-L3 changeover procedure [9]. To comply with this
requirement, the SCTP failover times should be no more than 800 ms. However, as follows
from Figure 4, this requirement was only fulfilled in those cases the Router1 queue was
relatively small (3 KBytes or 6 Kbytes). In the tests with a router queue of 13 KBytes or
twice the bandwidth-delay product (to our knowledge a quite common configuration [1]),
the failover times averaged well above 850 ms at medium (CT-MEDIUM) and high (CTHIGH) traffic loads.
Interestingly, in all tests, the measured failover times were significantly larger than what
could be expected given the RTOs. However, the discrepancy was larger with larger traffic
loads and router queues. Consider, for example, the test with a 13 KByte Router1 queue and
the CT-HIGH cross traffic. When this test was re-ran with tracing on the RTO, the RTO at
the time of the path failure, RT Ot , was measured to 240 ms. Only considering the timeout
periods, this gives us a theoretical failover time of 240 ms + 250 ms + 250 ms = 740 ms
(see Section 2). However, the measured failover time was 920 ms, or 180 ms larger than our
estimate.
The reason to this discrepancy turned out to be substantial delays between the expira-
4. Results
275
SEP1
SEP2
P1
Primary Path
P2
SIGTRAN Network
A2
A1
Alternate Path
A1
P1
P2
A2
First timeout on P1
cwnd = 1 MTU
80 ms
T3rtx restarted on P1
Second timeout on P 1
cwnd = 1 MTU
80 ms
T3rtx restarted on P1
Third timeout on P 1
cwnd = 1 MTU
Figure 5: Management of the T3-rtx timer during failover.
tion of the T3-rtx timer and its restart during the failover (see Figure 5). When a timeout
occurred, the SCTP congestion window at SEP1 was reduced to 1 Maximum Transmission
Unit (MTU). As a result, no packets were sent out on the primary path, and the T3-rtx timer
was not restarted, until the amount of outstanding data went below 1 MTU. This meant, as
shown in Figure 5, an extra delay (apart from the timeout delay) of about 80 ms at each
timeout event.
Although, an extra delay of 80 ms at each timeout during a failover has to be considered
as a quite large delay in this context (SS7 signaling), even larger delays could be expected
in real-world SIGTRAN networks. Specifically, it could take several transmission rounds
before the T3-rtx timer of the primary path is restarted again after a timeout in cases with
large amounts of outstanding data at the time of a path failure.
Finally, as an aside, we would like to mention the significant penalty in terms of failover
performance that could be the result of setting RT Oinit , the initial value of RTO, too low.
276
Specifically, a too low value on RT Oinit with respect to the round-trip time of the alternate
path8 could result in one extra retransmission, and thus one extra timeout period, before
SCTP considers the primary path failed. To gain some appreciation of the extent to which
this could in fact impede on the failover performance in a SIGTRAN network, we re-ran the
test with the Router1 queue set to 13 KBytes and with no cross traffic (CT-NONE), but this
time with RT Oinit at SEP1 and SEP2 configured to 80 ms instead of 250 ms. The result
of this test was that we observed an increase in failover time with about 180 ms, or 32%,
compared with the original test (cf. Figure 4 (a)).
Conclusions
This paper studies the impact of traffic load on the SCTP failover performance in an M3UAbased SIGTRAN network. Two performance metrics are considered: the SCTP failover
time, and the maximum transfer time experienced by an M3UA user during failover. The
paper shows that cross traffic, especially bursty cross traffic such as SS7 signaling traffic,
could indeed significantly deteriorate the SCTP failover performance. Furthermore, the
paper demonstrates how important it is to configure the routers in a SIGTRAN network
with relatively small queues. For example, in tests with bursty cross traffic and with router
queues twice the bandwidth-delay product (to our knowledge a quite common configuration), failover times were measured which on the average were more than 50% longer than
what was measured with no cross traffic at all. In fact, in these situations, our study suggests
the SCTP failover performance may not even meet the requirement of ITU-T on MTP-L3
changeovers.
Two important observations are also made in the paper which concern the SCTP failover
behavior. First, it is shown that the delays which occur in between the expiration of the SCTP
retransmission timer (T3-rtx) and its restart during a failover could contribute significantly
to the failover and max. MSU transfer times. Second, the paper comments on the extent to
which a too low initial retransmission timeout (RTO) value, i.e., a too low value on the SCTP
parameter RT Oinit , could deteriorate the failover performance.
While cross traffic, T3-rtx restart delays, and low values on RT Oinit could have a significant negative effect on the SCTP failover performance, it still remains that a major factor
is the length of the timeout periods. Thus, in our future work, we intend to study ways of
shortening these periods without threatening network stability. Specifically, we intend to
study the effect of introducing a more relaxed RTO backoff mechanism.
References
[1] M. Allman. TCP byte counting refinements. ACM Computer Communication Review,
3(29):1422, July 1999.
8 Note that the first transmission round on the alternate path within a timeout period only comprises a single
SCTP packet. Consequently, the SACK timer delay adds to the initial round-trip time in a timeout period on the
alternate path, something that is easily overlooked when RT Oinit is configured.
REFERENCES
277
[2] A. T. Andersen. Modelling of packet traffic with matrix analytic methods. PhD thesis, Informatics and Mathematical Modelling, Technical University of Denmark, DTU,
September 1995.
[3] S. Fu and M. Atiquzzaman. SCTP: State of the art in research, products, and technical
challenges. IEEE Communications Magazine, 42(4):6476, April 2004.
[4] T. George, B. Bidulock, R. Dantu, H. J. Schwarzbauer, and K. Morneault. Signaling
system 7 (SS7) message transfer part 2 (MTP2) - user peer-to-peer adaptation layer
(M2PA). Internet draft, IETF, June 2004. Work in Progress.
[5] K-J Grinnemo and A. Brunstrom. Performance of SCTP-controlled failovers in M3UAbased SIGTRAN networks. In Advanced Simulation Technologies Conference (ASTC),
Applied Telecommunication Symposium (ATS), pages 8691, Arlington, Virginia, USA,
April 2004.
[6] ITU-T. Specifications of signalling system no. 7 message transfer part: Signalling
network functions and messages. Recommendation Q.704, ITU-T, July 1996.
[7] ITU-T. Specifications of signalling system no. 7 - signalling connection control part:
Signalling connection control part procedures. Recommendation Q.714, ITU-T, July
1996.
[8] ITU-T. Specifications of signalling system no. 7 - ISDN user part: ISDN user part
signalling procedures. Recommendation Q.764, ITU-T, July 1997.
[11] R. Kuhn. Sources of failure in the public switched telephone network. IEEE Computer,
30(4):3136, April 1997.
[12] C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian. Delayed Internet routing convergence.
IEEE/ACM Transactions on Networking, 9(3):293306, June 2001.
October 1999.
[14] F. J. Scholtz. Statistical analysis of common channel signaling system no. 7 traffic. In
15th Internet Traffic Engineering and Traffic Management (ITC) Specialist Seminar,
Wurzburg, Germany, July 2002.
2002.
278

IETF, October 2000.
Paper X
Using Relaxed Timer Backoff to
Reduce SCTP Failover Times
Under submission
Using Relaxed Timer Backoff to Reduce SCTP

Failover Times
Stephan Baucke, Reiner Ludwig
Ericsson Research, Aachen, Germany
{Stephan.Baucke, Reiner.Ludwig}@ericsson.com

Adam Wolisz
Department of Electrical Engineering
Technical University Berlin, Germany
awa@ieee.org
Abstract
SCTPs multi-homing feature allows it to fail over to an alternate network path in
case of network failures, which is vital for meeting the reliability requirements of many
signaling applications. But using the standard RFC 2960 configuration, SCTP takes a
minimum of 63 s to detect a failure and perform the failover, which is far too slow for
these applications. The main cause for the long failover time lies in SCTPs use of a
binary exponential timer backoff. This is part of its congestion control scheme and also
plays an important role for Karns algorithm.
Existing approaches tries to accelerate the failure detection work by disabling or
limiting the exponential timer growth. This, however, partly defeats the purposes of
the backoff, impairing congestion control and limiting SCTPs ability to adapt to delay
variations. In case of unexpected delay increases this may lead to unstable behavior and
a large number of spurious retransmission timeouts.
Instead of just disabling or limiting the timer backoff, we propose to accelerate the
failure detection in a soft way by using smaller backoff factors. Based on existing
research results from the area of MAC contention, we argue that properly chosen smaller
backoff factors still yield stable behavior in case of congestion. Also, full adaptability to
delay variations is maintained.
We present estimations of the failure detection times that can be achieved with this
approach in realistic network scenarios. Complementing the estimations, we present
results of simulations and measurements with a commercial SCTP stack for selected
scenarios.
281
282
Using Relaxed Timer Backoff to Reduce SCTP Failover Times
Introduction
Over the past decade, the Internet has become a ubiquitous means of communications and
has quickly surpassed traditional circuit-switched network traffic volumes. More and more
telecommunications carriers, companies, and vendors have come to envision a nextgeneration network in which voice, video, and data converge into a single IP-based infrastructure, operating over both wired and wireless physical media. They see Voice over IP
(VoIP) as a natural step in this direction. Still, it is clear that the transition from the traditional public switched telephone network (PSTN) to VoIP will not happen overnight: With
about 1.4 billion users, about $900 billion of worldwide revenue, and more than $350 billion of legacy equipment installed, the PSTN will most likely live on for at least a decade
or so. Thus, to enable seamless interoperation of VoIP with the traditional PSTN, the IETF
Signaling Transport (SIGTRAN) working group has defined a framework architecture [35]
for transportation of PSTN signaling, i.e., Signaling System No. 7 (SS7) traffic over IP.
The SIGTRAN architecture essentially comprises two components: a new transport protocol, the Stream Control Transmission Protocol (SCTP) [43], specifically designed for signaling traffic; and an adaptation sub-layer which essentially makes it possible to run existing
SS7 application protocols unaltered on top of SIGTRAN.
To allow interoperability between PSTN and VoIP networks, it is important that the SIGTRAN architecture meets the functional and performance requirements of SS7. In particular,
it must exhibit the same availability as a traditional SS7 network. To this end, several network redundancy mechanisms have been incorporated into the SIGTRAN architecture with
the SCTP failover mechanism being among the most important ones.
The SCTP failover mechanism replaces the Changeover and Forced Rerouting procedures of SS7, and thus should approximately match the performance provided by these
procedures. Specifically, as will be shown in Section 3, an SCTP failover should take no
more than about 2 s to complete. However, several studies, e.g. [15, 26], indicate that SCTP
will not always be able to achieve a failover performance in that order of magnitude, especially if failovers should be carried out in a conservative manner to avoid false or spurious
failovers. To this end, this paper proposes a modified Retransmission Timeout strategy which
involves using a relaxed timer backoff factor of less than the standard factor of two used by
SCTPs exponential backoff mechanism [43]. The paper shows through simulations, complemented with validating experiments using a real SCTP stack, that the relaxed backoff
could significantly improve SCTP failover times. Specifically, the paper suggests that such
a strategy could substantially enlarge the range of permissible network delays (i.e., delays
over which SCTP remains compliant with SS7 failover requirements) for SIGTRAN networks, and this without having to resort to a much less conservative failover behavior. The
paper also presents strong arguments that SCTP remains stable with a relaxed timer backoff;
this is significant because the backoff mechanism is an important part of SCTPs congestion
control.
The remainder of the paper is organized as follows. Section 2 gives some preliminary
material on SCTP and the SCTP failover mechanism. This is followed in Section 3 with an
elaboration of the requirements in terms of failover time imposed on SCTP by SS7. Furthermore, Section 3 surveys previous and current work on improving the SCTP failover perfor-
2. SCTP and SCTP Failover
283
mance. Next, in Section 4, the stability of SCTP with a relaxed timer backoff is considered,
and arguments are presented showing that it is reasonable to believe that the exponential
timer backoff mechanism of SCTP remains stable with an appropriate backoff factor of less
than 2. Section 5 presents theoretical estimations of the failover times that could be expected
with a relaxed timer backoff. This is complemented in Section 6 with simulations of failover
times for a selection of backoff factors and network conditions. Section 6 also includes an
experimental validation of a subset of the executed simulations. Finally, Section 7 concludes
the paper and outlines future work.
2 SCTP and SCTP Failover

2.1 Overview
As previously mentioned, SCTP was spawned from an effort started in the IETF SIGTRAN
working group to develop a specialized transport protocol for transportation of telephony or
Signaling System No. 7 (SS7) signaling traffic over IP. Currently, SCTP is standardized in
RFC 2960 [43]. Additionally, there is a Specification Errata and Issues document [42] that
contains corrections and clarifications to RFC 2960.
SCTP inherits many of its features from TCP. Like TCP, SCTP provides a connectionoriented, reliable transport service on top of IP. Further, like TCP, SCTP offers an
acknowledgement-based, non-duplicated transfer of packets. It also uses window-based
congestion- and flow-control mechanisms that basically work the same as the ones used
in TCP SACK [31]. A selective retransmission mechanism is employed to correct losses and
errors.
SCTP also provides a number of new features that are considered essential for SS7 signaling. SCTP is message oriented, and preserves message boundaries by placing application
messages inside one or more so-called chunks. Large messages can be spread across multiple
chunks, and several chunks can be bundled into one SCTP/IP packet to reduce protocol overhead. The multi-streaming feature allows an application to define multiple streams within the
overall message flow and enforce in-order delivery only within each of the streams. Therefore, messages belonging to different streams will not block each other if retransmissions are
necessary, avoiding head-of-line blocking among unrelated messages.
Another major feature of SCTP is multi-homing, or the ability for a single SCTP end
point to support multiple IP addresses. This is also the feature of concern to us in this paper.
2.2 SCTP Multi-Homing and Failure Detection

A connection in SCTP is called an association, and, as for TCP, it comprises single source
and destination ports respectively. However, unlike TCP, an association in SCTP may comprise several source and destination IP addresses. The route taken by packets from a source
end point to one of the IP addresses of the destination peer constitutes an SCTP path. Multihoming was introduced in SCTP to provide for path redundancy in case one or more of
the destination addresses become unreachable. One of the paths between multi-homed end
points is always selected as the primary path; provided that this path is available, all new
284
data is sent that way. Any remaining paths serve as alternate paths, and will normally only
be used for retransmissions of dropped packets. Only if SCTP comes to the conclusion that
the primary path has failed permanently, one of the alternate paths is selected as the new
primary path. From that point on, new data is sent over this path. Note that the SCTP specifications currently do not support the concurrent transmission of new data on multiple paths,
which could be used e.g. for load-sharing. Multi-homing is currently only used to enhance
failure resilience.
SCTP provides two kinds of probing mechanisms for monitoring the reachability of
the peer destination addresses, one for the primary path and one for the alternate paths.
For the primary path, SCTP keeps an error counter which counts the number of consecutively missed responses to data transmissions (i.e., acknowledgements), detected by expired retransmission timeouts. If the error counter exceeds a configurable threshold, called
Path.Max.Retrans, the destination address belonging to the primary path is considered
unreachable. If, on the other hand, an acknowledgement for data sent on the primary path is
received, the error counter is reset to zero.
The purpose of the threshold Path.Max.Retrans is to reduce the risk of false unreachability detections due to packet losses that may happen even if the destination address
is still reachable, e.g., in case of network congestion. Choosing a value for
Path.Max.Retrans involves a tradeoff: A high value reduces the risk of false detections, but a lower value allows for faster detection of real failures because fewer consecutive
timeouts are needed until the destination is considered unreachable. RFC 2960 recommends
a rather conservative default value of 5 for Path.Max.Retrans.
A similar error counter as for the primary path is also used for each alternate path. But
since alternate paths are not normally used for data transmissions, SCTP uses a heartbeat
mechanism for probing. Heartbeats are sent periodically, based on a configurable heartbeat
timer. If a heartbeat response on an alternate path is not received within a specified time
period, the error counter is incremented in the same way as described above. Again, when
a counter exceeds Path.Max.Retrans, the corresponding destination address is considered unreachable.
Figure 1 illustrates the flow of events taking place when a failure occurs on the primary
of two paths of an association between two multi-homed Signaling End Points (SEPs), SEP1
and SEP2. The SEPs in this example have two addresses each (A and B), and are connected
by the two paths A1A2 and B1B2, with A1A2 being the initial primary path. Both end
points are assumed to be configured using the default values recommended by RFC 2960,
i.e., Path.Max.Retrans is assumed to be 5.
At (2), the primary path fails, which results in a timeout at (3) for the data chunks sent
earlier at (1). The timeout triggers the retransmission of the outstanding data at (3), which,
as mentioned earlier, takes place on an available alternate path, here B1B2. Furthermore,
the error counter of the primary path is incremented by one.
A new retransmission timeout (RTO) is now calculated for the primary path. According
to RFC 2960, the RTO is doubled following each retransmission timeout. This is called
exponential timer backoff, and is an essential part of SCTPs congestion control mechanism,
following the principles described in . Consequently, the RTO grows exponentially with each
consecutive timeout if the path failure persists. However, the RTO is bounded by the SCTP
2. SCTP and SCTP Failover
SEP1
285
SEP2
Primary Path
A1
A2
SIGTRAN Network
B1
B2
Alternate Path
B1
A1
A2
B2
(1)
(2)
RTO 1
(3)
(4)
RTO 2
(5)
(6)
Figure 1: Failover in SCTP.

parameters RT Omin and RT Omax , which determine the lower and upper limits of the RTO
respectively. These limits also apply to the timer backoff. Thus, following a timeout, the
RTO of the primary path is recomputed as follows:
RT Onew = min{max(2 RT Oold , RT Omin ), RT Omax }.
Since the primary path is still broken, a second timeout eventually occurs at (5), and the
actions taken at the first timeout event are repeated. With Path.Max.Retrans set to 5,
it will take 6 consecutive timeout events until the destination address of the primary path is
finally declared inactive (6) and the failover to the alternate path is completed.
The minimum failover time, i.e., the theoretically shortest time between the path failure
and the time at which the primary path is declared inactive, can be computed on the basis
of the exponential timer backoff and the values of RT Omin and RT Omax . The standard
values specified in RFC 2960 are 1 s for RT Omin and 60 s for RT Omax . Consequently, the
286
RTO is at least 1 s at the time of the path failure. Thus, the minimum failover time for SCTP
configured as recommended by RFC 2960 is 1 s + 2 s + 4 s + 8 s + 16 s + 32 s = 63 s, i.e.,
the sum of the time intervals between consecutive timeouts.
Background and Related Work
As previously mentioned, the SCTP failover mechanism replaces the Changeover and Forced
Rerouting procedures of SS7 [18, 32], and thus should provide approximately the same
failover performance as these procedures. In this section, we quantify an appropriate performance target for our further considerations.
Some previous work [15, 26] has used an upper limit on the SS7 Changeover procedure
of 800 ms, but the performance figures are actually only indirectly specified by the ITU-T
standards. The ITU-T recommendation Q.703 [17] imposes through the timer T7, Excessive delay of acknowledgement, an upper limit on the link failure detection time between
500 ms and 2 s, and recommendation Q.706 [19] prescribes upper limits on the Changeover
detection and response times of 500 ms and 300 ms, respectively. Together with the transmission and handling delays for the Changeover request and response messages, which typically
sum up to about 100 ms, we end up with an upper limit between 1.4 s and 2.9 s.
This range for an upper limit of a failover is also in line with the SIGTRAN prestudy
work by Seth et al. [6, 39] which suggests an upper limit on a path failover of less than 2 s.
Further, a similar delay requirement is also imposed on SCTP by BICC [20], a common
call-control protocol in a GSM/WCDMA network.
Taking these considerations into account, we chose a target value of 2 s for the SCTP
failover time. As is evident from the failover example in Section 2, SCTP configured according to RFC 2960 does not even come close to meeting this target.
The reason for this mismatch between SS7 requirements and RFC 2960 is that the IETF
standardized SCTP with a conservative configuration that can be used over the Internet,
rather than specifically targeting controlled signaling networks. In the Telephony Signaling
Transport over SCTP Applicability Statement [11], the IETF acknowledges this fact, and
suggests several ways to adapt SCTP for SS7. They suggest setting Path.Max.Retrans
to a value less than 5 and/or setting RT Omin to less than 1 s. They also suggest more radical
solutions such as disabling or drastically limiting the exponential timer backoff. Typically,
the latter is done by setting RT Omax to a value much less than the 60 s recommended
by RFC 2960, thus curbing the growth of the RTO during the consecutive timeout events
leading to a failover. Examples of practical uses of this method include the SCTP/T stack by
Adax [2], and the work of Jungmaier et al. [26].
However, as recognized by the IETF, the suggested solutions to improve the SCTP
failover times are not without problems. Configuring Path.Max.Retrans to a value
lower than 5 increases the risk of false or spurious failovers. Limiting or disabling the exponential timer backoff mechanism can contribute to a destabilization of the network in case
of congestion [14, 16, 24, 40, 41] (also see Section 4). Furthermore, it impedes Karns algorithm [27], which relies on the timer backoff to acquire a new round-trip time (RTT) estimate
in case of a sudden delay increase. The algorithm, which is mandatory for SCTP [43], states
that the RTT must not be measured using packets that have been retransmitted to avoid re-
4. Stability of Relaxed Exponential Backoff
287
transmission ambiguities [27, 45]. Instead, the backed-off RTO is used as the new RTO after
a timeout, ensuring that the RTO eventually becomes greater than the increased RTT. Without the timer backoff, SCTP may never be able to collect a correct RTT measurement in such
a situation.
Apart from the solutions put forth by the IETF to improve SCTPs failover times, Caro,
Iyengar et al. [7, 8, 22] performed extensive studies on the SCTP failover and changeover
mechanisms. They have proposed a two-level threshold (, ) failover mechanism [7] which
disentangles failover detection from failover recovery: When the number of retransmissions
reaches the first threshold, , recovery begins and traffic is re-routed from the primary path
to one of the alternate paths. However, the primary path is not considered unreachable until
the number of retransmissions reaches the second threshold, . The two-level threshold
mechanism improves throughput during SCTP failovers, and could be used as a complement
to our proposal. Specifically, our proposal enables the use of a higher value on , and thus
avoids unnecessary re-routings which, from a traffic-engineering viewpoint, might have a
destabilizing effect on a network.
The failover times could also be improved by extending SCTP with support for concurrent transmission on several paths. A path recovery in this context then translates to a redistribution of data from the failed path to still active paths. Several recent works have been
concerned with concurrent multi-path transfers. For example, Iyengar et al. [21, 23] studied
the consequences of sending new data across several paths, not just the primary one. They
proposed modifications and extensions to SCTP to mitigate the negative effects, in terms of
packet reordering and impaired congestion control, which emerge with multi-path transfers.
In line with their work., Ye et al. [44] have proposed IPCC-SCTP, an enhancement to SCTP
for more efficient support of multi-homing. In IPCC-SCTP, the per-association congestion
control of SCTP has been replaced with an independent per-path congestion control. As a
successor to IPCC-SCTP, LS-SCTP [3] has been proposed. LS-SCTP extends on the work
with IPCC-SCTP by providing load sharing among multiple paths.
Also others have proposed extensions of SCTP for load sharing. Casetti et al. [10] have
suggested an extension to SCTP for bandwidth-aware load sharing. They have devised and
implemented a bandwidth-aware source scheduler in SCTP with the objective to maximize
throughput. Furthermore, in a later incarnation of their work [9], they have incorporated
the low-pass bandwidth estimation technique of TCP Westwood+ [30] in their SCTP stack,
and developed Westwood SCTP. Still further examples of load sharing include the RivuS [1]
open-source project.
Although the number of research efforts on concurrent multi-path transfers and load sharing in SCTP is fairly large, it should be noted that this still is very much work in progress.
Requests to explicitly permit transmission over several paths have, so far, been rejected in
the IETF. As a matter of fact, it still remains to be decided if layer three is indeed the correct
layer for multi-path routing in the first place.
4 Stability of Relaxed Exponential Backoff

In a network, there is a well known correlation between throughput and traffic load [25]. At
light loads, the throughput of the network increases as the offered load increases. However, at
288
a certain point, the so-called knee, the rate at which the throughput increases declines. As
the load on the network continues to increase, the network capacity, the cliff, is eventually
reached. Beyond the cliff, throughput actually drops with increased load and the network
is said to be in a state of congestion collapse [33]. In this state, the majority of packets
being transmitted are actually retransmissions of discarded or presumed discarded packets.
The network is poorly utilized in this state despite high traffic demand, and the response
times are excessively long. Still worse, when congestion collapse is reached, the network
is not able to recover from this state by itself [33].
A transport protocol is said to be stable if it adjusts its sending rate in such a way that
the network always operates below the cliff, or, in other words, prevents the network
from reaching congestion collapse in the first place. As for TCP, the stability of SCTP
relies upon its congestion control and retransmission timer backoff mechanisms [24]. Thus,
to be able to use SCTP with a relaxed retransmission timer backoff in anything but welldimensioned, controlled network environments, it has to be stable.
SCTP essentially uses the same retransmission timer backoff mechanism as TCP, and, as
follows from Jacobson [24], the argument for stability of the retransmission timer backoff
mechanism in TCP relies to a large extent upon the work of Kelly [28]. Kelly showed that
no backoff mechanism that backs off slower than exponential is stable when the number of
contending flows is infinite. In [24], Jacobson argues that the problem of stability of the
TCP retransmission timer backoff mechanism is equivalent to that of the stability of backoff
protocols for multiple access control (MAC) channels such as Ethernet. Several works, such
as the works of Aldous [4] and Goodman [14] show that no matter how large the number
of stations is, as long as it is finite, a MAC contention protocol with a backoff factor of 2 is
stable. As a matter of fact, Song et al. [41] have shown that provided the number of sending
stations is finite, the optimal backoff factor is not 2 but approximately 1.6. It has even been
shown by Hastad et al. [16] and Goldberg et al. [13] that, provided the number of stations is
finite, it suffices to use a superlinear polynomial backoff, e.g., a quadratic backoff to obtain
stability.
Taken together, these findings make us believe that using a relaxed backoff factor of less
than 2 will not endanger the stability of SCTP. In this paper, we consider backoff factors of
1.5 and 1.75, which are close to or above the optimum found in [41], and well-suited for
efficient implementation using binary integer arithmetic.
Theoretical Estimation of Failover Times
To be able to theoretically estimate the failover times that could be expected by using a
relaxed backoff factor, we make a number of assumptions regarding the implementation of
the SCTP retransmission mechanism:
Minimum RTO: In order to achieve the desired failover performance, it is necessary to
set the minimum RTO (RT Omin ) to values significantly smaller than the one recommended in the standard [43] (i.e., 1 s). Some research results suggest that lowering
RT Omin can lead to an increased number of spurious timeouts [5], while others argue that a lower bound should not be needed if a proper predictor is used for the RTO
5. Theoretical Estimation of Failover Times
289
calculations [12, 29]9 . The latter work also demonstrates that the timer as specified
in RFC 2988 [36] is still conservative regardless of RT Omin due to the fact that the
retransmission timer is restarted whenever an acknowledgement for the earliest outstanding segment (or TSN in case of SCTP) is received, which delays the timeout by
roughly one round-trip time beyond the nominal RTO. Furthermore, [12] describes a
simple enhancement that can greatly reduce the number of spurious timeouts without
setting a static RT Omin . The RTO is not allowed to be smaller than the most recent
RTT sample plus a safety margin of two times the timer granularity. This dynamic
RT Omin was, to our knowledge, first introduced in the BSD implementation of TCP.
In any case, the choice of a particular value for RT Omin can be seen as a tradeoff between risk of spurious timeouts and responsiveness of the timer. Given the available
research, we believe that choosing a lower RT Omin to meet our application requirements bears only a limited risk of performance degradation due to increased spurious
timeouts. In the following we assume RT Omin to be set to a value equal to or smaller
than the RTT.
Timer Granularity: In order to use a low RT Omin , the timer granularity must be sufficiently fine. In cases where the RTO timer is implemented using a heartbeat timer,
the granularity of this timer effectively determines the lower limit for RT Omin . More
specifically, RT Omin should in this case be no less than twice the granularity of the
heartbeat timer [36].
Delayed Acknowledgements: RFC 2960 recommends the use of delayed acknowledgements with a delay of 200 ms. This adds to the RTT seen by the SCTP sender and
thus increases the RTO and prolongs the total failover time. To achieve the lowest
possible failover times, it should be possible to disable delayed acknowledgements, or
to set the delay to a lower value than recommended in the RFC. Note that this may
come at the cost of increased protocol overhead on the return path.
As described in Section 2, a path is declared inactive by SCTP, and a failover is initiated,
if more than Path.Max.Retrans consecutive timeouts occur on the path. The failover
time, i.e., the time that elapses from the first packet loss caused by the path failure until
the path is declared inactive, is thus approximately the sum of the lengths of these timeouts10 . Taking the exponential timer backoff into account, the failover time can therefore be
calculated as follows:
tf ailover = tRT O =
PX
MR
i=0
i = tRT O
P MR+1 1
,
1
(1)
where tRT O denotes the RTO at the time of the failure (before exponential backoff), PMR
the parameter Path.Max.Retrans, the backoff factor, and tf ailover the total failover
time.
9 Note that SCTP uses essentially the same timer algorithms as TCP [36], so research based on TCP is also valid
for SCTP in this context.
10 In practice, the failover time can be somewhat longer because of processing delays within the protocol stack.
290
6
Backoff Factor 1.50
Backoff Factor 1.75
Backoff Factor 2.00
5
Failover Time (s)
0
1
Path.Max.Retrans
Figure 2: Estimated failover times for various values of Path.Max.Retrans and backoff
factors (RTT = 40 ms).
To obtain a failover time estimation for a given network scenario, we need to estimate
tRT O . However, tRT O is in itself a function of the RTT and its long- and short-term variations [24, 36], neither of which we can say anything about in the general case. In order to
get an impression of possible benefits in a real-world scenario, we thus make some further
assumptions. In particular, we choose an exemplary signaling network with an RTT of 40 ms
for our discussion. We also assume for the purpose of this discussion that tRT O is 2 RT T
(which is a fairly conservative assumption for a tightly controlled network 11 ). Given these
assumptions, Figure 2 shows how the SCTP failover time depends on the backoff factor and
Path.Max.Retrans.
To further demonstrate the gain from using a relaxed backoff factor, we expand our discussion to RTTs in the interval 20 ms to 100 ms, and assume a fixed
Path.Max.Retrans of 4. Figure 3 shows how the SCTP failover time varies with the
backoff factor and the RTT. We observe that while standard SCTP exhibits failover times
longer than 2 s already at RTTs above 35 ms, the failover times of SCTP with a backoff fac11 In [5], Allman et al. present captured traces of TCP traffic from the Internet which suggest that before any backoffs take place, the RTO is typically 3 5 times the RTT. However, since we assume a dedicated SIGTRAN network
with far less delay variations, we believe using an RTO of 2 times the RTT is a fairly conservative assumption.
6. Experimental Estimation of Failover Times
291
7
Backoff Factor 1.50
Backoff Factor 1.75
Backoff Factor 2.00
6
Failover Time (s)
0
0.02
0.04
0.06
0.08
0.1
RTT (s)
Figure 3:
Estimated failover times for various RTTs and backoff factors
(Path.Max.Retrans = 4).
tor of 1.75 remain below 2 s almost up to an RTT of 50 ms, and SCTP with a backoff factor
of 1.5 almost up to 80 ms.
6 Experimental Estimation of Failover Times

This section complements the theoretical estimation of SCTP failover times with relaxed
backoff factors made in Section 5 with a series of simulations using ns-2 [34]. In the simulations, we measured SCTP failover times for backoff factors of 1.5, 1.75, and 2, also taking
the impact of cross traffic into account.
6.1 Simulation Setup

For the simulations we used ns-2 release 2.29, which includes an SCTP module developed
at the University of Delaware [38]. We modified the code to be able to set the timer backoff
factor to non-standard values and to read out the RTO variables.
The topology of the simulation scenario is shown in Figure 4. It consisted of two multihomed SEPs, SEP1 and SEP2, with two interfaces A and B each, which were connected
292
12 Competing Flows
Traffic
Sink
Traffic
Traffic
Source
Traffic
Source
SCTP/
Traffic
TCP
SCTP/
Source
TCP
SCTP/
TCP
Sink
SCTP/
Traffic
TCP
Sink
SCTP/
TCP
SCTP/
TCP
(L
A
2 Mbps (WAN)
M
bp
bps
M
A
(L
A1
A2
Failover
B2
SCTP
10
B1
Traffic
Sink
Alternate Path
(L
A
SCTP
10
Primary Path
Traffic
Source
SEP2
SEP1
10
N)
Cross Traffic
bp
N)
10
A
(L
bps
s
2 Mbps (WAN)
RED
Queue: 50 packets
Figure 4: ns-2 simulation topology.

pair wise by disjoint network paths. Each path comprised two 10 Mbit/s links connecting
the end points to a router (simulating site LANs), and 2 Mbit/s links between the routers
(simulating WAN connections). Additional nodes were attached to the routers, forming a
barbell topology with the WAN link of the primary SCTP path. The intention was that
the topology should enable us to model SCTP failover behavior in various signaling traffic
scenarios (see Section 6.2).
A custom traffic generator was used to run an SCTP association between SEP1 and SEP2.
It was set to produce messages with random interarrival times chosen from an exponential
distribution with a mean of 10 ms (i.e., on average it produced 100 messages per second).
The message size was set to 150 bytes, and chunk bundling was enabled.
Initially, the traffic was transmitted over the primary path. After a period of 10 s, which
gave SCTP the opportunity to reach a stable phase, the WAN link of the primary path was
taken down, simulating a path failure. This caused SCTP to initiate a failover to the alternate
path.
During the simulation, the following was recorded:
The failover time, defined as the time measured from the simulated path failure until
SCTP declared the primary path inactive and switched to the alternate path.
293
The SCTP senders RTO, SRTT, and RTTVAR variables at the time of the link failure.
This enabled us to distinguish between the two estimators used in the calculation of
the RTO (the mean and variance estimators of the RTT, i.e., SRTT and RTTVAR). As
pointed out in Section 5, the RTO at the time of the path failure (tRT O ) is the primary
factor determining the failover time for a given parameter setting.
For each parameter setting, 50 simulation runs were executed using different, mutually
independent random number streams for the individual runs. The averages over the fifty sets
of observations were computed along with the 95% confidence intervals.
6.2 Traffic Scenarios

As mentioned in Section 5, it is for the general case not possible to predict the RTO at the
time of the failure (tRT O ) just from the link delays, since it depends on the history and
variations of the RTT. In wireline networks, these factors are predominantly determined by
queuing delays, which in turn are dependent on the traffic scenario, the router queue sizes,
and the queueing disciplines used. We therefore picked three basic traffic scenarios covering
a range of possible applications of SCTP in the role of a signaling transport.
Single Association: This scenario represents the optimal case in terms of the expected
failover times. A single SCTP association with a traffic generator as described above is
run on an overdimensioned, dedicated path that carries no other traffic. Consequently
there are no significant queuing delays and little delay variations, resulting in an RTO
that is close to the link delays.
Multiple Associations: Besides the SCTP association being monitored, twelve other associations with the same traffic characteristics are run in parallel over the WAN link.
This produces a total load of about 75% of the available bandwidth on the WAN link.
The scenario can be considered typical for a part of a signaling network connecting
several sites, with a fixed amount of resources set aside just for signaling traffic. Any
other kinds of traffic transported over the same IP network are assumed to be separated
from the signaling traffic by appropriate QoS and/or traffic engineering methods.
Mixed Traffic: In this scenario, the background traffic on the WAN link consists of 6 SCTP
associations with characteristics as described above as well as 6 TCP connections
driven by an FTP-like traffic source. In contrast to the SCTP associations, the TCP
connections are not application-limited, and are thus greedy in the sense that they
constantly probe for and use excess bandwidth. This results in increased queueing and
consequently in increased delays and delay variations. A scenario similar to this could
arise if the signaling traffic has to share the network resources with uncontrolled besteffort traffic. Although such a setup is not likely to be used in a real-world telecom
signaling network, it still provides a good reference point.
In the Multiple Associations and Mixed Traffic scenarios, half of the background traffic
flows were set to run in the forward direction (i.e., in the same direction as the monitored
SCTP association between SEP1 and SEP2), and the other half in the backward direction.
294
Link Delay
Backoff Factor
Path.Max.Retrans
SACK Delay
RT Oinit
RT Omin
RT Omax
20 ms
1.5
40 ms
1.75
80 ms
100 ms
2
4
0 ms
20 ms
3s
20 ms
60 s
Table 1: Simulation Parameters.

The link delays between the nodes generating background traffic and the routers were randomly chosen between 5 and 50 ms. In all cases we used RED queues with a size of 50
packets for the routers.
6.3 Simulation Parameters

Table 1 lists the relevant simulation parameters and the settings used. Link delay is defined as
the sum of all link delays on the primary path (for a complete round-trip, i.e., both directions),
not including queueing delays in the routers. SCTP parameter settings that deviate from the
RFC 2960 recommendations are printed bold in the table.
The relaxed backoff factors were selected to be easily implemented using binary integer arithmetic. Further, in line with the finding of Song et al. [41] regarding the optimal
backoff factor (see Section 4), no backoff factors less than 1.5 were considered. While
Path.Max.Retrans was kept at a still conservative value of 4, both the SACK Delay
and RT Omin parameters were, in accordance with our discussion in Section 5, configured
to significantly smaller values than recommended in RFC 2960.
6.4 Simulation Results

Due to space limitations, only the results of the simulations with a Path.Max.Retrans of 4
are discussed in this section. Although the improvement in failover time performance using
a relaxed backoff factor was greater with a Path.Max.Retrans of 5, the result was less
interesting from a signaling viewpoint.
As previously mentioned, a primary factor influencing the failover time is the RTO. To
illustrate the impact of the traffic scenario on the RTO and thus the failover time, Figure 5
shows the average values of the RTO variables 12 of SCTP immediately before the link
failure as observed in the simulations for our scenarios, with and without delayed acknowledgements. We only present results for a link delay of 40 ms here; while the absolute values
changed when using the other link delays, the ratios among the different scenarios were similar. Note that since Figure 5 shows the RTO variables before the link failure, the backoff
factor is irrelevant here.
12 RTO
is equal to the sum SRT T + 4 RT T V ar [36].
295
Single Association
Single Association/Delayed ACK
Multiple Association
Multiple Association/Delayed ACK
Mixed Traffic
Mixed Traffic/Delayed ACK
Time (s)
0.15
0.1
0.05
0
RTO
SRTT
RTTVar
Figure 5: Impact of traffic scenario and delayed acknowledgements on RTO variables (Link
Delay = 40 ms). Error bars show 95% confidence intervals.
As follows from Figure 5, the Single Association and Multiple Associations scenarios
had similar RTT and RTT variance estimations, and, accordingly, the RTO at the time of the
failover differed very little between these two scenarios. This was no surprise, considering
that the WAN link was not saturated in either of these two scenarios. In contrast, due to
larger queueing delays and delay variances, the RTO in the Mixed Traffic scenario was considerably larger than in the other two scenarios. Figure 5 also shows that the use of delayed
acknowledgements increased the estimated RTT and its variance, and thus had a significant
impact on the RTO in the Single Association and Multiple Associations scenarios. For the
remainder of this section we only consider the simulations with no delayed acknowledgements.
Figure 6 presents the failover times obtained in the Single Association and Multiple Associations scenarios. Since the two scenarios had almost the same RTO at the time of the
failover, they exhibited very similar failover times.
We observe from Figure 6 that with the standard backoff factor of 2, our target failover
time of 2 s was only met when the link delay was less than about 60 ms. In comparison, a
backoff factor of 1.5 expanded the permissible link delay far beyond 100 ms. In other words,
a reduction of the backoff factor from 2 to 1.5 allowed us to still achieve our performance
296
3.5
Single Association/Backoff Factor 1.50

Multiple Association/Backoff Factor 1.50
Failover Time (s)
2.5
1.5
0.5
0
0.02
0.04
0.06
0.08
0.1
Link Delay (s)
Figure 6: Failover times with 95% confidence intervals for different backoff factors in Single
and Multiple Associations scenarios (Path.Max.Retrans = 4, no delayed acknowledgements).
target at about twice the link delay.

As shown in Figure 7, the outcome of the Mixed Traffic Scenario was less promising.
In this scenario, a failover time below 2 s was only barely obtained with the lowest backoff
factor of 1.5 at delays below 50 ms. Also, recall that the routers in our simulation topology
employed RED queueing, which in itself reduced queuing delays and delay variations, and
thus improved the failover times compared with drop-tail queues. Further tests (not shown
here) with drop-tail queueing yielded even worse results in this scenario. The results of the
Mixed Traffic scenario emphasize the importance of proper traffic engineering and QoS to
keep queuing delays and delay variances low for time-sensitive signaling traffic.
In conclusion, the simulation results confirm the theoretical estimations in Section 5.
Using a relaxed backoff factor broadens the range of permissible network delays for which
the target failover time can be achieved. Still, in order to achieve good failover times, it is
important to keep the RTO low. The different results for the various traffic scenarios show
that it is advantageous in this regard to separate time-sensitive signaling traffic from greedy
traffic to avoid increased queueing delays and variances, which may otherwise occur even
in well-dimensioned networks. Furthermore, either a very small SACK Delay or no delayed
297
7
Backoff Factor 1.50
Backoff Factor 1.75
Backoff Factor 2.00
6
Failover Time (s)
0
0.02
0.04
0.06
0.08
0.1
Link Delay (s)
Figure 7: Failover times with 95% confidence intervals in Mixed Traffic scenario
(Path.Max.Retrans = 4, no delayed acknowledgements).
acknowledgements at all should be used (although especially the latter comes at the cost of
increased overhead on the return path). Judging from the performance observed with the
Mixed Traffic scenario, it seems unlikely that the failover performance required for timesensitive signaling applications can be achieved in a pure best-effort environment without
risking unstable behavior.
6.5 Validation of Simulations

To validate our simulations, we used the experiment setup depicted in Figure 8. Only the
Single Association scenario was validated.
The network topology of the experiment closely corresponded with the simulation topology. The primary and alternate paths consisted of links of 10 Mbit/s. However, both paths
included link emulators (L1 and L2 in Figure 8) that enabled us to limit the bandwidth of the
two paths to 2 Mbit/s and to vary the link delay. In addition, the link emulator L1 enabled
us to introduce path failures on the primary path. The link emulators were PCs running
FreeBSD 5.0 and Dummynet [37].
The tests were run with the same parameter settings as shown in Table 1. Each test was
298
L1 (PC)
FreeBSD 5.0
SEP2 (Sun Ultra 10)
Mb
bp
ps
SEP1 (Sun Ultra 10)
10
(L
AN
Dummynet
10
A
(L
N)
Primary Path
A1
Traffic
Source
A2
Failover
B1
SCTP
B2
(L
A
N)
Alternate Path
Solaris 8
SCTP
Solaris 8
bp
s
10
L2 (PC)
Traffic
Sink
10
bp
(L
AN
)
Dummynet
Dummynet
Bandwidth: 2 Mbps
Path Delay: 10 ms, 20 ms, 40 ms, 50 ms
FreeBSD 5.0
Figure 8: Experiment setup used in validation tests.

run 50 times, and the average failover time along with its 95% confidence interval were
computed. Results of the tests and how they correlated with the corresponding simulations
are shown in Figure 9. The trend was fairly consistent between the corresponding measured
and simulated failover times. The systematic difference that existed between the measurements and the simulations was primarily due to additional processing delays in the timeout
handling of the protocol stack used for the measurements, leading to slightly longer failover
times compared to the simulations.
Conclusions
In this paper, we have proposed a new approach to lower the failover times of SCTP for
use with time-critical signaling applications. This was motivated by the observation that
using aggressive tuning of the SCTP retransmission timer parameters, as commonly used
today to reduce failover times, bears a number of risks with regard to network stability and
protocol performance. In particular, the adaptability to changing network delays is limited
and the congestion control mechanism is impaired, making the use of these approaches risky
in environments where there are no strict delay guarantees and congestion cannot be ruled
out completely.
The key element of our proposal is to lower the factor used in SCTPs exponential timer
backoff to a value lower than 2. Since the exponential backoff is the main contributing factor
to SCTPs long failover times, using a relaxed backoff helps shorten the failover times sig-
7. Conclusions
299
3
Simulated/Backoff Factor 1.50
Simulated/Backoff Factor 1.75
Measured/Backoff Factor 1.50
Measured/Backoff Factor 1.75
Failover Time (s)
0
0.02
0.04
0.06
0.08
0.1
Link Delay (s)
Figure 9: Validation of failover times (Path.Max.Retrans = 4, no delayed acknowledgements). Error bars show 95% confidence intervals.
nificantly, making it possible to meet the requirements of time-critical signaling applications

with less aggressive or no tuning of the timer parameters. The main benefits of this approach
are that the adaptability of the protocol to changing delays is maintained, and that the basic
function of the timer backoff as part of the congestion control mechanism and Karns algorithm is kept intact. In contrast, aggressive approaches that are used today (cf. Section 3)
may limit the adaptability and thus require frequent reconfigurations of SCTP in case of
changing network conditions.
Based on research in the area of collision backoff in MAC protocols, we argued that exponential retransmission timer backoff with well-chosen factors lower than 2 is still stable
and can avoid congestion collapses. As a consequence, we believe it is safe to use in environments that cannot give guarantees with regards to delay and congestion, and even in
best-effort environments like the Internet.
We used a three-pronged approach for the quantitative evaluation of the relaxed backoff
scheme. To convince ourselves of the viability and effectiveness of our approach, we first
made a theoretical estimation of failover times in example scenarios that might be encountered in a dedicated signaling network. We then simulated failovers for a number of scenarios
using the ns-2 simulator, and finally, we performed measurements with a commercial SCTP
300
stack for verification.

The theoretical estimation showed that our approach extends the range of permissible
network delays for which reasonable target failover times can be achieved. The results of
the estimation were confirmed by the simulations. To examine the effect of cross traffic and
the resulting delay characteristics, we used three different traffic scenarios: two controlled
traffic scenarios and one uncontrolled, best-effort, traffic scenario. We showed that greedy
best-effort flows, such as FTP file transfers over TCP, drastically increase the failover times
of competing SCTP flows since they introduce a significant amount of delay variance, driving up the average retransmission timeout. To this end, we conclude that it is unlikely that
the ambitious requirements of time-critical signaling applications can be met in a best-effort
network environment. Instead, we strongly recommend running signaling traffic over controlled networks where signaling traffic is separated from greedy flows by appropriate QoS
mechanisms.
Finally, to validate the simulations, we repeated one of the simulation scenarios using a
commercial SCTP stack and the Dummynet link emulator.
For future work, the stability of relaxed backoff in case of congestion remains an interesting topic; to our knowledge it has not been conclusively shown yet that the stability of
a given collision backoff scheme for MAC protocols also implies that the scheme is stable
when used as a retransmission timer backoff scheme in an IP network. Thus, there is no formal proof of the stability of either the relaxed or the standard binary backoff in that context.
Due to the multitude and complexity of signaling protocols used by telecom applications,
it is difficult to identify the exact requirements regarding the maximum permissible failover
delay. We intend to examine more signaling applications in detail to obtain these requirements. It would further be beneficial to perform measurements in real-world environments,
ranging from the open Internet to controlled signaling networks.
References
[1] RivuS project. http://sourceforge.net/projects/rivus.
[2] The Need for High Performance Convergence. Adax, 2001. White Paper.
[3] A. A. El Al, T. Saadawi, and L. Myung. LS-SCTP: A bandwidth aggregation technique
for stream control transmission protocol. Elsevier Computer Communications, 27(10),
June 2004.
[4] D. J. Aldous.
Ultimate instability of exponential backoff protocol for
acknowledgement-based transmission control of random access communication channels. IEEE Transactions on Information Theory, 33(2), March 1987.
[5] M. Allman and V. Paxson. On estimating end-to-end network path properties. In
SIGCOMM, Cambridge, Massachusetts, USA, August 1999.
[6] T. Seth A. Broscius, C. Huitema, and H-A P. Lin. Performance requirements for TCAP
signaling in Internet telephony. Internet draft, IETF, February 1999. Work in Progress.
REFERENCES
301
[7] A. Caro, J. R. Iyengar, P. D. Amer, and G. J. Heinz. A two-level threshold recovery

mechanism for SCTP. In 6th World Multiconference on Systemics, Cybernetics and
Informatics (SCI), Orlando, Florida, USA, July 2002.
[8] A. L. Caro, P. D. Amer, and R. R. Stewart. Retransmission schemes for end-to-end
failover with transport layer multihoming. In IEEE Global Telecommunications Conference (Globecom), Dallas, Texas, USA, November 2004.
[9] C. Casetti and W. Gaiotto. Westwood SCTP: Load balancing over multipaths using
bandwidth-aware source scheduling. In Vehicular Technology Conference (VTC), Los
Angeles, California, USA, September 2004.
[10] C. Casetti, R. Greco, and G. Galante. Load balancing over multipaths using bandwidthaware source scheduling. In 7th International Symposium on Wireless Personal Multimedia Communications (WPMC), Abano Terme, Italy, September 2004.
[11] L. Coene and J. Pastor-Balbas. Telephony signalling transport over stream control
transmission control (SCTP) applicability statement. RFC 4166, IETF, February 2006.
[12] H. Ekstrom and R. Ludwig. The peak-hopper: A new retransmission timer for unicast reliable transport. In 23rd Annual Joint Conference of the IEEE Computer and
Communications Societies (INFOCOM), Hong Kong, March 2004.
[13] L. A. Goldberg and P. D. MacKenzie. Analysis of practical backoff protocols for contention resolution with multiple servers. Journal of Computer and Systems Sciences,
58(1), February 1999.
[14] J. Goodman, A. G. Greenberg, N. Madras, and P. March. Stability of binary exponential
backoff. Journal of the ACM, 35(3), July 1988.
[15] K-J Grinnemo and A. Brunstrom. Performance of SCTP-controlled failovers in M3UAbased SIGTRAN networks. In Advanced Simulation Technologies Conference (ASTC),
Applied Telecommunication Symposium (ATS), pages 8691, Arlington, Virginia, USA,
April 2004.
[16] J. Hastad, T. Leighton, and B. Rogoff. Analysis of backoff protocols for multiple access
channels. SIAM Journal of Computing, 25(4), April 1996.
link. Recommendation Q.703, ITU-T, July 1996.
network functions and messages. Recommendation Q.704, ITU-T, July 1996.
[20] ITU-T. Bearer independent call control protocol. Recommendation Q.1901, ITU-T,
June 2000.
302
[21] J. R. Iyengar, P. D. Amer, and R. R. Stewart. Retransmission policies for concurrent

multipath transfer using SCTP multihoming. In IEEE International Conference on
Networks (ICON), Singapore, November 2004.
[22] J. R. Iyengar, A. L. Caro, P. D. Amer, and G. J. Heinz. SCTP congestion window overgrowth during changeover. In International Symposium on Performance Evaluation of
Computer and Telecommunication Systems (SPECTS), San Jose, California, USA, July
2004.
[23] J. R. Iyengar, K. C. Shah, P. D. Amer, and R. Stewart. Concurrent multipath transfer
using SCTP multihoming. In International Symposium on Performance Evaluation of
Computer and Telecommunication Systems (SPECTS), San Jose, California, USA, July
2004.
[24] V. Jacobson and M. J. Karels. Congestion avoidance and control. ACM Computer
Communication Review (SIGCOMM), 18(4):314329, August 1988.
[25] R. Jain and K. Ramakrishnan. Congestion avoidance in computer networks with a
connectionless network layer, part i: Concepts, goals and methodology. In Computer
Networking Symposium, New York, New York, USA, April 1988.
[27] P. Karn and C. Partridge. Improving round-trip time estimates in reliable transport
protocols. In SIGCOMM, August 1987.
[28] F. P. Kelly. Stochastic models of computer communication systems. Journal of the
Royal Statistical Society, B47(3):379395, 1985.
[29] R. Ludwig and K. Sklower. The eifel retransmit timer. ACM Computer Communications Review, 30(3):1727, July 2000.
[30] S. Mascolo, L. A. Grieco, R. Ferorelli, P. Camarda, and G. Piscitelli. Performance
evaluation of westwood+ TCP congestion control. Performance Evaluation, 55(12),
January 2004.
[32] A. R. Modarressi and R. A. Skoog. An overview of signaling system no. 7. Proceedings
of the IEEE, 80(4):590606, April 1992.
[33] J. Nagle. Congestion control in IP/TCP internetworks. RFC 896, IETF, January 1984.
REFERENCES
303
October 1999.
[36] V. Paxson and M. Allman. Computing TCPs retransmission timer. RFC 2988, IETF,
November 2000.
[38] SCTP module for ns-2. http://www.armandocaro.net/software.
[40] S. Shenker. Some conjectures on the behavior of acknowledgement-based transmission
control of random access communication channels. In ACM SIGMETRICS, May 1987.
[41] N.-O. Song, B.-J. Kwak, and L. E. Miller. On the stability of exponential backoff.
Journal of Research of the National Institute of Standards and Technology, 108(4),
August 2003.
[42] R. Stewart, I. Arias-Rodriguez, K. Poon, A. Caro, and M. Tuexen. Stream control
transmission protocol (SCTP) specification errata and issues. Internet draft, IETF,
October 2005. Work in Progress.
IETF, October 2000.
[44] G. Ye, T. N. Saadawi, and M. Lee. IPCC-SCTP: An enhancement to the standard
SCTP to support multihoming efficiently. In 23rd IEEE International Performance
Computing and Communications Conference (IPCCC), Phoenix, Arizona, USA, April
2004.
[45] L. Zhang. Why TCP timers dont work well. In SIGCOMM, August 1986.

tmp2281 TMP

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

tmp2281 TMP

Uploaded by

Copyright:

Available Formats

T HESIS

FOR THE DEGREE OF

Transport Services for

Department of Computer Science

Transport Services for

Transport Services for

my former co-supervisor, Dr. Jakob Angeby

List of Appended Papers

5 Conclusions and Future Research

Part I: Partially Reliable Transport Protocols for Multimedia Applications

Paper I: Taxonomy and Survey of Retransmission-based Partially Reliable Transport Protocols

4 A Classification and Survey of Existing Protocols

Paper II: A Simulation Based Performance Evaluation of PRTP

The PRTP Simulation Model

Validation of the PRTP Simulation Model

Description of Simulation Experiment

Conclusions and Future Work

Description of the Simulation Experiment

4 Results of the Simulation Experiment

Paper V: Enhancing TCP for Applications with Soft Real-Time Constraints

3 The PRTP-ECN Retransmission Scheme

4 Packet-Loss Behavior of the PRTP-ECN Retransmission Scheme

Part II: Transport Service for Telephony Signaling

2 Signaling in Todays Telecommunication Networks

4 Applications and Services

Call Control Signaling

Interworking with Legacy Circuit-Switched Networks

Paper VII: Performance Benefits of Avoiding Head-of-Line Blocking in SCTP

SCTP and HoLB

Paper VIII: Performance of SCTP-controlled Failovers in M3UA-based SIGTRAN

Paper IX: Impact of Traffic Load on SCTP Failovers in SIGTRAN

Paper X: Using Relaxed Timer Backoff to Reduce SCTP Failover Times

2 SCTP and SCTP Failover

4 Stability of Relaxed Exponential Backoff

5 Theoretical Estimation of Failover Times

6 Experimental Estimation of Failover Times

The overall objective of this thesis can be formulated as follows:

existing Internet transport protocols; be congestion-aware; and, ideally, react

3.1 Part I: Partially Reliable Transport Protocols for Multimedia Applications

Another contribution of our work on retransmission-based, partially reliable transport

3.2 Part II: Transport Service for Telephony Signaling

4.1 Part I: Partially Reliable Transport Protocols for Multimedia Applications

Paper V: Enhancing TCP for Applications with Soft Real-Time Constraints

4.2 Part II: Transport Service for Telephony Signaling

Paper VII: Performance Benefits of Avoiding Head-of-Line Blocking in SCTP

Paper VIII: Performance of SCTP-controlled Failovers in M3UA-based SIGTRAN

Paper IX: Impact of Traffic Load on SCTP Failovers in SIGTRAN

Paper X: Using Relaxed Timer Backoff to Reduce SCTP Failover Times

Conclusions and Future Research

Taxonomy and Survey of Retransmission-based Partially

Computer Communications, Elsevier

Taxonomy and Survey of Retransmission-based

Taxonomy and Survey of Retransmission-based Partially Reliable Transport...

Taxonomy and Survey of Retransmission-based Partially Reliable Transport...

Figure 1: Classification with respect to reliability service.

3.1 Classification with Respect to Reliability Service

Taxonomy and Survey of Retransmission-based Partially Reliable Transport...

To our knowledge, the number of protocols having a granularity of a flow are

3.2 Classification with Respect to Error Control Scheme

Taxonomy and Survey of Retransmission-based Partially Reliable Transport...

Error Detection Component

Error Feedback Component