Using Real-Time Measurements in Support of Real-Time Network Management
James L. Alberi, Ta Chen, Sumit Khurana, Allen Mcintosh,
Marc Pucci, Ravichander Vaidyanathan
Telcordia Technologies, Inc.
I.INTRODUCTION
Increased reliability is necessary if the Internet is to carry
information such as voice, video and other enhanced services. Congestion
in the network because of the statistical nature of packet forwarding is
a serious issue that could impede achieving the reliable, timely
delivery of data and the desired quality of service. At present humans
monitor the network for congestion with a variety of data collection
mechanisms and take corrective actions on an ad-hoc basis. Putting
humans in the control loop yields corrective actions that are too slow
because of delays in collecting data and too error prone because of the
complexity of the network. This paper presents Rondo, an automated
control system that manages congestion in near real time in core
networks. The Rondo system is composed of three subsystems, a data
collection system, a rebalancing algorithm and an element management
system.
Congestion in the network causes undesirable effects on three basic
parameters that are often components of quality of service, namely
delay, loss and jitter. Rondo reallocates data flows over network links
to optimize an objective function that will be discussed in more detail
in the full paper.
We will also detail relationship to other work in the full paper.
II.DATA COLLECTION
The Rondo data-collection subsystem is near real-time rather than hard
real-time because it does not guarantee scheduling of tasks, which would
be difficult because of its scale and distributed architecture. The
architecture is key to the rapid collection and processing of data. With
its basic abstractions of data streams and stream policies, the data
subsystem has the capacity to perform preliminary computations necessary
for network control close to the point of data collection, e.g. trending
or threshold detection. The architecture also encompasses a diverse
range of data sources including router-resident data, RMON, RTR, and
customized active probes. Rondo converts the data from each source to a
data stream that is a specialization of a generic data stream. The
collection system supports solicited reading of the data stream and
event notification to other systems on specified conditions.
III.REBALANCING ALGORITHM
A primary goal of Rondo is a near real-time response to network
congestion. Rapid response implies that a rebalancing algorithm be
simple, efficient and effective. This requirement conflicts with
choosing an algorithm that optimally allocates bandwidth and utilization
over a large network. As a result, the architecture of Rondo's
rebalancing mechanism lends itself to a whole spectrum of algorithms
from a simple first-fit strategy to full off-line optimization.
Different algorithms can be inserted depending on the needs and
situations. One example yielding good results is a constrained shortest
path algorithm modified from Dijkstra's algorithm that considers
bandwidth demands, current link utilizations and the other factors.
IV.ELEMENT MANAGEMENT
Multiprotocol Label Switching (MPLS) facilitates the automated control
of a network although other mechanisms might be possible. MPLS
establishes routes through a core network that are under complete
control of the management system. Both the hop-by-hop route and the
class of service are under control of the Rondo system. We will detail
how our system implements this control in the full paper.
V.EXPERIMENTAL NETWORK
Our experimental network models the core network of a national service
provider. The network has ten routers interconnected in a mesh, each one
representing a large POP. Varying network loadings are generated with a
combination of hardware and software load generators. Data will be
presented on the effectiveness and dynamic stability of the control
system under the generated loads.
VI.ISSUES
Various issues need further exploration in automated network control.
Among these are the dynamic stability of the control system under
increasing scale, the interaction of network failure with automatic
control, the effect of traffic shaping and admission control.
|
|
 |
|
 |
 |