IP Streams, Flows and Torrents: Measuring Stream Distributions in Real Time
Nevil Brownlee, The University of Auckland / CAIDA
RTFM (RFCs 2720-2724) considers network traffic as being made up of Flows, which are arbitrary groupings of packets defined only by attributes of their end-points. RTFM flows are also bi-directional, with the user specifying which end-point is the flow's source.
This paper extends RTFM's view of traffic by adding two further concepts, torrents and streams. A 'torrent' is the collection of flows which make up the total traffic on a network link. An RTFM meter may measure all packets in the torrent, or may select only a small subset of them.
'Streams' (which could be referred to as 'bi-directional microflows') are individual IP sessions (e.g. TCP or UDP) between ports on pairs of hosts. An RTFM meter has been extended to maintain queues of streams belonging to currently active flows, so as to produce distributions summarising stream behaviour. This was done so as to implement packet-pair matching for the 'turnaround time' attributes proposed in RFC 2724.
Streams introduce a new dimension into RTFM, making it possible to observe distributions of stream sizes and lifetimes. It also introduces new complexities into the meter's packet matching algorithms.
When using streams in the meter, flows provide a first level of packet matching, selecting the packets of interest. If a flow's direction is specified (e.g. packets from A to B), all its streams will have that direction, so the meter can easily decide whether a packet within a stream is a request (source to destination) or a response. A flow, however, can be very coarse-grained, leaving its direction unspecified To cope with such flows (e.g. all UDP packets), the meter must do bi-directional packet matching for streams.
To associate pairs of packets (e.g. request/response pairs for ICMP echo or for DNS over UDP), the meter maintains a queue of packets for each stream. In its memory management the meter implements a variety of timeout schemes for flows, streams and packet queues; these include fixed times (flows), protocol-based (TCP streams) and dynamic (packet queues). Details of the dynamic timeout algorithm will be discussed.
Stream measurement work using a meter located at SDSC (the San Diego Supercomputer Centre) will be presented. This has been concentrated on three measurement projects:
- 10s data rates: Data rate distributions for all traffic on a busy Internet link. These give a good overview of the link's behaviour.
- DNS response: DNS packet matching has been used to investigate response time and loss rates to the root and gTLD nameservers. Plots will be presented showing the behaviour of various roots, and comparing them with their corresponding gTLD servers.
- Stream sizes and lifetimes: Plots showing how these vary during the day will be presented. Differences between UDP, non-web TCP and web TCP traffic will be discussed.
For packet pair matching to work as expected it is important that the meter sees all the request and response packets within a flow. The meter was used to discover which of the netblocks using the SDSC link met this 'symmetry' condition. A lack of such symmetry can influence stream size distributions; current work on this will be described.