RIPE 36

RIPE Meeting: 36
Working Group:Test Traffic
Status: Final
Revision Number: 1

Please mail comments/suggestions on:


Draft minutes Test Traffic Working Group at RIPE 36


Chair Matthew Robinson
Scribe Rene Wilhelm

30 attendees


Agenda

A. Administrative stuff
B. Presentation by Henk Uijterwaal on current project and future plans.
C. Short presentation on current RFCs available about TT
C2 TTM as membership service
D. More analysis of current data
E. Branding
F. AOB

---------

A. Administrative stuff

o Agenda bashing

Matthew adds point C2 "TTM as Membership" to the agenda, to discus
the draft document Henk posted to the wg mail list prior to RIPE36.

-------------
B. Presentation by Henk Uijterwaal on current status and plans

[ Slides are on-line at
http://www.ripe.net/ripencc/mem-services/ttm/Talks/0005_RIPE36_Status/ ]

o Manpower changes

- Johann Gutauer (M.Sc student) left on April 30, he is now busy
finishing his thesis; some results will be presented in this talk
- 2nd Network Engineer starting August 1. Tasks involve operations
(installation & support for new boxes) and system programming.


o E-mail

Henk reminded attendees of the e-mail addresses related to test-traffic
measurements:

- tt-ops _at_ ripe _dot_ net Test-Box operators at RIPE NCC; to be used for
all operational issues. Personal mail addresses should NOT be
used, as that can delay response (vacation, sickness etc.)

- tt-host _at_ ripe _dot_ net Mailing list for Test-Box operators. As this list
is used by the NCC for communication with all sites hosting a
test-box, it is essential each site has at least one address on it.
Please remember this list when people leave.

- tt-wg _at_ ripe _dot_ net Mailing list for the working group


o Status of measurement network

On May 11, 9:00 GMT: 43 boxes were out in the field, of which
2 were off, 5 in setup phase, 6 in watch state (i.e. they have some
problems, the data are not useful). This sums up to 30 boxes
actively taking data.

Occasionally, the first series of boxes end up in an undefined state,
where the host has to reboot or power cycle machine to get it back
on-line. The second series of boxes does not show this behaviour.
Thorough study indicated the problem was related to memory size,
and the O/S version: indeed, after adding memory and upgrading the
O/S, all problems disappeared on the boxes at/near RIPE NCC.

Henk will send memory upgrades to the test-box hosts when he gets
back in Amsterdam. The O/S will be upgraded in the next weeks;
instructions for hosts (e.g. when to swap disks) will follow.

Henk presented some of the issues surrounding scalability
of test-box maintenance, regarding both operating system and
TTM data-taking software. The publicly available GNU software package
Cfengine will bring relief in this area; all TTM operations will
gradually be brought under its control.


o Turning TTM into a regular service

At RIPE35 there was consensus on the model (paying a one-time fee
for a test-box and paying annual fee for the service). The next
step is to produce a "service contract". A first draft exists at
http://www.ripe.net/test-traffic/RIPE36/note.html
Discussion postponed to agenda point C2.

The current draft still needs feedback from a number of people;
once it is finalized, it will be published as a RIPE document.

Still open is the issue of site surveys and installation support.
The standard contract assumes the host will find a suitable spot for
the box and will take care of installing all hardware. However, if there
is a need, EXTRA services could be delivered by an engineer from the NCC:

- Site survey: look for a suitable spot for the antenna
- Installation support: install box & antenna, check connectivity,
firewall configuration, etc.

Henk and Daniel emphasize such extra service would be charged separately,
as it requires additional resources. It will therefore only be instituted
if there is sufficient interest.

Question: what about the situation where many ISPs are at the same
location (for example an Internet exchange)?

Answer: one has the following options:
- share a box; the box is installed in the network of one ISP,
parties agree on sharing the results.
- install a re-radiation unit: one antenna on the roof, signal is
re-radiated in the computer room, where each box has its own
antenna. (this re-radiation may need approval from the property owner).
- when only two boxes are involved, a splitter can be used:
signal of one antenna delivered to two receivers.


o The Next series of boxes

[image of prototype on slides]

Essentially, the 2000 version of current PC hardware.
The Trimble palisade will be used for GPS signal reception:
one unit, housing both antenna and receiver; standard UTP cable
connects the Trimble to the box, much larger distances (100s of meters)
between antenna and PC are possible.

There are still some things to do for the new series of boxes,
most notably testing the new PC hardware, finalize its specifications
and produce a mounting bracket for the antenna (standard option is a
short pole). Henk estimates new boxes will start shipping to customers
in 6-10 weeks, depending on vendors.

o Analysis and results

Henk highlights the analysis work which has been done at the NCC
as well as plans for the (near) future:

- New version of the daily plots: complete rewrite, better layout
and more statistics. [see example on slides]. Test version was
available at the time of the meeting, will move to production
shortly after returning to Amsterdam.

- Plots on Demand: currently we generate >5000 plots each day;
takes a lot of computing resources, doesn't scale N-squared
and not all plots are looked at by a human. We want to move to
situation where only a subset of plots is automatically generated,
the rest on demand via a web interface. Advantage: user can tune
some of the plots parameters (time period, axis scales).

- Routing Vector Database: the performance of the current system
did not scale, a complete rewrite was needed: out-sourced
to the NCC software group. Importing data and querying is
a lot faster. Will be switched on after RIPE36.

The improvements in both routing vector database and daily plot
code should reestablish the (desired) situation where most popular
plots are available early in the morning.

- Network Alarms: continue to run smoothly. Statistics updated and
more analysis. Alarms tend to be in few paths on any given day
but no single path consistently produces more alarms. Further
development will depend on the feedback we get from test-box hosts.

- Trends in the data: how do delays develop over longer periods.
As it is hard to look at individual measurements for long time
intervals, data are summarized in percentiles, which yields
a few numbers per day. Henk showed some examples of the results
obtained in this analysis. Next steps are to turn experimental
code into production software and make results available on
the web site.

- performance scores: still in the planning/design state;
discussion about this is postponed until some first
results are available (RIPE37).

- delay variations: a measure of short-term jitter on delays.
The Internet Draft by the IPPM working group seems to have
converged. Implementing this in TTM is on our list of
projects for a student.

- throughput: IPPM does not seem to agree on the method to
measure throughput. Treno has been abandoned by its author,
the corresponding Internet Draft has meanwhile expired.

- relation between delays and traceroutes: part of a research
project proposed by Delft University. They intend to use
the data collected by TTM.


-------------
C. Overview of IPPM RFCs and Internet Drafts

[ Slides are on-line at
http://www.ripe.net/ripencc/mem-services/ttm/Talks/0005_RIPE36_IPPM/ ]

The IETF has several working groups dealing with performance and
benchmarking. IPPM (IP Performance Metrics) is most relevant
to the Test Traffic Measurements.

Henk summarized the IPPM WG charter and presented an overview
of the RFCs published so far as well as the current internet drafts.
Those interested in the work of IPPM are encouraged to subscribe to
the IPPM mailing list via ippm-request _at_ advanced _dot_ org or visit the
mail archive at http://www.advanced.org/IPPM/archive/


TTM Reference Paper

At the previous meeting it was suggested RIPE NCC write a reference
paper, a more or less scientific paper that describes what is being
done and how, followed by a discussion of the results.

A poll of the audience showed people are interested in seeing such a
paper. The action should be pursued by first posting a possible
table-of-contents to the tt-wg mailing list for discussion.
Henk feels the document does not belong in the IPPM wg, nor in the
IETF in general. He asks for input on where people think it should
be published; preferably a journal or conference with reasonable
coverage and peer reviews.


-------------
C2. TTM as Membership Service

Matthew summarized the concepts from the "providing Test Traffic
Measurement as a Membership Service" draft document.

Some discussion followed:

Q. About the 3000 EUR service charge, is that a per box,
or a per member charge?
A. Henk: the charge is per box, per annum.

Q. What about current boxes? what should they pay for?
A. Daniel Karrenberg: we recognize the current hosts are pioneers,
a transition scheme can be discussed but eventually everyone should
pay for the service. In the case where one organisation takes more
than one test-box, RIPE NCC are willing to consider volume discounts.

In response to questions Mike Norris had brought up earlier on the
tt-wg mailing list, Daniel answers that TTM membership will be structured
similar to current membership: those who buy the service get some rights
to vote on the RIPE NCC budget, activity plan and executive board.
Intention is to do the same thing for other (future) services.

Daniel raised the worries he has about assuring the validity of
measurements, i.e. how to prevent people to treat traffic to/from
the boxes preferentially to improve scores. His current thoughts
are to include in the contract something along the lines of
"don't cheat!"; in case of suspected fraud the service contract
would be terminated. Motivation behind this: some people cheating
can undermine the usefulness of the whole project.

Question: QoS has already some hooks to differentiate in handling
incoming/outgoing traffic; when do you consider something to be
cheating?

Daniel: traffic from a TTM box should be treated identical to other
traffic of the test-box host, *no* preferences on traffic from the
test-box's IP address. Differences to type of traffic (QoS) are
acceptable when applied to all the test-box host's traffic.

------------
D. More analysis of current data

A discussion on types of analysis possible with the current
test-boxes and their data set:

- derive other quantities, such as throughput?
- can a single metric be derived for each site?
- scaling of metrics? N-squared? overlap in measurements.


Daniel answered:

- N-squared: we are looking at it. The Delft University project is part
of that. We do not want to reduce the mesh by putting in assumptions, but
(ideally) derive redundancy from the data itself. That is not an easy
problem. Any ideas welcome; we are aware and are trying to find solutions.

- single metric: that's what we want to do with the scores.
also one number from one site to each other site.

- throughput; pretty delicate for us. So far we followed IPPM, but for
throughput discussion hasn't finalized; religious arguments going on
over there. Daniel sees two options:

1. wait till something useful is there
2. define something for TTM ourselves

Although it would not be to difficult to implement, Daniel feels the
second option would jeopardize the scientifically defendable way of doing
things; it would be more an ad-hoc type of metric. If a large number of
hosts want it, RIPE NCC will implement it, but Daniel has a strong
preference to wait; his concern is that the ad-hoc stuff could overshadow
the high-quality work like the one-way delay metrics and the information
derived from them.

Question: due to queueing effects, small test packets could get better
treatment than larger payload traffic; would it not be interesting to
measure delays in big streams?

Henk: yes, that is something to add to our list. Other topics would
be sending out bursts of packets and varying the packets sizes.
At the moment we send small packets, but we want to do larger
packets as well. Also thinking about measuring inter packet delays;
not by sending TCP streams, but by reducing the time interval
between consecutive packets. This is much easier for us as it only
involves changing the parameters of our one-way delay measurements.

Remark from the audience: it must be recognised as a stream
by QoS and other techniques.

Henk: the packets have same port same IP address.

Question: multicast one way delay vs. unicast?

Henk: haven't thought about that, but it is interesting.
must have multicast available to the destination(s) though.


---------
E. Branding

By this time tt-wg session had ran out of its allocated time-slot.
As db-wg were anxious to take over the room, this agenda item
was largely skipped.

Mike Norris' suggestion to put a small logo on TTM hosts' web pages
saying "I'm participating in the Test Traffic project" was welcomed.
by the audience.

---------
F. AOB

Henk demonstrated a visualization of six months network alarm data;
this nicely showed alarms do occur in bursts and clusters, but
no single connection shows significantly more alarms than others.

Daniel asked participants to think of a better name for the project;
test-traffic sounds rather dull.

Matthew closed the meeting after defining the following
action points for RIPE37:

1. design a cool logo for inclusion in web pages
2. come up with a flashy name test-traffic