Draft Minutes RIPE35 Test-Traffic Working Group version 1.1
- Date: Tue, 21 Mar 2000 11:11:18 +0100
Draft Minutes RIPE35 Test-Traffic Working Group
Chair: Daniel Karrenberg
Scribe: Rene Wilhelm
A. Administrative stuff
- appointment of scribe
- agenda bashing
- review of action items
B. Henk to give the usual update on TTM plus plans for the next months
C. Presentation of current state of thinking on moving TTM to
a membership service
D. Discussion of work on summarizing our results into a small set
of numbers that express the quality of the connectivity of a site
A. Administrative stuff
Daniel Karrenberg chairs the meeting, as Matthew Robinson had
announced he could not make it to RIPE35 for personal reasons.
- agenda bashing
item D was removed from the agenda: the work at the NCC on this
got delayed, Daniel felt it is was better to discuss it when
a more concrete proposal was available, rather than start a
philosophical discussion on what would be possible.
- review of action items
There were no outstanding action items, other than the
RIPE NCC to continue with the Test-Traffic Measurements Project.
B. Henk to give the usual update on TTM plus plans for the next months
Slides are on-line at http://www.ripe.net/test-traffic/Talks/0002_RIPE35/
- The existing measurement network
Status: 43 boxes installed, mostly in Europe, one in Israel,
four in the USA, one about to go to New Zealand.
35 taking data, some installation and operational problems.
Improving uptime, lessons for the next series
Boxes run generally fine but if there is a problem, then they
tend to be down for weeks. The number of spare parts needed
was underestimated, more will be ordered with the next series.
Shipping of boxes and spare parts will be made easier.
Response time from RIPE NCCs side will also be improved.
- The new series of test-boxes
A next series of boxes is planned for the coming months;
these will have to be purchased by the host (as discussed in
Henk's second presentation, "turning TTM into a service")
Experience with the present boxes showed the current design
of the clocks to often be a problem for installation:
The GPS Receiver is installed in the test-box, while the antenna
is located on the roof or out the window; coaxial cable is needed
between them. This can delay installation (running coax cable
through a building which is not the ISP's property) and imposes
strict limits on the maximum distance between box and antenna.
The solution is to move the receiver closer to the antenna
and use existing structured wiring (UTP cables) to transport
signals to the test box. A design document has been written,
now looking for products. Possible candidates are Trimble's
Palisade and AccutimeII which meet all requirements. Tests
are in progress.
Some software components and procedures need to be updated
to allow the project to scale from 40 to 100+ remotely
deployed boxes. Keeping software up to date, and ensuring
data taking jobs run properly are areas which will be
- Analysis, Products and Results
It has always been the policy that hosts can get copy of all data
send and received by their. In the past months we have seen first
requests for this. Data are provided in ROOT files; more info at
Rewritten using ROOT's graphing tools: better layout, small box
with statistics. In the next few weeks, the plots on the web site
will all be created with the new look. We are also looking into
creating a cgi-interface to generate plots on demand: with 100+
it will be impractical to make plots for all active paths,
because users are not likely to check the hundreds of plots which
otherwise would created for their box.
The current set-up did not scale for the increasing number of
route vectors stored (timeouts, high load on web server).
It is now being rewritten, using an SQL database, and
expected to be ready in a few weeks. Functionality will be the
same, the user interface will not change.
The goal of the network alarms is to detect unusual changes in
network delays and warn test-box hosts. The algorithm was
at the previous RIPE meeting: if 95% of measurements are outside
expected range, alarm is set and e-mail send to specified address.
In a next step the e-mail could be replaced (or supplemented)
by another means of notification that would integrate the
test-box alarms more with existing network monitoring systems.
First analysis of the recorded alarms shows that on average
about 12 messages are send per host per day. Weekend vs. weekday
effects are also clearly visible in number-of-alarms-per-day
The algorithms for sending alarm still have room for fine-tuning
as a lot of parameters are involved. However, it is hard to do
without input from the hosting sites. Henk asked to provide
feedback in cases where a host receives a lot of messages,
for example: did alarms coincide with changes in connections,
problems at the site? Or did a host receive an unusual number
of messages even though nothing was changed?
Ingo Luetkebohle asked which connections frequently generated
alarms. Henk did not know the answer, but said that further
analysis of the alarms generated during the past months is on
his action list. This topic will be included.
Trends in the data
This analysis topic deals with the question how delays
develop over longer periods of time (>1 month).
Data from individual measurements are summarized in percentiles
(2.5% = best case, 50% = normal case and 97.5% = worst case).
With a handful of numbers each month, these are much easier to
plot for a extended time period. Now the challenge in the analysis
is to automatically find those plots which could be of interest
human operators. Henk showed some examples. Next steps are to
finish research by May 1st, then turn experimental code into
production software and make results available on the web.
Network performance scores
As discussion of this item had been removed from the agenda,
Henk only touched briefly upon the reasons for looking into this,
and what directions our thoughts are taking. A more detailed
proposal will be available for discussion at the next RIPE meeting
Next (possible) analysis steps
Modelling of data, N-square problem, Full TCP sessions, QoS,
DiffServ, Traffic Engineering work ...
Suggestions are always welcome!
Daniel: alarm delivery from SNMP could be an option but will not
be started by RIPE NCC unless there is real interest from the
test-box hosts and someone is willing to start a joint project with us
to get this implemented: what would be practical for test-box
Implement polling the box for data as well?
Mixed feelings on best approach
- some like e-mail, simple delivery scheme, leave it up to the
host/user to post-process it (e.g. procmail filters and scripts
to add to syslog)
- others prefer syslog, fast low-weight (UDP) delivery, not suffering
from TCP and higher protocol (sendmail) timeouts. syslog and SNMP can
both be input streams to network monitoring software (e.g. netcore)
- a poll of the audience showed about 8 of those present use SNMP
RIPE NCC will provide syslog to those interested.
Comment by Frode Greisen: everyone performs pings inside their network
to check latencies, the test-box network provides a unique environment
in which more interesting things could be done, e.g. traffic flow and
Test-Traffic measurements are more than just pings, we do one
way delay measurements which are of much more value than the round
measurements provided by ping and have more accuracy than ping.
RIPE NCC do not want to perform passive measurements because of
privacy problems: the project depends on trust, not all participants
might like capturing all traffic passing by. Simulated loads, packet
trains are in scope, but we prefer to wait for IETF IPPM defined
before starting that. On the other hand if RIPE community would have a
demand for this we could look start something and bring it into the
IETF working group. Daniel agreed to look at possibilities in the
direction of TCP performance.
One person commented delay measurement data are already interesting,
they provide an archive over time. More effort should be put in
analysing the current data rather than collect different data,
Mike Norris liked the idea of a single number for network performance
but commented that like a stock market index, the number is created from
a changing base, requiring frequent re-calibration. Scores will depend
on the rest of the test-box network. He is interested in the correlation
with other statistics, like RIPE host count, size of routing table etc.
Daniel answered that Test-Traffic Measurements depend on the trust
RIPE NCC has from ISPs; we do not want to ask for SNMP access to
ISPs' routers. Someone else noted ISPs do know how much traffic they
carry, so they could perform the correlation themselves.
C. Presentation of current state of thinking on moving TTM to membership
- Untill now, the Test-Traffic measurement project was a project paid from
the general NCC budget. As such there were some limitations:
- 1 test-box per site gave everybody a chance to participate
- in principal for RIPE-NCC members only (but a few boxes were
installed outside RIPE region where collaboration with hosts
Some (larger) sites did express interest for more than one box.
Also there was a lot of interest from non-members, though often they
did not want to become LIR just for the sake of participating in
Test Traffic Measurements project.
Finally, it was clear that not all current RIPE-NCC members
are interested in the TTM service. Thus, the Annual General Meeting
of the NCC membership adopted the resolution that the Test-Traffic
Service should generate a self-sufficient budget in 2001.
(Assuming 100 test-boxes to be in the field by the end of 2000, an
estimated EUR 300,000 need to be generated from service fees in 2001)
Henk presented the current ideas on turning TTM into a separate
service that can be bought from the NCC. He stressed the figures
are preliminary, could be off by 10%; volume discounts for hardware e.g.
had not yet been discussed with supplier(s).
The basic model:
Buying a test-box:
about EUR 3000 for the hardware, shipping and installation.
Taxes, import duties, etc to be paid by the host site
Pre-configured test-box, 3 years warranty, some possibilities
to swap broken box with spare, installation support by email
No service fee in 2000, from 2001 onwards EUR 3000/year
(Assuming a total budget of EUR 300,000 and 100 boxes)
Service fee includes: operation of the test-box as a black-box
by the NCC; access to all data taken from/to the box at host's site;
access to all products based on the TTM data; maintenance and
Definition: measurements on internal network, host does not want
to share data and results OR other measurements that can be done
with a test-box.
Model: host buys another pre-configured test-box from RIPE NCC.
Root-password will be communicated, after installation, host
is on his own. RIPE NCC will not collect or analyze the data and
the box *cannot* be used in the normal measurement chain.
Build own test-box
Model: host buys hardware and constructs the test-box himself,
then hands it over to NCC Test-Traffic operations who will turn it
into a black box. Any interest in this?
Next steps for turning TTM into service:
- With feedback received at this meeting a draft proposal will be
written which will be posted to the tt-wg@localhost mailing list
for further discussion. From there plans will be finalized,
text for a service agreement will be drafted and a glossy brochure
will be produced for participating sites' management.
If there is consensus, the plan is to have this in place before
Henk opened the discussion with the following questions:
- Do you like this concept?
- If not, it's back to the drawing-board, but what should be changed?
- Would you be interested in paying for this service?
Daniele Bovio commented he sees the purpose of starting to charge
a fee for the service; there's value in RIPE NCC collecting and
reporting on the data. However, he did put some question marks with
the option of private experiments, where boxes would be sold to
anyone interested. In his view, it is not in the interest of the
RIPE NCC to start this business; it might be better to sell the
technology to a company who would handle marketing and support
for those type of installations.
Daniel Karrenberg replied some hosts already asked for multiple boxes,
to be used for monitoring the ISP's internal networks. Currently, the
NCC can sustain the demand, but he agrees it's not good if many of
such requests would come in.
Mike Norris felt one should be careful about opening the public
project to non-members of RIPE NCC. He agreed with the planned
It was asked if it would be possible have results from both internal
and RIPE boxes presented in one report. Henk did not feel this was
a good idea: we can guarantee the integrity of data from boxes
operated by the NCC; if we start mixing data from those boxes with
data from private experiments, we could loose the trust we have now.
Comment: in order to get the charging approved, we need details
(in the form of a written document) on what will be done with the
money paid for TTM services.
Henk/Daniel: this is what the service agreement is all about:
a contract signed by both RIPE NCC and the hosting organisation
detailing rights and obligations. In addition the NCC publishes
yearly charging schemes and activity plans on which members can vote
during the Annual General Meeting. Daniel stressed the importance
of selling TTM services via a membership construction: it gives
those who are not an LIR but do pay for TTM the opportunity to
participate in the formal structures of RIPE NCC.
Henk reminded the group test-box hosts get access to all data and
results obtained with the box at their site. Internally, these
can be used as desired; external publications, however, can only
be made after review in the TT working group.
Jake Khuon feels the RIPE NCC is doing good work in this field. He
would be interested in seeing a conclusive report, a paper published.
There were no other business.