Draft Minutes RIPE35 Test-Traffic Working Group version 1.1

To: "RIPE NCC WebMaster" < >
From: "Nick Webby Reid" < >
Date: Tue, 21 Mar 2000 11:11:18 +0100
Draft Minutes RIPE35 Test-Traffic Working Group
==============================================
version 1.1
----------------------------------------------

Chair:              Daniel Karrenberg
Scribe:             Rene Wilhelm
Attendees:          48


Agenda:

A.  Administrative stuff
      - appointment of scribe
      - agenda bashing
      - review of action items

B. Henk to give the usual update on TTM plus plans for the next months

C. Presentation of current state of thinking on moving TTM to
   a membership service

D. Discussion of work on summarizing our results into a small set
   of numbers that express the quality of the connectivity of a site

Z. AOB

-----------------------------------------------------------------------
A. Administrative stuff

   Daniel Karrenberg chairs the meeting, as Matthew Robinson had
   announced he could not make it to RIPE35 for personal reasons.

   - agenda bashing

     item D was removed from the agenda: the work at the NCC on this
     got delayed, Daniel felt it is was better to discuss it when
     a more concrete proposal was available, rather than start a
     philosophical discussion on what would be possible.

   - review of action items

     There were no outstanding action items, other than the
     RIPE NCC to continue with the Test-Traffic Measurements Project.


-----------------------------------------------------------------------

B. Henk to give the usual update on TTM plus plans for the next months

 Slides are on-line at http://www.ripe.net/test-traffic/Talks/0002_RIPE35/

 -  The existing measurement network

        Status:  43 boxes installed, mostly in Europe, one in Israel,
          four in the USA, one about to go to New Zealand.
          35 taking data, some installation and operational problems.

        Improving uptime, lessons for the next series
          Boxes run generally fine but if there is a problem, then they
          tend to be down for weeks. The number of spare parts needed
          was underestimated, more will be ordered with the next series.
          Shipping of boxes and spare parts will be made easier.
          Response time from RIPE NCCs side will also be improved.

 -  The new series of test-boxes

        Hardware
          A next series of boxes is planned for the coming months;
          these will have to be purchased by the host (as discussed in
          Henk's second presentation, "turning TTM into a service")

          Experience with the present boxes showed the current design
          of the clocks to often be a problem for installation:
          The GPS Receiver is installed in the test-box, while the antenna
          is located on the roof or out the window; coaxial cable is needed
          between them. This can delay installation (running coax cable
          through a building which is not the ISP's property) and imposes
          strict limits on the maximum distance between box and antenna.

          The solution is to move the receiver closer to the antenna
          and use existing structured wiring (UTP cables) to transport
          signals to the test box. A design document has been written,
          now looking for products. Possible candidates are Trimble's
          Palisade and AccutimeII which meet all requirements. Tests
          are in progress.

        Software
          Some software components and procedures need to be updated
          to allow the project to scale from 40 to 100+ remotely
          deployed boxes. Keeping software up to date, and ensuring
          data taking jobs run properly are areas which will be
          automated more.

 -  Analysis, Products and Results

        Raw Data
          It has always been the policy that hosts can get copy of all data
          send and received by their. In the past months we have seen first
          requests for this. Data are provided in ROOT files; more info at
          http://www.ripe.net/test-traffic/Host_testbox/root.html

        Daily Plots
          Rewritten using ROOT's graphing tools: better layout, small box
          with statistics.  In the next few weeks, the plots on the web site
          will all be created with the new look.  We are also looking into
          creating a cgi-interface to generate plots on demand: with 100+
boxes
          it will be impractical to make plots for all active paths,
especially
          because users are not likely to check the hundreds of plots which
          otherwise would created for their box.

        Routing Vectors
          The current set-up did not scale for the increasing number of
          route vectors stored (timeouts, high load on web server).
          It is now being rewritten, using an SQL database, and
          expected to be ready in a few weeks. Functionality will be the
          same, the user interface will not change.

        Network alarms
          The goal of the network alarms is to detect unusual changes in
          network delays and warn test-box hosts. The algorithm was
discussed
          at the previous RIPE meeting: if 95% of measurements are outside
          expected range, alarm is set and e-mail send to specified address.
          In a next step the e-mail could be replaced (or supplemented)
          by another means of notification that would integrate the
          test-box alarms more with existing network monitoring systems.

          First analysis of the recorded alarms shows that on average
          about 12 messages are send per host per day. Weekend vs. weekday
          effects are also clearly visible in number-of-alarms-per-day
graphs.

          The algorithms for sending alarm still have room for fine-tuning
          as a lot of parameters are involved.  However, it is hard to do
          without input from the hosting sites. Henk asked to provide
          feedback in cases where a host receives a lot of messages,
          for example: did alarms coincide with changes in connections,
          problems at the site? Or did a host receive an unusual number
          of messages even though nothing was changed?

          Ingo Luetkebohle asked which connections frequently generated
          alarms. Henk did not know the answer, but said that further
          analysis of the alarms generated during the past months is on
          his action list.  This topic will be included.

        Trends in the data
          This analysis topic deals with the question how delays
          develop over longer periods of time (>1 month).
          Data from individual measurements are summarized in percentiles
          (2.5% = best case, 50% = normal case and 97.5% = worst case).
          With a handful of numbers each month, these are much easier to
          plot for a extended time period. Now the challenge in the analysis
          is to automatically find those plots which could be of interest
for
          human operators. Henk showed some examples.  Next steps are to
          finish research by May 1st, then turn experimental code into
          production software and make results available on the web.

        Network performance scores
          As discussion of this item had been removed from the agenda,
          Henk only touched briefly upon the reasons for looking into this,
          and what directions our thoughts are taking. A more detailed
          proposal will be available for discussion at the next RIPE meeting
          (RIPE36).

        Next (possible) analysis steps
          Modelling of data, N-square problem, Full TCP sessions, QoS,
          DiffServ, Traffic Engineering work ...
          Suggestions are always welcome!

 -  Questions/Discussion

     Daniel: alarm delivery from SNMP could be an option but will not
     be started by RIPE NCC unless there is real interest from the
     test-box hosts and someone is willing to start a joint project with us
     to get this implemented: what would be practical for test-box
operations?
     Implement polling the box for data as well?

     Mixed feelings on best approach
     - some like e-mail, simple delivery scheme, leave it up to the
       host/user to post-process it (e.g. procmail filters and scripts
       to add to syslog)
     - others prefer syslog, fast low-weight (UDP) delivery, not suffering
       from TCP and higher protocol (sendmail) timeouts. syslog and SNMP can
       both be input streams to network monitoring software (e.g. netcore)
     - a poll of the audience showed about 8 of those present use SNMP
     RIPE NCC will provide syslog to those interested.


    Comment by Frode Greisen: everyone performs pings inside their network
      to check latencies, the test-box network provides a unique environment
      in which more interesting things could be done, e.g. traffic flow and
      throughput.

    Daniel answered:
      Test-Traffic measurements are more than just pings, we do one
      way delay measurements which are of much more value than the round
trip
      measurements provided by ping and have more accuracy than ping.
      RIPE NCC do not want to perform passive measurements because of
      privacy problems: the project depends on trust, not all participants
      might like capturing all traffic passing by. Simulated loads, packet
      trains are in scope, but we prefer to wait for IETF IPPM defined
metrics
      before starting that. On the other hand if RIPE community would have a
      demand for this we could look start something and bring it into the
      IETF working group. Daniel agreed to look at possibilities in the
      direction of TCP performance.

    One person commented delay measurement data are already interesting,
    they provide an archive over time. More effort should be put in
    analysing the current data rather than collect different data,
    other metrics.

    Mike Norris liked the idea of a single number for network performance
    but commented that like a stock market index, the number is created from
    a changing base, requiring frequent re-calibration. Scores will depend
    on the rest of the test-box network. He is interested in the correlation
    with other statistics, like RIPE host count, size of routing table etc.

    Daniel answered that Test-Traffic Measurements depend on the trust
    RIPE NCC has from ISPs; we do not want to ask for SNMP access to
    ISPs' routers. Someone else noted ISPs do know how much traffic they
    carry, so they could perform the correlation themselves.



C. Presentation of current state of thinking on moving TTM to membership
service

 Slides at
http://www.ripe.net/ripencc/mem-services/ttm/Talks/0002_RIPE35_TTM/

 -  Untill now, the Test-Traffic measurement project was a project paid from
    the general NCC budget. As such there were some limitations:

    - 1 test-box per site gave everybody a chance to participate
    - in principal for RIPE-NCC members only (but a few boxes were
      installed outside RIPE region where collaboration with hosts
      is beneficial)

    Some (larger) sites did express interest for more than one box.
    Also there was a lot of interest from non-members, though often they
    did not want to become LIR just for the sake of participating in
    Test Traffic Measurements project.

    Finally, it was clear that not all current RIPE-NCC members
    are interested in the TTM service. Thus, the Annual General Meeting
    of the NCC membership adopted the resolution that the Test-Traffic
    Service should generate a self-sufficient budget in 2001.
    (Assuming 100 test-boxes to be in the field by the end of 2000, an
    estimated EUR 300,000 need to be generated from service fees in 2001)

    Henk presented the current ideas on turning TTM into a separate
    service that can be bought from the NCC. He stressed the figures
    are preliminary, could be off by 10%; volume discounts for hardware e.g.
    had not yet been discussed with supplier(s).

    The basic model:

      Buying a test-box:
        about EUR 3000 for the hardware, shipping and installation.
        Taxes, import duties, etc to be paid by the host site

      Package Includes:
        Pre-configured test-box, 3 years warranty, some possibilities
        to swap broken box with spare, installation support by email

      Service fee:
        No service fee in 2000, from 2001 onwards EUR 3000/year
        (Assuming a total budget of EUR 300,000 and 100 boxes)

        Service fee includes: operation of the test-box as a black-box
        by the NCC; access to all data taken from/to the box at host's site;
        access to all products based on the TTM data; maintenance and
support.

    Private experiments

      Definition: measurements on internal network, host does not want
        to share data and results OR other measurements that can be done
        with a test-box.

      Model: host buys another pre-configured test-box from RIPE NCC.
        Root-password will be communicated, after installation, host
        is on his own. RIPE NCC will not collect or analyze the data and
        the box *cannot* be used in the normal measurement chain.

     Build own test-box

       Model: host buys hardware and constructs the test-box himself,
         then hands it over to NCC Test-Traffic operations who will turn it
         into a black box. Any interest in this?


    Next steps for turning TTM into service:

    -  With feedback received at this meeting a draft proposal will be
       written which will be posted to the tt-wg@localhost mailing list
       for further discussion. From there plans will be finalized,
       text for a service agreement will be drafted and a glossy brochure
       will be produced for participating sites' management.

       If there is consensus, the plan is to have this in place before
RIPE36

 - Discussion

     Henk opened the discussion with the following questions:

     - Do you like this concept?
     - If not, it's back to the drawing-board, but what should be changed?
     - Would you be interested in paying for this service?

     Daniele Bovio commented he sees the purpose of starting to charge
     a fee for the service; there's value in RIPE NCC collecting and
     reporting on the data. However, he did put some question marks with
     the option of private experiments, where boxes would be sold to
     anyone interested. In his view, it is not in the interest of the
     RIPE NCC to start this business; it might be better to sell the
     technology to a company who would handle marketing and support
     for those type of installations.

     Daniel Karrenberg replied some hosts already asked for multiple boxes,
     to be used for monitoring the ISP's internal networks. Currently, the
     NCC can sustain the demand, but he agrees it's not good if many of
     such requests would come in.

     Mike Norris felt one should be careful about opening the public
     project to non-members of RIPE NCC. He agreed with the planned
     funding.

     It was asked if it would be possible have results from both internal
     and RIPE boxes presented in one report. Henk did not feel this was
     a good idea: we can guarantee the integrity of data from boxes
     operated by the NCC; if we start mixing data from those boxes with
     data from private experiments, we could loose the trust we have now.

     Comment: in order to get the charging approved, we need details
     (in the form of a written document) on what will be done with the
     money paid for TTM services.

     Henk/Daniel: this is what the service agreement is all about:
     a contract signed by both RIPE NCC and the hosting organisation
     detailing rights and obligations. In addition the NCC publishes
     yearly charging schemes and activity plans on which members can vote
     during the Annual General Meeting. Daniel stressed the importance
     of selling TTM services via a membership construction: it gives
     those who are not an LIR but do pay for TTM the opportunity to
     participate in the formal structures of RIPE NCC.

     Henk reminded the group test-box hosts get access to all data and
     results obtained with the box at their site. Internally, these
     can be used as desired; external publications, however, can only
     be made after review in the TT working group.

     Jake Khuon feels the RIPE NCC is doing good work in this field. He
     would be interested in seeing a conclusive report, a paper published.

Z. AOB

    There were no other business.
Post To The List:
<<< Chronological >>>
Author Subject
<<< Threads >>>