[atlas] Incident report for 2019-10-02 (was: Error: No suitable probes and delayed results)
- Previous message (by thread): [atlas] Incident report for 2019-10-02 (was: Error: No suitable probes and delayed results)
- Next message (by thread): [atlas] Credits for Research
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Moritz Muller
moritz.muller at sidn.nl
Thu Oct 3 13:15:44 CEST 2019
Hi Robert, Thanks a lot for the update and good look with the review. Moritz > On 3 Oct 2019, at 11:45, Robert Kisteleki <robert at ripe.net> wrote: > > > On 2019-10-03 08:16, Moritz Muller wrote: >> Hi, >> >> In our experiment we’re trying to assign certain probes to a ping measurement but until now we always get the error message "NO SUITABLE PROBES”. >> According to the documentation, this is a sign that a probe might not have enough resources. >> However, when I check on the measurement a few hours later I do see results, but its state is still "NO SUITABLE PROBES”. >> See https://atlas.ripe.net/measurements/23016956/#!probes >> >> Is that a common problem when selecting certain probes for measurements? >> >> Moritz >> > > Hello, > > Yesterday afternoon we had an operational problem within RIPE Atlas that > had consequences visible to users. I strongly suspect the above is a > side-effect of this. > > Due to a combination of two configuration errors and a spike in requests > from users, the core infrastructure received an unreasonably high amount > of measurement requests from an internal process related to IPmap. The > measurements scheduler and participant management subsystems struggled > to keep up with this load and eventually things started piling up. > > The issue started at approximately 12 UTC. We identified the root cause > about an hour later. Processes started to normalise late in the > afternoon, and processing the backlog finished sometime after midnight. > > We're working on a post-mortem, and reviewing the code and configuration > in order to prevent this error from happening again. > > Apologies for the inconvenience, > Robert > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: Message signed with OpenPGP URL: <https://lists.ripe.net/ripe/mail/archives/ripe-atlas/attachments/20191003/aa3f3c23/attachment.sig>
- Previous message (by thread): [atlas] Incident report for 2019-10-02 (was: Error: No suitable probes and delayed results)
- Next message (by thread): [atlas] Credits for Research
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]