Clean up of unreferenced person objects Abstract -------- This is a proposal for a one time cleanup of unreferenced person objects. It also addresses the issues of flagging objects prior to deletion and an opt-in white pages. * Clean up of unreferenced person objects o Abstract o Intended Audience o Introduction o Background o Changes to objects o Flagged o White pages o Bulk cleanup procedure + Month 1 + Month 2 + Month 3 + Notes o References o Follow up + WG mailing list references + RIPE Meeting minutes references Introduction ------------ According to our database consistency statistics program (dbconstat [1]) we currently have 460,573 unreferenced person/role objects [2]. Some of these may be maintained, but are still unreferenced. Any personal data not referenced by Internet resources do not fit within the primary purpose of the RIPE Database. They should not be stored in the RIPE Database beyond a reasonable 'work in progress' period. A secondary purpose of the RIPE Database has been proposed. This will be a 'white pages' listing of 'significant' people within the Internet community. Those wishing to be linked to these white pages must 'opt-in' and explicitly consent to their personal data being displayed in a public database. We regard this consent to be given by providing the authentication to modify the person object. Those doing so will be exempt from the cleanup process. One of the other current proposals is to maintain all objects. This may result in many of the currently unreferenced person/role objects being referenced in a maintainer which maintains the person/role object. This pair of mutually referencing objects may not be linked to any other object. This may have a significant effect on the original proposal. We therefore propose a change to this one time cleanup to accommodate this effect and aim to hit the majority of the previously identified objects which should be removed. The rest will be taken care of later with the introduction of a regular cleanup script. The new proposal will still target any unreferenced person/role objects. But also target person objects and their referencing mntner object, providing this pair of objects only reference each other. Even though person/role objects must now be maintained, this does not mean they will all be referenced by a mntner. They may be maintained by an existing mntner which references other person/role objects. This allows for the maintained person/role objects to still be unreferenced. Targeting 'loose' mntner objects will catch the mutually referencing pairs. There may be many more of these when it is required to maintain person objects. In this case we will only target the person/mntner object pairs. To include role objects implies person/role/mntner groups with many more references. This is too complicated to handle within the scope of this one time cleanup process. Background ---------- The RIPE NCC has had a mandate to delete these since RIPE40: http://www.ripe.net/ripe/wg/db/minutes/ripe-40.html (2001) At RIPE 41, the Database Working Group agreed that "maintained objects will now be removed" and "gave a mandate to the RIPE NCC to continue with the cleanup process." http://www.ripe.net/ripe/wg/db/minutes/ripe-41.html (2002) The cleanup process of 2003 (http://www.ripe.net/db/news/unref-cleanup-200304.html) involved using a script to run periodic cleanups. This script was put in place. However, it failed about 18 months ago. Because of other priorities, we have not had the time to examine this issue again until now. The graph showing the increase in these unreferenced person objects [3] indicates that the cleanup script appeared not to be performing correctly. At the start of 2006, there were about 300,000 unreferenced person/role objects. Since then there has been a steady increase with a large increase of almost 50,000 in February 2007. The graph also shows a slightly higher rate of increases in these objects since February 2007. At RIPE 54 (2007) it was agreed that a new cleanup process is needed. Concerns regarding use as 'white pages' and the issue of tagging objects prior to deletion to delay re-use of nic-hdls have been addressed in this new proposal. Because redundant personal data, with no explicit consent to remain, is a serious data protection issue, we want to take a new approach to this in the future. Once the initial cleanup is in progress, we will create a new proposal for a new, regular cleanup procedure. This will be sent to the Data Protection Task Force [4] and then to the rest of the community. Changes to objects ------------------ Add a "not-ref:" attribute to person/role objects. This indicates that the person/role object is not referenced and the date when it last became unreferenced. It is generated by the database software and takes a date stamp. It can also take an extra tag 'FLAGGED'. It cannot be changed in any way by users. person: [mandatory] [single] address: [mandatory] [multiple] phone: [mandatory] [multiple] fax-no: [optional] [multiple] e-mail: [optional] [multiple] org: [optional] [multiple] nic-hdl: [mandatory] [single] not-ref: [generated] [single] remarks: [optional] [multiple] notify: [optional] [multiple] abuse-mailbox: [optional] [multiple] mnt-by: [mandatory] [multiple] changed: [mandatory] [multiple] source: [mandatory] [single] Create a new org-type for organisation objects: org-type: WHITE-PAGES This type can only be set by the database administrators. Flagged ------- When an object is deleted the nic-hdl becomes immediately re-usable by anyone. To allow sufficient time for the user to decide what they want to do with their (unreferenced) person/role objects, they will be flagged before deletion. During this flagged period the user can make appropriate references, link it to the white pages (maintained person objects only) or delete it themselves. If nothing is done it will be automatically deleted after a set flagged period is over. The cleanup script will append the 'FLAGGED' tag to the "not-ref:" generated attribute in the selected, (unreferenced) person/role objects: not-ref: FLAGGED If the object has a "notify:" attribute or is maintained, the user will be notified of this modification by the normal update mechanism. A customised notification message will be used to make it clear to the user why this notification has been sent and reference a web page with further details. If the user wishes to keep this object they can reference it either directly or indirectly from an internet resource or apply to link it to the white pages (maintained person objects only). Either of these operations, if successful, will result in the "not-ref:" attribute being removed from the object by the database software. After an agreed period, if this "not-ref:" attribute is still present with the 'FLAGGED' tag the object will be deleted. The user will be notified by the normal update mechanism with the standard message, if the appropriate contact attributes are present. White pages ----------- There are some people who have a 'significant presence' within the internet community, but who are not directly responsible for any internet resource. These people may want/need to be easily contactable by the community. To facilitate this we will introduce a white pages to the RIPE Database. Users can have their person objects linked to the white pages. The RIPE NCC will create a number of organisation objects to define the categories of the white pages. For example: RIPE NCC board RIPE WG chairs RIPE working groups etc These organisation objects will have a new "org-type:" of 'WHITE-PAGES'. For each category there will be a moderator who will approve an application. The nic-hdl of the moderator(s) will be stored in a "remarks:" attribute in the organisation objects. Details of the white pages and how to apply will be available on the RIPE website. A user can apply to have their person object linked to the white pages. They should select the category and contact the moderator. The user needs to send their full person object to the moderator. This should either include the plain text password or be a signed message providing the authentication to modify this person object. Unmaintained person objects will not be linked to the white-pages. If the moderator approves the application s/he will add an "org:" attribute referencing the appropriate organisation object and the required authentication and submit the message as an update to the RIPE Database software. If the user no longer wishes to be linked to the white pages they can remove the "org:" attribute from the person object. However, if the object is not referenced it will be tagged and then deleted. Person objects that are referenced by internet resources can still be linked to the white pages if the user wishes to be known in a specific category. Unreferenced person objects and person/mntner pairs will be exempt from the cleanup process if linked to the white pages. You can only apply for an existing, maintained person object with an "e-mail:" attribute to be linked to the white pages. The person objects must be created and maintained in the normal way before applying. If the object is later modified to remove the "e-mail:" attribute it will lose it's white pages status. The nominated moderator for the selected categories can be the RIPE NCC for such categories as RIPE NCC board and RIPE WG chairs. The WG chairs (or co-chairs) for their WG category, etc. Requests for additional white pages categories can be sent to Customer Services at RIPE NCC. These requests will be forwarded to the WG chairs mailing list for approval. If approved the RIPE NCC will create the new organisation object, update the web page and notify the moderator. At RIPE 54 there was no discussion on who should be allowed to use the white pages. Without any limits or restrictions applied, it is open to abuse and could become an address book with thousands or tens of thousands of entries. We would simply move the problem of unreferenced person objects from one place to another. Although the RIPE NCC is Data Controller for the RIPE Database, it is not in a position to decide on the usage of the white pages. This is why we have introduced the moderator position to put this responsibility on members of the community who are in a position to make these decisions. We do not expect a high demand for this service or for creating many new white pages categories. If the demand does become excessive then it becomes a social networking service for technical people. In that case we should reassess the purpose and implementation. Our first option is to keep it simple and constrained. Bulk cleanup procedure ---------------------- Month 1 * Select 80,000 unreferenced person/role objects or person/mntner pairs and tag them. (These will be the first 80,000 found by an sql query on the database.) Month 2 * Check selected objects. * Those still unreferenced and tagged: o Delete using normal update process. o Run updates overnight to avoid any unnecessary load on the servers. * Select next 80,000 unreferenced person/role objects or person/mntner pairs and tag them. Month 3 * Repeat process until complete. Notes ----- This procedure will take about 6 months to clear the current backlog, not including the extra time that may be necessary due to the increases over that period. We will announce the start of the cleanup to the Working Group mailing lists and as a news item on our web site home page. References ---------- [1] dbconstat http://www.ripe.net/projects/dbconstat/index.html [2] current unreferenced person/role objects http://www.ripe.net/projects/dbconstat/html/cons-current.html [3] graph of unreferenced person object increase http://www.ripe.net/projects/dbconstat/cons-unrefpero.html [4] Data Protection Task Force http://www.ripe.net/ripe/tf/dp/index.html