WeRelate:Suggestions/Automatically delete "emtpy" pages for living people

Contents

Original Suggestion and Discussion

I am proposing that pages for Living people that are not connected to any spouse pages be deleted from the database automatically by bot. There are thousands of such pages on WR that were created upon gedcom upload prior to the new system of excluding these individuals. These pages were stripped of their information and are essentially empty. I believe keeping them on WR causes confusion to new users. I have seen new users enter in vital information for these people, with the mistaken belief that WR accepts pages for living people. The bot should delete pages that are not connected to a spouse such as Person:Living Rank (12) and that contain no other information. Living pages that are connected to spouses like Person:Living Piontek (1) would likely need to be dealt with manually. --Jennifer (JBS66) 06:23, 17 November 2011 (EST)

Not an issue that I have particularly been involved in, but why differentiate those with spouses? It seems like those should be deleted as well and the spouse left with an unknown partner? On a different issue: I don't think it is limited to the Lane family, just my knowledge of it, but "Living" was actually used as a name for a while (e.g., Person:Living Lane (105) and Person:Living Lane (106)), so I was also led to wonder how exactly is it determined that people are living? --Jrich 08:54, 17 November 2011 (EST)
I did not know that Living was actually a given name, thank you for posting examples of that. I suggested that only those without spouses be deleted by bot, because I believe those with spouses will need to be evaluated by a person before being deleted. There are thousands of pages that were scrubbed of their data and are left essentially useless. I was hoping that we could devise a way to remove these pages in an automated way. The basic criteria of no spouse, no birth or death date, no text whatsoever on the page is a place we may be able to start in removing these from the database. You asked how these people are determined living. I believe the gedcom uploader looks at the date of birth and dates of parent's birth - if the person was born within the last 110 years and does not have a death date, they are marked as living. In the current gedcom upload process, these pages are excluded from upload. Dallan would have more info on the specifics of this process though. --Jennifer (JBS66) 09:16, 17 November 2011 (EST)
Again, I have not been working on this issue, so my questions may be naive. What are the spouses being evaluated for? That a living person has snuck in and not been labeled living? The case I looked at, the living person's spouse had a birth and death date and seemed legitimate. Presumably this kind of page (the spouse) doesn't want to be deleted, so the choices would seem to be: leave the living person as Living, or delete their page and leave the valid spouse married to Unknown. To me, either one still almost begs some detail-oriented person (a good trait in genealogy) to add information to complete the picture. In the big picture, the person represented by the Living page does actually exist, and will some day become a legitimate object for shared research without endangering privacy, so leaving the page as a placeholder doesn't seem all that objectionable to me, if it was properly identified as such. If labeling the page as "Living" is not sufficient, could a template be added that warns people not to add data to living people (so users stumbling on the page, that don't know the rules, won't be tempted to do so prematurely)? Many decades from now, when the person is no longer living, somebody can come in and legitimately add all the data and remove the template. --Jrich 09:46, 17 November 2011 (EST)
The gedcom uploader determines that someone is living if they've been born/christened in the past 110 years without anything in a death or burial field, or if they have no birth/christening date but their spouse or children are determined to be living, or if their parents are born within 130 years I believe. Currently, the uploader doesn't import anyone it determines to be living, even if they have a spouse. Previously, it would create empty pages for these people, with a name of "Living Surname", a gender, and nothing else. What if I write a program to delete all pages that have only a given name of "Living," a surname, a gender, and nothing else? Also, to avoid upsetting people, the program would delete only pages that were created by gedcom upload and haven't been edited since. Pages that have been edited on-line would have to be deleted manually after a human review, if someone wants to do that. Yes, we'd be deleting placeholder pages, but the deleted pages would all be empty, and it wouldn't be difficult to re-create them after the person had passed away. This would make the website conform to our current policy of "no pages for living" (see WeRelate:Policy). It might inadvertently delete a few pages for dead people whose givenname is "Living", which is a concern, but if there's nothing else on the page, we haven't lost a lot of information by deleting those pages. Thoughts?--Dallan 16:54, 23 November 2011 (EST)
What about the family pages where there are one or more living spouses? Would those need to be deleted by hand? --Jennifer (JBS66) 16:58, 23 November 2011 (EST)
I could apply the same rules that the current gedcom uploader uses: a family page is created only if there are multiple deceased family members, or if there is one deceased family member and an event or some other information. So the program could also remove the associated family pages of a living person if they didn't meet one of those two criteria.--Dallan 23:32, 23 November 2011 (EST)
Sounds like a good plan to me! --Jennifer (JBS66) 06:46, 25 November 2011 (EST)

There are also thousands of Family:Unknown and Unknown (1253) pages that contain no spouses, no text, and only one child. Might we be able to delete these as well? We would need to keep the pages with multiple children though. --Jennifer (JBS66) 09:49, 21 December 2011 (EST)

Yes, that makes sense.--Dallan 23:12, 21 December 2011 (EST)
Dallan, is there an API for writing bots? - Jdfoote1 13:08, 7 June 2012 (EDT)
I just made one publicly available. See WeRelate:Call for code volunteers for more information. I'd greatly appreciate it if you wanted to write a bot to delete the empty pages for living people.--Dallan 17:43, 10 June 2012 (EDT)
The above link goes to a redirect which goes to an empty page. The most relevant current page appears to be WeRelate:Bots for page maintenance --Tfmorris 15:44, 8 May 2015 (UTC)
Yes, both links go to maintenance committee pages for Website features and Bots which is appropriate in this case. --cos1776 20:21, 19 November 2016 (UTC)

Specifically, the bot should delete Person pages if:

  • given name is "Living",
  • page contains no other information except surname and gender,
  • there are no links to family pages that won't also be deleted (see below), and
  • comment on last edit is "gedcom upload"

and the bot should delete family pages if:

  • page contains no other information except for links to people, and
  • of the people that are linked to, either all of them are living (according to the rules above) or all but one of them are living.

--Dallan 17:55, 10 June 2012 (EDT)

Status (2012)

per Dallan (2012): Status: asking for someone to write a bot to implement this request

Neutral Watchers

Admin follow up (Nov 2016)

We are currently working through the backlog of old suggestions.

Analysis: If a bot was written, it does not appear to have been implemented. A simple Search shows that the database still contains approximately:

  • 12,332 Person pages where Given name = Living
  • 5,065 Person pages where Given name = Unknown and Surname = Unknown
  • 4,210 Family pages where Husband given = Living
  • 9.043 Family pages where Wife given = Living

Status: Open
Priority: 2 (medium)

--cos1776 20:21, 19 November 2016 (UTC)

Additional comments

The majority of Living pages have already been removed manually (many thanks to Susan Irish, who I believe did most of this cleanup). I am slowly working through the remaining 11,090 person pages, salvaging as many as I can, since more than half of these pages seem to be for deceased individuals. Consequently, I will move this suggestion to the archive page.--DataAnalyst 17:36, 8 April 2017 (UTC)