WeRelate:Suggestions/Automatically delete "emtpy" pages for living people

Status: asking for someone to write a bot to implement this request

I am proposing that pages for Living people that are not connected to any spouse pages be deleted from the database automatically by bot. There are thousands of such pages on WR that were created upon gedcom upload prior to the new system of excluding these individuals. These pages were stripped of their information and are essentially empty. I believe keeping them on WR causes confusion to new users. I have seen new users enter in vital information for these people, with the mistaken belief that WR accepts pages for living people. The bot should delete pages that are not connected to a spouse such as Person:Living Rank (12) and that contain no other information. Living pages that are connected to spouses like Person:Living Piontek (1) would likely need to be dealt with manually. --Jennifer (JBS66) 06:23, 17 November 2011 (EST)

Not an issue that I have particularly been involved in, but why differentiate those with spouses? It seems like those should be deleted as well and the spouse left with an unknown partner? On a different issue: I don't think it is limited to the Lane family, just my knowledge of it, but "Living" was actually used as a name for a while (e.g., Person:Living Lane (105) and Person:Living Lane (106)), so I was also led to wonder how exactly is it determined that people are living? --Jrich 08:54, 17 November 2011 (EST)
I did not know that Living was actually a given name, thank you for posting examples of that. I suggested that only those without spouses be deleted by bot, because I believe those with spouses will need to be evaluated by a person before being deleted. There are thousands of pages that were scrubbed of their data and are left essentially useless. I was hoping that we could devise a way to remove these pages in an automated way. The basic criteria of no spouse, no birth or death date, no text whatsoever on the page is a place we may be able to start in removing these from the database. You asked how these people are determined living. I believe the gedcom uploader looks at the date of birth and dates of parent's birth - if the person was born within the last 110 years and does not have a death date, they are marked as living. In the current gedcom upload process, these pages are excluded from upload. Dallan would have more info on the specifics of this process though. --Jennifer (JBS66) 09:16, 17 November 2011 (EST)
Again, I have not been working on this issue, so my questions may be naive. What are the spouses being evaluated for? That a living person has snuck in and not been labeled living? The case I looked at, the living person's spouse had a birth and death date and seemed legitimate. Presumably this kind of page (the spouse) doesn't want to be deleted, so the choices would seem to be: leave the living person as Living, or delete their page and leave the valid spouse married to Unknown. To me, either one still almost begs some detail-oriented person (a good trait in genealogy) to add information to complete the picture. In the big picture, the person represented by the Living page does actually exist, and will some day become a legitimate object for shared research without endangering privacy, so leaving the page as a placeholder doesn't seem all that objectionable to me, if it was properly identified as such. If labeling the page as "Living" is not sufficient, could a template be added that warns people not to add data to living people (so users stumbling on the page, that don't know the rules, won't be tempted to do so prematurely)? Many decades from now, when the person is no longer living, somebody can come in and legitimately add all the data and remove the template. --Jrich 09:46, 17 November 2011 (EST)
The gedcom uploader determines that someone is living if they've been born/christened in the past 110 years without anything in a death or burial field, or if they have no birth/christening date but their spouse or children are determined to be living, or if their parents are born within 130 years I believe. Currently, the uploader doesn't import anyone it determines to be living, even if they have a spouse. Previously, it would create empty pages for these people, with a name of "Living Surname", a gender, and nothing else. What if I write a program to delete all pages that have only a given name of "Living," a surname, a gender, and nothing else? Also, to avoid upsetting people, the program would delete only pages that were created by gedcom upload and haven't been edited since. Pages that have been edited on-line would have to be deleted manually after a human review, if someone wants to do that. Yes, we'd be deleting placeholder pages, but the deleted pages would all be empty, and it wouldn't be difficult to re-create them after the person had passed away. This would make the website conform to our current policy of "no pages for living" (see WeRelate:Policy). It might inadvertently delete a few pages for dead people whose givenname is "Living", which is a concern, but if there's nothing else on the page, we haven't lost a lot of information by deleting those pages. Thoughts?--Dallan 16:54, 23 November 2011 (EST)
What about the family pages where there are one or more living spouses? Would those need to be deleted by hand? --Jennifer (JBS66) 16:58, 23 November 2011 (EST)
I could apply the same rules that the current gedcom uploader uses: a family page is created only if there are multiple deceased family members, or if there is one deceased family member and an event or some other information. So the program could also remove the associated family pages of a living person if they didn't meet one of those two criteria.--Dallan 23:32, 23 November 2011 (EST)
Sounds like a good plan to me! --Jennifer (JBS66) 06:46, 25 November 2011 (EST)

There are also thousands of Family:Unknown and Unknown (1253) pages that contain no spouses, no text, and only one child. Might we be able to delete these as well? We would need to keep the pages with multiple children though. --Jennifer (JBS66) 09:49, 21 December 2011 (EST)

Yes, that makes sense.--Dallan 23:12, 21 December 2011 (EST)
Dallan, is there an API for writing bots? - Jdfoote1 13:08, 7 June 2012 (EDT)
I just made one publicly available. See WeRelate:Call for code volunteers for more information. I'd greatly appreciate it if you wanted to write a bot to delete the empty pages for living people.--Dallan 17:43, 10 June 2012 (EDT)

Specifically, the bot should delete Person pages if:

  • given name is "Living",
  • page contains no other information except surname and gender,
  • there are no links to family pages that won't also be deleted (see below), and
  • comment on last edit is "gedcom upload"

and the bot should delete family pages if:

  • page contains no other information except for links to people, and
  • of the people that are linked to, either all of them are living (according to the rules above) or all but one of them are living.

--Dallan 17:55, 10 June 2012 (EDT)