MySource talk:Inactive or Deleted Data

Watchers

Deleted/Inactive "MySource" Pages [24 June 2017]

The following discussion was extracted from a portion of the Wanted Pages topic held up to this point relating to deleted and inactive MySource pages at User_talk:Cos1776 in order to show the history of this issue pertinent to creation and editing of this page.

I've noticed that among the top dozen Wanted Pages now are multiple MySources that were Speedy Deleted by one of the system administrators, but are still referenced in the linked person pages as source citations for the fact events entered. For the multiple person pages I checked, those linked person pages are still active, but because the MySource was deleted, the source citation now shows in red as containing no information because of the deletion action. So now the red-linked MySource becomes a Wanted Page.

So this brings up some questions in my mind:

  1. Were the referenced and imported GEDCOM files (identified as MySources) deleted because the data file itself was deleted (or removed) at the importer's request? Or is this a maintenance and clean-up action conducted independently by the system administrator?
  2. Relating to the SysAdmin action, is having only one user watching a MySource considered adequate and justifiable reason to Speedy Delete the source?
  3. Once deleted (if the action is considered valid), should it then show up as a Wanted Page?

Please opine or advise. --BobC 17:09, 15 June 2017 (UTC)

I think you have to look at each case individually to find the answers to why a MySource page was deleted and whether or not it was the appropriate choice. Since most of these date back to the beginning, I would venture a guess that either the Admins didn't understand the process or there might have been a decision made way back when to use the Wanted pages list to flag pages in need of further editing to remove the undesired citation links. Nowadays, I examine all Sources coming in and exclude those that only refer to personal trees or files, so these types of MySource pages and links are not generated. However, I can not speak for all of the reviewers, so some might still be coming in. --cos1776 19:49, 15 June 2017 (UTC)
I don't fault the sysadmins or volunteers from doing what they needed to do to perform maintenance and clean-up. I think it is a system fault that allows deletion of information that connects to other data or pages. The system then moves the deleted link to the "Wanted Pages" list because it connects to other pages without having its own content. --BobC 18:05, 22 June 2017 (UTC)
Interesting thought! I could imagine either something like a pop up that warns you if the page you are about to delete still has links to it or the ability to remove all links at once from each affected page. The latter would solve the issue, and I think it is what some admins think is happening when they delete a MySource. Either that or, as I have wondered, they may have been told to delete MySources in this way, specifically so that they would appear on the Wanted pages list. I don't know the answer, so I asked an admin who has been around for a while. I am awaiting a response. --cos1776 23:07, 22 June 2017 (UTC)
Re: #2 - I would say that the decision to delete any page should be based on the circumstances and not on the number of watchers. I have deleted some pages with multiple watchers if it came in via GEDCOM a long time ago, there were no sources, the watchers are inactive, and/or it is too far back, etc. --cos1776 19:49, 15 June 2017 (UTC)
Understand. I agree that the circumstances should take priority over the number "watchers," but the level of activity of users watching these pages should be considered also. Relating specifically to GEDCOMS, the MySource Page Portal states: "When you upload a GEDCOM, all of your sources with title fields are listed as MySources, whether or not they meet the above criteria. You have the option when you review your GEDCOM to link your sources to the community Sources and to edit your MySource pages." So that tells me that GEDCOMs are and should be considered valid MySources. --BobC 18:05, 22 June 2017 (UTC)
I think you may be reading too much into that quoted sentence which likely dates to before the auto-matching program was in place. I think it is simply explaining how the program used to work. Before being updated, all matching had to be done by hand. This can be a tedious task, so many people took the easy way out and simply left all of their sources as MySources. This is why so many older trees only cite MySources. It is also true that in the beginning, the admins doing the reviewing were hesitant to exclude any sources, regardless of the quality. That is why so many older trees cite things like personal GEDCOM files, Ancestry Family Trees, IGI, etc. Nowadays, the review program will do its best to auto-match the incoming sources to our existing source database first with relatively good success. Whatever it can't match gets left to the user (preferred) or to the reviewer (usually) to decide if it can be matched, if it should remain as a MySource, or if it should be excluded from entry. Citations to personal GEDCOM files are usually denied now, so there is no need to remove them page-by-page later. --cos1776 23:07, 22 June 2017 (UTC)
Re: #3 - The best way to stop the creation of undesired MySource pages or citation links to them is to catch them before they are imported. If they make it in, there is currently no other way to resolve them, except to manually edit each page as we do now. Once all links are resolved, the MySource page will automatically disappear from the "Wanted pages" list, because there are no longer any other pages trying to link to it. I hope this makes sense? --cos1776 19:49, 15 June 2017 (UTC)
Had to read this one a few times to digest and attempt to comprehend it. Not sure I do in full, but thanks for your response. --BobC 18:05, 22 June 2017 (UTC)

Okay, I had to think about it awhile, but I came up with a way to remove the deleted MySource page links from the Wanted Pages listing. I redirected those old MySource pages to a new pages entitled MySource:Inactive or Deleted Data. Hope that helps clean up the listing without affecting the ability to restore those individual pages (if ever necessary or desired) or to further manipulate the person pages for which the deleted files were cross-referenced. --BobC 17:40, 22 June 2017 (UTC)

I'm not sure this is the best approach. Unless there is a second step planned to fix the links, you are just moving the problem to a different page that most admins won't know about. Clearing page titles off the Wanted pages list is not the primary goal. The primary goal is to ensure that data is entered into the correct data field and in an acceptable format. If we resolve the bad data entries, the Wanted pages list clears itself.
Creating (or undeleting) and then redirecting page titles does not solve the problem of bad data entries on each linked page. This is especially true when redirecting Place page titles, because it lessens the efficiency of our search engine. (ex. click "Browse Smith in Neshoba" (left menu) for Person:Abigail Smith (55) - her page does not get returned because the search engine is looking for the redirect and not for the original entry that is still in her Place field).
For MySources, the concern is a little different. If it is not a valid source, then I don't think we should restore the link to a page that says that it may not be a valid source. That seems like an unnecessary middle step. Why is that better than removing the undesired citations from the linked pages? I get that it is easier, but it doesn't really solve the problem. --cos1776 23:07, 22 June 2017 (UTC)
I understand what you're saying, and while it may not have been the literally correct approach or ultimately ideal solution, at this point it may be better and simply more feasible than tackling that "second step" you mention which would be to evaluate individually and edit all 5,108 links related to these 17 deleted pages that really were of no concern to anyone besides the handful of us who monitor some of these more unknown special pages and perform maintenance on some of the problems areas identified in those WeRelate pages since the pages they point to seem to have been abandoned by the folks who submitted them and of no concern to anyone else.
While the process I used may have appeared to do so in an indirect way, the function I performed did not really restore the pages, only redirected the deleted source pages to one page reflecting the deletion action previously performed (which may or may not have been appropriate and functional since the page then appeared in the Wanted Pages listing, and actually may have been a valid MySource page at the time created as previously stated) and explain the systemic processing and linking problem inherent in the system. Only by removing the 5108 undesired source citations from all of the linked person pages would have truly allowed complete deletion of the MySource pages. Are any of us prepared to do that?
Appreciate your feedback. Thanks again. --BobC 03:48, 23 June 2017 (UTC)
I know it is tedious work, but there are a handful of dedicated users who are actively pecking away at this all the time (even tracking the progress). Absent a programming improvement to make this process easier, we have to work with the current hand we are dealt. Since folks usually go to the Wanted pages list to see what needs to be done, let's see if we can come up with a way to "redirect" them now to the pages you have created, MySource:Inactive or Deleted Data and Place:Inappropriate placename usage (any others?), so they don't miss the ones that have been removed from the list.
For the Place name redirects, I would encourage you to either put them back onto the list (remove redirect and delete the page) or print out/screenshot the current list now, before the titles disappear this weekend, so that you/we can know which ones still need to be addressed. I made this same mistake initially with redirecting some Place names, so I have my own little list that I am slowly working to resolve - mostly Norwegian and German places.
I try to remain optimistic that we can eventually have a better method to fix the old bad links and prevent new ones from being entered. I have also put in a suggestion for an update to the left menu Browse search, so that we will be able to simply redirect the bad Place names. Please consider voting for this suggestion if you agree. Thanks, --cos1776 11:38, 23 June 2017 (UTC)--BobC 23:27, 24 June 2017 (UTC)

Next step? [16 July 2017]

BobC - how would you like to handle the next step in this process? I wasn't sure if you had already reviewed the MySources in this list to know if there were any whose citations should be retained or if they were all candidates for removal. I can clear this list if I know the following:

  1. MySource title
  2. For each citation that currently links to this MySource, do you wish to retain the citation on each page as a "Citation only" (no link) or to remove it completely from the page.
  3. Once step 2 is complete, the MySource page will no longer have any other pages linking to it, so should it be retained or deleted?

These particular ones had already been deleted once, so we can guess that is the choice for them, but going forward, I could envision that we could apply a form of this process to cleaning up all of the old citations to GEDCOMs, ex. [1]. For those, it is still important that we examine each case first to decide the best approach. I would still be willing to bet that most can be completely removed, but every once in a while there is one with some useful information in the citations.

Anyhow, would you like to make the call for those that currently appear on your list, or would you like for me to just start removing the citations? Regards, --cos1776 11:57, 5 July 2017 (UTC)

My apologies. I am a little tired this morning after a busy holiday weekend. In re-reading both of our comments above, I think we have already covered the allocation of items on this list and we all agreed their citations could be removed. I will proceed with this task, but will hold off starting for a little bit, in case you object. Thanks, --cos1776 12:35, 5 July 2017 (UTC)
Follow up - all MySource citations which used to link to the pages in the list have now been resolved. Most citations were removed as they only cited personal GEDCOM files, but some were converted a regular Source citation or a "Citation only" if I thought the text should be retained. After the links were resolved, the corresponding MySource pages were deleted.
Going forward, I think we could continue to use a system like this to log MySources that should be regular Sources, but it is probably not necessary to separately log MySources whose citations should simply be removed (like the GEDCOM file names with no further info). The previous system of flagging them by putting them on the Wanted pages list (i.e. deleting the MySource page first) could remain as is. It would be nice if the admins would log a reason for deletion however, so there is a record. What are your thoughts on this? --cos1776 01:46, 17 July 2017 (UTC)