WeRelate talk:Duplicate pages patrol


Willing to Serve on the Duplicate Pages Control Committee [20 October 2014]

  • Jillaine 14:16, 13 August 2012 (EDT)
  • I'm willing to give it a try.--GayelKnott 19:24, 13 August 2012 (EDT)
Jillaine and Gayel, great! I'll add your names to the main patrol page. --Jennifer (JBS66) 09:12, 14 August 2012 (EDT)

I'll be working on these. Besides simple duplicates, there seem to be a fair number of messed up relationships related to the entries on the duplicate report. --Robert.shaw 18:46, 20 August 2012 (EDT)

Thank you! I'll add your name to the project page. --Jennifer (JBS66) 06:48, 21 August 2012 (EDT)

I would like to serve on this committee. --Beth 21:34, 26 October 2012 (EDT)

Great! I've added your name to the main page. --Jennifer (JBS66) 09:52, 27 October 2012 (EDT)

I WOULD LIKE TO HELP--Shly45 13:42, 20 October 2014 (UTC)

Do/Should we be co-ordinating? [23 August 2012]

I've tried a few and have several questions.

The first is, with three of working on the same page, should we be making any attempt to co-ordinate our efforts?

The second is, what do the different categories mean? For instance, "Multiple parents A" and "Multiple parents C"? I'm sure there is something obvious, but I haven't spotted it yet.

The third is, how drastic do we get with the merging, and then the later cleaning up? On the few I've done so far I've tried to walk a middle-line, maybe leaning a bit towards letting it be if there is a potential question, but is that appropriate?

Thoughts, suggestions, anyone?--GayelKnott 22:30, 20 August 2012 (EDT)

Good questions. I've just started on WereLate myself so haven't good perspective yet. I think coordination is not a critical issue right now since few people seem to be working on this. The chance of hitting the same entry at the same time seems low, and if you see one someone else fixed (before a new report is generated), you probably wouldn't waste too much time on it before seeing things are ok.
The report page seems to have, from my observation, three groups of entries, in order:
  • Simple duplicates -- uncomplicated cases where the program thinks there might be a match
  • "Multiple parents <letter>" -- cases where a person is a child in two or more families
  • "Multiple spouses <letter>" -- cases where a single family has more than two spouses included in it (e.g. husband + 2 wives)
The "<letter>" is just a grouping based on the first letter of one of the persons names in the case. --Robert.shaw 13:53, 22 August 2012 (EDT)
Thanks. As for the "<letter>" being a grouping based on the name of one of the persons in the gourp, that's what I thought at first, but it doesn't seem to always be true. Mostly I was just curious.--GayelKnott 22:56, 23 August 2012 (EDT)
I just checked all 6 links under "Multiple parents I", only 2 of which have an "I" name in the link title; all 6 have a child with a name beginning with "I". So at least this case did show that pattern.
I'm concentrating on the sections in the last part of the alphabet of "Multiple parents" and "Multiple spouses", so if you're concerned about collisions you can avoid those sectons. --Robert.shaw 00:06, 24 August 2012 (EDT)
Okay. I wasn't worried about collisions so much as just thinking we should spread out. I started at the end of the alphabet for duplicate families and will work my way up.--GayelKnott 02:02, 24 August 2012 (EDT)

Duplicate Dutch families [24 August 2012]

In case you will find a Dutch duplicate (of Dutch people after 1800), I would prefer you to leave the merging to Jennifer or me. This is because the Dutch familynames are not allways recognised as such, they might be patronymics. I am not involved in the early Dutch settlers so I leave it up to those who wish to check them.--Klaas (Ekjansen) 02:43, 24 August 2012 (EDT)

Okay, I'll proceed with caution on the Dutch families. (Unless they're mine, of course.) --GayelKnott 12:16, 24 August 2012 (EDT)
Sure, your Dutch relationships you will watch and merge yourself.--Klaas (Ekjansen) 18:06, 24 August 2012 (EDT)

Pages that lack information necessary to determine if the pages should be merged [5 November 2012]

We need a method to tag pages that we have reviewed, but are not able to merge due to the failure of the uploader to include vital information, such as dates, places, etc. If any of these pages are updated, it would be fantastic if we could be notified. --Beth 21:38, 29 October 2012 (EDT)

Good idea. I agree.--GayelKnott 18:30, 30 October 2012 (EDT)

i was surprised to see so many empty profile pages-- ie no dates. i thought the gedcom upload tool prevented that from happening. Jillaine 01:20, 31 October 2012 (EDT)

It appears that many of poor pages in the Family Duplicates Report were uploaded years ago. My guess is that the upload tool back then didn't sanity-check the GEDCOM files. The site continues to suffer from the effects to this day. --Robert.shaw 12:51, 31 October 2012 (EDT)

So this makes me wonder: wouldn't it be better to figure out a way to global delete empty profile pages? Jillaine 13:35, 31 October 2012 (EDT)

The "empty" pages don't bother me so much as the pages that seem to have been messed up by someone trying to clean up a problem. (It's easy enough to do, especially late at night.)

For example, William Richmond and Mary Kaylor (1) and William Richmond and Mary Kaylor (2). Looking at the histories, at least some of the confusion seems to have come from cleaning up small problems, along with an attempt to connect people in ways that really didn't work at all. This is not the first example I've found, just the most recent. I usually just leave messages on the talk pages suggesting that there seem to be problems, although perhaps the messages/warnings should be left on the main page.

Unfortunately, the original contributors usually seem not to be paying attention to changes, although I have had one contributor respond with changes. --GayelKnott 00:01, 1 November 2012 (EDT)

--Beth 22:24, 3 November 2012 (EDT)

Bulk of Remaining Dupes need Research [3 November 2012]

Very few of the remaining dupes are "simple"-- i.e., obvious. They seem to reflect disagreement about parentage requiring research by those familiar with the pertinent lies. I'm not sure how much more this "committee" can do. Jillaine 14:38, 31 October 2012 (EDT)

Actually, there are always new dupes every week, some of which actually can be merged. But I agree, the bulk of the dupes seem to be more convoluted.--GayelKnott 00:04, 1 November 2012 (EDT)

Is there a way to distinguish new from old dupes?--Jillaine 07:10, 1 November 2012 (EDT)

There is a suggestions item: Tweak to Duplicates Report. I think this would be helpful for this committee. If others agree, please watch that suggestions page, add your comments, and I can talk to Dallan about adding this to his Roadmap. --Jennifer (JBS66) 07:18, 1 November 2012 (EDT)
I already signed up for that suggestion; however we do not know when the suggestion will be implemented. I suggest that we create a template for duplicates that one cannot make a decision on and use this to streamline the list. I have no known method of finding the new duplicates. How many new duplicates occur on average per month? --Beth 00:44, 3 November 2012 (EDT)

--- Is there a "More Research Needed" Template? Seems to me I've seen one someplace, but don't remember where. Where ever possible, I think it's a good idea to encourage people to assume some responsibility for the information they post. I'm starting to leave more messages on talk pages asking about possible duplicates (which also serves as a tentative warning to others). In the case of Simon Newcombe and Clara Conrad and James Newcombe and Clara Conrad, where it appears that the poster has just uploaded conflicting information from Ancestry.com, it would be nice to have a template indicating that these pages need more work. --GayelKnott 18:33, 3 November 2012 (EDT)

Hi Gayle, I understand your proposal but I think that a "more research needed" template is not specific to the problem. That template could probably be placed on every page. There is always more research needed on all of my projects. One rarely has exhausted all of the possible research on a person or family.
I agree all pages could potentially have more research, and I think most of us have pages where there are real questions, but the examples above (which were uploaded fairly recently) were particularly bad, and not the only ones he had uploaded. There was no reason for the duplicates to exist other than that he was just regurgitating bad research from elsewhere. That's where some sort of template would be useful. This maybe should have been a different topic, I don't know.--GayelKnott 15:27, 5 November 2012 (EST)

My focus is to improve upon the data on the duplicate page to eliminate everyone reviewing the same pages. Yes, I can look at all of the pages and change them to red but that is a total waste of my time. I am sure that some of you have already reviewed them. We need to kick the reviewed pages out by some template. The pages will be linked to the template so if there is an improvement made later, one may add them back. Any suggestions are appreciated. I could just get rid of them buy adding them to the no merge template but editing with a note that one may merge when pertinent information is added.

I do realize you are concerned about repeated "reviews" of duplicate pages. So far, I mostly just let the blue/purple colour of the link tell me whether or not I've already checked it. Since I normally only do a check about once a week, when I can take a reasonable chunk of time to work on the pages, I do see a few new ones every week, and can go straight to them
As for your suggesting of do a "Do Not Merge", that's certainly one possibility. One of the problems I have is not knowing how "intrusive" we should be, and I suspect there is no way to standardize that -- even for one person.--GayelKnott 15:27, 5 November 2012 (EST)

How often is Duplicate Families Page regenerated? [11 November 2012]

When I have time in a given day, I visit the duplicate families page and work away at a few dupes. I'm seeing all of my previous click-throughs still there-- at least they're purple. How often is the list generated such that those dupes that have been merged are no longer are on the list? Thanks. Jillaine 12:34, 1 November 2012 (EDT)

It should be getting re-generated every day. Can you give me an example of duplicates that have been merged but are still showing up?--Dallan 07:31, 11 November 2012 (EST)

Vote [21 November 2012]

We need to make a decision.

So vote about what to about the Pages.

Votes are for

Do Nothing

Add to the do not merge template with a comment

Create a new template

Or other suggested option

--Beth 21:44, 9 November 2012 (EST)

Beth I assume you are talking about pages that have no info on them? I say delete 'em. Jillaine 06:23, 10 November 2012 (EST)

Yes Jillaine, the pages that have no dates, etc. so one cannot determine if they are duplicates. So y'all vote and I guess we should submit our proposal to the committee. Not sure how this new administrative system works.--Beth 08:48, 10 November 2012 (EST)

I'd be willing to go along with a merge of pages with no information with a template added to the talk page noting that a merge has occurred. (Not everyone just starting to use WeRelate understand merges and that they can be undone.)

I'm a lot more hesitant to delete pages with minimal information -- some of them are obviously just bad reporting, but others are justified for one reason or another. I also would not be willing to merge other pages where it looks like maybe there is a match just to save us time and energy. For one thing, I've seen pages where that was done in the past and the result was an even worse mess than before, which would take a lot of work to figure out and undo, even for someone experienced with WeRelate.--GayelKnott 14:31, 10 November 2012 (EST)

just to be clear: when I say I vote to delete the pages, I'm talking about pages that-- except for the name-- are otherwise completely blank of data-- no birth, death or marriage info. Jillaine 17:35, 10 November 2012 (EST)
I would like us to have a template that says "May be a duplicate of xx". I could imagine many situations where that might be useful - as more information comes to light you can then move to a more definite conclusion, and merge, or alternatively conclusively establish that they are two separate people (in which case a template saying "definite not xx" might also be helpful for where people have in the past confused the two people. AndrewRT 16:31, 17 November 2012 (EST)
So we'd put this "may be a duplicate" template on pages where we couldn't conclusively say yes or no, in order to remove the duplicate from the list? That seems like a good idea to me. What do others think?--Dallan 12:44, 20 November 2012 (EST)
I vote for the "may be a duplicate" template. --Beth 13:02, 20 November 2012 (EST)
I recommend improving the "Compare pages" page to have a "Needs more data" button as well as a "Not a match" button. The new "Needs more data" (or whatever) button would cause a new "may or may not be a duplicate" template to be put on the Talk page of the family or person, just like "Not a match" now puts a {{nomerge}} template there. Presence of either template would keep the potential match pair off of the duplicates report. The new template would have different phrasing for humans, indicating that more research is needed to decide if the two pages are indeed duplicates or not. --Robert.shaw 16:23, 20 November 2012 (EST)
I like "Needs more data" better than "May be a duplicate".--Dallan 23:35, 20 November 2012 (EST)
re "So we'd put this "may be a duplicate" template on pages where we couldn't conclusively say yes or no, in order to remove the duplicate from the list?", I vote yes. Jillaine 06:03, 21 November 2012 (EST)
I'm willing to go with either a "may be a duplicate" template or a "Needs more data, may be a duplicate" template/button. Not sure what potential problems the latter might cause, but it's worth a try. --GayelKnott 14:56, 21 November 2012 (EST)

Serving on Duplicate Pages Patrol [20 December 2013]

I would be glad to work on this project. I think it is wonderful that this site does not want duplicate profiles and family pages. I am retired and spent most mornings online or on my computer. I have plenty of time and opportunity to put in on this.--Ila123 12:59, 2 April 2013 (EDT)

Wow - I just now noticed this. I've left a message for Ila123 asking if they still want to serve.--Dallan 18:32, 10 December 2013 (UTC)

Is there a place for an occasional participant? I'm still working on my own family tree, but every once in a while I need a break from it, and "dropping in" to do a few merges gives me a nice break. I have done this in the past. Should I just keep doing this on a "drop in" basis without becoming part of the official patrol?--DataAnalyst 21:08, 20 December 2013 (UTC)

Here's another volunteer [31 March 2014]

After seeing a competing site turn into a mess, with weird naming rules, I decided to return to WeRelate after some years of silence. I found this page, and the duplicates report attached to it, and already did a few merges.

Since this is an open site, it looks like I don't really need permission for merging, but this page suggests that it may be better to talk anyway, so here I am. I did loads of merges on that other site, but with the mess only increasing, I gave up.

Two questions first:

  1. Are there any pages you think I should read before going on with more merges than I already did,
  2. Is there a way to mark merge candidates as too vague or confusing to decide upon?

Thanks, --Enno 20:03, 31 March 2014 (UTC)

Hi, Enno. Always good to have more help. To answer your questions, the only reading that I know of is Help:Merging pages, which you may already have seen. And no there is no way to mark questionable pages, although it was talked about a while back. --GayelKnott 03:19, 1 April 2014 (UTC)
Regarding your second question:
Most of what remains to resolve is people with 2 (or more) sets of parents, or marriages with 3 (or more) spouses (in the same marriage). When I could not resolve duplicate parents, I added a section called Lineage. Here is an example of the notes I added to such a page: Johanna Ketner. In this case, I leaned in one direction, but in some cases you might not have enough info to even do that.
Hope this helps, and welcome to the duplicates brigade (although I am an informal member as you have been). --DataAnalyst 03:24, 1 April 2014 (UTC)

Committee Roll Call & Update [18 October 2016]

Hello - I am in the process of updating the information on the status of our admin structure and maintenance committees. The members of this committee are currently listed as:

  • Dallan, Liason to the Overview Committee
  • Jillaine
  • GayelKnott
  • Robert.shaw

Please respond here to let us know that you are still active on this committee and whether or not you wish to continue in this capacity.

To help us quantify the work that is being done, please include a brief list of the tasks that you perform most frequently and an estimate of the average amount of time per month that you currently spend on these tasks.
Thank you in advance for your help, --cos1776 13:41, 17 October 2016 (UTC)

Still active. Think I'm probably the only one. - 17:21, 18 October 2016 GayelKnott
Haven't been active since the big backlog due to early Gedcoms was worked off. Seems like most entries now are either temporary ones due to incomplete recent activity or unresolvable cases of long standing. I'm not inclined to continue being listed on this team. --robert.shaw 19:47, 18 October 2016 (UTC)