WeRelate talk:Duplicate review

The duplicate review project has ended. The vast majority, though not all, duplicate families have been merged. The remainder will be merged as they are discovered through normal usage. THANK YOU to everyone who participated!

If you're interested in helping to merge pages, please let people know on this page. I'd like to get a group of people together so we can decide together how best to tackle this problem. I'm open to suggestions on the best way to proceed.

Count me in. --Beth 19:42, 16 October 2008 (EDT)

Let me think... Yes. --Amelia 21:57, 16 October 2008 (EDT)

I'll help. I'm working on my own trees right now, but I should be a pro at this soon. --jillaine 15:06, 19 November 2008 (EST)

Topics


Merge with three families does not seem to work [20 October 2008]

I tried to merge the following but it did not work.

Tommy Coker and Anne Camberdella (1) Tommy Coker and Anne Camberdella (2) Tommy Coker and Anne Camberdella (3)

--Beth 00:11, 18 October 2008 (EDT)

Never mind; it works. I just did not know how to do it. You can't merge them all at once. <g>--Beth 07:42, 19 October 2008 (EDT)


You should be able to merge several families at once. There are two ways to merge families: click on one of the lines in the duplicates files listed on WeRelate:Duplicate review, or click on "Find duplicates" in the "More" menu when you're looking at a family page, then check the boxes next to the families you want to merge. Either way takes you to the Special:Compare screen. On this screen, check the boxes at the top of the page above each family that you want to merge. If you check the check the boxes above multiple families, then you'll merge all of the families at once.--Dallan 12:10, 20 October 2008 (EDT)


Thanks Dallan, it does work; must have just messed up on my initial attempt to merge multiple families. Sorry about that.--Beth 17:46, 20 October 2008 (EDT)

Procedure when active users have duplicate pages [3 March 2009]

Hello everyone,

The first name on the C list has a duplicate page from an active user; so I left Scot a message on his talk page regarding the duplicate. I figure active users will probably prefer to do their own merges. What is the guideline for this? --Beth 07:57, 18 October 2008 (EDT)


I was just conversing with my husband about this! It's not that I don't want somebody else merging pages I created, I just feel that I have resources at my disposal to make sure the merge is accurate. Perhaps we could break the Duplicates Report down further - maybe by country or active user? Then I could easily see which families that I would have more knowledge to tackle. Also, I know that I've personally contributed to a few hundred of the duplicates. I uploaded my Gedcom in large chunks - my husband's side, and my side. It's amazing how many ancestors we have in common! I've been merging them as I find duplicates, and would like to continue to work to merge into other user's pages.--JBS66 08:23, 18 October 2008 (EDT)


How about if I create a screen where you can view a list of families that you are watching that are also in the duplicate list?--Dallan 12:10, 20 October 2008 (EDT)


Wonderful, Dallan; you can do just about anything. Love it!--Beth 15:55, 20 October 2008 (EDT)


I think this will be valuable. It will be nice to have on one screen the duplicates that I am watching - easier than going family by family as I've been doing thus far! Thank you!!!--JBS66 16:17, 20 October 2008 (EDT)


This is done. Check out Special:ShowDuplicates. This feature is also available from the MyRelate menu. Also, you can view duplicates for any user, not just yourself. I'll put out a news item about this feature tonight or tomorrow.

Also, 97 users have more than 100 duplicate families in their watchlists. I'm going to start writing these users to ask them to help with the merging.--Dallan 17:34, 20 October 2008 (EDT)


Hello Dallan, the above does not seem to work for me, so I noticed today (doing the Kenz list); all I get is 'find duplicates' and from that I can compare to individuals; user pages only give 'contributions', no such thing as a list of all possible doubles. The list of doubles I got through your link, constantly referred me back to my own pages. Does it work in Europe or do we need an adaptor :):). Yours, Leo--Leo Bijl 15:35, 3 March 2009 (EST)


The Special:ShowDuplicates shows your own duplicates. If you want to see another user's, and you know their username, you can use the same link, but add /USERNAME after it, like we did for Kenz. If you want to see ALL potential duplicates, that's here.--Jennifer (JBS66) 15:54, 3 March 2009 (EST)


Lawson Aday [20 October 2008]

The family of Lawson Aday has three possible duplicates on the duplicate page; however only 2 and 3 are duplicates. One cannot merge only 3 to 2. How do we fix these pages?--Beth 07:52, 19 October 2008 (EDT)


Here's one way around it: From the Duplicate families report

  • click on the link for Lawson Aday & duplicates
  • click link for Lawson Aday & Living Unknown (3)
  • choose More --> find duplicates
  • click in the checkbox for only (2) and Compare
  • This link should also do it: [[1]]
    This link didn't end up working--JBS66 08:13, 19 October 2008 (EDT)
Brilliant; worked like a charm. Thanks.--Beth 08:58, 19 October 2008 (EDT)

Hmm, the above link should have worked also.--Dallan 12:10, 20 October 2008 (EDT)


Duplicate sources [19 November 2008]

Do I need to remove the duplicate sources manually or does the merge adjust the new source numbers to cite the correct event?--Beth 08:54, 19 October 2008 (EDT)

Okay, I tested this. For now you need to remove duplicate sources manually. I did not merge a duplicate source and you not only lose the source but the citation as well. --Beth 09:12, 19 October 2008 (EDT)


The merge renumbers source citations so that if an event on a merging page refers to a citation S1, you decide to keep both the event and the citation, and the citation becomes S3 on the merged page, then the merged event will refer to S3 in the merged page.

On the other hand, if you keep the event but remove the citation from the merging page, then the event won't cite anything in the merged page, even if there is a duplicate/similar citation on the merged page. There are currently two ways around this problem:

  1. Keep the citation along with the event on the merging page, and remove the citation from the merge target.
  2. Edit the merged page after the merge is finished to add a reference from the merged event to the duplicate/similar citation on the merged page.

If this situation comes up a lot, I could try to identify citations in the merged page that were the same as the dropped citations and have events reference those citations in the merged page. The problem is identifying when a kept citation is the same as a dropped citation. That is, is "MySource:User A/My Citation" the same as "MySource:User B/My Citation"? I'm open to suggestions on this.--Dallan 12:10, 20 October 2008 (EDT)


From now on if two pages have exactly the same source citation, when you merge the pages both citations will be merged into a single citation. Similarly, if one page has multiple source citations that are exactly the same, if you edit and save the page the duplicate source citations will be merged. Eventually we'll modify the GEDCOM upload to keep it from generating duplicate source citations in the first place.--Dallan 20:33, 28 October 2008 (EDT)

Exactly the same down to page and text, or the same based only on the title? I assume the first since I have seen sources multiplied, one for Name, one for Birth and one for Death? However, just want to be sure, because sometimes I cite a source for birth and death separately for clarity... --Jrich 20:50, 28 October 2008 (EDT)

Exactly the same down to page and text (and every other field). So for example if the source citation for a birth was S1 and the source citation with the exact same fields for death was S2, when you saved the page birth and death would both cite S1. The main reason for this is to help remove duplicate source citations in pages created from GEDCOM files. I've seen a number of pages where the exact same citation is repeated multiple times for birth, death, and other events. By having the system merge duplicate citations automatically when it saves a page, then when you come across a page with repeated citations during a merge, you don't have to worry about removing the duplicate ones. You can keep them all and trust that the system will remove exact duplicates.--Dallan 20:30, 29 October 2008 (EDT)


Sources seem to be an ongoing issue. It appers to me that most source citations from uploaded Gedcoms become mysources when they upload, probably usually a result of differences in syntax. One letter of a different case, a comma instead of a colon, all kinds of little things can prevent the system from differentiating a mysource from a source. As a result I think it mandatory that a page be edited after a merge. Often I can not guess how to change the text in a source citation to take advantage of the drop down feature and end up doing a search, only to find several duplicate entries, each with slightly varying format or syntax. Yesterday I edited a page that in the text field had an AFN number followed by 16 lines of text stating the information had come from the AF. Other than that, there was no source data at all. Including this kind of drivel when merging with a well souced page seems counter productive. Who cares if and when a person downloaded Joe Blow's gedcom from WFT unless Joe cited a real source and you included that. Many pages have sources entered in the text box or in the note fields that need to be converted to source citations. Often, a litle text from the source can be included, obviating the need to identify what event came from it. As far as editing an active user's pages is concerned, I don't mind unless it results in a lot of useless clutter like some of what happened over the week-end. most edits to pages I have uploaded are to siblings and their spouses and descendants. I included such people that are not my direct ancestors as hooks hoping to attract cousins who might have better information than I on our common ancestors, which they would not otherwise recognize.--Scot 19:27, 18 November 2008 (EST)


Merge procedure; merging additional source and results [20 October 2008]

If you merge two pages that have a different source for the same exact fact and check to keep the 2 different sources; the results are an alternate fact.

You need to manually edit the page after the merge to move the additional source to the event field and then remove the alternate event.

At least, I have confirmed this behaviour for the name fact. The merge creates an alternate name with the new source; even though the names are identical.


Also note that I am referring to duplicate pages created by the same person.

If you are merging pages by different users; you may wish to leave the alternate fact.--Beth 09:33, 19 October 2008 (EDT)

I've been treating "duplicate" events/names which are sourced only with junk like "asodifja.ged" as unsourced for most purposes like this -- if the information duplicates what's there, I don't merge it, if it's different, I merge it but remove the source.--Amelia 13:15, 19 October 2008 (EDT)
Well, I have not removed the sources; at least I will know the source and could use that information in evaluating the evidence. --Beth 22:03, 19 October 2008 (EDT)

Just to be clear, you're saying that you have the same name or event in two pages - the only thing that differs about them is the source, and you want to keep both citations in the merged page? If so, does this situation come up a lot? I could try to detect this situation and not add the duplicate event but instead add the source citation to the existing event on the merged page.--Dallan 12:10, 20 October 2008 (EDT)


That is what I mean exactly; but I so far this scenario has not come up alot. --Beth 19:47, 20 October 2008 (EDT)


Updated merge report, fixed bug [20 October 2008]

I've updated the merge report to remove families that have been merged as of yesterday evening.

I also fixed a bug where changes to a person's name or birth/death information during the merge weren't getting copied to the family page - the family page continued to show the pre-merge name and birth/death information. This won't happen in the future, but pages that have already been merged are going to continue to show the old information on the family pages. I'll write a program to update them (which probably won't happen for a few months). In the meantime, if you want to see the updated information on the family page, you can edit the person page, delete the name/birth/death and save the page, then re-add the name/birth/death and save the page again. This will cause the family page to be updated with the correct information.--Dallan 12:10, 20 October 2008 (EDT)


You can enter specific titles to merge now [20 October 2008]

If you already know the titles of the pages you want to merge, you can now enter them by going to Special:Compare. I'll add this as an option on the "Admin" menu later on today.--Dallan 12:57, 20 October 2008 (EDT) (This has been added to the Admin menu now.)


Unmerging [20 October 2008]

If you merge incorrectly and want to undo it, here's what to do:

For each page involved in the merge, both people and families, both the redirected and merged pages:

  1. Navigate to the page. For redirected pages, make sure that you're looking at the redirected page by clicking on the "redirected from page title" link at the top of the merged page. You can see a list of the pages that were involved in the merge by viewing your "Contributions" screen (click on MyRelate, then Contributions).
  2. Click on the "History" link.
  3. Click on the time-and-date link on the revision just before the merge occurred (the line just below the line with the merge comment).
  4. Click on the "Edit" link.
  5. You should see a warning about editing an out-of-date revision of the page. Ignore the warning and save the page. Add a summary comment about unmerging.

If you want to check to make sure that you've unmerged everything correctly, after you've edited all of the pages navigate to each page again:

  1. Click on the "History link.
  2. Click on the radio button (the circle) next to the revision just before the merge occurred and then click on the "Compare selected versions" button.
  3. The only changes you should see between the pre-merge revision and the current revision should involve re-ordering data elements and standardizing place links. Person and family links should be the same between the two revisions.

That's it! Eventually I'll simplify unmerging, but this process should work in the meantime.

Note: If a page that was redirected has a talk page associated with it, you'll see a "View the talk page" link instead of the "redirected from page title" link at the top of the merged page. This is annoying, and it's something I'm going to fix in a few weeks. In the meantime, if this happens to you, add "?redirect=no" (without the quotes) to the end of the URL line and press enter to navigate to the redirect page. --Dallan 22:05, 20 October 2008 (EDT)


Changes in merged individuals not showing up on family pages [21 October 2008]

There was a bug in merging that I just fixed yesterday: Changes made to the names, birth, and death events weren't getting copied properly to the family pages. It shouldn't happen from now on, but pages that have already been merged are going to continue to show the old dates on the family pages. I'll write a program to update them early next year. In the meantime, if you want to see the updated dates you can edit the person page, you can delete the date and save the page, then re-add the date and save the page again. This will cause the family page to be updated with the correct date.--Dallan 10:37, 21 October 2008 (EDT)


What to do when pages should not be merged [25 January 2009]

If you come across two families with the same title that should not be merged, and you want to let others know that they should not be merged, edit the Talk page of one of the families and add a template

{{nomerge|Family:Title of the other family}}

somewhere on the page. You can do this for Person pages as well. In the next few days I'll modify the merge function to prohibit merging pages when one of these templates is found.--Dallan 11:28, 21 October 2008 (EDT)

I used this to replace my emboldened comments on Judah Hopkins and his two wives both named Hannah Mayo, but it never warns me during a merge (I've gone up to but no further than the last cancel). Someone would have to actually look at the talk page to see it. Hopefully someone would be careful enough to do this, but who knows? Is it possible to have the Merge process check for the presence of this and warn of it early in the process? --Jrich 10:04, 25 October 2008 (EDT)

Yes, it will, soon, I promise. (Early next week.)--Dallan 01:17, 26 October 2008 (EDT)

The nomerge templates work now.--Dallan 20:28, 28 October 2008 (EDT)
Is it supposed to be stopping these pages from showing up on my page of duplicates? Because all it seems to have done is make both of them appear.--Amelia 10:58, 2 November 2008 (EST)
Mine too. Plus most of the others listed as duplicates have no match that even looks close. Either the scoring, or the threshhold for reporting, seems to generate too many potential duplicates. --Jrich 11:48, 2 November 2008 (EST) [Oops, spoke too soon, now my list of duplicates is empty so you probably just fixed this? --Jrich 13:27, 2 November 2008 (EST)]

I changed the show duplicates page to take you directly to the compare screen listing the pages that the system believes are probable-duplicates -- either because they have the same-named husband and wife, or because they show up as alternate husbands/wives/parents for the same family/person. And yes, I had some problems with the code to switch things over the other day so the duplicates list was empty for awhile, but it should be working fine now.

That's a good point about the nomerge template. It doesn't stop the people from showing up on your "show duplicates" screen, just keeps you from being able to merge them. I'll change the system so they don't show up on your show duplicates screen either (later this week or early next week).--Dallan 08:36, 5 November 2008 (EST)


Hi Dallan,

I'd really like to see a Don't Merge checkmark on the compare page that adds the NoMerge template to one of the talk pages automatically. This will probably only work when the compare page shows 2 candidates, but since those are the vast majority, implementing the checkmark there will be a good start.

The main reason to ask for this is that IMO marking pairs as non-duplicates should be just as easy as merging them. Putting the right template on the right page is not easy now, and it's likely to go wrong if one doesn't use the right syntax, or puts in the wrong page reference.

--Enno 06:58, 25 January 2009 (EST)


This makes sense to me. I'll add it to the todo list (after GEDCOM export, so February timeframe).--Dallan 14:54, 28 January 2009 (EST)


Would it be possible to...? [22 October 2008]

Dallan,

How difficult would it be to add a line to the Compare Pages to merge screen? Underneath each child, you have Given Name...Death Place. Would it be difficult to add Spouses (just like you have Parents underneath husband and wife)? I ask because I did a lot of merging last night (thanks to your new user specific duplicates screen!). I found that I occasionally had to take the extra step of opening the children's pages to make sure they were the same person. In Québec families, there can be many Mary or Joseph's in the same family.--JBS66 15:11, 21 October 2008 (EDT)


This is done now.--Dallan 15:14, 22 October 2008 (EDT)


What do the different colors mean? [22 October 2008]

On the "Merge pages - Experimental" what do the colors mean,

  1. some items are GREEN in one column with no checkbox and on opposite side is a plain column with a check box; Does this mean the green items WILL NOT be saved?
  2. some items are YELLOW with a checkbox on one side and plain with a check box on the other side.
  3. some items are PINK --Kristy 06:15, 22 October 2008 (EDT)

What I've observed is that Green means the field is the same for the families being compared, and pink means the field is different. As for yellow, I haven't figured that out yet...--JBS66 07:12, 22 October 2008 (EDT)


I think yellow means a portion of the field is the same across the families - like if the birth date for one was Abt. 1628 and the birth date for the other was just 1628.--JBS66 07:22, 22 October 2008 (EDT)


I've just added a "color key" to the pages. Green means exact match on a complete date or place or name, yellow means the information matched but some pieces are missing (e.g., the date is a year only, or the place is just a country or a US state), or the information is a "partial-match" (e.g., the name sounds similar but is spelled differently, or one place is a county but the other one includes the town also). Red means that the information differs.

When I compare people, I add two points for each green item, one point for each yellow item, and subtract two points for each red item. If the total is at least 9 or 10 points, then I generally believe that the two pages are duplicates. That's just a general rule-of-thumb that I use; it's obviously not correct in every situation, but it's a simple starting point.--Dallan 15:14, 22 October 2008 (EDT)


Open in a new tab/or link [24 October 2008]

This is a "minor" request... and easily doable manually but would be nice if, when we have to work with a list of links (Families with Possible Duplicates), if when we click on one of the links that the new page would then open in a new tab or a new window automatically. That way folks don't have to manually click on the back button so many times to get back to the list of links to work on (Like when I forget to manually open in a new tab). I know this can be done in HTML (I have links within my own website set up this way) but don't know if it can be done in Wiki language. --Kristy 10:12, 22 October 2008 (EDT)


It's possible to do this, but opening new windows/tabs is confusing to some people so I'm reluctant to do it. You can hold down the "control" key when you click on a link to open it in a new tab. That's what I suggest.--Dallan 15:14, 22 October 2008 (EDT)


OK, well I usually right click and choose "open in new tab." I have working on this for HOURS at a time and thought the addition of this feature, just on pages with LONG list to be edited would help. I have been working on my "Families with possible duplicates" for about five hours already today! I want to thank you Dallan, (I think.. My vision is getting blurry and I now have no circulation to my feet! ha ha) for creating this page to make it easier to find the ones I have to work on. It is a GREAT help in this overwhelming merge process! I am getting the hang of it now, and thanks for the explanation on the color codes on the "compare Page". --Kristy 15:31, 22 October 2008 (EDT)


I'll make it a separate link next to the title. I don't want you to get RSI :-)--Dallan 13:53, 23 October 2008 (EDT)

  • AHHH thank you, I see it! that will help me a lot! (been working for HOURS on my list again today!) --Kristy 00:26, 24 October 2008 (EDT)

Several enhancements [22 October 2008]

Several enhancements to merging today. Remember, merging is still experimental so if you notice anything going wrong (like pages redirecting back to themselves) please let me know. Also, if you tell me the page titles that are in error I can look at the page history to see what went wrong. Thank you!

  • You can select the page you want to be the target of the merge (the page the other pages merge into). I've made this available while we merge the existing pages. Once the existing pages have been merged, I plan to remove this ability and always merge into the older page, so that newcomers don't merge established pages into newly-created pages.
  • If a Person page shows multiple parent or spouse families, or a Family page shows multiple husbands or wives, you can select "Compare parents" or "Compare spouses", etc. from the "More" menu to compare and merge those probable-duplicates. Eventually I'll add these to the probable-duplicates report (multiple parents, husbands, or wives, but not multiple spouses).
  • The Compare screen shows additional information: the spouse-families of children, and all names, birth, and death events (not just the primary one).

Lost children [23 October 2008]

on the compare page, I was having one column with say 4 kids.. and the other column with one child. It correctly matched up the one child. and I left the option at "don't merge" for the other 3 kids, as there was no child to merge them with. I thought this would just carry over those three children, but they appear now to be "lost in cyber space"

Was I supposed to do a drop down for the second child and and choose "child two" and have it merge into a blank space? I guess so... hmmm --Kristy 21:41, 22 October 2008 (EDT)


You shouldn't have to. They should be added automatically to the merged family. (This discussion is continued on User talk:Dallan)--Dallan 13:32, 23 October 2008 (EDT)


Merging Problem [23 October 2008]

I performed the following merge this morning:
Pages merged successfully:

On the Compare pages to merge screen,

Since Geneviève Daunais was listed twice, as both Child 1 and Child 3, I thought they were different versions of the same person. In actuality, they were the same page, Person:Geneviève_Daunais_(1). So, what ended up happening is that she was redirected to herself, and she is not appearing on the Family:Antoine Daunais and Marie Richard (1) page. What is the best way for me to fix this?--JBS66 10:43, 23 October 2008 (EDT)


Thank-you for reporting this! I rolled back Person:Geneviève Daunais (1) to the version just before she was redirected, so her page is ok now and she's been added back into the family. I've also fixed this bug so it shouldn't happen again. (The system won't attempt to merge the same page into itself.)--Dallan 13:32, 23 October 2008 (EDT)


Thank you for fixing this!! You know, I wouldn't have noticed this happened if I had not written down each of the children before I proceeded with the merge. Unless I noticed how many children there were, I would have gone to the family page and thought everything looked alright. Is there any way to create a report of pages that redirect to themselves? I'm wondering if User:Msscarlet1957's previous question today about a child being lost in cyberspace might be related or this might have happened to others.--JBS66 13:55, 23 October 2008 (EDT)


Yes, that's on my list of things to do.--Dallan 15:04, 23 October 2008 (EDT)


Fear, uncertainty and doubt [26 October 2008]

I ran my duplicate report and picked a family to start in on, when I quickly became concerned about sources, images and text. The merge tool gives the impression of merging just dates and not the rest of the data. For many of my families I have worked to populate the sources, images and text. What happens to this data when a merge is performed? Some of the earlier comments seem to point to the sources being merged and you can then go fix it up afterwards, that's good. What about images? And Text?

I anticipate that the answer partly being careful which page you use as the target of the merge. It's probably more sensible to pick the "better" page to be the merge target. I have two issues with that. The first is cosmetic. The site has style preference for the lower numbers. Not that important, but if the higher number page has the better data and should be the merge target, it goes against the style preference. I can get over that. The second is what if both pages have a lot of text. I imagine this probably won't happen that often, and the answer is to merge two well populated pages by hand. Of course, you can't tell any of that beforehand from the compare so careful checking things out seems warranted before you try anything.

I searched the help for keyword "merge" hoping for some insight and guidance and came up empty. I realize the feature is all of a couple days old so that was a bit much to ask for. However, without a better understanding of exactly what is going to happen when I merge I am paralyzed by fear, uncertainty, and doubt. I do not want to lose the work I've done nor ruin work someone else has done.--Srblac 19:51, 23 October 2008 (EDT)


You're right, there's not any documentation yet. I thought I would write the documentation after things settled down a bit. It's good to ask questions, since they'll eventually form the basis of the documentation :-).

Merge actually happens in two phases: in the first phase you're just comparing names, dates, and places and choosing whether or not to merge. After you press the "Merge" button you're shown all of the names, events, sources, images, notes, and text on each of the pages you've chosen to merge. Each item has a checkbox next to it. If the checkbox is checked, the item will be added to the page. Items from the merge target that are kept will appear on the final page before items imported from other pages; text from the merge target appears above text imported from other pages. If you do merge two well-populated pages, it doesn't hurt to view the final result and edit it in case it needs some tweaking. Also, since it's a wiki it's not too difficult to unmerge in case you change your mind.

I'd suggest keeping the lower-numbered page as the merge target unless there's a good reason not to. And if anyone has suggestions for a better name to put on the "Merge" button that appears at the bottom of the Compare screen so that it didn't sound like the actual merging was going to happen immediately when you press that button, I'd like to hear them. I've been trying to come up with a better name but haven't thought of anything yet.--Dallan 23:24, 23 October 2008 (EDT)

  • Does it have to be one word? If not what about "proof your merges" or maybe "Review Merge" --Kristy 00:28, 24 October 2008 (EDT)

I do agree the merge button is confusing. There is a finality in the concept of button controls. We're used to seeing DONE, SUBMIT, CANCEL... where these actions generally can't be undone. How about a link instead, like proceed to step 2 and put Step 1 - Compare pages to merge at the top of the first screen?--JBS66 06:26, 24 October 2008 (EDT)


I've changed it to "Proceed to step 2" to see if that works better. It's certainly better than "Merge". Thanks!--Dallan 12:01, 24 October 2008 (EDT)


I saw the post on Help:Merging pages and that made me feel better. I took the plunge and tried merging on a page I felt I could stand to have problems on. Well - Everything worked great! The text, sources, everything came over as I would have expected. Alt events were created for things that were iffy. I could go in afterwards and tidy up. Yeeess! The first power of this wiki is the ability to dispaly information more richly than ever before, but collaboration is the second power and the merge feature takes that to new heights! Thanks!--Srblac 07:25, 26 October 2008 (EDT)


Merge screen ideas [24 October 2008]

Some of us are merging like mad :~) and have come to understand the merging screens and process. However, I believe for most WeRelate users, the merge screens are confusing. There is often a lot of information on one screen that is visually overwhelming. I’d like to put a couple of suggestions out there for future enhancements that might be beneficial.

  • Consider limiting the number of individuals to merge side-by-side to 2 (after the initial clean-up is done).
  • Break the information that is on each screen into a tab-form (hard to explain in words - here's a link to an example [2]). Essentially, instead of one long screen that appears daunting, it could be that you review just the marriage tab, then the children tab, then the images tab (with the images showing side-by-side)...--JBS66 07:00, 24 October 2008 (EDT)

Those are good ideas. I think once most of these initial merges are complete, most merges in the future will naturally be just two people at a time. I'm planning to make the person and family edit screens tabbed screen in the future; I'll do the same with the merge screen then.--Dallan 12:01, 24 October 2008 (EDT)


Help page [24 October 2008]

I've edited Help:Merging pages to show the new instructions for merging pages. If you notice anything incorrect or something that should be added, please feel free to fix it.

Also, things seem to be going pretty smoothly so I'm going to announce the find duplicates and merge functionality on the Main Page.--Dallan 12:01, 24 October 2008 (EDT)


Adding newly merged pages to my tree [28 October 2008]

I know you intend to get rid of "trees" Dallan, but as I am merging... If the page to keep is "not" my page, that newly merged page is not then in my tree. This fact I only just now realized after three days of merging!

I have been working hard at trying the get cousins to come in and adopt my tree, so to start collaboration, if the "trees" are eliminated, how will those people come in and start watching my "tree" and be part of efforts? Also, now that all those merged into someone else's tree, and no longer in MY tree, because of the merge, anyone who comes in will not pick up those pages when (and if) they adopt my tree. Is this a clear as mud? (hard to explain) --Kristy 15:18, 24 October 2008 (EDT)


Argh, I'd forgotten about updating trees. You're watching the merge targets now, and the other people who were watching the other pages are watching the merge targets now as well. So you and others will get notified when those pages change. The only thing is that they won't show up in the family tree explorer unless you click on the "Tree +" button in the upper right corner to add them to your tree. I'll automatically add the merge targets to your tree on Monday.

Merged pages are being added to trees as of this morning. But since pages merged before today were not added to people's family trees, I'm going to make showing the family tree explorer on every page, and having it show people and families even if they're not in your tree, a high priority. Hopefully it will be implemented in a week or two.--Dallan 18:51, 28 October 2008 (EDT)

I'm thinking that in the future, rather than "adopt" someone else's tree, people will just watch pages they're interested in, and the family tree explorer will appear on every page, and list the ancestors and descendants regardless of tree membership. I'll add a "watch ancestors also" option on the "watch" menu item.

If anyone else is in a similar situation and wants me to add the merge targets that they've created into their tree, leave me a message. On Monday I'll fix this problem for merges that happen in the future.--Dallan 01:17, 26 October 2008 (EDT)

  • I myself am against eliminating the adopt a tree program. The is the whole reason I joined into WeRelate, was to find cousins, have them adopt the tree and thus begin collaboration. Without that option the site is of little use for me, other than just another Gedcom upload, such as WorldConnect. None of my cousins are going to take time to manually set themselves to watch each page of a particular family tree (my CLARK tree for example has 530+ people in it) and setting to only watch the Ancestors of one person is of little advantage. (just my humble opinion)--Kristy 09:07, 26 October 2008 (EDT)

It's a good point. I'm going to initiate a conversation on the watercooler about this.--Dallan 18:51, 28 October 2008 (EDT)


Proposal to Manage Pages [27 October 2008]

I believe the ability to merge pages will move WeRelate to a different kind of tool than it was prior to having that ability. Yes, WeRelate is a wiki, but it is a different kind of wiki. In Wikipedia an author has to work to manage a page. WeRelate allows for the automated authoring of page by submitting a GEDCOM file. The natural barriers to wiki authorship are greatly reduced in the WeRelate wiki.

As with any wiki the more time put in by a dedicated researcher the more valuable the page. Automated authoring dilutes that value. At the moment this is not a big problem on the site. It's a good way to harvest raw data and the merge tool is helping us to sort through it all.

However, as merging proceeds the merged data has more value and being inundated in automatic pages is more worrisome. Of course we will ask people making submissions to be careful what they submit and introduce them our tools, but unfortunately new people are by definition unskilled and naive about the proper way to do things. Relying on new people to properly manage the wiki is a poor strategy. New people performing lots of merges is a potential danger.

So what do we do?

I created a Proposal for Managed Pages to discuss what to do and why it could work. Regardless of whether this particular approach is taken I think the WeRelate community needs think about protecting the large amounts of value being added by the new merge feature. --Srblac 11:22, 26 October 2008 (EDT)


Minor change to merge procedure [27 October 2008]

Up until now on the merge screen, when you unchecked a yellow box (to say that you did not want to keep that information) and the box to the far right was also unchecked, the system would automatically check the box on the far right, under the assumption if you did not want to keep the partially-matching information, you probably want to keep the information on the right. After thinking about it, I believe that while the assumption may be true sometimes, having the system check boxes automatically is too "tricky". So I've removed this capability. The system will no longer check some boxes if you uncheck others. I just wanted to let people know.--Dallan 22:45, 26 October 2008 (EDT)


Yes I appreciate the change, Dallan, as on a page with several name variations, it was then automatically putting check marks in those other names, causing me problems if I did not see that happen! --Kristy 11:02, 27 October 2008 (EDT)


Merge target now appears on the right when comparing pages [29 October 2008]

I thought it was confusing that the default merge target (the earliest-created page) would always appear on the right when merging pages, but would sometimes appear on the left when comparing pages. So I've made a change to show the default merge target on the right also when comparing pages. Please let me know if this causes any problems.--Dallan 13:53, 29 October 2008 (EDT)


Don't worry about merging pages from WaltK [29 October 2008]

A GEDCOM with a lot of duplicate people got past the maximum-size check a few days ago (I need to improve the check) and resulted in about 6,000 duplicate families being added. I noticed the increase today and deleted the tree.--Dallan 20:30, 29 October 2008 (EDT)


Duplicate husbands and wives and parents now appear in the duplicates list [30 October 2008]

I've added person pages to the "Show duplicates" list when the person is one of multiple husbands or wives of a family. I've also added family pages to the list when the family is one of multiple parent-families of a person. When you try to find duplicates for these cases, it's not as straightforward as the typical case where the family has the same husband and wife name as another family. I've tried a few cases and it doesn't seem too difficult to find the duplicates, but please let me know if it's not easy to spot the duplicates in the search results screen in general.

To make finding duplicates easier, especially as we start adding less-obvious duplicates to the list, it's possible to change "Show duplicates" to link you directly to a Compare screen showing just the system-identified duplicates, but then you'd miss out on the opportunity to find possibly even more duplicates by reviewing the search results yourself. So I'm reluctant to link directly from the Show duplicates screen to the Compare screen, but I can if finding duplicates in the search results list proves too difficult.--Dallan 20:30, 29 October 2008 (EDT)

Hi Dallan,
I got to this topic because I was wondering what you were going for when you added all these people who don't seem to have duplicates to the show duplicates page. Is there some way to get to the Compare screen for the system identified ones? Because I just tried about four news ones that showed up on my show duplicates page, and none of them have any even remotely similarly named pages.--Amelia 23:08, 29 October 2008 (EDT)
Ok, I'll change show duplicates to take you directly to the compare screen.--Dallan 10:47, 30 October 2008 (EDT)

Person:Henry England (2) [19 November 2008]

Good old King Henry I of England came up as someone with an unusually-large number of possible-duplicate families. I was going to try to merge some of the families he belongs to, but there are so many families - many of them look like they should not be combined - I don't know enough of the history and this is is way beyond me. Does anyone want to take a whack at it?--Dallan 19:18, 3 November 2008 (EST)


This seems like agood opening to bring up issues I have raised with you, Dallan and Jrm03063. First of all, there is a great website created by Stewart Baldwin and others called the Henry Project. It is at [[3]] It is well researched and documented, easy to navigate and I am sure that it is the best online resource for this data. It is meant to include the ancestors of Henry II which, of course includes Henry,I. However, I have been concerned about how to handle medieval folks naming. Not only are many of them known by names in several different languages inviting confusion, but also surnames were unknown. I for one do not want to see a page "Henry Unknown, King of England" "Henry England" King of England nor "Henry King" King of England. However, because there is a surname field submitters feel compelled to enter something there. The possibilities are endless rendering individuals unrecognizable let alone mergeable. I would like to see a way to leave the surname field empty, perhaps with a non printing character if there is such a thing. Then if, say the alternate name field could include sobriquet, nickname, alias or some such entry which along with prefix and suffix titles could be used to identify them. we could have, for example King (Pefix,) Henry(Given name) ,II King of England(Suffix); Alt Name 'Henry "Curtmantle"'(Sobriquet). I wrote today; I see I have over 300 merger candidates myself, so I have plenty to do without having to deal with unresolved issues in the medieval space. I have intentionally omitted nobility and medieval ancestors from my database as I believe they are the subject of much scholarly research and are best left to experts. I have been reading the newsgroup/mailing list, soc.genealogy.medieval, for some 12 years. Many of the participants are well credentialed genealogists. Several of them have worked to create the Henry project, also wikipedia seems to have pretty good stuff on the Carolingians, so maybe that isn't a priority today, with so much else to be done.--Scot 20:36, 18 November 2008 (EST)


Update [7 November 2008]

Just a couple of quick updates:

  • I fixed the watchlists and the family trees so that anyone watching a redirected page is now watching the redirect target as well. Similarly for trees - if a redirected page was in your tree, the redirect target is in it now as well. This should fix the bugs we had with merge targets not being added to people's watchlists or trees.
  • Starting tomorrow morning, nomerge pages won't show up anymore in the Show Duplicates screen.

Thank-you for all of your hard work on this! We'll start inviting people with lots of duplicates in their trees to join the process early next week.--Dallan 14:40, 7 November 2008 (EST)


Generally, Dallan, How Do you Think It's Going? [21 November 2008]

What's your sense of how well community-driven merging is going? Are you happy with it? Is it creating more headaches? What can we do better?

-- jillaine 16:08, 21 November 2008 (EST)


Unsubscribers with a lot of duplicates [28 February 2009]

Overwhelmingly the folks doing the most of the merging have voted to keep the abandoned gedcoms. I was just suggesting deletion to save people slogging through a bunch of redundant data. Thanks for your input.  :-)--sq 23:45, 28 February 2009 (EST)


Here is a list of users who have unsubscribed from WeRelate - that is, they have elected to not receive any change notification emails, and who have more than 100 duplicates in their Show Duplicates list.

Any thoughts on what we should do with these pages? Many of them have already been merged, but at least in the case of the top few on the list, there are many more left to go. I thought we could start thinking about the problem of what to do with inactive trees by using these as examples.

If we delete their trees, it means that any pages they are watching that nobody else is watching get removed. If those pages are linked to from people/families that others are watching, those other pages are edited to remove the links to the deleted pages. So if John and Mary had a child named Jim, and Jim was watched only by user whose tree we were deleting, but the family page for John and Mary was watched by someone else, then we would remove the link to Jim from the John and Mary family page.

Dallan, this one (above) looks pretty good. Granted, I only made about a dozen random checks on it, but on those, this gedcom appears to have more data than others-- although I haven't yet seen any source information. jillaine 11:53, 23 November 2008 (EST)
I just clicked on the TreeDeletionImpact link above and it took about 4 minutes for the page to display. It turns out this tree has over 72,000(!) pages in it. There are over 1000 other pages that link to pages that would be deleted if the tree were deleted though. Any more thoughts on whether to keep this tree (or the other trees in the list) now that we can see the deletion impact?--Dallan 14:16, 5 January 2009 (EST)
I tried to download the Deletion Impact for a half an hour and it still only had about 20%, so I stopped it.
As a general matter, I am not at all a fan of deleting trees that have a lot of overlap with existing work, especially when the tree was uploaded early like this one and has a lot of merged pages. I know I have personally merged a ton users' pages, and I'm sure I've barely made a dent. And a lot of these duplicates have the (1) name that is handy for identifying dups. I vote to keep this one.--Amelia 00:52, 25 February 2009 (EST)
I try to merge a lot of pages from Genealogist84 because I have the opinion tat these give a good support to among others a complete oversight of medieval life, we can never have enough bricks to build the house !
Dallan, this one (above) appears to be duplicating itself-- i.e., of the dozen or so random ones I examined, the dupes are referring back to the same gedcom. In addition, again, just about a dozen random checks, I'm not seeing much if any source information. So unless there's far more source information throughout it, I'd cast my vote for deleting this one.
Looking at the list of pages that would be deleted, they only link to the users MySource pages and don't link to any other pages of interest to active users. I vote to delete this tree --sq 14:32, 6 February 2009 (EST)
No strong feelings.--Amelia 00:52, 25 February 2009 (EST)
The only pages that would be deleted were contributed by this user; only 93 of over 9000 pages are watched by any other user and many of those are watched by an inactive user; there are more than 500 duplicate pages--I check three sets--they didn't seem to contribute sources or other significant data, and 2/3 were internal duplicates. However 7 out of 10 possible deletions involve pages that have been merged and edited by major contributors who are not watching them. I vote keep this tree .--sq 15:36, 19 February 2009 (EST)
Dallan, this one is a mishmash of good and bad. There are a fair amount of internal dupes, but in many cases, this GEDCOM provides more data than others. Again, without deeper analysis, it's difficult to call. I'm on the fence about this one. It requires a fair amount of cleanup internally as well as compared to others. But there's a fair amount of good info. How do you decide when it's worth it to keep? jillaine 12:17, 23 November 2008 (EST)
I recognize this name, which probably means I've done a lot of the merging, and thus deletion will probably have all sorts of annoying consequences. I vote keep.--Amelia 00:52, 25 February 2009 (EST)
I checked the possible deletions and 90% have been merged by major contributors but are not being watched by them, only about 500 of the 4000+ pages are being watched by more than 2 people. I think we should keep this one .--sq 15:36, 19 February 2009 (EST)
This one does not appear to add a sufficient amount of new/distinct information to be worth keeping (and also has a lot of internal dupes as well). jillaine 12:30, 23 November 2008 (EST)
I have merged a ton of this stuff. Keep.--Amelia 00:52, 25 February 2009 (EST)
I think this one can go. It includes Isaac and Sarah. Yeah, from the Bible. TONS of internal dupes. And in those cases where it's duping someone else's, it doesn't add much. (Again, this from examination of only 12-15 pages.) jillaine 12:36, 23 November 2008 (EST)
Of over 2600 pages, about 500 are watched by multiple users. This is the royal lines of Europe all the way back to Abraham. 5 out of 20 possible deleted pages have been merged by major contributors that are not watching the pages. I don't know about this one.
Don't have an opinion.--Amelia 00:52, 25 February 2009 (EST)
Yikes. This one is almost all internal duplicates. Pairs and triples. I vote get rid of it. jillaine 12:38, 23 November 2008 (EST)
Very few people watching these pages, 1 out of ten possible deletion impact pages have been merged by major users not watching the page. I vote to delete this tree
Delete.--Amelia 00:52, 25 February 2009 (EST)
This one has a LOT of internal duplicates, but otherwise appears to have decent information and is not too duplicative elsewhere. I'd say hold onto for now. jillaine 12:42, 23 November 2008 (EST)
Extremely few people are watching these pages, only 6 out of 5815 pages are being watched by more than one person; of the (2) possible deletion impact pages none have been merged, approximately 60 of the possible duplicates are internal and multiple internal duplicates the rest of the 200+ dups are Family:unknown and unknown pages that are end of line pages with no original data; pages in this tree have dates and places for vital events but no sources. My opinion is that unsourced genealogy is of little value. We have almost 6000 unsourced abandoned pages. I vote to delete this tree .--sq 13:23, 24 February 2009 (EST)
Vote delete based on the lack of impact to others.--Amelia 00:52, 25 February 2009 (EST)
Most dupes are internal; with a few exceptions, does not add much to the larger community of knowledge (i.e., external dupes provide as much if not more info). Toss. jillaine 12:48, 23 November 2008 (EST)
Mostly internal dupes, 108 out of 2967 pages being watched by others; deletion impact effects only unsourced pages, mostly for MASS. colonists, I think we have enough active people doing early colonists; I checked a number of pages that are unwatched by others, they are unsourced and generally have only names, a few have dates. I don't believe this helps anybody. I vote delete this tree
No opinion. I recognize the name, but it's not huge. And that's as far as I'm getting tonight.--Amelia 00:52, 25 February 2009 (EST)
With a few exceptions, does not add much to the larger community (i.e., external dupes provide as much if not more info). Toss. jillaine 13:06, 23 November 2008 (EST)
1318 pages largely unsourced, lots of dates and places for vital events, mostly external dupes a few have been merged by Amelia and are not being watched; 1400's-1800, early colonists; 204 pages being watched by more than one person, this type of unsourced data is easily available from other sources, I don't think WeRelate needs to maintain 1318 abandoned pages. We do need to get Amelia's opinion on this one. I vote to delete this tree
I checked the deletion impact and dealt with those that bothered me. Deleting is fine, it's a pretty lousy tree. But it does have some pages that will be saved from deletion because they have been merged with SelenaManley pages below. I'm more in favor of that one going, so perhaps delete that one and see what it does to this one before this goes permanently.--Amelia 00:33, 27 February 2009 (EST)
This is another 50/50, but on this one I'd vote on the "keep" side. jillaine 13:15, 23 November 2008 (EST)
3927 pages, unsourced, lots of places and dates for vital infor, lots of internal duplicates, external dupes tend to be dupes of other abandoned gedcoms, not merged, largely early American and their English ancestors; only source is Ancestry submitted trees (:-() ;a little over 200 of the 3927 are being watched by more than one person. It appears many of these pages have already been merged. I don't think it has anything else to contribute. I vote to delete this tree
I'm fine with deleting this one, though see above.--Amelia 00:33, 27 February 2009 (EST)
Does not appear to add significantly to the greater compilation. Could go. jillaine 13:21, 23 November 2008 (EST)
3844 pages, external dupes seem to be largely merged, names, dates and places for vital information, unsourced, early American colonists, fallout deletions are mostly MySource pages where the user refers to his/her own files with various on-line services; over 800 0f the 3844 pages are watched by more than one person. My feeling is that the important data has already been merged. I vote to delete this tree
I thought this one would be a disaster, but looking at the deletion impact, I'm not seeing anything too worrisome. And getting rid of the lousy sources has been annoying for years. Delete.--Amelia 00:33, 27 February 2009 (EST)
More cases of info either matching, being less than, or far uglier than others contributions. Toss. jillaine 13:27, 23 November 2008 (EST)
I would agree except that User:Susan Irish who is very active has merged many of these pages and is not watching them. I will ask her to look at it. They may be intrical parts of her tree.--sq 13:23, 24 February 2009 (EST)
I think this one is a keeper. It's odd. Alot of the "potential dupes" are not dupes at all. And the data appears to be decent, although it appears there are a number of internal dupes. jillaine 13:35, 23 November 2008 (EST)
5985 pages. Unsourced, dates and places for vital data; User:Susan Irish has merged some of these and is not watching them. Again early colonists. Tree deletion impacts her own pages and mysource pages; 270 pages out of 5985 are being watched by others; some internal dupes, external dupes are largely ignored by other inactive users. We need to get Susan's opinion on this one.--sq 13:23, 24 February 2009 (EST)
Dallan, I think this is a keeper; it offers a fair amount of new info compared to others. jillaine 13:42, 23 November 2008 (EST)
2588 pages; some merges by major contirbutors that are not watching the pages, many of the unmerged possible dupes are duplicates of other abandoned pages which have been merged by major contributors who are not watching those pages either; a little over 300 of the 2588 pages are being watched; the deletion impact pages are linked to this users mysource and family or person pages that have been merged by JRM. I think we need opinions from Amelia, JRM, CTFrog and Susan Irish before this otherwise deletable gedcome is deleted.--sq 13:23, 24 February 2009 (EST)

Note: A trick you can use on the Search screen to see if any of these users' pages are also being watched by you is to add your user name to the keywords box (e.g., User:Dallan), check Exact matches only, and press Search. --Dallan 17:35, 22 November 2008 (EST)

Dallan, is there a way to tell which pages this user has that no one else is watching? I've merged a ton of stuff from all these users, and a lot of the pages I randomly click on are pages that have already been merged once and need it again -- that's not really evidence that this user has dumped on the system, and it doesn't tell us the damage when their gedcom is removed. What I really want to know is how may red links are going to show up on already "fixed" pages I'm watching, though I'm not sure how to tell that. At least a list of "will be deleted" would help us spot damage and watch pages we want to keep.--Amelia 13:01, 23 November 2008 (EST)
That's a good idea. In order to know if a tree should be deleted, it would help to know which pages others are watching link that to the pages that would be deleted. Let me work on this. I think we need to have a way to show this information before deleting any trees.--Dallan 18:28, 27 December 2008 (EST)

I have a question about your comment that "those other pages are edited to remove the links to the deleted pages". I've seen a few examples where a tree has been deleted, and the page that shared watchers contains red-links. A couple of examples are: Family:Charles Cloutier and Louise Morin (1), Person:Marie Martin (43), and Family:Jean Gagnon and Marguerite Cochon (1). This goes back to the conversation on WeRelate_talk:Watercooler#Duplicates_and_deleting_trees_.5B18_November_2008.5D.--JBS66 21:35, 23 November 2008 (EST)

JBS, in those cases, only the family page was being watched by multiple members. The person pages that got deleted were apparently only being watched by the one member that deleted the tree. Watching a family page does not protect the person pages that are attached to it. In other words, you have to watch the person pages within the family as well if you don't want them deleted. I think some enhancement to this is on Dallan's to-do list where watching a family will also automatically add you to the watch list of the individuals. But in the meantime, the deleting of a bunch of trees would undoubtedly riddle quite a few already merged families with these red links. --Ronni 01:15, 24 November 2008 (EST)
This is a bug. The links to a deleted page are supposed to be removed from the pages being kept rather than letting them turn red, but this is obviously not always occurring. I'll fix it next week. I don't want to create a bunch of red links when we delete trees; I'd rather remove the links to the deleted pages.--Dallan 18:28, 27 December 2008 (EST)
Thank-you for pointing out this bug! I found it and fixed it so the links will be removed in the future instead of turning red. For links that have already turned red, I'll have to write a program to go through and remove those links. Sometime later this year.--Dallan 11:43, 5 January 2009 (EST)

Member with two accounts [23 November 2008]

User:Terryavis and User:Terry A appear to be the same person. Each account has an uploaded gedcom that appear to overlap each other. Can one of the accounts/trees be deleted thereby getting rid of the duplicates? --Ronni 21:14, 22 November 2008 (EST)

Scratch this. I see Terry A is now working on their own merges. --Ronni 16:17, 23 November 2008 (EST)

After no-merge added, people still on compare page with box checked [24 April 2011]

Please see:

(mmm.... tried to make this a link without success. [formatted to allow linewrap-originally had no spaces])

[[Special:Compare&ns=Family&compare=John_Miller_and_Mary_Unknown_%281%29

%7CJohn_Miller_and_Mary_Unknown_%282%29
%7CJohn_Miller_and_Mary_Unknown_%2812%29
%7CJohn_Miller_and_Mary_Unknown_%2813%29
| John Miller & Mary Unknown Compare page]]

URL

You'll see that the right hand column describes a 19th century couple and all other columns describe a 17th century couple. I added "no merge" instructions to the 19th century family and each of the two individuals. But when I re-load, despite the visible instructions not to merge, their merge check boxes are still checked and I can't uncheck them. I fear going through to the next step while those are checked. Please advise.

-- jillaine 09:05, 23 November 2008 (EST)

Your page is checked because it is the lowest numbered page and hence the target. But it does say do not merge with the other pages, and if somebody heeds the warnings, they won't proceed. If they do proceed, step 2 will remove the nomerge pages from consideration and tell them there are no pages to merge.
You could also add the nomerge template to both the duplicate pages, though I am not sure that does more than generate more warnings, increasing their likelihood of being effective. --Jrich 10:19, 23 November 2008 (EST)
Jrich, Thanks! I was afraid to proceed, but with your encouragement, I proceeded, and my 19th century John Miller and Mary Unknown were not listed, but I could go ahead and merge the other three dupes. Cool. Thanks. jillaine 11:43, 23 November 2008 (EST)

Give me some more! [27 December 2008]

I've gotten through my Merge page, and there is no more. But I don't believe that. There must be more. Do I have to wait 24 hours for a new list?

In the meantime, I'll peruse those unedited GEDCOMs above and see what if any advice I have for you about deleting them.

-- jillaine 11:44, 23 November 2008 (EST)


It possible for new pages to show up on your list the next morning. Here's what can happen: suppose that as part of merging two family pages together you merge the husbands: John Smith (2) into John Smith (1). It's possible that John Smith (1) now has two sets of parents: his original parents plus those from John Smith (2). If the two parent families didn't have the same spouse names in the title, say one was John Smith and Nancy Jones (1) and the other was John Smith and Nancy Unknown (1), then the two sets of parents would not have been detected as possible duplicates initially and wouldn't show up on your list. But now that they both appear as parents of John Smith, the system will detect that they are possible duplicates and will add add them to your duplicates list overnight so that they show up the next day.

Thank-you for working on this!--Dallan 18:28, 27 December 2008 (EST)


Statistics Update [27 December 2008]

I just wanted to give everyone an update: Over 30,000 person pages and 12,000 family pages have been merged over the past two months. This is fantastic! By my estimation we have merged roughly 10-15% of the duplicates in the database already. And User:Npowell are I are currently working to integrate merge into the GEDCOM upload process so that this problem doesn't happen anymore in the future. In January I plan to re-invite people to participate and tell them how many of their pages have been merged already. We're getting there!--Dallan 18:44, 27 December 2008 (EST)


New special pages for trees [5 January 2009]

I'm going to announce this on the Watercooler, but I thought I would talk about it here as well since it deals with the issue of deciding whether or not to delete trees. I've added three new special pages for trees:

I'm hoping the last two can help us decide whether or not to delete trees that have a lot of duplicates. I'll add links to these pages for the trees listed in the "Unsubscribers with a lot of duplicates" topic.--Dallan 12:28, 5 January 2009 (EST)


Acknowledgements... Rants... ? [8 April 2009]

I decided to take on the "A" page. I just got through the Family:Male Andersons

I'm wiped out!

Turns out a chunk of the dupes were multiple gedcoms by the same person. After I got through the Andersons I went to said person's talk page and encouraged them to do their own deduping. But I don't think they're around anymore.

It was nice to see Dallan's recent stats over at the watercooler.

-- jillaine 14:14, 8 April 2009 (EDT)


Duplicate pages in doubt [16 April 2009]

Hello everyone,

Regarding possible duplicate pages that do not have enough information to confirm the duplication; I have today marked them as not a match, but wondering if I should do that. What I would prefer to do is to mark all of the pages for speedy delete since they usually have no sources to back up the data and seem to have just dropped the gedcom here and have not returned to work on duplicate pages or returned to work on anything else.--Beth 22:03, 14 April 2009 (EDT)


Beth,

If you look above, you'll see that there seemed to be enough "votes" or opinions to hold on to abandoned GEDCOMs. I interpret that decision to extend abandoned pages -- i.e., we don't delete them. As someone elsewhere pointed out, unsourced data is not necessarily bad data; it may still be useful. On some other page, we've been discussing a way to automatically tag such pages with a wikipedia-like "this page needs attention" template.

-- jillaine 07:36, 15 April 2009 (EDT)


Yes, Jilliane, I am aware of the concensus, but everytime I spend my volunteer time on pages that I feel should be deleted I come back to it. We have new users all of the time and the consensus may change one day; but my main question is regarding the possible duplicate pages that do not have enough information to tell if they are a match. Am I supposed to mark them not a match? For instance one page may have no dates or places so it is impossible for me to tell if these are the same people. --Beth 08:16, 15 April 2009 (EDT)


I tend to agree with Beth. I've been trying to devote some time to "community service" cleaning up duplicates. But I keep running into 2007 gedcoms that have never been touched, they are terrible and are really not worth the time it takes to untangle the mess. There are plenty of other copies of this information on the site, so nothing would be lost if the messiest of them were simply deleted. I'm not proposing delete the whole gedcom, just individual pages. Martin Aucoin (7) seems to be his own father! Why spend 2 hours untangling this mess when deleting a few pages would fix it? --Jlanoux 14:37, 15 April 2009 (EDT)


Well, as a good Quaker, I won't block re-consensus! ;-) Especially on this topic. And especially as a fellow community service mergerer. But perhaps we should distinguish which we're talking about here:
  • Abandoned GEDCOMs that contain data that are NOT duplicative of data on other pages
  • Abandoned GEDCOMs that contain data that ARE duplicative of data on other pages
It seems like Beth is talking about the first; and the subsequent concurrer is talking about the second. Or did I misunderstand (it's been known to happen)? -- jillaine 17:55, 15 April 2009 (EDT)

Yes, Beth is talking about abandoned unsourced gedcoms whether duplicated or not if no other user is interested in the gedcom. Before deletion, users could watch pages or copy the trees for the gedcoms that are of interest to them. I would venture to say that one can probably find this data on Rootsweb or Ancestry, so I don't understand why we need the unsourced gedcoms on WeRelate. The problem with unsourced gedcoms is that one is not able to evaluate the evidence and make any judgments about the conclusions because the data is simply not presented. Also it is almost impossible to determine if a page is a duplicate when a page has no dates or locations. If you delete the individual pages that are in the duplicate merge list then that will most likely lead to gaps in the remaining gedcom. Maybe one could sort out the other duplicates first in the gedcom and then delete the problem page as Jlanoux suggested. I am not placing any blame on the users who uploaded their unsourced gedcoms and basically left the site. That is what some people have become accustomed to doing, and they were not likely not clear on the way a Wiki works. I am saying that the Wiki does not lend itself very well to abandoned unsourced gedcoms. I guess one can dream that eventually someone will adopt the pages, but I believe that would be a rare event. --Beth 18:41, 15 April 2009 (EDT)


Beth is back again. Perfect example of why at least these pages should be dumped and possibly both the gedcoms. There are no sources on these pages. There are 2 separate users involved who dropped gedcoms off and left. Both of them have the mother dying before her marriage. Both of them have 2 different children born after the mother's death. Exactly what is one supposed to do with these errors? I detached one child and then discovered the marriage after death. These are the duplicates: Family:Rhys Ap Gruffydd and Gwenllian Madog (1) and the other family. And there has been some criticism of the label junk genealogy but for 2 different people to manage to have the same error with the parents, then they must have at some point copied some erroneous tree and never checked the data for errors. Most any good genie program will identify errors for you, so yes I label this junk. --Beth 21:34, 16 April 2009 (EDT)


The other day I got the weekly NEHGS newsletter, and I nearly fell out of my seat when I read an article entitled "Safeguarding your research against tech-savvy thieves". The chief advice was to publish your data without sources, and to make people contact you, so you can screen them. I guess the worry is that some websites collect donated family trees and then make money providing access to your research. I don't advocate this position, as anybody who has read my Talk page is aware. But it seemed pertinent to this discussion. --Jrich 21:56, 16 April 2009 (EDT)


Maybe off topic somewhat but I am not surprised in the least little bit. Many people decline to enter sources for various reasons. Some of the reasons listed are that people pirate their work and give them no credit, companies publish their trees and charge people, and related family members will contact them if there are no sources but might not if there are sources. As I stated before these researchers would not be Wiki inclined so they are of no concern on the WeRelate site. --Beth 22:15, 16 April 2009 (EDT)


May we archive this page? [16 April 2009]

Does anyone have any objections to my creating an archive for this page? It contains a lot of information on it that is no longer pertinent; and it seems like it would be more helpful to people who are doing duplicate review and merging if it were a bit more focused. If someone else wants to do it, that's fine; I just um... have a lot of time on my hands now, and would be willing to do it. Didn't want to do anything, though, without checking. -- jillaine 07:38, 15 April 2009 (EDT)

Jillaine, of course you can archive this page. I recommend archiving through 2008, but do a quick run through and make sure that all pertinent items are in Help:Merging pages before archiving. --Beth 22:33, 16 April 2009 (EDT)


Is this a bug in the merge program father is listed as son of father [15 April 2009]

Help; I cannot figure out how to merge this page. The father is listed as his own son twice, but I think this is only in the duplicate window. Is this a bug in the merge program? Check this out.

Family: Daniel Allen and Eliza Martin (1) Daniel Allen and Eliza Martin (2) Daniel Allen and Eliza Martin (3) Daniel Allen and Eliza Martin (4) --Beth 19:24, 15 April 2009 (EDT)


I have no knowledge of this family. I think this is a bug in the data. But it is an interesting mess. The father and the son have identical names and there are multiple instances of both of them.

Here's what I find (F=family page, D=Daniel E=Eliza, D4 = Daniel Allen (4))

  • F1 marries D4 and E4, has child D3
  • F2 marries D18 and E11
  • F3 marries D18 and E11
  • F4 marries D17 and E12, has child D18

Based on birthdates D4 and D17 are the same person, D3 and D18 are the same person. All versions of Eliza are one person (i.e., merge E4,E11,E12), and she is (see below) the husband of D4/D17, and the mother of D3/D18. F2 and F3 are simply erroneous and should be deleted, so that Eliza is only married to the merge of D4 and D17 (i.e., merge F1 and F4).

Looking for a little guidance, I quickly found Find a Grave. Based on this one source, not an exhaustive search, Daniel was a Mormon and had multiple wives, presumably more than just one at one time, and Eliza is shown as his wife. He did have have a son with the same name but no wife is shown for the son on this page.

Don't know how reliable this source is so my analysis could be wrong, but unless the creator of this data wants to step in and clean it up (probably not as his last activity Feb 2008), this is how I would clean it up. --Jrich 20:44, 15 April 2009 (EDT)


Thanks, it has been a long day, don't think I will tackle this one tonight. --Beth 21:02, 15 April 2009 (EDT)

Watched pages [17 April 2009]

Hey, I don't want to have all of the pages that I merge added to my watch list.--Beth 20:16, 17 April 2009 (EDT)

Beth, you can easily avoid that. Go to My relate, then Preferences, then Editing; there you can switch it off. Leo

Thanks Leo, it has been so long since I looked at the Preferences, I did not remember that feature. I have reset my preferences. Much appreciated.--Beth 20:40, 17 April 2009 (EDT)

Are there really no more A's to de-dupe? [23 April 2009]

I just tried to go into the "A" list of dupes, and it's empty. Something must be wrong, right? --- jillaine 16:30, 21 April 2009 (EDT)

Hi Jillaine, nothing is wrong. I finished the A list. I plan to work on the B list next. I will complete a letter every 2 weeks or so. --Beth 10:42, 22 April 2009 (EDT)


Woo hoo!! Way to go, Beth! I think anyone who cleans out a letter's page should get a prize! I also did a chunk of work on the A's a week or so ago, and had just started on the B's. Should I continue such, or should I work on some other letter? Should we place our initials by letter page links so we know who's working on what? Would that be helpful? -- jillaine 12:40, 22 April 2009 (EDT)
You can continue on the B's; if we all work on the B's that one will soon be empty and we can move on to the C's. However there is no system, feel free to work on any letter.--Beth 16:38, 22 April 2009 (EDT)

There's something immensely satisfying about a blank "duplicate review report" page. Will, do, Beth! jillaine 19:30, 22 April 2009 (EDT)

A small step with many more to go, but having done a few merges myself when I want a change of pace, I know how much hard work this represents. Whew! --Jrich 12:43, 23 April 2009 (EDT)


Regarding merging and need ability to edit duplicates page [22 April 2009]

It would be helpful if we could add a notation after the duplicates names on the duplicate page. I sometimes post a message on a user's page regarding duplicates. I don't expect a response and usually don't receive one, but I leave these pages until I am at the end of the letter designation and then go back to them and make a decision without the user's input. This way if the user complains later I can reference my earlier post when they were not responsive. Anyway, it would be nice to be able to post: sent user a message on such and such a date after the duplicate listing. --Beth 21:02, 22 April 2009 (EDT)


You're too polite; I just merge. 95% of the time, I don't hear anything from the person; the other 5% (and that's generous), my merges inspire the person to do their own merging/clean-up. jillaine 23:01, 22 April 2009 (EDT)

HIstory not restored with merge [13 June 2009]

May we have the history of the pages left intact with the merges? When someone asks me a question regarding the merge, I find it necessary to unmerge and review and respond and then remerge, unless I have missed something.--Beth 18:55, 19 May 2009 (EDT)

To see the history of the merged page you have to navigate to the page that got redirected in the merge. Here are two ways to do it:
  • In the "Review merge" screen, click on the merged page that you want to see the history for -- this takes you to the page before it was redirected -- then click on the "History" link.
  • In the merge target, click on the "More" menu, then on "What links here." The merged page should show up as a redirect in this list. Click on the merged page to navigate to the page, then click on the "History" link.
I hope this helps.--Dallan 12:11, 13 June 2009 (EDT)

Different notification requested to send to user when pages are merged [13 June 2009]

Could you change my notification when I merge a user's pages? The user assumes that I am related. Perhaps the message could direct the user to the watchers of the page for questions? --Beth 18:58, 19 May 2009 (EDT)

I'll add the ability for you to add a comment when you do the merge.--Dallan 12:11, 13 June 2009 (EDT)
I want to second Beth's request. AND i'll use the merge template more frequently. (I've only been using it on family pages where MORE than two pages were merged.) People have been leaving messages for me about why I "changed," for example, Bailey to Baily, and really such conversations should happen on the talk page of the person page in question, not the merger's page. THanks. jillaine 15:56, 13 June 2009 (EDT)

Finished the B's but need the Semi-protected removed from the rest of the pages [13 June 2009]

There are 2 semi-protected pages on the B's that remain; but there is no reason for them to be protected. Please remove this so I can do an automatic merge; unless there is a volunteer for the manual merge; they will just remain on this list. --Beth 19:01, 19 May 2009 (EDT)

As an admin, you should be able to merge semi-protected pages. It this not working?--Dallan 12:11, 13 June 2009 (EDT)

User:Ccbreland has many internal dupes [13 June 2009]

I've left a message for User:Ccbreland about internal dupes in their GEDCOM. The information is nicely sourced, but there appears to be a lot of duplication. I'm not holding my breath that they'll de-dupe themselves (which I've requested) since they've done nothing here since the November 2008 upload of their GEDCOM. Just want to give y'all a heads-up. jillaine 11:33, 12 June 2009 (EDT)

I can remove the GEDCOM if you all would prefer.--Dallan 12:11, 13 June 2009 (EDT)
This one is nicely sourced, so I don't really want to see its content go. I was just posting this message here for other de-dupers, so they know what I've done. Thanks. jillaine 15:58, 13 June 2009 (EDT)

User:Crownedraven also has multiple dupes [19 June 2009]

Left a message for this person, too. (Last contribution made December 2008.) In her case, I found at least one instance of THREE identical families in the same gedcom. jillaine 11:44, 12 June 2009 (EDT)


I seen a lot of internal dups in various families, that look like somebody updated their data as they discovered new facts. So one version was the original, a birthdate added to one child and there's a second, duplicated version of the whole family, the marriage location pinpointed, and there's a third version. Whether they did this duplication in their own data or on WeRelate by reloading I never investigated, but I suspect most of this type of activity pre-dated the release of the new GEDCOM upload program. --Jrich 12:13, 12 June 2009 (EDT)

Jrich, why would someone create a new page for each new update? Why wouldn't they simply edit the existing page? And at least at the time this person uploaded, WeRelate wasn't allowing uploaded "updates" to go to have the same name, right, Dallan? This person, from their talk page, only has one GEDCOM/Tree. jillaine 16:01, 13 June 2009 (EDT)
I have no idea. Why else would one user be the only watcher on two or three versions of the same family with some differences but very few? I was just trying to figure out what was happening and my guess was they wanted to update WeRelate every time their data at home changed, so they re-uploaded the family, perhaps their whole tree, each time they discovered a new change. For example: [4] or [5] --Jrich 17:35, 13 June 2009 (EDT)
We have a check for when someone uploads a GEDCOM that appears to contain many of the same people as an existing upload. That's why you have to delete your existing tree before you can re-upload your GEDCOM. In the examples you listed, the duplicates were created on the same day within a few minutes of each other. This happens when the person has duplicates within their GEDCOM - multiple people in the GEDCOM with exactly or nearly the same information. This can happen if they import someone else's GEDCOM into theirs and don't merge all of the duplicates. We have a warning for this now in the new GEDCOM upload process.--Dallan 20:49, 19 June 2009 (EDT)
Same goes for this GEDCOM; I can remove it if you would prefer.--Dallan 12:11, 13 June 2009 (EDT)
Let me take a closer look at this one, Dallan, and see how strong or weak it is. Thanks. jillaine 16:01, 13 June 2009 (EDT)
It's not bad; it's not great, but it's better than a lot of others. Let's keep. jillaine 16:05, 13 June 2009 (EDT)

Advice on HannaHaney GEDCOM? [19 June 2009]

Actually, I'm not 100% certain there IS a GEDCOM here. I was deduping the "D" page and i've started checking to see who is watching the page and as you're seeing above, seeing that a number of dupes are in a single gedcom.

So was the case with Family:Alexander_Danley_and_Sarah_Hanna_(2) and Family:Alexander_Danley_and_Sarah_Hanna_(1).

Only one person User:RandyHanna is watching both of these pages. But you'll see that if you go to their page, there is no Tree on it, although if you look at their TALK page, you see that in April 2007 (!!!), a GEDCOM was uploaded successfully: HannaHaney.ged

It appears Randy never launched the family tree viewer. In fact, may have done nothing after uploading the GEDCOM.

There are no sources that I can see.

Shall we spend time de-duping this gedcom or should we ask Dallan to toss it?

-- jillaine 11:51, 12 June 2009 (EDT)


Dallan, if no one else objects, I'd recommend tossing this one. There's no source data that I can see. Isn't there a way to do one of those analyses to see how many other people watch these pages? Or did I dream that? (I've been away a couple months...) jillaine 16:09, 13 June 2009 (EDT)
NEVER MIND. I see you posted that link below. Thanks, Dallan. That's what I was looking for. jillaine 16:10, 13 June 2009 (EDT)
Dallan, I've run the deletion impact program on this GEDCOM. IMHO, the impact is low and as far as I can tell, all internal to the GEDCOM in question. My vote is to get rid of this GEDCOM. jillaine 16:18, 13 June 2009 (EDT)
Fine with me. I don't know what message is sent when a gedcom is deleted, but maybe one can add, something to the effect, that the person is welcome to reupload their gedcom under the new gedcom process. --Beth 19:28, 13 June 2009 (EDT)
I'll leave User:RandyHanna a message on his talk page that there are a lot of internal duplicates in his GEDCOM and that he's invited to merge them or we'll remove their GEDCOM. If he doesn't respond by Monday I'll remove it.--Dallan 20:57, 19 June 2009 (EDT)

User:Rlevans Duplicate GEDCOMs [19 June 2009]

This one is kinda (?) interesting. In this case TWO identical GEDCOMs (by same name) uploaded in August 2007. The second one was even held for review because the system recognized it might be a dupe, but then it completed the upload anyway. See User_talk:Rlevans.

It appears to be a real mess. And very little sourced. Can we agree to ask for this one to be deleted?

-- jillaine 12:11, 12 June 2009 (EDT)

Did the deletion impact program on this; almost all impact is internal to the GEDCOM. My vote: Toss. jillaine 16:21, 13 June 2009 (EDT)

No objections. --Beth 19:29, 13 June 2009 (EDT)

I left a similar message on this person's talk page. I'll remove the tree on Monday if they don't respond.--Dallan 21:43, 19 June 2009 (EDT)

User:Jolayne multiple dupes in LARGE GEDCOM [19 June 2009]

User:Jolayne has an 18k+ GEDCOM with multiple dupes, including:

  1. Family:James Davis and Bethiah Leach (1)
  2. Family:James Davis and Bethiah Leach (2)
  3. Family:James Davis and Bethiah Leach (3)

I started to post to her Talk page, but she hasn't been around for a LONG time and has never responded to messages posted to her TALK page. Advice? It's a huge GEDCOM.

jillaine 12:42, 12 June 2009 (EDT)

No objections. --Beth 19:31, 13 June 2009 (EDT)

Beth, are you sure? Do we know anything about this user? The GEDCOM was huge by WeRelate standards; seems like someone must have been familiar with her work. Also, she has a lot of watchers besides herself, meaning that some percentage of her pages are entertwined with the interests of others. I realize that when her GEDCOM is deleted, it won't delete those pages who have watchers other than herself, right? It's the pages linked to those pages that I'm worried about. jillaine 23:59, 13 June 2009 (EDT)
I don't know anything about this user. I have merged a bunch of her pages; so some of the many watchers could be because of the merges. Leaving in the morning on a short vacation. We have so many duplicate pages and just found out about the gedcom deletion impact today. Would like to evaluate more of the gedcoms upon my return. I thought perhaps we could select more for deletion and attach a note with the deletion to encourage them to reupload under the new gedcom upload. This way the user would be required to merge their pages during the upload. --Beth 00:54, 14 June 2009 (EDT)
I'm inclined to not delete this one. The deletion impact page lists 1750 pages that would lose links if we deleted this GEDCOM. On Special:ShowDuplicates/Jolayne it looks like there are roughly 500 duplicates. I could ask one of my children to start doing the easy merges. They wouldn't be able to do anything too involved though.--Dallan 22:33, 19 June 2009 (EDT)
I'm going to leave them a message to see if maybe they'll get involved in the merging. It's worth a try.--Dallan 22:35, 19 June 2009 (EDT)

A gazillion uploaded GEDCOMs? :-( [13 June 2009]

Please review User talk:Dwkiefer Notice the number of uploaded GEDCOMs. Notice how many of them are dupes to each other (by name). Notice that the person hasn't been online since.

Please advise... This is really frustrating.

-- Merging Jillaine jillaine 12:43, 12 June 2009 (EDT)

Just in case you're not aware of it, you can go to Special:TreeDeletionImpact, enter a user name, press "Go", then select one of their trees and press "Go" again to see the impact of deleting their tree -- what pages link to pages that would be deleted if the tree were deleted. If a tree owner doesn't respond within a reasonable amount of time to a request to get involved in merging duplicates within their tree, and the deletion impact looks like it would be relatively minor, I don't mind if the people doing the merging prefer to have the tree removed rather than to merge it.--Dallan 12:11, 13 June 2009 (EDT)
Harumph. This person's stuff is a mix of good and not so good; and a number of people are watching various pages. Let's leave his/hers alone for now. THanks, Dallan. jillaine 16:41, 13 June 2009 (EDT)

Meaning of ^^ after a name [19 June 2009]

What does Thomas ^^ designate? --Beth 22:06, 19 June 2009 (EDT)


It is probably a method within the constraints of that person's genealogy software to designate direct ancestors, or something similar. I use an apostrophe to do that in mine, and it is one of the reasons I do input by hand rather than GEDCOMs, because I forget to remove all this personal stuff that doesn't belong in a shared, community tree. I have seen names with generation numbers and such. I usually delete them. I figure if it doesn't mean anything to me, it probably doesn't mean much to others, either. A little harsh perhaps, but I think it is more likely that some contributor mistakes WeRelate for another one of those GEDCOM graveyards. --Jrich 22:24, 19 June 2009 (EDT)

Thanks, I thought it might be some popular usage not in my knowledge base. --Beth 22:49, 19 June 2009 (EDT)

User:Siusaidh 22k - delete? [28 June 2009]

Uploaded in late 2007. No activity since early 2008. Very early. Abandoned GEDCOM. Over 500 pages link to it, but what I can't tell from the Special tool is: is anyone WATCHING these pages? Are they only linked to from WITHIN the GEDCOM?

Anyway, other opinions? Can we delete?

-- jillaine 18:13, 26 June 2009 (EDT) (working on the Ms)

I recognize this user, which I generally consider a reason not to delete. He/She has a lot of early New England work.--Amelia 23:54, 26 June 2009 (EDT)
Thanks, Amelia. I'll spend some time on his/her tree, then, de-duping. I, too, am working on early New England. jillaine 11:45, 27 June 2009 (EDT)
I was also going to suggest not to delete this GEDCOM. The 500+ pages listed by the deletion impact tool are mostly pages that other people are watching. The tool only lists pages that link to to-be-deleted pages that aren't themselves going to be deleted, either because they're not in Siusaidh's tree or because they have multiple watchers. It doesn't list pages in the tree that are being watched only by Siusaidh. I'll leave a message for Siusaidh telling them about the project and asking for their help to de-dupe.--Dallan 17:05, 27 June 2009 (EDT)
Dallan, I've been working on their duplicates, and have received NO messages from them which makes me wonder if they're even accepting messages from WeRelate. I'm almost halfway through their duplicate list. Thanks for further explanation about how the dupe-impact tool works. Makes me feel better. jillaine 07:37, 28 June 2009 (EDT)

I've gotten through as much as I can of this user's initial dupes. I'll check back in a couple of days to see what else shows up. (I've found that the Show Duplicates page often comes up with new dupes after the initial page has gone through.

I'd say I de-duped about 90% of the initial dupes. The remainder I could not tell. I also found some that were clearly NOT dupes.

jillaine 15:26, 28 June 2009 (EDT)


Delete User:Kama gedcom please? [26 June 2009]

This user uploaded in jan 2008 and nothing since. Many internal dupes. deletion impact is minimal.

Please delete.

-- jillaine 20:12, 26 June 2009 (EDT)

I spot-checked some of the pages that would be deleted if this tree were deleted. I came across Family:William Taft and Helen Herron (1). We want to keep this family page for President Taft. Since Kama is the only watcher, it would be deleted if we deleted Kama's tree. Also, a lot of the pages in the Special:ShowDuplicates/Kama list are scandinavian. Those pesky scandinavians (I'm half Norwegian) have a very shallow name pool: everyone's named Ole, Lars, Maren, or Anna :-). Anyway, I spot-checked some of the families listed in ShowDuplicates and most didn't look like duplicates -- they just had the same names. I'll ask User:Taylor to work on merging the ones that are duplicates.--Dallan 17:50, 27 June 2009 (EDT)

User Redcowboyhat - delete [26 June 2009]

Ditto. Internal dupes. Minimal impact. no activity for many months. delete please? -- jillaine 20:17, 26 June 2009 (EDT)

I checked Special:ShowDuplicates/Redcowboyhat and it looks like there aren't too many duplicates. I'll ask User:Taylor to clean these up.--Dallan 17:50, 27 June 2009 (EDT)

Delete user GEDOM: Mattammatt [26 June 2009]

ditto.

-- jillaine 20:47, 26 June 2009 (EDT)

I checked Special:ShowDuplicates/Mattammatt and it looks like there aren't too many duplicates. I'll ask User:Taylor to clean these up.--Dallan 17:50, 27 June 2009 (EDT)



Happyacres [26 June 2009]

Delete please. Minimal deletion impact. Mucho internal dupes. jillaine 20:56, 26 June 2009 (EDT)

I checked Special:ShowDuplicates/Happyacres and although there are a lot of duplicates, many of them look like they're not internal to the GEDCOM. It looks like most of them aren't internal duplicates; they're duplicates with people in other trees. (When a GEDCOM has internal duplicates the ShowDuplicates page will list a family with one index number followed by the same family title with the next-higher number.) I'll ask Taylor to clean them up.--Dallan 17:50, 27 June 2009 (EDT)

Thank you, Dallan! [28 June 2009]

For:

  1. "Do not merge" button on "Compare pages" screen
  2. making me a "trusted merger" so I can merge semi-protected pages.
  3. for explaining how ShowDuplicates (for another user) and the deletion-impact tool work; I was working from an inaccurate understanding of this (sorry for the resulting posts above)

Makes things a LOT easier...

jillaine 07:27, 28 June 2009 (EDT)


Another thought on uploaded GEDCOMs [29 June 2009]

(Please move this if it belongs elsewhere; I'm thinking about it because of the de-duping I've doing.)

If there is a copyright message that indicates people should not contribute information that they didn't write or for which they do not have copyright, doesn't that include that they should NOT be uploading GEDCOMs that consist of a series of GEDCOMs or other data simply copied from others imported GEDCOMs? Could we perhaps add to the GEDCOM upload section a reminder about copyright and if it's already there, add a parenthetical "(this includes work of others downloaded -- as trees or gedcoms -- from web sites such as ancestry.com or familysearch.org)". I'm just seeing SO many "sources" as Public Trees, Ancestral Files, OneworldTrees, etc., it's making me sick!

This month's rant,

jillaine 13:43, 28 June 2009 (EDT)

Well, based upon what I've seen I think prohibiting GEDCOMs that contained information from other GEDCOMs would limit the number of uploaded GEDCOMs severely :-). I think that one of our long-term missions needs to be to help people do primary research. Not that we'll try to import a lot of original records: Ancestry and others do a fine job with that. But we should direct people to the appropriate sources (free or paid) where they can look up original records based upon what they've entered about their ancestors.
BTW, excluding filename-only GEDCOM sources during the upload process is high on my ToDo list; then at least we won't see sources containing only "Someone'sFamilyTree.GED".--Dallan 14:32, 29 June 2009 (EDT)
okay. i understand.
re: your BTW, above: kisses kisses kisses and hugs! THANKS! jillaine 16:59, 29 June 2009 (EDT)

I am SO bored!!!! [13 July 2009]

This is a brutally boring project. I have GOT to do something different for awhile... like go paint an oil portrait of someone or something. I'm taking a break... jillaine 17:09, 10 July 2009 (EDT)

Good idea. In fact if you wait for about a month my son Taylor will have reviewed and merged most of the "easy" ones. That will just leave the harder ones, and at least they're not as boring! :-) --Dallan 09:41, 13 July 2009 (EDT)

Thanks for understanding and your support, Dallan. As you've seen, I'm switching my attention to help and FAQ language for awhile. jillaine 10:12, 13 July 2009 (EDT)

Please delete GEDCOM of User:Donald_Pate [17 July 2009]

For background, read:

WeRelate_talk:Junk_Genealogy#How_about_a_.22death_row.22_status.3F_.5B14_July_2009.5D

and:

Person_talk:Mary_Jones_(380)

and see this as a really STRONG example for why this GEDCOM is so bad:

[6]

jillaine 07:22, 15 July 2009 (EDT)


I'll leave a message him and remove the tree.--Dallan 11:20, 16 July 2009 (EDT)


Bill (Q) pointed out that this user is-- probably as a result of your message-- de-duping his GEDCOM. Look at his contributions. But I'm not sure that merging alone is going to fix the problems with his GEDCOM. He's got a much bigger problem than dupes. FYI... -- jillaine 15:34, 16 July 2009 (EDT)
It looks like the recent edits are just a result of his tree being deleted -- family members are being removed from families that are being watched by other people.--Dallan 12:44, 17 July 2009 (EDT)

New duplicates list

moved from the primary page
Yikes. that's still a long list. I'm almost done with my source renaming review; Solveig, if you want to assign me a set, I'll help out. Or you could work from the top, and I could work from the bottom... jillaine 11:21, 16 September 2009 (EDT)
The potential duplicates that got missed in the original lists are cases where an individual has multiple sets of parents (generally due to two previously-separate individuals getting merged), or a family page that lists multiple husbands or wives (not many of these). I'd be glad for the help. Dallan has just inserted headers in various places in the list.
Would you like to start with the "Multiple Spouses A" header (near the bottom) and I'll start with the "Multiple Parents A" header? thanks!  :) --sq 14:47, 16 September 2009 (EDT)
Wow! I'll help out to. I was doing dupes while drinking my morning tea. After I work on any names I may recognize, I'll to all the unknowns and no names. --Judy (jlanoux) 14:03, 16 September 2009 (EDT)
Thank you! I really appreciate it. I've created bullet items above for different sections of the list, about 700 matches in each section. Would you mind signing up for a section?--sq 14:47, 16 September 2009 (EDT)

Problematic Pages

moved from the primary page

Solveig, there are some real messes buried in this last list. I spent some time pulling apart some nasty knots, but many of them require a fresher mind (i.e., I should do this in the morning). My sense is that previous de-dupes may have created some errors. Are you or others seeing this too? jillaine 16:31, 17 September 2009 (EDT)

Yes Jillaine, some of them are real messes; that is why I left them. Seemed to me that more merging would just make a bigger mess. --Beth 18:47, 17 September 2009 (EDT)
Yes, I think the general rule should be: when it doubt, don't merge. I don't think we want to spend a lot of time on each family. If it's an obvious duplicate, then merge, but otherwise leave it alone and move on. The contributors who know more about the specific family can come in later and review the difficult cases if they want to.--Dallan 19:54, 17 September 2009 (EDT)

Doozie: George Wright (8) et al [18 September 2009]

Person:George_Wright_(8)


He is really messed up. He's a child to himself; he married his mother; etc etc etc.

When I look at the history, the two different merges -- each of which involved 4-6 dupes each -- look right in and of themselves, but somehow merge #1 and merge #2 merged together, and I can't figure out how to detangle THAT. I'm leaving it alone. But it's a mess. I just wonder how many other messes we created in Phase I. jillaine 23:18, 17 September 2009 (EDT)

Jillaine, this seems to be messed up by one merge, but the merges made sense at the time. I just can't find the merge that messed it up to fix up. Maybe like you say not so late at night. This one seems doable. --Beth 00:31, 18 September 2009 (EDT)
By all means, Beth... go for it! ;-) (And glad to know I'm not the only one who could not find the merge that messed it up.) jillaine 08:14, 18 September 2009 (EDT)

Doozie: Hugh Cotton (1) et al [18 September 2009]

Here's another one that I really don't want to touch. Father and son mixed up. THey have the same names. And in the case of Person:Hugh_Cotton_(1), he's married to his mother. 14th century researchers requested to clean this up, puhleeeze?! jillaine 17:40, 18 September 2009 (EDT)


I wouldn't worry too much about it, this is all over Ancestry sans sources, it all says she died at the age of six and gave birth 11 yrs post-humously, so if you are wrong, we haven't lost much--Scot 19:26, 18 September 2009 (EDT)


Found a live person! [19 September 2009]

Well this is interesting... in de-duping, I found someone who is apparently getting around the Living rules:

Family:James Britt and Shirley Acker (1)

What caught my eye was a couple each of whom born about 1948/9 and died 1950, but they married in 1971!

Advice?

jillaine 20:42, 18 September 2009 (EDT)

We have a rule on this enforced by the software, and I think it's grounds to delete them. I also delete any "Living" and "Unnamed" living people too because they are useless pages that shouldn't be here either, but that's not based on an official policy as far as I know.--Amelia 17:07, 19 September 2009 (EDT)

Mis-use of "Alternative" Spouse? [21 September 2009]

Alot of these "second-round" de-dupes have "alternative spouses" for the same Family page, but the alternative spouse has a completely different name.

Is it possible that there's a glitch somewhere that is reading second spouses as alternative spouses?

Just curious. There just seems to be too many of them.

-- jillaine 21:01, 18 September 2009 (EDT)

Not sure what you are asking. If somebody has a second spouse, that should be a separate Family page, i.e., there should be two marriages listed on their Person page. So what other purpose for "Alternative spouse" is there than listing a different person who could possibly be the spouse of the indicated marriage? In other words, once there is a spouse entered, why does the page allow you to add yet another, unless this is specifically what is intended to happen? I.e., two sisters where you don't know which is the wife, or two women with the same given name but different surnames? I asked this question a long time ago on the watercooler but apparently too generalized for people to make sense of it. For example, Person:John Libby (4). "Of his first wife nothing is known", except that some sources say her name is Judith and other says Sarah. If you don't list Judith as an alternate, will some new GEDCOM upload now create a duplicate because they don't see a John and Judith to match to? Likewise if you don't list Sarah as an alternate, will some new GEDCOM upload now create a duplicate? But you don't want to create separate Family pages for both because there was only one marriage. So is the problem here that this situation should not be flagged as a possible duplicate, not that it is a misuse of the alternate spouse feature? --Jrich 12:11, 19 September 2009 (EDT)
The gedcom upload program looks for potential family matches based upon the primary name, birthdate, and deathdate of all spouses and also the marriage date and place. So if you list two spouses on the family page, one named Judith and the other named Sarah, then GEDCOM's being uploaded with one spouse or the other should match the family page.--Dallan 11:27, 21 September 2009 (EDT)
I understand the distinction you're making and how alternative is supposed to be used. What I'm saying is that I'm getting a sense that alternative spouse is NOT being used in the way it's supposed to be. That said, your other point is well taken-- perhaps the pages are correct and there really are that many cases where an alternative spouse is offered (number just feels too high to me), and that the dupes program shouldn't be marking these, then, as dupes. I'll pay closer attention when I see these and try to determine if they really are "alternatives". I'll start making a list of the ones I'm worried about. jillaine 13:38, 19 September 2009 (EDT)
I'm not sure if this is exactly what you're talking about, but I'm seeing a lot of cases where the "dupes" are two versions of probably the same marriage, with variations on the wife -- Sarah v. Judith (as above), or Elizabeth Smith v. Elizabeth Unknown, or Elizabeth Smith v. Elizabeth Jones. The problem is that although technically they shouldn't have different family pages, we don't know which one is correct or better documented in most cases, so how do we know how the one family page should be named? Do we want to merge these and change the wife's name to "unknown" in the family title? I think that will hinder matching. But while leaving two separate pages is less correct than it could be, I don't think we ignorant volunteers should be doing these merges. Someone who has looked at at least some source should pick one. In the meantime, maybe a template to mark the pages to ask watchers to do something?--Amelia 17:12, 19 September 2009 (EDT)

I can imagine three cases for having multiple spouses listed in a family:

  1. The person creating the family page didn't understand that they were supposed to create separate family pages for separate marriages.
  2. Two family pages were merged, but the merger didn't merge the spouse pages so both spouses are added to the family page.
  3. The person creating the page wasn't sure of the spouse so listed multiple.

I'd just pass over the first and third cases. Merging the obvious duplicates is already a big job. I think we ought to leave the other cases to the people who contributed the information.--Dallan 11:13, 21 September 2009 (EDT)


Doozie: John (de) Gaynesford [21 September 2009]

I'm having difficulty pulling this gordeon knot apart:

Something's awry. It's particularly confusing because there is a son John who marries a woman also named Margaret.

Thanks for any help. jillaine 15:47, 20 September 2009 (EDT) Perhaps this would help [Greater Medieval Houses of England and Wales, 1300-1500]


I found a source which I added to the page. It doesn't look to me that there is a problem, just 4 generations of John's. John,I married Magery de la Poyle (Pole); John'II m Christina unknown, John,III m unknown; John,IV m Margaret unknown. Don't have an idea of the source quality, but this line is all over Ancestry, unsourced eg [[7]].


Can I just say? [12 October 2009]

This is mind-numbing. Time for a break. (How we doin' Dallan?) jillaine 19:37, 12 October 2009 (EDT)

Really good! We're 25% of the way through (roughly 1500 out of 6000 reviewed). I realize that it's slow going. But it's not a race. Also, we can be more conservative with these -- when in doubt, leave it for the original contributors to decide. And when you leave it for the original contributors to decide, you don't need to (and shouldn't) put a nomerge template on the pages. That will speed things up a bit hopefully.--Dallan 15:05, 15 October 2009 (EDT)

This round surfacing more bad gedcoms [15 October 2009]

Not sure how this gedcom missed the last review of "toss-or-keep" or maybe it did and it managed to be kept. I'm speaking of the default tree for User:Riggsy23@gmail.com. The user hasn't been on for almost a year, and appears not to be replying to a July 2009 talk post by another user.

As we're going through the second round (well, last?) of deduping, this gedcom is creating headaches in part because it has TONS of pages without any dates at all. Can we delete it please? And can we perhaps be a bit more liberal in having such requests fulfilled as the last round of dedupers (I'm not alone on this) are finding really nasty merge candidates? Just seems like we've got more headaches than necessary, and what we're spending a lot of time trying to accurately merge are really kaka GEDCOMs. Seems we could be spending out time in better ways.

-- jillaine 17:40, 13 October 2009 (EDT)


Candidates for Deleting? [16 October 2009]

The following are my suggestions for deletion consideration; other de-dupers please add your own suggestions here:

  1. User:Riggsy23@gmail.com - User hasn't been on for almost a year, not replying to a July 2009 talk post by another user; gedcom has TONS of pages without any dates at all.
  2. User:Xioma - many internal dupes; many pages without dates; a post to Taylor's talk page implies this person may be hostile to wiki collaboration. I've recently posted a request that they dedupe their own GEDCOM.
  3. User:rebeccahill - We've discussed her before. I believe it was decided to keep her GEDCOM. Not only does she have a lot of internal dupes; she also has a lot of no-date pages. She has never responded to repeated efforts to contact her. I'd like a re-review of the value of her GEDCOM. Amelia are you watching?

-- Jillaine 09:36, 14 October 2009 (EDT)


I'll leave messages for the people that haven't already been contacted that they need to clean up their gedcom or I'll remove them. I'll do that now for Riggsy23, Xioma, and rebeccahill.--Dallan 15:05, 15 October 2009 (EDT)

I've left messages and told them that we would remove their trees if they did not reply by Oct 19th. Riggsy23 and Xioma don't actually have a lot of duplicates: Special:ShowDuplicates/Riggsy23@gmail.com, Special:ShowDuplicates/Xioma, so the reason to delete them would be more because their pages aren't very good. In both cases the tree deletion impact would be minimal (fewer than 50 or so pages that wouldn't be deleted link to pages that would be deleted). Rebeccahill and Angel71982 both have significant duplicates though: Special:ShowDuplicates/Rebeccahill, Special:ShowDuplicates/Angel71982, so keeping these trees would represent a lot more work.--Dallan 15:50, 15 October 2009 (EDT)
I can't speak about the Angel71982 tree, but I don't think we should delete Rebeccahill until Amelia weighs in. This gedcom interconnects with -- I believe -- a lot of her (and my) work. Some of it's good, but dang; it's got so many dupes (she wrote, tossing yet another piece of yummy chocolate in her mouth). Jillaine 21:03, 15 October 2009 (EDT)
I went through a few of the deletion impact on rebeccahill, and a majority of them are one click away from a page that's watched by five people and/or has been cleaned up with VR/Savage/GM cites. Deleting it is going to cause a mess. It's not the cleanest thing ever, but the research isn't bad. Her dupes aren't necessarily internal, I would wager, either - more a measure of how much the tree overlaps with other people.--Amelia 11:49, 16 October 2009 (EDT)
That's what I was afraid of. Okay, Dallan, leave RebeccaHill alone. I've already started de-duping her. Jillaine 16:00, 16 October 2009 (EDT)

List of people with Internal Dupes [14 October 2009]

  • User:Fuzzface - recent upload; appears to be deduping him/herself; I left a message of thanks and encouragement on their Talk page. Deserves watching.
  • User:Angel71982 - 2007 upload and abandon. Internal dupes. Not sure there are any sources.
  • User:Mcummens - July 2009 upload. Doesn't look like they went through the Review process, although the gedcom was released. I posted a message on their talk page requesting that they do their own Duplicate Review.

--Jillaine 09:40, 14 October 2009 (EDT)


I'll do the same for Angel71982. Regarding Mcummens, Special:ShowDuplicates/Mcummens shows only about a dozen duplicates, several of which have already been merged, so maybe this user responded to your message. Many of the remaining duplicates involve living people (which is a different discussion). It doesn't seem worth deleting the tree over just a few remaining duplicates.--Dallan 15:05, 15 October 2009 (EDT)


Don't merge the Ball family [27 October 2009]

Solveig and Jillaine, our new member genealogist, Persisto, uploaded a gedcom on the Ball family (as in George Washington). He refused to merge his pages with the existing ones because of all the errors on them. I meant to return the file to user review but missed and hit the import button so the new pages are created. I'm discussing the merge issue with him and hoping he doesn't just pull his data because of the hassle. I'm not sure we would recognize the errors so I don't think we should merge them ourselves.

Do we have a policy for handling this issue? Serious researchers are getting discouraged. --Judy (jlanoux) 23:03, 23 October 2009 (EDT)

I think the "policy" is that the person who knows what he is talking about removes the information that he thinks is wrong. If someone else disagrees, that's what the wiki is for, but I think it highly unlikely. Assure him he can delete bad information at will if that's his problem. If it's that he doesn't want to deal, then, while I hate to see people who know what they are talking about leave, if they're not willing to engage the wiki, then I don't know how we expect to keep them.--Amelia 23:10, 23 October 2009 (EDT)
Persisto is a serious researcher and I would hate to lose him. I have not uploaded a gedcom under the new system and don't know how it works but if serious researchers are getting discouraged maybe we should reconsider our gedcom rules and application of same. --Beth 23:33, 23 October 2009 (EDT)
It wasn't the upload process that was the problem. He had completed the review process, but had marked everything "not a match" to avoid 'contamination'. He objects to merging his carefully researched data into a page full of errors. The one example of an error he cited was a case of same name-different people. It's not the kind of thing we would catch when merging. If we hold off a few days from merging these new pages maybe we'll have time to work something out. I've always wished that user names would appear on the merge screen. --Judy (jlanoux) 00:37, 24 October 2009 (EDT)

Once I was awake this morning, I looked at a few of the pages in question and compared. Frankly, I'm underwhelmed. There isn't much content on either the old or the new pages. I don't detect a ring of authority. He doesn't include sources for the most part and where he does it's just a line in the text box with something like "Hill, p 44". I sent him a note and explained that he will have to provide sources if we wants to change someone else's data. But, compared to the pages on the Early Acadians, this is a walk in the park. It wouldn't take an hour. (He did process his gedcom extremely quickly yesterday.) As far as I'm concerned, they will merged by us if he refused to, but let's wait a few days to see what evolves. --Judy (jlanoux) 09:23, 24 October 2009 (EDT)

I agree that the best policy is to tell people that if they believe that information on the existing pages is incorrect, they should replace it with their correct information. They can do this during gedcom upload (for pages that aren't semi-protected). If they do replace existing data, they really should have sources as you say; otherwise they can just add what they have as alternate events without removing the original information.
I'll put adding watchers to the merge screen on my todo list. It's a good idea.--Dallan 19:15, 27 October 2009 (EDT)

Hey Solveig [11 November 2009]

Just wanted you to know that I periodically come in here and do some dupe clean-up. So you're not alone.

-- Jillaine 18:31, 11 November 2009 (EST)


Ann Batchelder's many families [24 April 2011]

Ann Batchelder is from a prominent early American family (her father was Stephen Bachiler, progenitor of poets, presidents, statesmen, generals and movie stars), but she has several families of people that are obvious duplicates with alternate spellings. Some of these pages have quite a bit of content, others less. For the Sanborn line the Family:John Sambourne and Anne Batchelder (1) seems to be the choice as the primary one. All the families in the Sanborn line have easily found books on them, such as the Philbrick, Phelps, and Chandler families, in addition to the Sanborn family itself. I'll eventually be connecting to this family, so it would be great if this plethora of Sanborns was whittled down a bit. — Parsa 01:48, 24 April 2011 (EDT)

This is somewhat of a finished project and this page is not very active. Normally, users would just be expected to clean up such pages that they have knowledge of as they find them, as part of their contributing to a family. But I can take a look at this page since I see she is mentioned in GMB giving me a great starting point, and since I should have some research in my files being connected to a sibling. It appears without checking sources, which I will do before I make any changes, that there are duplicates of her two marriages and the duplicate Family pages just need merging. --Jrich 09:35, 24 April 2011 (EDT)

Thanks. Sorry I didn't notice about the dates in the discussion. I noted later on the blog that the merging campaign had ended. —Parsa 12:04, 24 April 2011 (EDT)


Will the duplicate list continue to update? [28 April 2011]

Although I've worked my section of the list a few times, it does grow again as new pages are added. --Judy (jlanoux) 19:02, 25 April 2011 (EDT)

Yes. Everyone's duplicate list will continue to be updated every night. I think we should still encourage people to check their lists and remove duplicates from them. It's just the big duplicate-removal project that has ended. And people can still help out and merge others' duplicates as they come across them.--Dallan 11:05, 28 April 2011 (EDT)
Sorry if that wasn't clear. I didn't mean my USER duplicate list. I meant the one here that you created for the project. I find that I have to come back here every few months to check the section of the alphabet I signed up for because the list starts growing again. So to clarify...Is the duplicate list here on the project page going to continue to be updated? or did you kill the job that produces it? --Judy (jlanoux) 17:43, 28 April 2011 (EDT)
Oh, sorry for misunderstanding. Yes, that job will continue to run every night as well.--Dallan 17:46, 28 April 2011 (EDT)
Fundraiser
Help fund new features!