WeRelate talk:GEDCOM review

Topics


Volunteering (2012)

It appears that the greatest need right now on the wiki is for GEDCOM Review. I am very new, but am a careful researcher and have worked on projects like this in the past. I am volunteering my services as a GEDCOM reviewer. Let me know when and where to start.

Ron


Hello, I responded in private email. --sq 23:57, 29 September 2012 (EDT)


GEDCOM upload from Jan 2012 (living people) [19 December 2012]

A GEDCOM was uploaded (and subsequent information added) from this user back in Jan 2012. I cannot find a page they've submitted that contains any vital information and I suspect many may still be living. The user has not responded to requests to add additional info to pages or delete those that are living. Is it possible to delete their tree? --Jennifer (JBS66) 11:34, 19 December 2012 (EST)


Volunteering [22 January 2014]

I am willing to help with GEDCOM reviews. Please contact me if this is an area which needs the help.--Khaentlahn 07:20, 2 May 2013 (EDT)

Responded on User's talk page. --Jennifer (JBS66) 17:24, 5 May 2013 (EDT)

Is there a place to read about what skills are necessary for GEDCOM review? I am wondering if I have what it takes to become an admin to review my own uploads. About how long does it take to review a GEDCOM and email the member if necessary? --janiejac 14:32, 20 January 2014 (UTC)


Hello, First read over WeRelate:GEDCOM_review. This is basically the protocol for checking gedcoms. If you would like to help out, I would have you check gedcom upload on particular days. On the first few days, I would explain why certain gedcoms were acceptable and not. After we have encountered the most common problems, I would have you look over the gedcoms first and email me with your analysis. After you're comfortable, you would do it on your own. Depending on the day, doing gedcoms takes 15 mins to 30 mins. You would do as many as you feel comfortable doing on any one day. Heather and Jennifer and I would welcome your help. The more people help out the lesser the load. It also means if someone is out for a few days, the gedcoms are still uploaded. I look forward to hearing from you. Let me know and welcome to the team. --sq 05:45, 21 January 2014 (UTC)


Hi Janie, I just want to add one thing to what Solveig has said. You mentioned "I am wondering if I have what it takes to become an admin to review my own uploads". Are you considering joining GEDCOM review solely to review your own files and speed up imports? In that case, I would probably caution against it. Optimally, volunteers would be reviewing other users' GEDCOM files. If that is what you intended (to help out in general), then.. it would be great to have your help :-) --Jennifer (JBS66) 12:15, 21 January 2014 (UTC)

Well, yes the motivation for the question was to be able to review my own uploads. But once I learn how it is done and how much time it will take to do, then possibly I could do it for others also. If you could teach other users to do this, wouldn't it help relieve the load from the current reviewers? Is there a concern that users would let bad info be uploaded? One of the reasons I like to upload even small gedcoms to WeRelate is that this program catches my errors. But I do understand the need for caution and don't know how much time I would be willing to put into this. It looks like it would take a good bit of time/effort on your part just to train me; so perhaps this isn't the best use of our time. --janiejac 15:08, 22 January 2014 (UTC)


Family Matches [5 September 2013]

I'm just following up on an issue with a recent GEDCOM. After the file was uploaded, Amelia noticed a number of duplicate family pages in recent edits. Since the file is already imported, I can't determine the cause exactly, but I suspect the Family Matches were inaccurately marked as "Not a match". We may want to add a quick check before uploading GEDCOM's that the Family Matches weren't willfully marked "not a match" in haste. --Jennifer (JBS66) 06:47, 5 September 2013 (EDT)


Gedcom for User:Run4fun [12 November 2013]

Hi Dallan or SQ, since there have been recent problems (Letters, Numbers and Special Characters in the name field) with two Gedcoms for User:Run4Fun, I took a look into a recent Gedcom and inadvertently "claimed it". Jennifer (JBS66) suggested I leave a message here so one of you could complete the approval process. I didn't see anything similar to the two previous Gencoms.

Best regards,

Jim:)--Delijim 14:50, 12 November 2013 (UTC)


Is there a place to report poor quality GEDCOMs? [2 June 2019]

I've been helping out on the duplicates page recently and just spent the last 4 hours untangling some very messed up records that were added by this user in Jan 2012 (info copied from Ancestry.com public trees, and apparently quite indiscriminately). I enjoyed the puzzle-solving, so that's okay, but it makes me concerned about the quality of the rest of the GEDCOM. I noticed that Jennifer (JBS66) had asked this user to review and clean up duplicates the day after the GEDCOM was uploaded, but I can't tell if anything was done.

What is the appropriate action to take? If the GEDCOM is deleted, we lose the work I just did. Even if you keep the records I touched, there are related ones (which might or might not be accurate) that I did not touch.

BTW, in case someone wants to delete most of the GEDCOM, this is what I fixed:

Samuel Griffin (7), his wives and children (but not his parents or grandchildren
everyone in the tree of Granville Jenkins (1) (which is now disconnected from the rest of the GEDCOM)
everyone in the tree of Daniel Jenkins (15) (which is now disconnected from the rest of the GEDCOM)
Willie Griffin (4) - I disconnected him from his parents (wrong family), which leaves him orphaned. I did not validate his info - maybe his record should just be deleted.--DataAnalyst 19:01, 27 December 2013 (UTC)

The program will not let me into the review page! Why?--JlMack 20:45, 2 June 2019 (UTC)


volunteering [4 July 2014]

Hi - I'm happy to help out where do I start? Particularly Australian, but very familiar with the UK.--Wongers 12:03, 4 July 2014 (UTC)

Thank you! I'll ask Solveig to contact you.--Dallan 01:03, 5 July 2014 (UTC)

Offer to help [24 December 2015]

I notice that things move slowly here, but maybe waiting a week for admin review might put some people off adding data. If you would like another hand I am available to help.Rmg 09:09, 24 December 2015 (UTC)


my first couple of review attempts [1 aug 2016]

hi all, i am in contact with the first two people whose gedcoms i review.

the first person is Dutch (like me) and the upload is only 30 people or so.

the second person is Danish, and the gedcom review is at http://www.werelate.org/gedcom/index.php?gedcomId=12054

there are a lot of issues with this second gedcom, but specifically with places. i see a place like Søllested which seems to not exist although Wikipedia knows about the place. https://en.wikipedia.org/wiki/S%C3%B8llested

so now i am backing out on this second gedcom, not knowing what to do next.

do let me know if this is the right place to post this message. thx Ron woepwoep 04:01, 31 July 2016 (UTC)


Hi and welcome to the team. It's wonderful that you want to help out. It is truly appreciated. There are a few things you need to know.

You will notice on the Gedcom review page that the upload date appears at the far left. After the date, you will see the status of the gedcom. When a gedcom is uploaded, it immediately goes into "user review." Most people don't clean their data, and we eventually delete the gedcom. If a user cleans their data, they can then submit the gedcom for upload into the database.

"User review" will then change to "Needs admin review." That where we come in. We do not review files until they say "Needs admin review." Then we follow the instruction at GEDCOM_review and use the Gedcom_upload_messages where appropriate. Mostly we approve gedcoms with 50% or more of the person in the file have one date and place and all the couples have both spouses or, a marriage date, or one or more children. I like to exclude couples that are marked "living." If the gedcom passes muster, we upload it. If not, we leave a message on the user's talk page explaining why it was rejected and then we "Return to User Review."

Generally, if a gedcom is going to be rejected, it is rejected because there is not enough data, i.e. most people in the file do not have one event and place or there are too many errors.

Thanks again for helping out. --sq 23:14, 31 July 2016 (UTC)

Thanks Solveig this is very helpful. I do hope i haven't done anything stupid.
However i did introduce myself in both cases, saying i am still learning the gedcom review process.
So the worst that could happen i hope is that i make new friends.
I do have a question about 'early'. What to do if the data looks good and have both dates and sources, yet the flag 'early' is set and the data are to be excluded? Why is this policy? My family tree goes back to 1460. How to undo 'early' for the entire gedcom if i have a feeling that the uploader brought in a quality gedcom?
Thx, Ron woepwoep 02:51, 1 August 2016 (UTC)

Re: Most people don't clean their data, and we eventually delete the gedcom. [6 December 2016]

It seems that we are losing a lot of potentially good data and users at this stage. Sample message Imported without user review indicates that occasionally an admin will step in to help new users with this 2nd task. I think it would be beneficial to everyone if this was done more often. We would get cleaner data with more sources, and new users would have a better experience. I have tried it on a couple of files waiting in the queue with mixed results. I purposefully chose some small files with few problems.

  • One sailed right through and was imported without issue (I made a mistake of not preserving that file or user name, so after it disappeared from the queue, I can no longer find it.)
  • 4 files (‎v42755_0294745052ugj2723472kx (2).ged, ‎Matt Jivin Family Tree (2).ged, Romans Familie-2.ged and ‎McHaney Family Tree.ged) appear to be stuck in the Importing status.
  • One did not change status at all (Clink Family Tree (1).ged)

So - I have 3 questions:

  1. Other than the extra work for admins, is there a reason why we do not help new users more at this stage?
  2. Can someone explain what is causing some of the files to be stuck in the Importing stage and what can be done to resolve this?
  3. Would it be possible for admins to download the user's GEDCOM file, clean it up off site, and reload the cleaned version? I could do this so much faster offsite on my own, and this would solve the problem of Warnings that have been resolved continuing to prevent import.

Thank you, --cos1776 14:36, 22 November 2016 (UTC)

Partial follow up: Dallan has cleared some of the files that were stuck and deleted some that would not import. He does not know what caused them to fail import. The other questions are still open. As of yesterday, 18 files from the backlog were successfully imported. --cos1776 20:42, 6 December 2016 (UTC)

Committee Roll Call & Update [5 January 2018]

Hello - I am in the process of updating the information on the status of our admin structure and maintenance committees. The members of this committee are currently listed as:

  • Dallan, Liason to the Overview Committee
  • Solveig
  • JBS66
  • Klaas
  • Khaentlahn

Please respond here to let us know that you are still active on this committee and whether or not you wish to continue in this capacity. Also, if there are additional users on the committee not listed, please let us know.

To help us quantify the work that is being done, please include a brief list of the tasks that you perform most frequently and an estimate of the average amount of time per month that you currently spend on these tasks.
Thank you in advance for your help, --cos1776 13:49, 17 October 2016 (UTC)

i would be willing to help out.
i am not very skilled on the inside (genealogy)
but do know a couple of things about databases and php / css programming.
thx Ron woepwoep 15:49, 17 October 2016 (UTC)
Thank you, Ron. Your reply is much appreciated.
Ok, folks... last call to respond if you wish to remain on this committee. I will update the main page after the holiday weekend on Monday (28 Nov). Along with Mentoring, I view the actions of this committee as very important when it comes to creating a good first impression and welcoming atmosphere for our new users. You have all done a good job in the past, and your service is much appreciated. I hope that you will want to continue to help in this capacity. We are quite short of volunteers.
Thank you, --cos1776 13:03, 22 November 2016 (UTC)

2018 updates

I would like to take a break this year from monitoring, cleaning and processing the GEDCOMs that come in from non-admins. I don't want to just stop without letting the committee know that I will no longer be keeping a regular eye on the review queue. Would any of you be willing to take on this responsibility? Thanks, --cos1776 20:18, 5 January 2018 (UTC)


Notification of proposed change in procedure [16 December 2016]

The Overview Committee would like to increase data quality and compliance and decrease the amount of time spent on page maintenance, therefore the following change in the GEDCOM review procedure is proposed:

Please do not import files that have not been cleaned up and matched and adequately sourced.

Ex. Two files were recently imported

  1. Andrews_St Louis (2).ged from User:Bandrusa and
  2. Stults.ged from User:Sstults

that should have undergone more cleanup and matching in the review program. The first one generated pages like Person:Sarah Roop (6), which contains dubious links to family pages, citations to Ancestry trees and several unmatched places and sources. The second one came in with very few sources, and now the contributor has followed it up with another file that is similar.

We are wasting too many volunteer hours on cleaning up messes after import, hours that could be significantly decreased by this one simple change. All members of this team should know how to fix formatting and match places and sources in the review program. Please just ask if you don't. You can either do the clean up yourself as a service to the contributor, or better yet, contact the contributor and teach them how to do it for themselves. It doesn't matter who does the work as long as it gets done before import!

As for sources, for now, the minimum goal is one source citation per person page with a little flexibility. This is a pretty low standard, but we want to encourage learning and not scare off people who have taken the time to upload their data. Use your best judgement for this, but do not hesitate to return files with a request for added sources if there are too few. And please, reach out to new contributors. A little encouragement from a seasoned user can make a big difference in success and retention rates.

We have a great opportunity to make this site better by this one small change in procedure. If you have any reasons why this change should not be enacted, please respond below a.s.a.p. If all are in agreement or no response is posted by 31 Dec 2016, please consider this change effective on 1 Jan 2017.
Thank you, --cos1776 17:48, 16 December 2016 (UTC)


Gedcom Imports May Be Stuck [23 August 2019]

Would someone with more knowledge about the GEDCOM upload back-end please take a look at why, since at least the 16th of August, there are new GEDCOMs which are still showing "Waiting for analysis"? Could the GEDCOM process be stuck by chance? Thank you in advance for all the hard work.--khaentlahn 21:05, 23 August 2019 (UTC)

Another user has left a message for Dallan on his Talk page which is probably the best way to get a response for gedcom queue problems these days. Please feel free to add a message there as well. hth, --cos1776 22:13, 23 August 2019 (UTC)

Did I import correctly? [4 December 2019]

It has been a few days. Just wondering if the import completed correctly, since I failed the first two times.--Spectrejazz 05:40, 3 December 2019 (UTC)


PS: I'm happy to volunteer my time to help review GEDCOMs.--Spectrejazz 08:02, 4 December 2019 (UTC)


volunteer [9 June 2020]

how do i volunteer--ForestSteve 21:07, 2 June 2020 (UTC)


i am willing to help where I can but to be honest I am very confused with the GEDCOM review and everything else on this site. I am not trying to be negative I have been doing Family History for 25 years and I thought I could help. I really do not understand how this site works willing to learn and help if I can.--ForestSteve 19:37, 7 June 2020 (UTC)


Hi, Here's a loom video for how to do GEDCOM review. https://www.loom.com/share/61246ef39a82453a806a67747f865b4a Let me know how it goes. And thanks for offering to help.--Dallan 21:50, 9 June 2020 (UTC)


Checking status of my GEDCOM [5 June 2020]

Hi, I think it's been about a week since I clicked "Ready to Import" on my GEDCOM. Just wondering if I've done everything correctly - usually it doesn't take this long? Sorry if I'm just being impatient! Thanks, Jocelyn--jocelyn_K_B 07:43, 4 June 2020 (UTC)

Hi Jocelyn, I'm sorry for the delay. You've done everything correctly; we've been negligent about monitoring GEDCOMs lately. We've asked the people who have volunteered to help us, and given them your GEDCOM to practice on (because it's perfect - all ready to import so it should be easy for them). I'll check back again tomorrow and if it hasn't been imported, I'll import it.--Dallan

That's great, thanks very much!--jocelyn_K_B 08:17, 5 June 2020 (UTC)


I'll be reviewing [30 June 2020]

I responded to a call by Dallan on Watercooler for volunteers, and after some email exchanges he set me up to do Gedcom reviewing. I'll be trying to monitor the incoming list regularly. --robert.shaw 19:30, 30 June 2020 (UTC)


Proposed changes to be implemented soon [26 August 2021]

Hi

I've made several changes to the GEDCOM uploader in the Sandbox, which I wish to implement to production soon. The primary reason I am changing the GEDCOM uploader is to edit dates the same as in the wiki (changes implemented since last October). I made a few other changes in event-handling at the same time.

Please let me know if you have questions or concerns about the following changes:

  • The Uploader will edit all dates and provide an error message for each date it can't interpret.
  • Benefit: Fewer bad dates imported to WeRelate. (GEDCOM files can be imported with a few errors, but not a lot.)
  • Benefit: Users will no longer be able to import pages for living people by entering death date of "Bef XXXX" where XXXX is a future year.
  • Downside: Users not following GEDCOM date standards will be frustrated by having to correct their dates. (But at least they won't leave it for others to do.)
  • Downside: Users with dates in other languages may have to wait for me to enhance the date edit. Month names and abbreviations for Dutch, German, French and Spanish are already handled, but only a handful of modifiers are currently handled.
  • The Uploader will automatically reformat any date it can interpret to the correct GEDCOM (and WeRelate) standard date format. If this requires significant interpretation (e.g., transform "1870-1880" to "From 1870 to 1880" or transform "1900-08-07" to "7 Aug 1900") an alert will be created.
  • Benefit: Fewer warning messages, as any ambiguous date that can be automatically interpreted will no longer cause a warning message (although it will cause an alert).
  • Downside: More alerts for users who upload GEDCOMs with dates not following GEDCOM standards. Alerts do not count towards preventing the file from being imported, but the user is required to click on each one before importing.
  • The Uploader will treat each of the following event types in the GEDCOM file as the equivalent WeRelate event type (currently these are all treated as event type Other with the type appended to the description field).
  • Citizenship
  • Employment
  • Funeral
  • Illness
  • Living
  • Obituary
  • Pension
  • Stillborn
  • Marriage Notice
  • The Uploader will:
  • Treat the tag CHRA (Adult Christening) as a Baptism event
  • Treat the tag BASM (Bas Mitzvah) as a Bat Mitzvah event (they are the same thing).
  • Treat the tag DNA or _DNA as an event of type DNA rather than as an uninterpreted tag.
  • RootsMagic (which I use) doesn't put any detail into the GEDCOM for DNA. Does anyone have experience with other software and should we be worried about DNA detail causing the GEDCOM parser to fail?
  • The Uploader will ignore the _SDATE (Sort Date) tag - it will no longer be added to the event description as "Secondary Date". (This tag is used by RootsMagic and possibly other desktop software to supply a fully-qualified sort date when the event date is only a year or month/year, uses a modifier or has a split year. Users can also set it to customize sorting of events without dates. WeRelate uses its own sort algorithm and doesn't support user-customized sort order, so the "Secondary Date" is just clutter.)

If you would like to look at some test GEDCOMs uploaded to the Sandbox, please sign into the Sandbox as Test1 (password: testexplore) or Test2 (password: testmore) and check out the latest GEDCOM referenced in that user's Talk page. Please don't submit the files for import. Note: I plan to do some more complete testing over the next few days, so the uploaded files might disappear for a few minutes at a time.

If you want to try out a test GEDCOM of your own, please let me know. You can add the file with Test3 (password: testagain) if someone else hasn't already done so, and then I have to manually run the uploader before you can review it.

I appreciate any feedback. If you want to take a look but don't have time right now, please let me know so that I can hold off until you have time (within the next month or so). Thanks.

--DataAnalyst 19:28, 19 August 2021 (UTC)


These all look like very good improvements to the uploader. The alerts on the date interpretations seem like a good thing as people have cultural or idiosyncratic variations which might go awry. Nice that you pointed out the implications of having more Alerts. I'll try to find time to play in the sandbox in the next three days, but don't hold off more than that on my account. Thanks for this great work! --robert.shaw 20:13, 19 August 2021 (UTC)
Hi. I just finished my testing and the 2 currently uploaded GEDCOMs are in good shape for a quick review - freshly uploaded. The one uploaded by Test2 (focusing on event types) has a potential match that I haven't processed, if you want to take a look at that. It looks good to me, but if you spot something odd, let me know. (One of the event types loaded as "Other" - that was deliberate - it is a Misc event type in my software.) The other file focuses on date errors and alerts. I don't think it will take you more than 10 minutes to look at both uploaded files. I'll wait until at least Monday before implementing, and of course, you can still check out these files after that.--DataAnalyst 17:32, 21 August 2021 (UTC)
I've looked at the review pages of Test1 and Test2. I also applied the match on the Test2 review, and did an update for one of the persons. I didn't see any problems.
Although I've done a 23andMe DNA test, I've never entered stuff about it into a tree or a GEDCOM. Looking around, I didn't find much on what syntax might be used for recording in a GEDCOM. (There was a semi-spec by chronoplexsoftware for DNA extensions to the new GEDCOM 7.0.) I'd expect WeRelate's GEDCOM parser wouldn't get very sick from encountering it; generally such parsers tend to just bypass things they don't understand. --robert.shaw 23:46, 21 August 2021 (UTC)

The changes were implemented 26 Aug 2021.--DataAnalyst 15:27, 26 August 2021 (UTC)


Happy to volunteer [23 January 2023]

Hi GEDCOM reviewers!

Really enjoy GEDCOM sleuthing, so I'd be REALLY happy to help.

Please message me, but I'll also check back here.--Parelb 17:39, 23 January 2023 (UTC)