WeRelate talk:Junk Genealogy

From WeRelate

Topics


Scots initial Proposal [4 June 2008]

I am becoming more frustrated, disillusioned and concerned every day. Some 3 years ago it ocurred to me that the Wiki model could be a great opportunity for genealogy. When I found this site, as well as several others, I was excited that someone else had come to the same conclusion. However, it appears that any collaborative effort is being overwhelmed by the amount of Junk genealogy being uploaded to the site. I have been thinking about how to prevent the Werelate site from becoming just another repository for misinformation like so many others. Downloading gedcoms from world connect or the AF and then uploading it for someone else to edit is not genealogy. What we should be doing is compiling data, examining sources, weeding out the trash and creating a credible database. How can we encourage collaboration by those indivduals who are serious researchers and eliminate those who are just waving their pedegree to masssge their egos.

Some thoughts:

  1. If a person joins Werelate, uploads a gedcom and walks away, he contributes nothing.How about, If after uploading he does no editing for a certain period, say 90 days, then his upload is purged, except for pages edited by others just as if he removed it himself.
  2. For person pages for individuals from before 1500 or so, allow the surname field to remain empty without the unknown tag. The Title prefix or suffix can be used to differentiate individuals. Because people feel a need to enter something into the surname field, this creates an incredible number of variations for the same person if he does not have a surname. I have many instances of duplicates because they are entered in different languages. I realize that place names eventually became toponymics and are used as surnames, but in medieval times, they simply indicated where a person was from, often these appear with of, de, van, etc. preceding them and each vriation if used as a surname results in a duplicate entry.
  3. Accept no data for individuals prior to 1600 without source reference.
  4. Screen the sources and reject submissions based on questionabe sources or those known to be flawed.
  5. Perhaps have two separate databases with separate rules for submission. One for medieval, royal, historical and celebrity figures where submissions are restricted as stated above. Don't allow GEDCOM uploads to this section, only individual pages. A second data base remains as it is now but with the purge function for inactive users..Allow linkage to individuals in the other database within immediate families.

Maybe this seems rather Draconian, but I feel some kind of control must be implemented to prevent the site from becoming a hopeless morass like most others.. Opinion s anyone?--Scot 19:30, 9 April 2008 (EDT)


1. Try to imagine werelate is good enough to survive in some form 100 yars from now. I believe the next stage for amateur geneaolgy/ history is modelling of families / communities / places / events. wos to say something contributed yesterday then abandoned ( eg the 'only' online copy of someone's wedding or a school photograph wouldnt be of interest to someone else, even though it has not been looked at for a perid of 20 years

That person may not have sourced it correctly, or identified all the people on the photo , but may have left enough clues for someone else to identify it correctly.

In the UK you have to spend seven pounds for a copy of a certificate - perhaps in 5 years all the info will be online for 7p and it would be affordabel to model entire communities.

How do you know waht is useful? its in the eye of the beholder.

Perhaps the junk could be left out there but the werelate community should develop a simple quality accredition system. That which is sourced correctly (as identified by werelate accredited officers) can be coded as such.

Apage with a lower quality rating cannnot then 'damage' one with a higher one

Perhaps volunteers could adopt a geographic location and ask other interested parties to connect with them. With the right tools a volunteer could keep things in check. One thing I like about the wiki is you can host a family tree, a one neme or a one place - theoretically they should be able to co exist--Dsrodgers34 01:14, 4 June 2008 (EDT)


Some Responses

It has occurred to me that abandoned GEDCOM uploads are very much a mixed bag. It may sometimes be that the person just didn't take to the site and their data is still pretty good. Other times, well.... Anyway, I agree with the thrust of scot's argument - there need to be some steps taken to prevent werelate from becoming a sewer.

GEDCOM genealogy has made it easy for people to accumulate a data set that they would never have any hope of seriously maintaining - even given many lifetimes. We should encourage people to upload only data that they are serious about working on. Maybe the size of a particular GEDCOM upload should be limited unless by special arrangement (5000 or so?). Likewise a GEDCOM that goes back before 1500 AD.

The other problem of course, is the management of abandoned data. I've got a fairly small tree compared to some (~3000). I try to source things in detail and would like to think the pages are the sort of thing the site would want to host indefinitely. There should be a way to designate trees as a permanent part of the collection (genealogy goes on forever - good research done in the 1850s is still working for us - but I don't think anyone reading this dates to 1850...do they?). On the other hand, if a tree is loaded and worked a little then abandoned for a while, it probably shouldn't automatically persist forever. If someone in the user community wants to adopt the tree, maybe it just goes to them. If no one in the user community wants it, or perhaps if the user community actively requests it's removal, then after a time it goes away (if we want to be nice about it, maybe it gets archived to a named GEDCOM and tossed into the digital library?).--Jrm03063 22:04, 9 April 2008 (EDT)

Anything in an abandoned tree can be retained by any one, simply by editing the page. My 90 day suggestion is only for the period after the initial upload. I f they don't do match/merge for any of their data we can assume that they aren't interested in maintaining it. After that a longer period if inactivity could be required before declaring the data abandoned. If the match/merge utility works well, duplicate entries might not hang around so long, so it will be easier to find and evaluate data in recent uploads. Again any pages that are merged or edited will be retained and the rest of the tree was not found to be of interest to anyone searching.--Scot 01:12, 10 April 2008 (EDT)
There are several types of users on WeRelate presently. I looked at some of the users who registered in April 2007. First, you have those who only registered; second you have the users who registered and created a profile and listed surnames in place that they are researching but have no other contributions; third you have the users who created a profile, uploaded a gedcom and have no contributions since the initial gedcom upload, and fourth you have the users who have an active file and recent contributions.

I believe that we should eliminate all unsourced gedcoms after 6 months; unless they are watched pages by someone other than the user or an agent of WeRelate. --Beth 08:31, 10 April 2008 (EDT)


I agree in principal, the problem I see is that there probably aren't any utterly unsourced GEDCOMs, though there are plenty of essentially unsourced GEDCOMs. The former being a GEDCOM entirely without source records and citations. The latter being a GEDCOM with "OneWorldTree", gedcom upload date, and other sorts of essentially useless sources. Trying to make software know the difference wouldn't be all that easy.

I think we need to hook on other criteria to decide that something is both abandoned and useless.

Also, having one or more pages in an otherwise abandoned tree watched probably doesn't say a lot about the quality of the tree generally - though anyone watching a portion of a tree should be consulted before the remainder of the tree goes away. When I merge duplicate families, I don't concern myself with any question about whether an originating page comes from a "good" tree. I only try to understand whether the various pages are talking about the same people (or at least, the same fantasy about people).--Jrm03063 11:43, 10 April 2008 (EDT)


First, this topic should probably be moved to a separate page, because it soon is going to take on a life of its own.

Secondly, I am in 100% agreement with Scott. Becoming "Draconian" in principle is not going to turn away the masses, because from what I can tell, there doesn't seem to be a mad rush taking place to genealogy wikis anyway. Why is that? Becoming a little more picky about what gets uploaded and about what stays uploaded is instead going to attract those researchers who are serious about collaboration and who don't take offense when a bad source or bad information has been revealed. --Ronni 12:41, 10 April 2008 (EDT)


Dallans Response [10 May 2008]

A few thoughts:

I don't think we want to automatically remove abandoned trees, because abandoned trees that are of good quality are worth keeping around, and the system can't tell the difference between good quality and junk. So let's focus on removing junk trees. Under what conditions then would we want to remove a junk tree? I can think of four; perhaps there are more?

  1. The junk tree contains a lot of internal duplicates (duplicates within the tree itself)
  2. The junk tree overlaps with existing trees, and the tree uploader didn't merge the pages
  3. The junk tree overlaps with existing trees, and the tree uploader merged the pages but in so doing added a bunch of "bad" data from their tree to existing pages
  4. The junk tree overlaps with a well-sourced tree that I am trying to upload, and merging my tree with it is going to add my well-sourced data to a bunch of pages with "bad" data

I'd like to consider each of these cases in turn.

  1. There's nothing to do here but delete the tree, as has happened already with the tree that contained a large number of internal duplicates for the Norman's. If someone finds a tree with a large number of internal duplicates, I think we should contact the submitter and delete the tree.
  2. I think the best way to resolve this is to require the tree submitter to go through a match+merge step (where they are shown the probable-overlapping trees and can choose which pages to merge) within say 7 or 14 days of uploading the tree. If they have not completed the match+merge step within 3 days they get a warning, and the tree is removed if they have not completed it within 7 or 14 days. Trees that aren't determined to overlap any existing trees don't have to go through this step of course.
  3. This is a more difficult problem: The tree submitter merged their pages into an existing tree, but the merger resulted in a bunch of questionable data and sources being copied into otherwise good pages. We may need to have an option in merging to not append data from the new pages onto the existing pages.
  4. This is the opposite of the previous problem. I am trying to submit a new tree, and it overlaps an existing junk tree. But if I merge my pages into the existing tree I don't want my good data appended to a bunch of junk. We may need to have an option in merging to have the data from the new pages replace the data on the existing pages.

BTW, don't get discouraged. Match+merge is something that should have been implemented a long time ago, but it's not an insurmountable problem. As part of match+merge we'll have a screen that shows all of the probable duplicates between two trees, and lets the tree contributors discuss and select which pages to merge. This will hopefully make merging much easier than it is now.--Dallan 17:08, 10 April 2008 (EDT)


Question about #3/4 above - at some point in the past, we talked about a function that during gedcom upload would identify the duplicates and do a merge if necessary. At that point, the user would have the option of just not uploading the duplicate people. That takes care of 1) those people in well-sourced trees that are placemarkers (like spouse's parents one hasn't pursued); 2) chunks of badly researched trees; and 3) situations in between where you want see what's there before deciding one way or another. With some instructions, hopefully most offenders will recognize themselves and not upload their junk onto "nice" pages. Is something like that happening? If so, we're talking about people that ignored that instruction, which adds another dimension. But, that said, I also think you do really need to have an option of not appending the data from one page or another to the merged page. I would say, based on hundreds of merges, that far more often then not, one page is either junk or functionally, but not literally, identical (that is, one user says b. Windsor, CN, the other says b. Windsor, CT - this is why I'm still hand-merging, because these are human decisions.) So to avoid creating more work and more junk, I would think a "use data from ___ page" option would be highly useful. --Amelia 08:35, 4 May 2008 (EDT)
The duplicate-detection hasn't been implemented yet. I agree that we need to implement it. And a "use data from ___ page" option when merging is also a great idea.--Dallan 15:26, 6 May 2008 (EDT)
See Gen Mehods Archives for a recent comment on this problem and wiki's, particular in the context of the LDS site. Q 08:54, 7 May 2008 (EDT)
Interesting discussion. From what I've been told the new family search wiki is primarily a way for them to get their research outlines into a form that others can extend. It should also allow them to more easily post their own material online that is currently available only at the family history library (the "half-sheets" at the reference desks). I think it's a great step for them.--Dallan 10:44, 10 May 2008 (EDT)

Medieval Genealogy

Medieval genealogy (pre-1600) is a slightly different problem than the problem of "junk genealogies" because (a) there are only a few people that we have records for pre-1600, and (b) those people didn't generally have surnames, and the birth dates are often approximated. I'm not unwilling to prohibit people from uploading pre-1600 people, but I'd like to first see if we can merge uploaded pre-1600 people into well-sourced existing pre-1600 pages, and rather than append their probably-lesser-quality information onto the existing pages, we would not modify the existing pages.--Dallan 17:18, 10 April 2008 (EDT)


Speedy Delete [20 April 2008]

While I don't particularly think wholesale deletion of "abandoned" "poor quality" trees is a good idea, a feature that would be good to have is something akin to "Candidate for Speedy Delete" on other wiki's. In truth, that capability is already in place in part, in that its present under the "More" pulldown menu (at least when you are at the article level)---specifically, if you are the only person watching a page you can delete it anytime you want, as per the following guidance:

If you are the only person watching a page, click the More link in the upper right corner of the screen under the blue bar. Select Delete, enter a reason for the deletion and click Delete Page.

But I'll bet there are a lot of pages, such as duplicates, where the author is no longer paying in attention, and a duplicated or otherwise unneeded article (kind words for junk) could be removed with no loss to anyone. Its probably not a good idea to allow just anyone to do that, but I think its something that could be done suitably by an administrator---if they knew that someone thought a particular page could be done away with. If there were a repository where people could nominate candidates for speedy deletion, someone from the admin side would go through the list, review, and make a rationale decision about how to handle the article. That might mean notifying the original creator, denying the request, or perhaps, immediate deletion if that were appropriate.

Now, in truth, I don't really know of any articles so messed up that I think I'd delete them. But I DO encounter lots of duplicates---usually by the same author. I suspect that there's something in the process of GEDCOM uploads that creates them. Possibly they re-upload their GEDCOM periodically to sweep up any changes that they've entered in their genealogy program, and the upload program can't identify things that haven't changed, and just creates everything anew.---hence, lots of duplicates). Don't know why the duplicates are there, but the fact is, they are---and might be candidates for speedy deletion. Q 19:32, 10 April 2008 (EDT)


I like this idea. If a page or set of pages isn't just poor quality, but duplicates pages already in the system and is causing merge work without contributing any new information to those pages, I could see marking them for speedy deletion. Then you have a human being instead of a computer making the final delete decision. What are others' thoughts on this?--Dallan 15:08, 15 April 2008 (EDT)


I agree that human interaction is needed in making these kinds of decisions. I also believe we shouldn't let this issue fall by the wayside. I just came across a GEDCOM uploaded in February that has many duplicate pages in it. The GEDCOM is by no means considered "junk," however. But it was uploaded and the user has not edited it since nor have they contributed another page to WeRelate. --Ronni 22:46, 19 April 2008 (EDT)

One of the things Dallan has indicated would be in place eventually is a search result tabulation similar to the browse function, but including more information than just the name---ie, DOB/POB/Spouse/Father/Mother type information. Such a tabulation would make it easier to spot duplicates of this sort---especially if it included the identity of the submitter. That way,if you ran a search and found that John Smith had created four separate cards for a "Jeremy Black" all with similar DOB's and DOD's, Spouses etc, you'd be fairly sure that some of them were duplicates. Q 10:17, 20 April 2008 (EDT)
Including the identity of the submitter is a good idea. I don't have that readily available, but the list of users watching the page is available. I'll include the watching users in the search results.--Dallan 17:30, 24 April 2008 (EDT)

Status flags to mark state of data [4 May 2008]

Interesting discussion. The data I have loaded up for my small tree cannot be described as bad data, but it is poorly organized and presented. I am a beginer in genealogy and when I used Family Tree to hold my data I didn't put the data in the correct places. When I uploaded a gedcom it looks like a dump of mixed facts. This can be very confusing to any one that tries to sort through it.

Many people will be new to genealogy as well as new to computers as well as new to wikis. They will not do things correctly at the beginning and will be frustrated and over whelmed at times about the amount of work to get details into there proper places. There is a learning curve and it can be daunting especially when you are use to immediate gratification, fast food, and no line up service. This can be a lot of work.

The advantage of keeping these people active in your/our wiki is that they do bring very good data related to themselves. The closer the family ties the better the data will be. It is intuitive. So, I agree requiring better credentials for "historical" data makes sense.

The advantage for me to have a presence on this wiki is to increase the chances of contacting another person with an interest in the same people. To establish a tree for this purpose requires only the basic name, bmd stats, and locations. The details don't need to be 100% as that is why you want to find other people, to compare notes. I have connected with one person because of this wiki and we were able to share some info.

My suggestion would be to have two classes or status for data. Or even multiple status flags. Raw, draft, under construction, basic, vitals only, etc. Then when you run merges etc, you could include or exclude based on status flag. You/we will need to develop a set of criteria for fitting assigning a status flag. A disposition rule could also be auto set based on a status. For example, if status = raw, then delete 6 months after last update.

In records management profession the concept of transitory and offical records is well understood. Transitory information is used to create an offical or final record. While offical records may be kept permenantly, transitory records are not. Disposition is driven by a rule for the record series (class). Based on a triggering event, a count down of a specified time starts. When the end of the time period is reached the record is proposed for destruction. If no one can declare a reason to keep the record it is deleted.

Triggering events could be last date record amended, last date record viewed, and record status = "transitory". And/or include activity of record owner, or persons with interest in record. NO activity, transitory records, nothing happening for x months, then delete.

As an aside how many people are familiar with the 5 steps to change management? Awareness, understanding, acceptance, committment, action. It seems to me that there is a lot of change management required in a wiki to get people to move together in an agreeable direction.

Thxs Peter --PeterP 08:48, 13 April 2008 (EDT)


Good points.

I am reminded that just because an article is not being edited, does not mean that it is not being viewed. Just because an article is not being viewed does not mean it is not valued.

Eventually, there is NO user of this site, Dallan included, that will not cease editing their articles. It would not be good for this site if people could not contribute to it with confidence that their contributions would remain.

Also, in the same vein, if the criteria for deletion came to be that they had to be "good" articles (or at least not "bad" articles), you'd need to be able to define a criteria for good and bad articles. An obvious criteria would be that they meet BCG standards. How many articles on this site meet those standards? Not mine, I know.

Q 09:24, 13 April 2008 (EDT)


Wikipedia has a set of templates that anyone can add to a page to say for example that it doesn't contain source citations or is biased. These serve as flags for others to improve the articles. But articles not meeting wikipedia's criteria don't get deleted except in special circumstances. I'd be in favor of coming up with a set of templates along these lines to flag articles. But I wouldn't want to delete pages just because they weren't good quality and haven't been edited in awhile. I've uploaded my genealogy and many of those pages haven't been edited in quite awhile, and many don't have good source citations. But I'm hoping that they'll get better over time. I'd personally hate to see them deleted.--Dallan 15:08, 15 April 2008 (EDT)


Hi Dallan,

If one chooses to allow any gedcom upload without any criteria; then I certainly vote for some kind of status flags. One could be junk or more politely unsourced and second, sourced but only with meaningless sources such as WFT #3 or so and so's gedcom etc. All of these could be under one status flag.

One could isolate these trees as unavailable for automatic merging until some person chooses to edit the pages.

Then perhaps after a certain time period, perhaps one year, active users on WeRelate could vote whether or not to keep the pages or delete the pages. You could place a warning on the registration page that possibly one's pages could be deleted in the future. --Beth 10:10, 4 May 2008 (EDT)


Something along these lines makes sense. I'm not sure exactly how, and I'm not sure whether removing unsourced abandoned trees should be proactive or reactive, but it seems like we should come up with something in this area.--Dallan 15:26, 6 May 2008 (EDT)


GEDCOMs in the digitial library? [6 May 2008]

Do we support the upload of GEDCOM files to the digital library? I'm struck that, for some people who are not yet sure about whether they want to commit to the process of wiki genealogy, or if they have an unusually large GEDCOM (say, over 2K people) we should encourage them instead to protect their current GEDCOM by loading it to the digital library with whatever cover material they can muster. Then, instead of uploading their entire GEDCOM into werelate, we give them guidance on how to carve up their work to upload piecemeal.

The GEDCOM standard has been a help for genealogy, but a hinderance as well. Instead of folks focusing on a small set of ancestors that they reasonably have the time and interest to properly research, they become slaves to the maintainenace of a large data file that often turns out to contain tremendous amounts of crap. Tell them to abandon that stuff and focus on a more reasonable set of goals and they'll run off screaming. On the other hand, tell them to archive their work in a labelled and maintained repository like the digital library, while carving out the subset that they really want to actively continue work on, for use in werelate and we may all be better off.

I know that Dallan is looking at ways to improve the upload process, so that duplication can be suppressed at the start, but that's only part of the challenge. The real challenge is uploads of data that the user never really intends to actively work with.--Jrm03063 15:11, 5 May 2008 (EDT)


You should be able to upload your GEDCOM to the digital library. I haven't tried it, but I've added GEDCOM as an accepted file type to the library. I hadn't thought about having people upload a complete GEDCOM to the digital library and then copy just a portion of it to the wiki, but that seems like a really good idea. We could even have links from the wiki pages that were on the boundary of what was carved out of the GEDCOM pointing back into the GEDCOM file. (I wish there were two of me.)--Dallan 15:26, 6 May 2008 (EDT)


Usage Statistics [15 May 2008]

"Junk" genealogy is a fact of life in genealogy. Its been around for a looong time, and not a recent phenomenon, but it has taken on a life of its own with the internet. I suspect that for most services, such as Ancestry, there's really no advantage in purging junk. Perhaps the philosophy that rules is "something, anything, is better than nothing". There's a certain amount of truth to that, unpleasant though it is. The more people use a site, the more successful it will be at least in terms of survival.

With that in mind, here's a small summary of traffic on the main genealogy wiki's and some other sites for comparison. These data are from Quantcast.com, and are for the month of April. I've added some interpretive information about each (number of articles and functionality comments)

Datum Type Ancestry GenCircles.com Genealogy WeRelate WikiTree's FamilySearch Rodovid
Wiki?NoNoYesYesYesYesYes
Rank 287 21268 108397 136773 184445 287304 ND
Unique Hits per month 4.8M 102996 15192 11432 7892 4550 ND
Visits Per Month 44.9M 371021 3709 3524 0 6533 ND
visits/unique 9.30 3.60 0.24 0.31 0.00 1.44 ND
Audience Comp (Passerbys) 59 66 85 83 83 75 ND
Audience Comp (Regulars) 38 34 15 17 17 25 ND
Audience Comp (Addicts) 4 0 0 0 0 0 ND
Share of Visits (Passerbys) 12 27 71 74 71 53 ND
Share of Visits (Regulars) 41 73 29 26 29 47 ND
Share of Visits (Addicts) 47 0 0 0 0 0 ND
Number of ArticlesAVBN*ABN*20K2M??100K
GedCom supportYesYespartialYesassistedNo?
Guided Data Entry*??templatesYestemplatesNoYes

None of the Genealogy wiki's shown here are anywhere close to Ancestry or even GenCircles. In terms of traffic Genealogy and WeRelate are about the same. FamilySearch has alreay grown to more site visits than either G or WR, but that may be because its new. Rodovid seems to be loosing ground; Last month it garnered about 1900 hits. Its now dropped off the board (insufficient data to report), though its still active. The high traffic count for Genealogy is, I believe, do in part to recent changes in layout (much better looking than it used to be), but I don't think that's the real driver. Its being visited more, but actual page creation seems to have dropped off. What's really driving the visitation numbers for genealogy is its connection to Wikia, and I believe an advertising campaign that's made it somewhat more visible.

Among Wiki's the major distinction is the total number of articles. WeRelate is clearly the front runner here, with 2M. Its most serious competior in terms of site activity is Genealogy with 20K articles. WikiTree has 100K articles, but its activity is lower.

The greater number of articles on WeRelate is almost certainly due to it's GEDCOM import capability. Genealogy has a similar capability, but its not been effectively implemented. Rodovid may have this capability (I'm told) but its not obvious. WikiTree can do it but it requires the operator to insert it---not automatic---that's probably why it has 100K articles, but the fact that its not automatic is a major barrier for it.

The point of this is that I believe that what is driving WeRelate's success is its GEDCOM import capablity coupled with a well thought out manual data entry system. Its the GEDCOM load that brings the useful traffic. None of the other Wiki's have functioning GedCom support.

On this site getting folks to do more than dump their GEDCOM is a challenge, but first you have to get them here. My guess is that much less than 10% of the people who dump a GedCom ever do more on the site---perhaps that's 1% who really stick. What's really going to drive the further success of the site is that small percent---these are the people ANY wiki needs---dedicated users who do more than simply dump a GEDCOM. Ultimately, they are the ones that are going to make the site work. But to get them you have to cast a large net---and ANYTHING that diminishes the number of people trying the site, is also going to diminish the number of users that turn into dedicated users.

Which is why you have to be very careful about doing things that will turn off those who come to the site for its GEDCOM dumping capability. That's a number that you want to increase, not decrease. Otherwise we might find ourselves struggling along like Rodovid with traffic so low it doesn't get picked up in the statistics. it would be nice to encourage people to do well with their genealogy, so that nothing here could be described as Junk. No one else (wiki or otherwise) has succeeded with that, and putting up with Junk Genealogy is a small price to pay in return for persistence.

And finally, I might add that I've personally developed a fondness for "Junk Genealogy". True it is junk, but there's some utility in having about a million people looking for information. Even if they don't understand the need for citing sources, there's usually enough of a clue in their work that, once spotted, you can seek out the original data yourself. I LIKE having lots of folks looking for the same things I'm interested in. The fact that many of them don't know how to report what they find, or make effective use of it, is a small price to pay for having all of those busy hands finding good stuff. Q 10:28, 7 May 2008 (EDT)


I own Family Tree Legends and am a member of GenCircles. My family files were transferred to MyHeritage and I suspect that all of the GenCircles' files have been transferred there also. The transfer was an automatic transfer; without my knowledge. That did not bother me. I have now found the burial place of a great great grandfather and successfully ordered his funeral home records.
The family sites may be public or private; your option. I have not used the new program but I suspect it works similarly to Family Tree Legends. The genealogy software is on your personal computer and you enter data into your program as usual. The data entered is automatically entered on your web page on the MyHeritage site. You receive notifcation of Smart Matches as one did with GenCircles.
The capability of having a genie program with reports and charts and the capability of automatically creating web page entries; no duplicate typing is unsurpassed in my opinion.
It would be fantastic if WeRelate had a similar capability. WeRelate is not difficult but I have not had much assistance with my data because others seem to think it is difficult to learn how to use the site
Because you have GenCircles in your chart; I thought you might be interested in the new site. Here is the link for MyHeritage. [1] --Beth 10:51, 7 May 2008 (EDT)

It pains me, but I think this analysis is sound. Most of the "junk" genealogy I've encoutered over the last six weeks or so was really just inadequate genealogy - vast wastelands of unsourced names with nothing but dates for birth, death, and marriage. Much more often than not, the information is correct or (at least) flawed in a way that is well known or been documented as a flaw in the literature. It can be an odious task to work through merging the stuff, but I think that was mostly because we've got a backlog of a couple of years of stuff that was almost entirely unmerged. I noticed that individual trees, added to a reasonably well merged space, can be merged in pretty quickly when you know how to go about it.

I still think that we should encourage folks to think critically about their purposes before uploading a GEDCOM. The paradigm shift from __my__ tree to __our__ shared genealogy space is a serious jump for folks, and it can't be reinforced enough. If they are simply looking for a place to archive their GEDCOM (or perhaps their TMG data base or whatever), then the digital library may be a better choice. If they don't have an interest in working cooperatively, leaving their data base where it can be picked up by another researcher may be the best thing to do. If they have a large GEDCOM but a core set of folks that they are really interested in working, they may want to take a hybrid approach - GEDCOM to the digital library and a subset uploaded to werelate. Having offered that guidance we probably have to trust that folks will make good decisions more often than not. When they make very bad decisions, we can always fall back on the recently used informal approach with one notorious upload - deleted by popular demand.--Jrm03063 11:25, 7 May 2008 (EDT)


I agree with everything that's been said. Thank-you for the analysis! These are all great ideas.--Dallan 10:44, 10 May 2008 (EDT)


Junk genealogy? I learned very early to be very careful when uploading anywhere. So I have 'special' gedcoms to upload with hardly any sources named. If I find I like and trust the site, I upload a better file or do as I started here. Add them manually, as I have time. So, my files would be listed in this 'junk' talk, as I have not been able to add a lot lately. Or what constitutes true 'junk'?

WeRelate has as many entries as it does BECAUSE it can take in GEDCOMs. It is more complicated than the other gen wiki sites. And I still dislike the search here.

I have gotten many leads in 'junk genealogy' files. They tell me which direction to go or to just look else where. I don't believe that a file of 10,000 or more can have sources compiled by one person, it has to be a file that was put together by taking others peoples work. So what is 'junk'?

Abandoned files? It is a lot harder to understand how this place works. Perhaps they have gone away just frustrated. Perhaps they check back periodically to see if there are changes. People tend to do what is easy when putting up their files. Not many have time to learn a new format.

I don't know if anyone has even expressed an interest in my files. The only person watching is my cousin who I told to join so she could add if she chose. She hasn't. I have never gotten any messages from here.

I'm not sure what I will do now. If I am going to be 'junked' I would prefer to delete my own files. Just my ramblings.--Twigs 11:51, 15 May 2008 (EDT)


I think some people define it as unknown (or unknowable) genealogy, lacking in source support. I think the term is a little more abstract for us however, and it probably is more a function of the contributor's conduct than of any fundamental qualities of their data at any point in time. Whether the space of your interest is ten people or ten thousand, since we're sharing the space and any overlapping research, we hope that anyone jumping in will be interested in improving the quality of their contribution going forward - regardless of where they start.

I suppose it could be put another way. Imagine a group of people doing old-fashioned genealogy collectively. Maybe they share a file cabinet at the local historical society and the group has a set of general conventions for how to record information and sources. The group tries very hard to be dilligent about getting their information correct and complete, as well as citing sources so that other researchers can review and expand their work - but of course it still is of uneven quality. Now imagine someone showing up a meeting of the group, throwing vast chunks of material they don't understand (or plan to understand) into the group's cabinet. Then, they just disappear. What is the group to do with such a contribution? Does someone suspend their own research interests and start wading through the contribution to bring it up to the quality standards of the community? Or do they just extract it, set it aside, and wait for someone with actual interest in that area to adopt the stuff and take responsibility for it?

If what you are doing would seem brusque in a group meeting around a table once a month, then it would probably be received unenthusiastically in this context.

If you have a genuine interest but are simply starting small to see if this all works for you - great! Welcome! Nice to have you here! Ask for help any time! If you're just looking for some place to archive a GEDCOM without any intentions of working the stuff further, then I suggest either the digital library or another site that archives GEDCOMs from any source.

I guess it's all a wordy way to say "play nice".--Jrm03063 13:37, 15 May 2008 (EDT)


One more thing - repeated uploads will not do what you think. The shared data space would wind up with both the old and new information, and someone would have to merge the material. I'm curious - why hold back sources? What is the issue of trust that concerns you?--Jrm03063 13:43, 15 May 2008 (EDT)


I meant no offense by commenting. I did not know that was not playing nice. I am sorry.--Twigs 15:54, 15 May 2008 (EDT)


Oh my goodness! Of course you're "playing nice" - you're talking to folks - that's participating in the group! ....and I'm only one person in this community. I'm only sharing my idea of things, which hopefully is something like the mainstream, but who knows? I was just trying to help you understand how one other person sees this space and what's behind this weird notion of "junk genealogy".--Jrm03063 17:54, 15 May 2008 (EDT)


Leaving data from gedcoms sourced by WFT etc. [22 July 2008]

Hello everyone,

I have changed my mind. I think that we should leave all of the pages uploaded on WeRelate. Using Dallan's new search engine; I discovered a gedcom that had been uploaded in May of this year. This is a family that I have researched. I have not removed the source for WFT nor have I deleted data sourced by WFT that I do not have. To date there have been no conflicts in the data.

What I have chosen to do is to enter the data that I have and source my data. The user does not have a profile; and I don't intend to contact the user. If she receives a notification via email and contacts me that is fine.

I created a new tree and add the pages to the tree as I edit them. I am researching the Coker line and this person is researching the Meadors line so some of the pages will not be added to my newly created tree. You can view the history of one of the pages here Person:Elijah Coker (1). I use FTE all of the time and am not sure about the navigation if the Family Tree Explorer is dished.

Trees that have been uploaded via gedcom with no activity that have duplicate pages in another inactive gedcom should be automatically merged.--Beth 20:48, 22 July 2008 (EDT)

Menu
Views
Toolbox
Personal tools