The problem of the commons



A discussion on "Pando" and the importance of Good Sourcing

The use of original sources, such as US Census records, helps ensure Genealogy Well Done
Ultimately, a key problem for WeRelate is what was once called "The Problem of the Commons".

This used to be a fairly popular term back in the 60's, when the environmental movement was getting off the ground. It referred to the idea that some things are "owned by all", and that just because you can do something, doesn't mean you should be allowed to; that is, a common resource (such as the village green) can't be polluted simply to meet a single individual's need and desire.

In colonial times, a good example of this is found in the damming up of small streams for gristmills and the like. Sometimes folks would own the land on either side of the stream, and so feel they had the right to do whatever they wanted to. That became a problem because some downstream settlers were often dependent on having the stream flow freely---a commonly cited example: migratory fish species often made use of the headwaters of these streams. Damming the streams meant they couldn't spawn, and the downstream adult populations plummeted---to the loss of those downstreamers who depended on the fish as part of their food supply. It is for this reason that colonial governments insisted on authorizing the construction of mills and such.

The "Pando" Concept

In the case of WeRelate, the "Problem of the Commons" is front and center with the concept of "Pando". Theoretically, we have one "card" (or page) per person. People can't be allowed to change data on that one card just because they have something different than what's there now. Why? Because that card is "owned" in common by everyone else using WeRelate, and interested in that person.

If you scan the data on Ancestry trees (or other genealogical websites), it's common to find literally hundreds of separate cards (or pages) for a given person. It's also fairly common to find a dozen different dates of birth for the same person, as well as a variety of dates of death, spouses, parents (or multiple spouses and parents), etc. Each date or information presented, is obviously considered by the cards author as "correct". This works out just fine on a site like Ancestry, where anyone can create a new and different "card" for their ancestor, giving exactly what they think the data should be. But it doesn't work out so well on WeRelate, because we only have that one card for each individual.

That creates a problem: Whose data gets to go on that card? How do you decide which data is "right"?.

What is a "Good Source"?

The answer to that lies with "sources". If a DOB of 4 July 1818 is sourced to a particular valid, original document, (say a family bible) then there's some justification of saying a DOB of 4 July 1818 is probably correct.

But if you just have the DOB, and not the source, there's no particular justification for that particular data element. Without the source, 4 July 1818 is no better or worse than 5 July 1818, or 4 August 1818, or 1 January 1800. Without the sources there's no way to decide which of many different alternatives is "the right one". But a date with a sound source should always "win", at least in the short term, until a better source comes to hand.

But what if someone says

"My date is so sourced. See here, it came from Bob's Big GedCom. It's sourced, so it's right. So, since it's right, I want my date used. Which means your wrong date has to go".

The key here is in the phrase "sound source". Despite the fact that this person knows where they got their information from, it's not soundly sourced. "Bob's Big GedCom" is, simply put, hearsay evidence. It has no fundamental reliability. A family bible record might normally take precedence over "Bob's Big GedCom". Perhaps Bob got that date from a family bible record. If so, he didn't include it in his data. He might be right, but someone with a different date, sourced to a sound document, should still win. If you don't provide your sources, your data may or may not be right, but either way, its suspect, simply because it has no source.[1]

What I'm saying is that WeRelate's problem of the commons has a natural solution in the form of "best sourced data is shown".

Ultimately, because of Pando, things should become "self policed" --- but only when the users understand the value of a "well-documented" source, and why "Bob's Big Gedcom" or other un-documented sources may not be good sources.


  1. Probably not the best example. Family bibles by their nature seem to be highly reliable. In fact, we often don't know when the information was recorded, or by whom. Sometimes the data is recorded long after the fact, and may be just guess work.