User:Jrich/The Need for Sources




Sources Are Required in a Collaborative Effort Such as WeRelate

In WeRelate, we are collaborating with people we don't know. This means you cannot expect other people to simply take your word for something. Since no living persons should be entered, it is likely neither person has personal knowledge of the target being documented, or at best, it is so long ago as to be rendered questionable by the failings of normal human memory. So providing good sources is the only way of "proving" what we assert. Presenting data on a WeRelate page is like trying a case in court, where you must marshal evidence to prove your case, and disprove the other case(s). As they say on "Law and Order" all the time, it is not what you know, but what you can prove that matters.

People that do not provide the sources for the data they input are not helping the process of collaboration. Somebody that looks at a page with no sources is not helped. Furthermore, it sets an example that encourages others to follow the same lazy, self-centered genealogy practices. Sources are needed to justify each fact: birth, marriage, death, both the date and the location. A good source is not just a website, or even a book, that asserts something - though sometimes that is the best we can do. Even when done by a respected genealogist, an assertion still leaves us unable to assess the credibility of the assertion, and judge it against other possibilities. It is necessary to know why the fact is asserted. This usually requires a reference to a contemporary document, such as a will, church records, vital records, a diary, a newspaper, etc. Often, multiple sources are needed for each fact when discrepancies arise.

People that do not provide sources should not be surprised if their data gets changed without consultation or respect. It will be impossible for people merging duplicates to have any clue why their data value should be preserved. Without sources, it becomes too easy to assume somebody misidentified the person, or made a typo, etc. Only by providing sources, can one show that there is actually reason for believing the facts they have presented.

Collaboration is a Process, Not an Immediate Result

It seems to me that the process of identifying sources at WeRelate is the same process as dealing with the genealogical data: collaboration. Meaning that expecting sources to only look mature, polished and complete is unreasonable. Instead, just like the data they support, they are often part of an immature research project. But such work can still give a base to build on. For that reason, I don't think any source or data should be removed simply because it fails to met somebody's idea of a good source, unless that person has a better-quality source to contribute. Merely recognize that situation for what it is: early in the process. If it bothers you, make it better, but don't disrespect the work somebody did getting that far. It is often the result of years of searching even to find poor sources for some facts, but this very iffy piece of data may be all some other researcher needs to then find good sources, and solidify the toehold. If you remove the bottom rungs of the ladder, only a tall person can start climbing to the top.

I believe the key is to crowd out bad genealogy with good genealogy, not simply to remove bad genealogy. Removing the bad stuff simply paves the way for somebody else to come along and reenter it.

The most important thing in identifying sources, is to let other people find them. Unfortunately, it is hard to say there is a standard way of doing this. There are multiple styles of bibliography. A recent search I did yielded six major styles, and that list didn't mention Elizabeth Shown Mills, which seems to be the accepted style used by many genealogists. Most of these styles have much more ambitious agendas than simply making it possible to locate that source and inspect it. I think too many people get hung up on dotting the i's and crossing the t's, without regard to the very small percentage of times these issues make a difference. And when it does make a difference, I suspect collaboration will make sure that an entry has the i's dotted and the t's crossed. This is not arguing for sloppiness and inaccuracy. It is only suggesting more tolerance for less than perfect work, if said work meets the minimum standard of identifying the source so another person can find it. If not, then collaborate, by improving it.

Distinguishing People: A Suggestion to WeRelate Users

After many months of adding pages where I try to ensure I am not creating duplicate pages, out of exasperation, I would like to humbly suggest that users should try not to create a page where the only information displayed is the name of the person. Relationships help (though they in turn are just names), but dates and locations are critical for identifying a person. If you don't have information beyond the name, my opinion is that the proper thing to do is not to create a page for that person. This is particularly true if all or part of the person's name is Unknown. There is no reason to create a page for Mary Unknown, unless you need to store a birth date, death date, or some other fact that distinguishes this Mary Unknown from the literally millions of other Mary Unknowns.

A page with only a name matches too many searches, when the person found is not even remotely of interest, and is an impediment to other researchers. It is frustrating to be searching for, say, John and Jane Doe in colonial times, and have to scan 4 or 5 pages named this, often following links two or more generations away to determine dates and locations, only to find out the page titled John and Jane Doe are modern people living nowhere near the area of interest. If the page even had a estimated marriage date, or a location, chances are it could have been ruled out from the search page with no digging. Taking the time to add such distinguishing information is merely being considerate of others who will be looking at the page you create.

For example, if you have a marriage with a date, create the Family page. But do not create the Person page for the husband and wife unless you have some information that needs to go there. If you know somebody's parents' names, but nothing else, do not create the Family page, merely note the relationship in the notes on the child's page, and leave it for another researcher to create a robust Family page.

What Makes WeRelate Different

I was just browsing one of the standard genealogy forums. A posting asked for information on some past ancestor of name XYZ, and a person replied with a message saying "My XYZ line is", giving several generations, and never mentioning the person asked about. Other replies merely stated facts without explaining how they were known, as if the replying author's acceptance of those facts was sufficient to make them true.

As a user, what strikes me as different from this typical behavior, in regards to WeRelate, is that WeRelate is not about your genealogy. Even your ancestors that happen to be in WeRelate aren't about you, they are about reaching a consensus about what was. You have no more right or authority in regards to that page than someone who might even be a non-descendant.

To that end, I have some suggestions that users could consider as they work with WeRelate. I don't pretend to have any authority in saying this, other than noticing that many WeRelate users apparently fall into the old behavior patterns, such as marking their ancestors, or writing about the whole history of a surname on a Person page, or not providing sources.

  1. Input your data with the reader in mind. Put data on the page in a form that will be useful to others. Recognize that all you know about the reader is that they happen to be looking at that one page. Don't assume they know the information presented on some other page, or that they have seen any particular source, or that they are necessarily a descendant of that person.
  2. Provide your sources. Expect that you may have to convince a person who disagrees with your conclusion. To reach a consensus, we must strive to find the primary evidence that justifies all the stated information, so different hypotheses can be compared and judged by the whole community dispassionately on the basis of verifiable facts. Attempt to link all appropriate source references to source pages to identify your sources unambiguously.
  3. Be precise but don't say more than is known. If all you know is the baptism date, don't use it as a birth date. If you have a will date, don't pretend it is the death date. Explain your estimates, guesses, assumptions and hunches, and don't present these approximations as known facts.
  4. Collaborate. Improve any page you touch. Ask if the change you are about to make adds information that will be meaningful and credible to other people interested in that page, or if you are making changes for the sake of personal preference. No page belongs to you. Be flexible about spellings and formatting if existing data communicates adequately.
  5. Avoid creating pages that have no facts, just a name. Consider that a person looking for the page would like to get an idea of the century they lived in, the general region they lived in, parents and spouse if possible, in order to know who it is. If none of this is provided, it clutters up search lists with unidentifiable people. See the topic above on distinguishing people for more detail. --Jrich 12:27, 3 May 2010 (EDT)

We don't need no stinking Ancestry

I just spent the better part of 2 days cleaning up one colonial family entered by a user that thought copying data from Ancestry Family Trees to WeRelate was a good idea. The data was full of mistakes and much of what was right was presented in an inaccurate way (i.e., leaving the Bef. and Aft. qualifiers off dates, representing baptism dates as birth dates, etc.)

This type of data is not needed in WeRelate.

All we know about colonial families tends to come from a small number of primary sources. When we know the birth of a child, it is probably because their birth was recorded in the town's vital records. This gets repeated in a book, then to a website, and so on, and so on, until it ends up in an Ancestry Family Tree. It is a secret whispered around the room. What we want to do is get fewer iterations away from the original source, not more. Unfortunately, this is compounded, because it seems that those who believe the whispered secrets, instead of going back to the source, are usually the same ones that lack the genealogical experience and knowledge to make good judgments about the plausibility of the secrets they've heard.

Putting data into WeRelate is not an activity one should be doing to show off their "research". It should be done because 1) you want your data reviewed by a wider audience, and possibly corrected, AND 2) you think the data will be useful to other people interested in the Person or Family you are working on. Certainly, access to websites like Ancestry, which may likely be free at your local public library or Family History Center, or to AFN files on, can be considered to have been seen by anybody that wants to. So copying this data to WeRelate has little value. In fact, naked data, without sources, has little value. Unfortunately, Internet genealogy is so full of mistakes generated by bad genealogical practices, such as name matching, believing anonymous submissions, relying on single secondary sources, cousin collecting, that one must start out assuming data is wrong to protect oneself unless sources are there explaining how do we know this? What has value is trying to identify and cite the primary evidence that lets us know the facts.

How to Spot Good and Bad Genealogy

The goal of genealogy should be truth. We don't have first-hand knowledge of most of the events we document in genealogy, so we can only approach truth as an asymptote by using what evidence we can find. But the truth is what we should strive to identify for all ancestors, good person or bad,.

The Genealogical Proof Standard (GPS) has 5 main points

  • reasonably exhaustive search
  • source citations
  • analysis/correlation of information
  • resolution of conflicting evidence
  • coherent conclusion

Signs of Good Genealogy

  • Generally each fact is supported by a source citation, ideally that quotes or references a primary source. Very few, or no, facts lack a source citation.
  • Estimated/approximate values are explained.
  • Citation of vital records, wills, deeds, church records, etc., is frequent.
  • All data is not simply a re-hash of a single secondary or tertiary source. When plausible, conflicting theories are presented and discussed.
  • The data is not self-contradicting, unless accompanied by a discussion.
  • The data is interpreted in an appropriate historical context.

Signs of Bad Genealogy

Bad genealogy is usually done by individuals that use a methodology inconsistent with the GPS. Everybody makes errors, but some people work in a way that increases the frequency of mistakes. Such practices include

  • poor record keeping
  • relying too much on secondary sources, or worse, Internet family trees
  • laziness and/or rush to fill in the blanks
  • lack of knowledge of historical context

Poor Record Keeping

The chief problem here is lack of, or ambiguous, source citations.

There is much discussion on lack of sources on this page already. Unless a source citation is present, a reader might as well assume the person copied the data from the first website they found having the right name listed on it. Therefore, it is probably wrong. Even if a website appears to answer your problems, if there are no sources cited, do not use it as more than a research hypothesis, i.e., it is not ready to post in public. Only after good, confirming sources are found, should it be posted.

Since over time, conclusions may change, good source citations should indicate what the source said, either as quote (if reasonable and not copyright protected) or by abstract. A researcher should only attribute to a source what it says, and not give the appearance that it supports facts about which it does not have anything to say.

If conversion of data is required, good source citations will retain the original format. For example, if a date is numerically, the original numeric form should be retained in the citation. The place names and spellings given by a source should be retained as given. Conversions may be added if it is clear they are provided by the researcher and not the source.

If there are many examples of data being switched between like-named persons, or siblings within a family, etc., this may be a sign that the researcher is not a reliable transcriber of data. Similarly, if old documents are rendered into modern language without indication, the data could have picked up a few of the researcher's assumptions or biases. The use of years instead of precise dates, baptism dates for birth dates, and will dates instead of death dates may all indicate imprecise or poor record-keeping.

Relying on poor sources

Every genealogist has made errors. Relying on secondary sources exposes you to these problems. Even multiple secondary sources may not reduce this problem, since many secondary source are just copies of the first one. Look for sources that explain the facts by references to primary documents or cite the primary documents themselves, as well as a secondary source.

The more a secondary source can be confirmed by locating supporting primary documents, the more likely that this source is reliable. Wills are usually excellent sources, since they usually document multiple relationships that can identify and confirm the participants multiple times over, they were reviewed by courts in a contemporary timeframe, and the participants had a fiscal motivation to see that it was done right. Secondary sources that frequently refer to information contained in wills are usually more reliable.

Citation of multiple sources makes it more likely that the researcher is aware of known controversies and are not relying on out-dated research. In other words, they are more likely to have done a reasonably exhaustive survey themselves, making your citation of them more reliable.

Genalogies of whole families (not just single branches) may be more aware of such troublesome situations as multiple individuals having the same name, and may be useful in sorting them out. Town histories often have good access to town records, but much less access to records about events that happened outside of town. Genealogies by family associations may have access to the research of many people, and were probably reviewed by multiple knowledgable reviewers, excepting those organizations made up of only one person.


Lazy genealogy usually means the research has been not been exhaustive, that poor record keeping was involved, and poor analysis was done. Signs of lazy genealogy may include:

  • No sources
  • Inconsistent data, such as child born after parents dies, indicating a lack of analysis
  • Dates for events with no places, indicating no confirmation or poor record-keeping
  • Only one child done out of a family
  • Only one marriage out of multiple marriages listed
  • Readily available facts missing, indicating cursory research, or lack of receptiveness to conflicting information

Some people are interested in how many cousins they can collect. The breadth of their research prevents them from doing "exhaustive" research on anybody. Other people abhor a data vacuum and will fill in the first thing they find just to have an answer. Both of these tendencies may have many of the same symptoms as laziness.

Lack of historical knowledge

It is hard to fault people for lack of knowledge, as we were all there once upon a time. But lack of knowledge can be a sign that a researcher is unable to use good judgment in analyzing their data, making it less reliable, and possibly resulting in good data being changed to be incorrect.

One of the biggest issues with history is the change to the Gregorian calendar (1753 in England and colonial US). Symptoms include:

  • Dates off by one, two, or three months (inability to interpret numeric dates)
  • Dates off by one year (genealogy software has options set wrong, or misinterpretation of dates)
  • Dates given as an interval of exactly one year (don't understand double dating notation)

Other historical issues that are common:

  • specification of birth dates instead of baptisms for colonial people born in England
  • referencing events in towns before an area was populated
  • insufficient attention paid to location and distance

Good suggestions left on my Talk page for additional points will be brought forward. --Jrich 15:21, 29 August 2010 (EDT)

Resolving Differences

I just was working on a page that commented "DOD either 5 or 15 Mar 1822". This is impossible, if you think about it.

The person died on exactly one day. What ever date they died on, it is set in history, and it is not "5 or 15". What the person meant to say is that different sources give the date of death as 5 or 15 Mar 1822, depending on which source you believe.

Semantics, you say? No, actually it is an important distinction if one ever wants to resolve this problem. Because the question isn't on what day did the person die. We can't actually know that, since we didn't live back then. Rather, all we can do is ask, which source do we think is the most likely to be correct? Which date is consistent with other known facts from sources we think are credible, also? Since the contributor (in this case) didn't list the sources, they didn't give us any tools to help resolve this problem. We don't even know why the contributor thought there was a problem. We are completely clueless because no sources were given. We have to start from square one.

How many times have you seen postings in a genealogical forum where some posted assertion is refuted by another person by saying "I have...[something different]". What a quandry! We have your opinion, and we have somebody else's opinion. Well, that tells me... nothing! If all it takes is one person to "have" something, then we'll never get an answer. Only by naming sources do we really say anything. Only then can we hope to decide which answer is most likely.

In the above case, the vital records recorded by the town clerk soon after the death, then transcribed and published, said 5 Mar 1822. Now there is a lot of room for error in this process. The town clerk could have been given the wrong information by his informant, he could have mis-recorded it, he could have had sloppy handwriting that was misread, there could be a typesetting error at the printer, etc. But the presumption is that everybody did the best they could, they were all good at their job, and probably there is no error. Unless there is evidence of one...

What evidence, you say? No source was given, I have no evidence to show this town clerk made an error. Exactly! We don't know what the WeRelate contributor was looking at when they added "or 15" to their comment. Did they locate a gravestone and the gravestone says 15 Mar? I don't know, it's not in Find A Grave. I could search and search, not even knowing if what I am looking for exists. Do I want to bother? Heck no! As far as I can tell, it was just a comment saying that two Ancestral Files gave different dates. Sheesh! Like that never happens. --Jrich 22:56, 14 March 2011 (EDT)

Which source came first?

Two grandsons of the original Giles Rickard, both named John, both married women named Mary. In the births of their children both John Rickard Senior and John Rickard Junior are listed with a wife named Mary. Fortunately, the Senior and Junior let us identify clearly who was father of which children. Senior being older is the one born 1652, while Junior being younger is the one born 1657. Not so easy identifying the wives, and which one married which John Rickard. One is Mary Cook, a granddaughter of Mayflower passenger Francis Cook, and the other is sometimes said to be Mary Snow, a daughter of William Snow and Rebecca Brown (still working on this).

An article in Mayflower Quarterly is supposed to resolve this issue, at least according to a mention in a book by the same author (if he says so himself?) I will write myself a note and so that next time I schedule a field trip to my relatively local genealogy library, a few weeks from now, it will be one of many things I look up. But in the meantime, I thought I'd check if there was a synopsis posted on the Internet somewhere. If I could find the basis for the article's assertion, maybe I could confirm it now, rather than leaving this research thread hanging for a couple of weeks. Often, the answers to these controversies lie in widely accessible documents, it just takes someone to point them out. So I search for the title of the article, and get 23 hits. For a brief moment, I'm pleasantly surprised.

22 of those hits use the article title in the exact same sentence. Someone apparently posted a remark saying the author cleared up the mystery in such and such an article, and all the other industrious researchers simply cut and pasted this sentence into their own websites. Not a word is varied. 1 person probably looked up the article, wrote the sentence, and twenty-one cases of plagiarism by people who are too lazy to write their own sentence, much less bother reading the article themselves. (And unfortunately, the original didn't provide the synopsis that I was hoping to find, so of course, neither did the other twenty-one.)

By the way, the twenty-third mention? It was WeRelate! Apparently in an orgy of source citation, this article about Francis Cook's granddaughter was listed on Francis Cook's page. Now, it did not tie into any information mentioned on the page, was never referenced by a footnote, nor is the granddaughter mentioned on his page. I'm hardly in a position to complain when somebody actually cites a source, and I haven't read the article yet, but I have to wonder how relevant it will be to the rest of Francis Cook's children and grandchildren? Does it really need to be listed there, suggesting that it is recommended reading for anybody interested in any part of Francis Cook's family? I'll find out in a couple of weeks. --Jrich 11:56, 26 March 2011 (EDT)

I have since looked up this article, and thought I'd post this quick followup. I find that this article reached its conclusion by doing a handwriting analysis. I did not find it conclusive by itself, but combined with other circumstantial evidence, the conclusion that John Rickard Junior married Mary Cooke seems, by far, the most likely arrangement. I have posted abstracts in the source citations on Family:John Rickard and Mary Cooke (2) so future readers will have a good idea of its possible value to their research. And no, it is probably not useful to the general Francis Cooke researcher, only to this particular branch. As an incidental comment, I found some of the most interesting parts of the article to be comments by the author, Historian of the General Society of Mayflower Descendants, about the responsibility of making decisions about who is, and who isn't, accepted as a Mayflower Descendant. --Jrich 14:46, 24 April 2011 (EDT)

One About is as Good as Another (Precision with Dates)

When one sees a old style date expressed as 3 Jan 1702/03, one does not need to enter a date of 3 Jan 1702 and an alternate date of 3 Jan 1703. This notation is not meant to tell us the researcher was unsure of the year, rather it represents a single date.

When one sees a old style date expressed as 3 Jan 1702/03, one does not need to enter between 3 Jan 1702 and 3 Jan 1703. This notation is not meant as shorthand for an interval that is conveniently exactly one year long, rather it represents a single date.

When one sees a old style date expressed as 3 Jan 1702/03, one does not get to pick the date one likes the best. Nor should one pick just one. This introduces ambiguity and makes the date less useful than the double dating style, which indicates one date precisely with no ambiguity.

If you don't understand the Julian to Gregorian calendar change and the month numbering shift of 1752, and/or you don't understand double-dating, you should not involve yourself in any genealogy before the year 1753. (1753 applies to the U.S. and other English territories. In different countries, different cutoffs may apply.)

If a date is entered Abt 1702, you should not enter the date as 1702. This is misleading and completely misrepresents what the first date is saying. The same goes for other qualifiers, such as Before and After.

If a date is entered Abt 1702, it is not really helpful to enter Abt 1703 as an alternate date. One about is as good as another and they both adequately indicate the date is not known exactly.

Thank you.

Rant du Jour

Family:Benjamin Harrington and Abigail Bigelow (1)

A family having children from 1685 to 1703, stops, then has children from 1723 to 1734. Come on! Does this even make sense to people? You don't even have to know much genealogy to be able to say this can't be right. So why would you copy it to WeRelate and other sites?

I look up the Ancestral File for alleged daughter Lydia, and 22 of 23 submissions show the wrong parents. Yet right on the Internet, free, no subscription or membership required, a full transcription of the Watertown vital records, recorded by the town clerk who probably shook hands with the father, maybe even pinched the cheek of little Lydia, clearly naming her parents as George & Abial, not the Benjamin and Abigail who would have been 65 and 63 respectively when Lydia was born.

The town clerk has precisely recorded all the appropriate dates using double-dating. But the ancestral file these pages were copied from simply dropped it. If you don't understand the Julian calendar, you should not be touching any genealogy before 1753. It was clear, and now somebody's ignorance has made it essentially incorrect (most literature assumes plain 1727 means 1726/27, which is a year off from the recorded 1727/28). For Phineas, who died as an infant, they used the 1729 part for the birth, and the /30 part for the death: I guess you get to choose?

My pet peeve: the death date of Ruth is given with no location, no source and no marriage. Yes, she married. And there is no way one could know the death date unless one knew that. So somewhere along the way, somebody had that information and simply threw it away, keeping only the dates. Because they don't understand the chain of logic that goes into proof, they are merely interested in filling in some blanks.

I know a lot of garbage crept into WeRelate before some minimal safeguards were added to the GEDCOM upload. But why is it exactly the people who do things wrong are the ones who seem to fill the need to celebrate themselves by broadcasting their freshly copied mistakes all around the Internet? News flash: it's not about you, it's about the ancestors, and how about trying to get it right, rather than find that 100,000th cousin, that probably isn't your cousin anyway, because you copied the wrong parents five generations ago?