About the WeRelate Transcript of the Genealogical Dictionary of the First Settlers of New England

Share


Contents

Origin and Copyright

The original Genealogical Dictionary of the First Settlers of New England was published in four separate volumes between 1860 and 1862. Each volume contained appended additions and corrections. Further additions and corrections appeared in the New England Historical and Genealogical Register (NEHGR), (Vol. XXVII, No. 2, April, 1873, pp. 135-139). That content is obviously out of copyright.

The content presented starting here however, comes from a corrected electronic transcript prepared by Dr. Robert A. Kraft and Benjamin Dunning in 1994. Dr Kraft is the Berg Professor of Religious Studies, Emeritus, in the School of Arts and Sciences at the University of Pennsylvania. The transcript appears here by his very kind permission. The copyright page of the transcript indicates as follows:

"The electronic version has been adapted under the direction of Robert Kraft (assisted by Benjamin Dunning) from materials supplied by Automated Archives, 1160 South State, Suite 250, Orem UT 84058 in the following ways: missing lines have been added wherever they could be located (vol. 2 could not easily be checked since line format was not replicated; the corrections found in vols 1-4 have been integrated into the text; page numbers have been represented between double brackets; hyphens have been resolved, and some abbreviated names. NOTE that letter by letter verification has NOT yet been attempted. Copyright for the new electronic version by Robert Kraft, July 1994."

Creation of the WeRelate Version

Dr. Kraft's transcript exists in the form of four separate readable ASCII files, corresponding to the volumes of the original work. For presentation on WeRelate web pages, the content has been separated back into pages that correspond to those of the original published work. This partitioning has a number of benefits.

  • The web and original publication page paradigms are consistent
  • The page field of a source citation can present an active link to take a user to the appropriate page of the transcript
  • The standard talk/discussion page, backing every media wiki page, provides a handy place to discuss issues related to particular pages of the transcription such as errors discovered after publication of the April 1873 corrections in NEHGR.
  • Pages of the transcript are tagged with places to which they refer, surnames of the people described, and the year range of events. Searches on those attributes then, stand a fair chance of returning relevant specific pages of the transcript.
  • The surname index, which appears on the root page of the transcript, is not - itself - part of the transcript. Instead, it was created by the same text processing program that recognized the surname sections and page partitioning in Dr. Kraft's ASCII files.

Transcript Creation Defects

Most of the initial steps to create the page partitioning of Kraft's transcript, creating links that support "See xxxxx" links, as well as the initial values for places, surnames, and the date range - were performed by a simple text processing program. While quite effective, the implementation was not without flaws:

  • Surname sections were recognized by hallmarks that were typical, but not universal. Additional sections not recognized will have to be edited into the index by hand, with appropriate changes to page headers and addition of the proper section at the appropriate location in the transcript.
  • A bug in the processing of Surname sections sometimes doubles the surname so that it appears as LASTLAST.
  • The '£' character (among other special characters) did not translate correctly and appears as '�'.
  • The '£' character can be recreated (on most Windows keyboards) by holding down the -Alt- key and typing '163' on the numeric keypad, almost always on the right side of the keyboard.
  • The "See surname" idiom was replaced correctly with an active link only when it appeared immediately after a section name. When the phrase appears in the normal stream of text no replacement occurred. Use the {{savagepage|vol|page|See surname}} template to create the link.
  • Only a small number of common early New England place names (about 60) were known to the program, and only a few names in England. Places other than those will not be included in the initial place list for a transcript page.
  • Discrepancies in the transcription - such as spelling Billerica as Bi11erica (use of arabic numeral '1' instead of lower case letter 'l') - would not be recognized as Billerica, Massachusetts. Similarly, dates such as l777 (instead of 1777) would not be recognized as a date and would not contribute to the date range for a page. Related to these are defects that seem to relate to use of optical character recognition. For example, the letter g sometimes recognized as the digit nine. Errors of this type - while essentially "baked in" to the Kraft transcript, are still defects in the faithful recreation of the text that Kraft intended.
  • Savage makes use of the asterisk ('*') to designate members of the Massachusetts General Court. From time to time, these appear as the first character of a line of the transcription - where they are inadvertently interpreted as requesting a media-wiki bullet item.

It is always appropriate to modify the transcript appearance in order to resolve defects of these types.

Common Practices

  • The simplest and most important use of the transcript is as a linkable target for page citations. A template has been created to make this very simple. For example, Thomas Adams is discussed in volume 1, page 16 of Savage. The page field of the Savage citation for Thomas Adams contains the template {{savagepg|1|16}} -> 1:16.
  • The transcript is meant to be annotated with links to any WeRelate Person page to which it unambiguously refers (for example, see volume 1, page 16).
  • The transcript narrative also embeds mention of important reference works upon which Savage relied. When recognized, these should be turned into active links to appropriate WeRelate "Source" pages.
  • Normal wiki practice is to create a link only the first time a term (or person) is seen in some context. We depart from that rule for Savage, since those not familiar with Savage can easily be confused about "who" is meant in the narrative. Instead, consider it proper to link every unambiguous reference.
  • There are a number of categories which are useful as link targets in the transcription. Among these, are King Philip's War, Passengers on the Mayflower, Francis (1634) Passengers, Falls fight, and others. Not are these useful for a reader of the transcript, the collection of Savage transcript back-links provides a survey of places where that event is mentioned in Savage.
  • At present, other terms - such as place names in the body of the transcript - should not be linked (so as to allow linked names of individuals to stand out from the page clearly).
  • The transcript appearance should not change (in general) from that of the original Kraft ASCII files - or more particularly - from what we can properly infer as the intentions of Dr. Kraft. Faithfully maintain line breaks, punctuation, capitalization and retained abbreviations of the original text (as modified by application of the additions and errata). Repair of typographical, transcription, or WeRelate pre-processing errors (see above), is highly desireable.
  • The content of Savage consists of surname-specific sections. Those sections are already designated, in the WeRelate transcript, by Template:Savagetranscriptsection, which provides specific formatting for the situation. Those sections further consist of sketches that are associated with the family of one particular individual. In the original Savage, the only thing visually distinguishing the start of a new sketch is appearance of the given name in all capitals. For enhanced readability in the transcript, Template:Savagetranscriptsketch was created to designate such locations and provide more identifiable formatting than that of either the original or the Kraft transcription.

When Savage Got it Wrong

  • Defective, but properly transcribed, content should not be corrected or discussed on the transcript narrative pages proper. Instead, use Template:Savagetranscriptdefect to designate erroneous content along with an index that indicates which defect this is on that page. The template will create the erroneous content with a strike through and place an active link at the end of the content to the corresponding discussion page and section named "Defect number".
  • Discussion and supporting research for WHY the designated content should be summarized in the appropriate section of the talk page. See examples of this practice on pages 58 and 227 of volume 4. Notice on page 227, that two separate sections of narrative may be associated with the same error index. This is appropriate if there is indeed ONE error associated with both defects. Separate defect on a single page should be assigned simple increasing integer labels.

Survey of the Nuts and Bolts

The only direct wiki formatting operations that should be found in the transcript source are the page break, <br>, and the formatting escape around asterisks in the first column, <nowiki>*</nowiki>. All the other formatting operations are controlled by the following templates:

Supporting templates, used with transcript content other than in the transcript source proper, include:

Where We Stand - May 2013

The public phase of this project began in February of 2012, when 2343 (v1-4 - 497, 577, 596, 673) pages of body content, extracted from the Kraft files, was uploaded (exclusive of some introductory and closing material). A few statistics to this date:

  • 9180 individual person "sketches" designated, by insertion of the sketch template (40% of abt 22,000 total).
  • 7591 total person page links in the transcript to 5219 unique person pages
  • 869 transcript pages link one or more person pages
  • 940+ Person and Family pages cite transcript pages via this template.
  • The most often linked person, Increase Mather, from 18 pages.
  • Most extensively linked transcript page, Volume 4, page 299, with 82 total links.