WeRelate:Source renaming project

We have nearly 1 million Source pages in the wiki, and most of them pre-date our new source page titling rules. Because of this, we need to rename them. Renaming will be done automatically, but it will require some human help to review the proposed renamings to ensure they are correct. This is a huge project, and we need your help to complete it.

Renaming the existing sources will be a big help for new users. Having so many sources that don't currently follow the titling rules makes learning how to create new source pages correctly very difficult. Renaming the sources will also make finding sources and identifying duplicate sources easier, and helps prepare the way for automatically generating source citations.

Contents

New source page title rules

The new rules construct the source page title from various fields in the source depending upon the source type:

Source type Source page title
Books and Articles Author. Title
Government / Church records Place covered. Title
Newspapers Title (Place issued)
Periodicals Title (Publisher)
Miscellaneous/Unknown Author. Title


  • If the author, place covered, or place issued is missing for Books/Articles, Government/Church records, or Newspapers respectively, then the Source page title is simply the source title field.
  • Author is the first author listed in surname, given name(s) format; e.g., Doe, Jane A.
  • Place covered is small-to-large format; e.g., Chicago, Cook, Illinois, United States
  • Place issued is also in small-to-large format, but unlike place covered, does not have to include every level; e.g., Chicago, IL is sufficient
  • For Books, the title may be followed by the year published (in parentheses) to distinguish multiple editions of the same book.
  • Leading articles (A, An, The) in the title are omitted when generating the source page title.
  • Government/church records that have been compiled/transcribed by someone into a book use the Book source type.
  • A new source type of Website is for unique compilations of material. Web pages that are a copy of an offline book or a specific government/church record set should be of type Book or Government/Church record respectively. In general, you should prefer the "Government/church records" or "Book" source types over "Website", unless the website contains a hodge-podge of material that can't reasonably be thought of as a government/church record set or a copy of an offline book; e.g., the website contains contains a variety of information from different types of records. Only in these cases should you call it a website.
  • Source pages for archives, libraries, or historical societies ought to be renamed as Repository pages.

Ultimately source citations will be generated from various fields in the source depending upon the source type. It is important for source pages to have the correct field values so that the generated citation texts will be correct.

During the automated renaming, for sources of type "Miscellaneous" (the majority of the sources) the system will attempt to guess the correct source type by looking at the source author and will rename the source page accordingly.

Review process

We would like to start the automated renaming by the end of August, hopefully starting sometime August 24-28. Below are links to lists of page renamings, where you can see the current and renamed source page titles. The lists have been created based upon the last person to modify the source.

If you see your name in the list, would you please

  • review the proposed renamings for your list and edit the source pages to change the fields necessary so that the new source page title will be correct, and
  • sign your name next to the list so that we know it is being reviewed.

If you do not see your name in the list or if you finish your list and would like to help out, would you please

  • Choose one of the "Other" lists to review, and
  • sign your name next to it so that we know it is being worked on.

For each source, the lists contain a link to a source page showing the current title, followed by the new title underneath. If the new title is incorrect, please edit the source page and set the source type, title, author, place covered and/or place issued fields so that the new title will be generated correctly according to the rules in the table above. You don't need to (and shouldn't) rename the source. It will be renamed according to the rules in the table above next week.

The lists will be refreshed every morning so you will be able to see the results of your changes from the previous day.

If you have general comments or questions on the process, or if you notice any systemic problems during your review, please leave a message on the talk page.

Thank you!

Renamings to review

  • Amelia.Gerlicher Done (!) --Amelia 00:34, 22 August 2009 (EDT)
  • Beth Done !Beth 21:00, 19 August 2009 (EDT); non-census items double-check by jillaine 10:58, 1 September 2009 (EDT)
  • BobC -- Done! (Sorry, but I renamed the pages themselves. I see now I wasn't suppose to do that.) BobC 16:29, 21 August 2009 (EDT) Second portion done. This time I didn't "rename." -- BobC 12:47, 24 August 2009 (EDT) Next portion under my UserID done --BobC 17:36, 30 August 2009 (EDT) Reviewed new entries --BobC 09:50, 3 September 2009 (EDT) Latest review complete. --BobC 12:59, 23 September 2009 (EDT)
  • Ceyockey -- Changed my mind; CEY has some odd ones that appear self-created and um, I hesitate to touch CEY's work. jillaine 14:05, 2 September 2009 (EDT)
  • Dallan--Dallan 11:12, 19 August 2009 (EDT)
  • DFree completed, some might need to be redone?
  • Gewurztraminer --dayna 15:07, 19 August 2009 (EDT) - done
  • JBS66 A-Mi done, excepting the foreign language sources --dayna 08:51, 23 September 2009 (EDT)
  • Jillaine -- DONE jillaine 15:06, 2 September 2009 (EDT)
  • Jlanoux --Judy (jlanoux) 17:27, 19 August 2009 (EDT) - Done
  • Jrich --Jrich 08:56, 20 August 2009 (EDT)
  • Jstump--finished
  • Kennebec1 --kennebec1 16:53, 19 August 2009 (EDT) done except for several pending naming clarity from discussions... --kennebec1 22:21, 21 August 2009 (EDT) Ok, I think its really done now. --Brenda (kennebec1) 16:44, 31 August 2009 (EDT)
  • Leo Bijl --sq 16:59, 27 August 2009 (EDT)
  • Mksmith --Mike (mksmith) 16:34, 19 August 2009 (EDT) -- Done
  • Quolla6--Q 11:05, 21 August 2009 (EDT)
  • Skater --Taylor rechecked
  • Solveig --Solveig done rechecked
  • Taylor --Taylor rechecked

Sources that other people have edited or someone links to:

these lists used to contain 1000 sources each; they have been split in half so that each list now contains 500 sources to review.

Sources that are neither human-edited nor are linked to:

  • WeRelate agent -- this list represents a 1% sample of 900,000 sources, mostly from the Family History Library Catalog. It doesn't need to be reviewed. It's here in case you want to see how the renaming process guesses whether to use Place. Title or Author. Title format for these sources. Please leave a message on the talk page if you notice something systemically wrong.

You may notice that User:Dallan, User:Solveig, or User:Taylor have already fixed some of the sources in your list. This is because we are working on separate lists that show likely problems or show sources with a source type of something other than "Website" that are linked to rootsweb or other websites. These lists overlap somewhat with the lists above, but they don't include all of the potential problems. That's why we need people to review the lists above.

Duplicates

It's likely that the renaming process will attempt to rename several sources that currently have different titles to the same title. Dallan will post a list of these "possible duplicate" sources on Friday, August 21. We'll ask people to review and possibly merge these sources once the list is posted.

Ok, this project is much larger than I originally thought. I'll hold off on the duplicates for a week. Once we get the sources reviewed we can start renaming the non-duplicates, which are the vast majority. We can then figure out what to do with the duplicates the beginning of September.--Dallan 00:01, 21 August 2009 (EDT)

More on duplicates

I've narrowed the duplicates list down to roughly 600 sets. I don't expect the list to change from here on out. Each set of duplicates will need to be resolved one way other another, either by:

  • merging the sources and deleting all but one of them, or
  • changing the source-type, title, author, or place-covered fields so that the duplicate sources will end up being assigned different page titles during the renaming.

The duplicates list is refreshed each morning, so duplicates that have been resolved will be removed from the list the following day.

I'd like to divide the duplicates list into the following segments so that multiple people can work on it without stepping on each others' toes. Could someone please sign up for each segment below? Thank you!

  • A-F -- DONE! (woo woo!) jillaine 19:36, 16 September 2009 (EDT)
  • G-K--Amelia 23:41, 31 August 2009 (EDT) ------- Don't appear to be done, I'll start K --Judy (Well, you're welcome to work if you want, but all that's left to do is check for edition dates, and I've done at least 50% of that; none of the ones left would really cause a problem if Dallan merged them automatically--Amelia 12:06, 17 September 2009 (EDT)) Great, then we are done! Judy
  • L-N -- --Judy (jlanoux) 16:50, 10 September 2009 (EDT) Done!!!
  • O-S -- --Brenda (kennebec1) 08:36, 15 September 2009 (EDT) Done!!!
  • T-end -- sq 09:33, 15 September 2009 (EDT) done--sq 15:29, 15 September 2009 (EDT)
The last list has some authors with really long names; if you shorten the author name the system will be able to include enough of the source title in the page title that the page title will become unique

--Dallan 18:28, 31 August 2009 (EDT)

Various source lists

Website sources

Which of these should we delete, and which should we change to a source type of Miscellaneous?
Many of these should be changed to "Repositories" rather than remain as "Sources" (even as "Finding Aid" sources) --BobC 14:29, 27 August 2009 (EDT)
Due to the length of the notes previously here I transferred the discussion relating to this Other Subject list from the project page to the talk page --BobC 12:47, 21 September 2009 (EDT)



Miscellaneous sources having a human author and a record-oriented subject (10% sample of all Misc sources)

We're considering titling these sources using place-title format.
We're planning to title these sources using author-title format.

Manuscripts

Should we remove the Manuscript collection source type, or rename it to Manuscripts?
I am surprised during my review of LDS holdings during the source renaming project review how much of their collection is identified as manuscripts. Since the Manuscript source type was removed in the past couple weeks I have been saving them as Miscellaneous sources instead. Maybe in the broad scope of source documentation, true documented manuscripts are a miniscule percent and can be considered Miscellaneous source types. You decide. --BobC 16:44, 1 September 2009 (EDT)
I think Miscellaneous is alright for them. I wonder how many of the things the FHLC calls "manuscripts" would be classified as "records" in our nomenclature. I don't know. I think that reducing the number of source-type options is worthwhile if we can remove something that isn't used widely overall.--Dallan 22:17, 1 September 2009 (EDT)
I think many of the Manuscripts are actually private records that were microfilmed at historical societies or some similar path, at least the ones I've seen lately. So they're probably closest to books (well, really, to the websites we've deleted since a lot of them are like 4 pages, but that's too hard to figure out).--Amelia 00:51, 2 September 2009 (EDT)

Periodicals

Will be titled using "title (publisher)" format

US County Censuses

Should we title these using "NNNN U.S. Census Population Schedule"?