WeRelate:Place review

Contents

Please help us review the place database!

To make ready for much-improved indexing at the end of this month, which will index pages within 15 minutes of their being edited (instead of overnight), and will provide much better search over names, dates, and places, we need to rename the place pages to include all levels in the jurisdictional hierarchy. So Place:Sydney, Australia will become Place:Sydney, New South Wales, Australia. We renamed places for US and Canada already last Spring; it's now time to rename the rest of the places in the world.

One thing about renaming though, is that it will be _much harder_ after renaming to correct a place's name or its located-in field, because not only will you have to change the place in question, but also all of its contained places as well. For example, if New South Wales were incorrectly located-in "Northern Territory", then when we do the automated renaming, Sydney would be incorrectly renamed to "Sydney, New South Wales, Northern Territory, Australia". All other cities in New South Wales would also be incorrectly renamed.

So before we start the automated renaming on November 3rd or 10th, we're asking for your help. We'd like you to pick a few countries (as many as you like really) to review the places in and make sure that their names and their located-in fields are correct. If you have any questions, please don't hesitate to ask.

Thank-you for your help!

New place page format

In order to make it easier for people to edit place pages correctly, and to ensure that pages that link to newly-entered places are indexed properly, the following changes will be made to the place page edit form once the automated renaming is complete:

  • preferred name: will now be the text before the first comma in the place title. There won't be a separate edit field for preferred name.
  • located-in place: will now be the text after the first comma in the place title. For example, the located-in place for Place:Sydney, New South Wales, Australia will be Place:New South Wales, Australia. There won't be a separate edit field for located-in.
  • previously-located-in: will be renamed to also-located-in, to handle places that are simultaneously located in multiple jurisdictions (e.g., civil, parish, and military jurisdictions). Although the place index currently contains only civil jurisdictions, this change will allow people to enter additional jurisdictions in the future.

Also, it will be helpful for the upcoming new indexing scheme if old place names were implemented as redirects pointing to the current place name. So if County X is currently located in Province Y, but it used to be located in Province Z, then when creating a place page for "X,Z", it would be helpful if that page were a redirect to "X,Y". It shouldn't be necessary to create pages for every previous name of a place because an improved place matcher under development should be able to match places so long as the current place page lists its old name and old located-in place (as also-located-in). But if you do create a page for a previous name of a place, it would be helpful if that page were a redirect to the current name of the place. The reason is that pages that refer to redirected places will be indexed under both the source and target of the redirect. So a page that links to place "X,Z" will be indexed under both "X,Z" and "X,Y", to make that page findable for people searching either province.

Review process

Places are located-in the wrong places due to incorrectly merging the Getty Thesaurus of Geographic Place Names, the Family History Library Catalog, and place information from Wikipedia. I tried to be pretty careful during the merge process, but there were a lot of places that needed merging and duplicate places didn't always have exactly the same title. Sometimes a district got incorrectly merged with an inhabited place, resulting in the inhabited places under the district being put incorrectly under the merged inhabited place.

We know that there's a lot that could be done to clean up the place database, but for the immediate timeframe we'd like to focus primarily on correcting place names and located-in fields, since once we rename the place pages to include all of the levels in the jurisdictional hierarchy, correcting a place's name or its located-in field will involve renaming not only the place but also all of its contained places.

The review steps:

1. Please pick a country to review from the list below. Edit this page and place your name beside it so that others will know you are working on the country and not duplicate your work. (Hint: don't start out reviewing really large countries; save them for later when you're more comfortable with the review process. They'll go faster.)

2. In the list of countries below, the first link for each country takes you to a "placelist" page that lists all of the places for that country in a nested-list. This gives you a quick overview of the entire place hierarchy for that country. The second link takes you to a "placehighlight" page that lists places that I think are worth focusing on especially in the review. These are either places at the top of the hierarchy (just under the country), or places that have "strange" hierarchies -- like a state inside of another state.

3. As you review these two pages, if you suspect that a place might be located-in the wrong place, follow the links that are located near the top of the place page to the Getty, Wikipedia, and the Family History Library Catalog websites. (Many pages link to just one or two of the three possible sources.) Take a look at the hierarchy for the place on those websites and compare it to the place's hierarchy on WeRelate.

4. Fixing problems:

  • If you notice a problem somewhere along the hierarchy where a place page is listed as being located-in the wrong place, edit the page with the incorrect located-in field and correct it. You don't need to rename the page. We'll rename all place pages automatically once all of the countries have been reviewed.
  • If you want to correct the place's name (i.e., the text in the place title before the first comma is not the correct name of the place), then rename the place page and put the correct text in the place title.
  • If you notice that a place has been merged incorrectly (e.g., the place page lists two sources, one for Getty and one for the Family History Library Catalog, but each source should have resulted in a separate place page), create a new place page, copy the template referring to the source (e.g., Getty) and any other information you want to copy over to the new place page, and remove the template for that source from the existing place page.

5. If you find that you need to merge two places into one place:

  • Choose one of the places to keep and the other page to replace. If one of the places already has the "correct" title, keep that one. Otherwise, if one of the places has contained places and the other does not, keep the place with contained places. Otherwise it doesn't matter which one you choose to keep.
  • Copy any information you want to retain from the place to replace into the place to keep.
  • Redirect the place to replace to the place to keep by editing the place to replace and entering #redirect [[Place:title of place to keep]] in the big text box. Remove all other text from the big text box and save the page.
  • Alternatively, if the place to replace has no contained places, and when you click on the "what links here" link at the bottom of the page, the only pages that link to it are other place and template pages, then you can go ahead and delete the page. But redirecting works just as well. Later we'll update the wikipedia templates and sources to link to the redirected-to pages, and then we'll delete redirect pages that point to other places in the same country and that aren't linked to from any other pages.
  • Please don't delete places with contained places. Redirect them to point to the correct place.

6. When you have finished reviewing a country, add "reviewed" after your name in the list below so that everyone will know that that country has been reviewed.

7. Please log the hours you spend on this project in the administrator log (even if you are not an administrator). We need to keep track of volunteer hours spent on administration tasks to support our continuing non-profit status with the IRS.

If you have any questions, please leave them on this page's talk page.

The complete list of places and the highlighted places for each country will be updated every morning so that you can see the affects of your modifications from the previous day.

Helpful hints

The "placelist" and "placehighlight" pages list places in the form "proposed new name <= current name". Unless we make changes, the "proposed new name"'s are the one's we'll end up with. The "placelist" pages enumerate all places in a country, but the "placehighlight" pages generally show only a small subset of the pages -- pages that either

  • are high-up in the place hierarchy for the country and so are likely to contain a lot of other places, which means that their names will be repeated in the titles of a lot of other place pages during the renaming, or
  • have "suspicious" jurisdictional hierarchies (e.g., a state inside of a district, or a place inside of a city); these are cases where the located-in field on one of the pages in the hierarchy is likely to be wrong.

What to focus on

As you review places, it would be helpful to focus on two things:

  • The new place name, which is the title of the place page up to the first comma (and removing any "type" words -- see below). This is most important for places that are high-up in the place hierarchy and so contain a lot of other places. (We're ignoring the preferred name because in general, the title of the place page up to the first comma appears to be "better" than the preferred name when they differ.) The new title of the place page will start with this name.
  • The located-in field. This is most important for the places listed in the "placehighlight" pages that are either high up in the place hierarchy or are listed under "suspicious" jurisdictional paths. The new title of the located-in page will be used as the rest of the title of this place page during the renaming.

So during the renaming, the new title of a place will become the name of the place (text of the title up to the first comma and removing any "type" words), followed by a comma, followed by the new title of its located-in place.

In general, I think we should always try to get the first-level (e.g., state/province) divisions for a country correct. Regarding second-level (e.g., district/county) divisions, I'd say that when there are fewer than 100 of them and when most of them already show up correctly as second-level divisions because they are listed as such in FHLC or Getty, then we should try to make sure that we have all of the second-level divisions for the country, and that each is located in the correct first-level division. (Chances are there will be a page in Wikipedia listing all of the second-level divisions and showing what first-level division each one is located in.) Otherwise, I think it will be too much work to identify second-level jurisdictions at this time.

Once you've made sure that the top one or two levels in the place hierarchy are correct, look at the "suspicious" jurisdictional paths in the "placehighlight" page. See if you can figure out if these are correct, or if the located-in field for one of the places in the hierarchy of these "suspicious" places needs to be changed.

Thanks!

Use Wikipedia

Use Wikipedia to check whether the top 1-2 levels of the hierarchy are correct. If you are not very familiar with a country's divisions, read the Wikipedia article about the country. Click on the link to the WeRelate place page for the country that is found at the top of the "placehighlight" page for the country, then click on the "wikipedia" link at the top of the country's place page to go to the Wikipedia article for the country. Look for a section in the article like "Administrative divisions". Check to make sure that our top-level administrative divisions for the country (which are listed in the "placehighlight" page and also the "placelist" page for the country), match those found in Wikipedia.

Island-based countries

We'd like to include islands / island-groups in the place hierarchy for island-based countries (e.g., Place:Bahamas). If you review one of these countries, please review the places that are contained directly within the country page. For each place, if it has a link to "getty" or "wikipedia" at the top of the place page, and getty / wikipedia says that it is located on a particular island, then set the located-in field of the place to that island. (If the island isn't already in the place index, please create a page for it.) If the place only has a link to "family history library catalog", don't worry about trying to set the island; just leave it directly located-in the country.

Removing spurious regions from the place hierarchy

Sometimes a place of type "region" is a real political/administrative division (e.g., a state), and sometimes it's not (e.g., "New England"). When we did the merging, we assumed that regions should always be included in the place hierarchy. But for countries like Place:England and Place:Egypt, this is probably a bad assumption. In Egypt for example, the Governorates are currently listed under regions, but the regions don't appear as administrative divisions according to the Wikipedia article on Egypt and they have alternate names like "Upper Egypt", so they're likely not political/administrative divisions. If you come across a case like this, the thing to do is to edit each of the places contained within the spurious regions (in the case of Egypt, the 26 Governorates), and change their located-in place to the country. That will remove regions from the place hierarchy when we rename the pages.

Geonames.org template

[[User:Knarrows] has found that http://www.geonames.org often has useful information on places, including the place's hierarchy and latitude & longitude. She has created a source-geonames template that you can add to place pages to link them to Geonames, much like we link to Getty, FHLC, and Wikipedia. For example, to link to the Geonames page for Memphis, Egypt, add {{source-geonames|Memphis|EG}} to the top of the place page. The two-character country codes can be found at http://www.geonames.org/countries/. Feel free to look up place information on http://www.geonames.org and to add this template to place pages if you like.

Hierarchy paths displayed in placehighlight

If it would be helpful for you if additional hierarchy paths were displayed in the "placehighlight" page, please let me know. It's very easy to add more paths.

What to do when places are listed under historic places

Poland has undergone a lot of changes since WWII. Several of their pre-WWII Voivodeships have been split and given to other countries; others have been combined together to form current voivodeships. When you come across a country like Poland, where WeRelate lists (possibly hundreds of) inhabited places under provinces that are no longer current, what's the best thing to do? Long term, we want the place under the historic province to be a redirect to the place under the current province (possibly in another country), and the current place page to list the historic province as an "also located in" place. But doing this manually for possibly hundreds of places is not feasible. Instead, do the following:

  • Create a talk page for the country if there is not one already
  • In the talk page, list
  1. the historic province
  2. the "target" province(s) or country/ies that the inhabited places in the historic province were moved into
  3. whether all or just some of the inhabited places in the historic province should be moved into the target province
  4. the year the move occurred.
  • Add the talk page next to the country line in the list of countries below.
  • That's it -- you don't need to do anything more with places in the historic province.

After the renaming, we'll automatically edit the place pages.

  • If you specified that all of the inhabited places should be moved into the target province:
  • Each inhabited place (cities, towns, villages, etc. -- anything that doesn't itself contain places) will become a redirect to an inhabited place with the same name under the target province. If there is already a place with that name (or with a matching alternate name) under the target province, the information from the place under the historic province will be added to the page for the existing place under the target province.
  • The place under the target province will be edited so that its start year under the target province is the year that the move occurred, and the historic province will be added as an "also located in" place with an end year being the year the move occurred.
  • If you specified that just some of the inhabited places should be moved into the target province/country, or if the historical place was split among multiple target provinces/countries:
  • Each inhabited place is searched for under each target province/country. Only if a matching place is found (one with an identical preferred or alternate name after any accents are removed), will we redirect the old place to the new place. If no matching place is found, we'll just leave the place under the historical province so that someone can come along later and redirect it if necessary.

See Place talk:Poland for an example.

(forgot to press "save" earlier -- now you can see the page for an example)

"Type" words that will be omitted from new place names

We're currently planning to omit the following "type" words from the new place page titles:

Words to be omitted if they appear at the end of the place name

  • Autonomous
  • County
  • Department
  • District
  • Division
  • Federal
  • Governorate
  • Municipality
  • Oblast
  • Prefecture
  • Province
  • Rural
  • State
  • Urban
  • Voivodship

Phrases to be omitted if they appear inside parentheses at the end of the place names

  • (arrondissement)
  • (autonomous county)
  • (autonomous province)
  • (Bezirk)
  • (Bezirke)
  • (borough)
  • (Canton)
  • (Comuna)
  • (county borough)
  • (county)
  • (department)
  • (Departmento)
  • (Diocese)
  • (Diozese)
  • (District)
  • (District Council)
  • (district municipality)
  • (département)
  • (former county)
  • (former province)
  • (general region)
  • (governorate)
  • (Kanton)
  • (municipality)
  • (Municipio)
  • (national district)
  • (oblast)
  • (parish)
  • (prefecture)
  • (province)
  • (region)
  • (région)
  • (state)
  • (union territory)
  • (unitary authority)
  • (voivodship)

Phrases to be omitted if they appear at the beginning of place names

  • County
  • Arrondissement of
  • Canton of
  • County of
  • County Borough of
  • London Borough of
  • Metropolitan Borough of
  • Municipal Borough of
  • Province of
  • Regional District of
  • Royal Borough of

So Place:Rayleigh Urban District, Essex would become just Place:Rayleigh, Essex, England. If you would like some of these words to be retained, or if you would like to see additional "type" words to be removed from place names, please leave a comment on the Talk page.

Countries to review