WeRelate talk:Famous categories

This page continues an earlier discussion between the administrators.

Topics


Identifying existing famous people vs. adding new famous people

As I was reading through the emails I realized that for the "found your famous ancestors" feature, it's more important to identify (and merge) famous people that already exist in WeRelate, not to add new people to WeRelate, because the new people wouldn't be connected to anyone. Incorporating Gary Boyd Roberts' work would be worthwhile, but I don't think it's urgent and I'd want to get his permission.--Dallan 11:33, 18 May 2009 (EDT)


Equating famous people to people in Wikipedia [20 May 2009]

I agree that it's not a one-to-one correspondance, but I can't think of an easier way to get started. For some users, just finding out that a wikipedia article exists on one of their ancestors will be interesting in and of itself. So I think in addition to the main categories of famous people that we come up with, we should have a "Wikipedia" catch-all category that lists everyone else with a wikipedia article.

What about people who are genealogically famous but who do not have a wikipedia article? We have two choices:

  1. Anyone having one of the 12 "famous" categories listed on their Person page will be considered famous.
  2. We tell people to create a wikipedia article for the person they want to consider famous -- that only people with wikipedia articles will be considered famous.

If we go with option (1), then we need to monitor who's being added to these famous categories (and currently there's not a way to get an email when a page is added to a category - I would have to add that), and as administrators we'd have to decide whether a person added to one of the famous categories really qualifies as famous.

If we go with option (2), we rely upon Wikipedia to set the bar for who a famous person is. There are currently roughly 165,000 articles about dead people at Wikipedia. After reviewing a few of them, it seems to me that the bar for admitting an artricle on someone is reasonably low.

I'd like to start out going with option (2) and see if it becomes a problem. Option a means more work for us as administrators and has a significant chance of generating hard feelings. If it turns out that some people we want to consider famous don't get admitted to Wikipedia, we can make exceptions for those people.--Dallan 11:33, 18 May 2009 (EDT)

I don't really understand why we'd need to monitor under option (1). The bar is so low under the 12 categories that I don't really think it would be a problem. I'm not a fan of telling people that they have to go figure out if someone merits a WP article and add it if they have to. The example that's causing issues for me is that only maybe half of the Mayflower passengers have a page. I know how to create articles on Wikipedia and have done it, but I'm not real interested in doing it for the other passengers. And I'm kind of dedicated to this cause. On the other hand, we generally have pretty good pages here on most of the passengers already. Which leads to the next complication, that we already have a fully populated and inter-linked Mayflower category. What happens to that? --Amelia 00:51, 19 May 2009 (EDT)
What I'm hoping to avoid with option (1) is having people populate the categories with people that nobody's ever heard of. I wouldn't want people adding their grandfather to the "Artists and musicians" category just because he played the clarinet in high school for example. Perhaps we could avoid most of this through careful naming of the categories: "Noteworthy artists and musicians" or something.
I hadn't realized that Wikipedia didn't have articles on all of the Mayflower passengers. That certainly lessens my desire for option (2). I figured that with 250,000 articles (it's more than I originally thought -- see below), Wikipedia would surely have had articles on most of the figures important to genealogy.
If we have an existing category with the same name as a proposed "famous" category, people already in that category would stay in the category. The proposed process would possibly add the category to more WeRelate pages, and add links to the wikipedia articles for some of the pages already in the category.
How would people feel about going with option (2), and if so, can we come up with better category names for some of the more ambiguous categories like "Artists and musicians" or "Scholars"?--Dallan 14:54, 19 May 2009 (EDT)
I think your question makes more sense if you said option (1)... But, at the risk of being too literal, how about "Famous artists and musicians" and "Famous scholars"? --Amelia 00:57, 21 May 2009 (EDT)

Why do this now - it's not as important as other things?

To be honest it's because my daughter wants to earn some money and I don't feel comfortable putting her on merging. She could add source-wikipedia templates to existing pages though. I'm thinking that for the roughly 150,000 Wikipedia people that we don't already have at WeRelate, the system would search WeRelate for those people and if it finds a probable match to an existing WeRelate page, then it will add the Wikipedia page to a list of pages to review. My guess is that this list would be much smaller. A person (my daughter and possibly others) would then go through the list, make sure that the pages do indeed match, and add the source-wikipedia template to the page if they do.--Dallan 11:33, 18 May 2009 (EDT)


Wikipedia people who are related to WeRelate people but who don't have a page in WeRelate [19 May 2009]

I think I finally understand what User:Jrm03063 is suggesting. Rather than go the full length of the suggestion, how about a list of Wikipedia people who link to or are linked to by one or more Wikipedia people with WeRelate pages. The list would be:

  • Wikipedia title
    • WeRelate titles that link to or are linked to by the Wikipedia page

Identifying which Wikipedia pages are for people vs. on some other topic is fairly easy. They generally belong to a "nnnn births" or "nnnn deaths" category. So creating this list would be pretty straightforward.

I think we should actually create two lists: one for people born before 1500 and one for people born after 1500. The pre-1500 list could include a link to create a WeRelate page based upon the Wikipedia page title. But once this project is complete I think we need to re-consider how the pre-1500 pages are titled.--Dallan 11:33, 18 May 2009 (EDT)

Sounds like a good start. I figure it's apt to focus on likly WP people, instead of just all of them. BTW, how did you figure out that WP had 165,000 people in it? --Jrm03063
The number is actually 250,000 articles about dead people on Wikipedia. I looked at everyone who had a category of "X births" or "Y deaths", where X was between 100 and 1900, or Y was between 100 and 2009. (The 165,000 number restricted X to 1000-1900 or Y to 1000-1950; there are a lot of articles in Wikipedia about people who died in the past 60 years.)
I know this misses some people -- people without either a birth or death year, but given the current approach that we're not going to add the wikipedia link to a WeRelate person page unless we're pretty sure that the wikipedia article is for the same person, I don't mind overlooking people who don't have any dates at all. It's interesting that for an encyclopedia that says that they don't want to focus on people, nearly 10% of their English-language articles are about people.
It looks like we currently have roughly 3700 people at WeRelate linked to Wikipedia articles. Based upon what I've seen so far, it appears that this process will add wikipedia links to another 2000-4000 people -- people like Agatha Christie (Person:Agatha Miller (1)), Jane Austen (Person:Jane Austen (1)), Desi Arnaz (Person:Desiderio Arnaz (1)), and Bing Crosby (Person:Harry Crosby (1)).--Dallan 14:54, 19 May 2009 (EDT)

Displaying famous relatives

For efficiency's sake, I'd prefer to break up the display of famous relatives into two steps:

  1. Show a pedigree of someone, highlighting their ancestors who are themselves famous or who have famous descendants. Lines in their pedigree that do not lead to famous ancestors or ancestors with famous descendants would not be expanded. For each highlighted person, we would identify whether the person was themselves famous or had famous descendants for up to 12 categories of "famousness." However, we wouldn't say how many famous descendants they had in each category -- just whether famous descendants existed in each category.
  2. When you clicked on someone in the pedigree, we would display a descendancy for that person highlighting their famous descendants. Lines in their descendancy that did not lead to famous descendants would not be expanded. Each highlighted person would be identified as belonging to one or more of 12 categories of "famouness" as before.

By breaking this up into two steps, the system doesn't have to display ancestors and descendants simultaneously, so it limits the amount of information that has to be retrieved from the database at each step.--Dallan 11:33, 18 May 2009 (EDT)


How do we identify which category(ies) to put a person into? [20 May 2009]

For efficiciency's sake, I'd like to limit the number of "famous" categories to 12, plus the catch-all "has a wikipedia article" category. User:Solveig and I have created a "first cut" at 12 categories based upon the Wikipedia categories that a person belongs to on the primary page: WeRelate:Famous categories.

It's a pretty simple stragegy -- look for a set of keywords or phrases in the wikipedia categories, and if you find one, then put that person into the famous category. The strategy isn't perfect -- players for the Washington Senators will be placed into the "Political leaders" category, but I think the error rate will be pretty low.

Would people please review this list? It's just a first cut -- maybe we should have different categories, or maybe the keywords need to be different? Please feel free to edit the keywords and/or the categories and/or leave comments on this talk page. It's easy for me to re-generate the list of matching Wikipedia cateogies for each category.

I'm thinking that the wikipedia-update program that turns the source-wikipedia templates into wikipedia text inclusions would automatically add the "famous" categories to the Person pages by matching the keywords to the Wikipedia categories that the person belongs to.--Dallan 11:33, 18 May 2009 (EDT)

So you're saying that the key to being famous is the category? I prefer that to having a WP template, because it's just not always all that useful for an already well-sourced page. Can we have subcategories? There are categories right now for U.S. Vice Presidents, Senators, Governors, etc., all of which I'd find a much better clue when looking at the list than "political leader." And having Ben Franklin and the 1802 governor of Ohio in your tree are just not all that equivalent. On the Notable People talk page, there was some debate that fizzled out about distinguishing the "really" famous from the mildly notable, and I think given that the WP list is so broad, that we might be well-served by somehow highlighting that so people don't lose their kings among their Continental Army officers.--Amelia 11:10, 19 May 2009 (EDT)
I'd like to make the key to being famous the category, with "has a wikipedia article" as a 13th catch-all category. The thing is that for efficiency I'd like to limit the number of "famous" categories to either 12, 20, or 28, depending upon what we show in the "ancestors with famous descendants" pedigree chart:
  • If in the pedigree chart we show the categories that each person's famous descendants are in; e.g., John Henry has famous descendants in: Political leaders, Artists and musicians, and Scholars categories, then I'd like to limit the number of different "famous" categories to 12.
  • Alternatively, if in the pedigree chart we show a count of each person's famous descendants; e.g., John Henry has 15 famous descendants, without displaying the categories that they're in until you click on John Henry to get a descendancy chart showing his famous descendants, then I can increase the number of "famous" categories to 20.
  • Alternatively, if in the pedigree chart we just say that a person has famous descendants; e.g., John Henry has famous descendants, without displaying the categories or a count until you click on John Henry to get the descendancy chart, then I can increase the number of "famous" categories to 28.
Thoughts on this? If we want to distinguish vice presidents from senators, governors, etc. we'll need to go with a higher number of categories. As an alternative, we could group some of these fine-grain categories into a single famous "Political leaders" "super-category", but the pedigree and descendancy charts would just display the "Political leaders" super-category; you wouldn't see the fine-grain category until you opened the page for an individual person.--Dallan 14:54, 19 May 2009 (EDT)
If we already had additional categories we wanted to add to the list, we could do that as well. We just need to come up with 12, 20, or 28 "famous" categories and get them populated.--Dallan 15:10, 19 May 2009 (EDT)
So if we put Senator Jones into the "U.S. Senators" category, which is a subcategory of super-category "Political Leaders", does he also need to be in category "Political Leaders" in order to show up in these various situations? I've been trying to maintain a browseable hierarchy with these categories (on the theory that a category with 8000 entries is useless and that these are categories someone might also like to browse) but I think it's fine to have such a person in both if we maintain the subcategories for browseability and add a note to "political leaders" explaining how it works. This would work particularly well if you could program the display to include their name -- i.e. "Senator Harry Jones of Ohio" instead of/in addition to "Henry Jones (1)". We would probably have to change a number of people's names over time to take advantage of that functionality, but it would really help distinguish who people are in a broad category.
Another benefit of broad (and thus fewer) categories that then have subcategories is that we can expand them more easily later. If we start with U.S. Senators as a major category, we can't equivalently elevate politicians of every other country later when the user base merits it.--Amelia 01:08, 21 May 2009 (EDT)

WP's List of Categories [18 May 2009]

Hello Dallan,

WP's Category:Other is a good selection.

The rest of WP's list is kind of leaning toward a White European list, most non white, non European categories are under Other. Maybe we could add a note to mention this under Category:Other?

Is it possible to have a more broader category then the Mayflower? As you know each European Empire, Spanish, Dutch, etc had there own first colony. Maybe something like Category: First North American European Immigrants? We could than have the Mayflower as a subcategory under that.

Debbie Freeman --DFree 15:52, 18 May 2009 (EDT)

I'm open to other categories. The categories include every category that appears on more than about a dozen person pages at Wikipedia, about 13,000 categories in all. Solveig reviewed the categories and tried to group them into 12 super-categories. If you (or anyone) find additional categories in the list to group into new super-categories, please propose them. This list is just a first cut.--Dallan 14:54, 19 May 2009 (EDT)

possible add to Category: Other [18 May 2009]

Hello Dallan,

How about under Category: Other we also add WP - List of Justices of the Supreme Court of the United States. It is a list from the beginning in 1789. There are birth and death dates on a nice list. Debbie Freeman --DFree 16:21, 18 May 2009 (EDT)

People would be assigned to the 12 super-categories based upon the categories that appear on their wikipedia pages, so we need to look for a US Supreme Court Justices category: Wikipedia:Category:United States Supreme Court justices. Currently, this category is grouped into the "Political leaders" category. I've thought about breaking this apart into separate categories for presidents and vice-presidents, senators and congressmen, and justices. It's a question of how many categories we want to have. (See the discussion in the section "How to identify which categories to put a person into".)
Alternatively, we could come up with additional categories on our own that were not based upon wikipedia categories, but then we'd need to add the Category links to the pages belonging to these categories manually - the system wouldn't be able to add them automatically.--Dallan 14:54, 19 May 2009 (EDT)

New proposal for famous categories [27 June 2009]

After looking at Category:Notable people, I'd like to propose the following:

We use Category:Notable people as the main category. Anyone in this category or one of its sub-categories will be considered "famous".

Category:Notable people can have up to 13 "broad" direct sub-categories. When people view a pedigree of their ancestors with famous descendants, for each ancestor in their pedigree with famous descendants, they'll be able to see in which of the 13 broad sub-categories the person has famous descendants, in addition to the catch-all "has a wikipedia article" category.

Each of the 13 broad sub-categories can have fine-grained sub-categories. So for example a category of "Notable actors" might have "American actors" and "French actors" as fine-grained sub-categories. When people view a descendancy chart of someone with famous descendants, they'll be able to see the fine-grained sub-categories that each famous descendant belongs to. So if I click on someone with descendants in the "Notable actors" broad category, I'll be able to see for each famous descendant whether they were an American actor, French actor, etc.

I'd like to propose the following broad categories and the fine-grained categories within them:

  • Mayflower passengers
  • Notable historical figures
    • Signers of U.S. Declaration of Independence
    • Outlaws
    • American historical figures, French historical figures, British historical figures, etc. (we don't have keywords to populate these categories automatically from wikipedia categories; they'll need to be populated by hand)
  • Political leaders
    • U.S. First Ladies
    • U.S. Governors
    • U.S. House of Representatives
    • U.S. Presidents
    • U.S. Senators
    • U.S. Supreme Court Justices
    • U.S. Vice Presidents
  • American Revolutionary War figures (this category could be put under Notable military figures, but given the popularity of the DAR, maybe we should keep it separate?)
  • Notable military figures
    • Sub-categories by country
  • Nobility
    • U.K. Monarchs, French nobility, etc.
  • Notable athletes
    • Olympic athletes
  • Notable actors
    • Sub-categories by country
  • Notable writers
    • Sub-categories by country
  • Notable artists
    • Sub-categories by country
  • Notable musicians
    • Sub-categories by country
  • Notable scholars
    • Sub-categories by country
  • Notable religious leaders
    • Sub-categories by religion

If people like this new proposal, I'll re-work the keywords on the main page to assign pages to the proper-fine-grained categories based upon the wikipedia categories they are in.--Dallan 17:16, 27 May 2009 (EDT)

I realized that I need to limit it to 13 broad categories, not 24, so I remove Outlaws as a category. They'll still fall into the catch-all "has a wikipedia article" broad category. Alternatively, we could put them under the "Notable historical figures" category, but it seemed kind of strange to mix outlaws with signers of the Declaration of Independence (though I'm sure they were classified as such at the time).--Dallan 09:27, 30 May 2009 (EDT)
I like the idea of outlaws being historical figures :-) A couple questions; 1) Can we have multiple layers of subcategories? Governors and Senators have/can be further divided by state. That would also solve some of any controversy over whether to do country or field for other categories. 2) Do we want to have a particular policy for people who are notable only because of their children/grandchildren/parents? Presidential relatives are very susceptible to this. We can make great use of the pages, but they're going to be mostly duplicative for this purpose (It's not real illuminating to see that you're related to both "U.S. Presidents" and "Children of U.S. Presidents". Well duh.) 3) Do we want to incorporate in this 'notable genealogical interest' categories like founders of various towns? Those people are largely not on Wikipedia, but like the Mayflower, they are often people with lots of descendants, so we'd instantly have lots of people who could use the feature. On the other hand, it might clog the system considerably, for the same reason.
Other comments: I think religion should be by denomination, since that's easily determinable and usually the more notable fact. Musicians by country, since the few famous ones I can think of off the top of my head all played multiple instruments/styles. Scholars I think country, since whether someone is a sociologist or a historian or a whatever might be hard to determine. --Amelia 10:01, 30 May 2009 (EDT)
I've put Outlaws under Notable historical figures. :-) Yes, we can have multiple layers of sub-categories. I don't have a good opinion on what to do about people who are notable only because of their relatives. How about not categorizing them and letting them fall into the catch-all "has a wikipedia article" category if they have a wikipedia article? I like the idea of considering people like town founders famous. As you say, these will probably be the only famous people in many people's trees (including my own). Should we categorize them under Notable historical figures?--Dallan 11:12, 23 June 2009 (EDT)
If we can have multiple layers of sub-categories, does it make sense to have, say, "Children of U.S. Presidents" just be a subcategory of U.S. Presidents? Or would that show up strange somewhere? I think town founders go great under notable historical figures.--Amelia 00:58, 26 June 2009 (EDT)
That's a good idea. In the ancestors view the common ancestor would indicate that he/she has famous "political leader" descendants. And when you clicked on the common ancestor the famous descendants would indicate to which specific sub-category(ies) they belonged.--Dallan 11:48, 27 June 2009 (EDT)