I am a graduate student at the University of Virginia in Place:Charlottesville,_VA studying systems and information engineering. Genealogy has been interesting to me ever since I served as a public representative for my church in Yekaterinburg, Russia. It seems to be that the genealogical community could really benefit from several systems engineering principles that could help collaboration, information management, and research resource allocation.
Thoughts and Ideas
Disclaimer: These thoughts aren't necessarily correct or even well thought out. They're just ideas...
(1.) Change focus from personal family history to collaborative family history
I have been thinking that family history (as genealogy is called by The Church of Jesus Christ) is not very family-oriented. For example, the ancestral database software developed by The Church is even called "Personal Ancestral File" (PAF), rather than "Family Ancestral File" or just "Ancestral File." Moreover, the direction of PAF software development has been to stabilize larger and larger database structures, rather than developing PAF servers, work flow software, or collaboration functionality. The development of large-scale wiki, as well as The Church's efforts to unify research and establish a collaborative ancestral file is a good start, but fundamental changes need to take place at the primary pedogical level of family history teaching.
(2.) Not family tree, but rather family network
If you have never seen the concept of a visual thesaurus you should check out this link. In a visual thesaurus synonyms are shown as a web, where the word of interest is in the center connected to nodes that represent some meaning of the word. From each node extends synonyms that contain that definition. For example, if the center word is 'power' it might represent the definition (one node) of mathematical notation with the synonyms 'exponent' or 'index,' however another node might represent the definition of physical strength with synonyms such as 'might' or 'strength'. Families and family histories should be thought of as networks, not trees. Trees give the context of having a central important node, back to the idea of personal rather than family histories. Visual family networks could be graphical illustrations of data in a database. The center node could be the item of interest nodes would be possible fathers, possible mothers, possible children, possible neighbors, and possible localities. The length of lines could represent probabilities of match with the center of interest.
(3.) Bayes nets
Baysian networks are networks are directional networks that demonstrate how probability is influenced based relationships between random variables. We use intuitive baysian nets constantly (although sometimes are intuitive calculations are off). For example, if we cough then we might not think we are sick, but if we cough and feel weak then we believe that the probability of being sick is higher. Multiple verifications lead us to having a stronger belief or confidence in the varity of some conclusion. Excellent genealogist have great Bayes nets in their minds. They are able to conclude from clues about places, dates, other relationships, etc. whether a record is of possible interest or not. These sets of influence between information and degrees of belief in conclusions need to be quantified. If they are then a computer would be able to start with a GEDCOM of 100% accurate information a crawl databases (that have good metadata) to establish potiential extensions to the gedcom network with probabilities and links to the resources. The user could then follow the links to resource with highest probabilites first.
(4.) Utilize semantic technologies, such as RDF
Genealogy has a very characteristic database structure, and there is much information of which we are not taking advantage. Resource Description Framework (RDF) is a way of presenting data about metadata, maybe it's meta-metadata. It is a way of describing the relationships between various aspects of databases so that the database structure can be easily ported and searched. I've just began think abou this, but will write more as I have some time to think about the details.