For example, two researchers hold similar family data. One who has been researching Smith has a marriage to a Brown. Someone else has the database for Brown and also sees the marriage to Smith. In the Web, it is only necessary for the two databases to contain links to the appropriate place in the other. The result would be a distributed database containing the union of all the data contained in them. This is achieved without the two researchers combining their respective datasets into one that is contained in duplicate at both sites.
The advantages of linking are that each researcher can continue to explore their own area and amend and update the information under their domain of responsibility without the permission or active cooperation of the others involved. Conversely, if there was only one database with several contributors, then the normal configuration management and change control problems of a larger multi-author system rear their head.
Their are at least three different methods of making genealogical data available on the web. They are based on different philosophies and methods of organisation. More methods will develop as people experiment with the technology and as facilities become more advanced. Therefore we can say with certainty that the way we do it now is not likely to be the way it will always be done. Further, the three different methods have different ways of referring to individual records. We cannot assume that any one way is better than any other.
What does an individual reference look like? Here are some examples, extracted from GenWeb around the net:
Here we see that various methods of identifying individuals are used. Some methods use a file path and filename method of specifying the individual, where the path and name represent either the persons name or a person index. Other methods use a script to access the record and supply again either the persons name or their index. Some also use a byte offset index.
The philosophy from the point of view of the data provider contrasts with the view from the data user. The data user will save URLs pointing at interesting pieces of data in their private Bookmark or Hotlist file for later use. The idea that the data will "go away" is alien to them.
When a data provider also starts to become a data consumer, such as is the case when data sharing happens, access to the internals of a data set becomes a debating point.
Herbert Stoyan, who has a database of a large number of the German Nobility, has proposed the following model naming scheme. This is what he uses for his data:
This, of course, only works for people who have such a form of noble title. It has been suggested that this could be expanded for use with the general public at large by including a date, such as a birth date. On the surface this may seem a fine solution, but it runs aground on the same rocks that have sunk other many other attempts to use name and DOB as a universal identifier. Regular readers of comp.risks will be familiar with such an issue. For the genealogist things are even more complicated than, perhaps the Social Security office. This is because not all the details about an individual in our database are known. For many entries in my database the date of birth is not known; more than this for many of them their complete name is not known. The purpose of genealogical study is often to find out some of the missing details. To only support the access to records that are completely researched seems, in my view, to obfuscate the purpose of improving genealogical resource provision by using GenWeb in the first place.
The only unique and unchanging way I have discovered of naming individuals comes from those societies with traditional oral genealogical histories. Those are, for example, the Nordic and Icelandic sagas and the Gealic or Celtic histories. In those traditions people have names like Erik Magnusson Haraldsson Ignoldsson or Ruaidri macToirrdellbaig moicConchobaig O'Brien or Tewdwr ap Gryffydd ap Gwynedd and so on. These strings of names can continue as long as is necessary to link an individual to a unique ancestor. This may work for the male lines in these traditions but breaks down often when considering females in these Patriarchial Societies. Long strings of fornames used to make unambiguous name references are unwieldy to use and could not make a reasonable standard to adopt; even though they are often the de-facto method of referring to people in many genealogical and lineage works of reference in the English Language (such as "The complete Peerage").
If we accept that this is inevitable, we must also then consider the following corrolaries:
The problem with using names as the only means of access is one of performance. The index number is, in the short term, a much more rapid method of accessing an individual record.
Perhaps what is needed is a a standard enquiry to ask any genealogical system to translate a short term name into a permanent one.
Birger Wathne considers centralising access to genealogical records may be of benefit and has made the following proposal:
Either let all GenWeb URL's be of the form
genweb.fixed.location would be a hostname pointed to by a high-level DNS server, so we could be sure the name would be fixed even if the server has to move. I wonder if there is a DNS namespace for fixed services?
genweb-url is a gateway program that returns the 'real' URL to the data.
BASE= points to a unique database descriptor each provider gets when registering his base(s). This is used by the genweb-url program to build the real URL.
INDEX= is the private key into the database. The contents of this part should be left to the provider of the base.
The solution to the moving host or database problem can often be solved by mechanisms currently built into web servers and other associated software without resorting to a centralised authoritairian scheme. If an individual database moves to another server then the name mapping and proxy serving capabilities of the modern http server can be used to fulfil a request from a new location. If a site changes name altogether then the aliasing mechanisms in DNS can be used to leave dormant pointers from the old to the new name.
The centralising scheme does not resolve the problem of naming and locating records, but in fact continues it. It could be used as a way of providing a centralised enquiry bureau to find suitable data but this was not part of the original proposal. A cenralised genweb server is much like putting the Mormon IGI online.
Another problem with centralising authority for accessing genealogical records is the one of bottlneck and single point of failure. A non-centralised scheme has the advantages of resiliance and the multiplicity of contributors. If one person gets tired, bored or overloaded no one elses work needs to stop and wait.
Department of Computer Science
Hull, UK, HU6 7RX