Check data
Precision, reliability

The big part of the job is to achieve data reliability. Ideally, every information stored in g5 database should be checked against an official document. This is a huge work because it involves a human work of verification for each birth date.
The most precise we can hope is to have a certificate from the hospital (HC). In practice, we have birth certificates (BC) from the civil registries, which are often rounded to the hour. The exact birth of time then remains unknown. BCs are not ideal but usable for statistical tests.
So in g5 context, a birth time is considered reliable if it is related to a BC available and verifiable by anyone.

Currently very few birth times have been verified, only to solve questions raised by g5 development.
See also page Acts.

Trust - data reliability

Trust = level of reliability of an information.
Five levels of reliability are defined in g5 :
  • 1 - Hospital Certificate (HC)
    Original document available and verifiable by anyone.
  • 2 - Birth Certificate (BC)
    Original document available and verifiable by anyone.
  • 3 - Birth Record (BR)
    (= copy of the BC by an officer - may contain mistakes)
    Original document available and verifiable by anyone.
  • 4 - to check
    Data à priori serious (Cura, Newalchemypress, Astrodatabank) but containing errors.
    Need to be matched against BC.
  • 5 - the rest
    Data without birth time or grabbed from the web (wikidata, web sites).
Precision are constants of class g5\model\DB5

Most data handled by g5 are level 4, very few are level 2.

Note : as far as France is concerned, it's possible today to check BCs online. But at Gauquelin and Müller epoch, they had 2 possibilities : go physically to the archives and consult BCs, or send a letter and receive BRs. It means that Gauquelin and Müller data are mostly based on BRs. And a BR may differ from the original BC because the officer can make an error of copy, or copy the time of registration instead of the time of birth.
Raw data used by g5 may contain errors from different origins :
  • Copy error from the officer who established a BR.
  • Gauquelin or Müller error when integrating the BR in their files.
  • Error when original paper files where put in an electronic form (for example the error on GNR in Müller 1083 physicians).
  • Bugs in g5 program should be added to this list...

G5 integration

Persons have 2 fields to express reliability :
  • trust permits to specify the default trust level of the person.
  • trust-details is an array associating specific fields and trust level.

This model permits to indicate separately the reliability of each field.
Ex : a person has trust = 4 and trust-details = {"name.fame": 2} means that all fields are trust level 4, but one field, person.data['name']['fame'] is level 2.
When a person is imported in database, it takes by default the trust level of its source.