G5 integration
Importing raw files in database is done with following commands:php run-g5.php gauq E1 raw2tmp small php run-g5.php gauq E3 raw2tmp small php run-g5.php gauq E1 tmp2db php run-g5.php gauq E3 tmp2dbStep
raw2tmp
performs place restoration, geonames matching, timezone offset computation.
Importation of Arno Müller's 1083 medical doctors permits to restore some names in E3.
raw2tmp
This command needs a parameter to indicate what it should print :php run-g5.php gauq E1 raw2tmp
MISSING PARAMETER : raw2tmp needs a parameter to specify which output it displays. Can be : small : echoes only global report tz : echoes the records for which timezone information is missing geo : echoes the records for which geonames matching couldn't be done full : equivalent to tz and geo
NUM
IMPORTANT : files E1 and E3 differ from other Cura files because records between 1 and 999 are prefixed with zeroes.To obtain coherent NUM among generated csvfiles, these zeroes were removed to form the NUM of generated files.
This is done in step
raw2tmp
.
Place
The main problem of files E1 and E3 is that the timezone offset is not given. It can be computed from the place and the date, which means that the place must be identified to be linked with a timezone. This is done through geonames.org and included in the resulting file.This gives partial results :
php run-g5.php gauq E1 raw2tmp
Importing E1 2153 lines parsed 267 places not matched 346 timezone offsets not computed 1540 persons stored precisely (71.53 %)
php run-g5.php gauq E3 raw2tmp
Importing E3 1539 lines parsed 224 places not matched 266 timezone offsets not computed 1049 persons stored precisely (68.16 %)The unmatched cases come from two reasons :
- Some place names are different from geonames ; can be fixed by completing the program in function
compute_geo()
, where it is possible to write the correspondance between cura names and geonames.org names.
- Timezone computation refuses to compute unclear cases : during WW1, some parts of France were sometimes french, sometimes german, and timezone computation refuses to handle these cases.
Names
Family and given names were split for common case.For some records, NAME column of cura files contains a mix of several information : sometimes the maiden name of a woman, or the public pseudonym of a person. More precise code needs to be written to handle this.
Notes
Some field contain supplementary information, written before the name in Cura files : + - * or L. Their meaning is explained on Cura pages.These informations are included in the generated files in column
NOTE
.
Planetary sectors
Cura pages contain two lists : one with birth data and one with planetary sectors. These two lists can be merged without ambiguity, using fieldNUM
.
Planetary sectors are included in the generated files.
Small errors
Cura files contain several typos.Errors in 902gdE1.html
-
Error in saturn sector number
0517 26 28 20 01 lt3 DANIAUD Jean
Error not fixed yet. -
Error in moon sector
0560 o8 05 07 36 29 DELAY Maurice
Value o8 probably to replace with 08, but didn't check.
Error not fixed yet. -
Typos in the names of 27 persons (O are replaced by zero ; A by 3 ; S by 5, G by 6 ; B by 8).
These typos are present in both lists (list with birth data and list with sectors).
These errors are fixed inraw2tmp
code.0483 EX COURB0T Henri 12 02 1902 20:00 Courchelettes 59 0642 PH + DUB0S René 20 02 1901 10:00 St-Brice 78 0654 MI DUD0GNON Martial 01 06 1900 22:00 Ambazac 87 0743 MI FERRIERES de SAUVEB0EUF Guy 12 10 1919 22:00 Tours 37 0746 MI FEUVRIER Ch3rles 29 01 1915 01:00 Damprichart 25 0879 PH G0DECHOT Roger 15 12 1922 04:00 Nancy 54 0880 MI * G0DEFROY Charles 29 12 1888 07:00 La Flèche 72 0881 PH - G0DLEWSKI Guy 20 04 1913 09:00 St-Mandé 94 0882 PH G0DLEWSKI Jean 08 08 1919 15:00 St-Mandé 94 0883 PH G0DLEWSKI Stanislas 02 12 1919 23:30 Sorgues 84 0885 PH G0LDE Alice (DUPUY) 10 08 1928 23:40 Corbeil-Essonnes 91 0886 EX G0MA Michel 12 03 1932 23:30 Montcrabeau 47 0887 MI G0MBEAUD Jean 29 06 1907 15:00 Billère 66 0888 EX G0MEZ Francine (LE FOYER) 12 10 1932 01:40 Boulogne-Billt 92 0890 PH G0RET Pierre 27 08 1907 11:00 Rosières-en-Santer 80 0891 PH G0SSEREZ Maurice 18 03 1911 08:00 Montpellier 34 0892 PH G0UDAL Gaston 11 05 1910 06:00 Bagnac 46 0893 EX G0UDARD Jean-Michel 13 11 1939 21:00 Montpellier 34 0894 PH +* G0UGEROT Henri 02 07 1881 05:00 St-Ouen 75 0897 MI G0UJON Pierre 27 09 1910 06:00 Maisons-Lafitte 78 0898 PH G0UMAIN André 16 06 1910 15:00 Pau 64 0899 PH + G0UNELLE Hugues 27 02 1903 03:30 Châteauroux 36 0901 EX G0UZE-RENAL Christine 30 12 1914 08:30 Mouchard 39 0998 MI L HETTIER de B0ISLAMBERT Cl. 26 07 1906 12:00 Hérouvillette 14 1119 MI * LAB0UCHERE René 13 02 1890 05:00 Paris 8ème 75 1797 PH R0BERT François 15 09 1914 15:00 St-Avold 57 1801 PH R0BLIN Jean 24 06 1914 11:00 Guer 56
Errors in 902gdE3.html
-
A space is missing between mars and jupiter sector nb for record 0811 :
0811 04 26 2104 08 IZARD Georges
Error fixed. -
The same type of typo found in E1 file occurs for 9 records.
These errors are corrected by the program.0309 PAI CHARB0NNIER Pierre 24 08 1897 13:00 Vienne 38 0383 JO C0QUET James de 16 07 1898 01:00 Bordeaux 33 0488 AC DESCRIERES Georges (BER6É-D.) 15 04 1930 09:00 Bordeaux 33 0497 PO DESTREMAU 8ernard 11 02 1917 17:30 Paris 16ème 75 0703 PO GIAC0BBI François 19 07 1919 09:00 Venaco 20 0784 PO HERISS0N Charles 12 10 1831 10:00 Surgy 58 0836 PO J05PIN Lionel 12 07 1937 23:10 Meudon 92 1308 PO RIB0T Alexandre 03 02 1842 01:00 St-Omer 62 1326 PO R0BERT Pierre 17 05 1875 23:30 Montbrison 42
TODO
- More information can be extracted from names : maiden name, nobility, nickname