Ertel 4391 generalities
From newalchemypress.com :In the last year of his life, Professor Ertel kindly sent his main data-collection, of the 4,391 sports champion.I also found information about this file in these places :
-
"Raising the Hurdle for the Athletes' Mars Effect: Association Co-Varies With Eminence" by Suibert Ertel, published in Journal of Scientific Exploration. Vol. 2. No. 1. pp. 53-82. 1988, available on scientificexploration.org web site.
Reffered as [Ertel 88] in this page. -
"The Tenacious Mars Effect", a book by Suitbert Ertel and Kenneth Irving, 1996.
Reffered as [TME 96] in this page. -
"Is the "Mars Effect" Genuine?" by Paul Kurtz, Jan Willem Nienhuys , Ranjit Sandhu, published in Journal of Scientific Exploration, Vol. 11 , No. 1, pp. 19-39, 1997, available on scientificexploration.org web site.
Reffered as [KNS 97] in this page.
An incomplete and approximate file
One very strange feature of this file : many records don't contain birth time. Surprising because among the missing dates (~800) are records published by Gauquelin (LERRCP series), so known by Ertel. As noted by newalchemypress.com, birth places, longitudes and latitudes are not given.Analysis of this file shows several errors and omissions, for example :
- The file contains 4384 records ; 7 records are missing to reach 4391.
- The file contains 553 records from Gauquelin 1955 list (instead of 568).
-
In [TME 96], Ertel talks about 192 records common to CSICOP and Gauquelin (file D10).
In the file, 192 records have a CSICOP id, but only 190 match.
Not fixed. -
5 associations with CSICOP file are erroneous (out of 190 - error rate 2.63 %).
Fixed in steptweak2tmp
. -
26 associations with file D10 are erroneous (out of 349 - error rate 7.45 %).
Fixed in steptweak2tmp
. -
6 other erroneous associations with Gauquelin data could be identified.
Fixed in steptweak2tmp
. -
American football players and european football (soccer) players have the same code
FOOT
.
Fixed in steptmp2db
-
Rugby league and rugby union players are not differentiated.
Partially fixed with the inclusion of CFEPP data in g5 database.
Integration to g5
The input file used by g5 is data/raw/ertel/3a_sports-utf8.txt ; it is an exact copy of file3a_sports.txt
retrieved from newalchemypress.com, converted to UTF-8.
Full execution
The full set of transformations can be executed with the command :php run-g5.php ertel sport raw2tmp php run-g5.php ertel sport tweak2tmp php run-g5.php ertel sport fixA1 update php run-g5.php ertel sport tmp2dbThese commands must be executed AFTER having imported LERRCP (Gauquelin) files.
raw2tmp
The first step is to generate the filedata/tmp/ertel/ertel-4384-sport.csv
:
php run-g5.php ertel sport raw2tmpThis step copies the contents of
3a_sports-utf8.txt
to ertel-4384-sport.csv
with the following modifications ;
-
Column
GQID
is created, containing a Gauquelin id (a string like "A1-123") from columnsQUEL
andG_NR
for records originating from A1, D6 and D10.
Left empty for records of other origin. -
Column
NAME
is renamedFNAME
. -
Column
VORNAME
is renamedGNAME
. -
Columns
GEBDATUM
andSTUND
produce a columnDATE
, ISO 8601. -
Column
NATION
produces columnCY
, ISO 3166. -
Column
MF
produces columnSEX
, containing "M" or "F" (column is empty for males in3a_sports-utf8.txt
). -
Column
SPORTART
is renamedSPORT
. -
Column
INDGRUP
is renamedIG
(individual or collective sport) ; contains the same value as in3a_sports-utf8.txt
. -
In column
QUEL
, "*G:D10" is replaced by "G:D10" (for records with NR = 2872 and 4080).
tweak2tmp
Human corrections are stored in file data/db/init/newalch-tweak/ertel-4384-sport.yml.They are used to modify tmp file.
Fix Gauquelin A1 names
This step is included in the restoration process of file A1, not part of Ertel 4391 process.Fixes 100% of the remaining unidentified names in A1.
php run-g5.php ertel sport fixA1
WRONG USAGE - fixA1 needs one parameter. Can be : 'report' : echoes the list of names that will be modified by 'update' 'update' : updates file A1
php run-g5.php ertel sport fixA1 update
Nb missing names in Gauquelin A1 : 117 Ertel 4391 contains : 2084 lines from A1 Nb corrections : 117
Generate skeptics' files
This paragraph is obsolete, it describes a previous version of the code, not re-integrated yet.
Draft code ; in ertel-4384-sport.csv
, the associations between lines and skeptic ids are sometimes wrong or missing.
Can come from a problem in
ertel-4384-sport.csv
, a misunderstanding of column meanings or a bug in the code.
php run-g5.php ertel sport ertel2skeptics
PARAMETER MISSING Possible values for parameter : all : Generate all skeptic files cpara : Generate 5-cpara/535-cpara.csv cpara-full : Generate 5-cpara/611-cpara-full.csv cpara-lowers : Generate 5-cpara/76-cpara-lowers.csv cfepp : Generate 5-cfepp/925-cfepp.csv csicop : Generate 5-csicop/192-csicop.csv
php run-g5.php ertel sport ertel2skeptics all
CPARA : 535 records saved - stored in data/5-tmp/cpara/535-cpara.csv CPARA full : 611 records saved - stored in data/5-tmp/cpara/611-cpara-full.csv CPARA lowers : 76 records saved - stored in data/5-tmp/cpara/76-cpara-lowers.csv CSICOP : 192 records saved - stored in data/5-tmp/csicop/192-csicop.csv CFEPPP : 925 records saved - stored in data/5-tmp/cfepp/925-cfepp.csv
Comité Para
Inertel-4384-sport.csv
, 611 records have a PARA_NR
value.
In [TME 96] p SE-18, Ertel talks about 611 records :
- 535 records published in the official test.
- 76 records that were computed but not retained for the test because not eminent enough. Ertel called them Para Lowers and used these records to show a selection bias in Comité Para data.
It corresponds with
ertel-4384-sport.csv
: out of the 611 PARA_NR
,
- 535 records have column
QUEL
=G:A01
(come from file A1) ; they have birth date and time - 76 records have column
QUEL
=GCPAR
; they have birth day but not birth time.
Links to CSICOP
Date comparisons done to build CSICOP test showed errors in Ertel's file.Two records have a
CSINR
= 0 ; they correspond to existing records in file D10, but are absent from CSICOP file.
Ertel Id | Gauquelin id | Person |
---|---|---|
2285 | D10-726 | Kono Tom (Tomio) 1930-07-27 |
2873 | D10-894 | Miller John L. 1947-04-29 |
tweak2tmp
.
One record without
CSINR
could be identified during CSICOP merge :
Miller Freddie 1911-04-03 ; NR = 2872 ; CSID = 254
It brings to 191 the number of CSICOP records present in Ertel's file.
Looking at the file
NOTE : the following code was written to try to understand the content of the file.I don't understand all the columns, so consider the informations about the file as suppositions, there are possibly mistakes.
php run-g5.php ertel sport look PARAMETER MISSING Possible values for parameter : sport, quel, date, eminence, ids, marsFor example :
php run-g5.php ertel sport look sport
Links to other datasets
An interest of this file is the presence of columns indicating the ids of the records in other data sets.The information is contained in the following columns :
QUEL
: origin of the recordNR
: id (number) in Ertel's reference - unique id within this file.PARA_NR
: id in Comité Para testCFEPNR
: id in CFEPP testCSINR
: id in CSICOP testG55
: Presence in Gauquelin 1955 experience-
G_NR
has a different meaning, depending on the value ofQUEL
:- For records published by Gauquelin's LERRCP (A1 D6 D10), G_NR is the id of the records within these files.
-
For unpublished records (QUEL = GCPAR, GMINI, GMING, G_ADD, GMINV, GMIND, G_79), it contains a unique id within a given QUEL value.
(for QUEL = GCPAR, G_NR = PARA_NR).
G_NR
Looks like an id given by Ertel when a Gauquelin id was not available.
php run-g5.php ertel sport look idslists the number of records associated to the external datasets :
Gauquelin G_NR : 4384 (100 %) Gauquelin 1955 G55 : 553 (12.61 %) Comité Para PARA_NR : 611 (13.94 %) CSICOP CSINR : 192 (4.38 %) CFEPP CFEPNR : 925 (21.1 %)Column
QUEL
seems to indicate the origin of the records.
php run-g5.php ertel sport look quelgives the different values of
QUEL
and corresponding number of records :
[G:A01] => 2087 [G:D06] => 450 [G:D10] => 351 [GCPAR] => 76 [GMIND] => 453 [GMING] => 115 [GMINI] => 599 [GMINV] => 24 [G_79F] => 27 [G_ADD] => 202
Ertel's subsamples
Combining the output of the two previous commands permits a reconstitution of Table 1 given in [Ertel 88], p 59.This table and accompanying notes describe the samples used by Ertel to build his pool of 4391 records.
Notes :
-
New indicates the number of new records brought by this file, not present in Gauquelin or skeptics' files.
This file brings 1828 new records. - P : Published / Unpublished
-
QUEL : value of column
QUEL
in the orginal file. - Id : Name of the column concerning the sample in the original file.
-
Ner = number claimed by Ertel.
Ng5 = number found by g5, using columns QUEL and Id. - Ertel subsample : name of the subsample in Ertel article.
-
g5 match : Link to the other groups present in database that could be matched with Ertel's file.
An empty cell means that these records are only found in Ertel's file.
(links directly point to opengauquelin.org groups)
New | P | QUEL | NQUEL | Id | NId | Ertel subsample | g5 match | Comments |
---|---|---|---|---|---|---|---|---|
G:A01 | 2087 | A1 |
Ner = 2087 Ng5 = 2087
QUEL = G:A01 => records come from Gauquelin file A1.
|
|||||
0 | P | G55 | 1 - First French |
Ner = 567 Ng5 = 553
Records where column GAUQ1955 is not empty.
Records coming from Michel Gauquelin "L'influence des astres", 1955. Gauquelin 1955 restoration identifies 564 Gauquelin 1955 records present in file A1. |
||||
0 | P | 1202 | 2 - First European |
Ner = 1189 Ng5 = 1202
Records where QUEL = G:A01 and GAUQ1955 empty and PARA_NR empty.
From [Ertel 88] :
Note: 1202 = 2087 - 553 - 332: the number in sample 2 (1202) equals to the number found in A1 (2087), minus the number coming from 1955 book (553), minus the number of new records brought by Comité Para test (332).
|
||||
332 | P | PARA_NR | 535 | 6 - Para champions |
Ner = 535 Ng5 = 535
Records with QUEL = G:A01 and PARA_NR = number from 1 to 535.
List published by Comité Para for its 1976 test. From [Ertel 88] : "Since Gauquelin had already 203 athletes from the Para sample in his earlier studies (1955, 1960), only 332 are gained". |
|||
76 | U | GCPAR | 76 (76) |
PARA_NR | 76 | 7 - Para lowers |
Ner = 76 Ng5 = 76
Records with QUEL = GCPAR and PARA_NR = string from *1 to *76
535 + 76 = 611 ; numbers given in [Ertel 88] correspond to the content of the file. No birth time - only birth day
From [Ertel 88] : These 76 are part of a group of 241 soccer players gathered for the Comité Para test (1976). They were not retained in Comité Para experiment because considered less eminent and remained unpublished. They were copied by Ertel when he visited Gauquelin laboratory.
Ertel copied only 76 because it was for only 76 out of 241 (ranks 1-76) that mars sector was computed. |
|
G:D10 *G:D10 |
351 | D10 |
Ner = 351 Ng5 = 351
G:D10 records come from Gauquelin file D10
Gauquelin file contains 352 sportsmen, not 351. (Code *G:D10 is a typo concerning 2 records, present in D10 Gauquelin file)
|
|||||
0 | P | CSINR | 192 | 8 - CSICOP-U.S. |
D10
CSICOP |
Ner = 192 Ng5 = 192
Related to the 1979 CSICOP test (US skeptics)
Out of the 408 (Ertel cites 409) CSICOP records, Ertel uses only 192 also gathered by Gauquelin because mars 36 sector information was not available for CSICOP data. In file Ertel 4391, 2 records don't match D10, leading to 190 effective matches. TODO CHECK if these 2 records are new data. |
||
0 | P | 159 | 12 - GAUQ-U.S. | D10 |
Ner = 158 Ng5 = 159
Remaining sportsmen of D10, not already included in CSICOP test.
159 = 351 - 192 |
|||
599 | U | GMINI | 599 | 3 - Italian football |
Ner = 600 Ng5 = 599
Unpublished by Gauquelin (not famous enough), copied manually by Ertel in Gauquelin's laboratory.
No birth time - only birth day
In file Ertel 4391, all records marked GMINI are marked sport = FOOT and country = IT.
Possible meaning : Gauquelin MINor Italian |
|||
115 | U | GMING | 115 | 4 - German various |
Ner = 117 Ng5 = 115
Unpublished by Gauquelin (not famous enough), copied manually by Ertel.
Possible meaning : Gauquelin MINor German No birth time - only birth day
In file Ertel 4391, all records marked GMING have country = DE.
|
|||
202 | U | G_ADD | 202 | 5 - French occasionals |
Ner = 204 Ng5 = 202
Unpublished by Gauquelin (not famous enough), copied manually by Ertel in Gauquelin's laboratory.
No birth time - only birth day
Considered as "low-low-ranking" by Gauquelin.
|
|||
0 | P | G:D06 | 450 | 9 - Second European | D6 |
Ner = 450 Ng5 = 450
G:D06 records come from Gauquelin file D6.
No birth time - only birth day
Gauquelin file contains 449 sportsmen, not 450.
=> TODO : check (understand Ertel note: "In an appendix to D6 he listed 15 additional athletes whose birth dates had been received too late for inclusion. They were added to the present pool"). |
||
24 | U | GMINV | 24 | 10 - Italian cyclists |
Ner = 24 Ng5 = 24
Unpublished by Gauquelin (not famous enough), copied manually by Ertel in Gauquelin's laboratory.
Supposition from [KNS 97] p 25 : GMINV could mean Gauquelin MINor Vélo
No birth time - only birth day
In file Ertel 4391, all records marked GMINV are marked sport = CYCL and country = IT.
|
|||
453 | U | GMIND | 453 | 11 - Lower French |
Ner = 455 Ng5 = 453
Unpublished by Gauquelin (not famous enough), copied manually by Ertel in Gauquelin's laboratory.
Supposition from [KNS 97] p 25 : GMIND could mean Gauquelin MINor Dictionary
No birth time - only birth day
|
|||
27 | U | G_79F | 27 | 13 - Plus special |
Ner = 27 Ng5 = 27
Supplementary data sent by Gauquelin to Ertel after his visit in Paris.
No birth time - only birth day
|
|||
TOTAL | ||||||||
1828 new birth dates without time |
Counted in file Ertel 4391 = 553 + 1202 + 332 + 76 + 192 + 159 + 599 + 115 + 202 + 450 + 24 + 453 + 27 = 4384
Numbers coming from [Ertel 88] = 567 + 1189 + 332 + 76 + 192 + 158 + 600 + 117 + 204 + 450 + 24 + 455 + 27 = 4391
Totals match, which is an indication that sample restoration is correct.
Data sources
18 distinct sources can be extracted from columnZITATE
:
These lists associate the source code with the number of records found in this source.
[A] => 91 [B] => 90 [C] => 31 [D] => 1564 [E] => 65 [F] => 185 [G] => 84 [H] => 141 [J] => 28 [K] => 353 [M] => 21 [O] => 590 [R] => 28 [S] => 327 [T] => 164 [W] => 137 [X] => 143 [Y] => 67
[D] => 1564 [O] => 590 [K] => 353 [S] => 327 [F] => 185 [T] => 164 [X] => 143 [H] => 141 [W] => 137 [A] => 91 [B] => 90 [G] => 84 [Y] => 67 [E] => 65 [C] => 31 [R] => 28 [J] => 28 [M] => 21
O
, which makes 18 codes.
I checked, there is an exact matching between column
ZITATE
and Ertel's list.
Birth dates
php run-g5.php ertel sport look date
BUG in date : 46 Albani Peppino : - - N total : 4384 N with birth time : 2086 (47.58 %) N without birth time : 2297 (52.4 %) N without birth time from Gauquelin LERRCP : 802Among the 802 missing times coming from Gauquelin, 3 come from A1 and 799 from D6 and D10.
Eminence
php run-g5.php ertel sport look eminenceColumns
ZITRANG ZITSUM ZITATE ZITSUM_OD
deal with eminence.
ZITRANG
is the eminence rank (1 - 6).ZITSUM
is the number of citations.ZITATE
is the list of sources where the person is cited.ZITSUM_OD
: I don't know - Equals toZITSUM
orZITSUM - 1
Here are ranks and citation counts, associated with the number of records.
[1] => 2242 [2] => 1108 [3] => 549 [4] => 251 [5] => 101 [6] => 133
[0] => 2242 [1] => 1108 [2] => 549 [3] => 251 [4] => 101 [5] => 75 [6] => 37 [7] => 18 [8] => 3
Sport codes
php run-g5.php ertel sport look sportTwo columns are related to the sport of the persons :
SPORTART
and INDGRUP
.
INDGRUP
contains 'I' or 'G', indicating if this is an individual or collective sport.
The file contains 5 mistakes :
Incoherent association sport / IG, line Cachemire Jacques : BASK I Incoherent association sport / IG, line David Wilfried : CYCL G Incoherent association sport / IG, line Frey Andre : FOOT I Incoherent association sport / IG, line Richard René : HAND I Incoherent association sport / IG, line Windal Claude : HOCK IErrors on
INDGRUP
are fixed in step tweak2tmp
.
Sport codes are mostly composed by 4 letters.
This list shows IG and SPORT columns, and the number of records associated with each sport :
I ICES : 16 I JUDO : 5 I MOTO : 4 G PELOT : 18 I RODE : 1 I ROLL : 3 I ROWI : 22 G RUGB : 413 I SHOO : 13 I SKII : 86 I SWIM : 62 I TENN : 89 I TRA : 1 I TRAC : 407 I TRAV : 1 G VOLL : 4 I WALK : 6 I WEIG : 25 I WRES : 19 I YACH : 11
I AIRP : 396 I ALPI : 9 I AUTO : 109 I AVIR : 1 I BADM : 1 G BASE : 25 G BASK : 80 I BILL : 10 I BOBSL : 2 I BOWL : 3 I BOXI : 252 I CANO : 4 I CYCL : 669 I FENC : 41 G FOOT : 1465 I GOLF : 32 I GYMN : 24 G HAND : 12 G HOCK : 21 I HORS : 22
-
One code is composed by 3 letters (
TRA
), and this looks like a mistake. It corresponds to NR 4348 Charles Young (wikipedia page), and his sport is american football.
This code is changed toFOOT
in steptweak2tmp
-
Code
TRAV
corresponds to NR 2378 Martin Lauer (wikipedia page), and his sport is track and fields.
This code is changed toTRAC
in steptweak2tmp
Mars sectors
Concerned columns areMARS
contains the sectors when the circle is divided in 36.MA12
contains the sectors when the circle is divided in 12.-
MA_
importance of the sector in the observation of effects.
Combinig these informations with eminence rank, it should be possible to reproduce Ertel's famous curves of 1988.
Command :
php run-g5.php ertel sport look marsgenerates the following table.
This table lists the different values found in the 3 columns.
Interesting because it shows the difference between 12 and 36 sectors systems :
In 36-sectors system, sector 9 and 36 are also considered as important, and they are outside sectors 1 and 4 of 12-sectors system.
This shows that 12-sectors system does not catch all the information about the observed statistical effect.
This argument is used to say that using 36-sectors system is more efficient to observe the effects.
MARS | MA12 | MA_ (importance) |
---|---|---|
19 | 7 | 1 |
20 | 7 | 1 |
21 | 7 | 1 |
22 | 8 | 0 |
23 | 8 | 0 |
24 | 8 | 0 |
25 | 9 | 0 |
26 | 9 | 1 |
27 | 9 | 1 |
28 | 10 | 1 |
29 | 10 | 1 |
30 | 10 | 1 |
31 | 11 | 1 |
32 | 11 | 0 |
33 | 11 | 0 |
34 | 12 | 0 |
35 | 12 | 1 |
36 | 12 | 2 |
MARS | MA12 | MA_ (importance) |
---|---|---|
1 | 1 | 2 |
2 | 1 | 2 |
3 | 1 | 2 |
4 | 2 | 1 |
5 | 2 | 1 |
6 | 2 | 1 |
7 | 3 | 0 |
8 | 3 | 0 |
9 | 3 | 2 |
10 | 4 | 2 |
11 | 4 | 2 |
12 | 4 | 2 |
13 | 5 | 1 |
14 | 5 | 1 |
15 | 5 | 0 |
16 | 6 | 0 |
17 | 6 | 0 |
18 | 6 | 1 |