Gauquelin5 installation

This software has been developed and tested on Linux. It should normally also work on Windows and Macintosh, but hasn't been tested.
This installation guide has been tested on Debian 10 and Ubuntu 20.4

Prequisites

Before starting, you need to have installed on your machine :
  • PHP (version 8.0 or higher)
    See for example https://www.linuxtechi.com/install-php-8-on-debian-10/.
    Some PHP extensions are also necessary:
    sudo apt install php8.0-pgsql
    sudo apt install php8.0-yaml
    sudo apt install php8.0-mbstring
    sudo apt install php8.0-zip
            
  • Postgresql
    Version 12, from the stable Debian repository, is sufficient.
    sudo apt install postgresql
  • Python3, already available by default.
  • Git
    sudo apt install git

Install and configure g5

Open a terminal and clone the repository on your local machine :
git clone --depth=1 https://github.com/tig12/gauquelin5.git
Note: --depth=1 is optional, only useful to save up disk space and bandwidth.

Instead of cloning the repository, you can download the code.
In the rest of this doc, directory gauquelin5/ is called the root directory.
All the commands to run the program are issued from the root directory.

Directory structure

The important files and directories are :
gauquelin5/
    ├── data/
    │   ├── db/
    │   └── raw/
    ├── docs/
    ├── src/
    ├── vendor/
    ├── config.yml.dist
    └── run-g5.php
The files you need to know about are :
  • run-g5.php is the unique entry point to use the program.
  • data/ contains the data generated and manipulated by the program.
  • config.yml.dist contains sample configuration directives.

Configuration

Create a file config.yml by copying config.yml.dist :
cp config.yml.dist config.yml
Edit config.yml and adapt some values :

dirs

This section permits to specify unversioned directories containing data.
The values can contain either absolute paths or paths relative to root directory.
Default values are all relative to root directory :
dirs:
  output: data/output
  tmp:    data/tmp
At program installation, data/ directory contains 2 sub-directories : db/ and raw/.
These directories contain data necessary to run g5, and are versioned with the program. Their locations are imposed and not configurable.

Other sub-directories of data/, are not versioned, ignored by git.

Keeping the default values of section dirs, directory data/ contains:
gauquelin5/
    ├── data/
    │   ├── db/
    │   ├── output/
    │   ├── raw/
    │   └── tmp/

db5

This concerns g5 database, used to store data imported by the program.

In section postgresql, specify here the parameters used to connect to a local postgresql database.

geonames

G5 uses geonames.org to match place names to geonames ids and better geographical informations.
Geonames informations are stored in a local postgres database generated by geonames2postgres software.
Section postgresql permits to specify parameters used to connect to this database MUST be identical to the parameters used by geonames2postgres (see below).

openg

G5 generates a database used by openg software to access the database via postgrest.
Both softwares are bound to work together. So settings of this section must correspond to the corresponding settings of openg.

Prepare g5 database

You must have a database and credentials (user, password) corresponding to the values specified in config.
Following instructions use psql, but this can be done with other postgresql clients, like pgadmin.
These instructions use the values (user name, db name etc.) given in config.yml.dist - of course, adapt with the values you use in your config.yml file.

So, if your config.yml contains these settings:
db5:
  postgresql: 
    dbhost: localhost
    dbport: 5432
    dbuser: g5_pg_username
    dbpassword: g5_pg_password
    dbname: g5_pg_dbname
    schema: g5_pg_schema
    
openg:
  postgrest:
    user: web_anon
The database creation is:
sudo -s -u postgres
psql 

postgres=# create user g5_pg_username;
postgres=# alter role g5_pg_username with createdb;
postgres=# alter user g5_pg_username with encrypted password 'g5_pg_password';
postgres=# alter user g5_pg_username set search_path to g5_pg_schema;
postgres=# create database g5_pg_dbname owner g5_pg_username;
postgres=# \c g5_pg_dbname
g5_pg_dbname=# create schema g5_pg_schema authorization g5_pg_username;
These supplementary steps permit to use g5 database from openg program:
g5_pg_dbname=# create role web_anon nologin;
g5_pg_dbname=# grant usage on schema g5_pg_schema to web_anon;
g5_pg_dbname=# create role authenticator noinherit login password 'postgrest_password';
g5_pg_dbname=# grant web_anon to authenticator;
\q

exit    # exit postgres user
  • Note 1: if the value of g5_pg_schema is public (postgresql default schema), then the instructions:
    alter user g5_pg_username set search_path to g5_pg_schema;
    and
    create schema g5_pg_schema authorization g5_pg_username;
    are useless.
  • Note 2: postgrest_password is not present in g5 config.yml, but is present in file postgrest.conf of openg.
To connect to the database:
psql -d g5_pg_dbname -U g5_pg_username -W -h localhost

Generate Geonames database

Install geonames2postgres

Prequisites: geonames2postgres needs two libraries to run:
sudo apt install python3-yaml
sudo apt install python3-psycopg2
This software stores in a local database data copied from geonames.org, located in two directories: Files of these directories are first copied on a local machine, then stored in a postgres database, which is used by g5.

Clone the software:
git clone https://github.com/tig12/geonames2postgres.git

Configuration

cd geonames2postgres/
cp config.yml.dist config.yml
In config.yml, you specify in dir-countries and dir-postal where the geonames.org files are stored on your machine.

You also need to indicate the postgresql parameters to the database where the data are stored.

Note: these parameters MUST be the same as parameters geonames / postgres of g5 config.

Download Geonames data on your local machine

This needs to be done for all countries handled by g5, for example with this script:
countries=( AT BE CH CL CZ DE DK DZ ES FR GB GF GP IT LU MA MC MQ MU NL PL RU SE TN US )
cd /path/to/countries   # replace by the path specified in config.yml
for i in "${countries[@]}"
do
    wget -c http://download.geonames.org/export/dump/$i.zip && unzip -n $i.zip
done

cd /path/to/postal      # replace by the path specified in config.yml
for i in "${countries[@]}"
do
    wget -c http://download.geonames.org/export/zip/$i.zip && unzip -n $i.zip
done

Generate Geonames database

  1. Prepare a database - see creation of g5 database (sudo -s -u postgres etc.) with the credentials stored in geonames2postgres config.yml.
  2. Execute geonames2postgres with all available countries to fill the database:
    python3 geonames2postgres.py ALL