Gauquelin5 installation

This software has been developed and tested on Linux. It should normally also work on Windows and Macintosh, but hasn't been tested.
This installation guide has been tested on Debian 12.

Prequisites

Before starting, you need to have installed on your machine :
  • PHP ; current code uses php 8.3 - See https://packages.sury.org/php/README.txt
    sudo apt install php8.3
    Some PHP extensions are also necessary:
    sudo apt install php8.3-{pgsql,yaml,mbstring,zip}
  • Postgresql ; current code is using version 16.
    sudo apt install postgresql
  • Python3, already available by default.
  • Git
    sudo apt install git

Install and configure g5

Open a terminal and clone the repository on your local machine :
git clone --depth=1 https://github.com/tig12/g5.git
Note: --depth=1 is optional, only useful to save up disk space and bandwidth.

Instead of cloning the repository, you can download the code.
In the rest of this doc, directory g5/ is called the root directory.
All the commands to run the program are issued from the root directory.

Directory structure

The important files and directories are :
g5/
    ├── data/
    │   ├── db/
    │   └── raw/
    ├── docs/
    ├── src/
    ├── vendor/
    ├── config.yml.dist
    └── run-g5.php
The files you need to know about are :
  • run-g5.php is the unique entry point to use the program.
  • data/ contains the data generated and manipulated by the program.
  • config.yml.dist contains sample configuration directives.

Configuration

Create a file config.yml by copying config.yml.dist :
cp config.yml.dist config.yml
Edit config.yml and adapt some values :

dirs

This section permits to specify unversioned directories containing data.
The values can contain either absolute paths or paths relative to root directory.
Default values are all relative to root directory :
dirs:
  output: data/output
  tmp:    data/tmp
At program installation, data/ directory contains 2 sub-directories : db/ and raw/.
These directories contain data necessary to run g5, and are versioned with the program. Their locations are imposed and not configurable.

Other sub-directories of data/, are not versioned, ignored by git.

Keeping the default values of section dirs, directory data/ contains:
gauquelin5/
    ├── data/
    │   ├── db/
    │   ├── output/
    │   ├── raw/
    │   └── tmp/

db5

This concerns g5 database, used to store data imported by the program.

In section postgresql, specify here the parameters used to connect to a local postgresql database.

geonames

G5 uses geonames.org to match place names to geonames ids and better geographical informations.
Geonames informations are stored in a local postgres database generated by geonames2postgres software.
Section postgresql permits to specify parameters used to connect to this database MUST be identical to the parameters used by geonames2postgres (see below).

openg

G5 generates a database used by openg software to access the database via postgrest.
Both softwares are bound to work together. So settings of this section must correspond to the corresponding settings of openg.

Prepare g5 database

You must have a database and credentials (user, password) corresponding to the values specified in config.
Following instructions use psql, but this can be done with other postgresql clients, like pgadmin.
These instructions use the values (user name, db name etc.) given in config.yml.dist - of course, adapt with the values you use in your config.yml file.

So, if your config.yml contains these settings:
db5:
  postgresql: 
    dbhost: localhost
    dbport: 5432
    dbuser: g5_pg_username
    dbpassword: g5_pg_password
    dbname: g5_pg_dbname
    schema: g5_pg_schema
    
openg:
  postgrest:
    user: web_anon
The database creation is:
sudo -s -u postgres
psql 

postgres=# create user g5_pg_username;
postgres=# alter role g5_pg_username with createdb;
postgres=# alter user g5_pg_username with encrypted password 'g5_pg_password';
postgres=# create database g5_pg_dbname owner g5_pg_username;
If schema defined in config.yml is different from public:
postgres=# \c g5_pg_dbname
g5_pg_dbname=# create schema g5_pg_schema authorization g5_pg_username;
g5_pg_dbname=# alter user g5_pg_username set search_path to g5_pg_schema;
The database is now ready to use with g5. To connect to the database:
psql -d g5_pg_dbname -U g5_pg_username -W -h localhost
(to exit psql, type \q or ctrl d)

Postgrest (optional)

Postgrest permits to access to g5 database through a REST API.
This is used by openg program (a web application to browse the database, used to run opengauquelin.org).
sudo -s -u postgres
psql 

postgres=# \c g5_pg_dbname

g5_pg_dbname=# create role postgrest_anonymous_user nologin;
g5_pg_dbname=# grant usage on schema g5_pg_schema to postgrest_anonymous_user;
g5_pg_dbname=# create role authenticator noinherit login password 'postgrest_password';
g5_pg_dbname=# grant postgrest_anonymous_user to authenticator;
  • Note: postgrest_anonymous_user (web_anon in postgrest doc) and postgrest_password is not present in g5 config.yml, but is present in file postgrest.conf of openg.

Generate Geonames database

Here, "Geonames database" designates an auxiliary database used to associate places to geonames.org.
This is required because some steps of g5 database creation need it.

Install geonames2postgres

Prequisites: geonames2postgres needs two libraries to run:
sudo apt install python3-{yaml,psycopg2}
This software stores in a local database data copied from geonames.org, located in two directories: Files of these directories are first copied on a local machine, then stored in a postgres database, which is used by g5.

Clone the software:
git clone https://github.com/tig12/geonames2postgres.git

Configuration

cd geonames2postgres/
cp config.yml.dist config.yml
In config.yml, you specify in dir-countries and dir-postal where the geonames.org files are stored on your machine.

You also need to indicate the postgresql parameters to the database where the data are stored.

Note: these parameters MUST be the same as parameters geonames/postgres of g5 config.

Download Geonames data on your local machine

This needs to be done for all countries handled by g5, for example with this script:
countries=( AT BE CH CL CZ DE DK DZ ES FR GB GF GP IT LU MA MC MQ MU NL PL RU SE TN US )
cd /path/to/countries   # replace by the path specified in config.yml
for i in "${countries[@]}"
do
    wget -c http://download.geonames.org/export/dump/$i.zip && unzip -n $i.zip
done

cd /path/to/postal      # replace by the path specified in config.yml
for i in "${countries[@]}"
do
    wget -c http://download.geonames.org/export/zip/$i.zip && unzip -n $i.zip
done

Generate Geonames database

  1. Prepare a database - similar to the creation of g5 database (sudo -s -u postgres etc.) with the credentials stored in file config.yml of geonames2postgres.
  2. Execute geonames2postgres with all available countries to fill the database:
    python3 geonames2postgres.py ALL