Prequisites
Before starting, you need to have installed on your machine :-
PHP ; current code uses php 8.3 - See https://packages.sury.org/php/README.txt
sudo apt install php8.3
Some PHP extensions are also necessary:sudo apt install php8.3-{pgsql,yaml,mbstring,zip}
-
Postgresql ; current code is using version 16.
sudo apt install postgresql
- Python3, already available by default.
-
Git
sudo apt install git
Install and configure g5
Open a terminal and clone the repository on your local machine :git clone --depth=1 https://github.com/tig12/g5.gitNote:
--depth=1
is optional, only useful to save up disk space and bandwidth.
Instead of cloning the repository, you can download the code.
In the rest of this doc, directory
g5/
is called the root directory.
All the commands to run the program are issued from the root directory.
Directory structure
The important files and directories are :g5/ ├── data/ │ ├── db/ │ └── raw/ ├── docs/ ├── src/ ├── vendor/ ├── config.yml.dist └── run-g5.phpThe files you need to know about are :
run-g5.php
is the unique entry point to use the program.data/
contains the data generated and manipulated by the program.config.yml.dist
contains sample configuration directives.
Configuration
Create a fileconfig.yml
by copying config.yml.dist
:
cp config.yml.dist config.ymlEdit
config.yml
and adapt some values :
dirs
This section permits to specify unversioned directories containing data.The values can contain either absolute paths or paths relative to root directory.
Default values are all relative to root directory :
dirs: output: data/output tmp: data/tmpAt program installation,
data/
directory contains 2 sub-directories : db/
and raw/
.
These directories contain data necessary to run g5, and are versioned with the program. Their locations are imposed and not configurable.
Other sub-directories of
data/
, are not versioned, ignored by git.
Keeping the default values of section
dirs
, directory data/
contains:
gauquelin5/ ├── data/ │ ├── db/ │ ├── output/ │ ├── raw/ │ └── tmp/
db5
This concerns g5 database, used to store data imported by the program.In section
postgresql
, specify here the parameters used to connect to a local postgresql database.
geonames
G5 uses geonames.org to match place names to geonames ids and better geographical informations.Geonames informations are stored in a local postgres database generated by geonames2postgres software.
Section
postgresql
permits to specify parameters used to connect to this database MUST be identical to the parameters used by geonames2postgres (see below).
openg
G5 generates a database used by openg software to access the database via postgrest.Both softwares are bound to work together. So settings of this section must correspond to the corresponding settings of openg.
Prepare g5 database
You must have a database and credentials (user, password) corresponding to the values specified in config.Following instructions use psql, but this can be done with other postgresql clients, like pgadmin.
These instructions use the values (user name, db name etc.) given in
config.yml.dist
- of course, adapt with the values you use in your config.yml
file.
So, if your
config.yml
contains these settings:
db5: postgresql: dbhost: localhost dbport: 5432 dbuser: g5_pg_username dbpassword: g5_pg_password dbname: g5_pg_dbname schema: g5_pg_schema openg: postgrest: user: web_anonThe database creation is:
sudo -s -u postgres psql postgres=# create user g5_pg_username; postgres=# alter role g5_pg_username with createdb; postgres=# alter user g5_pg_username with encrypted password 'g5_pg_password'; postgres=# create database g5_pg_dbname owner g5_pg_username;If
schema
defined in config.yml
is different from public
:
postgres=# \c g5_pg_dbname g5_pg_dbname=# create schema g5_pg_schema authorization g5_pg_username; g5_pg_dbname=# alter user g5_pg_username set search_path to g5_pg_schema;The database is now ready to use with g5. To connect to the database:
psql -d g5_pg_dbname -U g5_pg_username -W -h localhost(to exit
psql
, type \q
or ctrl d
)
Postgrest (optional)
Postgrest
permits to access to g5 database through a REST API.
This is used by openg program (a web application to browse the database, used to run opengauquelin.org).
sudo -s -u postgres psql postgres=# \c g5_pg_dbname g5_pg_dbname=# create role postgrest_anonymous_user nologin; g5_pg_dbname=# grant usage on schema g5_pg_schema to postgrest_anonymous_user; g5_pg_dbname=# create role authenticator noinherit login password 'postgrest_password'; g5_pg_dbname=# grant postgrest_anonymous_user to authenticator;
-
Note:
postgrest_anonymous_user
(web_anon in postgrest doc) andpostgrest_password
is not present in g5config.yml
, but is present in filepostgrest.conf
of openg.
Generate Geonames database
Here, "Geonames database" designates an auxiliary database used to associate places to geonames.org.This is required because some steps of g5 database creation need it.
Install geonames2postgres
Prequisites: geonames2postgres needs two libraries to run:sudo apt install python3-{yaml,psycopg2}This software stores in a local database data copied from geonames.org, located in two directories:
- http://download.geonames.org/export/dump, containing cities and administrative areas.
- http://download.geonames.org/export/zip, containing postal codes.
Clone the software:
git clone https://github.com/tig12/geonames2postgres.git
Configuration
cd geonames2postgres/
cp config.yml.dist config.ymlIn
config.yml
, you specify in dir-countries
and dir-postal
where the geonames.org files are stored on your machine.
You also need to indicate the postgresql parameters to the database where the data are stored.
Note: these parameters MUST be the same as parameters
geonames/postgres
of g5 config.
Download Geonames data on your local machine
This needs to be done for all countries handled by g5, for example with this script:countries=( AT BE CH CL CZ DE DK DZ ES FR GB GF GP IT LU MA MC MQ MU NL PL RU SE TN US ) cd /path/to/countries # replace by the path specified in config.yml for i in "${countries[@]}" do wget -c http://download.geonames.org/export/dump/$i.zip && unzip -n $i.zip done cd /path/to/postal # replace by the path specified in config.yml for i in "${countries[@]}" do wget -c http://download.geonames.org/export/zip/$i.zip && unzip -n $i.zip done
Generate Geonames database
- Prepare a database - similar to the creation of g5 database (
sudo -s -u postgres
etc.) with the credentials stored in fileconfig.yml
ofgeonames2postgres
. -
Execute geonames2postgres with all available countries to fill the database:
python3 geonames2postgres.py ALL