In-Sylva Information System
Table of content
- In-Sylva Information System
Short architecture description
- In-Sylva Information System relies on docker.
- It is built on a microservices' scheme.
- Each microservice runs independently in its own docker container.
- You will find information about each microservice in their respective repositories'
README.md
- You will find schematics explaining the project's architecture, the databases in the
./documentation
folder. - In-Sylva Information System access is managed with keycloak authentication.
- In-Sylva Information System has been successfully tested on debian (9 and 10) hosts.
Requirements
- docker >= 17.12.0+
- docker-compose
Getting source code
To download this repository, you can use the following command:
git clone https://forgemia.inra.fr/in-sylva-development/in-sylva.information-system.git
You will find other microservice's repositories in the in-sylva development GitLab group.
Use git clone
to download each project's source code if you want to see or modify it.
Build project
Local development
Execute this command to build docker images for development:
./build.sh -k id_rsa -e dev
For production
Execute this command to build docker images for production:
./build.sh -k id_rsa -e prod -d <url> -ip <IP_address> -p <port>
where you set
-
<url>
as access URL (e.g., http://www.mydomain.world/insylva/) -
<IP_address>
as the IP address of the server on which in-sylva applications are running -
<port>
as the port number of the server on which in-sylva applications are running
SSL Certificates
Production and pre-production
To handle SSL certificates, the reverse-proxy (nginx) service uses the certbot tool to generate and renew the certificates on production servers.
Certificates are stored in the ssl_certificates/pem directory.
The certificates are renewed on production and preproduction every day at 2 A.M. using a cron job. The cron job is defined in the crons/certificate_auto_renewer_installer file.
On deployment, the playbook .ansible/playbook.yml will execute the crons/certificate_auto_renewer_installer script to install the cron job.
Local development
For local development, the reverse-proxy service uses self-signed certificates. These certificates must be stored in the ssl_certificates/pem directory.
To generate them, you can use the following command :
docker compose -f docker-compose.certs.yml up install-certs-dev
It will generate the certificates and store them in the ssl_certificates/pem directory.
Run project
The first time start_in-sylva.sh
is executed a .env
file is created.
The script will exit inviting you to edit this file with your own values.
This step is mandatory as it contains necessary configuration for each microservice.
The .env
file contains explanation for each value so take time
to understand otherwise the project will not work properly.
⚠️ Project will need to re-built after editing environment variables.
So the first time you want to run this project, you should:
- Execute
./start-in-sylva.sh
Note: if not executable, runchmod +x start-in-sylva.sh
- Edit
.env
configuration file - Build project
- Follow instructions bellow
After that, you will need to run ./start-in-sylva.sh
to start the project.
At this point, all microservices' containers should be running, but not fully functional yet.
Keycloak configuration
- Go to pgAdmin and log-in using credentials from
.env
file (PGADMIN_DEFAULT_EMAIL
andPGADMIN_DEFAULT_PASSWORD
) - Create access to postgres server:
- Click on
Add New Server
- Add a name in the
Name
field (e.g.,insylva
) - In
Connection
tab, add the postgres container's IP address inHost name/address
. Two ways to find it:- Go to portainer containers' list, then find
in-sylva.postgres
row andIP Address
column - Or using
ip -a address
as root on the host container
- Go to portainer containers' list, then find
- In
Username
andPassword
fields, add the corresponding credentials from.env
file (POSTGRES_USER
andPOSTGRES_PASSWORD
) - Click
Save
- Click on
- Then open a query tab on the keycloak database (public schema) and execute this SQL query:
update REALM set ssl_required = 'NONE' where id = 'master';
- Restart the keycloak container using portainer
- Connect to keycloak using credentials from
.env
file (KEYCLOAK_USER
andKEYCLOAK_PASSWORD
) - On page's top-left corner, click on
Master
and selectAdd Realm
button and importrealm-export.json
file located in./keycloak/
subfolder.
Admin user creation
Create an admin user for the system. This step is mandatory to access the portal.
- In a terminal, execute
curl --location --request POST 'http://localhost:4000/user/create-system-user'
- Restart the login container using portainer
Upload in-sylva standard
-
Connect to the portal using credentials given in
.env
(IN_SYLVA_ADMIN_USERNAME
andIN_SYLVA_ADMIN_PASSWORD
) - In the
Fields
tab you can upload a standard in csv format. Note: a version of this file can be found here.
Application access
Portal tool
The portal is accessible:
- at
http://localhost:3000/portal
for development environment - at the URL set as build parameter for production (e.g.
http://www.mydomain.world/si/portal
)
This application allows you to:
- Access in-sylva microservices tools: Portainer, PgAdmin, Kibana, mongo-express, Elasticsearch, Keycloak
- Manage in-sylva administration (users' accounts, roles and groups, sources, policies)
- Upload metadata records to the system
Search tool
The search tool is accessible:
- at
http://localhost:3001/search
for development environment - with the URL set as build parameter for production (e.g.
http://www.mydomain.world/si/search
)
This application allows you to:
- Search for metadata records in the catalog (basic and advanced search)
- Export metadata records after a specific search
Data dump and restore
Scripts used to dump and restore data are provided in dump_restore_tools
directory.
According to your own backup policy,
you can use insylva_bdds_dump_all.sh to dump all data from microservices of the SI
(postgres, mongodb, and elasticsearch).
The result of the dump procedure is an archive.tar
file stored in the dump_restore_tools directory.
On the hosting machine, you can install a cron job, running each days, the script insylva_bdds_dump_all.sh.
The crontab main contains several lines, and you MUST adapt these with the full path to your in-sylva SI installation:
* # for montly dump
* 0 0 1 * * bash -c 'cd dump_restore_tools && ./insylva_bdds_dump_all.sh'
* # on each friday, generate weekly dump and replaced each week
* 10 0 * * 5 bash -c 'cd dump_restore_tools && ./insylva_bdds_dump_all.sh'
* # on each day, generate daily dump
* 30 0 * * 1-4 bash -c 'cd dump_restore_tools && ./insylva_bdds_dump_all.sh'
* # (optional) synchronise dump storage to an s3 ressource (see below for confiuration)
* 30 1 * * 1-5 bash -c 'cd dump_restore_tools && ./send_dumps_to_s3.sh'
S3 configuration file
For the last point (synchronising dump archives in a S3 storage), you have to create a file s3config_file available in dump_restore_tools directory.
This file should be generated with the command:
s3cmd --configure -c dump_restore_tools/s3config_file
If you decide to activate this command, you will have, on your s3 ressource, exactly the same dump files as in the dump_restore_tools/DUMPS directory
The script insylva_bdds_restore_all.sh can be used to restore an archive. To properly restore data, you have to start from a new installation. For this, you must re-do all the above installation and setup procedures. Then run the restore script and follow instructions given at the end to restart microservices' containers.
Health check for elasticsearch
When re-booting, search-api container needs to be restarted after elasticsearch container is fully started. This is done automatically using a script executed after reboot using crontab.
To set this up on a new host, add the following line to your crontab:
@reboot /usr/local/insylva/in-sylva.information-system/tools/restart_search_api.sh
If you encounter a problem with the search tool (e.g., results are empty), you can also manually run this script.
Attention
Unless needed, you do not need to generate Certificate Authority (pem file with openssl) nor edit docker-compose.yml
.
If you want to change those files and settings, please read the below instructions carefully.
For production workloads, make sure the host setting vm.max_map_count
is set to at least 262144.
On the Open Distro for Elasticsearch Docker image, this setting is the default.
To check this, start a Bash session in the container and run: cat /proc/sys/vm/max_map_count
To increase this value, you have to modify the host operating system.
On the RPM installation, you can add the following line at the end of the host machine /etc/sysctl.conf
file:
vm.max_map_count=262144
Then run sudo sysctl -p
to reload.
This value is controlled when you run build.sh script. A warning message will be displayed in case of vm.max_map_count incompatible with Open Distro Elasticsearch Docker image.
The docker-compose.yml file also contains several key settings:
bootstrap.memory_lock=true, ES_JAVA_OPTS=-Xms512m -Xmx512m
, nofile 65536 and port 9600.
These settings respectively:
- Disable memory swapping (along with memlock)
- Set the size of the Java heap (we recommend half of system RAM)
- Set a limit of 65536 open files for the Elasticsearch user and allow you to access Performance Analyzer on port 9600