**translated by @xing **
Foreword
The official document introduces to usenohup
and &
to run the command in the background to start as a validator node in Join PlatON Network, it’s a simple solution but also has some problems:
- When want to terminate the process, will need to find the PID with
ps
command then terminate the process withkill
command. But thekill
command might easily terminate the wrong process by accident which will potentially break the running node to read/write the database, in this case, the only way to fix is to re-sync the node. - Log not able to be split which will make the log file getting too big and put pressure on the disk space.
- The process won’t restart automatically when the node exit with an error, which will cause the node to go offline.
We will talk about how to use the opensource tool supervisor
to manage PlatON node process which prevents the above issues.
About supervisor
Supervisor is a process management tool developed by Python, with it can easily start, restart and shutdown the process.
Except manage a single process, it can also manage multiple processes at the same time. For example, if the server, unfortunately, goes down and all processes shut down, Supervisor
can start all those processes at the same time.
Note: All following steps are demonstrated on Ubuntu 18.04
Supervisor install
Choose one of the following options
Install with apt
(Recommanded)
sudo apt update && sudo apt install -y supervisor
Insatll with pip
pip install supervisor
Files for Supervisor installation
There are many available configuration parameters for Supervisor
, we will only use some of those, if you want to learn the detail about Supervisor
configuration please reference the official doc here.
- Two part of the executable:
supervisord
is for server side (/usr/bin/supervisord); For client side issupervisorctl
(/usr/bin/supervisorctl) - Two part of configurations:
supervisord
main configuration file (/etc/supervisor/supervisord.conf); The configuration file for process are under path (/etc/supervisor/conf.d/).
supervisor.conf
main configuration file doesn’t need to modify by default, but because we need to use supervisor
to manage PlatON processes, so we need to create PlatON processes management configuration file at /etc/supervisor/conf.d/
, we can name it platon.conf
.
File name must end with .conf
In order to make the management even easier, we can write a PlatON startup script, let’s name it start.sh
and put it under /root/platon-node
:
#!/bin/bash
dir=/root/platon-node
mode="fast"
# Following are demo commands, replace with specific command
/usr/bin/platon --identity platon --datadir $dir/data --port 16789 --testnet --rpcport 6789 --rpcapi "db,platon,net,web3,admin,personal" --rpc --nodekey $dir/data/nodekey --cbft.blskey $dir/data/blskey --verbosity 3 --rpcaddr 127.0.0.1 --syncmode $mode
Use command chmod +x /root/platon-node/start.sh
to make the script executable
Platon.conf example:
[program:platon]
directory=/root/platon-node ;Directory for the process
command=bash start.sh ;Command for start the process
autostart = true ;Start the process when `supervisord` start
startsecs = 5 ;If no error occured in 5 seconds from start, means the process started successfully
startretries = 1 ;Retries, default value is 3
autorestart = unexpected ;[unexpected,true,false] Default value is `unexpected`, means only restart when accidentally shutted down
user = root ;User for start the process
redirect_stderr = true ;Redirect `stderr` to `stdout`, default `false`
stdout_logfile=/var/log/platon.out.log ;Log path
stdout_logfile_maxbytes = 30MB ;Split log config: limit size of `stdout` log file, default value is 50MB
stdout_logfile_backups = 20 ;Split log config: limit number of `stdout` log backup files
Start Supervisor
systemctl enable supervisor # Config supervisor to start while system start
systemctl start supervisor # Start supervisor
supervisorctl reload # supervisor load management configuration
PlatON process management command
supervisorctl start platon # Start
supervisorctl stop platon # Stop
supervisorctl restart platon # Restart
supervisorctl # Open management mode
Log
tail /var/log/supervisor/supervisord.log # Can use to check if PlatON process started
tail /var/log/platon.out.log # Log of PlatON process
Note: Check supervisord
log file and other program log file frequently, when the process crashing or throwing an exception, it will output to stderr
, you can check the related log file to identify the issue.
Common issue
Throw error: unix:///var/lock/supervisor.sock no such file when execute supervisorctl status
This error will throw when trying to use supervisorctl
without start supervisor
.
Or you might need to create file supervisor.sock
under var/lock/
to solve the issue.
touch /var/lock/supervisor.sock