Managing PlatON Process with Supervisor

**translated by @xing **

Foreword

The official document introduces to usenohup and & to run the command in the background to start as a validator node in Join PlatON Network, it’s a simple solution but also has some problems:

  • When want to terminate the process, will need to find the PID with ps command then terminate the process with kill command. But the kill command might easily terminate the wrong process by accident which will potentially break the running node to read/write the database, in this case, the only way to fix is to re-sync the node.
  • Log not able to be split which will make the log file getting too big and put pressure on the disk space.
  • The process won’t restart automatically when the node exit with an error, which will cause the node to go offline.

We will talk about how to use the opensource tool supervisor to manage PlatON node process which prevents the above issues.

About supervisor

Supervisor is a process management tool developed by Python, with it can easily start, restart and shutdown the process.

Except manage a single process, it can also manage multiple processes at the same time. For example, if the server, unfortunately, goes down and all processes shut down, Supervisor can start all those processes at the same time.

Note: All following steps are demonstrated on Ubuntu 18.04

Supervisor install

Official doc

Choose one of the following options

Install with apt (Recommanded)

sudo apt update && sudo apt install -y supervisor

Insatll with pip

pip install supervisor

Files for Supervisor installation

There are many available configuration parameters for Supervisor, we will only use some of those, if you want to learn the detail about Supervisor configuration please reference the official doc here.

  1. Two part of the executable: supervisord is for server side (/usr/bin/supervisord); For client side is supervisorctl (/usr/bin/supervisorctl)
  2. Two part of configurations: supervisord main configuration file (/etc/supervisor/supervisord.conf); The configuration file for process are under path (/etc/supervisor/conf.d/).

supervisor.conf main configuration file doesn’t need to modify by default, but because we need to use supervisor to manage PlatON processes, so we need to create PlatON processes management configuration file at /etc/supervisor/conf.d/, we can name it platon.conf.

File name must end with .conf

In order to make the management even easier, we can write a PlatON startup script, let’s name it start.sh and put it under /root/platon-node:

#!/bin/bash

dir=/root/platon-node
mode="fast"

# Following are demo commands, replace with specific command
/usr/bin/platon --identity platon --datadir $dir/data --port 16789 --testnet --rpcport 6789 --rpcapi "db,platon,net,web3,admin,personal" --rpc --nodekey $dir/data/nodekey --cbft.blskey $dir/data/blskey --verbosity 3 --rpcaddr 127.0.0.1 --syncmode $mode

Use command chmod +x /root/platon-node/start.sh to make the script executable

Platon.conf example:

[program:platon]
directory=/root/platon-node                ;Directory for the process
command=bash start.sh                      ;Command for start the process
autostart = true                           ;Start the process when `supervisord` start
startsecs = 5                              ;If no error occured in 5 seconds from start, means the process started successfully
startretries = 1                           ;Retries, default value is 3
autorestart = unexpected                   ;[unexpected,true,false] Default value is `unexpected`, means only restart when accidentally shutted down
user = root                                ;User for start the process
redirect_stderr = true                     ;Redirect `stderr` to `stdout`, default `false`
stdout_logfile=/var/log/platon.out.log     ;Log path
stdout_logfile_maxbytes = 30MB             ;Split log config: limit size of `stdout` log file, default value is 50MB
stdout_logfile_backups = 20                ;Split log config: limit number of `stdout` log backup files

Start Supervisor

systemctl enable supervisor      # Config supervisor to start while system start
systemctl start supervisor       # Start supervisor
supervisorctl reload             # supervisor load management configuration

PlatON process management command

supervisorctl start platon      # Start
supervisorctl stop platon       # Stop
supervisorctl restart platon    # Restart
supervisorctl                   # Open management mode

Log

tail /var/log/supervisor/supervisord.log   # Can use to check if PlatON process started
tail /var/log/platon.out.log               # Log of PlatON process

Note: Check supervisord log file and other program log file frequently, when the process crashing or throwing an exception, it will output to stderr, you can check the related log file to identify the issue.

Common issue

Throw error: unix:///var/lock/supervisor.sock no such file when execute supervisorctl status

This error will throw when trying to use supervisorctl without start supervisor.

Or you might need to create file supervisor.sock under var/lock/ to solve the issue.

touch /var/lock/supervisor.sock
2 个赞