Diagnosing File Integrity on Storage

This article details the steps to verify that all files stored on disk by  Open Bee™ Portal are present and have integrity. 

It is highly recommended that you run it after restoring a backup. 

Prerequisites

The diagnostic tool runs from a console (command-line interface).

Python language must be installed. 

Pyhton’s YAML and MYSQL libraries are also required: 

pip install pyyaml mysql-connector==2.1.4

Run the script

Download the script here: https://download.myopenbee.com/software/diagnostic/OBP_datacheck_onpremise.py

wget https://download.myopenbee.com/software/diagnostic/OBP_datacheck_onpremise.py
chmod +x OBP_datacheck_onpremise.py

Run the script: 

./OBP_datacheck_onpremise.py -c /var/www/openbeeportal/
  • The script takes a single argument which is the Open Bee™ Portal installation path. 
  • The -c  option performs a quick check of stored files. Only their presence is checked. 
  • Without the -c option, the HASH of the files is also recalculated and checked

Result 

  • A log/data_check_onprem.log file is created by the script, it contains the same as the output of the script on the screen
  • With the -c option: The first and last lines of the output indicate the differences between the number of documents stored in the database and the number of documents in the file system. The following lines show the list of files that may be missing
  • Without the -c option: a log/data_check_onprem.csv CSV file is created. It contains the list of anomalies and the information of the associated documents (id, path in Open Bee™ Portal, storage path, SHA2 and MD5 fingerprint, etc.)

 

This article details the steps to verify that all files stored on disk by  are present and healthy. 

It is highly recommended that you run it after restoring a backup.