Checking for Errors¶
It is recommended to periodically run the fsck.s3ql and s3ql_verify commands (in this order) to ensure that the file system is consistent, and that there has been no data corruption or data loss in the storage backend.
fsck.s3ql is intended to detect and correct problems with the internal file system structure, caused by e.g. a file system crash or a bug in S3QL. It assumes that the storage backend can be fully trusted, i.e. if the backend reports that a specific storage object exists, fsck.s3ql takes that as proof that the data is present and intact.
In contrast to that, the s3ql_verify command is intended to check the consistency of the storage backend. It assumes that the internal file system data is correct, and verifies that all data can actually be retrieved from the backend. Running s3ql_verify may therefore take much longer than running fsck.s3ql.
Checking and repairing internal file system errors¶
fsck.s3ql checks that the internal file system structure is consistent and attempts to correct any problems it finds. If an S3QL file system has not been unmounted correctly for any reason, you need to run fsck.s3ql before you can mount the file system again.
The fsck.s3ql command has the following syntax:
fsck.s3ql [options] <storage url>
This command accepts the following options:
- --log <target>
Destination for log messages. Specify
none
for standard output orsyslog
for the system logging daemon. Anything else will be interpreted as a file name. Log files will be rotated when they reach 1 MiB, and at most 5 old log files will be kept. Default:~/.s3ql/fsck.log
- --cachedir <path>
Store cached data in this directory (default:
~/.s3ql)
- --debug-modules <modules>
Activate debugging output from specified modules (use commas to separate multiple modules, ‘all’ for everything). Debug messages will be written to the target specified by the
--log
option.- --debug
Activate debugging output from all S3QL modules. Debug messages will be written to the target specified by the
--log
option.- --quiet
be really quiet
- --backend-options <options>
Backend specific options (separate by commas). See backend documentation for available options.
- --version
just print program version and exit
- --authfile <path>
Read authentication credentials from this file (default:
~/.s3ql/authinfo2)
- --compress <algorithm-lvl>
Compression algorithm and compression level to use when storing new data. algorithm may be any of
lzma
,bzip2
,zlib
, or none. lvl may be any integer from 0 (fastest) to 9 (slowest). Default:lzma-6
- --keep-cache
Do not purge locally cached files on exit.
- --batch
If user input is required, exit without prompting.
- --force
Force checking even if file system is marked clean.
- --force-remote
Force use of remote metadata even when this would likely result in data loss.
Detecting and handling backend data corruption¶
The s3ql_verify command verifies all data in the file
system. In contrast to fsck.s3ql, s3ql_verify
does not trust the object listing returned by the backend, but
actually attempts to retrieve every object. By default,
s3ql_verify will attempt to retrieve just the metadata for
every object (for e.g. the S3-compatible or Google Storage backends
this corresponds to a HEAD
request for each object), which is
generally sufficient to determine if the object still exists. When
specifying the --data
option, s3ql_verify will
instead read every object entirely. To determine how much data will be
transmitted in total when using --data
, look at the After
compression row in the s3qlstat output.
s3ql_verify is not able to correct any data corruption that
it finds. Instead, a list of the corrupted and/or missing objects is
written to a file and the decision about the proper course of action
is left to the user. If you have administrative access to the backend
server, you may want to investigate the cause of the corruption or
check if the missing/corrupted objects can be restored from
backups. If you believe that the missing/corrupted objects are indeed
lost irrevocably, you can use the remove_objects.py script (from
the contrib
directory of the S3QL distribution) to explicitly
delete the objects from the storage backend. After that, you should
run fsck.s3ql. Since the (now explicitly deleted) objects
should now no longer be included in the object index reported by the
backend, fsck.s3ql will identify the objects as missing,
update the internal file system structures accordingly, and move the
affected files into the lost+found
directory.
The s3ql_verify command has the following syntax:
s3ql_verify [options] <storage url>
This command accepts the following options:
- --log <target>
Destination for log messages. Specify
none
for standard output orsyslog
for the system logging daemon. Anything else will be interpreted as a file name. Log files will be rotated when they reach 1 MiB, and at most 5 old log files will be kept. Default:None
- --debug-modules <modules>
Activate debugging output from specified modules (use commas to separate multiple modules, ‘all’ for everything). Debug messages will be written to the target specified by the
--log
option.- --debug
Activate debugging output from all S3QL modules. Debug messages will be written to the target specified by the
--log
option.- --quiet
be really quiet
- --version
just print program version and exit
- --cachedir <path>
Store cached data in this directory (default:
~/.s3ql)
- --backend-options <options>
Backend specific options (separate by commas). See backend documentation for available options.
- --authfile <path>
Read authentication credentials from this file (default:
~/.s3ql/authinfo2)
- --missing-file <name>
File to store keys of missing objects.
- --corrupted-file <name>
File to store keys of corrupted objects.
- --data
Read every object completely, instead of checking just the metadata.
- --parallel PARALLEL
Number of connections to use in parallel.
- --start-with <n>
Skip over first <n> objects and with verifying object <n>+1.