Products > Computers

Borg backup - deduplication and cache location

(1/2) > >>

RoGeorge:
Last weekend tried Borg backup on the main desktop (Kubuntu 20.04 LTS). 
https://www.borgbackup.org/
https://borgbackup.readthedocs.io/en/stable/
https://github.com/borgbackup

Borg features seems impressive, includes data deduplication, compression, encryption, mounting, pruning, can backup whole partitions (by dd and piping its STDOUT to borg), etc.  The backups are filesystem agnostic, Borg only deals with files, and implements its own filesystem representation for saved backups (just files no hardlinks or other features than might be filesystem specific).  The drawback is the backups are not usable directly (like a mirror disk would be), Borg must be installed before decoding the backups.  Though, backups can be moved or copied just like any file.

Deduplication is made before compressing or encrypting, and a cache directory of file chunks and their checksum ID is created locally.

Q1.  Anybody knows why the cache is kept by default in ~/.cache/borg/ and not on the same disk where the source files are sitting?
Q2.  There is a system variable that can be set to change the location of the cache, BORG_CACHE_DIR.  I've tried to move it and it was ignored.  Is this enough, or the variable must be set AND exported?

--- Code: ---BORG_CACHE_DIR=/new/location/for/borg_cache/;  \
borg blablabla_commands_and_parameters
--- End code ---

The backups ar saved on a slow 10MB/s NAS, and are a few TB total, so first backup will be very time consuming.

Any other advice about borgbackup, or info I should know, or how to split/organize the backup repositories?

Nominal Animal:

--- Quote from: RoGeorge on June 20, 2022, 10:28:23 am ---Q2.  There is a system variable that can be set to change the location of the cache, BORG_CACHE_DIR.  I've tried to move it and it was ignored.  Is this enough, or the variable must be set AND exported?

--- Code: ---BORG_CACHE_DIR=/new/location/for/borg_cache/;  \
borg blablabla_commands_and_parameters
--- End code ---

--- End quote ---
The syntax is
    BORG_CACHE_DIR=/new/location/for/borg_cache/ borg args...
or
    export BORG_CACHE_DIR=/new/location/for/borg_cache/ ; borg args...

The mixed version you used does not pass the variable to the command.

To see this for yourself, try:
    VAL=x ; sh -c 'echo $VAL'
    (outputs an empty line)

    VAL=x sh -c 'echo $VAL'
    (outputs x)

    export VAL=x ; sh -c 'echo $VAL'
    (outputs x)

RoGeorge:
I've used export, and it works as expected, thank you.

Bash always surprised me in how it process the command line, never was able to correctly predict how it works, and never took the time to read the docs.

For example

--- Code: ---aaa@zub:~$ X1=present echo $X1

aaa@zub:~$ echo $X1

aaa@zub:~$ X1=present; echo $X1
present
aaa@zub:~$ echo $X1
present
aaa@zub:~$ unset X1
aaa@zub:~$ echo $X1

aaa@zub:~$ X1=present; \
> echo $X1
present
aaa@zub:~$

--- End code ---

The last format is what I've used with borg, but it didn't work, maybe borg spawns some other hidden instance of bash, and that's why the variable is not seen, IDK.  With export it works as expected.

For now I think I'll stay with Borg for backup.  It's a pity to use ZFS on the desktop and to use yet another thing for backup, but the NAS I have is a dedicated RAID 5 with ARM and some frozen in time Linux.  The NAS knows Samba 1.0 and NFS, and it is rather slow, only 10MB/s max, thought the LAN link is 1GBps.  Only rarely start it for manual backups a few times a year.  So far it was very reliable (for ~10 years), and I'm afraid to mess with it trying to upgrade, or to reformat it as ZFS.

Since we are at Borg backup, there is GUI for it, called Vorta, but I've decided to use the command line and craft my own scripts for manual backup.  Regarding Vorta, I've noticed that if the password manager is disabled (so no keyring), then Vorta saves the password in clear! in a table in its own database and the table remains as a leftover even after uninstalling Vorta, so I filed a security bug for Vorta and uninstall it.

Borg is quite fast at incremental backups.  For example a few hundreds GB that took many hours finishes in less than 15 minutes when doing incremental backups.  Before borg I was doing manual copy/paste then delete the old version, which usually took a whole day, if not a whole weekend to complete!  ;D

ve7xen:

--- Quote from: RoGeorge on June 20, 2022, 08:02:54 pm ---The last format is what I've used with borg, but it didn't work, maybe borg spawns some other hidden instance of bash, and that's why the variable is not seen, IDK.  With export it works as expected.
--- End quote ---

The issue with your examples that might be leading to confusion is that bash's internal state is used to expand the variables *before* the command is executed. So 'echo' is not getting passed the variable name, but the expanded text, and if the variable doesn't exist, the empty string is substituted, even if the variable gets passed in the environment. 'echo' doesn't even do its own variable expansion, if '$X1' were passed to it, it's going to echo '$X1' verbatim.

So in the end there are two behaviours here. In the form:


--- Code: ---var=value command
--- End code ---

The environment variable 'var' is set in the *command's* environment, not in the shell itself, so it disappears once the command is complete. The command obviously needs to know what to do with that environment variables.


--- Code: ---var=value ; command
--- End code ---

This sets the environment variable 'var' in the *shell's* environment, since it's not associated with a command (the command is separate due to the ';'). Since it's not been exported, it (importantly, here) *doesn't* get added to the command's environment.

When you export the variable, it will automatically be added to any command/subprocess' environment until you unset it or start a new shell instance (that isn't a child of one where it's been exported, anyway).

If you want to experiment, the 'env' command is probably a better way to help understand what's being passed, because of the variable expansion issue described above. For example:

--- Code: ---$ X1=present env | grep X1
X1=present
$ X1=present ; env | grep X1
$ export X1=present
$ env | grep X1
X1=present

--- End code ---

Foxxz:

--- Quote from: RoGeorge on June 20, 2022, 10:28:23 am ---Q1.  Anybody knows why the cache is kept by default in ~/.cache/borg/ and not on the same disk where the source files are sitting?

Any other advice about borgbackup, or info I should know, or how to split/organize the backup repositories?

--- End quote ---

The user cache directory is the best default place to store the filechecksums. Theres no guarantee that just because the user can read files at some particular location that it can be written to. Plus you'd have cache files all over the place.

Other advice:
You can have multiple backups and multiple machines backup to the same borg repo. The de-duplication is repo-wide and can save alot of disk space if the same files appear on difference machines sending data to the same repo. The downside is the repo can only be in use/locked by a single running instance of borg which can be a pain. The cache for the repo will be copied to any machine/user accessing the repo.

You can mount borg backups as a fuse filesystem which is pretty awesome but slow. Unmount with "fusermount -u <mountpoint>"

Navigation

[0] Message Index

[#] Next page

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod