Dar Documentation


DAR's - Frequently Asked Questions


Questions:

I restore/save all files but dar reported some files have been ignored, what are those ignored files?
Dar hangs when using it with pipes, why?
Why, when I restore 1 file, dar report 3 files have been restored?
While compiling dar I get the following message : " g++: /lib/libattr.a: No such file or directory", what can I do?
I cannot find the binary package for my distro, where to look for?
Why, does dar reports "ignored" files while I make a backup without filter?
Can I use different filters between a full backup and a differential backup? Would not dar consider some file not included in the filter to be deleted?
Once in action dar makes all the system slower and slower, then it stops with the message "killed"! How to overcome this problem?
I have a backup I want to change the size of slices?
I have a backup in one slice, how can I split it in several slices?
I have a backup in several slice, how can I stick all them in a single file?
I have a backup, how can I change its encryption scheme?
I have a backup, how can I change its compression algorithm?
Which options can I use with which options?
Why dar reports corruption for the archive I have transfered with FTP?
Why DAR does save UID/GID instead of plain usernames and usergroups?
Dar_Manager does not accept encrypted archives, how to workaround this?
How to overcome the lack of static linking on MacOS X?
Why cannot dar use the full power of my multi-processor computer?
Is libdar thread-safe, which way you mean it is?

How to solve "configure: error: Cannot find size_t type"?
How to search for questions (and their answers) about known problems similar to mines?
Why dar tells me that he failed to open a directory, while I have excluded this directory?


Answers:

I restore/save all files but dar reported some files have been ignored, what are those ignored files?
When restoring/saving, all files are considered by default. But if you specify some files to restore or save, all other files are "ignored", this is the case when using -P -X -I or -g.

Dar hangs when using it with pipes, why?
Dar can produce archive on its standard output, if you give '-' as basename. But it cannot read an archive from its standard input. To feed an archive to dar through pipes, you need dar_slave and two pipes. The first pipe transmits orders from dar to dar_slave that tell dar_slave the requested portion of the archive, the second pipe goes the other way and carries from dar_slave the data requested by dar. This way, only needed data get transmitted over pipes, which cannot be possible with a single pipe.

Why, when I restore 1 file, dar report 3 files have been restored?
if you restore for example the file usr/bin/emacs dar will first restore usr (if the directory already exists, it will get its date and ownership restored, all existing files will be preserved), then /usr/bin is restored, and last usr/bin/emacs is restored. Thus 3 inodes have been restored or modified while only one file has been asked for restoration.

While compiling dar I get the following message : " g++: /lib/libattr.a: No such file or directory", what can I do?
The problem comes from an incoherence in your distro (Redhat and Slackware seem(ed) concerned at least): Dar (Libtool) finds /usr/lib/gcc-lib/i386-redhat-linux/3.3.3/../../../libattr.la  file to link with. This file defines where is located libattr static and dynamic libraries but in this file both static and dynamic libraries are expected to be found under /lib. While the dynamic libattr is there,  the static version has been moved to /usr/lib. A workaround is to make a symbolic link:

ln -s /usr/lib/libattr.a /lib/libattr.a


I cannot find the binary package for my distro, where to look for?
For any binary package, ask your distro maintainer to include dar (if not already done), and check on the web site of your preferred distro for a dar package

Why, does dar reports "ignored" files while I make a backup without filter?
If you give to dar an argument which is not an option, up to version 2.2.x it is interpreted as "to save that file only" (it is now replaced by -g option). Let's take an example:

dar -c my-files -y -s 2045M data

or since version 2.3.0:

dar -c my-files -y -s 2045M -g data

reports

--------------------------------------------
31754 inode(s) saved
with 0 hard link(s) recorded
0 inode(s) not saved (no file change)
0 inode(s) failed to save (filesystem error)
147 files(s) ignored (excluded by filters)
0 files(s) recorded as deleted from reference backup
--------------------------------------------


the 147 files are due to the fact dar has excluded 147 files and directories which are not "data" in the current directory (or if using -R option, in the directory pointed to by -R option).


Can I use different filters between a full backup and a differential backup? Would not dar consider some file not included in the filter to be deleted?
Yes, you can. No, there is no risk to have dar deleting the files that were not selected for the differential backup. Here is the way dar works:

During a backup process, when a file is ignored due to filter exclusion, an "ignored" entry is added to the catalogue. At the end of the backup, dar compares both catalogues the one of reference and the new one built during the backup process, and adds a "detruit" (destroyed in English) entry, when an entry of the reference is not present in the new catalogue. Thus, if an "ignored" is present no "detruit" will be added for that name. Then all "ignored" entries are removed and the catalogue is dumped in the archive.


Once in action dar makes all the system slower and slower, then it stops with the message "killed"! How to overcome this problem?
Dar needs virtual memory to work. Virtual memory is the RAM + SWAP space. Dar memory requirement grows with the amount of file saved, not with the amount of data saved. If you have a few huge files you will have little chance to see any memory limitation problem. At the opposite, saving a plethora of files (either big or small), will make dar request a lot of virtual memory. Dar needs this memory to build the catalogue (the contents) of the archive it creates. Same thing, for differential backup, except it also needs to load in memory the catalogue of the archive of reference, which most of the time will make dar using twice more memory when doing a differential backup than a full backup.

Anyway, the solution is:
  1. Read the limitatons file to understand the problem and be aware of the limitations you will bring at step 3, bellow.
  2. If you can, add swap space to your system (under Linux, you can either add a swap partition or a swap file, which is less constraining but also a bit less impressive). Bob Barry provided a script that can give you a raw estimation of the required virtual memory (doc/samples/dar_rqck.bash).
  3. If this is not enough, or if you don't want/can add swap space, recompile dar giving --enable-mode=64 argument to the configure script.
  4. If this not enough, and you have some money, you can add some RAM on you system
  5. If all that fails, ask for support on the dar-support mailing-list.
There is still a workaround which is to make several smaller archives of the files to backup. For example, make a backup for all in /usr/local another for all in /var and so on. Theses backup can be full or differential. The drawback is not big as you can store theses archive side by side and use them at will. Moreover, you can feed a unique dar_manager database with all theses different archives. This which will hide you the fact that there are several full archives and several differential archives concerning different set of files.


I have a backup I want to change the size of slices?
dar_xform is your friend!

dar_xform -s <size> original_archive new_archive

dar_xform will create a new archive with the slices of the requested size, (you can also make use of -S option for the first slice). Note that you don't need to decrypt the archive, not dar will uncompress it, this is thus a very fast processing. See dar_xform man page for more.


I have a backup in one slice, how can I split it in several slices?
dar_xform is your friend!

dar_xform -s <size> original_archive new_archive

see above for more.

I have a backup in several slice, how can I stick all them in a single file?
dar_xform is your friend!

dar_xform original_archive new_archive

dar_xform without -s option creates a single sliced archive. See dar_xform man page for more.


I have a backup, how can I change its encryption scheme?
The merging feature let you do that. The merging has two roles, putting in one archive the contents of two archives, and at the same time filtering file contents to not copy certain files in the resulting archive. The merging feature can take two but also only one archive as input, so we will use it in a special way:
  • a single input (our original archive)
  • no file filtering (so we keep all the files)
  • Keep files compressed (no decompression/re compression) to speed up the process
dar -+ new_archive -A original_archive -K "<new_algo>:new pass" -ak

If the original archive was encrypted you need to add the -J option to provide the encryption key, and if you don't want to have password in clear on the command line (that can be seen with top or ps by other users), simply provide "<algo>:" then dar will ask you on the fly the password, if using blowfish you can then just provide ":" for the keys:

dar -+ new_archive -A original_archive -K ":" -J ":" -ak

Note that you can also change slicing (and even compression, see below) of the archive at the same time thanks to -s option:

dar -+ new_archive -A original_archive -K ":" -J ":" -ak -s 1G

I have a backup, how can I change its compression algorithm?
Same thing as above : we will use the merging feature :

to use bzip2 compression:

dar -+ new_archive -A original_archive -y

to use gzip compression

dar -+ new_archive -A original_archive -z

to use no compression at all:

dar -+ new_archive -A original_archive

Note that you can also change encryption scheme and slicing at the same time you change compression:

dar -+ new_archive -A original_archive -y -K ":" -J ":" -s 1G

Which options can I use with which options?
DAR provides seven commands:

-c   to create a new archive
-x   to extract files from a given archive
-l    to list the contents of a given archive
-d   to compare the contents of an archive with filesystem
-t    to test the coherence of a given archive
-C  to isolate an archive (extract its contents to a usually small file)
-+   to merge two archives in one or create a sub archive from one or two other archives

Follow for each command the available options (those marked OK):



-c
-x
-l
-d
-t
-C
-+
-v
OK
OK
OK
OK
OK
OK
OK
-vs
OK
OK
--
OK
OK
-- OK
-b
OK
OK
OK
OK
OK
OK
OK
-n
OK
OK
-- -- -- OK
OK
-w
OK OK -- -- -- OK OK
-wa
-- OK -- -- -- -- --
-R
OK OK -- OK -- -- --
-X
OK OK OK OK OK -- OK
-I
OK OK OK OK OK -- OK
-P
OK OK -- OK OK -- OK
-g
OK OK -- OK OK -- OK
-]
OK OK -- OK OK -- OK
-[
OK OK -- OK OK -- OK
-u
OK OK -- -- -- -- OK
-U
OK OK -- -- -- -- OK
-i
OK OK OK OK OK OK OK
-o
OK OK OK OK OK OK OK
-O
OK OK -- OK -- -- --
-H
OK OK -- -- -- -- --
-E
OK OK OK OK OK OK OK
-F
OK -- -- -- -- OK OK
-K
OK OK OK OK OK OK OK
-J
OK -- -- -- -- OK OK
-#
OK OK OK OK OK OK OK
-*
OK -- -- -- -- OK OK
-B
OK OK OK OK OK OK OK
-N
OK OK OK OK OK OK OK
-e
OK -- -- -- -- OK OK
-aSI
OK OK OK OK OK OK OK
-abinary
OK OK OK OK OK OK OK
-Q
OK OK OK OK OK OK OK
-aa
OK -- -- OK -- -- --
-ac
OK -- -- OK -- -- --
-am
OK OK OK OK OK OK OK
-an
OK OK OK OK OK OK OK
-acase
OK OK OK OK OK OK OK
-ar
OK
OK
OK
OK
OK
OK
OK
-ag
OK
OK
OK
OK
OK
OK
OK
-j
OK OK OK OK OK OK OK
-z
OK -- -- -- -- OK OK
-y
OK -- -- -- -- OK OK
-s
OK -- -- -- -- OK OK
-S
OK -- -- -- -- OK OK
-p
OK -- -- -- -- OK OK
-@
-- -- -- -- -- -- OK
-$
-- -- -- -- -- -- OK
-~
-- -- -- -- -- -- OK
-%
-- -- -- -- -- -- OK
-D
OK -- -- -- -- -- OK
-Z
OK -- -- -- -- -- OK
-Y
OK -- -- -- -- -- OK
-m
OK -- -- -- -- -- OK
-ak
--
--
--
--
--
--
OK
-af
OK
--
--
--
--
--
--
--nodump
OK -- -- -- -- -- --
-G
OK -- -- -- -- OK OK
-M
OK -- -- -- -- -- --
-,
OK -- -- -- -- -- --
-k
-- OK -- -- -- -- --
-r
-- OK -- -- -- -- --
-f
-- OK -- -- -- -- --
-ae
-- OK -- -- -- -- --
-T
-- -- OK -- -- -- --
-as
-- -- OK -- -- -- --
-q
OK
OK
OK
OK
OK
OK
OK



Why dar reports corruption of the archive I have transfered with FTP?

Dar archive are binary files, they must be transfered in binary mode when using FTP. This is done in the following way for the ftp command-line client :

ftp <somewhere>
<login>
<password>
bin
put <file>
get <file>
bye

If you transfer an archive (or any other binary file) in ascii mode (the opposite of binary mode), the 8th bit of each byte will be lost and the archive will become impossible to recover (due to the destruction of this information). Be very careful to test your archive after transferring back to you host to be sure you can delete the original file.


Why DAR does save UID/GID instead of plain usernames and usergroups?

In each file property there is not present the name of the owner nor the name of the group owner, but instead are present two numbers, the user ID and the group ID (UID & GID in short). In the /etc/password file theses numbers are associated names and other properties, like the login shell, the home directory, the password (see also /etc/shadow). Thus, when you do a directory list (with the 'ls' command for example or with any GUI program for another example), the listing application used does open each directory, there it finds a list of name and a inode number associated, then the listing program fetchs the inode attributes for each file and looks among other information for the UID and the GID. To be able to display the real user name and group name, the listing application calls a given standard C library call that will do the lookup in /etc/password, eventually NIS system if configured and any other additional system, [this way applications have not to bother with the many system configuration possible, the same API interface is used whatever is the system], then lookup returns the name if it exist and the listing application display for each file found in a directory the attributes and the user name and group name as returned by the system.

As you can see, the user name and group name are not part of any file attribute, but UID and GID *are* instead. Dar is a backup tool mainly, it does preserve at much as possible the files property to be able to restore them as close as possible to their original state. Thus a file saved with UID=3 will be restored with UID=3. The name corresponding the UID 3 may exist or not,  may exist and be the same or may exist and be different, the file will be anyway restored in UID 3.

Scenario with dar's way of restoring

Thus, when doing backup and restoration of a crashed system you can be confident, the restoration will not interfere with the bootable system you have used to launch dar to restore your disk. Assuming you have UID 1 labeled 'bin' in your real crashed system, but this UID 1 is labeled 'admin' in the boot system, while UID 2 is labeled 'bin' in this boot system, files owned by bin in the system to restore will be restored under UID 1, not UID 2 which is used by the temporary boot system. At that time after restoration still running the from the boot system, if you do a 'ls' you will see that the original files owned by 'bin' are now owned by user 'admin'.

This is really a mirage: in your restoration you will also restore the /etc/password file and other system configuration files (like NIS configuration files if they have been used), then at reboot time on the newly restored real system, the UID 1 will be backed associated to user 'bin' as expected and files originally owned by user bin will now been listed as owned by bin as expected.

Scenario with plain name way of restoring

If dar had done else, restoring the files owned by 'bin' to the UID corresponding to 'bin', theses files would have been given UID 2 (the one used by the temporary bootable system used to launch dar). But once the real restored system would have been launched, this UID 2 would have become some other user and not 'bin' which is mapped to UID 1 in the restored /etc/password.

Now, if you want to change some UID/GID when moving a set of files from one live system to another system, there is no problem if you are not restoring dar under the 'root' account. Other account than 'root' are usually not allowed to modify UID/GID, thus restored files by dar will have group and user ownership of the dar process, which is the one that has launched dar.

But if you really need to move a directory tree containing a set of files with different ownership and you want to preserve theses different ownership from one live system to another, while the corresponding UID/GID do not match between the two system, dar can still help you:

  • Save your directory tree on the source live system
  • From the root account in the destination live system do the following:
  • restore the archive in a empty directory
  • change the UID of files according to the one used by the destination filesystem with the command:
find /path/to/restored/archive -uid <old UID>  -print -exec chown <new name> {} \;

find /path/to/restored/archive -gid <old GID> -print -exec chgrp <new name> {} \;

The first command will let you remap an UID to another for all files under the /path/to/restored/archive directory
The second command will let you remap a GID to another for all files under the /path/to/restored/archive directory

Example on how to globally modify ownership of a directory tree user by user

For example, you have on the source system three users: Pierre (UID 100), Paul (UID 101), Jacques (UID 102)
but on the destination system, theses same users are mapped to different UID: Pierre has UID 101, Paul has UID 102 and Jacques has UID 100.

We temporary need an unused UID on the destination system, we will assume UID 680 is not used. Then after the archive restoration in the directory /tmp/A we will do the following:

find /tmp/A -uid 100 -print -exec chown 680 {} \;
find /tmp/A -uid 101 -print -exec chown pierre {} \;
find /tmp/A -uid 102 -print -exec chown paul {} \;
find /tmp/A -uid 680 -print -exec chown jacques  {} \;

which is:
change files of UID 100 to UID 680 (the files of Jacques are now under the temporary UID 680 and UID 100 is now freed)
change files of UID 101 to UID 100 (the files of Pierre get their UID of the destination live system, UID 101 is now freed)
change files of UID 102 to UID 101 (the files of Paul get their UID of the destination live system, UID 102 is now freed)
change files of UID 680 to UID 102 (the files of Jacques which had been temporarily moved to UID 680 are now set to their UID on the destination live system, UID 680 is no more used).

You can then move the modified files to appropriated destination or make a new dar archive to be restored in appropriated place if you want to use some of dar's feature like for example only restore files that are more recent than those present on filesystem.



Dar_Manager does not accept encrypted archives, how to workaround this?

Yes, that's true, dar_manager does not accept encrypted archives. The first reason is that while dar_manager database cannot be encrypted this is not very fair to add them encrypted archives. The second reason is because the dar_manager database should hold the key for each encrypted archive making this archive the weakest point in your the data security: Breaking the database encryption would then provide access to any encryption key, and with original archive access it would bring access to data of any of the archive added to the database.

OK, there is however a feature in the pipe to provide to dar_manager the support to encrypt its archives, then next another feature to provide dar_manager the possibility to store the different archive keys, then is needed another feature to have key being passed from dar_manager to dar out of command-line (which would expose the keys to the sight of other users on your multi-user system), then yet another feature to be able to feed the database with the archive keys also without using the command-line. ... well there is a lot of feature to add and test before you can expect finding it in a released version of dar.

In the meanwhile, you can proceed as follows:
  • isolate your encrypted archive to unencrypted 'extracted catalogue': Do not use the -K option while isolating, you will however need to use the -J option to let dar able to read the encrypted archive. Note that still for key protection, you are encouraged to use a DCF (Dar Command File, which  is a plain file with a list of options to be passed to dar) file with restricted permissions and containing the '-J <key>' option to be passed for dar. The dar's -B option would then receive this filename. this will avoid other users of your system to have a chance to read the key you have used for your archives,
  • add theses extracted catalogue to the dar_manager database of your choice,
  • change the name and path of the added catalogue to point to your real encrypted archives (-b and -p options of dar_manager).
Note that the database is not encrypted this will expose the archive file listing (not the file's contents) of your encrypted archives to anyone able to read the database, thus it is recommended to set restrictive permission to this database file.

When will come the time to use dar_manager to restore some file, you will have to make dar_manager pass the key to dar for it be able to restore the needed files from the archive. This can be done in several ways: dar_manager's command-line, dar_manager database or dar.dcf file.
  1. dar_manager's command-line: simply pass the -e "-K <key>" to dar_manager . Note that this will expose the key twice: on dar_manager's command-line and on dar's command-line.
  2. dar_manager database: the database can store some constant command to be passed to dar. This is done using the -o option, or the -i option. The -o option exposes the arguments you want to be passed to dar because they are on dar_manager command-line. While the -i option, let you do the same thing but in an interactive manner, this is a better choice. However, if -i option it is a safe way to feed the dar_manager database with the '-K <key>' option to be passed to dar, this option will be received by dar on command-line. Thus still the key will be visible by other users on your same system.
  3. The last and best way is to use a DCF file with restrictive permission. This one will receive the '-K <key>' option for dar to be able to read the encrypted archives. And dar_manager will ask dar to read this file thanks to the '-B <filename>' option you will have give either on dar_manager's command-line (-e -B <filename> ...) or from the stored option in the database (-o -B <filename>).
note that you must prevent other users reading any file holding the archive key, this covers the dar_manager database as well as the DCF files you could temporarily use. Second note, in this workaround approach we have assumed that all encrypted archive do share the same key.


How to overcome the lack of static linking on MacOS X?

The answer comes from Dave Vasilevsky in an email to the dar-support mailing-list. I let him explain how to do:

Pure-static executables aren't used on OS X. However, Mac OS X does have other ways to build portable binaries. HOWTO build portable binaries on OS X?

First, you have to make sure that dar only uses operating-system libraries that exist on the oldest version of OS X that you care about.
You do this by specifying one of Apple's SDKs, for example:

export CPPFLAGS="-isysroot /Developer/SDKs/MacOSX10.2.8.sdk"
export LDFLAGS="-Wl,-syslibroot,/Developer/SDKs/MacOSX10.2.8.sdk"


Second, you have to make sure that any non-system libraries that dar links to are linked in statically. To do this edit dar/src/dar_suite/Makefile, changing LDADD to '../libdar/.libs/libdar.a'. If any other non-system libs are used (such as gettext), change the makefiles so they are also linked in statically. Apple should really give us a way to force the linker to do this automatically!

Some caveats:

* If you build for 10.3 or lower, you will not get EA support, and therefore you will not be able to save special Mac information like
resource forks.
* To work on both ppc and x86 Macs, you need to build a universal binary. For instructions, use Google :-)
* To make a 10.2-compatible binary, you must build with GCC 3.3.
* These instructions won't work for the 10.1 SDK, that one is harder to use.

Why cannot dar use the full power of my multi-processor computer?

Parallel computing programming is a science by itself. For having done a specialization in that area during my studies, I can explain briefly here the constraints. A program can use several processor if the algorithm it uses is able to be parallelized. Such an algorithm can either statically (at programming time) or dynamically (at execution time) be cut in several independent execution threads. Theses different execution threads must be as much autonomous as possible between them, if you don't want to have one thread waiting for another (which is not what we want). The constraint is this: if you cannot have different threads with no or very little communication and dependence then parallelization does not worth it.

Back to dar. From a very abstracted point of view, dar works by fetching files from the filesystem and by appending their data in a single file (the archive). For each file, dar records in memory the location of the data and once all files have been treated, this location information (contained in the so called "catalogue") is added at the end of the archive.

One could say that to parallelize file treatment, instead of proceeding file by file, let's do all file at the same time (or rather let's say N files at the same time). OK, but first you would have an important loss of performance at disk level as the disk heads would spend most of the time seeking from one of the N file's data to another of the N file's data. The second point would be that to add a file to the archive you must know the position of the end of the last added file, which is not possible to know in advance because of compression and/or encryption.  thus a given thread would have to wait that another has finished to be able to drop in turn the data of the file it owns... As you can guess, parallelizing this way would bring worse performance than the sequential algorithm.

Another possibility is to have several thread doing :
  • file lookup (report which file are present on filesystem)
  • file filtering (determine which file to save, which file to compress, and so on)
  • file compression
  • file encryption
This would be a bit better, but : File lookup is very fast and does not consume much CPU, as well as file filtering.  Instead, file compression or file encryption are very CPU intensive. Thus, first, if you only use compression OR encryption parallelizing this way will not bring you much extra power as the encryption or the compression are not possible to parallelize (compressing a file is done sequentially, same thing when encrypting it). Rawly you will get the same execution time as the sequential execution. Second if you use no compression and no encryption, your CPU will stay idle most of the time and the time to execute dar will only depend on the speed of your hard disk, so you will not get any improvement here. Last, only if you use both encryption and compression you could gain some performance having parallelization, but dar could only use at most two CPU! no more! And second, the gain of time will be less than 2 (it will not be twice faster, but much less) as for a given amount of data, compression needs much more time to proceed than encryption. Thus the encryption thread will most of the time wait for compressed data.

OK, you have maybe found also another possibility : having N threads for compression and M threads for encryption. Assuming  encryption is faster than compression, we could choose N > M.  We could also have a fixed value for N and a dynamic value for M depending on how fast compression is running. Well, this would let dar be able to compress and encrypt several files at the same time, assuming that reading data and data writing time is negligible compared to compression time (which must be demonstrated as several files have potentially to be read at the same time), we could maybe have a real performance gain. But, ... while several files can now be compressed at the same time, only one can be written to disk at a given time. Thus, during the time the compression of a file has started and the time it has finished all other threads have to keep their compressed data in memory. Then a next thread can drop its data to the archive while all other keep compressing to memory (RAM). We will quickly lack of RAM! Or your computer will start to swap, or you have to store the data back to disk in a temporary file, which file will have to be read again and wrote back to archive. So, doing so will bring huge disk performance degradation, as disk will server for read file's data, writing its compressed data to temporary file, reading back its compressed data, writing its compressed data to archive.

Last, when using parallelization there is a always a cost due to inter-process communication and concurrent I/O operations on the hardware (here, hard disk are used at the same time to read files to backup and to write them into the archive). This cost becomes negligible when the number of parallel thread increase, assuming all thread are well busy ... here there is a bottleneck, which is the archive creation that seems to avoid a real impressive parallelization.

Conclusion, unless you can find another way to parallelize dar, it will not bring noticeable improvement to have a parallelized version of dar. Parallelization is strongly related to the algorithm used, some algorithms are well adapted to this operation some others are not.

Is libdar thread-safe, which way you mean it is?

libdar is the part of dar's source code that has been rewritten to be used by external programs (like kdar). It has been modified to be used in a multi-threaded environment, thus, *yes*, libdar is thread-safe. However, thread-safe does not mean that you do not have to take some precautions in your programs while using libdar (or any other library).

Let's take an example, considering a simple library that provides two functions that both receive the address of an integer as argument. The first increments the given integer up to an specific user key pressed, while the second decrements the given integer up to another user key pressed. This library is thread-safe in the way that there is no static variable in it nor it has any given state at a particular time. It is just a set of two functions.

Now, your multi-threaded program is the following: at a given time you have one thread running the first library function while another runs the other library function. All will work fine unless you provided to both threads the same integer. One thread would then increment it while the other would decrement it, and you would not have the expected behavior you could get if you were not using multi-threaded environment. The problem would be the same if instead of using an external library you were accessing this same integer from two different threads at the same time.

Care must thus be taken for two different threads not acting on the same variables at the same time. This is however possible with the use of posix mutex, which would define a portion of code (known as a critical section) that cannot be entered by a thread while another one is accessing it (such a thread is suspended until the other thread exits the critical section).

For libdar, this is the same, you must pay attention not having two or more different threads acting on the same data. Libdar provides a set of classes, which can be seen as a set of type (like a C struct) with associated functions (known as methods in the object oriented world). From theses classes, your program will create objects: each object *is* a variable. Technically, invoking a method on an object is exactly the same as invoking a function giving it as hidden argument a pointer to the object ; while semantically, invoking a method is a way to read or modify this variable (= the object). Thus, if you plan to act on a given object from several threads at the same time, you must use posix mutex or any other mean to mutually exclude the access to this object between all your threads, this way only one thread may read or modify this variable (=this object) at a given time.

Note that internally libdar uses some static variables. By static variables, I mean variable that exist even when no thread is running a libdar function or method. Theses variables are enclosed in critical sections for libdar's user may use it normally. In other words, this is transparent to you. For example, to cancel a libdar call, the mechanism uses an array in which the tid (thread id) by which a call is ran must be canceled: If you wish to cancel a libdar call ran by thread 10, another thread will add the tid 10 to this list. At regular checkpoints, all libdar function check that this same list does not contain the tid the call is ran from. If so, the call aborts/returns and the thread can continue its execution out of libdar code. As you see, several thread may read or write this array of tid at the same time. thanks to a set of mutex this is transparent to you and for this reason, libdar can be said to be thread-safe.

How to solve "configure: error: Cannot find size_t type"?

This error shows when you lack support for C++ compilation. Check the gcc compiler has been compiled with C++ support activated, or if you are using gcc binary from a distro, double check you have installed the C++ support for gcc.

How to search for questions (and their answers) about known problems similar to mines?

Before sending an email to the dar-support mailing-list, you are asked to first look in the already sent email if there your problem has not been yet exposed and solved. This will first for you the fastest way to get an answer to your problem, and for me a way to preserve time for development.

But yes, there is now tones of emails to read to have a chance to find the answer to your problem. Hopefully, there is a search engine at gmane (see the dark green area at the bottom of the page).

This search engine is available for all the mailing list archived at gmane used around dar.

Why dar tells me that he failed to open a directory, while I have excluded this directory?

Reading the contents of a directory is done using the usual system call (opendir/readdir/closedir). The first call (opendir) let dar design which directory to inspect, the dar call readdir to get the next entry in the opened directory. Once nothing has to be read, closedir is called. The problem here is that dar cannot start reading a directory do some treatment and start reading another directory. In brief, the opendir/readdir/closedir system call are not re-entrant.

This is in particular critical for dar as it does a depth lookup in the directory tree. In other words, from the root if we have two directories A and B, dar reads A's contents, the contents of its subdirectories, then once finished, it read the next entry of the root directory (which is B), then read the contents of B and then of each of its subdirectories, then once finished for B, it must go back to the root again, and read the next entry. In the meanwhile dar had to open many directories to get their contents.

For this reason dar caches the directory contents (when it first meet a directory, it read its whole content and stores it in the RAM). This is only after, that dar decide whether to include or not a given directory. But at this point then, its contents has already been read thus you may get the message that dar failed to read a given directory contents, while you explicitly specify not to include that particular directory in the backup.