||
- .\".so aufs.tmac
- .
- .eo
- .de TQ
- .br
- .ns
- .TP \$1
- ..
- .de Bu
- .IP \(bu 4
- ..
- .ec
- .\" end of macro definitions
- .
- .\" ----------------------------------------------------------------------
- .TH aufs 5 \*[AUFS_VERSION] Linux "Linux Aufs User's Manual"
- .SH NAME
- aufs \- advanced multi layered unification filesystem. version \*[AUFS_VERSION]
- .\" ----------------------------------------------------------------------
- .SH DESCRIPTION
- Aufs is a stackable unification filesystem such as Unionfs, which unifies
- several directories and provides a merged single directory.
- In the early days, aufs was entirely re-designed and re-implemented
- Unionfs Version 1.x series. After
- many original ideas, approaches and improvements, it
- becomes totally different from Unionfs while keeping the basic features.
- See Unionfs Version 1.x series for the basic features.
- Recently, Unionfs Version 2.x series begin taking some of same
- approaches to aufs's.
- .\" ----------------------------------------------------------------------
- .SH MOUNT OPTIONS
- At mount-time, the order of interpreting options is,
- .RS
- .Bu
- simple flags, except xino/noxino and udba=inotify
- .Bu
- branches
- .Bu
- xino/noxino
- .Bu
- udba=inotify
- .RE
- At remount-time,
- the options are interpreted in the given order,
- e.g. left to right.
- .RS
- .Bu
- create or remove
- whiteout-base(\*[AUFS_WH_BASE]) and
- whplink-dir(\*[AUFS_WH_PLINKDIR]) if necessary
- .RE
- .
- .TP
- .B br:BRANCH[:BRANCH ...] (dirs=BRANCH[:BRANCH ...])
- Adds new branches.
- (cf. Branch Syntax).
- Aufs rejects the branch which is an ancestor or a descendant of another
- branch. It is called overlapped. When the branch is loopback-mounted
- directory, aufs also checks the source fs-image file of loopback
- device. If the source file is a descendant of another branch, it will
- be rejected too.
- After mounting aufs or adding a branch, if you move a branch under
- another branch and make it descendant of another branch, aufs will not
- work correctly.
- .
- .TP
- .B [ add | ins ]:index:BRANCH
- Adds a new branch.
- The index begins with 0.
- Aufs creates
- whiteout-base(\*[AUFS_WH_BASE]) and
- whplink-dir(\*[AUFS_WH_PLINKDIR]) if necessary.
- If there is the same named file on the lower branch (larger index),
- aufs will hide the lower file.
- You can only see the highest file.
- You will be confused if the added branch has whiteouts (including
- diropq), they may or may not hide the lower entries.
- .\" It is recommended to make sure that the added branch has no whiteout.
- Even if a process have once mapped a file by mmap(2) with MAP_SHARED
- and the same named file exists on the lower branch,
- the process still refers the file on the lower(hidden)
- branch after adding the branch.
- If you want to update the contents of a process address space after
- adding, you need to restart your process or open/mmap the file again.
- .\" Usually, such files are executables or shared libraries.
- (cf. Branch Syntax).
- .
- .TP
- .B del:dir
- Removes a branch.
- Aufs does not remove
- whiteout-base(\*[AUFS_WH_BASE]) and
- whplink-dir(\*[AUFS_WH_PLINKDIR]) automatically.
- For example, when you add a RO branch which was unified as RW, you
- will see whiteout-base or whplink-dir on the added RO branch.
- If a process is referencing the file/directory on the deleting branch
- (by open, mmap, current working directory, etc.), aufs will return an
- error EBUSY.
- .
- .TP
- .B mod:BRANCH
- Modifies the permission flags of the branch.
- Aufs creates or removes
- whiteout-base(\*[AUFS_WH_BASE]) and/or
- whplink-dir(\*[AUFS_WH_PLINKDIR]) if necessary.
- If the branch permission is been changing `rw' to `ro', and a process
- is mapping a file by mmap(2)
- .\" with MAP_SHARED
- on the branch, the process may or may not
- be able to modify its mapped memory region after modifying branch
- permission flags.
- Additioanlly when you enable CONFIG_IMA (in linux-2.6.30 and later), IMA
- may produce some wrong messages. But this is equivalent when the
- filesystem is changed `ro' in emergency.
- (cf. Branch Syntax).
- .
- .TP
- .B append:BRANCH
- equivalent to `add:(last index + 1):BRANCH'.
- (cf. Branch Syntax).
- .
- .TP
- .B prepend:BRANCH
- equivalent to `add:0:BRANCH.'
- (cf. Branch Syntax).
- .
- .TP
- .B xino=filename
- Use external inode number bitmap and translation table.
- When CONFIG_AUFS_EXPORT is enabled, external inode generation table too.
- It is set to
- <FirstWritableBranch>/\*[AUFS_XINO_FNAME] by default, or
- \*[AUFS_XINO_DEFPATH].
- Comma character in filename is not allowed.
- The files are created per an aufs and per a branch filesystem, and
- unlinked. So you
- cannot find this file, but it exists and is read/written frequently by
- aufs.
- (cf. External Inode Number Bitmap, Translation Table and Generation Table).
- If you enable CONFIG_SYSFS, the path of xino files are not shown in
- /proc/mounts (and /etc/mtab), instead it is shown in
- <sysfs>/fs/aufs/si_<id>/xi_path.
- Otherwise, it is shown in /proc/mounts unless it is not the default
- path.
- .
- .TP
- .B noxino
- Stop using external inode number bitmap and translation table.
- If you use this option,
- Some applications will not work correctly.
- .\" And pseudo link feature will not work after the inode cache is
- .\" shrunk.
- (cf. External Inode Number Bitmap, Translation Table and Generation Table).
- .
- .TP
- .B trunc_xib
- Truncate the external inode number bitmap file. The truncation is done
- automatically when you delete a branch unless you do not specify
- `notrunc_xib' option.
- (cf. External Inode Number Bitmap, Translation Table and Generation Table).
- .
- .TP
- .B notrunc_xib
- Stop truncating the external inode number bitmap file when you delete
- a branch.
- (cf. External Inode Number Bitmap, Translation Table and Generation Table).
- .
- .TP
- .B create_policy | create=CREATE_POLICY
- .TQ
- .B copyup_policy | copyup | cpup=COPYUP_POLICY
- Policies to select one among multiple writable branches. The default
- values are `create=tdp' and `cpup=tdp'.
- link(2) and rename(2) systemcalls have an exception. In aufs, they
- try keeping their operations in the branch where the source exists.
- (cf. Policies to Select One among Multiple Writable Branches).
- .
- .TP
- .B verbose | v
- Print some information.
- Currently, it is only busy file (or inode) at deleting a branch.
- .
- .TP
- .B noverbose | quiet | q | silent
- Disable `verbose' option.
- This is default value.
- .
- .TP
- .B sum
- df(1)/statfs(2) returns the total number of blocks and inodes of
- all branches.
- Note that there are cases that systemcalls may return ENOSPC, even if
- df(1)/statfs(2) shows that aufs has some free space/inode.
- .
- .TP
- .B nosum
- Disable `sum' option.
- This is default value.
- .
- .TP
- .B dirwh=N
- Watermark to remove a dir actually at rmdir(2) and rename(2).
- If the target dir which is being removed or renamed (destination dir)
- has a huge number of whiteouts, i.e. the dir is empty logically but
- physically, the cost to remove/rename the single
- dir may be very high.
- It is
- required to unlink all of whiteouts internally before issuing
- rmdir/rename to the branch.
- To reduce the cost of single systemcall,
- aufs renames the target dir to a whiteout-ed temporary name and
- invokes a pre-created
- kernel thread to remove whiteout-ed children and the target dir.
- The rmdir/rename systemcall returns just after kicking the thread.
- When the number of whiteout-ed children is less than the value of
- dirwh, aufs remove them in a single systemcall instead of passing
- another thread.
- This value is ignored when the branch is NFS.
- The default value is \*[AUFS_DIRWH_DEF].
- .\" .
- .\" .TP
- .\" .B rdcache=N
- .
- .TP
- .B rdblk=N
- Specifies a size of internal VDIR block which is allocated at a time in
- byte.
- The VDIR block will be allocated several times when necessary. If your
- directory has millions of files, you may want to expand this size.
- The default value is defined as \*[AUFS_RDBLK_DEF].
- The size has to be lager than NAME_MAX (usually 255) and kmalloc\-able
- (the maximum limit depends on your system. at least 128KB is available
- for every system).
- Whenever you can reset the value to default by specifying rdblk=def.
- (cf. Virtual or Vertical Directory Block).
- .
- .TP
- .B rdhash=N
- Specifies a size of internal VDIR hash table which is used to compare
- the file names under the same named directory on multiple branches.
- The VDIR hash table will be allocated in readdir(3)/getdents(2),
- rmdir(2) and rename(2) for the existing target directory. If your
- directory has millions of files, you may want to expand this size.
- The default value is defined as \*[AUFS_RDHASH_DEF].
- The size has to be lager than zero, and it will be multiplied by 4 or 8
- (for 32\-bit and 64\-bit respectively, currently). The result must be
- kmalloc\-able
- (the maximum limit depends on your system. at least 128KB is available
- for every system).
- Whenever you can reset the value to default by specifying rdhash=def.
- (cf. Virtual or Vertical Directory Block).
- .
- .TP
- .B plink
- .TQ
- .B noplink
- Specifies to use `pseudo link' feature or not.
- The default is `plink' which means use this feature.
- (cf. Pseudo Link)
- .
- .TP
- .B clean_plink
- Removes all pseudo-links in memory.
- In order to make pseudo-link permanent, use
- `auplink' utility just before one of these operations,
- unmounting aufs,
- using `ro' or `noplink' mount option,
- deleting a branch from aufs,
- adding a branch into aufs,
- or changing your writable branch as readonly.
- If you installed both of /sbin/mount.aufs and /sbin/umount.aufs, and your
- mount(8) and umount(8) support them,
- `auplink' utility will be executed automatically and flush pseudo-links.
- (cf. Pseudo Link)
- .
- .TP
- .B udba=none | reval | inotify
- Specifies the level of UDBA (User's Direct Branch Access) test.
- (cf. User's Direct Branch Access and Inotify Limitation).
- .
- .TP
- .B diropq=whiteouted | w | always | a
- Specifies whether mkdir(2) and rename(2) dir case make the created directory
- `opaque' or not.
- In other words, to create `\*[AUFS_WH_DIROPQ]' under the created or renamed
- directory, or not to create.
- When you specify diropq=w or diropq=whiteouted, aufs will not create
- it if the
- directory was not whiteouted or opaqued. If the directory was whiteouted
- or opaqued, the created or renamed directory will be opaque.
- When you specify diropq=a or diropq==always, aufs will always create
- it regardless
- the directory was whiteouted/opaqued or not.
- The default value is diropq=w, it means not to create when it is unnecessary.
- If you define CONFIG_AUFS_COMPAT at aufs compiling time, the default will be
- diropq=a.
- You need to consider this option if you are planning to add a branch later
- since `diropq' affects the same named directory on the added branch.
- .
- .TP
- .B warn_perm
- .TQ
- .B nowarn_perm
- Adding a branch, aufs will issue a warning about uid/gid/permission of
- the adding branch directory,
- when they differ from the existing branch's. This difference may or
- may not impose a security risk.
- If you are sure that there is no problem and want to stop the warning,
- use `nowarn_perm' option.
- The default is `warn_perm' (cf. DIAGNOSTICS).
- .
- .TP
- .B shwh
- .TQ
- .B noshwh
- By default (noshwh), aufs doesn't show the whiteouts and
- they just hide the same named entries in the lower branches. The
- whiteout itself also never be appeared.
- If you enable CONFIG_AUFS_SHWH and specify `shwh' option, aufs
- will show you the name of whiteouts
- with keeping its feature to hide the lowers.
- Honestly speaking, I am rather confused with this `visible whiteouts.'
- But a user who originally requested this feature wrote a nice how-to
- document about this feature. See Tips file in the aufs CVS tree.
- .\" ----------------------------------------------------------------------
- .SH Module Parameters
- .TP
- .B nwkq=N
- The number of kernel thread named \*[AUFS_WKQ_NAME].
- Those threads stay in the system while the aufs module is loaded,
- and handle the special I/O requests from aufs.
- The default value is \*[AUFS_NWKQ_DEF].
- The special I/O requests from aufs include a part of copy-up, lookup,
- directory handling, pseudo-link, xino file operations and the
- delegated access to branches.
- For example, Unix filesystems allow you to rmdir(2) which has no write
- permission bit, if its parent directory has write permission bit. In aufs, the
- removing directory may or may not have whiteout or `dir opaque' mark as its
- child. And aufs needs to unlink(2) them before rmdir(2).
- Therefore aufs delegates the actual unlink(2) and rmdir(2) to another kernel
- thread which has been created already and has a superuser privilege.
- If you enable CONFIG_SYSFS, you can check this value through
- <sysfs>/module/aufs/parameters/nwkq.
- .
- .TP
- .B brs=1 | 0
- Specifies to use the branch path data file under sysfs or not.
- If the number of your branches is large or their path is long
- and you meet the limitation of mount(8) ro /etc/mtab, you need to
- enable CONFIG_SYSFS and set aufs module parameter brs=1.
- When this parameter is set as 1, aufs does not show `br:' (or dirs=)
- mount option through /proc/mounts (and /etc/mtab). So you can
- keep yourself from the page limitation of
- mount(8) or /etc/mtab.
- Aufs shows branch paths through <sysfs>/fs/aufs/si_XXX/brNNN.
- Actually the file under sysfs has also a size limitation, but I don't
- think it is harmful.
- There is one more side effect in setting 1 to this parameter.
- If you rename your branch, the branch path written in /etc/mtab will be
- obsoleted and the future remount will meet some error due to the
- unmatched parameters (Remember that mount(8) may take the options from
- /etc/mtab and pass them to the systemcall).
- If you set 1, /etc/mtab will not hold the branch path and you will not
- meet such trouble. On the other hand, the entries for the
- branch path under sysfs are generated dynamically. So it must not be obsoleted.
- But I don't think users want to rename branches so often.
- If CONFIG_SYSFS is disable, this parameter is always set to 0.
- .
- .TP
- .B sysrq=key
- Specifies MagicSysRq key for debugging aufs.
- You need to enable both of CONFIG_MAGIC_SYSRQ and CONFIG_AUFS_DEBUG.
- Currently this is for developers only.
- The default is `a'.
- .
- .TP
- .B debug= 0 | 1
- Specifies disable(0) or enable(1) debug print in aufs.
- This parameter can be changed dynamically.
- You need to enable CONFIG_AUFS_DEBUG.
- Currently this is for developers only.
- The default is `0' (disable).
- .\" ----------------------------------------------------------------------
- .SH Entries under Sysfs and Debugfs
- See linux/Documentation/ABI/*/{sys,debug}fs-aufs.
- .\" ----------------------------------------------------------------------
- .SH Branch Syntax
- .TP
- .B dir_path[ =permission [ + attribute ] ]
- .TQ
- .B permission := rw | ro | rr
- .TQ
- .B attribute := wh | nolwh
- dir_path is a directory path.
- The keyword after `dir_path=' is a
- permission flags for that branch.
- Comma, colon and the permission flags string (including `=')in the path
- are not allowed.
- Any filesystem can be a branch, But some are not accepted such like
- sysfs, procfs and unionfs.
- If you specify such filesystems as an aufs branch, aufs will return an error
- saying it is unsupported.
- Cramfs in linux stable release has strange inodes and it makes aufs
- confused. For example,
- .nf
- $ mkdir -p w/d1 w/d2
- $ > w/z1
- $ > w/z2
- $ mkcramfs w cramfs
- $ sudo mount -t cramfs -o ro,loop cramfs /mnt
- $ find /mnt -ls
- 76 1 drwxr-xr-x 1 jro 232 64 Jan 1 1970 /mnt
- 1 1 drwxr-xr-x 1 jro 232 0 Jan 1 1970 /mnt/d1
- 1 1 drwxr-xr-x 1 jro 232 0 Jan 1 1970 /mnt/d2
- 1 1 -rw-r--r-- 1 jro 232 0 Jan 1 1970 /mnt/z1
- 1 1 -rw-r--r-- 1 jro 232 0 Jan 1 1970 /mnt/z2
- .fi
- All these two directories and two files have the same inode with one
- as their link count. Aufs cannot handle such inode correctly.
- Currently, aufs involves a tiny workaround for such inodes. But some
- applications may not work correctly since aufs inode number for such
- inode will change silently.
- If you do not have any empty files, empty directories or special files,
- inodes on cramfs will be all fine.
- A branch should not be shared as the writable branch between multiple
- aufs. A readonly branch can be shared.
- The maximum number of branches is configurable at compile time (127 by
- default).
- When an unknown permission or attribute is given, aufs sets ro to that
- branch silently.
- .SS Permission
- .
- .TP
- .B rw
- Readable and writable branch. Set as default for the first branch.
- If the branch filesystem is mounted as readonly, you cannot set it `rw.'
- .\" A filesystem which does not support link(2) and i_op\->setattr(), for
- .\" example FAT, will not be used as the writable branch.
- .
- .TP
- .B ro
- Readonly branch and it has no whiteouts on it.
- Set as default for all branches except the first one. Aufs never issue
- both of write operation and lookup operation for whiteout to this branch.
- .
- .TP
- .B rr
- Real readonly branch, special case of `ro', for natively readonly
- branch. Assuming the branch is natively readonly, aufs can optimize
- some internal operation. For example, if you specify `udba=inotify'
- option, aufs does not set inotify for the things on rr branch.
- Set by default for a branch whose fs-type is either `iso9660',
- `cramfs' or `romfs' (and `squashfs' for linux\-2.6.29 and later).
- When your branch exists on slower device and you have some
- capacity on your hdd, you may want to try ulobdev tool in ULOOP sample.
- It can cache the contents of the real devices on another faster device,
- so you will be able to get the better access performance.
- The ulobdev tool is for a generic block device, and the ulohttp is for a
- filesystem image on http server.
- If you want to spin down your hdd to save the
- battery life or something, then you may want to use ulobdev to save the
- access to the hdd, too.
- See $AufsCVS/sample/uloop in detail.
- .SS Attribute
- .
- .TP
- .B wh
- Readonly branch and it has/might have whiteouts on it.
- Aufs never issue write operation to this branch, but lookup for whiteout.
- Use this as `<branch_dir>=ro+wh'.
- .
- .TP
- .B nolwh
- Usually, aufs creates a whiteout as a hardlink on a writable
- branch. This attributes prohibits aufs to create the hardlinked
- whiteout, including the source file of all hardlinked whiteout
- (\*[AUFS_WH_BASE].)
- If you do not like a hardlink, or your writable branch does not support
- link(2), then use this attribute.
- But I am afraid a filesystem which does not support link(2) natively
- will fail in other place such as copy-up.
- Use this as `<branch_dir>=rw+nolwh'.
- Also you may want to try `noplink' mount option, while it is not recommended.
- .\" .SS FUSE as a branch
- .\" A FUSE branch needs special attention.
- .\" The struct fuse_operations has a statfs operation. It is OK, but the
- .\" parameter is struct statvfs* instead of struct statfs*. So almost
- .\" all user\-space implementation will call statvfs(3)/fstatvfs(3) instead of
- .\" statfs(2)/fstatfs(2).
- .\" In glibc, [f]statvfs(3) issues [f]statfs(2), open(2)/read(2) for
- .\" /proc/mounts,
- .\" and stat(2) for the mountpoint. With this situation, a FUSE branch will
- .\" cause a deadlock in creating something in aufs. Here is a sample
- .\" scenario,
- .\" .\" .RS
- .\" .\" .IN -10
- .\" .Bu
- .\" create/modify a file just under the aufs root dir.
- .\" .Bu
- .\" aufs acquires a write\-lock for the parent directory, ie. the root dir.
- .\" .Bu
- .\" A library function or fuse internal may call statfs for a fuse branch.
- .\" The create=mfs mode in aufs will surely call statfs for each writable
- .\" branches.
- .\" .Bu
- .\" FUSE in kernel\-space converts and redirects the statfs request to the
- .\" user\-space.
- .\" .Bu
- .\" the user\-space statfs handler will call [f]statvfs(3).
- .\" .Bu
- .\" the [f]statvfs(3) in glibc will access /proc/mounts and issue
- .\" stat(2) for the mountpoint. But those require a read\-lock for the aufs
- .\" root directory.
- .\" .Bu
- .\" Then a deadlock occurs.
- .\" .\" .RE 1
- .\" .\" .IN
- .\"
- .\" In order to avoid this deadlock, I would suggest not to call
- .\" [f]statvfs(3) from fuse. Here is a sample code to do this.
- .\" .nf
- .\" struct statvfs stvfs;
- .\"
- .\" main()
- .\" {
- .\" statvfs(..., &stvfs)
- .\" or
- .\" fstatvfs(..., &stvfs)
- .\" stvfs.f_fsid = 0
- .\" }
- .\"
- .\" statfs_handler(const char *path, struct statvfs *arg)
- .\" {
- .\" struct statfs stfs
- .\"
- .\" memcpy(arg, &stvfs, sizeof(stvfs))
- .\"
- .\" statfs(..., &stfs)
- .\" or
- .\" fstatfs(..., &stfs)
- .\"
- .\" arg->f_bfree = stfs.f_bfree
- .\" arg->f_bavail = stfs.f_bavail
- .\" arg->f_ffree = stfs.f_ffree
- .\" arg->f_favail = /* any value */
- .\" }
- .\" .fi
- .\" ----------------------------------------------------------------------
- .SH External Inode Number Bitmap, Translation Table and Generation Table (xino)
- Aufs uses one external bitmap file and one external inode number
- translation table files per an aufs and per a branch
- filesystem by default.
- Additionally when CONFIG_AUFS_EXPORT is enabled, one external inode
- generation table is added.
- The bitmap (and the generation table) is for recycling aufs inode number
- and the others
- are a table for converting an inode number on a branch to
- an aufs inode number. The default path
- is `first writable branch'/\*[AUFS_XINO_FNAME].
- If there is no writable branch, the
- default path
- will be \*[AUFS_XINO_DEFPATH].
- .\" A user who executes mount(8) needs the privilege to create xino
- .\" file.
- If you enable CONFIG_SYSFS, the path of xino files are not shown in
- /proc/mounts (and /etc/mtab), instead it is shown in
- <sysfs>/fs/aufs/si_<id>/xi_path.
- Otherwise, it is shown in /proc/mounts unless it is not the default
- path.
- Those files are always opened and read/write by aufs frequently.
- If your writable branch is on flash memory device, it is recommended
- to put xino files on other than flash memory by specifying `xino='
- mount option.
- The
- maximum file size of the bitmap is, basically, the amount of the
- number of all the files on all branches divided by 8 (the number of
- bits in a byte).
- For example, on a 4KB page size system, if you have 32,768 (or
- 2,599,968) files in aufs world,
- then the maximum file size of the bitmap is 4KB (or 320KB).
- The
- maximum file size of the table will
- be `max inode number on the branch x size of an inode number'.
- For example in 32bit environment,
- .nf
- $ df -i /branch_fs
- /dev/hda14 2599968 203127 2396841 8% /branch_fs
- .fi
- and /branch_fs is an branch of the aufs. When the inode number is
- assigned contiguously (without `hole'), the maximum xino file size for
- /branch_fs will be 2,599,968 x 4 bytes = about 10 MB. But it might not be
- allocated all of disk blocks.
- When the inode number is assigned discontinuously, the maximum size of
- xino file will be the largest inode number on a branch x 4 bytes.
- Additionally, the file size is limited to LLONG_MAX or the s_maxbytes
- in filesystem's superblock (s_maxbytes may be smaller than
- LLONG_MAX). So the
- support-able largest inode number on a branch is less than
- 2305843009213693950 (LLONG_MAX/4\-1).
- This is the current limitation of aufs.
- On 64bit environment, this limitation becomes more strict and the
- supported largest inode number is less than LLONG_MAX/8\-1.
- The xino files are always hidden, i.e. removed. So you cannot
- do `ls \-l xino_file'.
- If you enable CONFIG_DEBUG_FS, you can check these information through
- <debugfs>/aufs/<si_id>/{xib,xi[0-9]*,xigen}. xib is for the bitmap file,
- xi0 ix for the first branch, and xi1 is for the next. xigen is for the
- generation table.
- xib and xigen are in the format of,
- .nf
- <blocks>x<block size> <file size>
- .fi
- Note that a filesystem usually has a
- feature called pre-allocation, which means a number of
- blocks are allocated automatically, and then deallocated
- silently when the filesystem thinks they are unnecessary.
- You do not have to be surprised the sudden changes of the number of
- blocks, when your filesystem which xino files are placed supports the
- pre-allocation feature.
- The rests are hidden xino file information in the format of,
- .nf
- <file count>, <blocks>x<block size> <file size>
- .fi
- If the file count is larger than 1, it means some of your branches are
- on the same filesystem and the xino file is shared by them.
- Note that the file size may not be equal to the actual consuming blocks
- since xino file is a sparse file, i.e. a hole in a file which does not
- consume any disk blocks.
- Once you unmount aufs, the xino files for that aufs are totally gone.
- It means that the inode number is not permanent across umount or
- shutdown.
- The xino files should be created on the filesystem except NFS.
- If your first writable branch is NFS, you will need to specify xino
- file path other than NFS.
- Also if you are going to remove the branch where xino files exist or
- change the branch permission to readonly, you need to use xino option
- before del/mod the branch.
- The bitmap file can be truncated.
- For example, if you delete a branch which has huge number of files,
- many inode numbers will be recycled and the bitmap will be truncated
- to smaller size. Aufs does this automatically when a branch is
- deleted.
- You can truncate it anytime you like if you specify `trunc_xib' mount
- option. But when the accessed inode number was not deleted, nothing
- will be truncated.
- If you do not want to truncate it (it may be slow) when you delete a
- branch, specify `notrunc_xib' after `del' mount option.
- If you do not want to use xino, use noxino mount option. Use this
- option with care, since the inode number may be changed silently and
- unexpectedly anytime.
- For example,
- rmdir failure, recursive chmod/chown/etc to a large and deep directory
- or anything else.
- And some applications will not work correctly.
- .\" When the inode number has been changed, your system
- .\" can be crazy.
- If you want to change the xino default path, use xino mount option.
- After you add branches, the persistence of inode number may not be
- guaranteed.
- At remount time, cached but unused inodes are discarded.
- And the newly appeared inode may have different inode number at the
- next access time. The inodes in use have the persistent inode number.
- When aufs assigned an inode number to a file, and if you create the
- same named file on the upper branch directly, then the next time you
- access the file, aufs may assign another inode number to the file even
- if you use xino option.
- Some applications may treat the file whose inode number has been
- changed as totally different file.
- .\" ----------------------------------------------------------------------
- .SH Pseudo Link (hardlink over branches)
- Aufs supports `pseudo link' which is a logical hard-link over
- branches (cf. ln(1) and link(2)).
- In other words, a copied-up file by link(2) and a copied-up file which was
- hard-linked on a readonly branch filesystem.
- When you have files named fileA and fileB which are
- hardlinked on a readonly branch, if you write something into fileA,
- aufs copies-up fileA to a writable branch, and write(2) the originally
- requested thing to the copied-up fileA. On the writable branch,
- fileA is not hardlinked.
- But aufs remembers it was hardlinked, and handles fileB as if it existed
- on the writable branch, by referencing fileA's inode on the writable
- branch as fileB's inode.
- Once you unmount aufs, the plink info for that aufs kept in memory are totally
- gone.
- It means that the pseudo-link is not permanent.
- If you want to make plink permanent, try `auplink' utility just before
- one of these operations,
- unmounting your aufs,
- using `ro' or `noplink' mount option,
- deleting a branch from aufs,
- adding a branch into aufs,
- or changing your writable branch to readonly.
- This utility will reproduces all real hardlinks on a writable branch by linking
- them, and removes pseudo-link info in memory and temporary link on the
- writable branch.
- Since this utility access your branches directly, you cannot hide them by
- `mount \-\-bind /tmp /branch' or something.
- If you are willing to rebuild your aufs with the same branches later, you
- should use auplink utility before you umount your aufs.
- If you installed both of /sbin/mount.aufs and /sbin/umount.aufs, and your
- mount(8) and umount(8) support them,
- `auplink' utility will be executed automatically and flush pseudo-links.
- .nf
- # auplink /your/aufs/root flush
- # umount /your/aufs/root
- or
- # auplink /your/aufs/root flush
- # mount -o remount,mod:/your/writable/branch=ro /your/aufs/root
- or
- # auplink /your/aufs/root flush
- # mount -o remount,noplink /your/aufs/root
- or
- # auplink /your/aufs/root flush
- # mount -o remount,del:/your/aufs/branch /your/aufs/root
- or
- # auplink /your/aufs/root flush
- # mount -o remount,append:/your/aufs/branch /your/aufs/root
- .fi
- The plinks are kept both in memory and on disk. When they consumes too much
- resources on your system, you can use the `auplink' utility at anytime and
- throw away the unnecessary pseudo-links in safe.
- Additionally, the `auplink' utility is very useful for some security reasons.
- For example, when you have a directory whose permission flags
- are 0700, and a file who is 0644 under the 0700 directory. Usually,
- all files under the 0700 directory are private and no one else can see
- the file. But when the directory is 0711 and someone else knows the 0644
- filename, he can read the file.
- Basically, aufs pseudo-link feature creates a temporary link under the
- directory whose owner is root and the permission flags are 0700.
- But when the writable branch is NFS, aufs sets 0711 to the directory.
- When the 0644 file is pseudo-linked, the temporary link, of course the
- contents of the file is totally equivalent, will be created under the
- 0711 directory. The filename will be generated by its inode number.
- While it is hard to know the generated filename, someone else may try peeping
- the temporary pseudo-linked file by his software tool which may try the name
- from one to MAX_INT or something.
- In this case, the 0644 file will be read unexpectedly.
- I am afraid that leaving the temporary pseudo-links can be a security hole.
- It makes sense to execute `auplink /your/aufs/root flush'
- periodically, when your writable branch is NFS.
- When your writable branch is not NFS, or all users are careful enough to set 0600
- to their private files, you do not have to worry about this issue.
- If you do not want this feature, use `noplink' mount option.
- .SS The behaviours of plink and noplink
- This sample shows that the `f_src_linked2' with `noplink' option cannot follow
- the link.
- .nf
- none on /dev/shm/u type aufs (rw,xino=/dev/shm/rw/.aufs.xino,br:/dev/shm/rw=rw:/dev/shm/ro=ro)
- $ ls -li ../r?/f_src_linked* ./f_src_linked* ./copied
- ls: ./copied: No such file or directory
- 15 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked
- 15 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked2
- 22 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ./f_src_linked
- 22 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ./f_src_linked2
- $ echo abc >> f_src_linked
- $ cp f_src_linked copied
- $ ls -li ../r?/f_src_linked* ./f_src_linked* ./copied
- 15 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked
- 15 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked2
- 36 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ../rw/f_src_linked
- 53 -rw-r--r-- 1 jro jro 6 Dec 22 11:03 ./copied
- 22 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ./f_src_linked
- 22 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ./f_src_linked2
- $ cmp copied f_src_linked2
- $
- none on /dev/shm/u type aufs (rw,xino=/dev/shm/rw/.aufs.xino,noplink,br:/dev/shm/rw=rw:/dev/shm/ro=ro)
- $ ls -li ../r?/f_src_linked* ./f_src_linked* ./copied
- ls: ./copied: No such file or directory
- 17 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked
- 17 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked2
- 23 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ./f_src_linked
- 23 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ./f_src_linked2
- $ echo abc >> f_src_linked
- $ cp f_src_linked copied
- $ ls -li ../r?/f_src_linked* ./f_src_linked* ./copied
- 17 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked
- 17 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked2
- 36 -rw-r--r-- 1 jro jro 6 Dec 22 11:03 ../rw/f_src_linked
- 53 -rw-r--r-- 1 jro jro 6 Dec 22 11:03 ./copied
- 23 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ./f_src_linked
- 23 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ./f_src_linked2
- $ cmp copied f_src_linked2
- cmp: EOF on f_src_linked2
- $
- .fi
- .\"
- .\" If you add/del a branch, or link/unlink the pseudo-linked
- .\" file on a branch
- .\" directly, aufs cannot keep the correct link count, but the status of
- .\" `pseudo-linked.'
- .\" Those files may or may not keep the file data after you unlink the
- .\" file on the branch directly, especially the case of your branch is
- .\" NFS.
- If you add a branch which has fileA or fileB, aufs does not follow the
- pseudo link. The file on the added branch has no relation to the same
- named file(s) on the lower branch(es).
- If you use noxino mount option, pseudo link will not work after the
- kernel shrinks the inode cache.
- This feature will not work for squashfs before version 3.2 since its
- inode is tricky.
- When the inode is hardlinked, squashfs inodes has the same inode
- number and correct link count, but the inode memory object is
- different. Squashfs inodes (before v3.2) are generated for each, even
- they are hardlinked.
- .\" ----------------------------------------------------------------------
- .SH User's Direct Branch Access (UDBA)
- UDBA means a modification to a branch filesystem manually or directly,
- e.g. bypassing aufs.
- While aufs is designed and implemented to be safe after UDBA,
- it can make yourself and your aufs confused. And some information like
- aufs inode will be incorrect.
- For example, if you rename a file on a branch directly, the file on
- aufs may
- or may not be accessible through both of old and new name.
- Because aufs caches various information about the files on
- branches. And the cache still remains after UDBA.
- Aufs has a mount option named `udba' which specifies the test level at
- access time whether UDBA was happened or not.
- .
- .TP
- .B udba=none
- Aufs trusts the dentry and the inode cache on the system, and never
- test about UDBA. With this option, aufs runs fastest, but it may show
- you incorrect data.
- Additionally, if you often modify a branch
- directly, aufs will not be able to trace the changes of inodes on the
- branch. It can be a cause of wrong behaviour, deadlock or anything else.
- It is recommended to use this option only when you are sure that
- nobody access a file on a branch.
- It might be difficult for you to achieve real `no UDBA' world when you
- cannot stop your users doing `find / \-ls' or something.
- If you really want to forbid all of your users to UDBA, here is a trick
- for it.
- With this trick, users cannot see the
- branches directly and aufs runs with no problem, except `auplink' utility.
- But if you are not familiar with aufs, this trick may make
- yourself confused.
- .nf
- # d=/tmp/.aufs.hide
- # mkdir $d
- # for i in $branches_you_want_to_hide
- > do
- > mount -n --bind $d $i
- > done
- .fi
- When you unmount the aufs, delete/modify the branch by remount, or you
- want to show the hidden branches again, unmount the bound
- /tmp/.aufs.hide.
- .nf
- # umount -n $branches_you_want_to_unbound
- .fi
- If you use FUSE filesystem as an aufs branch which supports hardlink,
- you should not set this option, since FUSE makes inode objects for
- each hardlinks (at least in linux\-2.6.23). When your FUSE filesystem
- maintains them at link/unlinking, it is equivalent
- to `direct branch access' for aufs.
- .
- .TP
- .B udba=reval
- Aufs tests only the existence of the file which existed. If
- the existed file was removed on the branch directly, aufs
- discard the cache about the file and
- re-lookup it. So the data will be updated.
- This test is at minimum level to keep the performance and ensure the
- existence of a file.
- This is default and aufs runs still fast.
- This rule leads to some unexpected situation, but I hope it is
- harmless. Those are totally depends upon cache. Here are just a few
- examples.
- .
- .RS
- .Bu
- If the file is cached as negative or
- not-existed, aufs does not test it. And the file is still handled as
- negative after a user created the file on a branch directly. If the
- file is not cached, aufs will lookup normally and find the file.
- .
- .Bu
- When the file is cached as positive or existed, and a user created the
- same named file directly on the upper branch. Aufs detects the cached
- inode of the file is still existing and will show you the old (cached)
- file which is on the lower branch.
- .
- .Bu
- When the file is cached as positive or existed, and a user renamed the
- file by rename(2) directly. Aufs detects the inode of the file is
- still existing. You may or may not see both of the old and new files.
- Todo: If aufs also tests the name, we can detect this case.
- .RE
- If your outer modification (UDBA) is rare and you can ignore the
- temporary and minor differences between virtual aufs world and real
- branch filesystem, then try this mount option.
- .
- .TP
- .B udba=inotify
- Aufs sets `inotify' to all the accessed directories on its branches
- and receives the event about the dir and its children. It consumes
- resources, cpu and memory. And I am afraid that the performance will be
- hurt, but it is most strict test level.
- There are some limitations of linux inotify, see also Inotify
- Limitation.
- So it is recommended to leave udba default option usually, and set it
- to inotify by remount when you need it.
- When a user accesses the file which was notified UDBA before, the cached data
- about the file will be discarded and aufs re-lookup it. So the data will
- be updated.
- When an error condition occurs between UDBA and aufs operation, aufs
- will return an error, including EIO.
- To use this option, you need to enable CONFIG_INOTIFY and
- CONFIG_AUFS_UDBA_INOTIFY.
- To rename/rmdir a directory on a branch directory may reveal the same named
- directory on the lower branch. Aufs tries re-lookuping the renamed
- directory and the revealed directory and assigning different inode
- number to them. But the inode number including their children can be a
- problem. The inode numbers will be changed silently, and
- aufs may produce a warning. If you rename a directory repeatedly and
- reveal/hide the lower directory, then aufs may confuse their inode
- numbers too. It depends upon the system cache.
- When you make a directory in aufs and mount other filesystem on it,
- the directory in aufs cannot be removed expectedly because it is a
- mount point. But the same named directory on the writable branch can
- be removed, if someone wants. It is just an empty directory, instead
- of a mount point.
- Aufs cannot stop such direct rmdir, but produces a warning about it.
- If the pseudo-linked file is hardlinked or unlinked on the branch
- directly, its inode link count in aufs may be incorrect. It is
- recommended to flush the pseudo-links by auplink script.
- .\" ----------------------------------------------------------------------
- .SH Linux Inotify Limitation
- Unfortunately, current inotify (linux\-2.6.18) has some limitations,
- and aufs must derive it.
- .SS IN_ATTRIB, updating atime
- When a file/dir on a branch is accessed directly, the inode atime (access
- time, cf. stat(2)) may or may not be updated. In some cases, inotify
- does not fire this event. So the aufs inode atime may remain old.
- .SS IN_ATTRIB, updating nlink
- When the link count of a file on a branch is incremented by link(2)
- directly,
- inotify fires IN_CREATE to the parent
- directory, but IN_ATTRIB to the file. So the aufs inode nlink may
- remain old.
- .SS IN_DELETE, removing file on NFS
- When a file on a NFS branch is deleted directly, inotify may or may
- not fire
- IN_DELETE event. It depends upon the status of dentry
- (DCACHE_NFSFS_RENAMED flag).
- In this case, the file on aufs seems still exists. Aufs and any user can see
- the file.
- .SS IN_IGNORED, deleted rename target
- When a file/dir on a branch is unlinked by rename(2) directly, inotify
- fires IN_IGNORED which means the inode is deleted. Actually, in some
- cases, the inode survives. For example, the rename target is linked or
- opened. In this case, inotify watch set by aufs is removed by VFS and
- inotify.
- And aufs cannot receive the events anymore. So aufs may show you
- incorrect data about the file/dir.
- .\" ----------------------------------------------------------------------
- .SH Virtual or Vertical Directory Block (VDIR)
- In order to provide the merged view of file listing, aufs builds
- internal directory block on memory. For readdir, aufs performs readdir()
- internally for each dir on branches, merges their entries with
- eliminating the whiteout\-ed ones, and sets it to the opened file (dir)
- object. So the file object has its entry list until it is closed. The
- entry list will be updated when the file position is zero (by
- rewinddir(3)) and becomes obsoleted.
- Some people may call it can be a security hole or invite DoS attack
- since the opened and once readdir\-ed dir (file object) holds its entry
- list and becomes a pressure for system memory. But I would say it is similar
- to files under /proc or /sys. The virtual files in them also holds a
- memory page (generally) while they are opened. When an idea to reduce
- memory for them is introduced, it will be applied to aufs too.
- The dynamically allocated memory block for the name of entries has a
- unit of \*[AUFS_RDBLK_DEF] bytes by default.
- During building dir blocks, aufs creates hash list (hashed and divided by
- \*[AUFS_RDHASH_DEF] by default) and judging whether
- the entry is whiteouted by its upper branch or already listed.
- These values are suitable for normal environments. But you may have
- millions of files or very long filenames under a single directory. For
- such cases, you may need to customize these values by specifying rdblk=
- and rdhash= aufs mount options.
- For instance, there are 97 files under my /bin, and the total name
- length is 597 bytes.
- .nf
- $ \\ls -1 /bin | wc
- 97 97 597
- .fi
- Strictly speaking, 97 end\-of\-line codes are
- included. But it is OK since aufs VDIR also stores the name length in 1
- byte. In this case, you do not need to customize the default values. 597 bytes
- filenames will be stored in 2 VDIR memory blocks (597 <
- \*[AUFS_RDBLK_DEF] x 2).
- And 97 filenames are distributed among \*[AUFS_RDHASH_DEF] lists, so one
- list will point 4 names in average. To judge the names is whiteouted or
- not, the number of comparison will be 4. 2 memory allocations
- and 4 comparison costs low (even if the directory is opened for a long
- time). So you do not need to customize.
- If your directory has millions of files, the you will need to specify
- rdblk= and rdhash=.
- .nf
- $ ls -U /mnt/rotating-rust | wc -l
- 1382438
- .fi
- In this case, assuming the average length of filenames is 6, in order to
- get better time performance I would
- recommend to set $((128*1024)) or $((64*1024)) for rdblk, and
- $((8*1024)) or $((4*1024)) for rdhash.
- You can change these values of the active aufs mount by "mount -o
- remount".
- This customization is not for
- reducing the memory space, but for reducing time for the number of memory
- allocation and the name comparison. The larger value is faster, in
- general. Of course, you will need system memory. This is a generic
- "time\-vs\-space" problem.
- .\" ----------------------------------------------------------------------
- .SH Copy On Write, or aufs internal copyup and copydown
- Every stackable filesystem which implements copy\-on\-write supports the
- copyup feature. The feature is to copy a file/dir from the lower branch
- to the upper internally. When you have one readonly branch and one
- upper writable branch, and you append a string to a file which exists on
- the readonly branch, then aufs will copy the file from the readonly
- branch to the writable branch with its directory hierarchy. It means one
- write(2) involves several logical/internal mkdir(2), creat(2), read(2),
- write(2) and close(2) systemcalls
- before the actual expected write(2) is performed. Sometimes it may take
- a long time, particularly when the file is very large.
- If CONFIG_AUFS_DEBUG is enabled, aufs produces a message saying `copying
- a large file.'
- You may see the message when you change the xino file path or
- truncate the xino/xib files. Sometimes those files can be large and may
- take a long time to handle them.
- .\" ----------------------------------------------------------------------
- .SH Policies to Select One among Multiple Writable Branches
- Aufs has some policies to select one among multiple writable branches
- when you are going to write/modify something. There are two kinds of
- policies, one is for newly create something and the other is for
- internal copy-up.
- You can select them by specifying mount option `create=CREATE_POLICY'
- or `cpup=COPYUP_POLICY.'
- These policies have no meaning when you have only one writable
- branch. If there is some meaning, it must hurt the performance.
- .SS Exceptions for Policies
- In every cases below, even if the policy says that the branch where a
- new file should be created is /rw2, the file will be created on /rw1.
- .
- .Bu
- If there is a readonly branch with `wh' attribute above the
- policy-selected branch and the parent dir is marked as opaque,
- or the target (creating) file is whiteouted on the ro+wh branch, then
- the policy will be ignored and the target file will be created on the
- nearest upper writable branch than the ro+wh branch.
- .RS
- .nf
- /aufs = /rw1 + /ro+wh/diropq + /rw2
- /aufs = /rw1 + /ro+wh/wh.tgt + /rw2
- .fi
- .RE
- .
- .Bu
- If there is a writable branch above the policy-selected branch and the
- parent dir is marked as opaque or the target file is whiteouted on the
- branch, then the policy will be ignored and the target file will be
- created on the highest one among the upper writable branches who has
- diropq or whiteout. In case of whiteout, aufs removes it as usual.
- .RS
- .nf
- /aufs = /rw1/diropq + /rw2
- /aufs = /rw1/wh.tgt + /rw2
- .fi
- .RE
- .
- .Bu
- link(2) and rename(2) systemcalls are exceptions in every policy.
- They try selecting the branch where the source exists as possible since
- copyup a large file will take long time. If it can't be, ie. the
- branch where the source exists is readonly, then they will follow the
- copyup policy.
- .
- .Bu
- There is an exception for rename(2) when the target exists.
- If the rename target exists, aufs compares the index of the branches
- where the source and the target are existing and selects the higher
- one. If the selected branch is readonly, then aufs follows the copyup
- policy.
- .SS Policies for Creating
- .
- .TP
- .B create=tdp | top\-down\-parent
- Selects the highest writable branch where the parent dir exists. If
- the parent dir does not exist on a writable branch, then the internal
- copyup will happen. The policy for this copyup is always `bottom-up.'
- This is the default policy.
- .
- .TP
- .B create=rr | round\-robin
- Selects a writable branch in round robin. When you have two writable
- branches and creates 10 new files, 5 files will be created for each
- branch.
- mkdir(2) systemcall is an exception. When you create 10 new directories,
- all are created on the same branch.
- .
- .TP
- .B create=mfs[:second] | most\-free\-space[:second]
- Selects a writable branch which has most free space. In order to keep
- the performance, you can specify the duration (`second') which makes
- aufs hold the index of last selected writable branch until the
- specified seconds expires. The first time you create something in aufs
- after the specified seconds expired, aufs checks the amount of free
- space of all writable branches by internal statfs call
- and the held branch index will be updated.
- The default value is \*[AUFS_MFS_SECOND_DEF] seconds.
- .
- .TP
- .B create=mfsrr:low[:second]
- Selects a writable branch in most-free-space mode first, and then
- round-robin mode. If the selected branch has less free space than the
- specified value `low' in bytes, then aufs re-tries in round-robin mode.
- .\" `G', `M' and `K' (case insensitive) can be followed after `low.' Or
- Try an arithmetic expansion of shell which is defined by POSIX.
- For example, $((10 * 1024 * 1024)) for 10M.
- You can also specify the duration (`second') which is equivalent to
- the `mfs' mode.
- .
- .TP
- .B create=pmfs[:second]
- Selects a writable branch where the parent dir exists, such as tdp
- mode. When the parent dir exists on multiple writable branches, aufs
- selects the one which has most free space, such as mfs mode.
- .SS Policies for Copy-Up
- .
- .TP
- .B cpup=tdp | top\-down\-parent
- Equivalent to the same named policy for create.
- This is the default policy.
- .
- .TP
- .B cpup=bup | bottom\-up\-parent
- Selects the writable branch where the parent dir exists and the branch
- is nearest upper one from the copyup-source.
- .
- .TP
- .B cpup=bu | bottom\-up
- Selects the nearest upper writable branch from the copyup-source,
- regardless the existence of the parent dir.
- .\" ----------------------------------------------------------------------
- .SH Exporting Aufs via NFS
- Aufs is supporting NFS-exporting.
- Since aufs has no actual block device, you need to add NFS `fsid' option at
- exporting. Refer to the manual of NFS about the detail of this option.
- There are some limitations or requirements.
- .RS
- .Bu
- The branch filesystem must support NFS-exporting.
- .Bu
- NFSv2 is not supported. When you mount the exported aufs from your NFS
- client, you will need to some NFS options like v3 or nfsvers=3,
- especially if it is nfsroot.
- .Bu
- If the size of the NFS file handle on your branch filesystem is large,
- aufs will
- not be able to handle it. The maximum size of NFSv3 file
- handle for a filesystem is 64 bytes. Aufs uses 24 bytes for 32bit
- system, plus 12 bytes for 64bit system. The rest is a room for a file
- handle of a branch filesystem.
- .Bu
- The External Inode Number Bitmap, Translation Table and Generation Table
- (xino) is
- required since NFS file
- handle is based upon inode number. The mount option `xino' is enabled
- by default.
- The external inode generation table and its debugfs entry
- (<debugfs>/aufs/si_*/xigen) is created when CONFIG_AUFS_EXPORT is
- enabled even if you don't export aufs actually.
- The size of the external inode generation table grows only, never be
- truncated. You might need to pay attention to the free space of the
- filesystem where xino files are placed. By default, it is the first
- writable branch.
- .Bu
- The branch filesystems must be accessible, which means `not hidden.'
- It means you need to `mount \-\-move' when you use initramfs and
- switch_root(8), or chroot(8).
- .RE
- .\" ----------------------------------------------------------------------
- .SH Dentry and Inode Caches
- If you want to clear caches on your system, there are several tricks
- for that. If your system ram is low,
- try `find /large/dir \-ls > /dev/null'.
- It will read many inodes and dentries and cache them. Then old caches will be
- discarded.
- But when you have large ram or you do not have such large
- directory, it is not effective.
- If you want to discard cache within a certain filesystem,
- try `mount \-o remount /your/mntpnt'. Some filesystem may return an error of
- EINVAL or something, but VFS discards the unused dentry/inode caches on the
- specified filesystem.
- .\" ----------------------------------------------------------------------
- .SH Compatible/Incompatible with Unionfs Version 1.x Series
- If you compile aufs with \-DCONFIG_AUFS_COMPAT, dirs= option and =nfsro
- branch permission flag are available. They are interpreted as
- br: option and =ro flags respectively.
- `debug', `delete', `imap' options are ignored silently. When you
- compile aufs without \-DCONFIG_AUFS_COMPAT, these three options are
- also ignored, but a warning message is issued.
- Ignoring `delete' option, and to keep filesystem consistency, aufs tries
- writing something to only one branch in a single systemcall. It means
- aufs may copyup even if the copyup-src branch is specified as writable.
- For example, you have two writable branches and a large regular file
- on the lower writable branch. When you issue rename(2) to the file on aufs,
- aufs may copyup it to the upper writable branch.
- If this behaviour is not what you want, then you should rename(2) it
- on the lower branch directly.
- And there is a simple shell
- script `unionctl' under sample subdirectory, which is compatible with
- unionctl(8) in
- Unionfs Version 1.x series, except \-\-query action.
- This script executes mount(8) with `remount' option and uses
- add/del/mod aufs mount options.
- If you are familiar with Unionfs Version 1.x series and want to use unionctl(8), you can
- try this script instead of using mount \-o remount,... directly.
- Aufs does not support ioctl(2) interface.
- This script is highly depending upon mount(8) in
- util\-linux\-2.12p package, and you need to mount /proc to use this script.
- If your mount(8) version differs, you can try modifying this
- script. It is very easy.
- The unionctl script is just for a sample usage of aufs remount
- interface.
- Aufs uses the external inode number bitmap and translation table by
- default.
- The default branch permission for the first branch is `rw', and the
- rest is `ro.'
- The whiteout is for hiding files on lower branches. Also it is applied
- to stop readdir going lower branches.
- The latter case is called `opaque directory.' Any
- whiteout is an empty file, it means whiteout is just an mark.
- In the case of hiding lower files, the name of whiteout is
- `\*[AUFS_WH_PFX]<filename>.'
- And in the case of stopping readdir, the name is
- `\*[AUFS_WH_PFX]\*[AUFS_WH_PFX].opq' or
- `\*[AUFS_WH_PFX]__dir_opaque.' The name depends upon your compile
- configuration
- CONFIG_AUFS_COMPAT.
- .\" All of newly created or renamed directory will be opaque.
- All whiteouts are hardlinked,
- including `<writable branch top dir>/\*[AUFS_WH_BASE].'
- The hardlink on an ordinary (disk based) filesystem does not
- consume inode resource newly. But in linux tmpfs, the number of free
- inodes will be decremented by link(2). It is recommended to specify
- nr_inodes option to your tmpfs if you meet ENOSPC. Use this option
- after checking by `df \-i.'
- When you rmdir or rename-to the dir who has a number of whiteouts,
- aufs rename the dir to the temporary whiteouted-name like
- `\*[AUFS_WH_PFX]<dir>.<random hex>.' Then remove it after actual operation.
- cf. mount option `dirwh.'
- .\" ----------------------------------------------------------------------
- .SH Incompatible with an Ordinary Filesystem
- stat(2) returns the inode info from the first existence inode among
- the branches, except the directory link count.
- Aufs computes the directory link count larger than the exact value usually, in
- order to keep UNIX filesystem semantics, or in order to shut find(1) mouth up.
- The size of a directory may be wrong too, but it has to do no harm.
- The timestamp of a directory will not be updated when a file is
- created or removed under it, and it was done on a lower branch.
- The test for permission bits has two cases. One is for a directory,
- and the other is for a non-directory. In the case of a directory, aufs
- checks the permission bits of all existing directories. It means you
- need the correct privilege for the directories including the lower
- branches.
- The test for a non-directory is more simple. It checks only the
- topmost inode.
- statfs(2) returns the information of the first branch info except
- namelen when `nosum' is specified (the default). The namelen is
- decreased by the whiteout prefix length. And the block size may differ
- from st_blksize which is obtained by stat(2).
- Remember, seekdir(3) and telldir(3) are not defined in POSIX. They may
- not work as you expect. Try rewinddir(3) or re-open the dir.
- The whiteout prefix (\*[AUFS_WH_PFX]) is reserved on all branches. Users should
- not handle the filename begins with this prefix.
- In order to future whiteout, the maximum filename length is limited by
- the longest value \- \*[AUFS_WH_PFX_LEN]. It may be a violation of POSIX.
- If you dislike the difference between the aufs entries in /etc/mtab
- and /proc/mounts, and if you are using mount(8) in util\-linux package,
- then try ./mount.aufs utility. Copy the script to /sbin/mount.aufs.
- This simple utility tries updating
- /etc/mtab. If you do not care about /etc/mtab, you can ignore this
- utility.
- Remember this utility is highly depending upon mount(8) in
- util\-linux\-2.12p package, and you need to mount /proc.
- Since aufs uses its own inode and dentry, your system may cache huge
- number of inodes and dentries. It can be as twice as all of the files
- in your union.
- It means that unmounting or remounting readonly at shutdown time may
- take a long time, since mount(2) in VFS tries freeing all of the cache
- on the target filesystem.
- When you open a directory, aufs will open several directories
- internally.
- It means you may reach the limit of the number of file descriptor.
- And when the lower directory cannot be opened, aufs will close all the
- opened upper directories and return an error.
- The sub-mount under the branch
- of local filesystem
- is ignored.
- For example, if you have mount another filesystem on
- /branch/another/mntpnt, the files under `mntpnt' will be ignored by aufs.
- It is recommended to mount the sub-mount under the mounted aufs.
- For example,
- .nf
- # sudo mount /dev/sdaXX /ro_branch
- # d=another/mntpnt
- # sudo mount /dev/sdbXX /ro_branch/$d
- # mkdir -p /rw_branch/$d
- # sudo mount -t aufs -o br:/rw_branch:/ro_branch none /aufs
- # sudo mount -t aufs -o br:/rw_branch/${d}:/ro_branch/${d} none /aufs/another/$d
- .fi
- There are several characters which are not allowed to use in a branch
- directory path and xino filename. See detail in Branch Syntax and Mount
- Option.
- The file-lock which means fcntl(2) with F_SETLK, F_SETLKW or F_GETLK, flock(2)
- and lockf(3), is applied to virtual aufs file only, not to the file on a
- branch. It means you can break the lock by accessing a branch directly.
- TODO: check `security' to hook locks, as inotify does.
- The I/O to the named pipe or local socket are not handled by aufs, even
- if it exists in aufs. After the reader and the writer established their
- connection if the pipe/socket are copied-up, they keep using the old one
- instead of the copied-up one.
- The fsync(2) and fdatasync(2) systemcalls return 0 which means success, even
- if the given file descriptor is not opened for writing.
- I am afraid this behaviour may violate some standards. Checking the
- behaviour of fsync(2) on ext2, aufs decided to return success.
- If you want to use disk-quota, you should set it up to your writable
- branch since aufs does not have its own block device.
- When your aufs is the root directory of your system, and your system
- tells you some of the filesystem were not unmounted cleanly, try these
- procedure when you shutdown your system.
- .nf
- # mount -no remount,ro /
- # for i in $writable_branches
- # do mount -no remount,ro $i
- # done
- .fi
- If your xino file is on a hard drive, you also need to specify
- `noxino' option or `xino=/your/tmpfs/xino' at remounting root
- directory.
- To rename(2) directory may return EXDEV even if both of src and tgt
- are on the same aufs. When the rename-src dir exists on multiple
- branches and the lower dir has child(ren), aufs has to copyup all his
- children. It can be recursive copyup. Current aufs does not support
- such huge copyup operation at one time in kernel space, instead
- produces a warning and returns EXDEV.
- Generally, mv(1) detects this error and tries mkdir(2) and
- rename(2) or copy/unlink recursively. So the result is harmless.
- If your application which issues rename(2) for a directory does not
- support EXDEV, it will not work on aufs.
- Also this specification is applied to the case when the src directory
- exists on the lower readonly branch and it has child(ren).
- If a sudden accident such like a power failure happens during aufs is
- performing, and regular fsck for branch filesystems is completed after
- the disaster, you need to extra fsck for aufs writable branches. It is
- necessary to check whether the whiteout remains incorrectly or not,
- eg. the real filename and the whiteout for it under the same parent
- directory. If such whiteout remains, aufs cannot handle the file
- correctly.
- To check the consistency from the aufs' point of view, you can use a
- simple shell script called /sbin/auchk. Its purpose is a fsck tool for
- aufs, and it checks the illegal whiteout, the remained
- pseudo-links and the remained aufs-temp files. If they are found, the
- utility reports you and asks whether to delete or not.
- It is recommended to execute /sbin/auchk for every writable branch
- filesystem before mounting aufs if the system experienced crash.
- .\" ----------------------------------------------------------------------
- .SH EXAMPLES
- The mount options are interpreted from left to right at remount-time.
- These examples
- shows how the options are handled. (assuming /sbin/mount.aufs was
- installed)
- .nf
- # mount -v -t aufs br:/day0:/base none /u
- none on /u type aufs (rw,xino=/day0/.aufs.xino,br:/day0=rw:/base=ro)
- # mount -v -o remount,\\
- prepend:/day1,\\
- xino=/day1/xino,\\
- mod:/day0=ro,\\
- del:/day0 \\
- /u
- none on /u type aufs (rw,xino=/day1/xino,br:/day1=rw:/base=ro)
- .fi
- .nf
- # mount -t aufs br:/rw none /u
- # mount -o remount,append:/ro /u
- different uid/gid/permission, /ro
- # mount -o remount,del:/ro /u
- # mount -o remount,nowarn_perm,append:/ro /u
- #
- (there is no warning)
- .fi
- .\" If you want to expand your filesystem size, aufs may help you by
- .\" adding an writable branch. Since aufs supports multiple writable
- .\" branches, the old writable branch can be being writable, if you want.
- .\" In this example, any modifications to the files under /ro branch will
- .\" be copied-up to /new, but modifications to the files under /rw branch
- .\" will not.
- .\" And the next example shows the modifications to the files under /rw branch
- .\" will be copied-up to /new/a.
- .\"
- .\" Todo: test multiple writable branches policy. cpup=nearest, cpup=exist_parent.
- .\"
- .\" .nf
- .\" # mount -v -t aufs br:/rw:/ro none /u
- .\" none on /u type aufs (rw,xino=/rw/.aufs.xino,br:/rw=rw:/ro=ro)
- .\" # mkfs /new
- .\" # mount -v -o remount,add:1:/new=rw /u
- .\" none on /u type aufs (rw,xino=/rw/.aufs.xino,br:/rw=rw:/new=rw:/ro=ro)
- .\" .fi
- .\"
- .\" .nf
- .\" # mount -v -t aufs br:/rw:/ro none /u
- .\" none on /u type aufs (rw,xino=/rw/.aufs.xino,br:/rw=rw:/ro=ro)
- .\" # mkfs /new
- .\" # mkdir /new/a new/b
- .\" # mount -v -o remount,add:1:/new/b=rw,prepend:/new/a,mod:/rw=ro /u
- .\" none on /u type aufs (rw,xino=/rw/.aufs.xino,br:/new/a=rw:/rw=ro:/new/b=rw:/ro=ro)
- .\" .fi
- When you use aufs as root filesystem, it is recommended to consider to
- exclude some directories. For example, /tmp and /var/log are not need
- to stack in many cases. They do not usually need to copyup or to whiteout.
- Also the swapfile on aufs (a regular file, not a block device) is not
- supported.
- In order to exclude the specific dir from aufs, try bind mounting.
- And there is a good sample which is for network booted diskless machines. See
- sample/ in detail.
- .\" ----------------------------------------------------------------------
- .SH DIAGNOSTICS
- When you add a branch to your union, aufs may warn you about the
- privilege or security of the branch, which is the permission bits,
- owner and group of the top directory of the branch.
- For example, when your upper writable branch has a world writable top
- directory,
- a malicious user can create any files on the writable branch directly,
- like copyup and modify manually. I am afraid it can be a security
- issue.
- When you mount or remount your union without \-o ro common mount option
- and without writable branch, aufs will warn you that the first branch
- should be writable.
- .\" It is discouraged to set both of `udba' and `noxino' mount options. In
- .\" this case the inode number under aufs will always be changed and may
- .\" reach the end of inode number which is a maximum of unsigned long. If
- .\" the inode number reaches the end, aufs will return EIO repeatedly.
- When you set udba other than inotify and change something on your
- branch filesystem directly, later aufs may detect some mismatches to
- its cache. If it is a critical mismatch, aufs returns EIO.
- When an error occurs in aufs, aufs prints the kernel message with
- `errno.' The priority of the message (log level) is ERR or WARNING which
- depends upon the message itself.
- You can convert the `errno' into the error message by perror(3),
- strerror(3) or something.
- For example, the `errno' in the message `I/O Error, write failed (\-28)'
- is 28 which means ENOSPC or `No space left on device.'
- When CONFIG_AUFS_BR_RAMFS is enabled, you can specify ramfs as an aufs
- branch. Since ramfs is simple, it does not set the maximum link count
- originally. In aufs, it is very dangerous, particularly for
- whiteouts. Finally aufs sets the maximum link count for ramfs. The
- value is 32000 which is borrowed from ext2.
- .\" .SH Current Limitation
- .
- .\" ----------------------------------------------------------------------
- .\" SYNOPSIS
- .\" briefly describes the command or function's interface. For commands, this
- .\" shows the syntax of the command and its arguments (including options); bold-
- .\" face is used for as-is text and italics are used to indicate replaceable
- .\" arguments. Brackets ([]) surround optional arguments, vertical bars (|) sep-
- .\" arate choices, and ellipses (...) can be repeated. For functions, it shows
- .\" any required data declarations or #include directives, followed by the func-
- .\" tion declaration.
- .
- .\" DESCRIPTION
- .\" gives an explanation of what the command, function, or format does. Discuss
- .\" how it interacts with files and standard input, and what it produces on
- .\" standard output or standard error. Omit internals and implementation
- .\" details unless they're critical for understanding the interface. Describe
- .\" the usual case; for information on options use the OPTIONS section. If
- .\" there is some kind of input grammar or complex set of subcommands, consider
- .\" describing them in a separate USAGE section (and just place an overview in
- .\" the DESCRIPTION section).
- .
- .\" RETURN VALUE
- .\" gives a list of the values the library routine will return to the caller and
- .\" the conditions that cause these values to be returned.
- .
- .\" EXIT STATUS
- .\" lists the possible exit status values or a program and the conditions that
- .\" cause these values to be returned.
- .
- .\" USAGE
- .\" describes the grammar of any sublanguage this implements.
- .
- .\" FILES
- .\" lists the files the program or function uses, such as configuration files,
- .\" startup files, and files the program directly operates on. Give the full
- .\" pathname of these files, and use the installation process to modify the
- .\" directory part to match user preferences. For many programs, the default
- .\" installation location is in /usr/local, so your base manual page should use
- .\" /usr/local as the base.
- .
- .\" ENVIRONMENT
- .\" lists all environment variables that affect your program or function and how
- .\" they affect it.
- .
- .\" SECURITY
- .\" discusses security issues and implications. Warn about configurations or
- .\" environments that should be avoided, commands that may have security impli-
- .\" cations, and so on, especially if they aren't obvious. Discussing security
- .\" in a separate section isn't necessary; if it's easier to understand, place
- .\" security information in the other sections (such as the DESCRIPTION or USAGE
- .\" section). However, please include security information somewhere!
- .
- .\" CONFORMING TO
- .\" describes any standards or conventions this implements.
- .
- .\" NOTES
- .\" provides miscellaneous notes.
- .
- .\" BUGS
- .\" lists limitations, known defects or inconveniences, and other questionable
- .\" activities.
- .SH COPYRIGHT
- Copyright \(co 2005\-2009 Junjiro R. Okajima
- .SH AUTHOR
- Junjiro R. Okajima
- .\" SEE ALSO
- .\" lists related man pages in alphabetical order, possibly followed by other
- .\" related pages or documents. Conventionally this is the last section.
|