unshare — run program with some namespaces unshared from parent
unshare
[options] [ program
[arguments] ]
Unshares the indicated namespaces from the parent process
and then executes the specified program
. If program
is not given, then
``${SHELL}'' is run (default: /bin/sh).
The namespaces can optionally be made persistent by bind
mounting /proc/pid/ns/type files to a filesystem
path and entered with nsenter(1) even after the
program
terminates
(except PID namespaces where permanently running init process
is required). Once a persistent namespace is no longer
needed, it can be unpersisted with umount(8). See the
EXAMPLES
section for more
details.
The namespaces to be unshared are indicated via options. Unshareable namespaces are:
Mounting and unmounting filesystems will not affect
the rest of the system, except for filesystems which
are explicitly marked as shared (with mount --make-shared; see
/proc/self/mountinfo
or
findmnt
-o+PROPAGATION for the shared flags). For
further details, see mount_namespaces(7)
and the discussion of the CLONE_NEWNS
flag in clone(2).
unshare since
util-linux version 2.27 automatically sets propagation
to private in a new
mount namespace to make sure that the new namespace is
really unshared. It's possible to disable this feature
with option −−propagation unchanged
.
Note that private is the kernel
default.
Setting hostname or domainname will not affect the
rest of the system. For further details, see namespaces(7) and the
discussion of the CLONE_NEWUTS
flag in clone(2).
The process will have an independent namespace for
POSIX message queues as well as System V message
queues, semaphore sets and shared memory segments. For
further details, see namespaces(7) and the
discussion of the CLONE_NEWIPC
flag in clone(2).
The process will have independent IPv4 and IPv6
stacks, IP routing tables, firewall rules, the
/proc/net
and
/sys/class/net
directory
trees, sockets, etc. For further details, see namespaces(7) and the
discussion of the CLONE_NEWNET
flag in clone(2).
Children will have a distinct set of PID-to-process
mappings from their parent. For further details, see
pid_namespaces(7) and
the discussion of the CLONE_NEWPID
flag in clone(2).
The process will have a virtualized view of
/proc/self/cgroup
, and
new cgroup mounts will be rooted at the namespace
cgroup root. For further details, see cgroup_namespaces(7)
and the discussion of the CLONE_NEWCGROUP
flag in clone(2).
The process will have a distinct set of UIDs, GIDs
and capabilities. For further details, see user_namespaces(7)
and the discussion of the CLONE_NEWUSER
flag in clone(2).
−i,
−−ipc[=file
]
Unshare the IPC namespace. If file is specified, then a persistent namespace is created by a bind mount.
−m,
−−mount[=file
]
Unshare the mount namespace. If file is specified, then a persistent namespace is created by a bind mount. Note that file has to be located on a filesystem with the propagation flag set to private. Use the command findmnt -o+PROPAGATION when not sure about the current setting. See also the examples below.
−n,
−−net[=file
]
Unshare the network namespace. If file is specified, then a persistent namespace is created by a bind mount.
−p,
−−pid[=file
]
Unshare the PID namespace. If file is specified
then persistent namespace is created by a bind mount.
See also the −−fork
and −−mount−proc
options.
−u,
−−uts[=file
]
Unshare the UTS namespace. If file is specified, then a persistent namespace is created by a bind mount.
−U,
−−user[=file
]
Unshare the user namespace. If file is specified, then a persistent namespace is created by a bind mount.
−C,
−−cgroup[=file
]
Unshare the cgroup namespace. If file is specified then persistent namespace is created by bind mount.
−f,
−−fork
Fork the specified program
as a child
process of unshare rather than
running it directly. This is useful when creating a new
PID namespace.
−−kill−child[=signame
]
When unshare terminates,
have signame be sent to
the forked child process. Combined with −−pid
this allows for an
easy and reliable killing of the entire process tree
below unshare. If not
given, signame defaults to
SIGKILL
. This option
implies −−fork
.
−−mount−proc[=mountpoint
]
Just before running the program, mount the proc filesystem at mountpoint (default is /proc). This is useful when creating a new PID namespace. It also implies creating a new mount namespace since the /proc mount would otherwise mess up existing programs on the system. The new proc filesystem is explicitly mounted as private (with MS_PRIVATE|MS_REC).
−r,
−−map−root−user
Run the program only after the current effective
user and group IDs have been mapped to the superuser
UID and GID in the newly created user namespace. This
makes it possible to conveniently gain capabilities
needed to manage various aspects of the newly created
namespaces (such as configuring interfaces in the
network namespace or mounting filesystems in the mount
namespace) even when run unprivileged. As a mere
convenience feature, it does not support more
sophisticated use cases, such as mapping multiple
ranges of UIDs and GIDs. This option implies
−−setgroups=deny
.
−−propagation
private|shared|slave|unchanged
Recursively set the mount propagation flag in the
new mount namespace. The default is to set the
propagation to private. It is
possible to disable this feature with the argument
unchanged. The option
is silently ignored when the mount namespace
(−−mount
) is
not requested.
−−setgroups
allow|deny
Allow or deny the setgroups(2) system call in a user namespace.
To be able to call setgroups(2), the
calling process must at least have CAP_SETGID. But
since Linux 3.19 a further restriction applies: the
kernel gives permission to call setgroups(2) only
after the GID map (/proc/
pid\fPfB/gid_map
) has
been set. The GID map is writable by root when
setgroups(2) is
enabled (i.e. allow, the default),
and the GID map becomes writable by unprivileged
processes when setgroups(2) is
permanently disabled (with deny).
−V,
−−version
Display version information and exit.
−h,
−−help
Display help text and exit.
The proc and sysfs filesystems mounting as root in a user namespace have to be restricted so that a less privileged user can not get more access to sensitive files that a more privileged user made unavailable. In short the rule for proc and sysfs is as close to a bind mount as possible.
# unshare --fork --pid --mount-proc readlink /proc/self 1
Establish a PID namespace, ensure we're PID 1 in it against a newly mounted procfs instance.
$ unshare --map-root-user --user sh -c whoami root
Establish a user namespace as an unprivileged user with a root user within it.
# touch /root/uts-ns # unshare --uts=/root/uts-ns hostname FOO # nsenter --uts=/root/uts-ns hostname FOO # umount /root/uts-ns
Establish a persistent UTS namespace, and modify the hostname. The namespace is then entered with nsenter. The namespace is destroyed by unmounting the bind reference.
# mount --bind /root/namespaces /root/namespaces # mount --make-private /root/namespaces # touch /root/namespaces/mnt # unshare --mount=/root/namespaces/mnt
Establish a persistent mount namespace referenced by the bind mount /root/namespaces/mnt. This example shows a portable solution, because it makes sure that the bind mount is created on a shared filesystem.
# unshare -pf --kill-child -- bash -c "(sleep 999 &) && sleep 1000" & # pid=$! # kill $pid
Reliable killing of subprocesses of the program
. When unshare gets killed,
everything below it gets killed as well. Without it, the
children of program
would have orphaned and been re-parented to PID 1.