aboutsummaryrefslogtreecommitdiffstats
path: root/fs/namei.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Allow O_PATH for symlinksAl Viro2011-03-151-6/+19
| | | | | | | | | | | | | | | | | | | At that point we can't do almost nothing with them. They can be opened with O_PATH, we can manipulate such descriptors with dup(), etc. and we can see them in /proc/*/{fd,fdinfo}/*. We can't (and won't be able to) follow /proc/*/fd/* symlinks for those; there's simply not enough information for pathname resolution to go on from such point - to resolve a symlink we need to know which directory does it live in. We will be able to do useful things with them after the next commit, though - readlinkat() and fchownat() will be possible to use with dfd being an O_PATH-opened symlink and empty relative pathname. Combined with open_by_handle() it'll give us a way to do realink-by-handle and lchown-by-handle without messing with more redundant syscalls. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* New kind of open files - "location only".Al Viro2011-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New flag for open(2) - O_PATH. Semantics: * pathname is resolved, but the file itself is _NOT_ opened as far as filesystem is concerned. * almost all operations on the resulting descriptors shall fail with -EBADF. Exceptions are: 1) operations on descriptors themselves (i.e. close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD), fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD), fcntl(fd, F_SETFD, ...)) 2) fcntl(fd, F_GETFL), for a common non-destructive way to check if descriptor is open 3) "dfd" arguments of ...at(2) syscalls, i.e. the starting points of pathname resolution * closing such descriptor does *NOT* affect dnotify or posix locks. * permissions are checked as usual along the way to file; no permission checks are applied to the file itself. Of course, giving such thing to syscall will result in permission checks (at the moment it means checking that starting point of ....at() is a directory and caller has exec permissions on it). fget() and fget_light() return NULL on such descriptors; use of fget_raw() and fget_raw_light() is needed to get them. That protects existing code from dealing with those things. There are two things still missing (they come in the next commits): one is handling of symlinks (right now we refuse to open them that way; see the next commit for semantics related to those) and another is descriptor passing via SCM_RIGHTS datagrams. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fs: Don't allow to create hardlink for deleted fileAneesh Kumar K.V2011-03-151-1/+5
| | | | | | | | | | | Add inode->i_nlink == 0 check in VFS. Some of the file systems do this internally. A followup patch will remove those instance. This is needed to ensure that with link by handle we don't allow to create hardlink of an unlinked file. The check also prevent a race between unlink and link Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* New AT_... flag: AT_EMPTY_PATHAl Viro2011-03-141-10/+19
| | | | | | | | | | For name_to_handle_at(2) we'll want both ...at()-style syscall that would be usable for non-directory descriptors (with empty relative pathname). Introduce new flag (AT_EMPTY_PATH) to deal with that and corresponding LOOKUP_EMPTY; teach user_path_at() and path_init() to deal with the latter. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* open-style analog of vfs_path_lookup()Al Viro2011-03-141-28/+52
| | | | | | | | | | | | | new function: file_open_root(dentry, mnt, name, flags) opens the file vfs_path_lookup would arrive to. Note that name can be empty; in that case the usual requirement that dentry should be a directory is lifted. open-coded equivalents switched to it, may_open() got down exactly one caller and became static. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* reduce vfs_path_lookup() to do_path_lookup()Al Viro2011-03-141-52/+43
| | | | | | | | | | | | | | | New lookup flag: LOOKUP_ROOT. nd->root is set (and held) by caller, path_init() starts walking from that place and all pathname resolution machinery never drops nd->root if that flag is set. That turns vfs_path_lookup() into a special case of do_path_lookup() *and* gets us down to 3 callers of link_path_walk(), making it finally feasible to rip the handling of trailing symlink out of link_path_walk(). That will not only simply the living hell out of it, but make life much simpler for unionfs merge. Trailing symlink handling will become iterative, which is a good thing for stack footprint in a lot of situations as well. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* untangle do_lookup()Al Viro2011-03-141-85/+56
| | | | | | | | | | | | | | | | | | | | | | That thing has devolved into rats nest of gotos; sane use of unlikely() gets rid of that horror and gives much more readable structure: * make a fast attempt to find a dentry; false negatives are OK. In RCU mode if everything went fine, we are done, otherwise just drop out of RCU. If we'd done (RCU) ->d_revalidate() and it had not refused outright (i.e. didn't give us -ECHILD), remember its result. * now we are not in RCU mode and hopefully have a dentry. If we do not, lock parent, do full d_lookup() and if that has not found anything, allocate and call ->lookup(). If we'd done that ->lookup(), remember that dentry is good and we don't need to revalidate it. * now we have a dentry. If it has ->d_revalidate() and we can't skip it, call it. * hopefully dentry is good; if not, either fail (in case of error) or try to invalidate it. If d_invalidate() has succeeded, drop it and retry everything as if original attempt had not found a dentry. * now we can finish it up - deal with mountpoint crossing and automount. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* path_openat: clean ELOOP handling a bitAl Viro2011-03-141-8/+6
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* do_last: kill a rudiment of old ->d_revalidate() workaroundAl Viro2011-03-141-5/+0
| | | | | | | | | | | | There used to be time when ->d_revalidate() couldn't return an error. So intents code had lookup_instantiate_filp() stash ERR_PTR(error) in nd->intent.open.filp and had it checked after lookup_hash(), to catch the otherwise silent failures. That had been introduced by commit 4af4c52f34606bdaab6930a845550c6fb02078a4. These days ->d_revalidate() can and does propagate errors back to callers explicitly, so this check isn't needed anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* fold __open_namei_create() and open_will_truncate() into do_last()Al Viro2011-03-141-48/+26
| | | | | | ... and clean up a bit more Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* do_last: unify may_open() call and everyting after itAl Viro2011-03-141-37/+22
| | | | | | | | | We have a bunch of diverging codepaths in do_last(); some of them converge, but the case of having to create a new file duplicates large part of common tail of the rest and exits separately. Massage them so that they could be merged. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* move may_open() from __open_name_create() to do_last()Al Viro2011-03-141-5/+7
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* expand finish_open() in its only callerAl Viro2011-03-141-52/+38
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* sanitize pathname component hash calculationAl Viro2011-03-141-23/+19
| | | | | | | | Lift it to lookup_one_len() and link_path_walk() resp. into the same place where we calculated default hash function of the same name. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill __lookup_one_len()Al Viro2011-03-141-26/+15
| | | | | | only one caller left Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch non-create side of open() to use of do_last()Al Viro2011-03-141-33/+67
| | | | | | | | | | | | | | | | | | | | Instead of path_lookupat() doing trailing symlink resolution, use the same scheme as on the O_CREAT side. Walk with LOOKUP_PARENT, then (in do_last()) look the final component up, then either open it or return error or, if it's a symlink, give the symlink back to path_openat() to be resolved there. The really messy complication here is RCU. We don't want to drop out of RCU mode before the final lookup, since we don't want to bounce parent directory ->d_count without a good reason. Result is _not_ pretty; later in the series we'll clean it up. For now we are roughly back where we'd been before the revert done by Nick's series - top-level logics of path_openat() is cleaned up, do_last() does actual opening, symlink resolution is done uniformly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get rid of nd->fileAl Viro2011-03-141-8/+7
| | | | | | | Don't stash the struct file * used as starting point of walk in nameidata; pass file ** to path_init() instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* get rid of the last LOOKUP_RCU dependencies in link_path_walk()Al Viro2011-03-141-8/+13
| | | | | | | | | | | | New helper: terminate_walk(). An error has happened during pathname resolution and we either drop nd->path or terminate RCU, depending the mode we had been in. After that, nd is essentially empty. Switch link_path_walk() to using that for cleanup. Now the top-level logics in link_path_walk() is back to sanity. RCU dependencies are in the lower-level functions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* make nameidata_dentry_drop_rcu_maybe() always leave RCU modeAl Viro2011-03-141-5/+11
| | | | | | | | Now we have do_follow_link() guaranteed to leave without dangling RCU and the next step will get LOOKUP_RCU logics completely out of link_path_walk(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* make handle_dots() leave RCU mode on errorAl Viro2011-03-141-11/+12
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* clear RCU on all failure exits from link_path_walk()Al Viro2011-03-141-14/+16
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* pull handling of . and .. into inlined helperAl Viro2011-03-141-14/+16
| | | | | | getting LOOKUP_RCU checks out of link_path_walk()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill out_dput: in link_path_walk()Al Viro2011-03-141-11/+4
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* separate -ESTALE/-ECHILD retries in do_filp_open() from real workAl Viro2011-03-141-29/+20
| | | | | | | | | | | new helper: path_openat(). Does what do_filp_open() does, except that it tries only the walk mode (RCU/normal/force revalidation) it had been told to. Both create and non-create branches are using path_lookupat() now. Fixed the double audit_inode() in non-create branch. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* switch do_filp_open() to struct open_flagsAl Viro2011-03-141-79/+9
| | | | | | | | | take calculation of open_flags by open(2) arguments into new helper in fs/open.c, move filp_open() over there, have it and do_sys_open() use that helper, switch exec.c callers of do_filp_open() to explicit (and constant) struct open_flags. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Collect "operation mode" arguments of do_last() into a structureAl Viro2011-03-141-22/+35
| | | | | | | | | | | | No point messing with passing shitloads of "operation mode" arguments to do_open() one by one, especially since they are not going to change during do_filp_open(). Collect them into a struct, fill it and pass to do_last() by reference. Make sure that lookup intent flags are correctly set and removed - we want them for do_last(), but they make no sense for __do_follow_link(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* clean up the failure exits after __do_follow_link() in do_filp_open()Al Viro2011-03-141-8/+5
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* pull security_inode_follow_link() into __do_follow_link()Al Viro2011-03-141-6/+7
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* pull dropping RCU on success of link_path_walk() into path_lookupat()Al Viro2011-03-141-18/+12
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* untangle the "need_reval_dot" messAl Viro2011-03-141-63/+44
| | | | | | | | | | | | | instead of ad-hackery around need_reval_dot(), do the following: set a flag (LOOKUP_JUMPED) in the beginning of path, on absolute symlink traversal, on ".." and on procfs-style symlinks. Clear on normal components, leave unchanged on ".". Non-nested callers of link_path_walk() call handle_reval_path(), which checks that flag is set and that fs does want the final revalidate thing, then does ->d_revalidate(). In link_path_walk() all the return_reval stuff is gone. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* merge component type recognitionAl Viro2011-03-141-26/+22
| | | | | | no need to do it in three places... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* merge path_init and path_init_rcuAl Viro2011-03-141-83/+35
| | | | | | | | | | Actual dependency on whether we want RCU or not is in 3 small areas (as it ought to be) and everything around those is the same in both versions. Since each function has only one caller and those callers are on two sides of if (flags & LOOKUP_RCU), it's easier and cleaner to merge them and pull the checks inside. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* sanitize path_walk() messAl Viro2011-03-141-92/+56
| | | | | | | | | New helper: path_lookupat(). Basically, what do_path_lookup() boils to modulo -ECHILD/-ESTALE handler. path_walk* family is gone; vfs_path_lookup() is using link_path_walk() directly, do_path_lookup() and do_filp_open() are using path_lookupat(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* take RCU-dependent stuff around exec_permission() into a new helperAl Viro2011-03-141-11/+14
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* kill path_lookup()Al Viro2011-03-141-4/+3
| | | | | | | all remaining callers pass LOOKUP_PARENT to it, so flags argument can die; renamed to kern_path_parent() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* nd->inode is not set on the second attempt in path_walk()Al Viro2011-03-081-0/+1
| | | | | | | | We leave it at whatever it had been pointing to after the first link_path_walk() had failed with -ESTALE. Things do not work well after that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* minimal fix for do_filp_open() raceAl Viro2011-03-041-3/+10
| | | | | | | | | | | | | | | | failure exits on the no-O_CREAT side of do_filp_open() merge with those of O_CREAT one; unfortunately, if do_path_lookup() returns -ESTALE, we'll get out_filp:, notice that we are about to return -ESTALE without having trying to create the sucker with LOOKUP_REVAL and jump right into the O_CREAT side of code. And proceed to try and create a file. Usually that'll fail with -ESTALE again, but we can race and get that attempt of pathname resolution to succeed. open() without O_CREAT really shouldn't end up creating files, races or not. The real fix is to rearchitect the whole do_filp_open(), but for now splitting the failure exits will do. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* vfs: fix BUG_ON() in fs/namei.c:1461Linus Torvalds2011-02-161-5/+4
| | | | | | | | | | | | | | | | | | | | | | When Al moved the nameidata_dentry_drop_rcu_maybe() call into the do_follow_link function in commit 844a391799c2 ("nothing in do_follow_link() is going to see RCU"), he mistakenly left the BUG_ON(inode != path->dentry->d_inode); behind. Which would otherwise be ok, but that BUG_ON() really needs to be _after_ dropping RCU, since the dentry isn't necessarily stable otherwise. So complete the code movement in that commit, and move the BUG_ON() into do_follow_link() too. This means that we need to pass in 'inode' as an argument (just for this one use), but that's a small thing. And eventually we may be confident enough in our path lookup that we can just remove the BUG_ON() and the unnecessary inode argument. Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* get rid of nameidata_dentry_drop_rcu() calling nameidata_drop_rcu()Al Viro2011-02-151-8/+0
| | | | | | can't happen anymore and didn't work right anyway Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* drop out of RCU in return_revalAl Viro2011-02-151-19/+6
| | | | | | ... thus killing the need to handle drop-from-RCU in d_revalidate() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* split do_revalidate() into RCU and non-RCU casesAl Viro2011-02-151-17/+30
| | | | | | fixing oopsen in lookup_one_len() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* in do_lookup() split RCU and non-RCU cases of need_revalidateAl Viro2011-02-151-15/+16
| | | | | | and use unlikely() instead of gotos, for fsck sake... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* nothing in do_follow_link() is going to see RCUAl Viro2011-02-151-8/+7
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Fix possible filp_cachep memory corruptionLinus Torvalds2011-02-111-10/+10
| | | | | | | | | | | | | | | | | | | | | | In commit 31e6b01f4183 ("fs: rcu-walk for path lookup") we started doing path lookup using RCU, which then falls back to a careful non-RCU lookup in case of problems (LOOKUP_REVAL). So do_filp_open() has this "re-do the lookup carefully" looping case. However, that means that we must not release the open-intent file data if we are going to loop around and use it once more! Fix this by moving the release of the open-intent data to the function that allocates it (do_filp_open() itself) rather than the helper functions that can get called multiple times (finish_open() and do_last()). This makes the logic for the lifetime of that field much more obvious, and avoids the possible double free. Reported-by: J. R. Okajima <hooanon05@yahoo.co.jp> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* vfs - fix dentry ref count in do_lookup()Ian Kent2011-01-181-1/+3
| | | | | | | | | | | There is a ref count problem in fs/namei.c:do_lookup(). When walking in ref-walk mode, if follow_managed() returns a fail we need to drop dentry and possibly vfsmount. Clean up properly, as we do in the other caller of follow_managed(). Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Take the completion of automount into new helperAl Viro2011-01-171-26/+5
| | | | | | ... and shift it from namei.c to namespace.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* sanitize vfsmount refcounting changesAl Viro2011-01-161-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of splitting refcount between (per-cpu) mnt_count and (SMP-only) mnt_longrefs, make all references contribute to mnt_count again and keep track of how many are longterm ones. Accounting rules for longterm count: * 1 for each fs_struct.root.mnt * 1 for each fs_struct.pwd.mnt * 1 for having non-NULL ->mnt_ns * decrement to 0 happens only under vfsmount lock exclusive That allows nice common case for mntput() - since we can't drop the final reference until after mnt_longterm has reached 0 due to the rules above, mntput() can grab vfsmount lock shared and check mnt_longterm. If it turns out to be non-zero (which is the common case), we know that this is not the final mntput() and can just blindly decrement percpu mnt_count. Otherwise we grab vfsmount lock exclusive and do usual decrement-and-check of percpu mnt_count. For fs_struct.c we have mnt_make_longterm() and mnt_make_shortterm(); namespace.c uses the latter in places where we don't already hold vfsmount lock exclusive and opencodes a few remaining spots where we need to manipulate mnt_longterm. Note that we mostly revert the code outside of fs/namespace.c back to what we used to have; in particular, normal code doesn't need to care about two kinds of references, etc. And we get to keep the optimization Nick's variant had bought us... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Unexport do_add_mount() and add in follow_automount(), not ->d_automount()David Howells2011-01-151-7/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Unexport do_add_mount() and make ->d_automount() return the vfsmount to be added rather than calling do_add_mount() itself. follow_automount() will then do the addition. This slightly complicates things as ->d_automount() normally wants to add the new vfsmount to an expiration list and start an expiration timer. The problem with that is that the vfsmount will be deleted if it has a refcount of 1 and the timer will not repeat if the expiration list is empty. To this end, we require the vfsmount to be returned from d_automount() with a refcount of (at least) 2. One of these refs will be dropped unconditionally. In addition, follow_automount() must get a 3rd ref around the call to do_add_mount() lest it eat a ref and return an error, leaving the mount we have open to being expired as we would otherwise have only 1 ref on it. d_automount() should also add the the vfsmount to the expiration list (by calling mnt_set_expiry()) and start the expiration timer before returning, if this mechanism is to be used. The vfsmount will be unlinked from the expiration list by follow_automount() if do_add_mount() fails. This patch also fixes the call to do_add_mount() for AFS to propagate the mount flags from the parent vfsmount. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Allow d_manage() to be used in RCU-walk modeDavid Howells2011-01-151-7/+8
| | | | | | | | | | | | | | Allow d_manage() to be called from pathwalk when it is in RCU-walk mode as well as when it is in Ref-walk mode. This permits __follow_mount_rcu() to call d_manage() directly. d_manage() needs a parameter to indicate that it is in RCU-walk mode as it isn't allowed to sleep if in that mode (but should return -ECHILD instead). autofs4_d_manage() can then be set to retain RCU-walk mode if the daemon accesses it and otherwise request dropping back to ref-walk mode. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Remove a further kludge from __do_follow_link()David Howells2011-01-151-6/+2
| | | | | | | | | | | | Remove a further kludge from __do_follow_link() as it's no longer required with the automount code. This reverts the non-helper-function parts of 051d381259eb57d6074d02a6ba6e90e744f1a29f, which breaks union mounts. Reported-by: vaurora@redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>