diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2017-11-16 11:41:22 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2017-11-16 11:41:22 -0800 |
commit | 487e2c9f44c4b5ea23bfe87bb34679f7297a0bce (patch) | |
tree | e9dcf16175078ae2bed9a2fc120e6bd0b28f48e9 /include | |
parent | b630a23a731a436f9edbd9fa00739aaa3e174c15 (diff) | |
parent | 98bf40cd99fcfed0705812b6cbdbb3b441a42970 (diff) | |
download | lwn-487e2c9f44c4b5ea23bfe87bb34679f7297a0bce.tar.gz lwn-487e2c9f44c4b5ea23bfe87bb34679f7297a0bce.zip |
Merge tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS updates from David Howells:
"kAFS filesystem driver overhaul.
The major points of the overhaul are:
(1) Preliminary groundwork is laid for supporting network-namespacing
of kAFS. The remainder of the namespacing work requires some way
to pass namespace information to submounts triggered by an
automount. This requires something like the mount overhaul that's
in progress.
(2) sockaddr_rxrpc is used in preference to in_addr for holding
addresses internally and add support for talking to the YFS VL
server. With this, kAFS can do everything over IPv6 as well as
IPv4 if it's talking to servers that support it.
(3) Callback handling is overhauled to be generally passive rather
than active. 'Callbacks' are promises by the server to tell us
about data and metadata changes. Callbacks are now checked when
we next touch an inode rather than actively going and looking for
it where possible.
(4) File access permit caching is overhauled to store the caching
information per-inode rather than per-directory, shared over
subordinate files. Whilst older AFS servers only allow ACLs on
directories (shared to the files in that directory), newer AFS
servers break that restriction.
To improve memory usage and to make it easier to do mass-key
removal, permit combinations are cached and shared.
(5) Cell database management is overhauled to allow lighter locks to
be used and to make cell records autonomous state machines that
look after getting their own DNS records and cleaning themselves
up, in particular preventing races in acquiring and relinquishing
the fscache token for the cell.
(6) Volume caching is overhauled. The afs_vlocation record is got rid
of to simplify things and the superblock is now keyed on the cell
and the numeric volume ID only. The volume record is tied to a
superblock and normal superblock management is used to mediate
the lifetime of the volume fscache token.
(7) File server record caching is overhauled to make server records
independent of cells and volumes. A server can be in multiple
cells (in such a case, the administrator must make sure that the
VL services for all cells correctly reflect the volumes shared
between those cells).
Server records are now indexed using the UUID of the server
rather than the address since a server can have multiple
addresses.
(8) File server rotation is overhauled to handle VMOVED, VBUSY (and
similar), VOFFLINE and VNOVOL indications and to handle rotation
both of servers and addresses of those servers. The rotation will
also wait and retry if the server says it is busy.
(9) Data writeback is overhauled. Each inode no longer stores a list
of modified sections tagged with the key that authorised it in
favour of noting the modified region of a page in page->private
and storing a list of keys that made modifications in the inode.
This simplifies things and allows other keys to be used to
actually write to the server if a key that made a modification
becomes useless.
(10) Writable mmap() is implemented. This allows a kernel to be build
entirely on AFS.
Note that Pre AFS-3.4 servers are no longer supported, though this can
be added back if necessary (AFS-3.4 was released in 1998)"
* tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (35 commits)
afs: Protect call->state changes against signals
afs: Trace page dirty/clean
afs: Implement shared-writeable mmap
afs: Get rid of the afs_writeback record
afs: Introduce a file-private data record
afs: Use a dynamic port if 7001 is in use
afs: Fix directory read/modify race
afs: Trace the sending of pages
afs: Trace the initiation and completion of client calls
afs: Fix documentation on # vs % prefix in mount source specification
afs: Fix total-length calculation for multiple-page send
afs: Only progress call state at end of Tx phase from rxrpc callback
afs: Make use of the YFS service upgrade to fully support IPv6
afs: Overhaul volume and server record caching and fileserver rotation
afs: Move server rotation code into its own file
afs: Add an address list concept
afs: Overhaul cell database management
afs: Overhaul permit caching
afs: Overhaul the callback handling
afs: Rename struct afs_call server member to cm_server
...
Diffstat (limited to 'include')
-rw-r--r-- | include/linux/wait_bit.h | 15 | ||||
-rw-r--r-- | include/trace/events/afs.h | 293 | ||||
-rw-r--r-- | include/uapi/linux/magic.h | 1 |
3 files changed, 303 insertions, 6 deletions
diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h index af0d495430d7..61b39eaf7cad 100644 --- a/include/linux/wait_bit.h +++ b/include/linux/wait_bit.h @@ -26,6 +26,8 @@ struct wait_bit_queue_entry { { .flags = p, .bit_nr = WAIT_ATOMIC_T_BIT_NR, } typedef int wait_bit_action_f(struct wait_bit_key *key, int mode); +typedef int wait_atomic_t_action_f(atomic_t *counter, unsigned int mode); + void __wake_up_bit(struct wait_queue_head *wq_head, void *word, int bit); int __wait_on_bit(struct wait_queue_head *wq_head, struct wait_bit_queue_entry *wbq_entry, wait_bit_action_f *action, unsigned int mode); int __wait_on_bit_lock(struct wait_queue_head *wq_head, struct wait_bit_queue_entry *wbq_entry, wait_bit_action_f *action, unsigned int mode); @@ -34,7 +36,7 @@ void wake_up_atomic_t(atomic_t *p); int out_of_line_wait_on_bit(void *word, int, wait_bit_action_f *action, unsigned int mode); int out_of_line_wait_on_bit_timeout(void *word, int, wait_bit_action_f *action, unsigned int mode, unsigned long timeout); int out_of_line_wait_on_bit_lock(void *word, int, wait_bit_action_f *action, unsigned int mode); -int out_of_line_wait_on_atomic_t(atomic_t *p, int (*)(atomic_t *), unsigned int mode); +int out_of_line_wait_on_atomic_t(atomic_t *p, wait_atomic_t_action_f action, unsigned int mode); struct wait_queue_head *bit_waitqueue(void *word, int bit); extern void __init wait_bit_init(void); @@ -51,10 +53,11 @@ int wake_bit_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync }, \ } -extern int bit_wait(struct wait_bit_key *key, int bit); -extern int bit_wait_io(struct wait_bit_key *key, int bit); -extern int bit_wait_timeout(struct wait_bit_key *key, int bit); -extern int bit_wait_io_timeout(struct wait_bit_key *key, int bit); +extern int bit_wait(struct wait_bit_key *key, int mode); +extern int bit_wait_io(struct wait_bit_key *key, int mode); +extern int bit_wait_timeout(struct wait_bit_key *key, int mode); +extern int bit_wait_io_timeout(struct wait_bit_key *key, int mode); +extern int atomic_t_wait(atomic_t *counter, unsigned int mode); /** * wait_on_bit - wait for a bit to be cleared @@ -251,7 +254,7 @@ wait_on_bit_lock_action(unsigned long *word, int bit, wait_bit_action_f *action, * outside of the target 'word'. */ static inline -int wait_on_atomic_t(atomic_t *val, int (*action)(atomic_t *), unsigned mode) +int wait_on_atomic_t(atomic_t *val, wait_atomic_t_action_f action, unsigned mode) { might_sleep(); if (atomic_read(val) == 0) diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 8b95c16b7045..6b59c63a8e51 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -30,6 +30,38 @@ enum afs_call_trace { afs_call_trace_work, }; +enum afs_fs_operation { + afs_FS_FetchData = 130, /* AFS Fetch file data */ + afs_FS_FetchStatus = 132, /* AFS Fetch file status */ + afs_FS_StoreData = 133, /* AFS Store file data */ + afs_FS_StoreStatus = 135, /* AFS Store file status */ + afs_FS_RemoveFile = 136, /* AFS Remove a file */ + afs_FS_CreateFile = 137, /* AFS Create a file */ + afs_FS_Rename = 138, /* AFS Rename or move a file or directory */ + afs_FS_Symlink = 139, /* AFS Create a symbolic link */ + afs_FS_Link = 140, /* AFS Create a hard link */ + afs_FS_MakeDir = 141, /* AFS Create a directory */ + afs_FS_RemoveDir = 142, /* AFS Remove a directory */ + afs_FS_GetVolumeInfo = 148, /* AFS Get information about a volume */ + afs_FS_GetVolumeStatus = 149, /* AFS Get volume status information */ + afs_FS_GetRootVolume = 151, /* AFS Get root volume name */ + afs_FS_SetLock = 156, /* AFS Request a file lock */ + afs_FS_ExtendLock = 157, /* AFS Extend a file lock */ + afs_FS_ReleaseLock = 158, /* AFS Release a file lock */ + afs_FS_Lookup = 161, /* AFS lookup file in directory */ + afs_FS_FetchData64 = 65537, /* AFS Fetch file data */ + afs_FS_StoreData64 = 65538, /* AFS Store file data */ + afs_FS_GiveUpAllCallBacks = 65539, /* AFS Give up all our callbacks on a server */ + afs_FS_GetCapabilities = 65540, /* AFS Get FS server capabilities */ +}; + +enum afs_vl_operation { + afs_VL_GetEntryByNameU = 527, /* AFS Get Vol Entry By Name operation ID */ + afs_VL_GetAddrsU = 533, /* AFS Get FS server addresses */ + afs_YFSVL_GetEndpoints = 64002, /* YFS Get FS & Vol server addresses */ + afs_VL_GetCapabilities = 65537, /* AFS Get VL server capabilities */ +}; + #endif /* end __AFS_DECLARE_TRACE_ENUMS_ONCE_ONLY */ /* @@ -42,6 +74,37 @@ enum afs_call_trace { EM(afs_call_trace_wake, "WAKE ") \ E_(afs_call_trace_work, "WORK ") +#define afs_fs_operations \ + EM(afs_FS_FetchData, "FS.FetchData") \ + EM(afs_FS_FetchStatus, "FS.FetchStatus") \ + EM(afs_FS_StoreData, "FS.StoreData") \ + EM(afs_FS_StoreStatus, "FS.StoreStatus") \ + EM(afs_FS_RemoveFile, "FS.RemoveFile") \ + EM(afs_FS_CreateFile, "FS.CreateFile") \ + EM(afs_FS_Rename, "FS.Rename") \ + EM(afs_FS_Symlink, "FS.Symlink") \ + EM(afs_FS_Link, "FS.Link") \ + EM(afs_FS_MakeDir, "FS.MakeDir") \ + EM(afs_FS_RemoveDir, "FS.RemoveDir") \ + EM(afs_FS_GetVolumeInfo, "FS.GetVolumeInfo") \ + EM(afs_FS_GetVolumeStatus, "FS.GetVolumeStatus") \ + EM(afs_FS_GetRootVolume, "FS.GetRootVolume") \ + EM(afs_FS_SetLock, "FS.SetLock") \ + EM(afs_FS_ExtendLock, "FS.ExtendLock") \ + EM(afs_FS_ReleaseLock, "FS.ReleaseLock") \ + EM(afs_FS_Lookup, "FS.Lookup") \ + EM(afs_FS_FetchData64, "FS.FetchData64") \ + EM(afs_FS_StoreData64, "FS.StoreData64") \ + EM(afs_FS_GiveUpAllCallBacks, "FS.GiveUpAllCallBacks") \ + E_(afs_FS_GetCapabilities, "FS.GetCapabilities") + +#define afs_vl_operations \ + EM(afs_VL_GetEntryByNameU, "VL.GetEntryByNameU") \ + EM(afs_VL_GetAddrsU, "VL.GetAddrsU") \ + EM(afs_YFSVL_GetEndpoints, "YFSVL.GetEndpoints") \ + E_(afs_VL_GetCapabilities, "VL.GetCapabilities") + + /* * Export enum symbols via userspace. */ @@ -51,6 +114,8 @@ enum afs_call_trace { #define E_(a, b) TRACE_DEFINE_ENUM(a); afs_call_traces; +afs_fs_operations; +afs_vl_operations; /* * Now redefine the EM() and E_() macros to map the enums to the strings that @@ -178,6 +243,234 @@ TRACE_EVENT(afs_call, __entry->where) ); +TRACE_EVENT(afs_make_fs_call, + TP_PROTO(struct afs_call *call, const struct afs_fid *fid), + + TP_ARGS(call, fid), + + TP_STRUCT__entry( + __field(struct afs_call *, call ) + __field(enum afs_fs_operation, op ) + __field_struct(struct afs_fid, fid ) + ), + + TP_fast_assign( + __entry->call = call; + __entry->op = call->operation_ID; + if (fid) { + __entry->fid = *fid; + } else { + __entry->fid.vid = 0; + __entry->fid.vnode = 0; + __entry->fid.unique = 0; + } + ), + + TP_printk("c=%p %06x:%06x:%06x %s", + __entry->call, + __entry->fid.vid, + __entry->fid.vnode, + __entry->fid.unique, + __print_symbolic(__entry->op, afs_fs_operations)) + ); + +TRACE_EVENT(afs_make_vl_call, + TP_PROTO(struct afs_call *call), + + TP_ARGS(call), + + TP_STRUCT__entry( + __field(struct afs_call *, call ) + __field(enum afs_vl_operation, op ) + ), + + TP_fast_assign( + __entry->call = call; + __entry->op = call->operation_ID; + ), + + TP_printk("c=%p %s", + __entry->call, + __print_symbolic(__entry->op, afs_vl_operations)) + ); + +TRACE_EVENT(afs_call_done, + TP_PROTO(struct afs_call *call), + + TP_ARGS(call), + + TP_STRUCT__entry( + __field(struct afs_call *, call ) + __field(struct rxrpc_call *, rx_call ) + __field(int, ret ) + __field(u32, abort_code ) + ), + + TP_fast_assign( + __entry->call = call; + __entry->rx_call = call->rxcall; + __entry->ret = call->error; + __entry->abort_code = call->abort_code; + ), + + TP_printk(" c=%p ret=%d ab=%d [%p]", + __entry->call, + __entry->ret, + __entry->abort_code, + __entry->rx_call) + ); + +TRACE_EVENT(afs_send_pages, + TP_PROTO(struct afs_call *call, struct msghdr *msg, + pgoff_t first, pgoff_t last, unsigned int offset), + + TP_ARGS(call, msg, first, last, offset), + + TP_STRUCT__entry( + __field(struct afs_call *, call ) + __field(pgoff_t, first ) + __field(pgoff_t, last ) + __field(unsigned int, nr ) + __field(unsigned int, bytes ) + __field(unsigned int, offset ) + __field(unsigned int, flags ) + ), + + TP_fast_assign( + __entry->call = call; + __entry->first = first; + __entry->last = last; + __entry->nr = msg->msg_iter.nr_segs; + __entry->bytes = msg->msg_iter.count; + __entry->offset = offset; + __entry->flags = msg->msg_flags; + ), + + TP_printk(" c=%p %lx-%lx-%lx b=%x o=%x f=%x", + __entry->call, + __entry->first, __entry->first + __entry->nr - 1, __entry->last, + __entry->bytes, __entry->offset, + __entry->flags) + ); + +TRACE_EVENT(afs_sent_pages, + TP_PROTO(struct afs_call *call, pgoff_t first, pgoff_t last, + pgoff_t cursor, int ret), + + TP_ARGS(call, first, last, cursor, ret), + + TP_STRUCT__entry( + __field(struct afs_call *, call ) + __field(pgoff_t, first ) + __field(pgoff_t, last ) + __field(pgoff_t, cursor ) + __field(int, ret ) + ), + + TP_fast_assign( + __entry->call = call; + __entry->first = first; + __entry->last = last; + __entry->cursor = cursor; + __entry->ret = ret; + ), + + TP_printk(" c=%p %lx-%lx c=%lx r=%d", + __entry->call, + __entry->first, __entry->last, + __entry->cursor, __entry->ret) + ); + +TRACE_EVENT(afs_dir_check_failed, + TP_PROTO(struct afs_vnode *vnode, loff_t off, loff_t i_size), + + TP_ARGS(vnode, off, i_size), + + TP_STRUCT__entry( + __field(struct afs_vnode *, vnode ) + __field(loff_t, off ) + __field(loff_t, i_size ) + ), + + TP_fast_assign( + __entry->vnode = vnode; + __entry->off = off; + __entry->i_size = i_size; + ), + + TP_printk("vn=%p %llx/%llx", + __entry->vnode, __entry->off, __entry->i_size) + ); + +/* + * We use page->private to hold the amount of the page that we've written to, + * splitting the field into two parts. However, we need to represent a range + * 0...PAGE_SIZE inclusive, so we can't support 64K pages on a 32-bit system. + */ +#if PAGE_SIZE > 32768 +#define AFS_PRIV_MAX 0xffffffff +#define AFS_PRIV_SHIFT 32 +#else +#define AFS_PRIV_MAX 0xffff +#define AFS_PRIV_SHIFT 16 +#endif + +TRACE_EVENT(afs_page_dirty, + TP_PROTO(struct afs_vnode *vnode, const char *where, + pgoff_t page, unsigned long priv), + + TP_ARGS(vnode, where, page, priv), + + TP_STRUCT__entry( + __field(struct afs_vnode *, vnode ) + __field(const char *, where ) + __field(pgoff_t, page ) + __field(unsigned long, priv ) + ), + + TP_fast_assign( + __entry->vnode = vnode; + __entry->where = where; + __entry->page = page; + __entry->priv = priv; + ), + + TP_printk("vn=%p %lx %s %lu-%lu", + __entry->vnode, __entry->page, __entry->where, + __entry->priv & AFS_PRIV_MAX, + __entry->priv >> AFS_PRIV_SHIFT) + ); + +TRACE_EVENT(afs_call_state, + TP_PROTO(struct afs_call *call, + enum afs_call_state from, + enum afs_call_state to, + int ret, u32 remote_abort), + + TP_ARGS(call, from, to, ret, remote_abort), + + TP_STRUCT__entry( + __field(struct afs_call *, call ) + __field(enum afs_call_state, from ) + __field(enum afs_call_state, to ) + __field(int, ret ) + __field(u32, abort ) + ), + + TP_fast_assign( + __entry->call = call; + __entry->from = from; + __entry->to = to; + __entry->ret = ret; + __entry->abort = remote_abort; + ), + + TP_printk("c=%p %u->%u r=%d ab=%d", + __entry->call, + __entry->from, __entry->to, + __entry->ret, __entry->abort) + ); + #endif /* _TRACE_AFS_H */ /* This part must be outside protection */ diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index aa50113ebe5b..1a6fee974116 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -47,6 +47,7 @@ #define OPENPROM_SUPER_MAGIC 0x9fa1 #define QNX4_SUPER_MAGIC 0x002f /* qnx4 fs detection */ #define QNX6_SUPER_MAGIC 0x68191122 /* qnx6 fs detection */ +#define AFS_FS_MAGIC 0x6B414653 #define REISERFS_SUPER_MAGIC 0x52654973 /* used by gcc */ /* used by file system utilities that |