diff options
| author | Zijing Yin <yzjaurora@gmail.com> | 2026-05-29 06:57:17 -0700 |
|---|---|---|
| committer | Jakub Kicinski <kuba@kernel.org> | 2026-06-02 12:54:58 -0700 |
| commit | 5893cc75a19146b1365867dcf7f01ed420f702ed (patch) | |
| tree | 82e6a2d1adce731fc7082082269915f7d7e9e95c /drivers/net/netdevsim | |
| parent | b04015d769cc59f938e54a83ab59d6cc6cead0de (diff) | |
| download | linux-next-5893cc75a19146b1365867dcf7f01ed420f702ed.tar.gz linux-next-5893cc75a19146b1365867dcf7f01ed420f702ed.zip | |
netdevsim: fib: fix use-after-free of FIB data via debugfs
Writing to the netdevsim debugfs file
"netdevsim/netdevsimN/fib/nexthop_bucket_activity" enters
nsim_nexthop_bucket_activity_write(), which looks up a nexthop in
data->nexthop_ht under rtnl_lock(). If a network namespace teardown,
devlink reload or device deletion runs concurrently, nsim_fib_destroy()
frees that rhashtable (and the surrounding nsim_fib_data) while the
write is still in flight, leading to a slab-use-after-free:
BUG: KASAN: slab-use-after-free in nsim_nexthop_bucket_activity_write+0xb9e/0xdf0
Read of size 4 at addr ff1100001a379808 by task syz.0.11967/27894
CPU: 0 UID: 0 PID: 27894 Comm: syz.0.11967 Not tainted 7.1.0-rc4-gf6f1bfc1980a #4
Call Trace:
nsim_nexthop_bucket_activity_write+0xb9e/0xdf0
full_proxy_write+0x135/0x1a0
vfs_write+0x2e2/0x1040
ksys_write+0x146/0x270
__x64_sys_write+0x76/0xb0
do_syscall_64+0xb9/0x5b0
entry_SYSCALL_64_after_hwframe+0x74/0x7c
Allocated by task 15957:
rhashtable_init_noprof+0x3ec/0x860
nsim_fib_create+0x371/0xca0
nsim_drv_probe+0xd60/0x15c0
...
new_device_store+0x425/0x7f0
Freed by task 24:
rhashtable_free_and_destroy+0x10d/0x620
nsim_fib_destroy+0xc9/0x1c0
nsim_dev_reload_destroy+0x1e7/0x530
nsim_dev_reload_down+0x6b/0xd0
devlink_reload+0x1b5/0x770
devlink_pernet_pre_exit+0x25d/0x3a0
ops_undo_list+0x1b7/0xb90
cleanup_net+0x47f/0x8a0
The buggy address belongs to the object at ff1100001a379800
which belongs to the cache kmalloc-1k of size 1024
The freed 1k object is the bucket table of data->nexthop_ht. Shortly
after, the dangling table is dereferenced again and the machine also
takes a GPF in __rht_bucket_nested() from the same call site.
The root cause is a lifetime mismatch: the debugfs files reference
nsim_fib_data (the writer dereferences data->nexthop_ht), but the
interface is not bracketed around the lifetime of that data.
nsim_fib_destroy() freed both rhashtables and only removed the debugfs
directory afterwards, and nsim_fib_create() created the debugfs files
before the rhashtables were initialized and, on the error path, freed
them before removing the files. debugfs keeps the file itself alive
across a ->write() via debugfs_file_get()/debugfs_file_put()
(fs/debugfs/file.c), but it does not keep data->nexthop_ht alive, so the
in-flight writer dereferenced freed memory. rtnl_lock() in the writer
does not help, because the teardown path does not take rtnl around
rhashtable_free_and_destroy().
Fix it by bracketing the debugfs interface around the data it exposes,
keeping nsim_fib_create() and nsim_fib_destroy() symmetric:
- In nsim_fib_destroy(), tear down the debugfs files before the data
structures they reference. debugfs_remove_recursive() drops the
initial active-user reference and then waits for every in-flight
->write() to drop its reference before returning, and rejects new
opens (__debugfs_file_removed(), fs/debugfs/inode.c). Once it returns,
no debugfs accessor can reach the FIB data, so the rhashtables and
nsim_fib_data can be destroyed safely. This also covers the bool knobs
in the same directory, which store pointers into the same
nsim_fib_data, and the final kfree(data).
- In nsim_fib_create(), create the debugfs files after the rhashtables
and notifiers are set up. This closes the same race on the
error-unwind path, where a concurrent writer could otherwise observe a
half-constructed instance or a table that the unwind has already
freed. (With only the destroy-side change, a writer racing the create
window instead dereferences an uninitialized data->nexthop_ht.)
This is reproducible by racing, in a loop, writes to
/sys/kernel/debug/netdevsim/netdevsimN/fib/nexthop_bucket_activity
against a teardown of the same netdevsim instance -- a devlink reload
("devlink dev reload netdevsim/netdevsimN"), destroying the network
namespace it lives in, or "echo N > /sys/bus/netdevsim/del_device". It
was found with syzkaller; a syzkaller reproducer is available. A
standalone C reproducer does not trigger it reliably because the race
needs the netns-teardown/reload path.
Cc: <stable+noautosel@kernel.org> # netdevsim is a test harness, it's never loaded on production systems
Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260529135718.1804031-1-yzjaurora@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'drivers/net/netdevsim')
| -rw-r--r-- | drivers/net/netdevsim/fib.c | 17 |
1 files changed, 9 insertions, 8 deletions
diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c index 1a42bdbfaa41..55bcdefadc9b 100644 --- a/drivers/net/netdevsim/fib.c +++ b/drivers/net/netdevsim/fib.c @@ -1562,14 +1562,11 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink, data->devlink = devlink; nsim_dev = devlink_priv(devlink); - err = nsim_fib_debugfs_init(data, nsim_dev); - if (err) - goto err_data_free; mutex_init(&data->nh_lock); err = rhashtable_init(&data->nexthop_ht, &nsim_nexthop_ht_params); if (err) - goto err_debugfs_exit; + goto err_nh_lock_destroy; mutex_init(&data->fib_lock); INIT_LIST_HEAD(&data->fib_rt_list); @@ -1600,6 +1597,10 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink, goto err_nexthop_nb_unregister; } + err = nsim_fib_debugfs_init(data, nsim_dev); + if (err) + goto err_fib_notifier_unregister; + devl_resource_occ_get_register(devlink, NSIM_RESOURCE_IPV4_FIB, nsim_fib_ipv4_resource_occ_get, @@ -1622,6 +1623,8 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink, data); return data; +err_fib_notifier_unregister: + unregister_fib_notifier(devlink_net(devlink), &data->fib_nb); err_nexthop_nb_unregister: unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb); err_rhashtable_fib_destroy: @@ -1633,10 +1636,8 @@ err_rhashtable_nexthop_destroy: rhashtable_free_and_destroy(&data->nexthop_ht, nsim_nexthop_free, data); mutex_destroy(&data->fib_lock); -err_debugfs_exit: +err_nh_lock_destroy: mutex_destroy(&data->nh_lock); - nsim_fib_debugfs_exit(data); -err_data_free: kfree(data); return ERR_PTR(err); } @@ -1653,6 +1654,7 @@ void nsim_fib_destroy(struct devlink *devlink, struct nsim_fib_data *data) NSIM_RESOURCE_IPV4_FIB_RULES); devl_resource_occ_get_unregister(devlink, NSIM_RESOURCE_IPV4_FIB); + nsim_fib_debugfs_exit(data); unregister_fib_notifier(devlink_net(devlink), &data->fib_nb); unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb); cancel_work_sync(&data->fib_flush_work); @@ -1665,6 +1667,5 @@ void nsim_fib_destroy(struct devlink *devlink, struct nsim_fib_data *data) WARN_ON_ONCE(!list_empty(&data->fib_rt_list)); mutex_destroy(&data->fib_lock); mutex_destroy(&data->nh_lock); - nsim_fib_debugfs_exit(data); kfree(data); } |
