diff options
author | Alexei Starovoitov <ast@plumgrid.com> | 2013-11-19 19:12:34 -0800 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2013-11-20 15:28:44 -0500 |
commit | dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4 (patch) | |
tree | f240417b7245048480497491fc82b464f14ebadd /net | |
parent | 4f837c3b117c4fdae72d901034d0565d16af7966 (diff) | |
download | lwn-dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4.tar.gz lwn-dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4.zip |
ipv4: fix race in concurrent ip_route_input_slow()
CPUs can ask for local route via ip_route_input_noref() concurrently.
if nh_rth_input is not cached yet, CPUs will proceed to allocate
equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
via rt_cache_route()
Most of the time they succeed, but on occasion the following two lines:
orig = *p;
prev = cmpxchg(p, orig, rt);
in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
so dst is leaking. dst_destroy() is never called and 'lo' device
refcnt doesn't go to zero, which can be seen in the logs as:
unregister_netdevice: waiting for lo to become free. Usage count = 1
Adding mdelay() between above two lines makes it easily reproducible.
Fix it similar to nh_pcpu_rth_output case.
Fixes: d2d68ba9fe8b ("ipv4: Cache input routes in fib_info nexthops.")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net')
-rw-r--r-- | net/ipv4/route.c | 8 |
1 files changed, 6 insertions, 2 deletions
diff --git a/net/ipv4/route.c b/net/ipv4/route.c index f428935c50db..f8da28278014 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1776,8 +1776,12 @@ local_input: rth->dst.error= -err; rth->rt_flags &= ~RTCF_LOCAL; } - if (do_cache) - rt_cache_route(&FIB_RES_NH(res), rth); + if (do_cache) { + if (unlikely(!rt_cache_route(&FIB_RES_NH(res), rth))) { + rth->dst.flags |= DST_NOCACHE; + rt_add_uncached_list(rth); + } + } skb_dst_set(skb, &rth->dst); err = 0; goto out; |