Age | Commit message (Collapse) | Author |
|
Both ip6_input_finish (non-GRO) and esp6_gro_receive (GRO) strip
the IPv6 header without adjusting skb->csum accordingly. As a
result CHECKSUM_COMPLETE breaks and "hw csum failure" is written
to the kernel log by netdev_rx_csum_fault (dev.c).
Fix skb->csum by substracting the checksum value of the pulled IPv6
header using a call to skb_postpull_rcsum.
This affects both transport and tunnel modes.
Note that the fix occurs far from the place that the header was
pulled. This is based on existing code, see:
ipv6_srh_rcv() in exthdrs.c and rawv6_rcv() in raw.c
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
xfrm6_transport_finish rebuilds the IPv6 header based on the
original one and pushes it back without fixing skb->csum.
Therefore, CHECKSUM_COMPLETE is no longer valid and the packet
gets dropped.
Fix skb->csum by calling skb_postpush_rcsum.
Note: A valid IPv4 header has checksum 0, unlike IPv6. Thus,
the change is not needed in the sibling xfrm4_transport_finish
function.
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Keep the device's reported ip_summed indication in case crypto
was offloaded by the device. Subtract the csum values of the
stripped parts (esp header+iv, esp trailer+auth_data) to keep
value correct.
Note: CHECKSUM_COMPLETE should be indicated only if skb->csum
has the post-decryption offload csum value.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Two minor conflicts in virtio_net driver (bug fix overlapping addition
of a helper) and MAINTAINERS (new driver edit overlapping revamp of
PHY entry).
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 3a3a4e3054137c5ff5d4d306ec834f6d25d7f95b.
inet6_add_protocol and inet6_del_protocol include casts that remove the
effect of the const annotation on their parameter, leading to possible
runtime crashes.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since commit 67a51780aebb ("ipv6: udp: leverage scratch area
helpers") udp6_recvmsg() read the skb len from the scratch area,
to avoid a cache miss.
But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their
length exceeds the 16 bits available in the scratch area. As a side
effect the length returned by recvmsg() is:
<ingress datagram len> % (1<<16)
This commit addresses the issue allocating one more bit in the
IP6CB flags field and setting it for incoming jumbograms.
Such field is still in the first cacheline, so at recvmsg()
time we can check it and fallback to access skb->len if
required, without a measurable overhead.
Fixes: 67a51780aebb ("ipv6: udp: leverage scratch area helpers")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is no need to go through sk->sk_net to access the net namespace
and its sysctl variables because we allocate the sock and initialize
sk_net just a few lines earlier in the same routine.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
prequeue is a tcp receive optimization that moves part of rx processing
from bh to process context.
This only works if the socket being processed belongs to a process that
is blocked in recv on that socket.
In practice, this doesn't happen anymore that often because nowadays
servers tend to use an event driven (epoll) model.
Even normal client applications (web browsers) commonly use many tcp
connections in parallel.
This has measureable impact only in netperf (which uses plain recv and
thus allows prequeue use) from host to locally running vm (~4%), however,
there were no changes when using netperf between two physical hosts with
ixgbe interfaces.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Discussion during NFWS 2017 in Faro has shown that the current
conntrack behaviour is unreasonable.
Even if conntrack module is loaded on behalf of a single net namespace,
its turned on for all namespaces, which is expensive. Commit
481fa373476 ("netfilter: conntrack: add nf_conntrack_default_on sysctl")
attempted to provide an alternative to the 'default on' behaviour by
adding a sysctl to change it.
However, as Eric points out, the sysctl only becomes available
once the module is loaded, and then its too late.
So we either have to move the sysctl to the core, or, alternatively,
change conntrack to become active only once the rule set requires this.
This does the latter, conntrack is only enabled when a rule needs it.
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
We no longer place these on a list so they can be const.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This is a preparatory patch for adding fib support to the netdev family.
The netdev family receives the packets from ingress hook. At this point
we have no guarantee that the ip header is linear. So this patch
replaces ip_hdr with skb_header_pointer in order to address that
possible situation.
Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When an early demuxed packet reaches __udp6_lib_lookup_skb(), the
sk reference is retrieved and used, but the relevant reference
count is leaked and the socket destructor is never called.
Beyond leaking the sk memory, if there are pending UDP packets
in the receive queue, even the related accounted memory is leaked.
In the long run, this will cause persistent forward allocation errors
and no UDP skbs (both ipv4 and ipv6) will be able to reach the
user-space.
Fix this by explicitly accessing the early demux reference before
the lookup, and properly decreasing the socket reference count
after usage.
Also drop the skb_steal_sock() in __udp6_lib_lookup_skb(), and
the now obsoleted comment about "socket cache".
The newly added code is derived from the current ipv4 code for the
similar path.
v1 -> v2:
fixed the __udp6_lib_rcv() return code for resubmission,
as suggested by Eric
Reported-by: Sam Edwards <CFSworks@gmail.com>
Reported-by: Marc Haber <mh+netdev@zugschlus.de>
Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The inet6_protocol structure is only passed as the first argument to
inet6_add_protocol or inet6_del_protocol, both of which are declared as
const. Thus the inet6_protocol structure itself can be const.
Also drop __read_mostly where present on the newly const structures.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RFC 2465 defines ipv6IfStatsOutFragFails as:
"The number of IPv6 datagrams that have been discarded
because they needed to be fragmented at this output
interface but could not be."
The existing implementation, instead, would increase the counter
twice in case we fail to allocate room for single fragments:
once for the fragment, once for the datagram.
This didn't look intentional though. In one of the two affected
affected failure paths, the double increase was simply a result
of a new 'goto fail' statement, introduced to avoid a skb leak.
The other path appears to be affected since at least 2.6.12-rc2.
Reported-by: Sabrina Dubroca <sdubroca@redhat.com>
Fixes: 1d325d217c7f ("ipv6: ip6_fragment: fix headroom tests and skb leak")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The last (4th) argument of tcp_rcv_established() is redundant as it
always equals to skb->len and the skb itself is always passed as 2th
agrument. There is no reason to have it.
Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
In some cases, offset can overflow and can cause an infinite loop in
ip6_find_1stfragopt(). Make it unsigned int to prevent the overflow, and
cap it at IPV6_MAXPLEN, since packets larger than that should be invalid.
This problem has been here since before the beginning of git history.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
KMSAN reported use of uninitialized memory in skb_set_hash_from_sk(),
which originated from the TCP request socket created in
cookie_v6_check():
==================================================================
BUG: KMSAN: use of uninitialized memory in tcp_transmit_skb+0xf77/0x3ec0
CPU: 1 PID: 2949 Comm: syz-execprog Not tainted 4.11.0-rc5+ #2931
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
TCP: request_sock_TCPv6: Possible SYN flooding on port 20028. Sending cookies. Check SNMP counters.
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:16
dump_stack+0x172/0x1c0 lib/dump_stack.c:52
kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927
__msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469
skb_set_hash_from_sk ./include/net/sock.h:2011
tcp_transmit_skb+0xf77/0x3ec0 net/ipv4/tcp_output.c:983
tcp_send_ack+0x75b/0x830 net/ipv4/tcp_output.c:3493
tcp_delack_timer_handler+0x9a6/0xb90 net/ipv4/tcp_timer.c:284
tcp_delack_timer+0x1b0/0x310 net/ipv4/tcp_timer.c:309
call_timer_fn+0x240/0x520 kernel/time/timer.c:1268
expire_timers kernel/time/timer.c:1307
__run_timers+0xc13/0xf10 kernel/time/timer.c:1601
run_timer_softirq+0x36/0xa0 kernel/time/timer.c:1614
__do_softirq+0x485/0x942 kernel/softirq.c:284
invoke_softirq kernel/softirq.c:364
irq_exit+0x1fa/0x230 kernel/softirq.c:405
exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:657
smp_apic_timer_interrupt+0x5a/0x80 arch/x86/kernel/apic/apic.c:966
apic_timer_interrupt+0x86/0x90 arch/x86/entry/entry_64.S:489
RIP: 0010:native_restore_fl ./arch/x86/include/asm/irqflags.h:36
RIP: 0010:arch_local_irq_restore ./arch/x86/include/asm/irqflags.h:77
RIP: 0010:__msan_poison_alloca+0xed/0x120 mm/kmsan/kmsan_instr.c:440
RSP: 0018:ffff880024917cd8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
RAX: 0000000000000246 RBX: ffff8800224c0000 RCX: 0000000000000005
RDX: 0000000000000004 RSI: ffff880000000000 RDI: ffffea0000b6d770
RBP: ffff880024917d58 R08: 0000000000000dd8 R09: 0000000000000004
R10: 0000160000000000 R11: 0000000000000000 R12: ffffffff85abf810
R13: ffff880024917dd8 R14: 0000000000000010 R15: ffffffff81cabde4
</IRQ>
poll_select_copy_remaining+0xac/0x6b0 fs/select.c:293
SYSC_select+0x4b4/0x4e0 fs/select.c:653
SyS_select+0x76/0xa0 fs/select.c:634
entry_SYSCALL_64_fastpath+0x13/0x94 arch/x86/entry/entry_64.S:204
RIP: 0033:0x4597e7
RSP: 002b:000000c420037ee0 EFLAGS: 00000246 ORIG_RAX: 0000000000000017
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004597e7
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 000000c420037ef0 R08: 000000c420037ee0 R09: 0000000000000059
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000042dc20
R13: 00000000000000f3 R14: 0000000000000030 R15: 0000000000000003
chained origin:
save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
kmsan_save_stack mm/kmsan/kmsan.c:317
kmsan_internal_chain_origin+0x12a/0x1f0 mm/kmsan/kmsan.c:547
__msan_store_shadow_origin_4+0xac/0x110 mm/kmsan/kmsan_instr.c:259
tcp_create_openreq_child+0x709/0x1ae0 net/ipv4/tcp_minisocks.c:472
tcp_v6_syn_recv_sock+0x7eb/0x2a30 net/ipv6/tcp_ipv6.c:1103
tcp_get_cookie_sock+0x136/0x5f0 net/ipv4/syncookies.c:212
cookie_v6_check+0x17a9/0x1b50 net/ipv6/syncookies.c:245
tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989
tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298
tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487
ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
NF_HOOK ./include/linux/netfilter.h:257
ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
dst_input ./include/net/dst.h:492
ip6_rcv_finish net/ipv6/ip6_input.c:69
NF_HOOK ./include/linux/netfilter.h:257
ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
__netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
__netif_receive_skb net/core/dev.c:4246
process_backlog+0x667/0xba0 net/core/dev.c:4866
napi_poll net/core/dev.c:5268
net_rx_action+0xc95/0x1590 net/core/dev.c:5333
__do_softirq+0x485/0x942 kernel/softirq.c:284
origin:
save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198
kmsan_kmalloc+0x7f/0xe0 mm/kmsan/kmsan.c:337
kmem_cache_alloc+0x1c2/0x1e0 mm/slub.c:2766
reqsk_alloc ./include/net/request_sock.h:87
inet_reqsk_alloc+0xa4/0x5b0 net/ipv4/tcp_input.c:6200
cookie_v6_check+0x4f4/0x1b50 net/ipv6/syncookies.c:169
tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989
tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298
tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487
ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
NF_HOOK ./include/linux/netfilter.h:257
ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
dst_input ./include/net/dst.h:492
ip6_rcv_finish net/ipv6/ip6_input.c:69
NF_HOOK ./include/linux/netfilter.h:257
ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
__netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
__netif_receive_skb net/core/dev.c:4246
process_backlog+0x667/0xba0 net/core/dev.c:4866
napi_poll net/core/dev.c:5268
net_rx_action+0xc95/0x1590 net/core/dev.c:5333
__do_softirq+0x485/0x942 kernel/softirq.c:284
==================================================================
Similar error is reported for cookie_v4_check().
Fixes: 58d607d3e52f ("tcp: provide skb->hash to synack packets")
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After rcu conversions performance degradation in forward tests isn't that
noticeable anymore.
See next patch for some numbers.
A followup patcg could then also remove genid from the policies
as we do not cache bundles anymore.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
revert c386578f1cdb4dac230395 ("xfrm: Let the flowcache handle its size by default.").
Once we remove flow cache, we don't have a flow cache limit anymore.
We must not allow (virtually) unlimited allocations of xfrm dst entries.
Revert back to the old xfrm dst gc limits.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
flow cache is removed in next commit.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rename udp{4,6}_ufo_fragment() to udp{4,6}_tunnel_segment() and only
handle tunnel segmentation.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We leak the temporary allocated resources in error paths,
fix this by freeing them.
Fixes: fca11ebde3f ("esp4: Reorganize esp_output")
Fixes: 383d0350f2c ("esp6: Reorganize esp_output")
Fixes: 3f29770723f ("ipsec: check return value of skb_to_sgvec always")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Lennert reported a failure to add different mpls encaps in a multipath
route:
$ ip -6 route add 1234::/16 \
nexthop encap mpls 10 via fe80::1 dev ens3 \
nexthop encap mpls 20 via fe80::1 dev ens3
RTNETLINK answers: File exists
The problem is that the duplicate nexthop detection does not compare
lwtunnel configuration. Add it.
Fixes: 19e42e451506 ("ipv6: support for fib route lwtunnel encap attributes")
Signed-off-by: David Ahern <dsahern@gmail.com>
Reported-by: João Taveira Araújo <joao.taveira@gmail.com>
Reported-by: Lennert Buytenhek <buytenh@wantstofly.org>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Tested-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull memdup_user() conversions from Al Viro:
"A fairly self-contained series - hunting down open-coded memdup_user()
and memdup_user_nul() instances"
* 'work.memdup_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
bpf: don't open-code memdup_user()
kimage_file_prepare_segments(): don't open-code memdup_user()
ethtool: don't open-code memdup_user()
do_ip_setsockopt(): don't open-code memdup_user()
do_ipv6_setsockopt(): don't open-code memdup_user()
irda: don't open-code memdup_user()
xfrm_user_policy(): don't open-code memdup_user()
ima_write_policy(): don't open-code memdup_user_nul()
sel_write_validatetrans(): don't open-code memdup_user_nul()
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, when the link for $DEV is down, this command succeeds but the
address is removed immediately by DAD (1):
ip addr add 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
In the same situation, this will succeed and not remove the address (2):
ip addr add 1111::12/64 dev $DEV
ip addr change 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
The comment in addrconf_dad_begin() when !IF_READY makes it look like
this is the intended behavior, but doesn't explain why:
* If the device is not ready:
* - keep it tentative if it is a permanent address.
* - otherwise, kill it.
We clearly cannot prevent userspace from doing (2), but we can make (1)
work consistently with (2).
addrconf_dad_stop() is only called in two cases: if DAD failed, or to
skip DAD when the link is down. In that second case, the fix is to avoid
deleting the address, like we already do for permanent addresses.
Fixes: 3c21edbd1137 ("[IPV6]: Defer IPv6 device initialization until the link becomes ready.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
This patch uses refcount_inc_not_zero() instead of
atomic_inc_not_zero_hint() due to absense of a _hint()
version of refcount API. If the hint() version must
be used, we might need to revisit API.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A set of overlapping changes in macvlan and the rocker
driver, nothing serious.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for your net-next
tree. This batch contains connection tracking updates for the cleanup
iteration path, patches from Florian Westphal:
X) Skip unconfirmed conntracks in nf_ct_iterate_cleanup_net(), just set
dying bit to let the CPU release them.
X) Add nf_ct_iterate_destroy() to be used on module removal, to kill
conntrack from all namespace.
X) Restart iteration on hashtable resizing, since both may occur at
the same time.
X) Use the new nf_ct_iterate_destroy() to remove conntrack with NAT
mapping on module removal.
X) Use nf_ct_iterate_destroy() to remove conntrack entries helper
module removal, from Liping Zhang.
X) Use nf_ct_iterate_cleanup_net() to remove the timeout extension
if user requests this, also from Liping.
X) Add net_ns_barrier() and use it from FTP helper, so make sure
no concurrent namespace removal happens at the same time while
the helper module is being removed.
X) Use NFPROTO_MAX in layer 3 conntrack protocol array, to reduce
module size. Same thing in nf_tables.
Updates for the nf_tables infrastructure:
X) Prepare usage of the extended ACK reporting infrastructure for
nf_tables.
X) Remove unnecessary forward declaration in nf_tables hash set.
X) Skip set size estimation if number of element is not specified.
X) Changes to accomodate a (faster) unresizable hash set implementation,
for anonymous sets and dynamic size fixed sets with no timeouts.
X) Faster lookup function for unresizable hash table for 2 and 4
bytes key.
And, finally, a bunch of asorted small updates and cleanups:
X) Do not hold reference to netdev from ipt_CLUSTER, instead subscribe
to device events and look up for index from the packet path, this
is fixing an issue that is present since the very beginning, patch
from Xin Long.
X) Use nf_register_net_hook() in ipt_CLUSTER, from Florian Westphal.
X) Use ebt_invalid_target() whenever possible in the ebtables tree,
from Gao Feng.
X) Calm down compilation warning in nf_dup infrastructure, patch from
stephen hemminger.
X) Statify functions in nftables rt expression, also from stephen.
X) Update Makefile to use canonical method to specify nf_tables-objs.
From Jike Song.
X) Use nf_conntrack_helpers_register() in amanda and H323.
X) Space cleanup for ctnetlink, from linzhang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The commit b65ac44674dd ("udp: try to avoid 2 cache miss on dequeue")
leveraged the scratched area helpers for UDP v4 but I forgot to
update accordingly the IPv6 code path.
This change extends the scratch area usage to the IPv6 code, synching
the two implementations and giving some performance benefit.
IPv6 is again almost on the same level of IPv4, performance-wide.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for extended error reporting.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for extended error reporting.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for extended error reporting.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In __ip6_datagram_connect(), reset sk->sk_v6_daddr and inet->dport if
error occurs.
In udp_v6_early_demux(), check for sk_state to make sure it is in
TCP_ESTABLISHED state.
Together, it makes sure unconnected UDP socket won't be considered as a
valid candidate for early demux.
v3: add TCP_ESTABLISHED state check in udp_v6_early_demux()
v2: fix compilation error
Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2017-06-23
1) Use memdup_user to spmlify xfrm_user_policy.
From Geliang Tang.
2) Make xfrm_dev_register static to silence a sparse warning.
From Wei Yongjun.
3) Use crypto_memneq to check the ICV in the AH protocol.
From Sabrina Dubroca.
4) Remove some unused variables in esp6.
From Stephen Hemminger.
5) Extend XFRM MIGRATE to allow to change the UDP encapsulation port.
From Antony Antony.
6) Include the UDP encapsulation port to km_migrate announcements.
From Antony Antony.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2017-06-23
1) Fix xfrm garbage collecting when unregistering a netdevice.
From Hangbin Liu.
2) Fix NULL pointer derefernce when exiting a network namespace.
From Hangbin Liu.
3) Fix some error codes in pfkey to prevent a NULL pointer derefernce.
From Dan Carpenter.
4) Fix NULL pointer derefernce on allocation failure in pfkey.
From Dan Carpenter.
5) Adjust IPv6 payload_len to include extension headers. Otherwise
we corrupt the packets when doing ESP GRO on transport mode.
From Yossi Kuperman.
6) Set nhoff to the proper offset of the IPv6 nexthdr when doing ESP GRO.
From Yossi Kuperman.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The memory allocation size is controlled by user-space,
if it is too large just fail silently and return NULL,
not to mention there is a fallback allocation later.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|