diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2013-11-19 15:50:47 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2013-11-19 15:50:47 -0800 |
commit | 1ee2dcc2245340cf4ac94b99c4d00efbeba61824 (patch) | |
tree | 5339e55add987379525177011dde6e3756b1430c /net/ipv4 | |
parent | 4457e6f6c9f6052cd5f44a79037108b5ddeb0ce7 (diff) | |
parent | 091e0662ee2c37867ad918ce7b6ddd17f0e090e2 (diff) | |
download | lwn-1ee2dcc2245340cf4ac94b99c4d00efbeba61824.tar.gz lwn-1ee2dcc2245340cf4ac94b99c4d00efbeba61824.zip |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"Mostly these are fixes for fallout due to merge window changes, as
well as cures for problems that have been with us for a much longer
period of time"
1) Johannes Berg noticed two major deficiencies in our genetlink
registration. Some genetlink protocols we passing in constant
counts for their ops array rather than something like
ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
using fixed IDs for their multicast groups.
We have to retain these fixed IDs to keep existing userland tools
working, but reserve them so that other multicast groups used by
other protocols can not possibly conflict.
In dealing with these two problems, we actually now use less state
management for genetlink operations and multicast groups.
2) When configuring interface hardware timestamping, fix several
drivers that simply do not validate that the hwtstamp_config value
is one the driver actually supports. From Ben Hutchings.
3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.
4) In dev_forward_skb(), set the skb->protocol in the right order
relative to skb_scrub_packet(). From Alexei Starovoitov.
5) Bridge erroneously fails to use the proper wrapper functions to make
calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
Makita.
6) When detaching a bridge port, make sure to flush all VLAN IDs to
prevent them from leaking, also from Toshiaki Makita.
7) Put in a compromise for TCP Small Queues so that deep queued devices
that delay TX reclaim non-trivially don't have such a performance
decrease. One particularly problematic area is 802.11 AMPDU in
wireless. From Eric Dumazet.
8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
here. Fix from Eric Dumzaet, reported by Dave Jones.
9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.
10) When computing mergeable buffer sizes, virtio-net fails to take the
virtio-net header into account. From Michael Dalton.
11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
bumping, this one has been with us for a while. From Eric Dumazet.
12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
Hugne.
13) 6lowpan bit used for traffic classification was wrong, from Jukka
Rissanen.
14) macvlan has the same issue as normal vlans did wrt. propagating LRO
disabling down to the real device, fix it the same way. From Michal
Kubecek.
15) CPSW driver needs to soft reset all slaves during suspend, from
Daniel Mack.
16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.
17) The xen-netfront RX buffer refill timer isn't properly scheduled on
partial RX allocation success, from Ma JieYue.
18) When ipv6 ping protocol support was added, the AF_INET6 protocol
initialization cleanup path on failure was borked a little. Fix
from Vlad Yasevich.
19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
blocks we can do the wrong thing with the msg_name we write back to
userspace. From Hannes Frederic Sowa. There is another fix in the
works from Hannes which will prevent future problems of this nature.
20) Fix route leak in VTI tunnel transmit, from Fan Du.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
genetlink: make multicast groups const, prevent abuse
genetlink: pass family to functions using groups
genetlink: add and use genl_set_err()
genetlink: remove family pointer from genl_multicast_group
genetlink: remove genl_unregister_mc_group()
hsr: don't call genl_unregister_mc_group()
quota/genetlink: use proper genetlink multicast APIs
drop_monitor/genetlink: use proper genetlink multicast APIs
genetlink: only pass array to genl_register_family_with_ops()
tcp: don't update snd_nxt, when a socket is switched from repair mode
atm: idt77252: fix dev refcnt leak
xfrm: Release dst if this dst is improper for vti tunnel
netlink: fix documentation typo in netlink_set_err()
be2net: Delete secondary unicast MAC addresses during be_close
be2net: Fix unconditional enabling of Rx interface options
net, virtio_net: replace the magic value
ping: prevent NULL pointer dereference on write to msg_name
bnx2x: Prevent "timeout waiting for state X"
bnx2x: prevent CFC attention
bnx2x: Prevent panic during DMAE timeout
...
Diffstat (limited to 'net/ipv4')
-rw-r--r-- | net/ipv4/datagram.c | 2 | ||||
-rw-r--r-- | net/ipv4/ip_tunnel.c | 4 | ||||
-rw-r--r-- | net/ipv4/ip_vti.c | 1 | ||||
-rw-r--r-- | net/ipv4/ping.c | 49 | ||||
-rw-r--r-- | net/ipv4/raw.c | 4 | ||||
-rw-r--r-- | net/ipv4/tcp.c | 6 | ||||
-rw-r--r-- | net/ipv4/tcp_metrics.c | 10 | ||||
-rw-r--r-- | net/ipv4/tcp_output.c | 7 | ||||
-rw-r--r-- | net/ipv4/udp.c | 7 |
9 files changed, 41 insertions, 49 deletions
diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c index b28e863fe0a7..19e36376d2a0 100644 --- a/net/ipv4/datagram.c +++ b/net/ipv4/datagram.c @@ -57,7 +57,7 @@ int ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) if (IS_ERR(rt)) { err = PTR_ERR(rt); if (err == -ENETUNREACH) - IP_INC_STATS_BH(sock_net(sk), IPSTATS_MIB_OUTNOROUTES); + IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTNOROUTES); goto out; } diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index caf01176a5e4..90ff9570d7d4 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -454,6 +454,8 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, tstats->rx_bytes += skb->len; u64_stats_update_end(&tstats->syncp); + skb_scrub_packet(skb, !net_eq(tunnel->net, dev_net(tunnel->dev))); + if (tunnel->dev->type == ARPHRD_ETHER) { skb->protocol = eth_type_trans(skb, tunnel->dev); skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); @@ -461,8 +463,6 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb, skb->dev = tunnel->dev; } - skb_scrub_packet(skb, !net_eq(tunnel->net, dev_net(tunnel->dev))); - gro_cells_receive(&tunnel->gro_cells, skb); return 0; diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index 5d9c845d288a..52b802a0cd8c 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -126,6 +126,7 @@ static netdev_tx_t vti_tunnel_xmit(struct sk_buff *skb, struct net_device *dev) if (!rt->dst.xfrm || rt->dst.xfrm->props.mode != XFRM_MODE_TUNNEL) { dev->stats.tx_carrier_errors++; + ip_rt_put(rt); goto tx_error_icmp; } tdev = rt->dst.dev; diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c index cbc85f660d54..876c6ca2d8f9 100644 --- a/net/ipv4/ping.c +++ b/net/ipv4/ping.c @@ -830,8 +830,6 @@ int ping_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, { struct inet_sock *isk = inet_sk(sk); int family = sk->sk_family; - struct sockaddr_in *sin; - struct sockaddr_in6 *sin6; struct sk_buff *skb; int copied, err; @@ -841,13 +839,6 @@ int ping_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, if (flags & MSG_OOB) goto out; - if (addr_len) { - if (family == AF_INET) - *addr_len = sizeof(*sin); - else if (family == AF_INET6 && addr_len) - *addr_len = sizeof(*sin6); - } - if (flags & MSG_ERRQUEUE) { if (family == AF_INET) { return ip_recv_error(sk, msg, len); @@ -877,11 +868,15 @@ int ping_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, /* Copy the address and add cmsg data. */ if (family == AF_INET) { - sin = (struct sockaddr_in *) msg->msg_name; - sin->sin_family = AF_INET; - sin->sin_port = 0 /* skb->h.uh->source */; - sin->sin_addr.s_addr = ip_hdr(skb)->saddr; - memset(sin->sin_zero, 0, sizeof(sin->sin_zero)); + struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name; + + if (sin) { + sin->sin_family = AF_INET; + sin->sin_port = 0 /* skb->h.uh->source */; + sin->sin_addr.s_addr = ip_hdr(skb)->saddr; + memset(sin->sin_zero, 0, sizeof(sin->sin_zero)); + *addr_len = sizeof(*sin); + } if (isk->cmsg_flags) ip_cmsg_recv(msg, skb); @@ -890,17 +885,21 @@ int ping_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, } else if (family == AF_INET6) { struct ipv6_pinfo *np = inet6_sk(sk); struct ipv6hdr *ip6 = ipv6_hdr(skb); - sin6 = (struct sockaddr_in6 *) msg->msg_name; - sin6->sin6_family = AF_INET6; - sin6->sin6_port = 0; - sin6->sin6_addr = ip6->saddr; - - sin6->sin6_flowinfo = 0; - if (np->sndflow) - sin6->sin6_flowinfo = ip6_flowinfo(ip6); - - sin6->sin6_scope_id = ipv6_iface_scope_id(&sin6->sin6_addr, - IP6CB(skb)->iif); + struct sockaddr_in6 *sin6 = + (struct sockaddr_in6 *)msg->msg_name; + + if (sin6) { + sin6->sin6_family = AF_INET6; + sin6->sin6_port = 0; + sin6->sin6_addr = ip6->saddr; + sin6->sin6_flowinfo = 0; + if (np->sndflow) + sin6->sin6_flowinfo = ip6_flowinfo(ip6); + sin6->sin6_scope_id = + ipv6_iface_scope_id(&sin6->sin6_addr, + IP6CB(skb)->iif); + *addr_len = sizeof(*sin6); + } if (inet6_sk(sk)->rxopt.all) pingv6_ops.ip6_datagram_recv_ctl(sk, msg, skb); diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index 41e1d2845c8f..5cb8ddb505ee 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -696,9 +696,6 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, if (flags & MSG_OOB) goto out; - if (addr_len) - *addr_len = sizeof(*sin); - if (flags & MSG_ERRQUEUE) { err = ip_recv_error(sk, msg, len); goto out; @@ -726,6 +723,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, sin->sin_addr.s_addr = ip_hdr(skb)->saddr; sin->sin_port = 0; memset(&sin->sin_zero, 0, sizeof(sin->sin_zero)); + *addr_len = sizeof(*sin); } if (inet->cmsg_flags) ip_cmsg_recv(msg, skb); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 8e8529d3c8c9..3dc0c6cf02a8 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -808,12 +808,6 @@ static unsigned int tcp_xmit_size_goal(struct sock *sk, u32 mss_now, xmit_size_goal = min_t(u32, gso_size, sk->sk_gso_max_size - 1 - hlen); - /* TSQ : try to have at least two segments in flight - * (one in NIC TX ring, another in Qdisc) - */ - xmit_size_goal = min_t(u32, xmit_size_goal, - sysctl_tcp_limit_output_bytes >> 1); - xmit_size_goal = tcp_bound_to_half_wnd(tp, xmit_size_goal); /* We try hard to avoid divides here */ diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c index 2ab09cbae74d..06493736fbc8 100644 --- a/net/ipv4/tcp_metrics.c +++ b/net/ipv4/tcp_metrics.c @@ -663,10 +663,13 @@ void tcp_fastopen_cache_get(struct sock *sk, u16 *mss, void tcp_fastopen_cache_set(struct sock *sk, u16 mss, struct tcp_fastopen_cookie *cookie, bool syn_lost) { + struct dst_entry *dst = __sk_dst_get(sk); struct tcp_metrics_block *tm; + if (!dst) + return; rcu_read_lock(); - tm = tcp_get_metrics(sk, __sk_dst_get(sk), true); + tm = tcp_get_metrics(sk, dst, true); if (tm) { struct tcp_fastopen_metrics *tfom = &tm->tcpm_fastopen; @@ -988,7 +991,7 @@ static int tcp_metrics_nl_cmd_del(struct sk_buff *skb, struct genl_info *info) return 0; } -static struct genl_ops tcp_metrics_nl_ops[] = { +static const struct genl_ops tcp_metrics_nl_ops[] = { { .cmd = TCP_METRICS_CMD_GET, .doit = tcp_metrics_nl_cmd_get, @@ -1079,8 +1082,7 @@ void __init tcp_metrics_init(void) if (ret < 0) goto cleanup; ret = genl_register_family_with_ops(&tcp_metrics_nl_family, - tcp_metrics_nl_ops, - ARRAY_SIZE(tcp_metrics_nl_ops)); + tcp_metrics_nl_ops); if (ret < 0) goto cleanup_subsys; return; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 672854664ff5..7820f3a7dd70 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1875,8 +1875,12 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, * - better RTT estimation and ACK scheduling * - faster recovery * - high rates + * Alas, some drivers / subsystems require a fair amount + * of queued bytes to ensure line rate. + * One example is wifi aggregation (802.11 AMPDU) */ - limit = max(skb->truesize, sk->sk_pacing_rate >> 10); + limit = max_t(unsigned int, sysctl_tcp_limit_output_bytes, + sk->sk_pacing_rate >> 10); if (atomic_read(&sk->sk_wmem_alloc) > limit) { set_bit(TSQ_THROTTLED, &tp->tsq_flags); @@ -3093,7 +3097,6 @@ void tcp_send_window_probe(struct sock *sk) { if (sk->sk_state == TCP_ESTABLISHED) { tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1; - tcp_sk(sk)->snd_nxt = tcp_sk(sk)->write_seq; tcp_xmit_probe_skb(sk, 0); } } diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index de86e5bc4462..5944d7d668dd 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1235,12 +1235,6 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, int is_udplite = IS_UDPLITE(sk); bool slow; - /* - * Check any passed addresses - */ - if (addr_len) - *addr_len = sizeof(*sin); - if (flags & MSG_ERRQUEUE) return ip_recv_error(sk, msg, len); @@ -1302,6 +1296,7 @@ try_again: sin->sin_port = udp_hdr(skb)->source; sin->sin_addr.s_addr = ip_hdr(skb)->saddr; memset(sin->sin_zero, 0, sizeof(sin->sin_zero)); + *addr_len = sizeof(*sin); } if (inet->cmsg_flags) ip_cmsg_recv(msg, skb); |