diff options
author | Willem de Bruijn <willemb@google.com> | 2017-08-03 16:29:41 -0400 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2017-08-03 21:37:30 -0700 |
commit | 1f8b977ab32dc5d148f103326e80d9097f1cefb5 (patch) | |
tree | 1c2c09ca72dba5bd43f8b3ead791d271f075c24b /drivers/vhost | |
parent | 76851d1212c11365362525e1e2c0a18c97478e6b (diff) | |
download | lwn-1f8b977ab32dc5d148f103326e80d9097f1cefb5.tar.gz lwn-1f8b977ab32dc5d148f103326e80d9097f1cefb5.zip |
sock: enable MSG_ZEROCOPY
Prepare the datapath for refcounted ubuf_info. Clone ubuf_info with
skb_zerocopy_clone() wherever needed due to skb split, merge, resize
or clone.
Split skb_orphan_frags into two variants. The split, merge, .. paths
support reference counted zerocopy buffers, so do not do a deep copy.
Add skb_orphan_frags_rx for paths that may loop packets to receive
sockets. That is not allowed, as it may cause unbounded latency.
Deep copy all zerocopy copy buffers, ref-counted or not, in this path.
The exact locations to modify were chosen by exhaustively searching
through all code that might modify skb_frag references and/or the
the SKBTX_DEV_ZEROCOPY tx_flags bit.
The changes err on the safe side, in two ways.
(1) legacy ubuf_info paths virtio and tap are not modified. They keep
a 1:1 ubuf_info to sk_buff relationship. Calls to skb_orphan_frags
still call skb_copy_ubufs and thus copy frags in this case.
(2) not all copies deep in the stack are addressed yet. skb_shift,
skb_split and skb_try_coalesce can be refined to avoid copying.
These are not in the hot path and this patch is hairy enough as
is, so that is left for future refinement.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'drivers/vhost')
-rw-r--r-- | drivers/vhost/net.c | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 06d044862e58..ba08b78ed630 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -533,6 +533,7 @@ static void handle_tx(struct vhost_net *net) ubuf->callback = vhost_zerocopy_callback; ubuf->ctx = nvq->ubufs; ubuf->desc = nvq->upend_idx; + atomic_set(&ubuf->refcnt, 1); msg.msg_control = ubuf; msg.msg_controllen = sizeof(ubuf); ubufs = nvq->ubufs; |