<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/net/tipc/bcast.c, branch docs-4.16</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-4.16</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-4.16'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2017-10-13T15:46:00+00:00</updated>
<entry>
<title>tipc: improve destination linked list</title>
<updated>2017-10-13T15:46:00+00:00</updated>
<author>
<name>Jon Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-10-13T09:04:22+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a80ae5306a7346d4e52f59462878beb8362f4bbd'/>
<id>urn:sha1:a80ae5306a7346d4e52f59462878beb8362f4bbd</id>
<content type='text'>
We often see a need for a linked list of destination identities,
sometimes containing a port number, sometimes a node identity, and
sometimes both. The currently defined struct u32_list is not generic
enough to cover all cases, so we extend it to contain two u32 integers
and rename it to struct tipc_dest_list.

Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: correct initialization of skb list</title>
<updated>2017-10-09T04:13:23+00:00</updated>
<author>
<name>Jon Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-10-07T12:32:49+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=3382605fd8db1ed1fb03f3f1529490133fe3ab08'/>
<id>urn:sha1:3382605fd8db1ed1fb03f3f1529490133fe3ab08</id>
<content type='text'>
We change the initialization of the skb transmit buffer queues
in the functions tipc_bcast_xmit() and tipc_rcast_xmit() to also
initialize their spinlocks. This is needed because we may, during
error conditions, need to call skb_queue_purge() on those queues
further down the stack.

Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: make replicast a user selectable option</title>
<updated>2017-01-20T17:10:17+00:00</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-01-18T18:50:53+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=01fd12bb189a0772301dd37e9b31e53761269a1b'/>
<id>urn:sha1:01fd12bb189a0772301dd37e9b31e53761269a1b</id>
<content type='text'>
If the bearer carrying multicast messages supports broadcast, those
messages will be sent to all cluster nodes, irrespective of whether
these nodes host any actual destinations socket or not. This is clearly
wasteful if the cluster is large and there are only a few real
destinations for the message being sent.

In this commit we extend the eligibility of the newly introduced
"replicast" transmit option. We now make it possible for a user to
select which method he wants to be used, either as a mandatory setting
via setsockopt(), or as a relative setting where we let the broadcast
layer decide which method to use based on the ratio between cluster
size and the message's actual number of destination nodes.

In the latter case, a sending socket must stick to a previously
selected method until it enters an idle period of at least 5 seconds.
This eliminates the risk of message reordering caused by method change,
i.e., when changes to cluster size or number of destinations would
otherwise mandate a new method to be used.

Reviewed-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: introduce replicast as transport option for multicast</title>
<updated>2017-01-20T17:10:17+00:00</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-01-18T18:50:52+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a853e4c6d0843729e1f25a7a7beff168e1dd7420'/>
<id>urn:sha1:a853e4c6d0843729e1f25a7a7beff168e1dd7420</id>
<content type='text'>
TIPC multicast messages are currently carried over a reliable
'broadcast link', making use of the underlying media's ability to
transport packets as L2 broadcast or IP multicast to all nodes in
the cluster.

When the used bearer is lacking that ability, we can instead emulate
the broadcast service by replicating and sending the packets over as
many unicast links as needed to reach all identified destinations.
We now introduce a new TIPC link-level 'replicast' service that does
this.

Reviewed-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: add functionality to lookup multicast destination nodes</title>
<updated>2017-01-20T17:10:16+00:00</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-01-18T18:50:51+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=2ae0b8af1fe35ddaa2e46704ae31a2f9cac0349d'/>
<id>urn:sha1:2ae0b8af1fe35ddaa2e46704ae31a2f9cac0349d</id>
<content type='text'>
As a further preparation for the upcoming 'replicast' functionality,
we add some necessary structs and functions for looking up and returning
a list of all nodes that host destinations for a given multicast message.

Reviewed-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: add function for checking broadcast support in bearer</title>
<updated>2017-01-20T17:10:15+00:00</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-01-18T18:50:50+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=9999974a8318b605ebae08a87e86232659e56a52'/>
<id>urn:sha1:9999974a8318b605ebae08a87e86232659e56a52</id>
<content type='text'>
As a preparation for the 'replicast' functionality we are going to
introduce in the next commits, we need the broadcast base structure to
store whether bearer broadcast is available at all from the currently
used bearer or bearers.

We do this by adding a new function tipc_bearer_bcast_support() to
the bearer layer, and letting the bearer selection function in
bcast.c use this to give a new boolean field, 'bcast_support' the
appropriate value.

Reviewed-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: reduce risk of user starvation during link congestion</title>
<updated>2017-01-03T16:13:05+00:00</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-01-03T15:55:11+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=365ad353c2564bba8835290061308ba825166b3a'/>
<id>urn:sha1:365ad353c2564bba8835290061308ba825166b3a</id>
<content type='text'>
The socket code currently handles link congestion by either blocking
and trying to send again when the congestion has abated, or just
returning to the user with -EAGAIN and let him re-try later.

This mechanism is prone to starvation, because the wakeup algorithm is
non-atomic. During the time the link issues a wakeup signal, until the
socket wakes up and re-attempts sending, other senders may have come
in between and occupied the free buffer space in the link. This in turn
may lead to a socket having to make many send attempts before it is
successful. In extremely loaded systems we have observed latency times
of several seconds before a low-priority socket is able to send out a
message.

In this commit, we simplify this mechanism and reduce the risk of the
described scenario happening. When a message is attempted sent via a
congested link, we now let it be added to the link's backlog queue
anyway, thus permitting an oversubscription of one message per source
socket. We still create a wakeup item and return an error code, hence
instructing the sender to block or stop sending. Only when enough space
has been freed up in the link's backlog queue do we issue a wakeup event
that allows the sender to continue with the next message, if any.

The fact that a socket now can consider a message sent even when the
link returns a congestion code means that the sending socket code can
be simplified. Also, since this is a good opportunity to get rid of the
obsolete 'mtu change' condition in the three socket send functions, we
now choose to refactor those functions completely.

Signed-off-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix broadcast link synchronization problem</title>
<updated>2016-10-29T21:21:09+00:00</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2016-10-27T22:51:55+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=06bd2b1ed04ca9fdbc767859885944a1e8b86b40'/>
<id>urn:sha1:06bd2b1ed04ca9fdbc767859885944a1e8b86b40</id>
<content type='text'>
In commit 2d18ac4ba745 ("tipc: extend broadcast link initialization
criteria") we tried to fix a problem with the initial synchronization
of broadcast link acknowledge values. Unfortunately that solution is
not sufficient to solve the issue.

We have seen it happen that LINK_PROTOCOL/STATE packets with a valid
non-zero unicast acknowledge number may bypass BCAST_PROTOCOL
initialization, NAME_DISTRIBUTOR and other STATE packets with invalid
broadcast acknowledge numbers, leading to premature opening of the
broadcast link. When the bypassed packets finally arrive, they are
inadvertently accepted, and the already correctly initialized
acknowledge number in the broadcast receive link is overwritten by
the invalid (zero) value of the said packets. After this the broadcast
link goes stale.

We now fix this by marking the packets where we know the acknowledge
value is or may be invalid, and then ignoring the acks from those.

To this purpose, we claim an unused bit in the header to indicate that
the value is invalid. We set the bit to 1 in the initial BCAST_PROTOCOL
synchronization packet and all initial ("bulk") NAME_DISTRIBUTOR
packets, plus those LINK_PROTOCOL packets sent out before the broadcast
links are fully synchronized.

This minor protocol update is fully backwards compatible.

Reported-by: John Thompson &lt;thompa.atl@gmail.com&gt;
Tested-by: John Thompson &lt;thompa.atl@gmail.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: transfer broadcast nacks in link state messages</title>
<updated>2016-09-03T00:10:24+00:00</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2016-09-01T17:52:49+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=02d11ca20091fcef904f05defda80c53e5b4e793'/>
<id>urn:sha1:02d11ca20091fcef904f05defda80c53e5b4e793</id>
<content type='text'>
When we send broadcasts in clusters of more 70-80 nodes, we sometimes
see the broadcast link resetting because of an excessive number of
retransmissions. This is caused by a combination of two factors:

1) A 'NACK crunch", where loss of broadcast packets is discovered
   and NACK'ed by several nodes simultaneously, leading to multiple
   redundant broadcast retransmissions.

2) The fact that the NACKS as such also are sent as broadcast, leading
   to excessive load and packet loss on the transmitting switch/bridge.

This commit deals with the latter problem, by moving sending of
broadcast nacks from the dedicated BCAST_PROTOCOL/NACK message type
to regular unicast LINK_PROTOCOL/STATE messages. We allocate 10 unused
bits in word 8 of the said message for this purpose, and introduce a
new capability bit, TIPC_BCAST_STATE_NACK in order to keep the change
backwards compatible.

Reviewed-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: remove pre-allocated message header in link struct</title>
<updated>2016-03-07T04:01:20+00:00</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2016-03-03T19:23:21+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=e74a386d70c793ef44678b44b021c99094b28c47'/>
<id>urn:sha1:e74a386d70c793ef44678b44b021c99094b28c47</id>
<content type='text'>
Until now, we have kept a pre-allocated protocol message header
aggregated into struct tipc_link. Apart from adding unnecessary
footprint to the link instances, this requires extra code both to
initialize and re-initialize it.

We now remove this sub-optimization. This change also makes it
possible to clean up the function tipc_build_proto_msg() and remove
a couple of small functions that were accessing the mentioned header.
In particular, we can replace all occurrences of the local function
call link_own_addr(link) with the generic tipc_own_addr(net).

Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
