summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorYuchung Cheng <ycheng@google.com>2015-10-16 21:57:42 -0700
committerDavid S. Miller <davem@davemloft.net>2015-10-21 07:00:43 -0700
commitf672258391b42a5c7cc2732c9c063e56a85c8dbe (patch)
tree7b94f91a3b04fd478a4ea08eaa52edbc38492a99 /Documentation
parent9e45a3e36b363cc4c79c70f2b4f994e66543a219 (diff)
downloadlwn-f672258391b42a5c7cc2732c9c063e56a85c8dbe.tar.gz
lwn-f672258391b42a5c7cc2732c9c063e56a85c8dbe.zip
tcp: track min RTT using windowed min-filter
Kathleen Nichols' algorithm for tracking the minimum RTT of a data stream over some measurement window. It uses constant space and constant time per update. Yet it almost always delivers the same minimum as an implementation that has to keep all the data in the window. The measurement window is tunable via sysctl.net.ipv4.tcp_min_rtt_wlen with a default value of 5 minutes. The algorithm keeps track of the best, 2nd best & 3rd best min values, maintaining an invariant that the measurement time of the n'th best >= n-1'th best. It also makes sure that the three values are widely separated in the time window since that bounds the worse case error when that data is monotonically increasing over the window. Upon getting a new min, we can forget everything earlier because it has no value - the new min is less than everything else in the window by definition and it's the most recent. So we restart fresh on every new min and overwrites the 2nd & 3rd choices. The same property holds for the 2nd & 3rd best. Therefore we have to maintain two invariants to maximize the information in the samples, one on values (1st.v <= 2nd.v <= 3rd.v) and the other on times (now-win <=1st.t <= 2nd.t <= 3rd.t <= now). These invariants determine the structure of the code The RTT input to the windowed filter is the minimum RTT measured from ACK or SACK, or as the last resort from TCP timestamps. The accessor tcp_min_rtt() returns the minimum RTT seen in the window. ~0U indicates it is not available. The minimum is 1usec even if the true RTT is below that. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/networking/ip-sysctl.txt8
1 files changed, 8 insertions, 0 deletions
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index ebe94f2cab98..502d6a572b4f 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -384,6 +384,14 @@ tcp_mem - vector of 3 INTEGERs: min, pressure, max
Defaults are calculated at boot time from amount of available
memory.
+tcp_min_rtt_wlen - INTEGER
+ The window length of the windowed min filter to track the minimum RTT.
+ A shorter window lets a flow more quickly pick up new (higher)
+ minimum RTT when it is moved to a longer path (e.g., due to traffic
+ engineering). A longer window makes the filter more resistant to RTT
+ inflations such as transient congestion. The unit is seconds.
+ Default: 300
+
tcp_moderate_rcvbuf - BOOLEAN
If set, TCP performs receive buffer auto-tuning, attempting to
automatically size the buffer (no greater than tcp_rmem[2]) to