summaryrefslogtreecommitdiff
path: root/include
diff options
context:
space:
mode:
authorSage Weil <sage@inktank.com>2012-07-10 11:53:34 -0700
committerSage Weil <sage@inktank.com>2012-07-17 19:35:59 -0700
commit5bdca4e0768d3e0f4efa43d9a2cc8210aeb91ab9 (patch)
treede2a46ca2bc95e84737f3fe65e715d602b3b9356 /include
parenta018540141a931f5299a866907b27886916b4374 (diff)
downloadlwn-5bdca4e0768d3e0f4efa43d9a2cc8210aeb91ab9.tar.gz
lwn-5bdca4e0768d3e0f4efa43d9a2cc8210aeb91ab9.zip
libceph: fix messenger retry
In ancient times, the messenger could both initiate and accept connections. An artifact if that was data structures to store/process an incoming ceph_msg_connect request and send an outgoing ceph_msg_connect_reply. Sadly, the negotiation code was referencing those structures and ignoring important information (like the peer's connect_seq) from the correct ones. Among other things, this fixes tight reconnect loops where the server sends RETRY_SESSION and we (the client) retries with the same connect_seq as last time. This bug pretty easily triggered by injecting socket failures on the MDS and running some fs workload like workunits/direct_io/test_sync_io. Signed-off-by: Sage Weil <sage@inktank.com>
Diffstat (limited to 'include')
-rw-r--r--include/linux/ceph/messenger.h12
1 files changed, 2 insertions, 10 deletions
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 2521a95fa6d9..44c87e731e9d 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -163,16 +163,8 @@ struct ceph_connection {
/* connection negotiation temps */
char in_banner[CEPH_BANNER_MAX_LEN];
- union {
- struct { /* outgoing connection */
- struct ceph_msg_connect out_connect;
- struct ceph_msg_connect_reply in_reply;
- };
- struct { /* incoming */
- struct ceph_msg_connect in_connect;
- struct ceph_msg_connect_reply out_reply;
- };
- };
+ struct ceph_msg_connect out_connect;
+ struct ceph_msg_connect_reply in_reply;
struct ceph_entity_addr actual_peer_addr;
/* message out temps */