On Wed, 2025-09-10 at 12:50 +0200, Ilya Dryomov wrote: > There is a place where generic code in messenger.c is reading and > another place where it is writing to con->v1 union member without > checking that the union member is active (i.e. msgr1 is in use). > > On 64-bit systems, con->v1.auth_retry overlaps with con->v2.out_iter, > so such a read is almost guaranteed to return a bogus value instead of > 0 when msgr2 is in use. This ends up being fairly benign because the > side effect is just the invalidation of the authorizer and successive > fetching of new tickets. > > con->v1.connect_seq overlaps with con->v2.conn_bufs and the fact that > it's being written to can cause more serious consequences, but luckily > it's not something that happens often. > > Cc: stable@xxxxxxxxxxxxxxx > Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") > Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx> > --- > net/ceph/messenger.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index d1b5705dc0c6..9f6d860411cb 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -1524,7 +1524,7 @@ static void con_fault_finish(struct ceph_connection *con) > * in case we faulted due to authentication, invalidate our > * current tickets so that we can get new ones. > */ > - if (con->v1.auth_retry) { > + if (!ceph_msgr2(from_msgr(con->msgr)) && con->v1.auth_retry) { Frankly speaking, this check implementation looks pretty not obvious :). static inline bool ceph_msgr2(struct ceph_client *client) { return client->options->con_modes[0] != CEPH_CON_MODE_UNKNOWN; } It's strange that struct ceph_connection_v1_info and struct ceph_connection_v2_info don't start with version field. Because, it will be much cleaner if these structures simply start like this: struct ceph_connection_v1_info { u8 version; <skipped> }; struct ceph_connection_v2_info { u8 version; <skipped> }; But now it's too complicated to change this. > dout("auth_retry %d, invalidating\n", con->v1.auth_retry); > if (con->ops->invalidate_authorizer) > con->ops->invalidate_authorizer(con); > @@ -1714,9 +1714,10 @@ static void clear_standby(struct ceph_connection *con) > { > /* come back from STANDBY? */ > if (con->state == CEPH_CON_S_STANDBY) { > - dout("clear_standby %p and ++connect_seq\n", con); This comment "++connect_seq" makes sense to delete or to rework even without the fix. :) > + dout("clear_standby %p\n", con); > con->state = CEPH_CON_S_PREOPEN; > - con->v1.connect_seq++; > + if (!ceph_msgr2(from_msgr(con->msgr))) > + con->v1.connect_seq++; By the way, we have connect_seq field in struct ceph_connection_v1_info and in struct ceph_connection_v2_info. Why do we increment this field only for version 1? Version 2 doesn't need this increment? > WARN_ON(ceph_con_flag_test(con, CEPH_CON_F_WRITE_PENDING)); > WARN_ON(ceph_con_flag_test(con, CEPH_CON_F_KEEPALIVE_PENDING)); > } Looks good. Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx> Thanks, Slava.