213 lines
		
	
	
		
			9.6 KiB
		
	
	
	
		
			Diff
		
	
	
	
	
	
			
		
		
	
	
			213 lines
		
	
	
		
			9.6 KiB
		
	
	
	
		
			Diff
		
	
	
	
	
	
| Bugzilla: 1123696
 | |
| Upstream-status: Queued for 3.16
 | |
| 
 | |
| From patchwork Tue Jul 22 13:22:45 2014
 | |
| Content-Type: text/plain; charset="utf-8"
 | |
| MIME-Version: 1.0
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Subject: [net,v2] net: sctp: inherit auth_capable on INIT collisions
 | |
| From: Daniel Borkmann <dborkman@redhat.com>
 | |
| X-Patchwork-Id: 372475
 | |
| Message-Id: <1406035365-1154-1-git-send-email-dborkman@redhat.com>
 | |
| To: davem@davemloft.net
 | |
| Cc: jgunthorpe@obsidianresearch.com, netdev@vger.kernel.org,
 | |
|  linux-sctp@vger.kernel.org, Vlad Yasevich <vyasevich@gmail.com>
 | |
| Date: Tue, 22 Jul 2014 15:22:45 +0200
 | |
| 
 | |
| Jason reported an oops caused by SCTP on his ARM machine with
 | |
| SCTP authentication enabled:
 | |
| 
 | |
| Internal error: Oops: 17 [#1] ARM
 | |
| CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty #1
 | |
| task: c6eefa40 ti: c6f52000 task.ti: c6f52000
 | |
| PC is at sctp_auth_calculate_hmac+0xc4/0x10c
 | |
| LR is at sg_init_table+0x20/0x38
 | |
| pc : [<c024bb80>]    lr : [<c00f32dc>]    psr: 40000013
 | |
| sp : c6f538e8  ip : 00000000  fp : c6f53924
 | |
| r10: c6f50d80  r9 : 00000000  r8 : 00010000
 | |
| r7 : 00000000  r6 : c7be4000  r5 : 00000000  r4 : c6f56254
 | |
| r3 : c00c8170  r2 : 00000001  r1 : 00000008  r0 : c6f1e660
 | |
| Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
 | |
| Control: 0005397f  Table: 06f28000  DAC: 00000015
 | |
| Process sctp-test (pid: 104, stack limit = 0xc6f521c0)
 | |
| Stack: (0xc6f538e8 to 0xc6f54000)
 | |
| [...]
 | |
| Backtrace:
 | |
| [<c024babc>] (sctp_auth_calculate_hmac+0x0/0x10c) from [<c0249af8>] (sctp_packet_transmit+0x33c/0x5c8)
 | |
| [<c02497bc>] (sctp_packet_transmit+0x0/0x5c8) from [<c023e96c>] (sctp_outq_flush+0x7fc/0x844)
 | |
| [<c023e170>] (sctp_outq_flush+0x0/0x844) from [<c023ef78>] (sctp_outq_uncork+0x24/0x28)
 | |
| [<c023ef54>] (sctp_outq_uncork+0x0/0x28) from [<c0234364>] (sctp_side_effects+0x1134/0x1220)
 | |
| [<c0233230>] (sctp_side_effects+0x0/0x1220) from [<c02330b0>] (sctp_do_sm+0xac/0xd4)
 | |
| [<c0233004>] (sctp_do_sm+0x0/0xd4) from [<c023675c>] (sctp_assoc_bh_rcv+0x118/0x160)
 | |
| [<c0236644>] (sctp_assoc_bh_rcv+0x0/0x160) from [<c023d5bc>] (sctp_inq_push+0x6c/0x74)
 | |
| [<c023d550>] (sctp_inq_push+0x0/0x74) from [<c024a6b0>] (sctp_rcv+0x7d8/0x888)
 | |
| 
 | |
| While we already had various kind of bugs in that area
 | |
| ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if
 | |
| we/peer is AUTH capable") and b14878ccb7fa ("net: sctp: cache
 | |
| auth_enable per endpoint"), this one is a bit of a different
 | |
| kind.
 | |
| 
 | |
| Giving a bit more background on why SCTP authentication is
 | |
| needed can be found in RFC4895:
 | |
| 
 | |
|   SCTP uses 32-bit verification tags to protect itself against
 | |
|   blind attackers. These values are not changed during the
 | |
|   lifetime of an SCTP association.
 | |
| 
 | |
|   Looking at new SCTP extensions, there is the need to have a
 | |
|   method of proving that an SCTP chunk(s) was really sent by
 | |
|   the original peer that started the association and not by a
 | |
|   malicious attacker.
 | |
| 
 | |
| To cause this bug, we're triggering an INIT collision between
 | |
| peers; normal SCTP handshake where both sides intent to
 | |
| authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO
 | |
| parameters that are being negotiated among peers:
 | |
| 
 | |
|   ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
 | |
|   <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
 | |
|   -------------------- COOKIE-ECHO -------------------->
 | |
|   <-------------------- COOKIE-ACK ---------------------
 | |
| 
 | |
| RFC4895 says that each endpoint therefore knows its own random
 | |
| number and the peer's random number *after* the association
 | |
| has been established. The local and peer's random number along
 | |
| with the shared key are then part of the secret used for
 | |
| calculating the HMAC in the AUTH chunk.
 | |
| 
 | |
| Now, in our scenario, we have 2 threads with 1 non-blocking
 | |
| SEQ_PACKET socket each, setting up common shared SCTP_AUTH_KEY
 | |
| and SCTP_AUTH_ACTIVE_KEY properly, and each of them calling
 | |
| sctp_bindx(3), listen(2) and connect(2) against each other,
 | |
| thus the handshake looks similar to this, e.g.:
 | |
| 
 | |
|   ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
 | |
|   <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
 | |
|   <--------- INIT[RANDOM; CHUNKS; HMAC-ALGO] -----------
 | |
|   -------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] -------->
 | |
|   ...
 | |
| 
 | |
| Since such collisions can also happen with verification tags,
 | |
| the RFC4895 for AUTH rather vaguely says under section 6.1:
 | |
| 
 | |
|   In case of INIT collision, the rules governing the handling
 | |
|   of this Random Number follow the same pattern as those for
 | |
|   the Verification Tag, as explained in Section 5.2.4 of
 | |
|   RFC 2960 [5]. Therefore, each endpoint knows its own Random
 | |
|   Number and the peer's Random Number after the association
 | |
|   has been established.
 | |
| 
 | |
| In RFC2960, section 5.2.4, we're eventually hitting Action B:
 | |
| 
 | |
|   B) In this case, both sides may be attempting to start an
 | |
|      association at about the same time but the peer endpoint
 | |
|      started its INIT after responding to the local endpoint's
 | |
|      INIT. Thus it may have picked a new Verification Tag not
 | |
|      being aware of the previous Tag it had sent this endpoint.
 | |
|      The endpoint should stay in or enter the ESTABLISHED
 | |
|      state but it MUST update its peer's Verification Tag from
 | |
|      the State Cookie, stop any init or cookie timers that may
 | |
|      running and send a COOKIE ACK.
 | |
| 
 | |
| In other words, the handling of the Random parameter is the
 | |
| same as behavior for the Verification Tag as described in
 | |
| Action B of section 5.2.4.
 | |
| 
 | |
| Looking at the code, we exactly hit the sctp_sf_do_dupcook_b()
 | |
| case which triggers an SCTP_CMD_UPDATE_ASSOC command to the
 | |
| side effect interpreter, and in fact it properly copies over
 | |
| peer_{random, hmacs, chunks} parameters from the newly created
 | |
| association to update the existing one.
 | |
| 
 | |
| Also, the old asoc_shared_key is being released and based on
 | |
| the new params, sctp_auth_asoc_init_active_key() updated.
 | |
| However, the issue observed in this case is that the previous
 | |
| asoc->peer.auth_capable was 0, and has *not* been updated, so
 | |
| that instead of creating a new secret, we're doing an early
 | |
| return from the function sctp_auth_asoc_init_active_key()
 | |
| leaving asoc->asoc_shared_key as NULL. However, we now have to
 | |
| authenticate chunks from the updated chunk list (e.g. COOKIE-ACK).
 | |
| 
 | |
| That in fact causes the server side when responding with ...
 | |
| 
 | |
|   <------------------ AUTH; COOKIE-ACK -----------------
 | |
| 
 | |
| ... to trigger a NULL pointer dereference, since in
 | |
| sctp_packet_transmit(), it discovers that an AUTH chunk is
 | |
| being queued for xmit, and thus it calls sctp_auth_calculate_hmac().
 | |
| 
 | |
| Since the asoc->active_key_id is still inherited from the
 | |
| endpoint, and the same as encoded into the chunk, it uses
 | |
| asoc->asoc_shared_key, which is still NULL, as an asoc_key
 | |
| and dereferences it in ...
 | |
| 
 | |
|   crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len)
 | |
| 
 | |
| ... causing an oops. All this happens because sctp_make_cookie_ack()
 | |
| called with the *new* association has the peer.auth_capable=1
 | |
| and therefore marks the chunk with auth=1 after checking
 | |
| sctp_auth_send_cid(), but it is *actually* sent later on over
 | |
| the then *updated* association's transport that didn't initialize
 | |
| its shared key due to peer.auth_capable=0. Since control chunks
 | |
| in that case are not sent by the temporary association which
 | |
| are scheduled for deletion, they are issued for xmit via
 | |
| SCTP_CMD_REPLY in the interpreter with the context of the
 | |
| *updated* association. peer.auth_capable was 0 in the updated
 | |
| association (which went from COOKIE_WAIT into ESTABLISHED state),
 | |
| since all previous processing that performed sctp_process_init()
 | |
| was being done on temporary associations, that we eventually
 | |
| throw away each time.
 | |
| 
 | |
| The correct fix is to update to the new peer.auth_capable
 | |
| value as well in the collision case via sctp_assoc_update(),
 | |
| so that in case the collision migrated from 0 -> 1,
 | |
| sctp_auth_asoc_init_active_key() can properly recalculate
 | |
| the secret. This therefore fixes the observed server panic.
 | |
| 
 | |
| Fixes: 730fc3d05cd4 ("[SCTP]: Implete SCTP-AUTH parameter processing")
 | |
| Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
 | |
| Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
 | |
| Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
 | |
| Cc: Vlad Yasevich <vyasevich@gmail.com>
 | |
| Acked-by: Vlad Yasevich <vyasevich@gmail.com>
 | |
| ---
 | |
|  v1 -> v2, more notes:
 | |
| 
 | |
|   I've only updated the commit description for now, this bug seems
 | |
|   clear to me that we would need to fix it; since RFC4895 mentions
 | |
|   it explicitly that on collisions, we need to *update* these params
 | |
|   accordingly as we would do so in RFC2960. So in other words, this
 | |
|   can be explained by having an *inconsistency* when doing the update
 | |
|   as auth_capable is *tightly coupled* with peer_random, peer_chunks,
 | |
|   peer_hmacs and eventually the asoc_shared_key creation.
 | |
| 
 | |
|   For the rest, I went through the code and currently could not
 | |
|   find where we could oops if we don't have the others for now. It
 | |
|   needs more time and testing however. It's also not too clear from
 | |
|   RFC2960/RFC4960 what needs to be carried over in addition: so we
 | |
|   know "The endpoint should stay in or enter the ESTABLISHED state
 | |
|   but it MUST update its peer's Verification Tag from the State
 | |
|   Cookie, stop any init or cookie timers that may running and send
 | |
|   a COOKIE ACK." and we know that we need to update all AUTH related
 | |
|   members, which we do *now*.
 | |
| 
 | |
|   In addition, we also need to fix AUTH + COOKIE_ECHO collisions,
 | |
|   as they currently cannot be resolved properly into a handshake.
 | |
| 
 | |
|  net/sctp/associola.c | 1 +
 | |
|  1 file changed, 1 insertion(+)
 | |
| 
 | |
| diff --git a/net/sctp/associola.c b/net/sctp/associola.c
 | |
| index 9de23a2..06a9ee6 100644
 | |
| --- a/net/sctp/associola.c
 | |
| +++ b/net/sctp/associola.c
 | |
| @@ -1097,6 +1097,7 @@ void sctp_assoc_update(struct sctp_association *asoc,
 | |
|  	asoc->c = new->c;
 | |
|  	asoc->peer.rwnd = new->peer.rwnd;
 | |
|  	asoc->peer.sack_needed = new->peer.sack_needed;
 | |
| +	asoc->peer.auth_capable = new->peer.auth_capable;
 | |
|  	asoc->peer.i = new->peer.i;
 | |
|  	sctp_tsnmap_init(&asoc->peer.tsn_map, SCTP_TSN_MAP_INITIAL,
 | |
|  			 asoc->peer.i.initial_tsn, GFP_ATOMIC);
 |