File spinlock_deadlock_dev_al_lock.patch of Package drbd.17219

commit 7ce7cac6a1901988caec429f8fb42874d44d7b44
Author: Lars Ellenberg <lars.ellenberg@linbit.com>
Date:   Mon May 13 17:13:47 2019 +0200

    drbd: fix potential spinlock deadlock on device->al_lock
    
    kernel:  [<ffffffffbab6b6e7>] _raw_spin_lock_irqsave+0x37/0x40
    kernel:  [<ffffffffc0a2a90f>] drbd_rs_complete_io+0x3f/0x160 [drbd]
    
    kernel:  [<ffffffffba67fc87>] bio_endio+0x67/0xb0
    
    kernel:  [<ffffffffba74fda7>] blk_mq_complete_request+0x27/0x30
    kernel:  [<ffffffffc05a2372>] nvme_process_cq+0xf2/0x1e0 [nvme]
    kernel:  [<ffffffffc05a2933>] nvme_irq+0x23/0x50 [nvme]
    
    kernel:  [<ffffffffba42e554>] handle_irq+0xe4/0x1a0
    kernel:  [<ffffffffbab7a59d>] do_IRQ+0x4d/0xf0
    kernel:  [<ffffffffbab6c362>] common_interrupt+0x162/0x162
    
    kernel:  [<ffffffffc0a208d9>] drbd_receiver+0x479/0x780 [drbd]
    
    So drbd_receiver is in receive_Data(), prepare_activity_log(),
    holding the device->al_lock (but forgot to disable IRQs),
    gets interrupted by NVME completion, finds its way into
    drbd_rs_complete_io() and tries to lock the same device->al_lock again.
    
    introduced while fixing a distributed resource starvation deadlock with:
    2018-07-19 3d98754e drbd: protocol 114: fix distributed deadlock on secondary activity log
    (released with 9.0.15)
    
    Fix: use spin_lock_irq() in prepare_activity_log().

diff --git a/drbd/drbd_receiver.c b/drbd/drbd_receiver.c
index 3b4c6263..654aab61 100644
--- a/drbd/drbd_receiver.c
+++ b/drbd/drbd_receiver.c
@@ -2742,11 +2742,11 @@ prepare_activity_log(struct drbd_peer_request *peer_req)
 	 * See also drbd_request_prepare() for the "request" entry point. */
 	ecnt = atomic_add_return(nr_al_extents, &device->wait_for_actlog_ecnt);
 
-	spin_lock(&device->al_lock);
+	spin_lock_irq(&device->al_lock);
 	al = device->act_log;
 	nr = al->nr_elements;
 	used = al->used;
-	spin_unlock(&device->al_lock);
+	spin_unlock_irq(&device->al_lock);
 
 	/* note: due to the slight delay between being accounted in "used" after
 	 * being committed to the activity log with drbd_al_begin_io_commit(),
openSUSE Build Service is sponsored by