From e47373ec1c9aab9ee134f4e2b8249957e9f4c7ef Mon Sep 17 00:00:00 2001 From: Alan Stern Date: Wed, 30 Mar 2005 15:05:45 -0500 Subject: [SCSI] return success after retries in scsi_eh_tur The problem lies in the way the error handler uses TEST UNIT READY to tell whether error recovery has succeeded. The scsi_eh_tur function gives up after one round of retrying; after that it decides that more error recovery is needed. However TUR is liable to report sense data indicating a retry is needed when in fact error recovery has succeeded. A typical example might be SK=2, ASC=4, ASCQ=1 (Logical unit in process of becoming ready). The mere fact that we were able to get a sensible reply to the TUR should indicate that the device is working well enough to stop error recovery. I ran across a case back in January where this happened. A CD-ROM drive timed out the INQUIRY command, and a device reset fixed the blockage. But then the drive kept responding with 2/4/1 -- because it was spinning up I suppose -- until the error handler gave up and placed it offline. If the initial INQUIRY had received the 2/4/1 instead, everything would have worked okay. It doesn't seem reasonable for things to fail just because the error handler had started running. Signed-off-by: Alan Stern Signed-off-by: James Bottomley --- drivers/scsi/scsi_error.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'drivers/scsi/scsi_error.c') diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index e9c451b..688bce7 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -776,9 +776,11 @@ retry_tur: __FUNCTION__, scmd, rtn)); if (rtn == SUCCESS) return 0; - else if (rtn == NEEDS_RETRY) + else if (rtn == NEEDS_RETRY) { if (retry_cnt--) goto retry_tur; + return 0; + } return 1; } -- cgit v1.1 From c5478def7a3a2dba9ceda452c2aa3539514d30a9 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Tue, 6 Sep 2005 14:04:26 +0200 Subject: [SCSI] switch EH thread startup to the kthread API Signed-off-by: James Bottomley --- drivers/scsi/scsi_error.c | 29 ++--------------------------- 1 file changed, 2 insertions(+), 27 deletions(-) (limited to 'drivers/scsi/scsi_error.c') diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 688bce7..ebe74cc 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -1585,16 +1586,8 @@ int scsi_error_handler(void *data) int rtn; DECLARE_MUTEX_LOCKED(sem); - /* - * Flush resources - */ - - daemonize("scsi_eh_%d", shost->host_no); - current->flags |= PF_NOFREEZE; - shost->eh_wait = &sem; - shost->ehandler = current; /* * Wake up the thread that created us. @@ -1602,8 +1595,6 @@ int scsi_error_handler(void *data) SCSI_LOG_ERROR_RECOVERY(3, printk("Wake up parent of" " scsi_eh_%d\n",shost->host_no)); - complete(shost->eh_notify); - while (1) { /* * If we get a signal, it means we are supposed to go @@ -1624,7 +1615,7 @@ int scsi_error_handler(void *data) * semaphores isn't unreasonable. */ down_interruptible(&sem); - if (shost->eh_kill) + if (kthread_should_stop()) break; SCSI_LOG_ERROR_RECOVERY(1, printk("Error handler" @@ -1663,22 +1654,6 @@ int scsi_error_handler(void *data) * Make sure that nobody tries to wake us up again. */ shost->eh_wait = NULL; - - /* - * Knock this down too. From this point on, the host is flying - * without a pilot. If this is because the module is being unloaded, - * that's fine. If the user sent a signal to this thing, we are - * potentially in real danger. - */ - shost->eh_active = 0; - shost->ehandler = NULL; - - /* - * If anyone is waiting for us to exit (i.e. someone trying to unload - * a driver), then wake up that process to let them know we are on - * the way out the door. - */ - complete_and_exit(shost->eh_notify, 0); return 0; } -- cgit v1.1 From fe1b2d544d71300f8e2d151c3c77a130d13a58be Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Tue, 6 Sep 2005 14:15:37 +0200 Subject: [SCSI] unexport scsi_add_timer/scsi_delete_timer Signed-off-by: James Bottomley --- drivers/scsi/scsi_error.c | 2 -- 1 file changed, 2 deletions(-) (limited to 'drivers/scsi/scsi_error.c') diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index ebe74cc..ae28bcb 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -116,7 +116,6 @@ void scsi_add_timer(struct scsi_cmnd *scmd, int timeout, add_timer(&scmd->eh_timeout); } -EXPORT_SYMBOL(scsi_add_timer); /** * scsi_delete_timer - Delete/cancel timer for a given function. @@ -144,7 +143,6 @@ int scsi_delete_timer(struct scsi_cmnd *scmd) return rtn; } -EXPORT_SYMBOL(scsi_delete_timer); /** * scsi_times_out - Timeout function for normal scsi commands. -- cgit v1.1