On Fri, Jun 24, 2005 at 04:19:28PM +0530, Suparna Bhattacharya wrote:
> On Mon, Jun 20, 2005 at 05:31:54PM +0530, Suparna Bhattacharya wrote:
> > (2) Buffered filesystem AIO read/write (me/Ben)

Filesystem AIO write

AIO support for O_SYNC buffered writes, built over O_SYNC-speedup.
It uses the tagged radix tree lookups to writeout just the pages
pertaining to this request, and retries instead of blocking
for writeback to complete on the same range. All the writeout is 
issued at the time of io submission, and there is a check to make
sure that retries skip over straight to the wait_on_page_writeback_range.

Signed-off-by: Suparna Bhattacharya <suparna@in.ibm.com>
Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com>
diff -purN --exclude=description 76_aio-wait_page/fs/aio.c 77_fs_aio_read/fs/aio.c
--- 76_aio-wait_page/fs/aio.c	2005-08-08 17:15:58.000000000 -0400
+++ 77_fs_aio_read/fs/aio.c	2005-08-08 17:16:03.000000000 -0400
@@ -1487,7 +1487,16 @@ int aio_wake_function(wait_queue_t *wait
 
 	list_del_init(&wait->task_list);
 	kick_iocb(iocb);
-	return 1;
+	/* 
+	 * Avoid exclusive wakeups with retries since an exclusive wakeup
+	 * may involve implicit expectations of waking up the next waiter
+	 * and there is no guarantee that the retry will take a path that
+	 * would do so. For example if a page has become up-to-date, then
+	 * a retried read may end up straightaway performing a copyout 
+	 * and not go through a lock_page - unlock_page that would have
+	 * passed the baton to the next waiter.
+	 */
+	return 0;
 }
 
 int fastcall io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
diff -purN --exclude=description 76_aio-wait_page/mm/filemap.c 77_fs_aio_read/mm/filemap.c
--- 76_aio-wait_page/mm/filemap.c	2005-08-08 17:16:00.000000000 -0400
+++ 77_fs_aio_read/mm/filemap.c	2005-08-08 17:16:03.000000000 -0400
@@ -745,6 +745,11 @@ void do_generic_mapping_read(struct addr
 	if (!isize)
 		goto out;
 
+	if (in_aio()) {
+		/* Avoid repeat readahead */
+		if (is_retried_kiocb(io_wait_to_kiocb(current->io_wait)))
+			next_index = last_index;
+	}
 	end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
 	for (;;) {
 		struct page *page;
@@ -814,7 +819,11 @@ page_ok:
 
 page_not_up_to_date:
 		/* Get exclusive access to the page ... */
-		lock_page(page);
+
+		if ((error = __lock_page(page, current->io_wait))) {
+			pr_debug("queued lock page \n");
+			goto readpage_error;
+		}
 
 		/* Did it get unhashed before we got the lock? */
 		if (!page->mapping) {
@@ -837,7 +846,8 @@ readpage:
 			goto readpage_error;
 
 		if (!PageUptodate(page)) {
-			lock_page(page);
+			if ((error = __lock_page(page, current->io_wait)))
+				goto readpage_error;
 			if (!PageUptodate(page)) {
 				if (page->mapping == NULL) {
 					/*
@@ -882,7 +892,11 @@ readpage:
 		goto page_ok;
 
 readpage_error:
-		/* UHHUH! A synchronous read error occurred. Report it */
+		/* We don't have uptodate data in the page yet */
+		/* Could be due to an error or because we need to
+		 * retry when we get an async i/o notification.
+		 * Report the reason.
+		 */
 		desc->error = error;
 		page_cache_release(page);
 		goto out;