814626 – Kernel panic with 5.8 nfs clients

Bug 814626 - Kernel panic with 5.8 nfs clients

Summary: Kernel panic with 5.8 nfs clients

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.8
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	J. Bruce Fields
QA Contact:	Eryu Guan
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	820358 829755
TreeView+	depends on / blocked

Reported:	2012-04-20 09:49 UTC by Rainer Traut
Modified:	2018-12-02 15:18 UTC (History)
CC List:	20 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	The kernel version 2.6.18-308.4.1.el5 contained several bugs which led to an overrun of the NFS server page array. Consequently, any attempt to connect an NFS client running on Red Hat Enterprise Linux 5.8 to the NFS server running on the system with this kernel caused the NFS server to terminate unexpectedly and the kernel to panic. This update corrects the bugs causing NFS page array overruns and the kernel no longer crashes in this scenario.
Clone Of:
Clones:	829755 (view as bug list)
Environment:
Last Closed:	2013-01-08 04:29:54 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
kernel panic (34.17 KB, image/png) 2012-04-20 09:49 UTC, Rainer Traut	no flags	Details
kernel panic with old broadcom firmware (35 bytes, text/plain) 2012-05-01 23:15 UTC, Rainer Traut	no flags	Details
fix oops due to overrunning server's page array (8.98 KB, patch) 2012-05-07 22:24 UTC, J. Bruce Fields	no flags	Details \| Diff
fix oops due to overrunning server's page array (9.10 KB, patch) 2012-05-09 13:26 UTC, J. Bruce Fields	no flags	Details \| Diff
fix oops due to overrunning server's page array (12.08 KB, patch) 2012-05-15 22:02 UTC, J. Bruce Fields	no flags	Details \| Diff
Show Obsolete (2) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	109263	0	None	None	None	Never
Red Hat Product Errata	RHBA-2013:0006	0	normal	SHIPPED_LIVE	Red Hat Enterprise Linux 5.9 kernel update	2013-01-08 08:48:56 UTC

Description Rainer Traut 2012-04-20 09:49:20 UTC

Created attachment 578932 [details]
kernel panic

Description of problem:
Our NFS server runs fine latest kernel 2.6.18-308.4.1.el5 x86_64. As soon as as we boot the clients to a 5.8 kernel the NFS server crashes. When the NFS clients run a 5.7 kernel, all is well.
We see this since the initial 5.8 release.

Version-Release number of selected component (if applicable):
$ uname -a
Linux ng-bak1.xxx 2.6.18-308.4.1.el5 #1 SMP Wed Mar 28 01:54:56 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
always, boot the client with
$ uname -a
Linux ng-bak1.xxx 2.6.18-308.4.1.el5 #1 SMP Wed Mar 28 01:54:56 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

Steps to Reproduce:
1. exports file on server
/srv/backup/orabak 192.168.100.0/24(rw,async,all_squash,anonuid=501,anongid=501) 
2. fstab on clients
192.168.100.10:/srv/backup/orabak       /srv/orabak     nfs     intr         0 0 
3. boot the clients with any 5.8 kernel and see the NFS server crahs
  
Actual results:
kernel panic, screenshot attached

Expected results:
no kernel panic


Additional info:
I attached DRAC screenshot of server, all hardware is DELL PE2950

Comment 1 Vincent S. Cojot 2012-04-25 14:06:39 UTC

Tried reproducing it here but to no avail:
1) on server, add this to /etc/exports:
/orabak 192.168.100.0/24(rw,async,all_squash,anonuid=501,anongid=501)
2) mount the export on one RHEL5.8 client:
mount -t nfs -o intr,vers=3 nfsserver:/orabak /mnt

Nothing else to report, works fine. Anything I may have missed?
Vincent

Comment 2 Rainer Traut 2012-04-25 14:10:54 UTC

@vincent no, that's exactly how I can reproduce.

It's Dell PE2950 server hardware.
Sadly in the Dell firmware repository it is missing latest broadcom firmware for the NICs.
I will update the firmware by hand tonight and see if this makes a difference.

Comment 3 Vincent S. Cojot 2012-04-25 14:11:59 UTC

Tried mounting by IP and by fstab as well, no panic.
Server is a Dell Precision workstation, client is a VM (VMWare Workstation 8).

Comment 4 Vincent S. Cojot 2012-04-25 16:30:59 UTC

@Rainer: if I can reproduce the issue here, then I'll open a support ticket. Your issue looks bad enough that I should take care that it's being looked into.

Comment 5 Rainer Traut 2012-04-26 07:35:15 UTC

The NFS server is up now for 9 hours since I updated to dell's latest firmware 6.4.5 for broadcom cards.
Until then it was running 6.2.6 which is latest in the firmware repository.

So I guess the problem is solved because we saw the crashes instantly.

For the record, this is the problematic nic:
07:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
        Subsystem: Dell Device 01b2
        Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 114
        Memory at d6000000 (64-bit, non-prefetchable) [size=32M]
        Capabilities: [40] PCI-X non-bridge device
        Capabilities: [48] Power Management version 2
        Capabilities: [50] Vital Product Data
        Capabilities: [58] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Kernel driver in use: bnx2
        Kernel modules: bnx2

Comment 6 Vincent S. Cojot 2012-04-26 08:38:35 UTC

Interesting, on my T5400 workstation acting as a server, I also had a Broadcom card as eth0 (but a different type, though):

[root@palanthas ~]# lspci |grep -i broad
09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5754 Gigabit Ethernet PCI Express (rev 02). 

[root@palanthas ~]# ethtool -i eth0
driver: tg3
version: 3.119
firmware-version: 5754-v3.24
bus-info: 0000:09:00.0

It would be interesting to investigate this issue a little further.
Would you, per chance, have a vmcore of the previous crashes?

The fact that a firmware issue can cause a kernel crash doesn't feel good. As the most, it should (IMHO) cause timeouts, cut the server off from the network but it shouldn't panic.

Regards,

Comment 7 Rainer Traut 2012-04-26 09:20:17 UTC

As I told on the rhel5 mailing list, I also found the workaround to use UDP instead of TCP to mount the NFS share.

This showed exactly the behaviour you describe (client): 

Apr 24 23:10:15 ng-db1 kernel: nfs: server 192.168.100.10 not responding, still trying
Apr 24 23:11:06 ng-db1 last message repeated 2 times
Apr 24 23:11:06 ng-db1 kernel: nfs: server 192.168.100.10 OK
Apr 24 23:11:06 ng-db1 last message repeated 2 times
Apr 25 00:21:03 ng-db1 kernel: nfs: server 192.168.100.10 not responding, still trying
Apr 25 00:21:03 ng-db1 kernel: nfs: server 192.168.100.10 OK

And in dmesg there were many "link up" and link down" on the server.

I asked our Oracle admin if rman was working w/o problems (though these messages) which he affirmed.

These messages made me look for the new firmware which obviously cured the kernel panics.

But I guess the real problem are the kernel panics if TCP is used to mount NFS and link goes away on the server.

Sadly this is our production environment and I do not have a testing lab.

Comment 8 Jeff Layton 2012-04-26 21:01:22 UTC

Well, it died down in the bowels of the TCP layer code (tcp_sendpage). It
seems possible that there's a crashable bug in there, and somehow adding
the correct firmware papered over the bug somehow. I suppose it's also
possible that there's a driver firmware bug that just happened to cause
a crash in this layer (mem corruption maybe?).

Either way, it's doubtful we'd be able to make much progress w/o a vmcore
or at least the entire oops message. Without that, we'll probably have to
close this with a resolution of INSUFFICIENT_DATA. Any chance you have
either of those things or have a way to get them?

Comment 10 Rainer Traut 2012-04-27 09:05:56 UTC

I will try to get these informations on the weekend.

Comment 11 Jeff Layton 2012-04-27 10:10:57 UTC

Either way, it would also help expedite things to open a RH support case.

Comment 12 Vincent S. Cojot 2012-04-27 10:15:01 UTC

I would open a support ticket with our contract (large bank) but alas I don't have Dell Hardware, only HP proliants..

If Rainer can get a vmcore and a sosreport of the crashed server, I could open a ticket for him.

Thanks,

Vincent

Comment 13 Rainer Traut 2012-04-27 10:21:04 UTC

I have done so, but we only have basic support.
Case # is 00633163

Comment 14 Jeff Layton 2012-04-27 10:43:54 UTC

Good enough -- thanks.

Comment 15 Vincent S. Cojot 2012-04-30 11:11:08 UTC

Unable to access https://access.redhat.com/knowledge/solutions/109263, even with my support login. Is there really a solution?
Vincent

Comment 16 Rainer Traut 2012-05-01 22:40:21 UTC

Ok, I have a vmcore.
Maybe important, I only got it with this fstab line:
192.168.100.10:/srv/backup/orabak       /srv/orabak     nfs     intr         0 0

this was working:
192.168.100.10:/srv/backup/orabak       /srv/orabak     nfs     rsize=32768,wsize=32768,timeo=14,hard,intr              0 0

I'm on vacation from thursday on and be back on may 21.

Comment 17 Rainer Traut 2012-05-01 23:15:40 UTC

Created attachment 581493 [details]
kernel panic with old broadcom firmware

Comment 18 Rainer Traut 2012-05-01 23:16:21 UTC

or just the link:
http://awaro.com/de/download/vmcore

Comment 19 Jeff Layton 2012-05-03 18:23:12 UTC

Ok, some notes...

Oops occurred in compound_head() call here:

        if (unlikely(PageTail(page)))

...so...some disassembly around the crash area:

0xffffffff8025dbd2 <tcp_sendpage+0x388>:        jmp    0xffffffff8025dc29 <tcp_sendpage+0x3df>
0xffffffff8025dbd4 <tcp_sendpage+0x38a>:        mov    0x28(%rsp),%rdx 
0xffffffff8025dbd9 <tcp_sendpage+0x38f>:        mov    (%rdx),%rax <<< CRASH HERE
0xffffffff8025dbdc <tcp_sendpage+0x392>:        and    $0x24000,%eax
0xffffffff8025dbe1 <tcp_sendpage+0x397>:        cmp    $0x24000,%rax
0xffffffff8025dbe7 <tcp_sendpage+0x39d>:        jne    0xffffffff8025dbed <tcp_sendpage+0x3a3>
0xffffffff8025dbe9 <tcp_sendpage+0x39f>:        mov    0x10(%rdx),%rdx

So this appears to be in the midst of the PageTail macro, but it does some
"funny stuff" with the stack here. It's expecting to pull the . In any case, it's calling 

tcp_sendpage is loaded with heavily inlined functions, but we should be ending
up with %rdx holding the address of the page. Unfortunately, that address is clearly bogus:

    0x007808000001022c

...so apparently the page array that got passed into do_tcp_sendpages was
corrupt.

Comment 20 Jeff Layton 2012-05-04 11:35:15 UTC

The kernel_sendpage request was this one in svc_sendto:

0x7e2b is in svc_sendto (net/sunrpc/svcsock.c:423).

        /* send head */
        if (slen == xdr->head[0].iov_len)
                flags = 0;
        len = kernel_sendpage(sock, rqstp->rq_respages[0], 0,        <<<<
                                  xdr->head[0].iov_len, flags);
        if (len != xdr->head[0].iov_len)
                goto out;

Poking around on the stack tells me that the rqstp is at 0xffff810221dda000.
That gives me:

crash> struct svc_rqst.rq_respages ffff810221dda000
  rq_respages = 0xffff810101b22840

...which shows:

crash> struct page.flags 0xffff810101b22840
  flags = 0x7808000001022c

...so either we have a bug where we passed in a pointer where it should have been
a double pointer, or I've misinterpreted the assembly above...

Comment 21 Jeff Layton 2012-05-04 11:53:09 UTC

Ooops, nm -- that is wrong...

req_respages is a **page, so that should be a pointer to an array of page pointers, and the first page pointer in that array is bad. Now to see if we
can determine how it got there in the first place...

Comment 22 Jeff Layton 2012-05-04 12:02:01 UTC

Well...no... rq_respages should contain a **page, but it seems to have a *page instead:

crash> kmem 0xffff810101b22840
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffff810101b22840  7c0b8000 ffff8101ff009220      100  3 7808000001022c

I'll go back through the code and see if we have a single/double pointer
confusion somewhere...

Comment 24 Jeff Layton 2012-05-04 13:12:13 UTC

Dump of the svc_rqst in memory

Here's the last bit of the rq_pages array:

ffff810221ddaa40:  ffff810106e39310 ffff810106d51270   ........p.......
ffff810221ddaa50:  ffff810106ea9c78 ffff810106d69798   x...............
ffff810221ddaa60:  ffff810106e50760 ffff810106e543c0   `........C......
ffff810221ddaa70:  ffff810104c06cd8 ffff810104c07250   .l......Pr......
ffff810221ddaa80:  ffff810104c06e98 ffff810104c07cd0   .n.......|......
ffff810221ddaa90:  ffff810104c26b30 ffff810104c26a88   0k.......j......

...and here's the rq_respages pointer, followed by the kvec array:

ffff810221ddaaa0:  ffff810101b22840 ffff81009553c000   @(........S.....
ffff810221ddaab0:  0000000000001000 ffff8101f246c000   ..........F.....

Interestingly, the "index" field in each page is increasing as it goes:

crash> kmem ffff810104c26b30 
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffff810104c26b30 15c1ea000 ffff8101ff009220       fe  3 15810000001020c
crash> kmem ffff810104c26a88
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffff810104c26a88 15c1e7000 ffff8101ff009220       ff  3 15810000001020c
crash> kmem ffff810101b22840
      PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
ffff810101b22840  7c0b8000 ffff8101ff009220      100  3 7808000001022c

...it certainly seems like the page pointer in the rq_respages field fits
that pattern. I think this might be an overrun of the rq_pages array.

Comment 25 Jeff Layton 2012-05-04 14:07:04 UTC

Ok, playing with a little debug patch here...

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index a0edff9..f477ba7 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
 -1247,6 +1247,7 @@ svc_recv(struct svc_serv *serv, struct svc_rqst *rqstp, long timeout)
 
        /* now allocate needed pages.  If we get a failure, sleep briefly */
        pages = 2 + (serv->sv_bufsz + PAGE_SIZE -1) / PAGE_SIZE;
+       dprintk("%s: allocating %d pages\n", __func__, pages);
        for (i=0; i < pages ; i++)
                while (rqstp->rq_pages[i] == NULL) {
                        struct page *p = alloc_page(GFP_KERNEL);

...when I bump nfsd_max_blksize to 1M (which is what it would be when you have
>4GB of RAM), I see this on starting knfsd:

    svc_recv: allocating 259 pages

...which is 1 more than it should be. So, we probably have a situation where
rq_respages doesn't get overwritten in certain cases, and in those cases if
we try to send the reply...kaboom.

In cases where that pointer gets overwritten we still are leaking a page, so
that's a pretty nasty bug regardless...

Comment 26 Jeff Layton 2012-05-04 14:26:28 UTC

The upstream and rhel6 code seems to be ok since it has this sanity check:

        BUG_ON(pages >= RPCSVC_MAXPAGES);

seems like that should be checked *before* we allocate all of the pages, but it should catch this regardless. So this seems to be a bug only in the backport
of the code to allow larger r/wsize.

Comment 27 Jeff Layton 2012-05-04 14:52:52 UTC

The buffer size calculations are pretty convoluted but here we go:

The second arg here is what becomes sv_bufsize:

        nfsd_serv = svc_create(&nfsd_program, NFSD_BUFSIZE - NFSSVC_MAXBLKSIZE + nfsd_max_blksize);

nfsd_max_blksize == NFSSVC_MAXBLKSIZE (when mem > 4G)
NFSSVC_MAXBLKSIZE == RPCSVC_MAXPAYLOAD == (1*1024*1024u)

...but in any case, those should cancel each other out. So it should be NFSD_BUFSIZE, which is:

#define NFSD_BUFSIZE  ((RPC_MAX_HEADER_WITH_AUTH+26)*XDR_UNIT + NFSSVC_MAXBLKSIZE)

...upstream has the same definition, but it's only used to set pc_xdrressize for
the NFSv4 compound RPC.

Comment 28 Jeff Layton 2012-05-04 15:04:55 UTC

Ok, reassigning to Bruce since he understands this code better than I do...

Thanks, Bruce! Let me know if you need me to help out with testing or anything!

Comment 29 Vincent S. Cojot 2012-05-04 15:23:02 UTC

Very interesting read, Thank you.
I'll keep watching this thread.
Vincent

Comment 30 J. Bruce Fields 2012-05-04 22:17:12 UTC

Looking carefully at rhel5 and upstream (assuming tcp, 4096-byte pages, NFSv3/v4, and >4gigs RAM throughout):

In both cases, the maximum read/write size is 1024*1024 bytes = 1M.

The number of pages required to hold that much data is 1M/PAGE_SIZE = 256.

The rq_pages array has 256 + 2 elements (allowing an extra page to hold the reply (in the case of a write) or request (in the case of a read), and another extra page for headers, padding, etc).

In rhel5, we call svc_create with a buffer size of 1MB + some extra (the NFSD_BUFSIZE Jeff mentions in comment 27). After rounding up that adds an extra page.

svc_recv then adds two more pages--so we're adding 3 extra pages, whereas rq_pages only got two extra pages.

Upstream doesn't have this problem: upstream removes the extra from the svc_create argument, and instead has svc_create add an extra page to account for headers, and then svc_recv add an extra page to account for the request or reply, for a total of 2 extra pages, as it should be.

So, where'd I screw up the backport?

It appears that upstream actually had the same bug at some point, but c6b0a9f87b82 "knfsd: tidy up up meaning of 'buffer size' in nfsd/sunrpc', while appearing to be pure cleanup, actually fixed the bug.

What I'm unclear about is where exactly the bug was introduced upstream--possibly with 44524359484 "knfsd: Replace two page lists in struct svc_rqst with one".

In any case, c6b0a9f87b82 is probably more than we want to backport. (In theory I think it may change kabi, though I doubt it's kabi anyone uses.) So in rhel5 we'll probably want to do something simpler: maybe throw out the extra NFSD_BUFSIZE in the svc_create() argument? I'll take a look Monday.

Comment 31 J. Bruce Fields 2012-05-07 22:12:28 UTC

(In reply to comment #30)
> In any case, c6b0a9f87b82 is probably more than we want to backport.  (In
> theory I think it may change kabi, though I doubt it's kabi anyone uses.)  So
> in rhel5 we'll probably want to do something simpler: maybe throw out the extra
> NFSD_BUFSIZE in the svc_create() argument?

The problem with doing that is that sv_bufsz is also used to estimate the sizes of socket buffers, for example, and for that purpose we want the size of the full rpc request, not just the read or write payload.

So I think backporting that whole patch, with some kabi fixups, probably is the right thing to do.

Comment 32 J. Bruce Fields 2012-05-07 22:24:32 UTC

Created attachment 582792 [details]
fix oops due to overrunning server's page array

Here's a backport of c6b0a9f87b82f25fa35206ec04b5160372eabab4.

Comment 33 monorailpilot 2012-05-08 15:06:13 UTC

I'm having the exact same issue on a Dell R610.  I've opened case #639436.  I will attach a vmcore to that ticket now.

Comment 35 RHEL Program Management 2012-05-08 18:59:48 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 38 monorailpilot 2012-05-09 12:05:19 UTC

If I limit memory on the server to 3 gigs, I start receiving 

RPC: bad TCP reclen 0x001000a4 (large)

on the console, but the system doesn't appear to kernel panic.

Comment 39 J. Bruce Fields 2012-05-09 12:26:24 UTC

(In reply to comment #38)
> If I limit memory on the server to 3 gigs, I start receiving 
> 
> RPC: bad TCP reclen 0x001000a4 (large)
>
> on the console, but the system doesn't appear to kernel panic.

This is using which kernel exactly?

If you're able to reproduce this crash reliably, and if it were possible to apply the attached patch and retest, the results would be useful.

Comment 40 monorailpilot 2012-05-09 12:30:43 UTC

2.6.18-308.4.1.el5

I can do the testing, but I need to get my oracle backups completed first so it may be 24-48 hours before I can test.   I can reproduce this fairly reliably however.  I'm poking my DBA's now to try and get them to get things caught up so I can break the server again.

Comment 41 J. Bruce Fields 2012-05-09 13:16:02 UTC

(In reply to comment #38)
> If I limit memory on the server to 3 gigs, I start receiving 
> 
> RPC: bad TCP reclen 0x001000a4 (large)
> 
> on the console, but the system doesn't appear to kernel panic.

Thinking about that some more, I think that is all expected behavior:

The server by default sets a maximum IO size based on memory.

After lowering the amount of memory, the server lowered that maximum IO size to something less than 1 megabyte, which prevents us from overrunning this array, preventing the panic.

However, I'm guessing you had at least one client with existing mounts when you restarted the server.  That client was still using the old maximum IO size, hence was sending write requests larger than the newly rebooted server expected.

If the client unmounts and remounts, you probably won't see those messages any more.

Comment 42 J. Bruce Fields 2012-05-09 13:26:38 UTC

Created attachment 583282 [details]
fix oops due to overrunning server's page array

Apologies--on review I found an error in the backported patch.  If you're able to test, please test this version rather than the previous one.

(The previous patch will probably fix the bug as well, though there's a small chance it could cause some other problems.)

Comment 44 monorailpilot 2012-05-09 14:32:14 UTC

(In reply to comment #41)
> (In reply to comment #38)
> > If I limit memory on the server to 3 gigs, I start receiving 
> > 
> > RPC: bad TCP reclen 0x001000a4 (large)
> > 
> > on the console, but the system doesn't appear to kernel panic.
> 
> Thinking about that some more, I think that is all expected behavior:
> 
> The server by default sets a maximum IO size based on memory.
> 
> After lowering the amount of memory, the server lowered that maximum IO size to
> something less than 1 megabyte, which prevents us from overrunning this array,
> preventing the panic.
> 
> However, I'm guessing you had at least one client with existing mounts when you
> restarted the server.  That client was still using the old maximum IO size,
> hence was sending write requests larger than the newly rebooted server
> expected.
> 
> If the client unmounts and remounts, you probably won't see those messages any
> more.

Confirmed.  When the server crashed originally, we went to cifs mounts as temporary replacements. I unmounted the stale nfs systems with umount -l.  It looks like one server was pinging continuously trying to finish whatever it was doing.

tcpdump (10.150.50.104 is the client, SHSNS2 is the server):
10:26:58.709743 IP 10.150.50.104.0 > SHSNS2.nfs: 1448 null
10:26:58.709746 IP SHSNS2.nfs > 10.150.50.104.987: . ack 1233525619 win 159 <nop,nop,timestamp 8786454 3405911169>
10:26:58.709748 IP 10.150.50.104.0 > SHSNS2.nfs: 1448 null
10:26:58.709751 IP SHSNS2.nfs > 10.150.50.104.987: . ack 1233527067 win 181 <nop,nop,timestamp 8786454 3405911169>

A hard reboot of the client cleared the RPC messages and forcing the max memory to 3 gigabytes is at least allowing us to serve via NFS until this is resolved.

Comment 49 J. Bruce Fields 2012-05-10 20:30:10 UTC

Note: as a workaround you can decrease the maximum IO size to something less than 1MB by writing to /proc/fs/nfsd/max_block_size, for example:

  echo 524288 >/proc/fs/nfsd/max_block_size

Note this has to be done aftering mounting /proc/fs/nfsd, but before starting nfsd.

Comment 55 J. Bruce Fields 2012-05-15 22:02:20 UTC

Created attachment 584796 [details]
fix oops due to overrunning server's page array

Further testing on ia64 saw another overrun of the array.

I believe this may be caused by an off-by-one-error in read, fixed upstream by 250f3915183d377d36e012bac9caa7345ce465b8 "[PATCH] knfsd: fix an NFSD bug with full sized, non-page-aligned reads", so I've added a backport of that patch to the attached.  Not yet tested.

Comment 64 Miroslav Svoboda 2012-07-11 09:44:05 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The kernel version 2.6.18-308.4.1.el5 contained several bugs which led to an overrun of the NFS server page array. Consequently, any attempt to connect an NFS client running on Red Hat Enterprise Linux 5.8 to the NFS server running on the system with this kernel caused the NFS server to terminate unexpectedly and the kernel to panic. This update corrects the bugs causing NFS page array overruns and the kernel no longer crashes in this scenario.

Comment 67 errata-xmlrpc 2013-01-08 04:29:54 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0006.html

Note You need to log in before you can comment on or make changes to this bug.

bfields
ccui
cmitchel
dhoward
eguan
jiali
jlayton
jwest
monorailpilot
msvoboda
nhorman
pbenas
phillip.brown
rdassen
rwheeler
smayhew
sprabhu
steved
tgummels
vincent