It seems Mldonkey has always been the unfortunate process to trigger this, so far. It happens with 2.6.7 and 2.6.8.1 so far. It seems likely to be a hardware issue. A partial memtest86 run has revealed nothing so far. Next I am going to try reseating the CPUs and RAM, since it appears to have been dropped during shipping to me. (The HSFs had fallen off the CPUs.) Next, I’ll try each CPU by itself with an UP kernel. Finally, I’ll remove the board and try running it on an electrostatic bag. It would be quite helpful if I could find something less intensive to trigger the Oops, though, since the latter most test might not be possible since connecting all the cards to their hard disks will be quite a stretch with the mainboard out of the case.
I can still downgrade to my old L7S7A2, but I’d seriously rather not given the performance increase from the Supermicro Super 370DLE.
It’s always CPU 0. Maybe that means something:
nebula:~# egrep 'CPU: [0-9]' /var/log/syslog.0 Sep 9 09:31:38 nebula kernel: CPU: 0 Sep 9 17:26:35 nebula kernel: CPU: 0 Sep 9 19:16:42 nebula kernel: CPU: 0
The full trace in all its glory.
nebula kernel: mlnet: page allocation failure. order:5, mode:0xd0 nebula kernel: [__alloc_pages+761/880] __alloc_pages+0x2f9/0x370 nebula kernel: [__get_free_pages+31/64] __get_free_pages+0x1f/0x40 nebula kernel: [kmem_getpages+32/224] kmem_getpages+0x20/0xe0 nebula kernel: [cache_grow+160/528] cache_grow+0xa0/0x210 nebula kernel: [cache_alloc_refill+381/560] cache_alloc_refill+0x17d/0x230 nebula kernel: [__kmalloc+136/160] __kmalloc+0x88/0xa0 nebula kernel: [__crc_generic_read_dir+1429139/7856069] xfs_iread_extents+0x70/0x180 [xfs] nebula kernel: [__crc_generic_read_dir+1260038/7856069] xfs_bmapi+0x213/0x1480 [xfs] nebula kernel: [autoremove_wake_function+0/96] autoremove_wake_function+0x0/0x60 nebula kernel: [mempool_alloc+115/320] mempool_alloc+0x73/0x140 nebula kernel: [as_update_arq+46/128] as_update_arq+0x2e/0x80 nebula kernel: [as_add_request+434/544] as_add_request+0x1b2/0x220 nebula kernel: [__crc_generic_read_dir+1446652/7856069] xfs_imap_to_bmap+0x39/0x240 [xfs] nebula kernel: [__crc_generic_read_dir+1447581/7856069] xfs_iomap+0x19a/0x540 [xfs] nebula kernel: [as_update_arq+46/128] as_update_arq+0x2e/0x80 nebula kernel: [__crc_generic_read_dir+1588792/7856069] linvfs_get_block_core+0xa5/0x2c0 [xfs] nebula kernel: [__make_request+712/1408] __make_request+0x2c8/0x580 nebula kernel: [__crc_generic_read_dir+1589391/7856069] linvfs_get_block+0x3c/0x40 [xfs] nebula kernel: [do_mpage_readpage+943/960] do_mpage_readpage+0x3af/0x3c0 nebula kernel: [radix_tree_node_alloc+31/112] radix_tree_node_alloc+0x1f/0x70 nebula kernel: [radix_tree_insert+237/272] radix_tree_insert+0xed/0x110 nebula kernel: [add_to_page_cache+96/192] add_to_page_cache+0x60/0xc0 nebula kernel: [mpage_readpages+315/368] mpage_readpages+0x13b/0x170 nebula kernel: [__crc_generic_read_dir+1589331/7856069] linvfs_get_block+0x0/0x40 [xfs] nebula kernel: [read_pages+312/336] read_pages+0x138/0x150 nebula kernel: [__crc_generic_read_dir+1589331/7856069] linvfs_get_block+0x0/0x40 [xfs] nebula kernel: [__alloc_pages+784/880] __alloc_pages+0x310/0x370 nebula kernel: [do_page_cache_readahead+275/400] do_page_cache_readahead+0x113/0x190 nebula kernel: [page_cache_readahead+256/512] page_cache_readahead+0x100/0x200 nebula kernel: [do_generic_mapping_read+266/1120] do_generic_mapping_read+0x10a/0x460 nebula kernel: [__generic_file_aio_read+523/576] __generic_file_aio_read+0x20b/0x240 nebula kernel: [file_read_actor+0/240] file_read_actor+0x0/0xf0 nebula kernel: [__crc_generic_read_dir+1616642/7856069] xfs_read+0x18f/0x2c0 [xfs] nebula kernel: [__crc_generic_read_dir+1600798/7856069] linvfs_read+0x8b/0xa0 [xfs] nebula kernel: [do_sync_read+137/192] do_sync_read+0x89/0xc0 nebula kernel: [link_path_walk+1550/2304] link_path_walk+0x60e/0x900 nebula kernel: [permission+47/80] permission+0x2f/0x50 nebula kernel: [__crc_generic_read_dir+1602973/7856069] linvfs_open+0x8a/0x90 [xfs] nebula kernel: [dentry_open+206/400] dentry_open+0xce/0x190 nebula kernel: [filp_open+104/112] filp_open+0x68/0x70 nebula kernel: [vfs_read+184/304] vfs_read+0xb8/0x130 nebula kernel: [sys_read+66/112] sys_read+0x42/0x70 nebula kernel: [syscall_call+7/11] syscall_call+0x7/0xb nebula kernel: nebula kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000 nebula kernel: printing eip: nebula kernel: d0bdb945 nebula kernel: *pde = 00000000 nebula kernel: Oops: 0002 [#1] nebula kernel: SMP nebula kernel: Modules linked in: nfsd exportfs lockd sunrpc 8250 serial_core xfs reiserfs lp parport_pc parport e100 mii rtc unix ext3 jbd 3w_xxxx megaraid sd_mod scsi_mod ide_disk serverworks ide_core nebula kernel: CPU: 0 nebula kernel: EIP: 0060:[__crc_generic_read_dir+1259032/7856069] Not tainted nebula kernel: EFLAGS: 00010202 (2.6.7) nebula kernel: EIP is at xfs_bmap_read_extents+0x2f5/0x4d0 [xfs] nebula kernel: eax: dc010000 ebx: 00000001 ecx: 00000000 edx: 0300603a nebula kernel: esi: c308c018 edi: 00000000 ebp: 00000000 esp: ce7168c0 nebula kernel: ds: 007b es: 007b ss: 0068 nebula kernel: Process mlnet (pid: 731, threadinfo=ce716000 task=ce734cf0) nebula kernel: Stack: cec63400 005015e0 00000001 00000000 ce716914 00000002 00000000 839c0b00 nebula kernel: 000000fe 000000fe 000000fe 005015e0 00000000 c15815a0 00001d69 c2dd7808 nebula kernel: cec63400 00000000 c4d13360 00000001 c308c000 c15815a0 c4d13360 0001d690 nebula kernel: Call Trace: nebula kernel: [__crc_generic_read_dir+1429178/7856069] xfs_iread_extents+0x97/0x180 [xfs] nebula kernel: [__crc_generic_read_dir+1260038/7856069] xfs_bmapi+0x213/0x1480 [xfs] nebula kernel: [autoremove_wake_function+0/96] autoremove_wake_function+0x0/0x60 nebula kernel: [mempool_alloc+115/320] mempool_alloc+0x73/0x140 nebula kernel: [as_update_arq+46/128] as_update_arq+0x2e/0x80 nebula kernel: [as_add_request+434/544] as_add_request+0x1b2/0x220 nebula kernel: [__crc_generic_read_dir+1446652/7856069] xfs_imap_to_bmap+0x39/0x240 [xfs] nebula kernel: [__crc_generic_read_dir+1447581/7856069] xfs_iomap+0x19a/0x540 [xfs] nebula kernel: [as_update_arq+46/128] as_update_arq+0x2e/0x80 nebula kernel: [__crc_generic_read_dir+1588792/7856069] linvfs_get_block_core+0xa5/0x2c0 [xfs] nebula kernel: [__make_request+712/1408] __make_request+0x2c8/0x580 nebula kernel: [__crc_generic_read_dir+1589391/7856069] linvfs_get_block+0x3c/0x40 [xfs] nebula kernel: [do_mpage_readpage+943/960] do_mpage_readpage+0x3af/0x3c0 nebula kernel: [radix_tree_node_alloc+31/112] radix_tree_node_alloc+0x1f/0x70 nebula kernel: [radix_tree_insert+237/272] radix_tree_insert+0xed/0x110 nebula kernel: [add_to_page_cache+96/192] add_to_page_cache+0x60/0xc0 nebula kernel: [mpage_readpages+315/368] mpage_readpages+0x13b/0x170 nebula kernel: [__crc_generic_read_dir+1589331/7856069] linvfs_get_block+0x0/0x40 [xfs] nebula kernel: [read_pages+312/336] read_pages+0x138/0x150 nebula kernel: [__crc_generic_read_dir+1589331/7856069] linvfs_get_block+0x0/0x40 [xfs] nebula kernel: [__alloc_pages+784/880] __alloc_pages+0x310/0x370 nebula kernel: [do_page_cache_readahead+275/400] do_page_cache_readahead+0x113/0x190 nebula kernel: [page_cache_readahead+256/512] page_cache_readahead+0x100/0x200 nebula kernel: [do_generic_mapping_read+266/1120] do_generic_mapping_read+0x10a/0x460 nebula kernel: [__generic_file_aio_read+523/576] __generic_file_aio_read+0x20b/0x240 nebula kernel: [file_read_actor+0/240] file_read_actor+0x0/0xf0 nebula kernel: [__crc_generic_read_dir+1616642/7856069] xfs_read+0x18f/0x2c0 [xfs] nebula kernel: [__crc_generic_read_dir+1600798/7856069] linvfs_read+0x8b/0xa0 [xfs] nebula kernel: [do_sync_read+137/192] do_sync_read+0x89/0xc0 nebula kernel: [link_path_walk+1550/2304] link_path_walk+0x60e/0x900 nebula kernel: [permission+47/80] permission+0x2f/0x50 nebula kernel: [__crc_generic_read_dir+1602973/7856069] linvfs_open+0x8a/0x90 [xfs] nebula kernel: [dentry_open+206/400] dentry_open+0xce/0x190 nebula kernel: [filp_open+104/112] filp_open+0x68/0x70 nebula kernel: [vfs_read+184/304] vfs_read+0xb8/0x130 nebula kernel: [sys_read+66/112] sys_read+0x42/0x70 nebula kernel: [syscall_call+7/11] syscall_call+0x7/0xb nebula kernel: nebula kernel: Code: 89 4d 00 83 c6 10 89 c1 89 7d 04 89 d7 0f c9 0f cf 87 cf 89