LEDE Project

  • Status Unconfirmed
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To No-one
  • Operating System All
  • Severity High
  • Priority Very Low
  • Reported Version Trunk
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: LEDE Project
Opened by Russell Senior - 25.04.2017

FS#737 - Kernel panic on Buffalo WZR600DHP when LAN plugged at boot time, unaligned access

Supply the following if possible:
- Device problem occurs on

Buffalo WZR600DHP (multiple different devices)

- Software versions of LEDE release, packages, etc.

All recent versions tested, including “Reboot (SNAPSHOT, r3939+1-de5b8e5d2f)” ... currently building HEAD to try there

- Steps to reproduce

 
* Remove power (warm boots seem more likely to boot successfully)
* Plug a powered-on ethernet-connected device into one of the LAN ports (I used port 2)
* Apply power to the Buffalo, watch as the red diagnostic light goes on steadily, blinks fast, blinks slower, then goes off without the WLAN leds turning on normally, then repeat indefinitely.  Eventually, one of the reboots may succeed.

A panic looks like this:

[   20.142060] eth0: link up (1000Mbps/Full duplex)
[   20.148486] Unhandled kernel unaligned access[#1]:
[   20.153285] CPU: 0 PID: 960 Comm: basename Not tainted 4.4.59 #0
[   20.159280] task: 87cf1e30 ti: 86c9c000 task.ti: 86c9c000
[   20.164661] $ 0   : 00000000 00000001 00000000 00000001
[   20.169907] $ 4   : 871193ff 87311068 800f3538 87311068
[   20.175154] $ 8   : 87311068 80065058 7701b000 00002000
[   20.180402] $12   : 00000000 00003000 fffffffc 00000000
[   20.185649] $16   : 87119980 871193ff 87cf1e30 76f870d0
[   20.190896] $20   : 000a1f10 7701bf6f 7701cdcc 86c6d1e0
[   20.196143] $24   : 00000000 800f3538                  
[   20.201391] $28   : 86c9c000 86c9dda8 86c9ddf0 8010ba2c
[   20.206638] Hi    : 00000000
[   20.209505] Lo    : 00000008
[   20.212393] epc   : 80099ad8 commit_creds+0x3c/0x20c
[   20.217356] ra    : 8010ba2c install_exec_creds+0x18/0x38
[   20.222735] Status: 1000fc03 KERNEL EXL IE 
[   20.226936] Cause : 00800010 (ExcCode 04)
[   20.230935] BadVA : 871193ff
[   20.233802] PrId  : 00019374 (MIPS 24Kc)
[   20.237714] Modules linked in: ath9k ath9k_common ath9k_hw ath nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_condition xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ipt_ECN ip_tables compat br_netfilter ledtrig_usbport ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables tun ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[   20.319746] Process basename (pid: 960, threadinfo=86c9c000, task=87cf1e30, tls=778c5d48)
[   20.327896] Stack : 86c8d634 7701bf6f 7701cdcc 86c6d1e0 86c9ddf0 800746d0 86c9ddf0 800f5f14
          86c6c400 87119a80 86c6c400 8010ba2c fffffff2 87119a80 86c6c400 76f870d0
          00000000 8014596c 00000000 800f6d60 00000000 800763c0 00002012 00000091
          55555554 80410000 0045a000 00000008 0045a535 0045a000 004497c0 00400000
          76f78000 00400000 86c6c500 87119f00 86c6c300 00000000 00000003 00000003
          ...
[   20.363640] Call Trace:
[   20.366082] [<80099ad8>] commit_creds+0x3c/0x20c
[   20.370700] [<8010ba2c>] install_exec_creds+0x18/0x38
[   20.375752] [<8014596c>] load_elf_binary+0xae0/0x112c
[   20.380794] [<8010bad8>] search_binary_handler+0x8c/0x1cc
[   20.386189] [<8010d09c>] do_execveat_common+0x400/0x5ac
[   20.391400] 
[   20.392880] 
Code: 8c820000  28420001  00020336 <c2220000> 24420001  e2220000  1040fffc  00000000  8e260014 
[   20.402936] ---[ end trace f860d77d4ae98434 ]---
[   20.409602] Fatal exception: panic in 5 seconds
[   20.414551] CPU 0 Unable to handle kernel paging request at virtual address 00000200, epc == 87fe0254, ra == 87fe1ce8
[   20.425372] Oops[#2]:
[   20.427647] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G      D         4.4.59 #0
[   20.434942] task: 87c2ca10 ti: 87c44000 task.ti: 87c44000
[   20.440323] $ 0   : 00000000 80480000 80000001 00000200
[   20.445570] $ 4   : 86c9e000 00000001 00000000 dcbc842b
[   20.450817] $ 8   : 86c9e058 86c9e058 00000000 00000000
[   20.456064] $12   : 00440002 ffffffff 00000000 ffffffff
[   20.461311] $16   : 86c9e000 000006ec 00000377 804275d8
[   20.466559] $20   : 000006ec 00000004 804275d8 00000000
[   20.471805] $24   : 00000000 87fe7890                  
[   20.477053] $28   : 87c44000 87c45b20 86c9e040 87fe1ce8
[   20.482300] Hi    : 00000377
[   20.485167] Lo    : 3456f000
[   20.488079] epc   : 87fe0254 nf_ct_untracked_status_or+0x78/0x98 [nf_conntrack]
[   20.495383] ra    : 87fe1ce8 __nf_conntrack_confirm+0xbc/0x364 [nf_conntrack]
[   20.502493] Status: 1000fc03 KERNEL EXL IE 
[   20.506694] Cause : 0080000c (ExcCode 03)
[   20.510693] BadVA : 00000200
[   20.513560] PrId  : 00019374 (MIPS 24Kc)
[   20.517472] Modules linked in: ath9k ath9k_common ath9k_hw ath nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_condition xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ipt_ECN ip_tables compat br_netfilter ledtrig_usbport ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables tun ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[   20.599503] Process ksoftirqd/0 (pid: 3, threadinfo=87c44000, task=87c2ca10, tls=00000000)
[   20.607740] Stack : 00000002 87fd97c0 00000001 02000000 87c45b9c 87c45bd4 804279ec 87118cc0
          00000001 00000004 fffffffd 00000000 02080020 80294f60 00000300 71a45669
          00000000 00000000 87118cc0 87118cc0 87118cc0 87c45bd4 fffffffb 00000001
          00000003 80295004 804279e4 71a45669 87c45c04 87c45c3c 00000000 8741d284
          87118cc0 80420000 00000001 87118cc0 02080020 00000800 8720b05c 8029b5f0
          ...
[   20.643491] Call Trace:
[   20.645944] [<87fe0254>] nf_ct_untracked_status_or+0x78/0x98 [nf_conntrack]
[   20.652919] [<87fe1ce8>] __nf_conntrack_confirm+0xbc/0x364 [nf_conntrack]
[   20.659709] [<80294f60>] nf_iterate+0x7c/0xc8
[   20.664055] [<80295004>] nf_hook_slow+0x58/0xf8
[   20.668582] [<8029b5f0>] ip_local_deliver+0xb8/0xe8
[   20.673451] [<8029b8e8>] ip_rcv+0x2c8/0x334
[   20.677629] [<80268c5c>] __netif_receive_skb_core+0x614/0x6d0
[   20.683373] [<8032f74c>] br_pass_frame_up+0x94/0x10c
[   20.688328] [<8032ffd0>] br_handle_frame+0x2cc/0x418
[   20.693279] [<802689dc>] __netif_receive_skb_core+0x394/0x6d0
[   20.699028] [<8023703c>] ag71xx_poll+0x430/0x5c0
[   20.703637] [<80269ab4>] net_rx_action+0x10c/0x2c8
[   20.708420] [<80084134>] __do_softirq+0x250/0x298
[   20.713116] [<800841a4>] run_ksoftirqd+0x28/0x60
[   20.717725] [<8009abf8>] smpboot_thread_fn+0x158/0x188
[   20.722851] [<80098628>] kthread+0xd8/0xec
[   20.726936] [<80060878>] ret_from_kernel_thread+0x14/0x1c
[   20.732315] 
[   20.733795] 
Code: 00020336  8c820008  30450001 <ac620000> 14a00002  00000000  ac430004  24020200  ac82000c 
[   20.743784] ---[ end trace f860d77d4ae98435 ]---
[   20.749725] Kernel panic - not syncing: Fatal exception in interrupt
[   20.757339] Rebooting in 3 seconds..

Or another one:

[   20.121647] eth0: link up (1000Mbps/Full duplex)
[   20.131943] Unhandled kernel unaligned access[#1]:
[   20.136738] CPU: 0 PID: 937 Comm: snmpd Not tainted 4.4.59 #0
[   20.142474] task: 87d92840 ti: 86c7c000 task.ti: 86c7c000
[   20.147854] $ 0   : 00000000 7fe40058 8010de54 00000000
[   20.153101] $ 4   : 873b557f 86c7de7c 86c7de74 00000001
[   20.158348] $ 8   : 00000020 80065050 80107538 00000000
[   20.163595] $12   : 7fe40040 00000000 00000000 77d382c0
[   20.168842] $16   : 00000000 873b557f 86c7df08 00000080
[   20.174090] $20   : 00000080 77d2e000 86c831e0 77d32d8c
[   20.179336] $24   : 00000fa3 77c9d000                  
[   20.184584] $28   : 86c7c000 86c7de08 00000000 8010de9c
[   20.189832] Hi    : 00000004
[   20.192707] Lo    : 00000000
[   20.195596] epc   : 800685a8 mutex_lock+0x0/0x30
[   20.200203] ra    : 8010de9c pipe_read+0x48/0x284
[   20.204889] Status: 1000fc03 KERNEL EXL IE 
[   20.209090] Cause : 00800010 (ExcCode 04)
[   20.213089] BadVA : 873b557f
[   20.215957] PrId  : 00019374 (MIPS 24Kc)
[   20.219869] Modules linked in: ath9k ath9k_common ath9k_hw ath nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_condition xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ipt_ECN ip_tables compat br_netfilter ledtrig_usbport ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables tun ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[   20.301899] Process snmpd (pid: 937, threadinfo=86c7c000, task=87d92840, tls=77d39d48)
[   20.309790] Stack : 8730d004 068b46ad 86c84e70 00000012 00000000 00000000 00000000 873b0780
          87cf2efc 86c7de7c 00000000 86c831e0 86c7df08 00000080 77d2e000 77d2e000
          77d31518 77d32d8c 00000000 8010647c 0045acd4 80076b6c 80410000 8008a31c
          00000080 00000000 77d31518 7fe4015c 00000080 00000000 00000000 00000080
          86c7de74 00000001 86c831e0 00000000 00000000 00000000 00000000 00000000
          ...
[   20.345535] Call Trace:
[   20.347973] [<800685a8>] mutex_lock+0x0/0x30
[   20.352235] [<8010de9c>] pipe_read+0x48/0x284
[   20.356586] [<8010647c>] __vfs_read+0xc4/0x104
[   20.361023] [<80106bfc>] vfs_read+0x84/0x130
[   20.365284] [<80107590>] SyS_read+0x58/0xc4
[   20.369462] [<80062c2c>] syscall_common+0x30/0x54
[   20.374153] 
[   20.375640] 
Code: 8fb00024  03e00008  27bd0040 <c0820000> 2443ffff  e0830000  1060fffc  00000000  2442ffff 
[   20.385691] ---[ end trace 7089734b36069ee3 ]---
[   20.392349] Fatal exception: panic in 5 seconds
[   20.400076] CPU 0 Unable to handle kernel paging request at virtual address 00000200, epc == 87ff0254, ra == 87ff1ce8
[   20.410891] Oops[#2]:
[   20.413160] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G      D         4.4.59 #0
[   20.420456] task: 87c2ca10 ti: 87c44000 task.ti: 87c44000
[   20.425836] $ 0   : 00000000 80480000 80000001 00000200
[   20.431083] $ 4   : 8730e000 00000001 00000000 409e9f5a
[   20.436331] $ 8   : 8730e058 8730e058 00000000 00000000
[   20.441578] $12   : 00440002 ffffffff 00000000 ffffffff
[   20.446825] $16   : 8730e000 000002e0 00000655 804275d8
[   20.452073] $20   : 000002e0 00000004 804275d8 00000000
[   20.457320] $24   : 00000000 87ff7890                  
[   20.462567] $28   : 87c44000 87c45b20 8730e040 87ff1ce8
[   20.467814] Hi    : 00000655
[   20.470680] Lo    : 17458800
[   20.473587] epc   : 87ff0254 nf_ct_untracked_status_or+0x78/0x98 [nf_conntrack]
[   20.480887] ra    : 87ff1ce8 __nf_conntrack_confirm+0xbc/0x364 [nf_conntrack]
[   20.487999] Status: 1000fc03 KERNEL EXL IE 
[   20.492199] Cause : 0080000c (ExcCode 03)
[   20.496199] BadVA : 00000200
[   20.499075] PrId  : 00019374 (MIPS 24Kc)
[   20.502986] Modules linked in: ath9k ath9k_common ath9k_hw ath nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_condition xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_HL xt_DSCP xt_CLASSIFY ts_kmp ts_fsm ts_bm nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ipt_ECN ip_tables compat br_netfilter ledtrig_usbport ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables tun ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[   20.585017] Process ksoftirqd/0 (pid: 3, threadinfo=87c44000, task=87c2ca10, tls=00000000)
[   20.593254] Stack : 00000002 874187c0 00000001 02000000 87c45b9c 87c45bd4 804279ec 87312180
          00000001 00000004 fffffffd 00000000 02080020 80294f60 00000300 d41771e8
          00000000 00000000 87312180 87312180 87312180 87c45bd4 fffffffb 00000001
          00000003 80295004 804279e4 d41771e8 87c45c04 87c45c3c 00000000 87413604
          87312180 80420000 00000001 87312180 02080020 00000800 8721105c 8029b5f0
          ...
[   20.629004] Call Trace:
[   20.631457] [<87ff0254>] nf_ct_untracked_status_or+0x78/0x98 [nf_conntrack]
[   20.638422] [<87ff1ce8>] __nf_conntrack_confirm+0xbc/0x364 [nf_conntrack]
[   20.645214] [<80294f60>] nf_iterate+0x7c/0xc8
[   20.649561] [<80295004>] nf_hook_slow+0x58/0xf8
[   20.654087] [<8029b5f0>] ip_local_deliver+0xb8/0xe8
[   20.658957] [<8029b8e8>] ip_rcv+0x2c8/0x334
[   20.663134] [<80268c5c>] __netif_receive_skb_core+0x614/0x6d0
[   20.668878] [<8032f74c>] br_pass_frame_up+0x94/0x10c
[   20.673833] [<8032ffd0>] br_handle_frame+0x2cc/0x418
[   20.678785] [<802689dc>] __netif_receive_skb_core+0x394/0x6d0
[   20.684534] [<8023703c>] ag71xx_poll+0x430/0x5c0
[   20.689142] [<80269ab4>] net_rx_action+0x10c/0x2c8
[   20.693926] [<80084134>] __do_softirq+0x250/0x298
[   20.698622] [<800841a4>] run_ksoftirqd+0x28/0x60
[   20.703231] [<8009abf8>] smpboot_thread_fn+0x158/0x188
[   20.708365] [<80098628>] kthread+0xd8/0xec
[   20.712450] [<80060878>] ret_from_kernel_thread+0x14/0x1c
[   20.717829] 
[   20.719309] 
Code: 00020336  8c820008  30450001 <ac620000> 14a00002  00000000  ac430004  24020200  ac82000c 
[   20.729304] ---[ end trace 7089734b36069ee4 ]---
[   20.735253] Kernel panic - not syncing: Fatal exception in interrupt
[   20.742878] Rebooting in 3 seconds..

I have more panics, available on request.

Russell Senior commented on 25.04.2017 23:56

Duplicated on r4027-725238f698 (current HEAD). A series of full bootlogs with panics are attached.

Russell Senior commented on 27.04.2017 05:34

bisection is not generating clear results, will update when clarity achieved.

Russell Senior commented on 27.04.2017 13:00

It appears the problem manifests if br_netfilter (from kmod-br-netfilter) or xt_physdev (from kmod-ipt-extra) are loaded at boot time and a link is present on the LAN ports (eth0).

Russell Senior commented on 27.04.2017 14:30

I am bisecting with the following .config stub:

CONFIG_TARGET_ar71xx=y
CONFIG_TARGET_ar71xx_generic=y
CONFIG_TARGET_ar71xx_generic_DEVICE_WZR600DHP=y
CONFIG_DEVEL=y
CONFIG_BUILD_LOG=y
# CONFIG_PACKAGE_firewall is not set
CONFIG_PACKAGE_kmod-br-netfilter=y
CONFIG_PACKAGE_kmod-ipt-extra=y
# CONFIG_PACKAGE_uboot-envtools is not set
CONFIG_PACKAGE_kmod-ipt-nat=y
CONFIG_PACKAGE_kmod-nf-nat=y
Russell Senior commented on 27.04.2017 14:59

reboot-1238-ga5c32a1f19 = no panic
reboot-1239-gfa85ee1d4e = panic

In the panic case, lsmod shows both the problematic modules: br_netfilter and xt_physdev, the no-panic case, those modules don't appear.

Russell Senior commented on 29.04.2017 09:59

I tried to replicate the problem on a Netgear WNDR3800, but didn't see the panic there.

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing