WRT1200AC V2 reboot in a loop and crash when hot


#1

Guys, i have a big problem with WRT1200AC V2... :cry:

Me and my company bought a lot of router Linksys WRT1200AC V2, and approximatly 50 % of our router (on 20 for our 2 lasts orders) have a kernel panic between 30 minutes and 10 hours of use...

"Cant handle request at virtual memory 0x00000000" or something like that (not always the same error).
and
CPU#0 STOP
CPU#1 STOP
reboot...

After that, sometimes it boot (and crash again after a few moment), sometimes u-boot crash and stop boot "BootROM: Image checksum verification FAILED"

I have big supsons on the temperature and the NAND ROM.
once the router is hot, even the OEM firmware crash... and reboot in a loop (or not...)

did someone had the same experience ?


WRT1200AC V2 unstable / no routing
#2

Perhaps you should contact Linksys or your distributor.


#3

yeah, that's what we did... we are waiting for answers... just want to knew if someone has any informations or similar problems...

Thx


#4

So the wrt1200ac has heat issues? I know it has no fan inside, where the 1900 and 3200 do.


#5

The WRT3200ACM does not have an internal fan (at least, mine doesn't have one).


#6

I have bought WRT1200AC V2 and flashed LEDE to boot_part 2 only.
It is unstable, rebooting one or 10 times a day. So far I was just guessing why, I thought it might be because I run OpenWRT on boot_part 2 only.
How do you check the logs when it crashes? After reboot the logs don't show these errors.
I guess I need to send it back and get a better / more stable one.

Cheers, Frood


#7

I have the same issue of random reboots of WRT1200v2. I have already shipped one back and got a replacement unit, and it also has the same issue. I dont think it's temperature problems because I monitor sensor readings and also CPU usage is always just around 1-2%.


#8

hi mfka8,

After a long discuss and negociation (approx 6 months) with linksys (belkin), they admit that they have a problem with the DDR3 on the board. they take all our broken router for making some test, and replace them.

they will replace all the DDR3 by changind the brand on the next board. (maybe with a V3 version)


#9

@keulu This is a pure joke, right? I honestly was already suspecting the RAM, because I had a few log hints for memory corruption like this:

[177170.508736] BUG: Bad page map in process tinyproxy  pte:1b06d7dd pmd:1cbe3831
[177170.516009] page:dff59da0 count:0 mapcount:-1 mapping:  (null) index:0x0
[177170.522837] flags: 0x10(dirty)
[177170.526052] page dumped because: bad pte
[177170.530081] addr:01421000 vm_flags:00100073 anon_vma:dcbb2540 mapping:  (null) index:1421
[177170.538390] file:  (null) fault:  (null) mmap:  (null) readpage:  (null)
[177170.545209] CPU: 1 PID: 6474 Comm: tinyproxy Not tainted 4.9.65 #0
[177170.551501] Hardware name: Marvell Armada 380/385 (Device Tree)
[177170.557549] [<c0016010>] (unwind_backtrace) from [<c0012220>] (show_stack+0x10/0x14)
[177170.565420] [<c0012220>] (show_stack) from [<c0218580>] (dump_stack+0x7c/0x9c)
[177170.572769] [<c0218580>] (dump_stack) from [<c00b5140>] (print_bad_pte+0x154/0x18c)
[177170.580549] [<c00b5140>] (print_bad_pte) from [<c00b7394>] (unmap_page_range+0x4fc/0x554)
[177170.588851] [<c00b7394>] (unmap_page_range) from [<c00b781c>] (zap_page_range+0xd0/0x174)
[177170.597156] [<c00b781c>] (zap_page_range) from [<c00c468c>] (SyS_madvise+0x58c/0x7e8)
[177170.605111] [<c00c468c>] (SyS_madvise) from [<c000ed40>] (ret_fast_syscall+0x0/0x3c)
[177170.612984] Disabling lock debugging due to kernel taint
[177170.618830] BUG: Bad rss-counter state mm:dc174700 idx:0 val:-1
[177170.624899] BUG: Bad rss-counter state mm:dc174700 idx:1 val:1
[178472.320439] BUG: Bad page state in process swapper/0  pfn:1b06d
[178472.326478] page:dff59da0 count:-1 mapcount:-1 mapping:  (null) index:0x0
[178472.333393] flags: 0x10(dirty)
[178472.336545] page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
[178472.342837] bad because of flags: 0x10(dirty)
[178472.347296] Modules linked in: pppoe ppp_async pppox ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 iptable_nat ipt_REJECT ipt_MASQUERADE xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_quota xt_policy xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_esp xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_LOG xt_HL xt_DSCP xt_CT xt_CLASSIFY usblp ums_usbat ums_sddr55 ums_sddr09 ums_karma ums_jumpshot ums_isd200 ums_freecom ums_datafab ums_cypress ums_alauda ts_fsm ts_bm slhc rfcomm nf_reject_ipv4 nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_redirect nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4 nf_nat_ipv4 nf_nat_h323 nf_nat_amanda
[178472.419066]  nf_nat nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sip nf_conntrack_rtcache nf_conntrack_proto_gre nf_conntrack_irc nf_conntrack_h323 nf_conntrack_broadcast ts_kmp nf_conntrack_amanda iptable_mangle iptable_filter ipt_ah ipt_ECN ip_tables hidp hci_uart crc_ccitt btusb btmrvl_sdio btmrvl btintel br_netfilter bnep bluetooth fuse sch_cake em_nbyte cls_basic sch_dsmark sch_pie sch_gred sch_teql act_ipt em_text em_meta sch_codel sch_sfq sch_fq act_police sch_prio em_cmp sch_red act_connmark nf_conntrack act_skbedit act_mirred em_u32 cls_u32 cls_tcindex cls_flow cls_route cls_fw sch_tbf sch_htb sch_hfsc sch_ingress hid evdev input_core mwlwifi mac80211 cfg80211 compat cryptodev xt_set ip_set_list_set ip_set_hash_netiface ip_set_hash_netport ip_set_hash_netnet
[178472.490579]  ip_set_hash_net ip_set_hash_netportnet ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables msdos bonding ifb tun vfat fat ntfs nls_utf8 nls_iso8859_1 nls_cp437 regmap_mmio sha512_generic sha256_generic seqiv jitterentropy_rng drbg md5 hmac ghash_generic gf128mul gcm ecb ctr cmac cbc authenc ohci_pci uhci_hcd ohci_platform ohci_hcd gpio_button_hotplug
[178472.542862] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G    B           4.9.65 #0
[178472.550114] Hardware name: Marvell Armada 380/385 (Device Tree)
[178472.556162] [<c0016010>] (unwind_backtrace) from [<c0012220>] (show_stack+0x10/0x14)
[178472.564034] [<c0012220>] (show_stack) from [<c0218580>] (dump_stack+0x7c/0x9c)
[178472.571383] [<c0218580>] (dump_stack) from [<c009c000>] (bad_page+0x100/0x138)
[178472.578727] [<c009c000>] (bad_page) from [<c009df98>] (get_page_from_freelist+0x638/0x658)
[178472.587115] [<c009df98>] (get_page_from_freelist) from [<c009e3f8>] (__alloc_pages_nodemask+0xe8/0xa10)
[178472.596635] [<c009e3f8>] (__alloc_pages_nodemask) from [<c009edb8>] (__alloc_page_frag+0x34/0x14c)
[178472.605725] [<c009edb8>] (__alloc_page_frag) from [<c0399c80>] (netdev_alloc_frag+0x24/0x34)
[178472.614291] [<c0399c80>] (netdev_alloc_frag) from [<c03c6868>] (hwbm_pool_refill+0x18/0x68)
[178472.622771] [<c03c6868>] (hwbm_pool_refill) from [<c031e3a0>] (mvneta_poll+0x470/0x970)
[178472.630903] [<c031e3a0>] (mvneta_poll) from [<c03a76f4>] (net_rx_action+0xe8/0x2ac)
[178472.638684] [<c03a76f4>] (net_rx_action) from [<c002d364>] (__do_softirq+0xd0/0x204)
[178472.646550] [<c002d364>] (__do_softirq) from [<c002d71c>] (irq_exit+0x94/0xb8)
[178472.653887] [<c002d71c>] (irq_exit) from [<c0062154>] (__handle_domain_irq+0x90/0xb4)
[178472.661840] [<c0062154>] (__handle_domain_irq) from [<c0009428>] (gic_handle_irq+0x50/0x94)
[178472.670316] [<c0009428>] (gic_handle_irq) from [<c0012c8c>] (__irq_svc+0x6c/0x90)
[178472.677916] Exception stack(0xc062ff60 to 0xc062ffa8)
[178472.683077] ff60: 00000001 00000000 00000000 c001b1a0 00000000 c062e000 c0630fe4 00000001
[178472.691377] ff80: c062c168 00000000 c062ffb8 00000001 00000000 c062ffb0 c000f808 c000f80c
[178472.699675] ffa0: 60000013 ffffffff
[178472.703268] [<c0012c8c>] (__irq_svc) from [<c000f80c>] (arch_cpu_idle+0x2c/0x38)
[178472.710791] [<c000f80c>] (arch_cpu_idle) from [<c005b4a4>] (cpu_startup_entry+0xf0/0x19c)
[178472.719096] [<c005b4a4>] (cpu_startup_entry) from [<c05e8c54>] (start_kernel+0x39c/0x420)
[187564.266647] swap_free: Bad swap file entry 00000c00
[187564.271683] BUG: Bad page map in process grep  pte:00060000 pmd:1cbe3831
[187564.278525] addr:00021000 vm_flags:00000875 anon_vma:  (null) mapping:df068b94 index:11
[187564.286663] file:busybox fault:filemap_fault mmap:generic_file_readonly_mmap readpage:squashfs_readpage
[187564.296199] CPU: 0 PID: 4974 Comm: grep Tainted: G    B           4.9.65 #0
[187564.303276] Hardware name: Marvell Armada 380/385 (Device Tree)
[187564.309321] [<c0016010>] (unwind_backtrace) from [<c0012220>] (show_stack+0x10/0x14)
[187564.317193] [<c0012220>] (show_stack) from [<c0218580>] (dump_stack+0x7c/0x9c)
[187564.324539] [<c0218580>] (dump_stack) from [<c00b5140>] (print_bad_pte+0x154/0x18c)
[187564.332319] [<c00b5140>] (print_bad_pte) from [<c00b718c>] (unmap_page_range+0x2f4/0x554)
[187564.340621] [<c00b718c>] (unmap_page_range) from [<c00b773c>] (unmap_vmas+0x44/0x54)
[187564.348487] [<c00b773c>] (unmap_vmas) from [<c00bc0fc>] (exit_mmap+0xc0/0x1bc)
[187564.355829] [<c00bc0fc>] (exit_mmap) from [<c0026ee8>] (mmput+0x38/0xf4)
[187564.362649] [<c0026ee8>] (mmput) from [<c002b650>] (do_exit+0x354/0x838)
[187564.369468] [<c002b650>] (do_exit) from [<c002cc64>] (do_group_exit+0x48/0xd0)
[187564.376810] [<c002cc64>] (do_group_exit) from [<c002ccfc>] (__wake_up_parent+0x0/0x18)
[187564.385112] BUG: Bad rss-counter state mm:ddb41880 idx:2 val:-1

The router may stay stable for days, then randomly just reboot when using it.

This is unbelievable. Can you quote someone at Linksys about this I can get in contact with? I wont tolerate something like this. I actually also made a post about this on the Linksys forum and my post was deleted. Yes. Deleted. So they want to hold it shush shush I guess. Disgusting.

I already got a replacement WRT1200V2 last week from Amazon, sadly, I had two reboots again a few days with the unit too. This one is revision C00, the other was A00.


#10

Yeah mine routers crashed everytime i wrote something in /tmp.

After a discuss with linksys, they admit they have a problem, but they told us to see with our resealler for RMA. they do nothing for us. My bad. i thought they send a new routers, but no, they sended us back broken routers.

Sorry for linksys, but i think you have to change the brand. see with amazon if you can send it back and choose another router. TP-link archer C7 for exemple.


#11

@keulu

TP-link archer C7? You joking right (= Thats like 1% of the CPU power the WRT1200 has. I need a strong ARM or even x86 based router for max OpenVPN bandwidth, no MIPS child toys.

Could you please quote/copy/scan source of Linksys saying they have a production issue and know about it, I need some reliable source and proof for it, so I could link/send it to Amazon.

edit:

I actually let a:

stress -m 1 --vm-bytes 128M

and

#!/bin/sh

while true; do
sleep 2
wget http://server/100mb.zip -P /tmp/
rm /tmp/100mb.zip
done

exit 0

run for some hour and it works normally, no crash.


#12

i think they have an issue with the temp. when i put 5 router one on top of each other, i just wait beetween 30 mins and 2 hours with original firmware before crash. we have routers on production some of the router are in a rack with air conditioner, no problem, if we put out of the rack -> crash...


#13

Maybe the RAM is overheating?


#14

yep seems to be overheating.


#15

I've been hit by this problem too, @keuleu do you have a link to an official comment from linksys about this issue ? I'll have to RMA my routers too.


#16

see with your resealler... :confused: Unfortunatly, they don't do RMA if you don't bought it directly to them...

but send a post on their forum. @mfka8 do the same, and his post was deleted.

just insists !!!


#17

This seems to be the relevant thread on the OEM website: http://community.linksys.com/t5/Wireless-Routers/WRT1900ac-Intemitant-Reboot-Fix/m-p/887961

There's an interesting post that reads:

I did and after my own investigation and liaising with my ISP it was my conclusion that the WRT1900ac cable router was responsible for spamming the DHCP which was causing my router to reboot which in tern cause my cable modem to reboot constantly

I seem to notice that there are other threads on this site that appear abandoned or closed with just 1-2 messages about the WRT1200AC rebooting...


#18

I have retried to put a post about it on the unoffical forums, had to contact a mod for making it public, it is here https://community.linksys.com/t5/Wireless-Routers/WRT1200-Lede-random-reboots/td-p/1295447

I doubt to hear anything back on there, you guys could maybe post to it too, telling you having the same issue and your feedback you got from official Linksys.

@lleachii What you linked is not connected with this issue, also this is about the WRT1200v2, not the WRT1900(x).

I also have my doubts that it is heat connected, because my crash and reboots and memory corruptions just happened under room temperatur of 19°C, and the router was mostly idling, load work of about 1-2%.

It might be though, that Linksys did a bad job of connecting the heat sink with the RAM, and maybe it is just connected with the CPU. Though... this is very unlikely, RAM doesnt need cooling mostly, especially not DDR3 RAM used in routers with very low clock speeds.


#19

I replied to your post there, let's hope someone from linksys will answer but I highly doubt it.


#20

There is supposedly no "electrical" difference between the v1/v2 model, so if there would be a real design problem, would we not expect that all openwrt 1200AC users have a similar problem over time? Yet myself and other 1200ac users seems fine and never had any issue at all, over multiple openwrt/lede/dd-wrt versions.
So a general overheating problem seems illogical to me, maybe its a "bad" production batch, with some obscure to diagnose problem.