The AVR54 bug in 2023

This is, strictly speaking, not OpenWrt specific, so I will understand if moderators reprimand me and apologize in advance if I am unwittingly breaking the rules.

Back in 2015 or thereabouts, Intel discovered a flaw in Atom C2xxx processors (stepping B0). Over time (months; occasionally, years), the system clock on affected processors would degrade to the point where the processor could no longer function. The issue was duly cataloged as AVR54. Following the discovery, Intel put out a new stepping, C0, in which the flaw was fixed.

But that was then. And now I wonder: how much of an ongoing problem does AVR54 present? My reasoning is, there were affected processors and unaffected processors. Affected processors went on to die relatively quickly. So, unless you have an affected unit that sat in storage for years, chances of happening upon an affected unit ought to be small.

Would anyone care to comment or share recent experience?

Haven't owned any affected device, I think, but they're fixable, at least some of them.

They are (generally, you need to solder a resistor onto the motherboard in some magic place), but that's not really what I'm after. I am trying to solicit evidence for and against the idea that units susceptible to AVR54 have all but died out. Ideally, someone will say, "I've managed 12 of these at work, two died between 2015 and 2017, and it's been quiet since", or, conversely, "I've managed 12 of these at work, and they've been steadily dying at a rate about one per year ever since 2015".

I wouldn't expect any affected devices to be 'safe' by now, as you never know why it hasn't blown up already (maybe it was just sitting on a shelf, as a cold spare; e.g. I got a j1900 based system with less than one hour runtime on its SSD). Chances for the issue to hit you, unless already hot-fixed in hardware (as in resistor soldered on), are just as high as it was back then, just that there are fewer devices to choose from - and what you get, is used and without any (remaining) warranties (so little ongoing noise about it).

Fair point, but how can we tell if the device is affected? The affected devices are stepping B0, but unaffected stepping C0 also exists, and the two are virtually indistinguishable (same model numbers, different ordering codes, but ordering codes are for OEMs; the owner of a device doesn't know the ordering code for the processor inside that device). Short of system discovery, there's no way to determine the stepping.

Incidentally, how would I find out the stepping if I had to? Are there any utilities for that? Ideally on OpenWrt, but failing that, anything that can run from live USB media would work..

My ATOM C2550 died 6yrs after purchase.

CPU-Z?

there's lscpu in Openwrt

There is, but I can't make heads or tails out of its output in the context of AVR 54. Here's lscpu output from a potentially affected device running on Atom C2358:

root@SomeRandomRouter:~# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         36 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  2
  On-line CPU(s) list:   0,1
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Intel(R) Corporation
  Model name:            Intel(R) Atom(TM) CPU  C2358  @ 1.74GHz
    BIOS Model name:     Intel(R) Atom(TM) CPU  C2358  @ 1.74GHz
    CPU family:          6
    Model:               77
    Thread(s) per core:  1
    Core(s) per socket:  2
    Socket(s):           1
    Stepping:            8
    BogoMIPS:            3500.14
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fx
                         sr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
                         xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx
                         16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch cpu
                         id_fault epb pti tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms dtherm arat
Virtualization features:
  Virtualization:        VT-x
Caches (sum of all):
  L1d:                   48 KiB (2 instances)
  L1i:                   64 KiB (2 instances)
  L2:                    1 MiB (1 instance)
Vulnerabilities:
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Vulnerable: Clear CPU buffers attempted, no microcode; SMT disabled
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Unknown: No mitigations
  Retbleed:              Not affected
  Spec store bypass:     Not affected
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

Note that stepping is shown as 8, rather than B0 or C0 I am trying to extract. What am I missing?

Poked the Googlebeast, and it burped out a couple of alternative suggestions (not sure if they are valid, but I've got nothing better to go on):

  1. setpci -s 00:00.0 8.b would return 02 if stepping is B0 and 03 if stepping is C0.
  2. The same two-digit combination (02 or 03) would show as revision number in the output of lspci -v | grep 00:00.0 (example: 00:00.0 Host bridge: Intel Corporation Atom processor C2000 SoC Transaction Router (rev 02))

Does this make any sense?

I have a VMware Edge 500-n with an Atom C2358, I can hook up tomorrow, and compare.

That would be very helpful, thank you!

seems to be same as yours

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         36 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  2
  On-line CPU(s) list:   0,1
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Intel(R) Corporation
  Model name:            Intel(R) Atom(TM) CPU  C2358  @ 1.74GHz
    BIOS Model name:     Intel(R) Atom(TM) CPU  C2358  @ 1.74GHz
    CPU family:          6
    Model:               77
    Thread(s) per core:  1
    Core(s) per socket:  2
    Socket(s):           1
    Stepping:            8
    BogoMIPS:            3500.14
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
                         a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
                         ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmo
                         n pebs bts rep_good nopl xtopology nonstop_tsc cpuid ap
                         erfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm
                         2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_d
                         eadline_timer aes rdrand lahf_lm 3dnowprefetch cpuid_fa
                         ult epb pti tpr_shadow vnmi flexpriority ept vpid tsc_a
                         djust smep erms dtherm arat
Virtualization features:
  Virtualization:        VT-x
Caches (sum of all):
  L1d:                   48 KiB (2 instances)
  L1i:                   64 KiB (2 instances)
  L2:                    1 MiB (1 instance)
Vulnerabilities:
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Vulnerable: Clear CPU buffers attempted, no microcode;
                         SMT disabled
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Unknown: No mitigations
  Retbleed:              Not affected
  Spec store bypass:     Not affected
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer
                          sanitization
  Spectre v2:            Mitigation; Retpolines, STIBP disabled, RSB filling, PB
                         RSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected
root@OpenWrt:/# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         36 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  2
  On-line CPU(s) list:   0,1
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Intel(R) Corporation
  Model name:            Intel(R) Atom(TM) CPU  C2358  @ 1.74GHz
    BIOS Model name:     Intel(R) Atom(TM) CPU  C2358  @ 1.74GHz
    CPU family:          6
    Model:               77
    Thread(s) per core:  1
    Core(s) per socket:  2
    Socket(s):           1
    Stepping:            8
    BogoMIPS:            3500.14
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
                         a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
                         ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmo
                         n pebs bts rep_good nopl xtopology nonstop_tsc cpuid ap
                         erfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm
                         2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_d
                         eadline_timer aes rdrand lahf_lm 3dnowprefetch cpuid_fa
                         ult epb pti tpr_shadow vnmi flexpriority ept vpid tsc_a
                         djust smep erms dtherm arat
Virtualization features:
  Virtualization:        VT-x
Caches (sum of all):
  L1d:                   48 KiB (2 instances)
  L1i:                   64 KiB (2 instances)
  L2:                    1 MiB (1 instance)
Vulnerabilities:
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Vulnerable: Clear CPU buffers attempted, no microcode;
                         SMT disabled
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Unknown: No mitigations
  Retbleed:              Not affected
  Spec store bypass:     Not affected
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer
                          sanitization
  Spectre v2:            Mitigation; Retpolines, STIBP disabled, RSB filling, PB
                         RSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

root@OpenWrt:/# setpci -s 00:00.0 8.b
02
root@OpenWrt:/# lspci -v | grep 00:00.0
00:00.0 Host bridge: Intel Corporation Atom processor C2000 SoC Transaction Router (rev 02)
1 Like

Thank you!