Chadster766 wrote:gufus wrote:Chadster766 wrote:It also looks more like this is relate to the "/sbin/fan_ctrl.sh" more than wireless at this point.
I'm going to change the schedule to once every 15 minutes for /sbin/fan_ctrl.sh crontab and see what happens.
Maybe eh...
Hope thats the prob
When the "rcu_sched self-detected stall on CPU { 0}" occurs isn't it supposed to clear up the stall instead of allowing CPU 0 to be stalled forever?
There isn't really that much to do, is there? The CPU is running buggy code. If you knew how to automatically fix that code, then you would probably have run that automated fixup on the source _before_ building the kernel/driver that is failing, wouldn't you?
But the kernel will try its best, using the available means. Which is basically limited to forcing the scheduler to run whatever is waiting in line. This code follows the warning and stacktrace dump:
/*
* Attempt to revive the RCU machinery by forcing a context switch.
*
* A context switch would normally allow the RCU state machine to make
* progress and it could be we're stuck in kernel space without context
* switches for an entirely unreasonable amount of time.
*/
resched_cpu(smp_processor_id());
But that won't work very well if the only task waiting for CPU time is the buggy one, which likely is the case. It will just lock up again and again.
There is actually great docs available on how to interpret this warning. See
https://www.kernel.org/doc/Documentatio … llwarn.txt
The bug is probably a locking error in one of the functions listed in the stack trace (or another place using the same locks). I believe i've seen a few mwlwifi functions there, which definitely is high on the list of suspects.I assume the Marvell developers are looking into it. You'd have to get a good grasp on the driver locking design before you can try to figure out the problem.