Skip to content

Commit 2d08e26

Browse files
Valentin SchneiderDhineshCool
authored andcommitted
[BACKPORT]sched/topology: Don't try to build empty sched domains
[ Upstream commit cd1cb3350561d2bf544ddfef76fbf0b1c9c7178f ] Turns out hotplugging CPUs that are in exclusive cpusets can lead to the cpuset code feeding empty cpumasks to the sched domain rebuild machinery. This leads to the following splat: Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 235 Comm: kworker/5:2 Not tainted 5.4.0-rc1-00005-g8d495477d62e #23 Hardware name: ARM Juno development board (r0) (DT) Workqueue: events cpuset_hotplug_workfn pstate: 60000005 (nZCv daif -PAN -UAO) pc : build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) lr : build_sched_domains (kernel/sched/topology.c:1966) Call trace: build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) partition_sched_domains_locked (kernel/sched/topology.c:2250) rebuild_sched_domains_locked (./include/linux/bitmap.h:370 ./include/linux/cpumask.h:538 kernel/cgroup/cpuset.c:955 kernel/cgroup/cpuset.c:978 kernel/cgroup/cpuset.c:1019) rebuild_sched_domains (kernel/cgroup/cpuset.c:1032) cpuset_hotplug_workfn (kernel/cgroup/cpuset.c:3205 (discriminator 2)) process_one_work (./arch/arm64/include/asm/jump_label.h:21 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:114 kernel/workqueue.c:2274) worker_thread (./include/linux/compiler.h:199 ./include/linux/list.h:268 kernel/workqueue.c:2416) kthread (kernel/kthread.c:255) ret_from_fork (arch/arm64/kernel/entry.S:1167) Code: f860dae2 912802d6 aa1603e1 12800000 (f8616853) The faulty line in question is: cap = arch_scale_cpu_capacity(cpumask_first(cpu_map)); and we're not checking the return value against nr_cpu_ids (we shouldn't have to!), which leads to the above. Prevent generate_sched_domains() from returning empty cpumasks, and add some assertion in build_sched_domains() to scream bloody murder if it happens again. The above splat was obtained on my Juno r0 with the following reproducer: $ cgcreate -g cpuset:asym $ cgset -r cpuset.cpus=0-3 asym $ cgset -r cpuset.mems=0 asym $ cgset -r cpuset.cpu_exclusive=1 asym $ cgcreate -g cpuset:smp $ cgset -r cpuset.cpus=4-5 smp $ cgset -r cpuset.mems=0 smp $ cgset -r cpuset.cpu_exclusive=1 smp $ cgset -r cpuset.sched_load_balance=0 . $ echo 0 > /sys/devices/system/cpu/cpu4/online $ echo 0 > /sys/devices/system/cpu/cpu5/online Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hannes@cmpxchg.org Cc: lizefan@huawei.com Cc: morten.rasmussen@arm.com Cc: qperret@google.com Cc: tj@kernel.org Cc: vincent.guittot@linaro.org Fixes: 05484e098448 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection") Link: https://lkml.kernel.org/r/20191023153745.19515-2-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent ea6dc0a commit 2d08e26

File tree

2 files changed

+6
-2
lines changed

2 files changed

+6
-2
lines changed

kernel/cpuset.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -698,7 +698,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
698698
cpumask_intersects(cp->cpus_allowed, non_isolated_cpus)))
699699
continue;
700700

701-
if (is_sched_load_balance(cp))
701+
if (is_sched_load_balance(cp) &&
702+
!cpumask_empty(cp->effective_cpus))
702703
csa[csn++] = cp;
703704

704705
/* skip @cp's subtree */

kernel/sched/core.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7682,11 +7682,14 @@ struct sched_domain *build_sched_domain(struct sched_domain_topology_level *tl,
76827682
static int build_sched_domains(const struct cpumask *cpu_map,
76837683
struct sched_domain_attr *attr)
76847684
{
7685-
enum s_alloc alloc_state;
7685+
enum s_alloc alloc_state = sa_none;
76867686
struct sched_domain *sd;
76877687
struct s_data d;
76887688
int i, ret = -ENOMEM;
76897689

7690+
if (WARN_ON(cpumask_empty(cpu_map)))
7691+
goto error;
7692+
76907693
alloc_state = __visit_domain_allocation_hell(&d, cpu_map);
76917694
if (alloc_state != sa_rootdomain)
76927695
goto error;

0 commit comments

Comments
 (0)