aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* sched: Merge select_task_rq_fair() and sched_balance_self()Peter Zijlstra2009-09-1510-237/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem with wake_idle() is that is doesn't respect things like cpu_power, which means it doesn't deal well with SMT nor the recent RT interaction. To cure this, it needs to do what sched_balance_self() does, which leads to the possibility of merging select_task_rq_fair() and sched_balance_self(). Modify sched_balance_self() to: - update_shares() when walking up the domain tree, (it only called it for the top domain, but it should have done this anyway), which allows us to remove this ugly bit from try_to_wake_up(). - do wake_affine() on the smallest domain that contains both this (the waking) and the prev (the wakee) cpu for WAKE invocations. Then use the top-down balance steps it had to replace wake_idle(). This leads to the dissapearance of SD_WAKE_BALANCE and SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE. SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective. Touch all topology bits to replace the old with new SD flags -- platforms might need re-tuning, enabling SD_BALANCE_WAKE conditionally on a NUMA distance seems like a good additional feature, magny-core and small nehalem systems would want this enabled, systems with slow interconnects would not. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Add TASK_WAKINGPeter Zijlstra2009-09-152-16/+16
| | | | | | | | | | | | | | | | | We're going to want to drop rq->lock in try_to_wake_up() for a longer period of time, however we also want to deal with concurrent waking of the same task, which is currently handled by holding rq->lock. So introduce a new TASK state, namely TASK_WAKING, which indicates someone is already waking the task (other wakers will fail p->state & state). We also keep preemption disabled over the whole ttwu(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Hook sched_balance_self() into sched_class::select_task_rq()Peter Zijlstra2009-09-155-11/+20
| | | | | | | | | Rather ugly patch to fully place the sched_balance_self() code inside the fair class. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Move sched_balance_self() into sched_fair.cPeter Zijlstra2009-09-152-146/+145
| | | | | | | | | | | Move the sched_balance_self() code into sched_fair.c This facilitates the merger of sched_balance_self() and sched_fair::select_task_rq(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Move code aroundPeter Zijlstra2009-09-151-42/+39
| | | | | | | | | In preparation to other code movement, move weighted_cpuload(), source_load() and target_load() before the class includes. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Add come comments to the sched featuresPeter Zijlstra2009-09-151-8/+85
| | | | | | | | Add text... Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Complete buddy switchesMike Galbraith2009-09-152-1/+3
| | | | | | | | | Add a NEXT_BUDDY feature flag to aid in debugging. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Split WAKEUP_OVERLAPPeter Zijlstra2009-09-152-3/+5
| | | | | | | | | It consists of two conditions, split them out in separate toggles so we can test them independently. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: Fix double_rq_lock() compile warningPeter Zijlstra2009-09-151-2/+2
| | | | | | Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'x86-setup-for-linus' of ↵Linus Torvalds2009-09-145-40/+4
|\ | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-setup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, e820: Guard against array overflowed in __e820_add_region() x86, setup: remove obsolete pre-Kconfig CONFIG_VIDEO_ variables
| * x86, e820: Guard against array overflowed in __e820_add_region()Cyrill Gorcunov2009-08-261-1/+1
| | | | | | | | | | | | | | | | Better to be paranoid against unpredicted nr_map modifications. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <20090824175551.146070377@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, setup: remove obsolete pre-Kconfig CONFIG_VIDEO_ variablesH. Peter Anvin2009-06-264-39/+3
| | | | | | | | | | | | | | | | | | | | | | There were a set of pre-Kconfig configuration variables defined in the video code. There is absolutely no evidence that they have been tweaked by anybody in modern history, so just get rid of them and hope nobody notices. If someone does complain, these should be made real Kconfig variables. Reported-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | Merge branch 'x86-percpu-for-linus' of ↵Linus Torvalds2009-09-147-22/+40
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-percpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, percpu: Collect hot percpu variables into one cacheline x86, percpu: Fix DECLARE/DEFINE_PER_CPU_PAGE_ALIGNED() x86, percpu: Add 'percpu_read_stable()' interface for cacheable accesses
| * | x86, percpu: Collect hot percpu variables into one cachelineTejun Heo2009-08-043-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On x86_64, percpu variables current_task and kernel_stack are used for get_current() and current_thread_info() respectively and thus are often used close to each other. Move definition of current_task to kernel/cpu/common.c right above kernel_stack definition and align it to cacheline so that they always fall into the same cacheline. Two percpu variables defined there together - irq_stack_ptr and irq_count - are also pretty hot and will benefit from sharing the cacheline. For consistency, current_task definition for x86_32 is also moved to kernel/cpu/common.c. Putting current_task and kernel_stack into the same cacheline was suggested by Linus Torvalds. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | x86, percpu: Fix DECLARE/DEFINE_PER_CPU_PAGE_ALIGNED()Tejun Heo2009-08-042-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DECLARE/DEFINE_PER_CPU_PAGE_ALIGNED() put percpu variables in .page_aligned section without adding any alignment restrictions. Currently, this doesn't cause any problem because all users of the macros have explicit page alignment and page-sized but it's much safer to enforce page alignment from the macros. After all, it's what they claim to do. Add __aligned(PAGE_SIZE) to DECLARE/DEFINE_PER_CPU_PAGE_ALIGNED() and drop explicit alignment from it users. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | x86, percpu: Add 'percpu_read_stable()' interface for cacheable accessesLinus Torvalds2009-08-043-9/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is very useful for some common things like 'get_current()' and 'get_thread_info()', which can be used multiple times in a function, and where the result is cacheable. tj: Added the magical undocumented "P" modifier to UP __percpu_arg() to force gcc to dereference the pointer value passed in via the "p" input constraint. Without this, percpu_read_stable() returns the address of the percpu variable. Also added comment explaining the difference between percpu_read() and percpu_read_stable(). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2009-09-143-39/+30
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, highmem_32.c: Clean up comment x86, pgtable.h: Clean up types x86: Clean up dump_pagetable()
| * | | x86, highmem_32.c: Clean up commentFigo.zhang2009-06-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Figo.zhang <figo1802@gmail.com> Cc: Andrew Morton <akpm@osdl.org> LKML-Reference: <1246248175.5759.12.camel@myhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, pgtable.h: Clean up typesFigo.zhang2009-06-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use "unsigned long" consistently, not "unsigned". Signed-off-by: Figo.zhang <figo1802@gmail.com> LKML-Reference: <1246183659.2530.4.camel@myhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86: Clean up dump_pagetable()Akinobu Mita2009-06-292-35/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use pgtable access helpers for 32-bit version dump_pagetable() and get rid of __typeof__() operators. This needs to make pmd_pfn() available for 2-level pgtable. Also, remove some casts for 64-bit version dump_pagetable(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> LKML-Reference: <20090627063514.GA2834@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge branch 'x86-kbuild-for-linus' of ↵Linus Torvalds2009-09-141-2/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-kbuild-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Simplify the Makefile in a minor way through use of cc-ifversion
| * | | | x86: Simplify the Makefile in a minor way through use of cc-ifversionFrans Pop2009-08-041-2/+2
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Frans Pop <elendil@planet.nl> Acked-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> LKML-Reference: <200907232056.28635.elendil@planet.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds2009-09-144-33/+61
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64: move clts into batch cpu state updates when preloading fpu x86-64: move unlazy_fpu() into lazy cpu state part of context switch x86-32: make sure clts is batched during context switch x86: split out core __math_state_restore
| * | | | x86-64: move clts into batch cpu state updates when preloading fpuJeremy Fitzhardinge2009-06-171-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a task is likely to be using the fpu, we preload its state during the context switch, rather than waiting for it to run an fpu instruction. Make sure the clts() happens while we're doing batched fpu state updates to optimise paravirtualized context switches. [ Impact: optimise paravirtual FPU context switch ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au>
| * | | | x86-64: move unlazy_fpu() into lazy cpu state part of context switchJeremy Fitzhardinge2009-06-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that unlazy_fpu()'s stts gets batched along with the other cpu state changes during context switch. (32-bit already does this.) This makes sure it gets batched when running paravirtualized. [ Impact: optimise paravirtual FPU context switch ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au>
| * | | | x86-32: make sure clts is batched during context switchJeremy Fitzhardinge2009-06-171-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we're preloading the fpu state during context switch, make sure the clts happens while we're batching the cpu context update, then do the actual __math_state_restore once the updates are flushed. This allows more efficient context switches when running paravirtualized, as all the hypercalls can be folded together into one. [ Impact: optimise paravirtual FPU context switch ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au>
| * | | | x86: split out core __math_state_restoreJeremy Fitzhardinge2009-06-172-10/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split the core fpu state restoration out into __math_state_restore, which assumes that cr0.TS is clear and that the fpu context has been initialized. This will be used during context switch. There are two reasons this is desireable: - There's a small clarification. When __switch_to() calls math_state_restore, it relies on the fact that tsk_used_math() returns true, and so will never do a blocking init_fpu(). __math_state_restore() does not have (or need) that logic, so the question never arises. - It allows the clts() to be moved earler in __switch_to() so it can be performed while cpu context updates are batched (will be done in a later patch). [ Impact: refactor code to make reuse cleaner; no functional change ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au>
* | | | | Merge branch 'x86-debug-for-linus' of ↵Linus Torvalds2009-09-141-2/+2
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Decrease the level of some NUMA messages to KERN_DEBUG
| * | | | | x86: Decrease the level of some NUMA messages to KERN_DEBUGRafael J. Wysocki2009-09-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some NUMA messages in srat_32.c are confusing to users, because they seem to indicate errors, while in fact they reflect normal behaviour. Decrease the level of these messages to KERN_DEBUG so that they don't show up unnecessarily. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> LKML-Reference: <200909050107.45175.rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds2009-09-1427-786/+1203
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits) x86: Fix code patching for paravirt-alternatives on 486 x86, msr: change msr-reg.o to obj-y, and export its symbols x86: Use hard_smp_processor_id() to get apic id for AMD K8 cpus x86, sched: Workaround broken sched domain creation for AMD Magny-Cours x86, mcheck: Use correct cpumask for shared bank4 x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors x86: Fix CPU llc_shared_map information for AMD Magny-Cours x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h x86, msr: fix msr-reg.S compilation with gas 2.16.1 x86, msr: Export the register-setting MSR functions via /dev/*/msr x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs() x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT x86, msr: CFI annotations, cleanups for msr-reg.S x86, asm: Make _ASM_EXTABLE() usable from assembly code x86, asm: Add 32-bit versions of the combined CFI macros x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bit x86, msr: Rewrite AMD rd/wrmsr variants x86, msr: Add rd/wrmsr interfaces with preset registers x86: add specific support for Intel Atom architecture ...
| * | | | | | x86: Fix code patching for paravirt-alternatives on 486Ben Hutchings2009-09-102-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported in <http://bugs.debian.org/511703> and <http://bugs.debian.org/515982>, kernels with paravirt-alternatives enabled crash in text_poke_early() on at least some 486-class processors. The problem is that text_poke_early() itself uses inline functions affected by paravirt-alternatives and so will modify instructions that have already been prefetched. Pentium and later processors will invalidate the prefetched instructions in this case, but 486-class processors do not. Change sync_core() to limit prefetching on 486-class (and 386-class) processors, and move the call to sync_core() above the call to the modifiable local_irq_restore(). Signed-off-by: Ben Hutchings <ben@decadent.org.uk> LKML-Reference: <1252547631.3423.134.camel@localhost> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | x86, msr: change msr-reg.o to obj-y, and export its symbolsH. Peter Anvin2009-09-042-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change msr-reg.o to obj-y (it will be included in virtually every kernel since it is used by the initialization code for AMD processors) and add a separate C file to export its symbols to modules, so that msr.ko can use them; on uniprocessors we bypass the helper functions in msr.o and use the accessor functions directly via inlines. Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <20090904140834.GA15789@elte.hu> Cc: Borislav Petkov <petkovbb@googlemail.com>
| * | | | | | x86: Use hard_smp_processor_id() to get apic id for AMD K8 cpusYinghai Lu2009-09-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise, system with apci id lifting will have wrong apicid in /proc/cpuinfo. and use that in srat_detect_node(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> LKML-Reference: <4A998CCA.1040407@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | x86, sched: Workaround broken sched domain creation for AMD Magny-CoursAndreas Herrmann2009-09-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current sched domain creation code can't handle multi-node processors. When switching to power_savings scheduling errors show up and system might hang later on (due to broken sched domain hierarchy): # echo 0 >> /sys/devices/system/cpu/sched_mc_power_savings CPU0 attaching sched-domain: domain 0: span 0-5 level MC groups: 0 1 2 3 4 5 domain 1: span 0-23 level NODE groups: 0-5 6-11 18-23 12-17 ... # echo 1 >> /sys/devices/system/cpu/sched_mc_power_savings CPU0 attaching sched-domain: domain 0: span 0-11 level MC groups: 0 1 2 3 4 5 6 7 8 9 10 11 ERROR: parent span is not a superset of domain->span domain 1: span 0-5 level CPU ERROR: domain->groups does not contain CPU0 groups: 6-11 (__cpu_power = 12288) ERROR: groups don't span domain->span domain 2: span 0-23 level NODE groups: ERROR: domain->cpu_power not set ERROR: groups don't span domain->span ... Fixing all aspects of power-savings scheduling for Magny-Cours needs some larger changes in the sched domain creation code. As a short-term and temporary workaround avoid the problems by extending "the worst possible hack" ;-( and always use llc_shared_map on AMD Magny-Cours when MC domain span is calculated. With this I get: # echo 1 >> /sys/devices/system/cpu/sched_mc_power_savings CPU0 attaching sched-domain: domain 0: span 0-5 level MC groups: 0 1 2 3 4 5 domain 1: span 0-5 level CPU groups: 0-5 (__cpu_power = 6144) domain 2: span 0-23 level NODE groups: 0-5 (__cpu_power = 6144) 6-11 (__cpu_power = 6144) 18-23 (__cpu_power = 6144) 12-17 (__cpu_power = 6144) ... I.e. no errors during sched domain creation, no system hangs, and also mc_power_savings scheduling works to a certain extend. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | x86, mcheck: Use correct cpumask for shared bank4Andreas Herrmann2009-09-031-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes threshold_bank4 support on multi-node processors. The correct mask to use is llc_shared_map, representing an internal node on Magny-Cours. We need to create 2 sets of symlinks for sibling shared banks -- one set for each internal node, symlinks of each set should target the first core on same internal node. Currently only one set is created where all symlinks are targeting the first core of the entire socket. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | x86, cacheinfo: Fixup L3 cache information for AMD multi-node processorsAndreas Herrmann2009-09-031-10/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | L3 cache size, associativity and shared_cpu information need to be adapted to show information for an internal node instead of the entire physical package. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | x86: Fix CPU llc_shared_map information for AMD Magny-CoursAndreas Herrmann2009-09-032-1/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Construct entire NodeID and use it as cpu_llc_id. Thus internal node siblings are stored in llc_shared_map. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit tooIngo Molnar2009-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The macro was defined in the 32-bit path as well - breaking the build on 32-bit platforms: arch/x86/lib/msr-reg.S: Assembler messages: arch/x86/lib/msr-reg.S:53: Error: Bad macro parameter list arch/x86/lib/msr-reg.S:100: Error: invalid character '_' in mnemonic arch/x86/lib/msr-reg.S:101: Error: invalid character '_' in mnemonic Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <tip-f6909f394c2d4a0a71320797df72d54c49c5927e@git.kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.hHuang Ying2009-09-012-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function measures whether the FPU/SSE state can be touched in interrupt context. If the interrupted code is in user space or has no valid FPU/SSE context (CR0.TS == 1), FPU/SSE state can be used in IRQ or soft_irq context too. This is used by AES-NI accelerated AES implementation and PCLMULQDQ accelerated GHASH implementation. v3: - Renamed to irq_fpu_usable to reflect the purpose of the function. v2: - Renamed to irq_is_fpu_using to reflect the real situation. Signed-off-by: Huang Ying <ying.huang@intel.com> CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | x86, msr: fix msr-reg.S compilation with gas 2.16.1H. Peter Anvin2009-09-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | msr-reg.S used the :req option on a macro argument, which wasn't supported by gas 2.16.1 (but apparently by some earlier versions of gas, just to be confusing.) It isn't necessary, so just remove it. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@googlemail.com>
| * | | | | | Merge branch 'x86/paravirt' into x86/cpuIngo Molnar2009-09-012-711/+722
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/include/asm/paravirt.h Manual merge: arch/x86/include/asm/paravirt_types.h Merge reason: x86/paravirt conflicts non-trivially with x86/cpu, resolve it. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | x86/paravirt: split paravirt definitions into paravirt_types.hJeremy Fitzhardinge2009-06-172-710/+721
| | | |/ / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split the monolithic asm/paravirt.h into separate paravirt.h (inlines and other "active" definitions), and paravirt_types.h (types, constants and other "passive" definitions). This makes it easier to use the type/constant definitions without pulling in everything else and causing circular dependency problems. [ Impact: cleanup ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
| * | | | | | x86, msr: Export the register-setting MSR functions via /dev/*/msrH. Peter Anvin2009-08-313-2/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make it possible to access the all-register-setting/getting MSR functions via the MSR driver. This is implemented as an ioctl() on the standard MSR device node. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com>
| * | | | | | x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs()H. Peter Anvin2009-08-312-4/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create _on_cpu helpers for {rw,wr}msr_safe_regs() analogously with the other MSR functions. This will be necessary to add support for these to the MSR driver. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com>
| * | | | | | x86, msr: Have the _safe MSR functions return -EIO, not -EFAULTH. Peter Anvin2009-08-313-11/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For some reason, the _safe MSR functions returned -EFAULT, not -EIO. However, the only user which cares about the return code as anything other than a boolean is the MSR driver, which wants -EIO. Change it to -EIO across the board. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au>
| * | | | | | x86, msr: CFI annotations, cleanups for msr-reg.SH. Peter Anvin2009-08-311-38/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add CFI annotations for native_{rd,wr}msr_safe_regs(). Simplify the 64-bit implementation: we don't allow the upper half registers to be set, and so we can use them to carry state across the operation. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com>
| * | | | | | x86, asm: Make _ASM_EXTABLE() usable from assembly codeH. Peter Anvin2009-08-311-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have had this convenient macro _ASM_EXTABLE() to generate exception table entry in inline assembly. Make it also usable for pure assembly. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | x86, asm: Add 32-bit versions of the combined CFI macrosH. Peter Anvin2009-08-311-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add 32-bit versions of the combined CFI macros, equivalent to the 64-bit ones except, obviously, operating on 32-bit stack words. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bitBorislav Petkov2009-08-311-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fbd8b1819e80ac5a176d085fdddc3a34d1499318 turns off the bit for /proc/cpuinfo. However, a proper/full fix would be to additionally turn off the bit in the CPUID output so that future callers get correct CPU features info. Do that by basically reversing what the BIOS wrongfully does at boot. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1251705011-18636-3-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | x86, msr: Rewrite AMD rd/wrmsr variantsBorislav Petkov2009-08-313-24/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch them to native_{rd,wr}msr_safe_regs and remove pv_cpu_ops.read_msr_amd. Signed-off-by: Borislav Petkov <petkovbb@gmail.com> LKML-Reference: <1251705011-18636-2-git-send-email-petkovbb@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>