aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAgeFilesLines
* [CPUFREQ] powernow-k8: Use a common exit path.Dave Jones2009-02-241-12/+10
| | | | | | | | a0abd520fd69295f4a3735e29a9448a32e101d47 introduced a slew of extra kfree/return -ENODEV pairs. This replaces them all with gotos. Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] Change link order of x86 cpufreq modulesMatthew Garrett2009-02-241-2/+6
| | | | | | | | Change the link order of the cpufreq modules to ensure that they're probed in the preferred order when statically linked in. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] Use swap() in longhaul.cDave Jones2009-02-241-7/+4
| | | | | | Remove hand-coded implementation of swap() Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for acpi-cpufreqDave Jones2009-02-241-18/+18
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] powernow-k8: Only print error message once, not per core.Thomas Renninger2009-02-241-6/+15
| | | | | | | | | This is the typical message you get if you plug in a CPU which is newer than your BIOS. It's annoying seeing this message for each core. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] powernow-k8: Always compile powernow-k8 driver with ACPI supportThomas Renninger2009-02-243-49/+3
| | | | | | | | | | | | | | | | | powernow-k8 driver should always try to get cpufreq info from ACPI. Otherwise it will not be able to detect the transition latency correctly which results in ondemand governor taking a wrong sampling rate which will then result in sever performance loss. Let the user not shoot himself in the foot and always compile in ACPI support for powernow-k8. This also fixes a wrong message if ACPI_PROCESSOR is compiled as a module and #ifndef CONFIG_ACPI_PROCESSOR path is chosen. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for powernow-k8Dave Jones2009-02-241-125/+226
| | | | | | | This driver has so many long function names, and deep nested if's The remaining warnings will need some code restructuring to clean up. Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for powernow-k7Dave Jones2009-02-241-100/+137
| | | | | | | The asm/timer.h warning can be ignored, it's needed for recalibrate_cpu_khz() Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for speedstep related drivers.Dave Jones2009-02-245-185/+258
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for sc520Dave Jones2009-02-241-13/+17
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for powernow-k6Dave Jones2009-02-241-18/+26
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for longrunDave Jones2009-02-241-11/+14
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for longhaulDave Jones2009-02-242-94/+100
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for gx-suspmodDave Jones2009-02-241-40/+65
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for e_powersaverDave Jones2009-02-241-9/+12
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for elanfreqDave Jones2009-02-241-2/+4
| | | | | | | The remaining warning about the simple_strtoul conversion to strict_strtoul seems kind of pointless to me. Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] nforce2: Use driver prefix, not cpufreq prefix.Dave Jones2009-02-241-11/+11
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] checkpatch cleanups for cpufreq-nforce2Dave Jones2009-02-241-17/+21
| | | | Signed-off-by: Dave Jones <davej@redhat.com>
* [CPUFREQ] Stupidly trivial CodingStyle fixDave Jones2009-02-241-1/+1
| | | | | | | | GNU indent complains about this being ambiguous, because it's dumb. One of my automated tests relies on the output of indent, so this shuts it up. Signed-off-by: Dave Jones <davej@redhat.com>
* PM: Split up sysdev_[suspend|resume] from device_power_[down|up]Rafael J. Wysocki2009-02-221-0/+4
| | | | | | | | | | | Move the sysdev_suspend/resume from the callee to the callers, with no real change in semantics, so that we can rework the disabling of interrupts during suspend/hibernation. This is based on an earlier patch from Linus. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: Add IRQF_TIMER to legacy x86 timer interrupt descriptorsLinus Torvalds2009-02-224-4/+4
| | | | | | | | | | | | | | Right now nobody cares, but the suspend/resume code will eventually want to suspend device interrupts without suspending the timer, and will depend on this flag to know. The modern x86 timer infrastructure uses the local APIC timers and never shows up as a device interrupt at all, so it isn't affected and doesn't need any of this. Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86_64: Fix S3 fail pathJiri Slaby2009-02-211-1/+3
| | | | | | | | | | | | | | | As acpi_enter_sleep_state can fail, take this into account in do_suspend_lowlevel and don't return to the do_suspend_lowlevel's caller. This would break (currently) fpu status and preempt count. Technically, this means use `call' instead of `jmp' and `jmp' to the `resume_point' after the `call' (i.e. if acpi_enter_sleep_state returns=fails). `resume_point' will handle the restore of fpu and preempt count gracefully. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
* x86_64: acpi/wakeup_64 cleanupJiri Slaby2009-02-211-19/+7
| | | | | | | | | | | | | | | | - remove %ds re-set, it's already set in wakeup_long64 - remove double labels and alignment (ENTRY already adds both) - use meaningful resume point labelname - skip alignment while jumping from wakeup_long64 to the resume point - remove .size, .type and unused labels [v2] - added ENDPROCs Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
* x86, mce: remove incorrect __cpuinit for mce_cpu_features()H. Peter Anvin2009-02-203-4/+4
| | | | | | | | | | | | | | | | | | | | Impact: Bug fix on UP Checkin 6ec68bff3c81e776a455f6aca95c8c5f1d630198: x86, mce: reinitialize per cpu features on resume introduced a call to mce_cpu_features() in the resume path, in order for the MCE machinery to get properly reinitialized after a resume. However, this function (and its successors) was flagged __cpuinit, which becomes __init on UP configurations (on SMP suspend/resume requires CPU hotplug and so this would not be seen.) Remove the offending __cpuinit annotations for mce_cpu_features() and its successor functions. Cc: Andi Kleen <ak@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86: use the right protections for split-up pagetablesIngo Molnar2009-02-201-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Steven Rostedt found a bug in where in his modified kernel ftrace was unable to modify the kernel text, due to the PMD itself having been marked read-only as well in split_large_page(). The fix, suggested by Linus, is to not try to 'clone' the reference protection of a huge-page, but to use the standard (and permissive) page protection bits of KERNPG_TABLE. The 'cloning' makes sense for the ptes but it's a confused and incorrect concept at the page table level - because the pagetable entry is a set of all ptes and hence cannot 'clone' any single protection attribute - the ptes can be any mixture of protections. With the permissive KERNPG_TABLE, even if the pte protections get changed after this point (due to ftrace doing code-patching or other similar activities like kprobes), the resulting combined protections will still be correct and the pte's restrictive (or permissive) protections will control it. Also update the comment. This bug was there for a long time but has not caused visible problems before as it needs a rather large read-only area to trigger. Steve possibly hacked his kernel with some really large arrays or so. Anyway, the bug is definitely worth fixing. [ Huang Ying also experienced problems in this area when writing the EFI code, but the real bug in split_large_page() was not realized back then. ] Reported-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Huang Ying <ying.huang@intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, vmi: TSC going backwards check in vmi clocksourceAlok N Kataria2009-02-201-1/+4
| | | | | | | | | | | | Impact: fix time warps under vmware Similar to the check for TSC going backwards in the TSC clocksource, we also need this check for VMI clocksource. Signed-off-by: Alok N Kataria <akataria@vmware.com> Cc: Zachary Amsden <zach@vmware.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: stable@kernel.org
* Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2009-02-193-6/+4
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown x86, mce: use force_sig_info to kill process in machine check x86, mce: reinitialize per cpu features on resume x86, rcu: fix strange load average and ksoftirqd behavior
| * x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdownAndi Kleen2009-02-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | Impact: Bugfix The ifdef for the apic clear on shutdown for the 64bit intel thermal vector was incorrect and never triggered. Fix that. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86, mce: use force_sig_info to kill process in machine checkAndi Kleen2009-02-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: bug fix (with tolerant == 3) do_exit cannot be called directly from the exception handler because it can sleep and the exception handler runs on the exception stack. Use force_sig() instead. Based on a earlier patch by Ying Huang who debugged the problem. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86, mce: reinitialize per cpu features on resumeAndi Kleen2009-02-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: Bug fix This fixes a long standing bug in the machine check code. On resume the boot CPU wouldn't get its vendor specific state like thermal handling reinitialized. This means the boot cpu wouldn't ever get any thermal events reported again. Call the respective initialization functions on resume v2: Remove ancient init because they don't have a resume device anyways. Pointed out by Thomas Gleixner. v3: Now fix the Subject too to reflect v2 change Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86, rcu: fix strange load average and ksoftirqd behaviorPaul E. McKenney2009-02-171-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Damien Wyart reported high ksoftirqd CPU usage (20%) on an otherwise idle system. The function-graph trace Damien provided: > 799.521187 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521371 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521555 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521738 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.521934 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522068 | 1) ksoftir-2324 | | rcu_check_callbacks() { > 799.522208 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522392 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522575 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522759 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.522956 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523074 | 1) ksoftir-2324 | | rcu_check_callbacks() { > 799.523214 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523397 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523579 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523762 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.523960 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524079 | 1) ksoftir-2324 | | rcu_check_callbacks() { > 799.524220 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524403 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524587 | 1) <idle>-0 | | rcu_check_callbacks() { > 799.524770 | 1) <idle>-0 | | rcu_check_callbacks() { > [ . . . ] Shows rcu_check_callbacks() being invoked way too often. It should be called once per jiffy, and here it is called no less than 22 times in about 3.5 milliseconds, meaning one call every 160 microseconds or so. Why do we need to call rcu_pending() and rcu_check_callbacks() from the idle loop of 32-bit x86, especially given that no other architecture does this? The following patch removes the call to rcu_pending() and rcu_check_callbacks() from the x86 32-bit idle loop in order to reduce the softirq load on idle systems. Reported-by: Damien Wyart <damien.wyart@free.fr> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | mm: clean up for early_pfn_to_nid()KAMEZAWA Hiroyuki2009-02-183-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | What's happening is that the assertion in mm/page_alloc.c:move_freepages() is triggering: BUG_ON(page_zone(start_page) != page_zone(end_page)); Once I knew this is what was happening, I added some annotations: if (unlikely(page_zone(start_page) != page_zone(end_page))) { printk(KERN_ERR "move_freepages: Bogus zones: " "start_page[%p] end_page[%p] zone[%p]\n", start_page, end_page, zone); printk(KERN_ERR "move_freepages: " "start_zone[%p] end_zone[%p]\n", page_zone(start_page), page_zone(end_page)); printk(KERN_ERR "move_freepages: " "start_pfn[0x%lx] end_pfn[0x%lx]\n", page_to_pfn(start_page), page_to_pfn(end_page)); printk(KERN_ERR "move_freepages: " "start_nid[%d] end_nid[%d]\n", page_to_nid(start_page), page_to_nid(end_page)); ... And here's what I got: move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00] move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00] move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff] move_freepages: start_nid[1] end_nid[0] My memory layout on this box is: [ 0.000000] Zone PFN ranges: [ 0.000000] Normal 0x00000000 -> 0x0081ff5d [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[8] active PFN ranges [ 0.000000] 0: 0x00000000 -> 0x00020000 [ 0.000000] 1: 0x00800000 -> 0x0081f7ff [ 0.000000] 1: 0x0081f800 -> 0x0081fe50 [ 0.000000] 1: 0x0081fed1 -> 0x0081fed8 [ 0.000000] 1: 0x0081feda -> 0x0081fedb [ 0.000000] 1: 0x0081fedd -> 0x0081fee5 [ 0.000000] 1: 0x0081fee7 -> 0x0081ff51 [ 0.000000] 1: 0x0081ff59 -> 0x0081ff5d So it's a block move in that 0x81f600-->0x81f7ff region which triggers the problem. This patch: Declaration of early_pfn_to_nid() is scattered over per-arch include files, and it seems it's complicated to know when the declaration is used. I think it makes fix-for-memmap-init not easy. This patch moves all declaration to include/linux/mm.h After this, if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID -> Use static definition in include/linux/mm.h else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID -> Use generic definition in mm/page_alloc.c else -> per-arch back end function will be called. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reported-by: David Miller <davem@davemlloft.net> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds2009-02-171-22/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: doc: mmiotrace.txt, buffer size control change trace: mmiotrace to the tracer menu in Kconfig mmiotrace: count events lost due to not recording
| * | trace: mmiotrace to the tracer menu in KconfigPekka Paalanen2009-02-151-22/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: cosmetic change in Kconfig menu layout This patch was originally suggested by Peter Zijlstra, but seems it was forgotten. CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable directly under the Kernel hacking / debugging menu in the kernel configuration system. They were present only for x86 and x86_64. Other tracers that use the ftrace tracing framework are in their own sub-menu. This patch moves the mmiotrace configuration options there. Since the Kconfig file, where the tracer menu is, is not architecture specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by x86/x86_64. CONFIG_MMIOTRACE now depends on it. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2009-02-1710-82/+109
|\ \ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, vm86: fix preemption bug x86, olpc: fix model detection without OFW x86, hpet: fix for LS21 + HPET = boot hang x86: CPA avoid repeated lazy mmu flush x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem x86/cpa: make sure cpa is safe to call in lazy mmu mode x86, ptrace, mm: fix double-free on race
| * | x86, vm86: fix preemption bugThomas Gleixner2009-02-151-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3d2a71a596bd9c761c8487a2178e95f8a61da083 ("x86, traps: converge do_debug handlers") changed the preemption disable logic of do_debug() so vm86_handle_trap() is called with preemption disabled resulting in: BUG: sleeping function called from invalid context at include/linux/kernel.h:155 in_atomic(): 1, irqs_disabled(): 0, pid: 3005, name: dosemu.bin Pid: 3005, comm: dosemu.bin Tainted: G W 2.6.29-rc1 #51 Call Trace: [<c050d669>] copy_to_user+0x33/0x108 [<c04181f4>] save_v86_state+0x65/0x149 [<c0418531>] handle_vm86_trap+0x20/0x8f [<c064e345>] do_debug+0x15b/0x1a4 [<c064df1f>] debug_stack_correct+0x27/0x2c [<c040365b>] sysenter_do_call+0x12/0x2f BUG: scheduling while atomic: dosemu.bin/3005/0x10000001 Restore the original calling convention and reenable preemption before calling handle_vm86_trap(). Reported-by: Michal Suchanek <hramrach@centrum.cz> Cc: stable@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86, olpc: fix model detection without OFWChris Ball2009-02-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: fix "garbled display, laptop is unusable" bug Commit e51a1ac2dfca9ad869471e88f828281db7e810c0 ("x86, olpc: fix endian bug in openfirmware workaround") breaks model comparison on OLPC; the value 0xc2 needs to be scaled up by olpc_board(). The pre-patch version was wrong, but accidentally worked anyway (big-endian 0xc2 is big enough to satisfy all other board revisions, but little endian 0xc2 is not). Signed-off-by: Chris Ball <cjb@laptop.org> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Andres Salomon <dilinger@queued.net> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86, hpet: fix for LS21 + HPET = boot hangjohn stultz2009-02-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Between 2.6.23 and 2.6.24-rc1 a change was made that broke IBM LS21 systems that had the HPET enabled in the BIOS, resulting in boot hangs for x86_64. Specifically commit b8ce33590687888ebb900d09557b8807c4539022, which merges the i386 and x86_64 HPET code. Prior to this commit, when we setup the HPET timers in x86_64, we did the following: hpet_writel(HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | HPET_TN_32BIT, HPET_T0_CFG); However after the i386/x86_64 HPET merge, we do the following: cfg = hpet_readl(HPET_Tn_CFG(timer)); cfg |= HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | HPET_TN_32BIT; hpet_writel(cfg, HPET_Tn_CFG(timer)); However on LS21s with HPET enabled in the BIOS, the HPET_T0_CFG register boots with Level triggered interrupts (HPET_TN_LEVEL) enabled. This causes the periodic interrupt to be not so periodic, and that results in the boot time hang I reported earlier in the delay calibration. My fix: Always disable HPET_TN_LEVEL when setting up periodic mode. Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86: CPA avoid repeated lazy mmu flushThomas Gleixner2009-02-121-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: Flush the lazy MMU only once Pending mmu updates only need to be flushed once to bring the in-memory pagetable state up to date. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible contextThomas Gleixner2009-02-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: Catch cases where lazy MMU state is active in a preemtible context arch_flush_lazy_mmu_cpu() has been changed to disable preemption so the checks in enter/leave will never trigger. Put the preemtible() check into arch_flush_lazy_mmu_cpu() to catch such cases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemptionJeremy Fitzhardinge2009-02-122-15/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: avoid access to percpu vars in preempible context They are intended to be used whenever there's the possibility that there's some stale state which is going to be overwritten with a queued update, or to force a state change when we may be in lazy mode. Either way, we could end up calling it with preemption enabled, so wrap the functions in their own little preempt-disable section so they can be safely called in any context (though preemption should never be enabled if we're actually in a lazy state). (Move out of line to avoid #include dependencies.) Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/memSuresh Siddha2009-02-123-58/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jeff Mahoney reported: > With Suse's hwinfo tool, on -tip: > WARNING: at arch/x86/mm/pat.c:637 reserve_pfn_range+0x5b/0x26d() reserve_pfn_range() is not tracking the memory range below 1MB as non-RAM and as such is inconsistent with similar checks in reserve_memtype() and free_memtype() Rename the pagerange_is_ram() to pat_pagerange_is_ram() and add the "track legacy 1MB region as non RAM" condition. And also, fix reserve_pfn_range() to return -EINVAL, when the pfn range is RAM. This is to be consistent with this API design. Reported-and-tested-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86/cpa: make sure cpa is safe to call in lazy mmu modeJeremy Fitzhardinge2009-02-121-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: fix race leading to crash under KVM and Xen The CPA code may be called while we're in lazy mmu update mode - for example, when using DEBUG_PAGE_ALLOC and doing a slab allocation in an interrupt handler which interrupted a lazy mmu update. In this case, the in-memory pagetable state may be out of date due to pending queued updates. We need to flush any pending updates before inspecting the page table. Similarly, we must explicitly flush any modifications CPA may have made (which comes down to flushing queued operations when flushing the TLB). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: Stable Kernel <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86, ptrace, mm: fix double-free on raceMarkus Metzger2009-02-111-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ptrace_detach() races with __ptrace_unlink() if the traced task is reaped while detaching. This might cause a double-free of the BTS buffer. Change the ptrace_detach() path to only do the memory accounting in ptrace_bts_detach() and leave the buffer free to ptrace_bts_untrace() which will be called from __ptrace_unlink(). The fix follows a proposal from Oleg Nesterov. Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Merge branch 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2009-02-1710-70/+40
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: Flush volatile msrs before emulating rdmsr KVM: Fix assigned devices circular locking dependency KVM: x86: fix LAPIC pending count calculation KVM: Fix INTx for device assignment KVM: MMU: Map device MMIO as UC in EPT KVM: x86: disable kvmclock on non constant TSC hosts KVM: PIT: fix i8254 pending count read KVM: Fix racy in kvm_free_assigned_irq KVM: Add kvm_arch_sync_events to sync with asynchronize events KVM: mmu_notifiers release method KVM: Avoid using CONFIG_ in userspace visible headers KVM: ia64: fix fp fault/trap handler
| * | | KVM: VMX: Flush volatile msrs before emulating rdmsrAvi Kivity2009-02-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some msrs (notable MSR_KERNEL_GS_BASE) are held in the processor registers and need to be flushed to the vcpu struture before they can be read. This fixes cygwin longjmp() failure on Windows x64. Signed-off-by: Avi Kivity <avi@redhat.com>
| * | | KVM: x86: fix LAPIC pending count calculationMarcelo Tosatti2009-02-156-63/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify LAPIC TMCCT calculation by using hrtimer provided function to query remaining time until expiration. Fixes host hang with nested ESX. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | | KVM: MMU: Map device MMIO as UC in EPTSheng Yang2009-02-152-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Software are not allow to access device MMIO using cacheable memory type, the patch limit MMIO region with UC and WC(guest can select WC using PAT and PCD/PWT). Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | | KVM: x86: disable kvmclock on non constant TSC hostsMarcelo Tosatti2009-02-151-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is better. Currently, this code path is posing us big troubles, and we won't have a decent patch in time. So, temporarily disable it. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | | KVM: PIT: fix i8254 pending count readMarcelo Tosatti2009-02-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | count_load_time assignment is bogus: its supposed to contain what it means, not the expiration time. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>