aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/cpu/mcheck/mce.c
Commit message (Collapse)AuthorAgeFilesLines
* MCE: Fix vm86 handling for 32bit mce handlerAndi Kleen2012-10-021-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit a129a7c84582629741e5fa6f40026efcd7a65bd4 upstream. When running on 32bit the mce handler could misinterpret vm86 mode as ring 0. This can affect whether it does recovery or not; it was possible to panic when recovery was actually possible. Fix this by always forcing vm86 to look like ring 3. [ Backport to 3.0 notes: Things changed there slightly: - move mce_get_rip() up. It fills up m->cs and m->ip values which are evaluated in mce_severity(). Therefore move it up right before the mce_severity call. This seem to be another bug in 3.0? - Place the backport (fix m->cs in V86 case) to where m->cs gets filled which is mce_get_rip() in 3.0 ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'Kevin Winchester2012-08-151-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 141168c36cdee3ff23d9c7700b0edc47cb65479f and commit 3f806e50981825fa56a7f1938f24c0680816be45 upstream. Several fields in struct cpuinfo_x86 were not defined for the !SMP case, likely to save space. However, those fields still have some meaning for UP, and keeping them allows some #ifdef removal from other files. The additional size of the UP kernel from this change is not significant enough to worry about keeping up the distinction: text data bss dec hex filename 4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before 4737444 506459 972040 6215943 5ed907 vmlinux.o.after for a difference of 276 bytes for an example UP config. If someone wants those 276 bytes back badly then it should be implemented in a cleaner way. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Steffen Persvold <sp@numascale.com> Link: http://lkml.kernel.org/r/1324428742-12498-1-git-send-email-kjwinchester@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* x86, mce: Drop the default decoding notifierBorislav Petkov2011-04-211-17/+7
| | | | | | | | | | | | | | | | | | | The default notifier doesn't make a lot of sense to call in the correctable errors case. Drop it and emit the mcelog decoding hint only in the uncorrectable errors case and when no notifier is registered. Also, limit issuing the "mcelog --ascii" message in the rare case when we dump unreported CEs before panicking. While at it, remove unused old x86_mce_decode_callback from the header. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Nagananda Chumbalkar <Nagananda.Chumbalkar@hp.com> Cc: Russ Anderson <rja@sgi.com> Link: http://lkml.kernel.org/r/20110420102349.GB1361@aftab Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, MCE: Do not taint when handling correctable errorsBorislav Petkov2011-04-191-1/+0
| | | | | | | | | | | | | | | | | | | | | Correctable errors are considered something rather normal on modern hardware these days. Even more importantly, correctable errors mean exactly that - they've been corrected by the hardware - and there's no need to taint the kernel since execution hasn't been compromised so far. Also, drop tainting in the thermal throttling code for a similar reason: crossing a thermal threshold does not mean corruption. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Tony Luck <tony.luck@intel.com> Acked-by: Nagananda Chumbalkar <Nagananda.Chumbalkar@hp.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Russ Anderson <rja@sgi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1303135222-17118-1-git-send-email-bp@amd64.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
* rcu: create new rcu_access_index() and use in mcePaul E. McKenney2011-04-011-1/+1
| | | | | | | | | | | The MCE subsystem needs to sample an RCU-protected index outside of any protection for that index. If this was a pointer, we would use rcu_access_pointer(), but there is no corresponding rcu_access_index(). This commit therefore creates an rcu_access_index() and applies it to MCE. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Zdenek Kabelac <zkabelac@redhat.com>
* x86: Use syscore_ops instead of sysdev classes and sysdevsRafael J. Wysocki2011-03-231-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | Some subsystems in the x86 tree need to carry out suspend/resume and shutdown operations with one CPU on-line and interrupts disabled and they define sysdev classes and sysdevs or sysdev drivers for this purpose. This leads to unnecessarily complicated code and excessive memory usage, so switch them to using struct syscore_ops objects for this purpose instead. Generally, there are three categories of subsystems that use sysdevs for implementing PM operations: (1) subsystems whose suspend/resume callbacks ignore their arguments entirely (the majority), (2) subsystems whose suspend/resume callbacks use their struct sys_device argument, but don't really need to do that, because they can be implemented differently in an arguably simpler way (io_apic.c), and (3) subsystems whose suspend/resume callbacks use their struct sys_device argument, but the value of that argument is always the same and could be ignored (microcode_core.c). In all of these cases the subsystems in question may be readily converted to using struct syscore_ops objects for power management and shutdown. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu>
* x86: Fix common misspellingsLucas De Marchi2011-03-181-1/+1
| | | | | | | | | They were generated by 'codespell' and then manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: Replace uses of current_cpu_data with this_cpu opsTejun Heo2010-12-301-7/+7
| | | | | | | | | | | | | | | | | | | Replace all uses of current_cpu_data with this_cpu operations on the per cpu structure cpu_info. The scala accesses are replaced with the matching this_cpu ops which results in smaller and more efficient code. In the long run, it might be a good idea to remove cpu_data() macro too and use per_cpu macro directly. tj: updated description Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* x86: Use this_cpu_ops to optimize codeTejun Heo2010-12-301-3/+3
| | | | | | | | | | | | | Go through x86 code and replace __get_cpu_var and get_cpu_var instances that refer to a scalar and are not used for address determinations. Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* llseek: automatically add .llseek fopArnd Bergmann2010-10-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
* Merge branch 'x86-mce-for-linus' of ↵Linus Torvalds2010-08-061-22/+10
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mce: Use HW_ERR in MCE handler x86, mce: Add HW_ERR printk prefix for hardware error logging x86, mce: Fix MSR_IA32_MCI_CTL2 CMCI threshold setup x86, mce: Rename MSR_IA32_MCx_CTL2 value
| * x86, mce: Use HW_ERR in MCE handlerHuang Ying2010-06-101-22/+10
| | | | | | | | | | | | | | | | | | | | Use HW_ERR printk prefix in MCE handler. To make it more explicit that this is hardware error instead of software error. Signed-off-by: Huang Ying <ying.huang@intel.com> LKML-Reference: <1275978939.3444.668.camel@yhuang-dev.sh.intel.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2010-08-061-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: Revert "net: Make accesses to ->br_port safe for sparse RCU" mce: convert to rcu_dereference_index_check() net: Make accesses to ->br_port safe for sparse RCU vfs: add fs.h to define struct file lockdep: Add an in_workqueue_context() lockdep-based test function rcu: add __rcu API for later sparse checking rcu: add an rcu_dereference_index_check() tree/tiny rcu: Add debug RCU head objects mm: remove all rcu head initializations fs: remove all rcu head initializations, except on_stack initializations powerpc: remove all rcu head initializations
| * | mce: convert to rcu_dereference_index_check()Paul E. McKenney2010-06-141-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | The mce processing applies rcu_dereference_check() to integers used as array indices. This patch therefore moves mce to the new RCU API rcu_dereference_index_check() that avoids the sparse processing that would otherwise result in compiler errors. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com>
* | x86, mce: Notify about corrected events tooBorislav Petkov2010-08-031-0/+1
|/ | | | | | | | | Notify all parties registered on the mce decoder chain about logged correctable MCEs. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'linux_next' of ↵Linus Torvalds2010-06-041-0/+10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (83 commits) i7core_edac: Better describe the supported devices Add support for Westmere to i7core_edac driver i7core_edac: don't free on success i7core_edac: Add support for X5670 Always call i7core_[ur]dimm_check_mc_ecc_err i7core_edac: fix memory leak of i7core_dev EDAC: add __init to i7core_xeon_pci_fixup i7core_edac: Fix wrong device id for channel 1 devices i7core: add support for Lynnfield alternate address i7core_edac: Add initial support for Lynnfield i7core_edac: do not export static functions edac: fix i7core build edac: i7core_edac produces undefined behaviour on 32bit i7core_edac: Use a more generic approach for probing PCI devices i7core_edac: PCI device is called NONCORE, instead of NOCORE i7core_edac: Fix ringbuffer maxsize i7core_edac: First store, then increment i7core_edac: Better parse "any" addrmask i7core_edac: Use a lockless ringbuffer edac: Create an unique instance for each kobj ...
| * edac_mce: Add an interface driver to report mce errors via edacMauro Carvalho Chehab2010-05-101-0/+10
| | | | | | | | | | | | | | | | edac_mce module is an interface module that gets mcelog data and forwards to any registered edac module that expects to receive data via mce. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* | Merge branch 'release' of ↵Linus Torvalds2010-05-281-11/+68
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (27 commits) ACPI: Don't let acpi_pad needlessly mark TSC unstable drivers/acpi/sleep.h: Checkpatch cleanup ACPI: Minor cleanup eliminating redundant PMTIMER_TICKS to NS conversion ACPI: delete unused c-state promotion/demotion data strucutures ACPI: video: fix acpi_backlight=video ACPI: EC: Use kmemdup drivers/acpi: use kasprintf ACPI, APEI, EINJ injection parameters support Add x64 support to debugfs ACPI, APEI, Use ERST for persistent storage of MCE ACPI, APEI, Error Record Serialization Table (ERST) support ACPI, APEI, Generic Hardware Error Source memory error support ACPI, APEI, UEFI Common Platform Error Record (CPER) header Unified UUID/GUID definition ACPI Hardware Error Device (PNP0C33) support ACPI, APEI, PCIE AER, use general HEST table parsing in AER firmware_first setup ACPI, APEI, Document for APEI ACPI, APEI, EINJ support ACPI, APEI, HEST table parsing ACPI, APEI, APEI supporting infrastructure ...
| * | ACPI, APEI, Use ERST for persistent storage of MCEHuang Ying2010-05-191-11/+68
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally, fatal MCE will cause Linux print error log to console then reboot. Because MCE registers will preserve their content after warm reboot, the hardware error can be logged to disk or network after reboot. But system may fail to warm reboot, then you may lose the hardware error log. ERST can help here. Through saving the hardware error log into flash via ERST before go panic, the hardware error log can be gotten from the flash after system boot successful again. The fatal MCE processing procedure with ERST involved is as follow: - Hardware detect error, MCE raised - MCE read MCE registers, check error severity (fatal), prepare error record - Write MCE error record into flash via ERST - Go panic, then trigger system reboot - System reboot, /sbin/mcelog run, it reads /dev/mcelog to check flash for error record of previous boot via ERST, and output and clear them if available - /sbin/mcelog logs error records into disk or network ERST only accepts CPER record format, but there is no pre-defined CPER section can accommodate all information in struct mce, so a customized section type is defined to hold struct mce inside a CPER record as an error section. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
* | x86, asm: Introduce and use percpu_inc()Jan Beulich2010-04-281-2/+2
|/ | | | | | | | ... generating slightly smaller code. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4BCF261F020000780003B33C@vpn.id2.novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* x86/mce: Fix build bug with CONFIG_PROVE_LOCKING=y && CONFIG_X86_MCE_INTEL=yIngo Molnar2010-03-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f56e8a076 "x86/mce: Fix RCU lockdep splats" introduced the following build bug: arch/x86/kernel/cpu/mcheck/mce.c: In function 'mce_log': arch/x86/kernel/cpu/mcheck/mce.c:166: error: 'mce_read_mutex' undeclared (first use in this function) arch/x86/kernel/cpu/mcheck/mce.c:166: error: (Each undeclared identifier is reported only once arch/x86/kernel/cpu/mcheck/mce.c:166: error: for each function it appears in.) Move the in-the-middle-of-file lock variable up to the variable definition section, the top of the .c file. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1267830207-9474-3-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds2010-03-131-3/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: locking: Make sparse work with inline spinlocks and rwlocks x86/mce: Fix RCU lockdep splats rcu: Increase RCU CPU stall timeouts if PROVE_RCU ftrace: Replace read_barrier_depends() with rcu_dereference_raw() rcu: Suppress RCU lockdep warnings during early boot rcu, ftrace: Fix RCU lockdep splat in ftrace_perf_buf_prepare() rcu: Suppress __mpol_dup() false positive from RCU lockdep rcu: Make rcu_read_lock_sched_held() handle !PREEMPT rcu: Add control variables to lockdep_rcu_dereference() diagnostics rcu, cgroup: Relax the check in task_subsys_state() as early boot is now handled by lockdep-RCU rcu: Use wrapper function instead of exporting tasklist_lock sched, rcu: Fix rcu_dereference() for RCU-lockdep rcu: Make task_subsys_state() RCU-lockdep checks handle boot-time use rcu: Fix holdoff for accelerated GPs for last non-dynticked CPU x86/gart: Unexport gart_iommu_aperture Fix trivial conflicts in kernel/trace/ftrace.c
| * x86/mce: Fix RCU lockdep splatsPaul E. McKenney2010-03-111-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create an rcu_dereference_check_mce() that checks for RCU-sched read side and mce_read_mutex being held on update side. Replace uses of rcu_dereference() in arch/x86/kernel/cpu/mcheck/mce.c with this new macro. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1267830207-9474-3-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on dynamic attributesEric W. Biederman2010-03-071-0/+1
|/ | | | | | | | | | | | | | These are the non-static sysfs attributes that exist on my test machine. Fix them to use sysfs_attr_init or sysfs_bin_attr_init as appropriate. It simply requires making a sysfs attribute present to see this. So this is a little bit tedious but otherwise not too bad. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* x86, mce: fix confusion between bank attributes and mce attributesHidetoshi Seto2009-12-081-1/+1
| | | | | | | | | | | Commit cebe182033f156b430952370fb0f9dbe6e89b081 had an unnecessary, wrong change: &mce_banks[i].attr is equivalent to the former bank_attrs[i], not to mce_attrs[i]. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Andi Kleen <andi@firstfloor.org> LKML-Reference: <4B1E05CC.4040703@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* x86/mce: Set up timer unconditionallyJan Beulich2009-12-081-1/+2
| | | | | | | | | | | | | | mce_timer must be passed to setup_timer() in all cases, no matter whether it is going to be actually used. Otherwise, when the CPU gets brought down, its call to del_timer_sync() will never return, as the timer won't have a base associated, and hence lock_timer_base() will loop infinitely. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: <stable@kernel.org> LKML-Reference: <4B1DB831.2030801@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'linus' into x86/urgentIngo Molnar2009-12-071-39/+64
|\ | | | | | | | | | | Merge reason: we want to queue up a dependent fix. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, mce: Add __cpuinit to hotplug callback functionsHidetoshi Seto2009-11-261-2/+3
| | | | | | | | | | | | | | | | | | | | | | The mce_disable_cpu() and mce_reenable_cpu() are called only from mce_cpu_callback() which is marked as __cpuinit. So these functions can be __cpuinit too. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4B0E3C4E.4090809@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, mce: Fix __init annotationsHidetoshi Seto2009-11-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The intel_init_thermal() is called from resume path, so it cannot be marked as __init. OTOH mce_banks_init() is only called from __mcheck_cpu_cap_init() which is marked as __cpuinit, so it can be also marked as __cpuinit. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Yong Wang <yong.y.wang@linux.intel.com> LKML-Reference: <4AFBB0B8.2070501@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP valueYong Wang2009-11-101-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On platforms where the BIOS handles the thermal monitor interrupt, APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS must not touch it. Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all the LVT entries except the mask bit. Essentially this results in all LVT entries including the thermal monitoring interrupt set to masked (clearing the bios programmed value for APIC_LVTTHMR). And this leads to kernel take over the thermal monitoring interrupt on AP's but not on BSP (leaving the bios programmed value only on BSP). As a result of this, we have seen system hangs when the thermal monitoring interrupt is generated. Fix this by reading the initial value of thermal LVT entry on BSP and if bios has taken over the control, then program the same value on all AP's and leave the thermal monitoring interrupt control on all the logical cpu's to the bios. Signed-off-by: Yong Wang <yong.y.wang@intel.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: stable@kernel.org
| * x86, mce: Add a global MCE init helperBorislav Petkov2009-10-161-2/+8
| | | | | | | | | | | | | | | | | | | | Add an early initcall (pre SMP) which sets up global MCE functionality. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1255689093-26921-2-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, mce: Fix up MCE naming nomenclatureBorislav Petkov2009-10-161-26/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prefix global/setup routines with "mcheck_" thus differentiating from the internal facilities prefixed with "mce_". Also, prefix the per cpu calls with mcheck_cpu and rename them to reflect the MCE setup hierarchy of calls better. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1255689093-26921-1-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| *-. Merge branches 'x86/mce' and 'x86/urgent' into perf/mceIngo Molnar2009-10-161-10/+22
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: Put all MCE changes into this branch, we are queueing up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | mce, edac: Use an atomic notifier for MCEs decodingBorislav Petkov2009-10-121-9/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an atomic notifier which ensures proper locking when conveying MCE info to EDAC for decoding. The actual notifier call overrides a default, negative priority notifier. Note: make sure we register the default decoder only once since mcheck_init() runs on each CPU. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091003065752.GA8935@liondog.tnic> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | perf_event, x86, mce: Use TRACE_EVENT() for MCE loggingHidetoshi Seto2009-10-131-0/+6
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This approach is the first baby step towards solving many of the structural problems the x86 MCE logging code is having today: - It has a private ring-buffer implementation that has a number of limitations and has been historically fragile and buggy. - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE specific. /dev/mcelog is not part of any larger logging framework and hence has remained on the fringes for many years. - The MCE logging code is still very unclean partly due to its ABI limitations. Fields are being reused for multiple purposes, and the whole message structure is limited and x86 specific to begin with. All in one, the x86 tree would like to move away from this private implementation of an event logging facility to a broader framework. By using perf events we gain the following advantages: - Multiple user-space agents can access MCE events. We can have an mcelog daemon running but also a system-wide tracer capturing important events in flight-recorder mode. - Sampling support: the kernel and the user-space call-chain of MCE events can be stored and analyzed as well. This way actual patterns of bad behavior can be matched to precisely what kind of activity happened in the kernel (and/or in the app) around that moment in time. - Coupling with other hardware and software events: the PMU can track a number of other anomalies - monitoring software might chose to monitor those plus the MCE events as well - in one coherent stream of events. - Discovery of MCE sources - tracepoints are enumerated and tools can act upon the existence (or non-existence) of various channels of MCE information. - Filtering support: we just subscribe to and act upon the events we are interested in. Then even on a per event source basis there's in-kernel filter expressions available that can restrict the amount of data that hits the event channel. - Arbitrary deep per cpu buffering of events - we can buffer 32 entries or we can buffer as much as we want, as long as we have the RAM. - An NMI-safe ring-buffer implementation - mappable to user-space. - Built-in support for timestamping of events, PID markers, CPU markers, etc. - A rich ABI accessible over system call interface. Per cpu, per task and per workload monitoring of MCE events can be done this way. The ABI itself has a nice, meaningful structure. - Extensible ABI: new fields can be added without breaking tooling. New tracepoints can be added as the hardware side evolves. There's various parsers that can be used. - Lots of scheduling/buffering/batching modes of operandi for MCE events. poll() support. mmap() support. read() support. You name it. - Rich tooling support: even without any MCE specific extensions added the 'perf' tool today offers various views of MCE data: perf report, perf stat, perf trace can all be used to view logged MCE events and perhaps correlate them to certain user-space usage patterns. But it can be used directly as well, for user-space agents and policy action in mcelog, etc. With this we hope to achieve significant code cleanup and feature improvements in the MCE code, and we hope to be able to drop the /dev/mcelog facility in the end. This patch is just a plain dumb dump of mce_log() records to the tracepoints / perf events framework - a first proof of concept step. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | x86, mce: don't restart timer if disabledHidetoshi Seto2009-12-021-2/+4
| |/ |/| | | | | | | | | | | | | | | | | | | | | Even it is in error path unlikely taken, add_timer_on() at CPU_DOWN_FAILED* needs to be skipped if mce_timer is disabled. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: <stable@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | x86: Don't print number of MCE banks for every CPURoland Dreier2009-10-161-1/+2
|/ | | | | | | | | | | | | | | The MCE initialization code explicitly says it doesn't handle asymmetric configurations where different CPUs support different numbers of MCE banks, and it prints a big warning in that case. Therefore, printing the "mce: CPU supports <x> MCE banks" message into the kernel log for every CPU is pure redundancy that clutters the log significantly for systems with lots of CPUs. Signed-off-by: Roland Dreier <rolandd@cisco.com> LKML-Reference: <adaeip473qt.fsf@cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: EDAC: MCE: Fix MCE decoding callback logicIngo Molnar2009-10-021-22/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make decoding of MCEs happen only on AMD hardware by registering a non-default callback only on CPU families which support it. While looking at the interaction of decode_mce() with the other MCE code i also noticed a few other things and made the following cleanups/fixes: - Fixed the mce_decode() weak alias - a weak alias is really not good here, it should be a proper callback. A weak alias will be overriden if a piece of code is built into the kernel - not good, obviously. - The patch initializes the callback on AMD family 10h and 11h. - Added the more correct fallback printk of: No support for human readable MCE decoding on this CPU type. Transcribe the message and run it through 'mcelog --ascii' to decode. On CPUs that dont have a decoder. - Made the surrounding code more readable. Note that the callback allows us to have a default fallback - without having to check the CPU versions during the printout itself. When an EDAC module registers itself, it can install the decode-print function. (there's no unregister needed as this is core code.) version -v2 by Borislav Petkov: - add K8 to the set of supported CPUs - always build in edac_mce_amd since we use an early_initcall now - fix checkpatch warnings Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <20091001141432.GA11410@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Revert "x86, mce: do not compile mcelog message on AMD"Linus Torvalds2009-09-301-4/+1
| | | | | | | | | | | | This reverts commit 22223c9b417be5fd0ab2cf9ad17eb7bd1e19f7b9, as requested by Andi Kleen: "Obviously kernels compiled with AMD support can still run on non AMD systems, so messages like this can never be removed at compile time." Requsted-by: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: mce: Use safer ways to access MCE registersIngo Molnar2009-09-231-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use rdmsrl_safe() when accessing MCE registers. While in theory we always 'know' which ones are safe to access from the capability bits, there's a lot of hardware variations and reality might differ from theory, as it did in this case: http://bugzilla.kernel.org/show_bug.cgi?id=14204 [ 0.010016] mce: CPU supports 5 MCE banks [ 0.011029] general protection fault: 0000 [#1] [ 0.011998] last sysfs file: [ 0.011998] Modules linked in: [ 0.011998] [ 0.011998] Pid: 0, comm: swapper Not tainted (2.6.31_router #1) HP Vectra [ 0.011998] EIP: 0060:[<c100d9b9>] EFLAGS: 00010246 CPU: 0 [ 0.011998] EIP is at mce_rdmsrl+0x19/0x60 [ 0.011998] EAX: 00000000 EBX: 00000001 ECX: 00000407 EDX: 08000000 [ 0.011998] ESI: 00000000 EDI: 8c000000 EBP: 00000405 ESP: c17d5eac So WARN_ONCE() instead of crashing the box. ( also fix a number of stylistic inconsistencies in the code. ) Note, we might still crash in wrmsrl() if we get that far, but we shouldnt if the registers are truly inaccessible. Reported-by: GNUtoo <GNUtoo@no-log.org> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <bug-14204-5438@http.bugzilla.kernel.org/> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'x86-mce-for-linus' of ↵Linus Torvalds2009-09-171-143/+161
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits) x86, mce: Fix compilation with !CONFIG_DEBUG_FS in mce-severity.c x86, mce: CE in last bank prevents panic by unknown MCE x86, mce: Fake panic support for MCE testing x86, mce: Move debugfs mce dir creating to mce.c x86, mce: Support specifying raise mode for software MCE injection x86, mce: Support specifying context for software mce injection x86, mce: fix reporting of Thermal Monitoring mechanism enabled x86, mce: remove never executed code x86, mce: add missing __cpuinit tags x86, mce: fix "mce" boot option handling for CONFIG_X86_NEW_MCE x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUs x86: mce: Lower maximum number of banks to architecture limit x86: mce: macros to compute banks MSRs x86: mce: Move per bank data in a single datastructure x86: mce: Move code in mce.c x86: mce: Rename CONFIG_X86_NEW_MCE to CONFIG_X86_MCE x86: mce: Remove old i386 machine check code x86: mce: Update X86_MCE description in x86/Kconfig x86: mce: Make CONFIG_X86_ANCIENT_MCE dependent on CONFIG_X86_MCE x86, mce: use atomic_inc_return() instead of add by 1 ... Manually fixed up trivial conflicts: Documentation/feature-removal-schedule.txt arch/x86/kernel/cpu/mcheck/mce.c
| * x86, mce: CE in last bank prevents panic by unknown MCEHidetoshi Seto2009-08-261-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If MCE handler is called but none of mces_seen have machine check event which might signal the MCE (i.e. event higher than MCE_KEEP_SEVERITY), panic with "Machine check from unknown source" will be taken since the MCE is assumed to be signaled from external agent or so. Usually mces_seen never point MCE_KEEP_SEVERITY event such as CE. But it can happen because initial value of mces_seen is accidentally modified by mce_no_way_out() - in case if mce_no_way_out() run through all banks and the last bank has the CE, mces_seen points the CE and the "panic by unknown" will not be taken. This patch fixes this undesired behavior, and clarifies the logic. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Dongming <jin.dongming@np.css.fujitsu.com> LKML-Reference: <4A94E244.3020301@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reported-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
| * x86, mce: Fake panic support for MCE testingHuang Ying2009-08-101-11/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If "fake panic" mode is turned on, just log panic message instead of go real panic. This is used for testing only, so that the test suite can check for the correct panic message and do regression testing for MCE would go panic. This patch is based on x86-tip.git/mce. ChangeLog: v5: - Rebased on x86-tip.git/mce v4: - Move config file from sysfs to debugfs Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86, mce: Move debugfs mce dir creating to mce.cHuang Ying2009-08-101-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Because more debugfs files under mce dir will be create in mce.c. ChangeLog: v5: - Rebased on x86-tip.git/mce Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86, mce: add missing __cpuinit tagsBartlomiej Zolnierkiewicz2009-07-291-2/+2
| | | | | | | | | | | | | | | | mce_cap_init() and mce_cpu_quirks() can be tagged with __cpuinit. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Reviewed-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86, mce: fix "mce" boot option handling for CONFIG_X86_NEW_MCEBartlomiej Zolnierkiewicz2009-07-291-1/+3
| | | | | | | | | | | | | | | | | | "mce argument mce ignored. Please use /sys" message shouldn't be printed when using "mce" boot option. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Reviewed-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUsBartlomiej Zolnierkiewicz2009-07-291-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On my legacy Pentium M laptop (Acer Extensa 2900) I get bogus MCE on a cold boot with CONFIG_X86_NEW_MCE enabled, i.e. (after decoding it with mcelog): MCE 0 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 0 BANK 1 MCG status: MCi status: Error overflow Uncorrected error Error enabled Processor context corrupt MCA: Data CACHE Level-1 UNKNOWN Error STATUS f200000000000195 MCGSTATUS 0 [ The other STATUS values observed: f2000000000001b5 (... UNKNOWN error) and f200000000000115 (... READ Error). To verify that this is not a CONFIG_X86_NEW_MCE bug I also modified the CONFIG_X86_OLD_MCE code (which doesn't log any MCEs) to dump content of STATUS MSR before it is cleared during initialization. ] Since the bogus MCE results in a kernel taint (which in turn disables lockdep support) don't log boot MCEs on Pentium M (model == 13) CPUs by default ("mce=bootlog" boot parameter can be be used to get the old behavior). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Reviewed-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86: mce: macros to compute banks MSRsAndi Kleen2009-07-091-17/+17
| | | | | | | | | | | | | | | | | | | | Instead of open coded calculations for bank MSRs hide the indexing of higher banks MCE register MSRs in new macros. No semantic changes. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * x86: mce: Move per bank data in a single datastructureAndi Kleen2009-07-091-56/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | This addresses one of the leftover review comments. Move the per bank data into a single structure. This avoids several separate variables and also separate allocation of sysfs objects. I didn't move the CMCI ownership information so far because that would have needed some non trivial changes in the algorithms. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>