aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/edac/sb_edac.c
Commit message (Collapse)AuthorAgeFilesLines
* sb_edac: Fix erroneous bytes->gigabytes conversionJim Snow2015-08-071-19/+19
| | | | | | | | | | | | | | commit 8c009100295597f23978c224aec5751a365bc965 upstream. Signed-off-by: Jim Snow <jim.snow@intel.com> Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Vinson Lee <vlee@twopensource.com> [lizf: Backported to 3.4: - adjust context - use debugf0() instead of edac_dbg()] Signed-off-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* Fix sb_edac compilation with 32 bits kernelsMauro Carvalho Chehab2015-08-071-19/+29
| | | | | | | | | | | | | | | | | | | | commit 5b889e379f340f43b1252abde5d37a7084c846b9 upstream. As reported by Josh Boyer <jwboyer@redhat.com>: > drivers/edac/sb_edac.c: In function 'get_memory_error_data': > drivers/edac/sb_edac.c:861:2: warning: left shift count >= width of type > [enabled by default] > <snip> > ERROR: "__udivdi3" [drivers/edac/sb_edac.ko] undefined! > make[1]: *** [__modpost] Error 1 > make: *** [modules] Error 2 PS.: compile-tested only Reported-by: Josh Boyer <jwboyer@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> [bwh: Prerequisite for "sb_edac: Fix erroneous bytes->gigabytes conversion"] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* sb_edac: Avoid overflow errors at memory size calculationMauro Carvalho Chehab2012-10-101-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit deb09ddaff1435f72dd598d38f9b58354c68a5ec upstream. Sandy bridge EDAC is calculating the memory size with overflow. Basically, the size field and the integer calculation is using 32 bits. More bits are needed, when the DIMM memories have high density. The net result is that memories are improperly reported there, when high-density DIMMs are used: EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 0, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800 EDAC DEBUG: in drivers/edac/sb_edac.c, line at 591: mc#0: channel 1, dimm 0, -16384 Mb (-4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800 As the number of pages value is handled at the EDAC core as unsigned ints, the driver shows the 16 GB memories at sysfs interface as 16760832 MB! The fix is simple: calculate the number of pages as unsigned 64-bits integer. After the patch, the memory size (16 GB) is properly detected: EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 0, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800 EDAC DEBUG: in drivers/edac/sb_edac.c, line at 592: mc#0: channel 1, dimm 0, 16384 Mb (4194304 pages) bank: 8, rank: 2, row: 0x10000, col: 0x800 Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> [bwh: Backported to 3.2: - Adjust context - Debug log function is debugf0(), not edac_dbg()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'Kevin Winchester2012-08-041-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 141168c36cdee3ff23d9c7700b0edc47cb65479f and commit 3f806e50981825fa56a7f1938f24c0680816be45 upstream. Several fields in struct cpuinfo_x86 were not defined for the !SMP case, likely to save space. However, those fields still have some meaning for UP, and keeping them allows some #ifdef removal from other files. The additional size of the UP kernel from this change is not significant enough to worry about keeping up the distinction: text data bss dec hex filename 4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before 4737444 506459 972040 6215943 5ed907 vmlinux.o.after for a difference of 276 bytes for an example UP config. If someone wants those 276 bytes back badly then it should be implemented in a cleaner way. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Steffen Persvold <sp@numascale.com> Link: http://lkml.kernel.org/r/1324428742-12498-1-git-send-email-kjwinchester@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* edac: avoid mce decoding crash after edac driver unloadedChen Gong2012-07-041-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e35fca4791fcdd43dc1fd769797df40c562ab491 upstream. Some edac drivers register themselves as mce decoders via notifier_chain. But in current notifier_chain implementation logic, it doesn't accept same notifier registered twice. If so, it will be wrong when adding/removing the element from the list. For example, on one SandyBridge platform, remove module sb_edac and then trigger one error, it will hit oops because it has no mce decoder registered but related notifier_chain still points to an invalid callback function. Here is an example: Call Trace: [<ffffffff8150ef6a>] atomic_notifier_call_chain+0x1a/0x20 [<ffffffff8102b936>] mce_log+0x46/0x180 [<ffffffff8102eaea>] apei_mce_report_mem_error+0x4a/0x60 [<ffffffff812e19d2>] ghes_do_proc+0x192/0x210 [<ffffffff812e2066>] ghes_proc+0x46/0x70 [<ffffffff812e20d8>] ghes_notify_sci+0x48/0x80 [<ffffffff8150ef05>] notifier_call_chain+0x55/0x80 [<ffffffff81076f1a>] __blocking_notifier_call_chain+0x5a/0x80 [<ffffffff812aea11>] ? acpi_os_wait_events_complete+0x23/0x23 [<ffffffff81076f56>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff812ddc4d>] acpi_hed_notify+0x19/0x1b [<ffffffff812b16bd>] acpi_device_notify+0x19/0x1b [<ffffffff812beb38>] acpi_ev_notify_dispatch+0x67/0x7f [<ffffffff812aea3a>] acpi_os_execute_deferred+0x29/0x36 [<ffffffff81069dc2>] process_one_work+0x132/0x450 [<ffffffff8106bbcb>] worker_thread+0x17b/0x3c0 [<ffffffff8106ba50>] ? manage_workers+0x120/0x120 [<ffffffff81070aee>] kthread+0x9e/0xb0 [<ffffffff81514724>] kernel_thread_helper+0x4/0x10 [<ffffffff81070a50>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff81514720>] ? gs_change+0x13/0x13 Code: f3 49 89 d4 45 85 ed 4d 89 c6 48 8b 0f 74 48 48 85 c9 75 17 eb 41 0f 1f 80 00 00 00 00 41 83 ed 01 4c 89 f9 74 22 4d 85 ff 74 1d <4c> 8b 79 08 4c 89 e2 48 89 de 48 89 cf ff 11 4d 85 f6 74 04 41 RIP [<ffffffff8150eef6>] notifier_call_chain+0x46/0x80 RSP <ffff88042868fb20> CR2: ffffffffa01af838 ---[ end trace 0100930068e73e6f ]--- BUG: unable to handle kernel paging request at fffffffffffffff8 IP: [<ffffffff810705b0>] kthread_data+0x10/0x20 PGD 1a0d067 PUD 1a0e067 PMD 0 Oops: 0000 [#2] SMP Only i7core_edac and sb_edac have such issues because they have more than one memory controller which means they have to register mce decoder many times. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> [bwh: Backported to 3.2: drivers call atomic_notifier_chain_{,un}register() directly] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* EDAC: Fix incorrect edac mode reporting in sb_edacMark A. Grondona2011-11-011-3/+4
| | | | | | | | | | | | | | | | | | | | | | | The edac driver for Sandy Bridge was found to be reporting "FPM" for edac_mode, which clearly doesn't make sense. It was found that sb_edac.c:get_dimm_config was reusing a variable for both mem_type and edac_type, and thus was overwriting the value after setting it correctly. This patch fixes that issue. Before the patch: /sys/devices/system/edac/mc/mc0/csrow0/edac_mode:FPM /sys/devices/system/edac/mc/mc0/csrow1/edac_mode:FPM /sys/devices/system/edac/mc/mc0/csrow2/edac_mode:FPM /sys/devices/system/edac/mc/mc0/csrow3/edac_mode:FPM After: /sys/devices/system/edac/mc/mc0/csrow0/edac_mode:S4ECD4ED /sys/devices/system/edac/mc/mc0/csrow1/edac_mode:S4ECD4ED /sys/devices/system/edac/mc/mc0/csrow2/edac_mode:S4ECD4ED /sys/devices/system/edac/mc/mc0/csrow3/edac_mode:S4ECD4ED Signed-off-by: Mark A. Grondona <mgrondona@llnl.gov> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* edac: sb_edac: Add it to the building systemMauro Carvalho Chehab2011-11-011-25/+23
| | | | | | | Some changes on it were required due to changeset cd90cc84c6bf0, that changed the glue with the MCE logic. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* edac: Add an experimental new driver to support Sandy Bridge CPU'sMauro Carvalho Chehab2011-11-011-0/+1894
This driver is known to work on mine and Tony's test environments, using software error injection, and a partial hardware/software error injection tool. There's no broader range test yet to double check if the error decoding logic will actually point to the right DIMM, so use it with care. More tests are required to be sure that the driver will work on all different types of memory configurations. If you're willing to risk using it, I suggest you to enable EDAC debugs for your test machines, as the debug logs helps to track what's going inside the driver. Please feed me with bug reports, if you notice that the driver is miss-behaving. Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>