aboutsummaryrefslogtreecommitdiffstats
path: root/include/trace
Commit message (Collapse)AuthorAgeFilesLines
* jbd2: issue cache flush after checkpointing even with internal journalJan Kara2015-08-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 79feb521a44705262d15cc819a4117a447b11ea7 upstream. When we reach jbd2_cleanup_journal_tail(), there is no guarantee that checkpointed buffers are on a stable storage - especially if buffers were written out by jbd2_log_do_checkpoint(), they are likely to be only in disk's caches. Thus when we update journal superblock effectively removing old transaction from journal, this write of superblock can get to stable storage before those checkpointed buffers which can result in filesystem corruption after a crash. Thus we must unconditionally issue a cache flush before we update journal superblock in these cases. A similar problem can also occur if journal superblock is written only in disk's caches, other transaction starts reusing space of the transaction cleaned from the log and power failure happens. Subsequent journal replay would still try to replay the old transaction but some of it's blocks may be already overwritten by the new transaction. For this reason we must use WRITE_FUA when updating log tail and we must first write new log tail to disk and update in-memory information only after that. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> [bwh: Prerequisite for "jbd2: fix ocfs2 corrupt when updating journal superblock fails". Backported to 3.2: - Adjust context - Drop changes to jbd2_journal_update_sb_log_tail trace event] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* tracing: Fix syscall_*regfunc() vs copy_process() raceOleg Nesterov2014-07-111-0/+15
| | | | | | | | | | | | | | | | | | | | | commit 4af4206be2bd1933cae20c2b6fb2058dbc887f7c upstream. syscall_regfunc() and syscall_unregfunc() should set/clear TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race with copy_process() and miss the new child which was not added to the process/thread lists yet. Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT under tasklist. Link: http://lkml.kernel.org/p/20140413185854.GB20668@redhat.com Fixes: a871bd33a6c0 "tracing: Add syscall tracepoints" Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* trace: module: Maintain a valid user countRomain Izard2014-06-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | commit 098507ae3ec2331476fb52e85d4040c1cc6d0ef4 upstream. The replacement of the 'count' variable by two variables 'incs' and 'decs' to resolve some race conditions during module unloading was done in parallel with some cleanup in the trace subsystem, and was integrated as a merge. Unfortunately, the formula for this replacement was wrong in the tracing code, and the refcount in the traces was not usable as a result. Use 'count = incs - decs' to compute the user count. Link: http://lkml.kernel.org/p/1393924179-9147-1-git-send-email-romain.izard.pro@gmail.com Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Frederic Weisbecker <fweisbec@gmail.com> Fixes: c1ab9cab7509 "merge conflict resolution" Signed-off-by: Romain Izard <romain.izard.pro@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* blktrace: fix accounting of partially completed requestsRoman Pen2014-04-301-3/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit af5040da01ef980670b3741b3e10733ee3e33566 upstream. trace_block_rq_complete does not take into account that request can be partially completed, so we can get the following incorrect output of blkparser: C R 232 + 240 [0] C R 240 + 232 [0] C R 248 + 224 [0] C R 256 + 216 [0] but should be: C R 232 + 8 [0] C R 240 + 8 [0] C R 248 + 8 [0] C R 256 + 8 [0] Also, the whole output summary statistics of completed requests and final throughput will be incorrect. This patch takes into account real completion size of the request and fixes wrong completion accounting. Signed-off-by: Roman Pen <r.peniaev@gmail.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@redhat.com> CC: linux-kernel@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com> [bwh: Backported to 3.2: drop change in blk-mq.c] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* tracing: Allow events to have NULL stringsSteven Rostedt (Red Hat)2014-01-031-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4e58e54754dc1fec21c3a9e824bc108b05fdf46e upstream. If an TRACE_EVENT() uses __assign_str() or __get_str on a NULL pointer then the following oops will happen: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c127a17b>] strlen+0x10/0x1a *pde = 00000000 ^M Oops: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc1-test+ #2 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006^M task: f5cde9f0 ti: f5e5e000 task.ti: f5e5e000 EIP: 0060:[<c127a17b>] EFLAGS: 00210046 CPU: 1 EIP is at strlen+0x10/0x1a EAX: 00000000 EBX: c2472da8 ECX: ffffffff EDX: c2472da8 ESI: c1c5e5fc EDI: 00000000 EBP: f5e5fe84 ESP: f5e5fe80 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000000 CR3: 01f32000 CR4: 000007d0 Stack: f5f18b90 f5e5feb8 c10687a8 0759004f 00000005 00000005 00000005 00200046 00000002 00000000 c1082a93 f56c7e28 c2472da8 c1082a93 f5e5fee4 c106bc61^M 00000000 c1082a93 00000000 00000000 00000001 00200046 00200082 00000000 Call Trace: [<c10687a8>] ftrace_raw_event_lock+0x39/0xc0 [<c1082a93>] ? ktime_get+0x29/0x69 [<c1082a93>] ? ktime_get+0x29/0x69 [<c106bc61>] lock_release+0x57/0x1a5 [<c1082a93>] ? ktime_get+0x29/0x69 [<c10824dd>] read_seqcount_begin.constprop.7+0x4d/0x75 [<c1082a93>] ? ktime_get+0x29/0x69^M [<c1082a93>] ktime_get+0x29/0x69 [<c108a46a>] __tick_nohz_idle_enter+0x1e/0x426 [<c10690e8>] ? lock_release_holdtime.part.19+0x48/0x4d [<c10bc184>] ? time_hardirqs_off+0xe/0x28 [<c1068c82>] ? trace_hardirqs_off_caller+0x3f/0xaf [<c108a8cb>] tick_nohz_idle_enter+0x59/0x62 [<c1079242>] cpu_startup_entry+0x64/0x192 [<c102299c>] start_secondary+0x277/0x27c Code: 90 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 55 89 e5 57 66 66 66 66 90 83 c9 ff 89 c7 31 c0 <f2> ae f7 d1 8d 41 ff 5f 5d c3 55 89 e5 57 66 66 66 66 90 31 ff EIP: [<c127a17b>] strlen+0x10/0x1a SS:ESP 0068:f5e5fe80 CR2: 0000000000000000 ---[ end trace 01bc47bf519ec1b2 ]--- New tracepoints have been added that have allowed for NULL pointers being assigned to strings. To fix this, change the TRACE_EVENT() code to check for NULL and if it is, it will assign "(null)" to it instead (similar to what glibc printf does). Reported-by: Shuah Khan <shuah.kh@samsung.com> Reported-by: Jovi Zhangwei <jovi.zhangwei@gmail.com> Link: http://lkml.kernel.org/r/CAGdX0WFeEuy+DtpsJzyzn0343qEEjLX97+o1VREFkUEhndC+5Q@mail.gmail.com Link: http://lkml.kernel.org/r/528D6972.9010702@samsung.com Fixes: 9cbf117662e2 ("tracing/events: provide string with undefined size support") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* xen/mmu: Use Xen specific TLB flush instead of the generic one.Konrad Rzeszutek Wilk2012-11-161-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | commit 95a7d76897c1e7243d4137037c66d15cbf2cce76 upstream. As Mukesh explained it, the MMUEXT_TLB_FLUSH_ALL allows the hypervisor to do a TLB flush on all active vCPUs. If instead we were using the generic one (which ends up being xen_flush_tlb) we end up making the MMUEXT_TLB_FLUSH_LOCAL hypercall. But before we make that hypercall the kernel will IPI all of the vCPUs (even those that were asleep from the hypervisor perspective). The end result is that we needlessly wake them up and do a TLB flush when we can just let the hypervisor do it correctly. This patch gives around 50% speed improvement when migrating idle guest's from one host to another. Oracle-bug: 14630170 Tested-by: Jingjie Jiang <jingjie.jiang@oracle.com> Suggested-by: Mukesh Rathor <mukesh.rathor@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* tracing: Don't call page_to_pfn() if page is NULLWen Congyang2012-10-101-2/+2
| | | | | | | | | | | | | | | | | | commit 85f2a2ef1d0ab99523e0b947a2b723f5650ed6aa upstream. When allocating memory fails, page is NULL. page_to_pfn() will cause the kernel panicked if we don't use sparsemem vmemmap. Link: http://lkml.kernel.org/r/505AB1FF.8020104@cn.fujitsu.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* writeback: fix dereferencing NULL bdi->dev on trace_writeback_queueWu Fengguang2012-02-201-1/+4
| | | | | | | | | | | | | | | | | | | commit 977b7e3a52a7421ad33a393a38ece59f3d41c2fa upstream. When a SD card is hot removed without umount, del_gendisk() will call bdi_unregister() without destroying/freeing it. This leaves the bdi in the bdi->dev = NULL, bdi->wb.task = NULL, bdi->bdi_list removed state. When sync(2) gets the bdi before bdi_unregister() and calls bdi_queue_work() after the unregister, trace_writeback_queue will be dereferencing the NULL bdi->dev. Fix it with a simple test for NULL. LKML-reference: http://lkml.org/lkml/2012/1/18/346 Reported-by: Rabin Vincent <rabin@rab.in> Tested-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* writeback: fix NULL bdi->dev in trace writeback_single_inodeWu Fengguang2012-02-201-1/+1
| | | | | | | | | | | | | | | commit 15eb77a07c714ac80201abd0a9568888bcee6276 upstream. bdi_prune_sb() resets sb->s_bdi to default_backing_dev_info when the tearing down the original bdi. Fix trace_writeback_single_inode to use sb->s_bdi=default_backing_dev_info rather than bdi->dev=NULL for a teared down bdi. Reported-by: Rabin Vincent <rabin@rab.in> Tested-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* writeback: show writeback reason with __print_symbolicWu Fengguang2011-12-181-2/+13
| | | | | | | | | This makes the binary trace understandable by trace-cmd. CC: Dave Chinner <david@fromorbit.com> CC: Curt Wohlgemuth <curtw@google.com> CC: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
* Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds2011-11-062-11/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
| * Revert "tracing: Include module.h in define_trace.h"Paul Gortmaker2011-10-311-10/+0
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 3a9f987b3141f086de27832514aad9f50a53f754. With all the files that are real modules now having module.h explicitly called out for inclusion, and no reliance on any implicit presence of module.h assumed, we should no longer need this workaround. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * include: replace linux/module.h with "struct module" wherever possiblePaul Gortmaker2011-10-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The <linux/module.h> pretty much brings in the kitchen sink along with it, so it should be avoided wherever reasonably possible in terms of being included from other commonly used <linux/something.h> files, as it results in a measureable increase on compile times. The worst culprit was probably device.h since it is used everywhere. This file also had an implicit dependency/usage of mutex.h which was masked by module.h, and is also fixed here at the same time. There are over a dozen other headers that simply declare the struct instead of pulling in the whole file, so follow their lead and simply make it a few more. Most of the implicit dependencies on module.h being present by these headers pulling it in have been now weeded out, so we can finally make this change with hopefully minimal breakage. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* | Merge branch 'writeback-for-linus' of ↵Linus Torvalds2011-11-061-30/+131
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux * 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Add a 'reason' to wb_writeback_work writeback: send work item to queue_io, move_expired_inodes writeback: trace event balance_dirty_pages writeback: trace event bdi_dirty_ratelimit writeback: fix ppc compile warnings on do_div(long long, unsigned long) writeback: per-bdi background threshold writeback: dirty position control - bdi reserve area writeback: control dirty pause time writeback: limit max dirty pause time writeback: IO-less balance_dirty_pages() writeback: per task dirty rate limit writeback: stabilize bdi->dirty_ratelimit writeback: dirty rate control writeback: add bg_threshold parameter to __bdi_update_bandwidth() writeback: dirty position control writeback: account per-bdi accumulated dirtied pages
| * | writeback: Add a 'reason' to wb_writeback_workCurt Wohlgemuth2011-10-311-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This creates a new 'reason' field in a wb_writeback_work structure, which unambiguously identifies who initiates writeback activity. A 'wb_reason' enumeration has been added to writeback.h, to enumerate the possible reasons. The 'writeback_work_class' and tracepoint event class and 'writeback_queue_io' tracepoints are updated to include the symbolic 'reason' in all trace events. And the 'writeback_inodes_sbXXX' family of routines has had a wb_stats parameter added to them, so callers can specify why writeback is being started. Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
| * | writeback: send work item to queue_io, move_expired_inodesCurt Wohlgemuth2011-10-311-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of sending ->older_than_this to queue_io() and move_expired_inodes(), send the entire wb_writeback_work structure. There are other fields of a work item that are useful in these routines and in tracepoints. Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
| * | writeback: trace event balance_dirty_pagesWu Fengguang2011-10-311-0/+73
| | | | | | | | | | | | | | | | | | | | | Useful for analyzing the dynamics of the throttling algorithms and debugging user reported problems. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
| * | writeback: trace event bdi_dirty_ratelimitWu Fengguang2011-10-311-0/+45
| | | | | | | | | | | | | | | | | | It helps understand how various throttle bandwidths are updated. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
| * | writeback: IO-less balance_dirty_pages()Wu Fengguang2011-10-031-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As proposed by Chris, Dave and Jan, don't start foreground writeback IO inside balance_dirty_pages(). Instead, simply let it idle sleep for some time to throttle the dirtying task. In the mean while, kick off the per-bdi flusher thread to do background writeback IO. RATIONALS ========= - disk seeks on concurrent writeback of multiple inodes (Dave Chinner) If every thread doing writes and being throttled start foreground writeback, it leads to N IO submitters from at least N different inodes at the same time, end up with N different sets of IO being issued with potentially zero locality to each other, resulting in much lower elevator sort/merge efficiency and hence we seek the disk all over the place to service the different sets of IO. OTOH, if there is only one submission thread, it doesn't jump between inodes in the same way when congestion clears - it keeps writing to the same inode, resulting in large related chunks of sequential IOs being issued to the disk. This is more efficient than the above foreground writeback because the elevator works better and the disk seeks less. - lock contention and cache bouncing on concurrent IO submitters (Dave Chinner) With this patchset, the fs_mark benchmark on a 12-drive software RAID0 goes from CPU bound to IO bound, freeing "3-4 CPUs worth of spinlock contention". * "CPU usage has dropped by ~55%", "it certainly appears that most of the CPU time saving comes from the removal of contention on the inode_wb_list_lock" (IMHO at least 10% comes from the reduction of cacheline bouncing, because the new code is able to call much less frequently into balance_dirty_pages() and hence access the global page states) * the user space "App overhead" is reduced by 20%, by avoiding the cacheline pollution by the complex writeback code path * "for a ~5% throughput reduction", "the number of write IOs have dropped by ~25%", and the elapsed time reduced from 41:42.17 to 40:53.23. * On a simple test of 100 dd, it reduces the CPU %system time from 30% to 3%, and improves IO throughput from 38MB/s to 42MB/s. - IO size too small for fast arrays and too large for slow USB sticks The write_chunk used by current balance_dirty_pages() cannot be directly set to some large value (eg. 128MB) for better IO efficiency. Because it could lead to more than 1 second user perceivable stalls. Even the current 4MB write size may be too large for slow USB sticks. The fact that balance_dirty_pages() starts IO on itself couples the IO size to wait time, which makes it hard to do suitable IO size while keeping the wait time under control. Now it's possible to increase writeback chunk size proportional to the disk bandwidth. In a simple test of 50 dd's on XFS, 1-HDD, 3GB ram, the larger writeback size dramatically reduces the seek count to 1/10 (far beyond my expectation) and improves the write throughput by 24%. - long block time in balance_dirty_pages() hurts desktop responsiveness Many of us may have the experience: it often takes a couple of seconds or even long time to stop a heavy writing dd/cp/tar command with Ctrl-C or "kill -9". - IO pipeline broken by bumpy write() progress There are a broad class of "loop {read(buf); write(buf);}" applications whose read() pipeline will be under-utilized or even come to a stop if the write()s have long latencies _or_ don't progress in a constant rate. The current threshold based throttling inherently transfers the large low level IO completion fluctuations to bumpy application write()s, and further deteriorates with increasing number of dirtiers and/or bdi's. For example, when doing 50 dd's + 1 remote rsync to an XFS partition, the rsync progresses very bumpy in legacy kernel, and throughput is improved by 67% by this patchset. (plus the larger write chunk size, it will be 93% speedup). The new rate based throttling can support 1000+ dd's with excellent smoothness, low latency and low overheads. For the above reasons, it's much better to do IO-less and low latency pauses in balance_dirty_pages(). Jan Kara, Dave Chinner and me explored the scheme to let balance_dirty_pages() wait for enough writeback IO completions to safeguard the dirty limit. However it's found to have two problems: - in large NUMA systems, the per-cpu counters may have big accounting errors, leading to big throttle wait time and jitters. - NFS may kill large amount of unstable pages with one single COMMIT. Because NFS server serves COMMIT with expensive fsync() IOs, it is desirable to delay and reduce the number of COMMITs. So it's not likely to optimize away such kind of bursty IO completions, and the resulted large (and tiny) stall times in IO completion based throttling. So here is a pause time oriented approach, which tries to control the pause time in each balance_dirty_pages() invocations, by controlling the number of pages dirtied before calling balance_dirty_pages(), for smooth and efficient dirty throttling: - avoid useless (eg. zero pause time) balance_dirty_pages() calls - avoid too small pause time (less than 4ms, which burns CPU power) - avoid too large pause time (more than 200ms, which hurts responsiveness) - avoid big fluctuations of pause times It can control pause times at will. The default policy (in a followup patch) will be to do ~10ms pauses in 1-dd case, and increase to ~100ms in 1000-dd case. BEHAVIOR CHANGE =============== (1) dirty threshold Users will notice that the applications will get throttled once crossing the global (background + dirty)/2=15% threshold, and then balanced around 17.5%. Before patch, the behavior is to just throttle it at 20% dirtyable memory in 1-dd case. Since the task will be soft throttled earlier than before, it may be perceived by end users as performance "slow down" if his application happens to dirty more than 15% dirtyable memory. (2) smoothness/responsiveness Users will notice a more responsive system during heavy writeback. "killall dd" will take effect instantly. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
* | | Merge branch 'for_linus' of ↵Linus Torvalds2011-11-021-7/+473
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (97 commits) jbd2: Unify log messages in jbd2 code jbd/jbd2: validate sb->s_first in journal_get_superblock() ext4: let ext4_ext_rm_leaf work with EXT_DEBUG defined ext4: fix a syntax error in ext4_ext_insert_extent when debugging enabled ext4: fix a typo in struct ext4_allocation_context ext4: Don't normalize an falloc request if it can fit in 1 extent. ext4: remove comments about extent mount option in ext4_new_inode() ext4: let ext4_discard_partial_buffers handle unaligned range correctly ext4: return ENOMEM if find_or_create_pages fails ext4: move vars to local scope in ext4_discard_partial_page_buffers_no_lock() ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten ext4: optimize locking for end_io extent conversion ext4: remove unnecessary call to waitqueue_active() ext4: Use correct locking for ext4_end_io_nolock() ext4: fix race in xattr block allocation path ext4: trace punch_hole correctly in ext4_ext_map_blocks ext4: clean up AGGRESSIVE_TEST code ext4: move variables to their scope ext4: fix quota accounting during migration ext4: migrate cleanup ...
| * | | ext4: optimize ext4_ext_convert_to_initialized()Eric Gouriou2011-10-271-0/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a fast path in ext4_ext_convert_to_initialized() for the case when the conversion can be performed by transferring the newly initialized blocks from the uninitialized extent into an adjacent initialized extent. Doing so removes the expensive invocations of memmove() which occur during extent insertion and the subsequent merge. In practice this should be the common case for clients performing append writes into files pre-allocated via fallocate(FALLOC_FL_KEEP_SIZE). In such a workload performed via direct IO and when using a suboptimal implementation of memmove() (x86_64 prior to the 2.6.39 rewrite), this patch reduces kernel CPU consumption by 32%. Two new trace points are added to ext4_ext_convert_to_initialized() to offer visibility into its operations. No exit trace point has been added due to the multiplicity of return points. This can be revisited once the upstream cleanup is backported. Signed-off-by: Eric Gouriou <egouriou@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
| * | | ext4: add some tracepoints in ext4/extents.cAditya Kali2011-09-091-7/+391
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds some tracepoints in ext4/extents.c and updates a tracepoint in ext4/inode.c. Tested: Built and ran the kernel and verified that these tracepoints work. Also ran xfstests. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* | | | mm: change isolate mode from #define to bitwise typeMinchan Kim2011-10-311-4/+4
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change ISOLATE_XXX macro with bitwise isolate_mode_t type. Normally, macro isn't recommended as it's type-unsafe and making debugging harder as symbol cannot be passed throught to the debugger. Quote from Johannes " Hmm, it would probably be cleaner to fully convert the isolation mode into independent flags. INACTIVE, ACTIVE, BOTH is currently a tri-state among flags, which is a bit ugly." This patch moves isolate mode from swap.h to mmzone.h by memcontrol.h Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2011-10-281-0/+25
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (549 commits) ALSA: hda - Fix ADC input-amp handling for Cx20549 codec ALSA: hda - Keep EAPD turned on for old Conexant chips ALSA: hda/realtek - Fix missing volume controls with ALC260 ASoC: wm8940: Properly set codec->dapm.bias_level ALSA: hda - Fix pin-config for ASUS W90V ALSA: hda - Fix surround/CLFE headphone and speaker pins order ALSA: hda - Fix typo ALSA: Update the sound git tree URL ALSA: HDA: Add new revision for ALC662 ASoC: max98095: Convert codec->hw_write to snd_soc_write ASoC: keep pointer to resource so it can be freed ASoC: sgtl5000: Fix wrong mask in some snd_soc_update_bits calls ASoC: wm8996: Fix wrong mask for setting WM8996_AIF_CLOCKING_2 ASoC: da7210: Add support for line out and DAC ASoC: da7210: Add support for DAPM ALSA: hda/realtek - Fix DAC assignments of multiple speakers ASoC: Use SGTL5000_LINREG_VDDD_MASK instead of hardcoded mask value ASoC: Set sgtl5000->ldo in ldo_regulator_register ASoC: wm8996: Use SND_SOC_DAPM_AIF_OUT for AIF2 Capture ASoC: wm8994: Use SND_SOC_DAPM_AIF_OUT for AIF3 Capture ...
| * | | ASoC: Add another DAPM stat for neighbour checksMark Brown2011-09-221-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The number of times we look at a potentially connected neighbour is just as important as the number of times we actually recurse into looking at that neighbour so also collect that statistic. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
| * | | ASoC: Trace and collect statistics for DAPM graph walkingMark Brown2011-09-211-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the longest standing areas for improvement in ASoC has been the DAPM algorithm - it repeats the same checks many times whenever it is run and makes no effort to limit the areas of the graph it checks meaning we do an awful lot of walks over the full graph. This has never mattered too much as the size of the graph has generally been small in relation to the size of the devices supported and the speed of CPUs but it is annoying. In preparation for work on improving this insert a trace point after the graph walk has been done. This gives us specific timing information for the walk, and in order to give quantifiable (non-benchmark) numbers also count every time we check a link or check the power for a widget and report those numbers. Substantial changes in the algorithm may require tweaks to the stats but they should be useful for simpler things. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
* | | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2011-10-261-4/+5
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) llist: Add back llist_add_batch() and llist_del_first() prototypes sched: Don't use tasklist_lock for debug prints sched: Warn on rt throttling sched: Unify the ->cpus_allowed mask copy sched: Wrap scheduler p->cpus_allowed access sched: Request for idle balance during nohz idle load balance sched: Use resched IPI to kick off the nohz idle balance sched: Fix idle_cpu() llist: Remove cpu_relax() usage in cmpxchg loops sched: Convert to struct llist llist: Add llist_next() irq_work: Use llist in the struct irq_work logic llist: Return whether list is empty before adding in llist_add() llist: Move cpu_relax() to after the cmpxchg() llist: Remove the platform-dependent NMI checks llist: Make some llist functions inline sched, tracing: Show PREEMPT_ACTIVE state in trace_sched_switch sched: Remove redundant test in check_preempt_tick() sched: Add documentation for bandwidth control sched: Return unused runtime on group dequeue ...
| * \ \ \ Merge branch 'linus' into sched/coreIngo Molnar2011-10-041-5/+5
| |\ \ \ \ | | | |_|/ | | |/| | | | | | | | | | | | | | | | | Merge reason: pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | sched, tracing: Show PREEMPT_ACTIVE state in trace_sched_switchPeter Zijlstra2011-09-261-4/+5
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We had need to see the difference between scheduling a runnable task and a runnable task being involuntarily preempted. No app should rely on the old string output (the binary trace event record format is not changed). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1316164603.10174.11.camel@twins Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2011-10-261-0/+3
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits) perf symbols: Increase symbol KSYM_NAME_LEN size perf hists browser: Refuse 'a' hotkey on non symbolic views perf ui browser: Use libslang to read keys perf tools: Fix tracing info recording perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads perf hists: Don't consider filtered entries when calculating column widths perf hists: Don't decay total_period for filtered entries perf hists browser: Honour symbol_conf.show_{nr_samples,total_period} perf hists browser: Do not exit on tab key with single event perf annotate browser: Don't change selection line when returning from callq perf tools: handle endianness of feature bitmap perf tools: Add prelink suggestion to dso update message perf script: Fix unknown feature comment perf hists browser: Apply the dso and thread filters when merging new batches perf hists: Move the dso and thread filters from hist_browser perf ui browser: Honour the xterm colors perf top tui: Give color hints just on the percentage, like on --stdio perf ui browser: Make the colors configurable and change the defaults perf tui: Remove unneeded call to newtCls on startup perf hists: Don't format the percentage on hist_entry__snprintf ... Fix up conflicts in arch/x86/kernel/kprobes.c manually. Ingo's tree did the insane "add volatile to const array", which just doesn't make sense ("volatile const"?). But we could remove the const *and* make the array volatile to make doubly sure that gcc doesn't optimize it away.. Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the reader_lock has been turned into a raw lock by the core locking merge, and there was a new user of it introduced in this perf core merge. Make sure that new use also uses the raw accessor functions.
| * \ \ \ Merge commit 'v3.1-rc9' into perf/coreIngo Molnar2011-10-061-5/+5
| |\ \ \ \ | | | |/ / | | |/| | | | | | | | | | | | | | | | | Merge reason: pick up latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | perf: Fix counter of ftrace eventsAndrew Vagin2011-10-041-0/+3
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each event adds some points to its counters. By default it adds 1, and a number of points may be transmited in event's parameters. E.g. sched:sched_stat_runtime adds how long process has been running. But this functionality was broken by v2.6.31-rc5-392-gf413cdb and now the event's parameters doesn't affect on a number of points. TP_perf_assign isn't defined, so __perf_count(c) isn't executed and __count is always equal to 1. Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1317052535-1765247-2-git-send-email-avagin@openvz.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2011-10-261-0/+459
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp() rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states rcu: Wire up RCU_BOOST_PRIO for rcutree rcu: Make rcu_torture_boost() exit loops at end of test rcu: Make rcu_torture_fqs() exit loops at end of test rcu: Permit rt_mutex_unlock() with irqs disabled rcu: Avoid having just-onlined CPU resched itself when RCU is idle rcu: Suppress NMI backtraces when stall ends before dump rcu: Prohibit grace periods during early boot rcu: Simplify unboosting checks rcu: Prevent early boot set_need_resched() from __rcu_pending() rcu: Dump local stack if cannot dump all CPUs' stacks rcu: Move __rcu_read_unlock()'s barrier() within if-statement rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier rcu: Make rcu_implicit_dynticks_qs() locals be correct size rcu: Eliminate in_irq() checks in rcu_enter_nohz() nohz: Remove nohz_cpu_mask rcu: Document interpretation of RCU-lockdep splats rcu: Allow rcutorture's stat_interval parameter to be changed at runtime ...
| * \ \ \ Merge branch 'rcu/next' of git://github.com/paulmckrcu/linux into core/rcuIngo Molnar2011-10-011-0/+459
| |\ \ \ \ | | |_|/ / | |/| | |
| | * | | rcu: Add grace-period, quiescent-state, and call_rcu trace eventsPaul E. McKenney2011-09-281-14/+331
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add trace events to record grace-period start and end, quiescent states, CPUs noticing grace-period start and end, grace-period initialization, call_rcu() invocation, tasks blocking in RCU read-side critical sections, tasks exiting those same critical sections, force_quiescent_state() detection of dyntick-idle and offline CPUs, CPUs entering and leaving dyntick-idle mode (except from NMIs), CPUs coming online and going offline, and CPUs being kicked for staying in dyntick-idle mode for too long (as in many weeks, even on 32-bit systems). Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> rcu: Add the rcu flavor to callback trace events The earlier trace events for registering RCU callbacks and for invoking them did not include the RCU flavor (rcu_bh, rcu_preempt, or rcu_sched). This commit adds the RCU flavor to those trace events. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| | * | | rcu: Add event-trace markers to TREE_RCU kthreadsPaul E. McKenney2011-09-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add event-trace markers to TREE_RCU kthreads to allow including these kthread's CPU time in the utilization calculations. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| | * | | rcu: Add RCU type to callback-invocation tracingPaul E. McKenney2011-09-281-10/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a string to the rcu_batch_start() and rcu_batch_end() trace messages that indicates the RCU type ("rcu_sched", "rcu_bh", or "rcu_preempt"). The trace messages for the actual invocations themselves are not marked, as it should be clear from the rcu_batch_start() and rcu_batch_end() events before and after. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| | * | | rcu: Event-trace markers for computing RCU CPU utilizationPaul E. McKenney2011-09-281-20/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds the trace_rcu_utilization() marker that is to be used to allow postprocessing scripts compute RCU's CPU utilization, give or take event-trace overhead. Note that we do not include RCU's dyntick-idle interface because event tracing requires RCU protection, which is not available in dyntick-idle mode. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| | * | | rcu: Add event-tracing for RCU callback invocationPaul E. McKenney2011-09-281-0/+98
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There was recently some controversy about the overhead of invoking RCU callbacks. Add TRACE_EVENT()s to obtain fine-grained timings for the start and stop of a batch of callbacks and also for each callback invoked. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* | | | Merge branch 'for-linus' of git://github.com/ericvh/linuxLinus Torvalds2011-10-261-0/+176
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://github.com/ericvh/linux: 9p: fix 9p.txt to advertise msize instead of maxdata net/9p: Convert net/9p protocol dumps to tracepoints fs/9p: change an int to unsigned int fs/9p: Cleanup option parsing in 9p 9p: move dereference after NULL check fs/9p: inode file operation is properly initialized init_special_inode fs/9p: Update zero-copy implementation in 9p
| * | | | net/9p: Convert net/9p protocol dumps to tracepointsAneesh Kumar K.V2011-10-241-0/+176
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This helps in more control over debugging. root@qemu-img-64:~# ls /pass/123 ls: cannot access /pass/123: No such file or directory root@qemu-img-64:~# cat /sys/kernel/debug/tracing/trace # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | ls-1536 [001] 70.928584: 9p_protocol_dump: clnt 18446612132784021504 P9_TWALK(tag = 1) 000: 16 00 00 00 6e 01 00 01 00 00 00 02 00 00 00 01 010: 00 03 00 31 32 33 00 00 00 ff ff ff ff 00 00 00 ls-1536 [001] 70.928587: <stack trace> => trace_9p_protocol_dump => p9pdu_finalize => p9_client_rpc => p9_client_walk => v9fs_vfs_lookup => d_alloc_and_lookup => walk_component => path_lookupat ls-1536 [000] 70.929696: 9p_protocol_dump: clnt 18446612132784021504 P9_RLERROR(tag = 1) 000: 0b 00 00 00 07 01 00 02 00 00 00 4e 03 00 02 00 010: 00 00 00 00 03 00 02 00 00 00 00 00 ff 43 00 00 ls-1536 [000] 70.929697: <stack trace> => trace_9p_protocol_dump => p9_client_rpc => p9_client_walk => v9fs_vfs_lookup => d_alloc_and_lookup => walk_component => path_lookupat => do_path_lookup Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
* | | | Merge branch 'pm-for-linus' of ↵Linus Torvalds2011-10-251-0/+99
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm * 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (63 commits) PM / Clocks: Remove redundant NULL checks before kfree() PM / Documentation: Update docs about suspend and CPU hotplug ACPI / PM: Add Sony VGN-FW21E to nonvs blacklist. ARM: mach-shmobile: sh7372 A4R support (v4) ARM: mach-shmobile: sh7372 A3SP support (v4) PM / Sleep: Mark devices involved in wakeup signaling during suspend PM / Hibernate: Improve performance of LZO/plain hibernation, checksum image PM / Hibernate: Do not initialize static and extern variables to 0 PM / Freezer: Make fake_signal_wake_up() wake TASK_KILLABLE tasks too PM / Hibernate: Add resumedelay kernel param in addition to resumewait MAINTAINERS: Update linux-pm list address PM / ACPI: Blacklist Vaio VGN-FW520F machine known to require acpi_sleep=nonvs PM / ACPI: Blacklist Sony Vaio known to require acpi_sleep=nonvs PM / Hibernate: Add resumewait param to support MMC-like devices as resume file PM / Hibernate: Fix typo in a kerneldoc comment PM / Hibernate: Freeze kernel threads after preallocating memory PM: Update the policy on default wakeup settings PM / VT: Cleanup #if defined uglyness and fix compile error PM / Suspend: Off by one in pm_suspend() PM / Hibernate: Include storage keys in hibernation image on s390 ...
| * \ \ \ Merge branch 'pm-runtime' into pm-for-linusRafael J. Wysocki2011-10-071-0/+99
| |\ \ \ \ | | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-runtime: PM / Tracing: build rpm-traces.c only if CONFIG_PM_RUNTIME is set PM / Runtime: Replace dev_dbg() with trace_rpm_*() PM / Runtime: Introduce trace points for tracing rpm_* functions PM / Runtime: Don't run callbacks under lock for power.irq_safe set USB: Add wakeup info to debugging messages PM / Runtime: pm_runtime_idle() can be called in atomic context PM / Runtime: Add macro to test for runtime PM events PM / Runtime: Add might_sleep() to runtime PM functions
| | * | | PM / Runtime: Introduce trace points for tracing rpm_* functionsMing Lei2011-09-271-0/+99
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces 3 trace points to prepare for tracing rpm_idle/rpm_suspend/rpm_resume functions, so we can use these trace points to replace the current dev_dbg(). Signed-off-by: Ming Lei <ming.lei@canonical.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* | | | Merge branch 'for-linus' of git://opensource.wolfsonmicro.com/regmapLinus Torvalds2011-10-251-0/+136
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://opensource.wolfsonmicro.com/regmap: (62 commits) mfd: Enable rbtree cache for wm831x devices regmap: Support some block operations on cached devices regmap: Allow caches for devices with no defaults regmap: Ensure rbtree syncs registers set to zero properly regmap: Allow rbtree to cache zero default values regmap: Warn on raw I/O as well as bulk reads that bypass cache regmap: Return a sensible error code if we fail to read the cache regmap: Use bsearch() to search the register defaults regmap: Fix doc comment regmap: Optimize the lookup path to use binary search regmap: Ensure we scream if we enable cache bypass/only at the same time regmap: Implement regcache_cache_bypass helper function regmap: Save/restore the bypass state upon syncing regmap: Lock the sync path, ensure we use the lockless _regmap_write() regmap: Fix apostrophe usage regmap: Make _regmap_write() global regmap: Fix lock used for regcache_cache_only() regmap: Grab the lock in regcache_cache_only() regmap: Modify map->cache_bypass directly regmap: Fix regcache_sync generic implementation ...
| * | | regmap: Add the regcache_sync trace eventDimitris Papastamos2011-09-191-0/+24
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
| * | | regmap: Use int rather than size_t for lengths when logging blocksMark Brown2011-08-101-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | x86_64 warns as size_t is not an int. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
| * | | regmap: Add basic tracepointsMark Brown2011-08-081-0/+112
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trace single register reads and writes, plus start/stop tracepoints for the actual I/O to see where we're spending time. This makes it easy to have always on logging without overwhelming the logs and also lets us take advantage of all the context and time information that the trace subsystem collects for us. We don't currently trace register values for bulk operations as this would add complexity and overhead parsing the cooked data that's being worked with. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
* | | writeback: show raw dirtied_when in trace writeback_single_inodeWu Fengguang2011-08-311-5/+5
| |/ |/| | | | | | | | | | | | | | | | | Save inode->dirtied_when in the raw trace output for reliable scripting, and to also show in formatted output the relative age in seconds for easy human reading. CC: Jan Kara <jack@suse.cz> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
* | Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds2011-08-191-9/+11
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-block: (23 commits) Revert "cfq: Remove special treatment for metadata rqs." block: fix flush machinery for stacking drivers with differring flush flags block: improve rq_affinity placement blktrace: add FLUSH/FUA support Move some REQ flags to the common bio/request area allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH xen/blkback: Make description more obvious. cfq-iosched: Add documentation about idling block: Make rq_affinity = 1 work as expected block: swim3: fix unterminated of_device_id table block/genhd.c: remove useless cast in diskstats_show() drivers/cdrom/cdrom.c: relax check on dvd manufacturer value drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse bsg-lib: add module.h include cfq-iosched: Reduce linked group count upon group destruction blk-throttle: correctly determine sync bio loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices loop: add management interface for on-demand device allocation loop: replace linked list of allocated devices with an idr index ...