aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/DMA-API-HOWTO.txt (renamed from Documentation/PCI/PCI-DMA-mapping.txt)0
-rw-r--r--Documentation/cgroups/memory.txt2
-rw-r--r--Documentation/circular-buffers.txt234
-rw-r--r--Documentation/connector/cn_test.c1
-rw-r--r--Documentation/filesystems/00-INDEX2
-rw-r--r--Documentation/filesystems/9p.txt18
-rw-r--r--Documentation/filesystems/ceph.txt11
-rw-r--r--Documentation/filesystems/tmpfs.txt6
-rw-r--r--Documentation/memory-barriers.txt20
-rw-r--r--Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt54
-rw-r--r--Documentation/volatile-considered-harmful.txt6
11 files changed, 342 insertions, 12 deletions
diff --git a/Documentation/PCI/PCI-DMA-mapping.txt b/Documentation/DMA-API-HOWTO.txt
index 52618ab..52618ab 100644
--- a/Documentation/PCI/PCI-DMA-mapping.txt
+++ b/Documentation/DMA-API-HOWTO.txt
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index f8bc802..3a6aecd 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -340,7 +340,7 @@ Note:
5.3 swappiness
Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only.
- Following cgroups' swapiness can't be changed.
+ Following cgroups' swappiness can't be changed.
- root cgroup (uses /proc/sys/vm/swappiness).
- a cgroup which uses hierarchy and it has child cgroup.
- a cgroup which uses hierarchy and not the root of hierarchy.
diff --git a/Documentation/circular-buffers.txt b/Documentation/circular-buffers.txt
new file mode 100644
index 0000000..8117e5b
--- /dev/null
+++ b/Documentation/circular-buffers.txt
@@ -0,0 +1,234 @@
+ ================
+ CIRCULAR BUFFERS
+ ================
+
+By: David Howells <dhowells@redhat.com>
+ Paul E. McKenney <paulmck@linux.vnet.ibm.com>
+
+
+Linux provides a number of features that can be used to implement circular
+buffering. There are two sets of such features:
+
+ (1) Convenience functions for determining information about power-of-2 sized
+ buffers.
+
+ (2) Memory barriers for when the producer and the consumer of objects in the
+ buffer don't want to share a lock.
+
+To use these facilities, as discussed below, there needs to be just one
+producer and just one consumer. It is possible to handle multiple producers by
+serialising them, and to handle multiple consumers by serialising them.
+
+
+Contents:
+
+ (*) What is a circular buffer?
+
+ (*) Measuring power-of-2 buffers.
+
+ (*) Using memory barriers with circular buffers.
+ - The producer.
+ - The consumer.
+
+
+==========================
+WHAT IS A CIRCULAR BUFFER?
+==========================
+
+First of all, what is a circular buffer? A circular buffer is a buffer of
+fixed, finite size into which there are two indices:
+
+ (1) A 'head' index - the point at which the producer inserts items into the
+ buffer.
+
+ (2) A 'tail' index - the point at which the consumer finds the next item in
+ the buffer.
+
+Typically when the tail pointer is equal to the head pointer, the buffer is
+empty; and the buffer is full when the head pointer is one less than the tail
+pointer.
+
+The head index is incremented when items are added, and the tail index when
+items are removed. The tail index should never jump the head index, and both
+indices should be wrapped to 0 when they reach the end of the buffer, thus
+allowing an infinite amount of data to flow through the buffer.
+
+Typically, items will all be of the same unit size, but this isn't strictly
+required to use the techniques below. The indices can be increased by more
+than 1 if multiple items or variable-sized items are to be included in the
+buffer, provided that neither index overtakes the other. The implementer must
+be careful, however, as a region more than one unit in size may wrap the end of
+the buffer and be broken into two segments.
+
+
+============================
+MEASURING POWER-OF-2 BUFFERS
+============================
+
+Calculation of the occupancy or the remaining capacity of an arbitrarily sized
+circular buffer would normally be a slow operation, requiring the use of a
+modulus (divide) instruction. However, if the buffer is of a power-of-2 size,
+then a much quicker bitwise-AND instruction can be used instead.
+
+Linux provides a set of macros for handling power-of-2 circular buffers. These
+can be made use of by:
+
+ #include <linux/circ_buf.h>
+
+The macros are:
+
+ (*) Measure the remaining capacity of a buffer:
+
+ CIRC_SPACE(head_index, tail_index, buffer_size);
+
+ This returns the amount of space left in the buffer[1] into which items
+ can be inserted.
+
+
+ (*) Measure the maximum consecutive immediate space in a buffer:
+
+ CIRC_SPACE_TO_END(head_index, tail_index, buffer_size);
+
+ This returns the amount of consecutive space left in the buffer[1] into
+ which items can be immediately inserted without having to wrap back to the
+ beginning of the buffer.
+
+
+ (*) Measure the occupancy of a buffer:
+
+ CIRC_CNT(head_index, tail_index, buffer_size);
+
+ This returns the number of items currently occupying a buffer[2].
+
+
+ (*) Measure the non-wrapping occupancy of a buffer:
+
+ CIRC_CNT_TO_END(head_index, tail_index, buffer_size);
+
+ This returns the number of consecutive items[2] that can be extracted from
+ the buffer without having to wrap back to the beginning of the buffer.
+
+
+Each of these macros will nominally return a value between 0 and buffer_size-1,
+however:
+
+ [1] CIRC_SPACE*() are intended to be used in the producer. To the producer
+ they will return a lower bound as the producer controls the head index,
+ but the consumer may still be depleting the buffer on another CPU and
+ moving the tail index.
+
+ To the consumer it will show an upper bound as the producer may be busy
+ depleting the space.
+
+ [2] CIRC_CNT*() are intended to be used in the consumer. To the consumer they
+ will return a lower bound as the consumer controls the tail index, but the
+ producer may still be filling the buffer on another CPU and moving the
+ head index.
+
+ To the producer it will show an upper bound as the consumer may be busy
+ emptying the buffer.
+
+ [3] To a third party, the order in which the writes to the indices by the
+ producer and consumer become visible cannot be guaranteed as they are
+ independent and may be made on different CPUs - so the result in such a
+ situation will merely be a guess, and may even be negative.
+
+
+===========================================
+USING MEMORY BARRIERS WITH CIRCULAR BUFFERS
+===========================================
+
+By using memory barriers in conjunction with circular buffers, you can avoid
+the need to:
+
+ (1) use a single lock to govern access to both ends of the buffer, thus
+ allowing the buffer to be filled and emptied at the same time; and
+
+ (2) use atomic counter operations.
+
+There are two sides to this: the producer that fills the buffer, and the
+consumer that empties it. Only one thing should be filling a buffer at any one
+time, and only one thing should be emptying a buffer at any one time, but the
+two sides can operate simultaneously.
+
+
+THE PRODUCER
+------------
+
+The producer will look something like this:
+
+ spin_lock(&producer_lock);
+
+ unsigned long head = buffer->head;
+ unsigned long tail = ACCESS_ONCE(buffer->tail);
+
+ if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
+ /* insert one item into the buffer */
+ struct item *item = buffer[head];
+
+ produce_item(item);
+
+ smp_wmb(); /* commit the item before incrementing the head */
+
+ buffer->head = (head + 1) & (buffer->size - 1);
+
+ /* wake_up() will make sure that the head is committed before
+ * waking anyone up */
+ wake_up(consumer);
+ }
+
+ spin_unlock(&producer_lock);
+
+This will instruct the CPU that the contents of the new item must be written
+before the head index makes it available to the consumer and then instructs the
+CPU that the revised head index must be written before the consumer is woken.
+
+Note that wake_up() doesn't have to be the exact mechanism used, but whatever
+is used must guarantee a (write) memory barrier between the update of the head
+index and the change of state of the consumer, if a change of state occurs.
+
+
+THE CONSUMER
+------------
+
+The consumer will look something like this:
+
+ spin_lock(&consumer_lock);
+
+ unsigned long head = ACCESS_ONCE(buffer->head);
+ unsigned long tail = buffer->tail;
+
+ if (CIRC_CNT(head, tail, buffer->size) >= 1) {
+ /* read index before reading contents at that index */
+ smp_read_barrier_depends();
+
+ /* extract one item from the buffer */
+ struct item *item = buffer[tail];
+
+ consume_item(item);
+
+ smp_mb(); /* finish reading descriptor before incrementing tail */
+
+ buffer->tail = (tail + 1) & (buffer->size - 1);
+ }
+
+ spin_unlock(&consumer_lock);
+
+This will instruct the CPU to make sure the index is up to date before reading
+the new item, and then it shall make sure the CPU has finished reading the item
+before it writes the new tail pointer, which will erase the item.
+
+
+Note the use of ACCESS_ONCE() in both algorithms to read the opposition index.
+This prevents the compiler from discarding and reloading its cached value -
+which some compilers will do across smp_read_barrier_depends(). This isn't
+strictly needed if you can be sure that the opposition index will _only_ be
+used the once.
+
+
+===============
+FURTHER READING
+===============
+
+See also Documentation/memory-barriers.txt for a description of Linux's memory
+barrier facilities.
diff --git a/Documentation/connector/cn_test.c b/Documentation/connector/cn_test.c
index b07add3..7764594 100644
--- a/Documentation/connector/cn_test.c
+++ b/Documentation/connector/cn_test.c
@@ -25,6 +25,7 @@
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/skbuff.h>
+#include <linux/slab.h>
#include <linux/timer.h>
#include <linux/connector.h>
diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX
index 3bae418..4303614 100644
--- a/Documentation/filesystems/00-INDEX
+++ b/Documentation/filesystems/00-INDEX
@@ -16,6 +16,8 @@ befs.txt
- information about the BeOS filesystem for Linux.
bfs.txt
- info for the SCO UnixWare Boot Filesystem (BFS).
+ceph.txt
+ - info for the Ceph Distributed File System
cifs.txt
- description of the CIFS filesystem.
coda.txt
diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index 57e0b80..c0236e7 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -37,6 +37,15 @@ For Plan 9 From User Space applications (http://swtch.com/plan9)
mount -t 9p `namespace`/acme /mnt/9 -o trans=unix,uname=$USER
+For server running on QEMU host with virtio transport:
+
+ mount -t 9p -o trans=virtio <mount_tag> /mnt/9
+
+where mount_tag is the tag associated by the server to each of the exported
+mount points. Each 9P export is seen by the client as a virtio device with an
+associated "mount_tag" property. Available mount tags can be
+seen by reading /sys/bus/virtio/drivers/9pnet_virtio/virtio<n>/mount_tag files.
+
OPTIONS
=======
@@ -47,7 +56,7 @@ OPTIONS
fd - used passed file descriptors for connection
(see rfdno and wfdno)
virtio - connect to the next virtio channel available
- (from lguest or KVM with trans_virtio module)
+ (from QEMU with trans_virtio module)
rdma - connect to a specified RDMA channel
uname=name user name to attempt mount as on the remote server. The
@@ -85,7 +94,12 @@ OPTIONS
port=n port to connect to on the remote server
- noextend force legacy mode (no 9p2000.u semantics)
+ noextend force legacy mode (no 9p2000.u or 9p2000.L semantics)
+
+ version=name Select 9P protocol version. Valid options are:
+ 9p2000 - Legacy mode (same as noextend)
+ 9p2000.u - Use 9P2000.u protocol
+ 9p2000.L - Use 9P2000.L protocol
dfltuid attempt to mount as a particular uid
diff --git a/Documentation/filesystems/ceph.txt b/Documentation/filesystems/ceph.txt
index 6e03917..0660c9f 100644
--- a/Documentation/filesystems/ceph.txt
+++ b/Documentation/filesystems/ceph.txt
@@ -8,7 +8,7 @@ Basic features include:
* POSIX semantics
* Seamless scaling from 1 to many thousands of nodes
- * High availability and reliability. No single points of failure.
+ * High availability and reliability. No single point of failure.
* N-way replication of data across storage nodes
* Fast recovery from node failures
* Automatic rebalancing of data on node addition/removal
@@ -94,7 +94,7 @@ Mount Options
wsize=X
Specify the maximum write size in bytes. By default there is no
- maximu. Ceph will normally size writes based on the file stripe
+ maximum. Ceph will normally size writes based on the file stripe
size.
rsize=X
@@ -115,7 +115,7 @@ Mount Options
number of entries in that directory.
nocrc
- Disable CRC32C calculation for data writes. If set, the OSD
+ Disable CRC32C calculation for data writes. If set, the storage node
must rely on TCP's error correction to detect data corruption
in the data payload.
@@ -133,7 +133,8 @@ For more information on Ceph, see the home page at
http://ceph.newdream.net/
The Linux kernel client source tree is available at
- git://ceph.newdream.net/linux-ceph-client.git
+ git://ceph.newdream.net/git/ceph-client.git
+ git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git
and the source for the full system is at
- git://ceph.newdream.net/ceph.git
+ git://ceph.newdream.net/git/ceph.git
diff --git a/Documentation/filesystems/tmpfs.txt b/Documentation/filesystems/tmpfs.txt
index 3015da0..fe09a2c 100644
--- a/Documentation/filesystems/tmpfs.txt
+++ b/Documentation/filesystems/tmpfs.txt
@@ -82,11 +82,13 @@ tmpfs has a mount option to set the NUMA memory allocation policy for
all files in that instance (if CONFIG_NUMA is enabled) - which can be
adjusted on the fly via 'mount -o remount ...'
-mpol=default prefers to allocate memory from the local node
+mpol=default use the process allocation policy
+ (see set_mempolicy(2))
mpol=prefer:Node prefers to allocate memory from the given Node
mpol=bind:NodeList allocates memory only from nodes in NodeList
mpol=interleave prefers to allocate from each node in turn
mpol=interleave:NodeList allocates from each node of NodeList in turn
+mpol=local prefers to allocate memory from the local node
NodeList format is a comma-separated list of decimal numbers and ranges,
a range being two hyphen-separated decimal numbers, the smallest and
@@ -134,3 +136,5 @@ Author:
Christoph Rohland <cr@sap.com>, 1.12.01
Updated:
Hugh Dickins, 4 June 2007
+Updated:
+ KOSAKI Motohiro, 16 Mar 2010
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 7f5809e..631ad2f 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -3,6 +3,7 @@
============================
By: David Howells <dhowells@redhat.com>
+ Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Contents:
@@ -60,6 +61,10 @@ Contents:
- And then there's the Alpha.
+ (*) Example uses.
+
+ - Circular buffers.
+
(*) References.
@@ -2226,6 +2231,21 @@ The Alpha defines the Linux kernel's memory barrier model.
See the subsection on "Cache Coherency" above.
+============
+EXAMPLE USES
+============
+
+CIRCULAR BUFFERS
+----------------
+
+Memory barriers can be used to implement circular buffering without the need
+of a lock to serialise the producer with the consumer. See:
+
+ Documentation/circular-buffers.txt
+
+for details.
+
+
==========
REFERENCES
==========
diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt
index 6e37be1..4f89302 100644
--- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt
+++ b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt
@@ -21,6 +21,15 @@ Required properties:
- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use for the
threads.
+Optional properties:
+- fsl,firmware-phandle:
+ Usage: required only if there is no fsl,qe-firmware child node
+ Value type: <phandle>
+ Definition: Points to a firmware node (see "QE Firmware Node" below)
+ that contains the firmware that should be uploaded for this QE.
+ The compatible property for the firmware node should say,
+ "fsl,qe-firmware".
+
Recommended properties
- brg-frequency : the internal clock source frequency for baud-rate
generators in Hz.
@@ -59,3 +68,48 @@ Example:
reg = <0 c000>;
};
};
+
+* QE Firmware Node
+
+This node defines a firmware binary that is embedded in the device tree, for
+the purpose of passing the firmware from bootloader to the kernel, or from
+the hypervisor to the guest.
+
+The firmware node itself contains the firmware binary contents, a compatible
+property, and any firmware-specific properties. The node should be placed
+inside a QE node that needs it. Doing so eliminates the need for a
+fsl,firmware-phandle property. Other QE nodes that need the same firmware
+should define an fsl,firmware-phandle property that points to the firmware node
+in the first QE node.
+
+The fsl,firmware property can be specified in the DTS (possibly using incbin)
+or can be inserted by the boot loader at boot time.
+
+Required properties:
+ - compatible
+ Usage: required
+ Value type: <string>
+ Definition: A standard property. Specify a string that indicates what
+ kind of firmware it is. For QE, this should be "fsl,qe-firmware".
+
+ - fsl,firmware
+ Usage: required
+ Value type: <prop-encoded-array>, encoded as an array of bytes
+ Definition: A standard property. This property contains the firmware
+ binary "blob".
+
+Example:
+ qe1@e0080000 {
+ compatible = "fsl,qe";
+ qe_firmware:qe-firmware {
+ compatible = "fsl,qe-firmware";
+ fsl,firmware = [0x70 0xcd 0x00 0x00 0x01 0x46 0x45 ...];
+ };
+ ...
+ };
+
+ qe2@e0090000 {
+ compatible = "fsl,qe";
+ fsl,firmware-phandle = <&qe_firmware>;
+ ...
+ };
diff --git a/Documentation/volatile-considered-harmful.txt b/Documentation/volatile-considered-harmful.txt
index 991c26a..db0cb22 100644
--- a/Documentation/volatile-considered-harmful.txt
+++ b/Documentation/volatile-considered-harmful.txt
@@ -63,9 +63,9 @@ way to perform a busy wait is:
cpu_relax();
The cpu_relax() call can lower CPU power consumption or yield to a
-hyperthreaded twin processor; it also happens to serve as a memory barrier,
-so, once again, volatile is unnecessary. Of course, busy-waiting is
-generally an anti-social act to begin with.
+hyperthreaded twin processor; it also happens to serve as a compiler
+barrier, so, once again, volatile is unnecessary. Of course, busy-
+waiting is generally an anti-social act to begin with.
There are still a few rare situations where volatile makes sense in the
kernel: