summaryrefslogtreecommitdiffstats
path: root/docs
diff options
context:
space:
mode:
authorStephen Hines <srhines@google.com>2014-07-21 00:45:20 -0700
committerStephen Hines <srhines@google.com>2014-07-21 00:45:20 -0700
commitc6a4f5e819217e1e12c458aed8e7b122e23a3a58 (patch)
tree81b7dd2bb4370a392f31d332a566c903b5744764 /docs
parent19c6fbb3e8aaf74093afa08013134b61fa08f245 (diff)
downloadexternal_llvm-c6a4f5e819217e1e12c458aed8e7b122e23a3a58.zip
external_llvm-c6a4f5e819217e1e12c458aed8e7b122e23a3a58.tar.gz
external_llvm-c6a4f5e819217e1e12c458aed8e7b122e23a3a58.tar.bz2
Update LLVM for rebase to r212749.
Includes a cherry-pick of: r212948 - fixes a small issue with atomic calls Change-Id: Ib97bd980b59f18142a69506400911a6009d9df18
Diffstat (limited to 'docs')
-rw-r--r--docs/Atomics.rst10
-rw-r--r--docs/CMake.rst8
-rw-r--r--docs/CodeGenerator.rst6
-rw-r--r--docs/CodingStandards.rst5
-rw-r--r--docs/CommandGuide/bugpoint.rst4
-rw-r--r--docs/CommandGuide/opt.rst4
-rw-r--r--docs/Extensions.rst49
-rw-r--r--docs/GarbageCollection.rst2
-rw-r--r--docs/GettingStarted.rst23
-rw-r--r--docs/GettingStartedVS.rst3
-rw-r--r--docs/LangRef.rst535
-rw-r--r--docs/Lexicon.rst2
-rw-r--r--docs/Passes.rst41
-rw-r--r--docs/Phabricator.rst29
-rw-r--r--docs/ProgrammersManual.rst97
-rw-r--r--docs/ReleaseNotes.rst10
-rw-r--r--docs/SourceLevelDebugging.rst35
-rw-r--r--docs/TestingGuide.rst3
-rw-r--r--docs/Vectorizers.rst83
-rw-r--r--docs/WritingAnLLVMPass.rst1
20 files changed, 595 insertions, 355 deletions
diff --git a/docs/Atomics.rst b/docs/Atomics.rst
index 1243f34..5f17c61 100644
--- a/docs/Atomics.rst
+++ b/docs/Atomics.rst
@@ -110,8 +110,7 @@ where threads and signals are involved.
``cmpxchg`` and ``atomicrmw`` are essentially like an atomic load followed by an
atomic store (where the store is conditional for ``cmpxchg``), but no other
-memory operation can happen on any thread between the load and store. Note that
-LLVM's cmpxchg does not provide quite as many options as the C++0x version.
+memory operation can happen on any thread between the load and store.
A ``fence`` provides Acquire and/or Release ordering which is not part of
another operation; it is normally used along with Monotonic memory operations.
@@ -430,10 +429,9 @@ other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``. Depending
on the users of the result, some ``atomicrmw`` operations can be translated into
operations like ``LOCK AND``, but that does not work in general.
-On ARM, MIPS, and many other RISC architectures, Acquire, Release, and
-SequentiallyConsistent semantics require barrier instructions for every such
+On ARM (before v8), MIPS, and many other RISC architectures, Acquire, Release,
+and SequentiallyConsistent semantics require barrier instructions for every such
operation. Loads and stores generate normal instructions. ``cmpxchg`` and
``atomicrmw`` can be represented using a loop with LL/SC-style instructions
which take some sort of exclusive lock on a cache line (``LDREX`` and ``STREX``
-on ARM, etc.). At the moment, the IR does not provide any way to represent a
-weak ``cmpxchg`` which would not require a loop.
+on ARM, etc.).
diff --git a/docs/CMake.rst b/docs/CMake.rst
index fed283d..bfc9cb9 100644
--- a/docs/CMake.rst
+++ b/docs/CMake.rst
@@ -132,7 +132,7 @@ write the variable and the type on the CMake command line:
Frequently-used CMake variables
-------------------------------
-Here are listed some of the CMake variables that are used often, along with a
+Here are some of the CMake variables that are used often, along with a
brief explanation and LLVM-specific notes. For full documentation, check the
CMake docs or execute ``cmake --help-variable VARIABLE_NAME``.
@@ -157,8 +157,8 @@ CMake docs or execute ``cmake --help-variable VARIABLE_NAME``.
Extra flags to use when compiling C++ source files.
**BUILD_SHARED_LIBS**:BOOL
- Flag indicating is shared libraries will be built. Its default value is
- OFF. Shared libraries are not supported on Windows and not recommended in the
+ Flag indicating if shared libraries will be built. Its default value is
+ OFF. Shared libraries are not supported on Windows and not recommended on the
other OSes.
.. _LLVM-specific variables:
@@ -487,7 +487,7 @@ into LLVM source tree. You can achieve it in two easy steps:
#. Adding ``add_subdirectory(<pass name>)`` line into
``<LLVM root>/lib/Transform/CMakeLists.txt``.
-Compiler/Platform specific topics
+Compiler/Platform-specific topics
=================================
Notes for specific compilers and/or platforms.
diff --git a/docs/CodeGenerator.rst b/docs/CodeGenerator.rst
index cc09946..5736e43 100644
--- a/docs/CodeGenerator.rst
+++ b/docs/CodeGenerator.rst
@@ -1228,7 +1228,7 @@ used. Each virtual register can only be mapped to physical registers of a
particular class. For instance, in the X86 architecture, some virtuals can only
be allocated to 8 bit registers. A register class is described by
``TargetRegisterClass`` objects. To discover if a virtual register is
-compatible with a given physical, this code can be used:</p>
+compatible with a given physical, this code can be used:
.. code-block:: c++
@@ -1683,7 +1683,7 @@ ones supported by the matcher), through a Requires clause:
def : MnemonicAlias<"pushf", "pushfq">, Requires<[In64BitMode]>;
def : MnemonicAlias<"pushf", "pushfl">, Requires<[In32BitMode]>;
-In this example, the mnemonic gets mapped into different a new one depending on
+In this example, the mnemonic gets mapped into a different one depending on
the current instruction set.
Instruction Aliases
@@ -2027,7 +2027,7 @@ supported on x86/x86-64 and PowerPC. It is performed if:
* Option ``-tailcallopt`` is enabled.
-* Platform specific constraints are met.
+* Platform-specific constraints are met.
x86/x86-64 constraints:
diff --git a/docs/CodingStandards.rst b/docs/CodingStandards.rst
index edbef3a..3cfa1f6 100644
--- a/docs/CodingStandards.rst
+++ b/docs/CodingStandards.rst
@@ -107,10 +107,7 @@ unlikely to be supported by our host compilers.
* Trailing return types: N2541_
* Lambdas: N2927_
- * But *not* ``std::function``, until Clang implements `MSVC-compatible RTTI`_.
- In many cases, you may be able to use ``llvm::function_ref`` instead, and it
- is a superior choice in those cases.
- * And *not* lambdas with default arguments.
+ * But *not* lambdas with default arguments.
* ``decltype``: N2343_
* Nested closing right angle brackets: N1757_
diff --git a/docs/CommandGuide/bugpoint.rst b/docs/CommandGuide/bugpoint.rst
index e4663e5..f11585d 100644
--- a/docs/CommandGuide/bugpoint.rst
+++ b/docs/CommandGuide/bugpoint.rst
@@ -124,10 +124,6 @@ OPTIONS
do not use this option, **bugpoint** will attempt to generate a reference output
by compiling the program with the "safe" backend and running it.
-**--profile-info-file** *filename*
-
- Profile file loaded by **--profile-loader**.
-
**--run-{int,jit,llc,custom}**
Whenever the test program is compiled, **bugpoint** should generate code for it
diff --git a/docs/CommandGuide/opt.rst b/docs/CommandGuide/opt.rst
index 3fed684..ad5b62c 100644
--- a/docs/CommandGuide/opt.rst
+++ b/docs/CommandGuide/opt.rst
@@ -99,10 +99,6 @@ OPTIONS
:option:`-std-compile-opts` and :option:`-verify-each` can quickly track down
this kind of problem.
-.. option:: -profile-info-file <filename>
-
- Specify the name of the file loaded by the ``-profile-loader`` option.
-
.. option:: -stats
Print statistics.
diff --git a/docs/Extensions.rst b/docs/Extensions.rst
index a49485c..271c085 100644
--- a/docs/Extensions.rst
+++ b/docs/Extensions.rst
@@ -76,7 +76,7 @@ the target. It corresponds to the COFF relocation types
Syntax:
- ``.linkonce [ comdat type [ section identifier ] ]``
+ ``.linkonce [ comdat type ]``
Supported COMDAT types:
@@ -95,16 +95,6 @@ Supported COMDAT types:
Duplicates are discarded, but the linker issues an error if any duplicates
do not have exactly the same content.
-``associative``
- Links the section if a certain other COMDAT section is linked. This other
- section is indicated by its section identifier following the comdat type.
- The following restrictions apply to the associated section:
-
- 1. It must be the name of a section already defined.
- 2. It must differ from the current section.
- 3. It must be a COMDAT section.
- 4. It cannot be another associative COMDAT section.
-
``largest``
Links the largest section from among the duplicates.
@@ -118,10 +108,6 @@ Supported COMDAT types:
.linkonce
...
- .section .xdata$foo
- .linkonce associative .text$foo
- ...
-
``.section`` Directive
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -160,6 +146,25 @@ different COMDATs:
Symbol2:
.long 1
+In addition to the types allowed with ``.linkonce``, ``.section`` also accepts
+``associative``. The meaning is that the section is linked if a certain other
+COMDAT section is linked. This other section is indicated by the comdat symbol
+in this directive. It can be any symbol defined in the associated section, but
+is usually the associated section's comdat.
+
+ The following restrictions apply to the associated section:
+
+ 1. It must be a COMDAT section.
+ 2. It cannot be another associative COMDAT section.
+
+In the following example the symobl ``sym`` is the comdat symbol of ``.foo``
+and ``.bar`` is associated to ``.foo``.
+
+.. code-block:: gas
+
+ .section .foo,"bw",discard, "sym"
+ .section .bar,"rd",associative, "sym"
+
Target Specific Behaviour
=========================
@@ -190,3 +195,17 @@ range via a slight deviation. It will generate an indirect jump as follows:
blx r12
sub.w sp, sp, r4
+Variable Length Arrays
+^^^^^^^^^^^^^^^^^^^^^^
+
+The reference implementation (Microsoft Visual Studio 2012) does not permit the
+emission of Variable Length Arrays (VLAs).
+
+The Windows ARM Itanium ABI extends the base ABI by adding support for emitting
+a dynamic stack allocation. When emitting a variable stack allocation, a call
+to ``__chkstk`` is emitted unconditionally to ensure that guard pages are setup
+properly. The emission of this stack probe emission is handled similar to the
+standard stack probe emission.
+
+The MSVC environment does not emit code for VLAs currently.
+
diff --git a/docs/GarbageCollection.rst b/docs/GarbageCollection.rst
index 323a6ea..dc6dab1 100644
--- a/docs/GarbageCollection.rst
+++ b/docs/GarbageCollection.rst
@@ -633,7 +633,7 @@ Threaded
Denotes a multithreaded mutator; the collector must still stop the mutator
("stop the world") before beginning reachability analysis. Stopping a
multithreaded mutator is a complicated problem. It generally requires highly
- platform specific code in the runtime, and the production of carefully
+ platform-specific code in the runtime, and the production of carefully
designed machine code at safe points.
Concurrent
diff --git a/docs/GettingStarted.rst b/docs/GettingStarted.rst
index 3d2ec1e..6de9b90 100644
--- a/docs/GettingStarted.rst
+++ b/docs/GettingStarted.rst
@@ -87,9 +87,10 @@ Here's the short story for getting up and running quickly with LLVM:
* ``make check-all`` --- This run the regression tests to ensure everything
is in working order.
- * It is also possible to use CMake instead of the makefiles. With CMake it is
- possible to generate project files for several IDEs: Xcode, Eclipse CDT4,
- CodeBlocks, Qt-Creator (use the CodeBlocks generator), KDevelop3.
+ * It is also possible to use `CMake <CMake.html>`_ instead of the makefiles.
+ With CMake it is possible to generate project files for several IDEs:
+ Xcode, Eclipse CDT4, CodeBlocks, Qt-Creator (use the CodeBlocks
+ generator), KDevelop3.
* If you get an "internal compiler error (ICE)" or test failures, see
`below`.
@@ -680,7 +681,7 @@ The following options can be used to set or enable LLVM specific options:
Enables optimized compilation (debugging symbols are removed and GCC
optimization flags are enabled). Note that this is the default setting if you
- are using the LLVM distribution. The default behavior of an Subversion
+ are using the LLVM distribution. The default behavior of a Subversion
checkout is to use an unoptimized build (also known as a debug build).
``--enable-debug-runtime``
@@ -698,14 +699,12 @@ The following options can be used to set or enable LLVM specific options:
Controls which targets will be built and linked into llc. The default value
for ``target_options`` is "all" which builds and links all available targets.
- The value "host-only" can be specified to build only a native compiler (no
- cross-compiler targets available). The "native" target is selected as the
- target of the build host. You can also specify a comma separated list of
- target names that you want available in llc. The target names use all lower
- case. The current set of targets is:
+ The "host" target is selected as the target of the build host. You can also
+ specify a comma separated list of target names that you want available in llc.
+ The target names use all lower case. The current set of targets is:
- ``arm, cpp, hexagon, mips, mipsel, msp430, powerpc, ptx, sparc, spu,
- systemz, x86, x86_64, xcore``.
+ ``aarch64, arm, arm64, cpp, hexagon, mips, mipsel, mips64, mips64el, msp430,
+ powerpc, nvptx, r600, sparc, systemz, x86, x86_64, xcore``.
``--enable-doxygen``
@@ -743,7 +742,7 @@ builds:
Debug Builds
- These builds are the default when one is using an Subversion checkout and
+ These builds are the default when one is using a Subversion checkout and
types ``gmake`` (unless the ``--enable-optimized`` option was used during
configuration). The build system will compile the tools and libraries with
debugging information. To get a Debug Build using the LLVM distribution the
diff --git a/docs/GettingStartedVS.rst b/docs/GettingStartedVS.rst
index aa980d2..d914cc1 100644
--- a/docs/GettingStartedVS.rst
+++ b/docs/GettingStartedVS.rst
@@ -99,6 +99,9 @@ Here's the short story for getting up and running quickly with LLVM:
build.
* See the :doc:`LLVM CMake guide <CMake>` for detailed information about
how to configure the LLVM build.
+ * CMake generates project files for all build types. To select a specific
+ build type, use the Configuration manager from the VS IDE or the
+ ``/property:Configuration`` command line option when using MSBuild.
6. Start Visual Studio
diff --git a/docs/LangRef.rst b/docs/LangRef.rst
index fa40363..cc9656a 100644
--- a/docs/LangRef.rst
+++ b/docs/LangRef.rst
@@ -117,8 +117,8 @@ And the hard way:
.. code-block:: llvm
- %0 = add i32 %X, %X ; yields {i32}:%0
- %1 = add i32 %0, %0 ; yields {i32}:%1
+ %0 = add i32 %X, %X ; yields i32:%0
+ %1 = add i32 %0, %0 ; yields i32:%1
%result = add i32 %1, %1
This last way of multiplying ``%X`` by 8 illustrates several important
@@ -464,6 +464,34 @@ DLL storage class:
exists for defining a dll interface, the compiler, assembler and linker know
it is externally referenced and must refrain from deleting the symbol.
+.. _tls_model:
+
+Thread Local Storage Models
+---------------------------
+
+A variable may be defined as ``thread_local``, which means that it will
+not be shared by threads (each thread will have a separated copy of the
+variable). Not all targets support thread-local variables. Optionally, a
+TLS model may be specified:
+
+``localdynamic``
+ For variables that are only used within the current shared library.
+``initialexec``
+ For variables in modules that will not be loaded dynamically.
+``localexec``
+ For variables defined in the executable and only used within it.
+
+If no explicit model is given, the "general dynamic" model is used.
+
+The models correspond to the ELF TLS models; see `ELF Handling For
+Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
+more information on under which circumstances the different models may
+be used. The target may choose a different TLS model if the specified
+model is not supported, or if a better choice of model can be made.
+
+A model can also be specified in a alias, but then it only governs how
+the alias is accessed. It will not have any effect in the aliasee.
+
.. _namedtypes:
Structure Types
@@ -491,29 +519,13 @@ Global Variables
Global variables define regions of memory allocated at compilation time
instead of run-time.
-Global variables definitions must be initialized, may have an explicit section
-to be placed in, and may have an optional explicit alignment specified.
+Global variables definitions must be initialized.
Global variables in other translation units can also be declared, in which
case they don't have an initializer.
-A variable may be defined as ``thread_local``, which means that it will
-not be shared by threads (each thread will have a separated copy of the
-variable). Not all targets support thread-local variables. Optionally, a
-TLS model may be specified:
-
-``localdynamic``
- For variables that are only used within the current shared library.
-``initialexec``
- For variables in modules that will not be loaded dynamically.
-``localexec``
- For variables defined in the executable and only used within it.
-
-The models correspond to the ELF TLS models; see `ELF Handling For
-Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
-more information on under which circumstances the different models may
-be used. The target may choose a different TLS model if the specified
-model is not supported, or if a better choice of model can be made.
+Either global variable definitions or declarations may have an explicit section
+to be placed in and may have an optional explicit alignment specified.
A variable may be defined as a global ``constant``, which indicates that
the contents of the variable will **never** be modified (enabling better
@@ -550,6 +562,8 @@ is zero. The address space qualifier must precede any other attributes.
LLVM allows an explicit section to be specified for globals. If the
target supports it, it will emit globals to the section specified.
+Additionally, the global can placed in a comdat if the target has the necessary
+support.
By default, global initializers are optimized by assuming that global
variables defined within the module are not modified from their
@@ -572,11 +586,14 @@ iteration.
Globals can also have a :ref:`DLL storage class <dllstorageclass>`.
+Variables and aliasaes can have a
+:ref:`Thread Local Storage Model <tls_model>`.
+
Syntax::
[@<GlobalVarName> =] [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal]
- [AddrSpace] [unnamed_addr] [ExternallyInitialized]
- <global | constant> <Type>
+ [unnamed_addr] [AddrSpace] [ExternallyInitialized]
+ <global | constant> <Type> [<InitializerConstant>]
[, section "name"] [, align <Alignment>]
For example, the following defines a global in a numbered address space
@@ -612,8 +629,9 @@ an optional ``unnamed_addr`` attribute, a return type, an optional
:ref:`parameter attribute <paramattrs>` for the return type, a function
name, a (possibly empty) argument list (each with optional :ref:`parameter
attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
-an optional section, an optional alignment, an optional :ref:`garbage
-collector name <gc>`, an optional :ref:`prefix <prefixdata>`, an opening
+an optional section, an optional alignment,
+an optional :ref:`comdat <langref_comdats>`,
+an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, an opening
curly brace, a list of basic blocks, and a closing curly brace.
LLVM function declarations consist of the "``declare``" keyword, an
@@ -643,6 +661,7 @@ predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
LLVM allows an explicit section to be specified for functions. If the
target supports it, it will emit functions to the section specified.
+Additionally, the function can placed in a COMDAT.
An explicit alignment may be specified for a function. If not present,
or if the alignment is set to zero, the alignment of the function is set
@@ -658,37 +677,131 @@ Syntax::
define [linkage] [visibility] [DLLStorageClass]
[cconv] [ret attrs]
<ResultType> @<FunctionName> ([argument list])
- [unnamed_addr] [fn Attrs] [section "name"] [align N]
- [gc] [prefix Constant] { ... }
+ [unnamed_addr] [fn Attrs] [section "name"] [comdat $<ComdatName>]
+ [align N] [gc] [prefix Constant] { ... }
.. _langref_aliases:
Aliases
-------
-Aliases act as "second name" for the aliasee value (which can be either
-function, global variable, another alias or bitcast of global value).
+Aliases, unlike function or variables, don't create any new data. They
+are just a new symbol and metadata for an existing position.
+
+Aliases have a name and an aliasee that is either a global value or a
+constant expression.
+
Aliases may have an optional :ref:`linkage type <linkage>`, an optional
-:ref:`visibility style <visibility>`, and an optional :ref:`DLL storage class
-<dllstorageclass>`.
+:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
+<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
Syntax::
- @<Name> = [Visibility] [DLLStorageClass] alias [Linkage] <AliaseeTy> @<Aliasee>
+ @<Name> = [Visibility] [DLLStorageClass] [ThreadLocal] [unnamed_addr] alias [Linkage] <AliaseeTy> @<Aliasee>
The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
-might not correctly handle dropping a weak symbol that is aliased by a non-weak
-alias.
+might not correctly handle dropping a weak symbol that is aliased.
Alias that are not ``unnamed_addr`` are guaranteed to have the same address as
-the aliasee.
+the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
+to the same content.
+
+Since aliases are only a second name, some restrictions apply, of which
+some can only be checked when producing an object file:
+
+* The expression defining the aliasee must be computable at assembly
+ time. Since it is just a name, no relocations can be used.
+
+* No alias in the expression can be weak as the possibility of the
+ intermediate alias being overridden cannot be represented in an
+ object file.
-The aliasee must be a definition.
+* No global value in the expression can be a declaration, since that
+ would require a relocation, which is not possible.
-Aliases are not allowed to point to aliases with linkages that can be
-overridden. Since they are only a second name, the possibility of the
-intermediate alias being overridden cannot be represented in an object file.
+.. _langref_comdats:
+
+Comdats
+-------
+
+Comdat IR provides access to COFF and ELF object file COMDAT functionality.
+
+Comdats have a name which represents the COMDAT key. All global objects which
+specify this key will only end up in the final object file if the linker chooses
+that key over some other key. Aliases are placed in the same COMDAT that their
+aliasee computes to, if any.
+
+Comdats have a selection kind to provide input on how the linker should
+choose between keys in two different object files.
+
+Syntax::
+
+ $<Name> = comdat SelectionKind
+
+The selection kind must be one of the following:
+
+``any``
+ The linker may choose any COMDAT key, the choice is arbitrary.
+``exactmatch``
+ The linker may choose any COMDAT key but the sections must contain the
+ same data.
+``largest``
+ The linker will choose the section containing the largest COMDAT key.
+``noduplicates``
+ The linker requires that only section with this COMDAT key exist.
+``samesize``
+ The linker may choose any COMDAT key but the sections must contain the
+ same amount of data.
+
+Note that the Mach-O platform doesn't support COMDATs and ELF only supports
+``any`` as a selection kind.
+
+Here is an example of a COMDAT group where a function will only be selected if
+the COMDAT key's section is the largest:
+
+.. code-block:: llvm
+
+ $foo = comdat largest
+ @foo = global i32 2, comdat $foo
+
+ define void @bar() comdat $foo {
+ ret void
+ }
+
+In a COFF object file, this will create a COMDAT section with selection kind
+``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
+and another COMDAT section with selection kind
+``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
+section and contains the contents of the ``@baz`` symbol.
+
+There are some restrictions on the properties of the global object.
+It, or an alias to it, must have the same name as the COMDAT group when
+targeting COFF.
+The contents and size of this object may be used during link-time to determine
+which COMDAT groups get selected depending on the selection kind.
+Because the name of the object must match the name of the COMDAT group, the
+linkage of the global object must not be local; local symbols can get renamed
+if a collision occurs in the symbol table.
+
+The combined use of COMDATS and section attributes may yield surprising results.
+For example:
+
+.. code-block:: llvm
+
+ $foo = comdat any
+ $bar = comdat any
+ @g1 = global i32 42, section "sec", comdat $foo
+ @g2 = global i32 42, section "sec", comdat $bar
+
+From the object file perspective, this requires the creation of two sections
+with the same name. This is necessary because both globals belong to different
+COMDAT groups and COMDATs, at the object file level, are represented by
+sections.
+
+Note that certain IR constructs like global variables and functions may create
+COMDATs in the object file in addition to any which are specified using COMDAT
+IR. This arises, for example, when a global variable has linkonce_odr linkage.
.. _namedmetadatastructure:
@@ -997,6 +1110,14 @@ example:
inlining this function is desirable (such as the "inline" keyword in
C/C++). It is just a hint; it imposes no requirements on the
inliner.
+``jumptable``
+ This attribute indicates that the function should be added to a
+ jump-instruction table at code-generation time, and that all address-taken
+ references to this function should be replaced with a reference to the
+ appropriate jump-instruction-table function pointer. Note that this creates
+ a new pointer for the original function, which means that code that depends
+ on function-pointer identity can break. So, any function annotated with
+ ``jumptable`` must also be ``unnamed_addr``.
``minsize``
This attribute suggests that optimization passes and code generator
passes make choices that keep the code size of this function as small
@@ -2715,11 +2836,12 @@ number representing the maximum relative error, for example:
'``range``' Metadata
^^^^^^^^^^^^^^^^^^^^
-``range`` metadata may be attached only to loads of integer types. It
-expresses the possible ranges the loaded value is in. The ranges are
-represented with a flattened list of integers. The loaded value is known
-to be in the union of the ranges defined by each consecutive pair. Each
-pair has the following properties:
+``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
+integer types. It expresses the possible ranges the loaded value or the value
+returned by the called function at this call site is in. The ranges are
+represented with a flattened list of integers. The loaded value or the value
+returned is known to be in the union of the ranges defined by each consecutive
+pair. Each pair has the following properties:
- The type must match the type loaded by the instruction.
- The pair ``a,b`` represents the range ``[a,b)``.
@@ -2737,8 +2859,9 @@ Examples:
%a = load i8* %x, align 1, !range !0 ; Can only be 0 or 1
%b = load i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
- %c = load i8* %z, align 1, !range !2 ; Can only be 0, 1, 3, 4 or 5
- %d = load i8* %z, align 1, !range !3 ; Can only be -2, -1, 3, 4 or 5
+ %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5
+ %d = invoke i8 @bar() to label %cont
+ unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
...
!0 = metadata !{ i8 0, i8 2 }
!1 = metadata !{ i8 255, i8 2 }
@@ -2768,7 +2891,7 @@ constructs:
The loop identifier metadata can be used to specify additional per-loop
metadata. Any operands after the first operand can be treated as user-defined
-metadata. For example the ``llvm.vectorizer.unroll`` metadata is understood
+metadata. For example the ``llvm.loop.vectorize.unroll`` metadata is understood
by the loop vectorizer to indicate how many times to unroll the loop:
.. code-block:: llvm
@@ -2776,7 +2899,7 @@ by the loop vectorizer to indicate how many times to unroll the loop:
br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
...
!0 = metadata !{ metadata !0, metadata !1 }
- !1 = metadata !{ metadata !"llvm.vectorizer.unroll", i32 2 }
+ !1 = metadata !{ metadata !"llvm.loop.vectorize.unroll", i32 2 }
'``llvm.mem``'
^^^^^^^^^^^^^^^
@@ -2796,7 +2919,7 @@ with the same loop identifier.
Precisely, given two instructions ``m1`` and ``m2`` that both have the
``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the
set of loops associated with that metadata, respectively, then there is no loop
-carried dependence between ``m1`` and ``m2`` for loops ``L1`` or
+carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and
``L2``.
As a special case, if all memory accessing instructions in a loop have
@@ -2861,54 +2984,54 @@ the loop identifier metadata node directly:
!1 = metadata !{ metadata !1 } ; an identifier for the inner loop
!2 = metadata !{ metadata !2 } ; an identifier for the outer loop
-'``llvm.vectorizer``'
-^^^^^^^^^^^^^^^^^^^^^
+'``llvm.loop.vectorize``'
+^^^^^^^^^^^^^^^^^^^^^^^^^
-Metadata prefixed with ``llvm.vectorizer`` is used to control per-loop
+Metadata prefixed with ``llvm.loop.vectorize`` is used to control per-loop
vectorization parameters such as vectorization factor and unroll factor.
-``llvm.vectorizer`` metadata should be used in conjunction with ``llvm.loop``
-loop identification metadata.
+``llvm.loop.vectorize`` metadata should be used in conjunction with
+``llvm.loop`` loop identification metadata.
-'``llvm.vectorizer.unroll``' Metadata
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+'``llvm.loop.vectorize.unroll``' Metadata
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata instructs the loop vectorizer to unroll the specified
loop exactly ``N`` times.
-The first operand is the string ``llvm.vectorizer.unroll`` and the second
+The first operand is the string ``llvm.loop.vectorize.unroll`` and the second
operand is an integer specifying the unroll factor. For example:
.. code-block:: llvm
- !0 = metadata !{ metadata !"llvm.vectorizer.unroll", i32 4 }
+ !0 = metadata !{ metadata !"llvm.loop.vectorize.unroll", i32 4 }
-Note that setting ``llvm.vectorizer.unroll`` to 1 disables unrolling of the
-loop.
+Note that setting ``llvm.loop.vectorize.unroll`` to 1 disables
+unrolling of the loop.
-If ``llvm.vectorizer.unroll`` is set to 0 then the amount of unrolling will be
-determined automatically.
+If ``llvm.loop.vectorize.unroll`` is set to 0 then the amount of
+unrolling will be determined automatically.
-'``llvm.vectorizer.width``' Metadata
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+'``llvm.loop.vectorize.width``' Metadata
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata sets the target width of the vectorizer to ``N``. Without
this metadata, the vectorizer will choose a width automatically.
Regardless of this metadata, the vectorizer will only vectorize loops if
it believes it is valid to do so.
-The first operand is the string ``llvm.vectorizer.width`` and the second
-operand is an integer specifying the width. For example:
+The first operand is the string ``llvm.loop.vectorize.width`` and the
+second operand is an integer specifying the width. For example:
.. code-block:: llvm
- !0 = metadata !{ metadata !"llvm.vectorizer.width", i32 4 }
+ !0 = metadata !{ metadata !"llvm.loop.vectorize.width", i32 4 }
-Note that setting ``llvm.vectorizer.width`` to 1 disables vectorization of the
-loop.
+Note that setting ``llvm.loop.vectorize.width`` to 1 disables
+vectorization of the loop.
-If ``llvm.vectorizer.width`` is set to 0 then the width will be determined
-automatically.
+If ``llvm.loop.vectorize.width`` is set to 0 then the width will be
+determined automatically.
Module Flags Metadata
=====================
@@ -3110,6 +3233,42 @@ Each individual option is required to be either a valid option for the target's
linker, or an option that is reserved by the target specific assembly writer or
object file emitter. No other aspect of these options is defined by the IR.
+C type width Module Flags Metadata
+----------------------------------
+
+The ARM backend emits a section into each generated object file describing the
+options that it was compiled with (in a compiler-independent way) to prevent
+linking incompatible objects, and to allow automatic library selection. Some
+of these options are not visible at the IR level, namely wchar_t width and enum
+width.
+
+To pass this information to the backend, these options are encoded in module
+flags metadata, using the following key-value pairs:
+
+.. list-table::
+ :header-rows: 1
+ :widths: 30 70
+
+ * - Key
+ - Value
+
+ * - short_wchar
+ - * 0 --- sizeof(wchar_t) == 4
+ * 1 --- sizeof(wchar_t) == 2
+
+ * - short_enum
+ - * 0 --- Enums are at least as large as an ``int``.
+ * 1 --- Enums are stored in the smallest integer type which can
+ represent all of its values.
+
+For example, the following metadata section specifies that the module was
+compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
+enum is the smallest type which can represent all of its values::
+
+ !llvm.module.flags = !{!0, !1}
+ !0 = metadata !{i32 1, metadata !"short_wchar", i32 1}
+ !1 = metadata !{i32 1, metadata !"short_enum", i32 0}
+
.. _intrinsicglobalvariables:
Intrinsic Global Variables
@@ -3543,9 +3702,9 @@ Example:
.. code-block:: llvm
%retval = invoke i32 @Test(i32 15) to label %Continue
- unwind label %TestCleanup ; {i32}:retval set
+ unwind label %TestCleanup ; i32:retval set
%retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
- unwind label %TestCleanup ; {i32}:retval set
+ unwind label %TestCleanup ; i32:retval set
.. _i_resume:
@@ -3634,10 +3793,10 @@ Syntax:
::
- <result> = add <ty> <op1>, <op2> ; yields {ty}:result
- <result> = add nuw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = add nsw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = add nuw nsw <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = add <ty> <op1>, <op2> ; yields ty:result
+ <result> = add nuw <ty> <op1>, <op2> ; yields ty:result
+ <result> = add nsw <ty> <op1>, <op2> ; yields ty:result
+ <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -3673,7 +3832,7 @@ Example:
.. code-block:: llvm
- <result> = add i32 4, %var ; yields {i32}:result = 4 + %var
+ <result> = add i32 4, %var ; yields i32:result = 4 + %var
.. _i_fadd:
@@ -3685,7 +3844,7 @@ Syntax:
::
- <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -3712,7 +3871,7 @@ Example:
.. code-block:: llvm
- <result> = fadd float 4.0, %var ; yields {float}:result = 4.0 + %var
+ <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var
'``sub``' Instruction
^^^^^^^^^^^^^^^^^^^^^
@@ -3722,10 +3881,10 @@ Syntax:
::
- <result> = sub <ty> <op1>, <op2> ; yields {ty}:result
- <result> = sub nuw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = sub nsw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = sub nuw nsw <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = sub <ty> <op1>, <op2> ; yields ty:result
+ <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result
+ <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result
+ <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -3764,8 +3923,8 @@ Example:
.. code-block:: llvm
- <result> = sub i32 4, %var ; yields {i32}:result = 4 - %var
- <result> = sub i32 0, %val ; yields {i32}:result = -%var
+ <result> = sub i32 4, %var ; yields i32:result = 4 - %var
+ <result> = sub i32 0, %val ; yields i32:result = -%var
.. _i_fsub:
@@ -3777,7 +3936,7 @@ Syntax:
::
- <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -3807,8 +3966,8 @@ Example:
.. code-block:: llvm
- <result> = fsub float 4.0, %var ; yields {float}:result = 4.0 - %var
- <result> = fsub float -0.0, %val ; yields {float}:result = -%var
+ <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
+ <result> = fsub float -0.0, %val ; yields float:result = -%var
'``mul``' Instruction
^^^^^^^^^^^^^^^^^^^^^
@@ -3818,10 +3977,10 @@ Syntax:
::
- <result> = mul <ty> <op1>, <op2> ; yields {ty}:result
- <result> = mul nuw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = mul nsw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = mul nuw nsw <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = mul <ty> <op1>, <op2> ; yields ty:result
+ <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result
+ <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result
+ <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -3861,7 +4020,7 @@ Example:
.. code-block:: llvm
- <result> = mul i32 4, %var ; yields {i32}:result = 4 * %var
+ <result> = mul i32 4, %var ; yields i32:result = 4 * %var
.. _i_fmul:
@@ -3873,7 +4032,7 @@ Syntax:
::
- <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -3900,7 +4059,7 @@ Example:
.. code-block:: llvm
- <result> = fmul float 4.0, %var ; yields {float}:result = 4.0 * %var
+ <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var
'``udiv``' Instruction
^^^^^^^^^^^^^^^^^^^^^^
@@ -3910,8 +4069,8 @@ Syntax:
::
- <result> = udiv <ty> <op1>, <op2> ; yields {ty}:result
- <result> = udiv exact <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = udiv <ty> <op1>, <op2> ; yields ty:result
+ <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -3944,7 +4103,7 @@ Example:
.. code-block:: llvm
- <result> = udiv i32 4, %var ; yields {i32}:result = 4 / %var
+ <result> = udiv i32 4, %var ; yields i32:result = 4 / %var
'``sdiv``' Instruction
^^^^^^^^^^^^^^^^^^^^^^
@@ -3954,8 +4113,8 @@ Syntax:
::
- <result> = sdiv <ty> <op1>, <op2> ; yields {ty}:result
- <result> = sdiv exact <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = sdiv <ty> <op1>, <op2> ; yields ty:result
+ <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -3990,7 +4149,7 @@ Example:
.. code-block:: llvm
- <result> = sdiv i32 4, %var ; yields {i32}:result = 4 / %var
+ <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var
.. _i_fdiv:
@@ -4002,7 +4161,7 @@ Syntax:
::
- <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4029,7 +4188,7 @@ Example:
.. code-block:: llvm
- <result> = fdiv float 4.0, %var ; yields {float}:result = 4.0 / %var
+ <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var
'``urem``' Instruction
^^^^^^^^^^^^^^^^^^^^^^
@@ -4039,7 +4198,7 @@ Syntax:
::
- <result> = urem <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = urem <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4071,7 +4230,7 @@ Example:
.. code-block:: llvm
- <result> = urem i32 4, %var ; yields {i32}:result = 4 % %var
+ <result> = urem i32 4, %var ; yields i32:result = 4 % %var
'``srem``' Instruction
^^^^^^^^^^^^^^^^^^^^^^
@@ -4081,7 +4240,7 @@ Syntax:
::
- <result> = srem <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = srem <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4126,7 +4285,7 @@ Example:
.. code-block:: llvm
- <result> = srem i32 4, %var ; yields {i32}:result = 4 % %var
+ <result> = srem i32 4, %var ; yields i32:result = 4 % %var
.. _i_frem:
@@ -4138,7 +4297,7 @@ Syntax:
::
- <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4166,7 +4325,7 @@ Example:
.. code-block:: llvm
- <result> = frem float 4.0, %var ; yields {float}:result = 4.0 % %var
+ <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var
.. _bitwiseops:
@@ -4187,10 +4346,10 @@ Syntax:
::
- <result> = shl <ty> <op1>, <op2> ; yields {ty}:result
- <result> = shl nuw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = shl nsw <ty> <op1>, <op2> ; yields {ty}:result
- <result> = shl nuw nsw <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = shl <ty> <op1>, <op2> ; yields ty:result
+ <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result
+ <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result
+ <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4228,9 +4387,9 @@ Example:
.. code-block:: llvm
- <result> = shl i32 4, %var ; yields {i32}: 4 << %var
- <result> = shl i32 4, 2 ; yields {i32}: 16
- <result> = shl i32 1, 10 ; yields {i32}: 1024
+ <result> = shl i32 4, %var ; yields i32: 4 << %var
+ <result> = shl i32 4, 2 ; yields i32: 16
+ <result> = shl i32 1, 10 ; yields i32: 1024
<result> = shl i32 1, 32 ; undefined
<result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>
@@ -4242,8 +4401,8 @@ Syntax:
::
- <result> = lshr <ty> <op1>, <op2> ; yields {ty}:result
- <result> = lshr exact <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = lshr <ty> <op1>, <op2> ; yields ty:result
+ <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4277,10 +4436,10 @@ Example:
.. code-block:: llvm
- <result> = lshr i32 4, 1 ; yields {i32}:result = 2
- <result> = lshr i32 4, 2 ; yields {i32}:result = 1
- <result> = lshr i8 4, 3 ; yields {i8}:result = 0
- <result> = lshr i8 -2, 1 ; yields {i8}:result = 0x7F
+ <result> = lshr i32 4, 1 ; yields i32:result = 2
+ <result> = lshr i32 4, 2 ; yields i32:result = 1
+ <result> = lshr i8 4, 3 ; yields i8:result = 0
+ <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
<result> = lshr i32 1, 32 ; undefined
<result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
@@ -4292,8 +4451,8 @@ Syntax:
::
- <result> = ashr <ty> <op1>, <op2> ; yields {ty}:result
- <result> = ashr exact <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = ashr <ty> <op1>, <op2> ; yields ty:result
+ <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4328,10 +4487,10 @@ Example:
.. code-block:: llvm
- <result> = ashr i32 4, 1 ; yields {i32}:result = 2
- <result> = ashr i32 4, 2 ; yields {i32}:result = 1
- <result> = ashr i8 4, 3 ; yields {i8}:result = 0
- <result> = ashr i8 -2, 1 ; yields {i8}:result = -1
+ <result> = ashr i32 4, 1 ; yields i32:result = 2
+ <result> = ashr i32 4, 2 ; yields i32:result = 1
+ <result> = ashr i8 4, 3 ; yields i8:result = 0
+ <result> = ashr i8 -2, 1 ; yields i8:result = -1
<result> = ashr i32 1, 32 ; undefined
<result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
@@ -4343,7 +4502,7 @@ Syntax:
::
- <result> = and <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = and <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4380,9 +4539,9 @@ Example:
.. code-block:: llvm
- <result> = and i32 4, %var ; yields {i32}:result = 4 & %var
- <result> = and i32 15, 40 ; yields {i32}:result = 8
- <result> = and i32 4, 8 ; yields {i32}:result = 0
+ <result> = and i32 4, %var ; yields i32:result = 4 & %var
+ <result> = and i32 15, 40 ; yields i32:result = 8
+ <result> = and i32 4, 8 ; yields i32:result = 0
'``or``' Instruction
^^^^^^^^^^^^^^^^^^^^
@@ -4392,7 +4551,7 @@ Syntax:
::
- <result> = or <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = or <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4429,9 +4588,9 @@ Example:
::
- <result> = or i32 4, %var ; yields {i32}:result = 4 | %var
- <result> = or i32 15, 40 ; yields {i32}:result = 47
- <result> = or i32 4, 8 ; yields {i32}:result = 12
+ <result> = or i32 4, %var ; yields i32:result = 4 | %var
+ <result> = or i32 15, 40 ; yields i32:result = 47
+ <result> = or i32 4, 8 ; yields i32:result = 12
'``xor``' Instruction
^^^^^^^^^^^^^^^^^^^^^
@@ -4441,7 +4600,7 @@ Syntax:
::
- <result> = xor <ty> <op1>, <op2> ; yields {ty}:result
+ <result> = xor <ty> <op1>, <op2> ; yields ty:result
Overview:
"""""""""
@@ -4479,10 +4638,10 @@ Example:
.. code-block:: llvm
- <result> = xor i32 4, %var ; yields {i32}:result = 4 ^ %var
- <result> = xor i32 15, 40 ; yields {i32}:result = 39
- <result> = xor i32 4, 8 ; yields {i32}:result = 12
- <result> = xor i32 %V, -1 ; yields {i32}:result = ~%V
+ <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var
+ <result> = xor i32 15, 40 ; yields i32:result = 39
+ <result> = xor i32 4, 8 ; yields i32:result = 12
+ <result> = xor i32 %V, -1 ; yields i32:result = ~%V
Vector Operations
-----------------
@@ -4748,7 +4907,7 @@ Syntax:
::
- <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] ; yields {type*}:result
+ <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] ; yields type*:result
Overview:
"""""""""
@@ -4790,10 +4949,10 @@ Example:
.. code-block:: llvm
- %ptr = alloca i32 ; yields {i32*}:ptr
- %ptr = alloca i32, i32 4 ; yields {i32*}:ptr
- %ptr = alloca i32, i32 4, align 1024 ; yields {i32*}:ptr
- %ptr = alloca i32, align 1024 ; yields {i32*}:ptr
+ %ptr = alloca i32 ; yields i32*:ptr
+ %ptr = alloca i32, i32 4 ; yields i32*:ptr
+ %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr
+ %ptr = alloca i32, align 1024 ; yields i32*:ptr
.. _i_load:
@@ -4876,9 +5035,9 @@ Examples:
.. code-block:: llvm
- %ptr = alloca i32 ; yields {i32*}:ptr
- store i32 3, i32* %ptr ; yields {void}
- %val = load i32* %ptr ; yields {i32}:val = i32 3
+ %ptr = alloca i32 ; yields i32*:ptr
+ store i32 3, i32* %ptr ; yields void
+ %val = load i32* %ptr ; yields i32:val = i32 3
.. _i_store:
@@ -4890,8 +5049,8 @@ Syntax:
::
- store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] ; yields {void}
- store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> ; yields {void}
+ store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] ; yields void
+ store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> ; yields void
Overview:
"""""""""
@@ -4955,9 +5114,9 @@ Example:
.. code-block:: llvm
- %ptr = alloca i32 ; yields {i32*}:ptr
- store i32 3, i32* %ptr ; yields {void}
- %val = load i32* %ptr ; yields {i32}:val = i32 3
+ %ptr = alloca i32 ; yields i32*:ptr
+ store i32 3, i32* %ptr ; yields void
+ %val = load i32* %ptr ; yields i32:val = i32 3
.. _i_fence:
@@ -4969,7 +5128,7 @@ Syntax:
::
- fence [singlethread] <ordering> ; yields {void}
+ fence [singlethread] <ordering> ; yields void
Overview:
"""""""""
@@ -5012,8 +5171,8 @@ Example:
.. code-block:: llvm
- fence acquire ; yields {void}
- fence singlethread seq_cst ; yields {void}
+ fence acquire ; yields void
+ fence singlethread seq_cst ; yields void
.. _i_cmpxchg:
@@ -5025,14 +5184,14 @@ Syntax:
::
- cmpxchg [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields {ty}
+ cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields { ty, i1 }
Overview:
"""""""""
The '``cmpxchg``' instruction is used to atomically modify memory. It
loads a value in memory and compares it to a given value. If they are
-equal, it stores a new value into the memory.
+equal, it tries to store a new value into the memory.
Arguments:
""""""""""
@@ -5049,10 +5208,10 @@ to modify the number or order of execution of this ``cmpxchg`` with
other :ref:`volatile operations <volatile>`.
The success and failure :ref:`ordering <ordering>` arguments specify how this
-``cmpxchg`` synchronizes with other atomic operations. The both ordering
-parameters must be at least ``monotonic``, the ordering constraint on failure
-must be no stronger than that on success, and the failure ordering cannot be
-either ``release`` or ``acq_rel``.
+``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
+must be at least ``monotonic``, the ordering constraint on failure must be no
+stronger than that on success, and the failure ordering cannot be either
+``release`` or ``acq_rel``.
The optional "``singlethread``" argument declares that the ``cmpxchg``
is only atomic with respect to code (usually signal handlers) running in
@@ -5065,10 +5224,17 @@ equal to the size in memory of the operand.
Semantics:
""""""""""
-The contents of memory at the location specified by the '``<pointer>``'
-operand is read and compared to '``<cmp>``'; if the read value is the
-equal, '``<new>``' is written. The original value at the location is
-returned.
+The contents of memory at the location specified by the '``<pointer>``' operand
+is read and compared to '``<cmp>``'; if the read value is the equal, the
+'``<new>``' is written. The original value at the location is returned, together
+with a flag indicating success (true) or failure (false).
+
+If the cmpxchg operation is marked as ``weak`` then a spurious failure is
+permitted: the operation may not write ``<new>`` even if the comparison
+matched.
+
+If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
+if the value loaded equals ``cmp``.
A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
@@ -5080,14 +5246,15 @@ Example:
.. code-block:: llvm
entry:
- %orig = atomic load i32* %ptr unordered ; yields {i32}
+ %orig = atomic load i32* %ptr unordered ; yields i32
br label %loop
loop:
%cmp = phi i32 [ %orig, %entry ], [%old, %loop]
%squared = mul i32 %cmp, %cmp
- %old = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields {i32}
- %success = icmp eq i32 %cmp, %old
+ %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
+ %value_loaded = extractvalue { i32, i1 } %val_success, 0
+ %success = extractvalue { i32, i1 } %val_success, 1
br i1 %success, label %done, label %loop
done:
@@ -5103,7 +5270,7 @@ Syntax:
::
- atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering> ; yields {ty}
+ atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering> ; yields ty
Overview:
"""""""""
@@ -5164,7 +5331,7 @@ Example:
.. code-block:: llvm
- %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields {i32}
+ %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32
.. _i_getelementptr:
@@ -5898,7 +6065,7 @@ Syntax:
::
- <result> = icmp <cond> <ty> <op1>, <op2> ; yields {i1} or {<N x i1>}:result
+ <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Overview:
"""""""""
@@ -5989,7 +6156,7 @@ Syntax:
::
- <result> = fcmp <cond> <ty> <op1>, <op2> ; yields {i1} or {<N x i1>}:result
+ <result> = fcmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Overview:
"""""""""
@@ -6241,7 +6408,7 @@ This instruction requires several arguments:
uses value of call or is void).
- Option ``-tailcallopt`` is enabled, or
``llvm::GuaranteedTailCallOpt`` is ``true``.
- - `Platform specific constraints are
+ - `Platform-specific constraints are
met. <CodeGenerator.html#tailcallopt>`_
#. The optional "cconv" marker indicates which :ref:`calling
@@ -6294,7 +6461,7 @@ Example:
call void %foo(i8 97 signext)
%struct.A = type { i32, i8 }
- %r = call %struct.A @foo() ; yields { 32, i8 }
+ %r = call %struct.A @foo() ; yields { i32, i8 }
%gr = extractvalue %struct.A %r, 0 ; yields i32
%gr1 = extractvalue %struct.A %r, 1 ; yields i8
%Z = call void @foo() noreturn ; indicates that %foo never returns normally
@@ -8456,7 +8623,7 @@ Examples:
.. code-block:: llvm
- %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields {float}:r2 = (a * b) + c
+ %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
Half Precision Floating Point Intrinsics
----------------------------------------
@@ -8484,7 +8651,7 @@ Syntax:
::
- declare i16 @llvm.convert.to.fp16(f32 %a)
+ declare i16 @llvm.convert.to.fp16(float %a)
Overview:
"""""""""
@@ -8512,7 +8679,7 @@ Examples:
.. code-block:: llvm
- %res = call i16 @llvm.convert.to.fp16(f32 %a)
+ %res = call i16 @llvm.convert.to.fp16(float %a)
store i16 %res, i16* @x, align 2
.. _int_convert_from_fp16:
@@ -8525,7 +8692,7 @@ Syntax:
::
- declare f32 @llvm.convert.from.fp16(i16 %a)
+ declare float @llvm.convert.from.fp16(i16 %a)
Overview:
"""""""""
@@ -8554,7 +8721,7 @@ Examples:
.. code-block:: llvm
%a = load i16* @x, align 2
- %res = call f32 @llvm.convert.from.fp16(i16 %a)
+ %res = call float @llvm.convert.from.fp16(i16 %a)
Debugger Intrinsics
-------------------
@@ -8675,7 +8842,7 @@ Semantics:
""""""""""
On some architectures the address of the code to be executed needs to be
-different to the address where the trampoline is actually stored. This
+different than the address where the trampoline is actually stored. This
intrinsic returns the executable address corresponding to ``tramp``
after performing the required machine specific adjustments. The pointer
returned can then be :ref:`bitcast and executed <int_trampoline>`.
@@ -8683,7 +8850,7 @@ returned can then be :ref:`bitcast and executed <int_trampoline>`.
Memory Use Markers
------------------
-This class of intrinsics exists to information about the lifetime of
+This class of intrinsics provides information about the lifetime of
memory objects and ranges where variables are immutable.
.. _int_lifestart:
diff --git a/docs/Lexicon.rst b/docs/Lexicon.rst
index 11f1341..fccfd5f 100644
--- a/docs/Lexicon.rst
+++ b/docs/Lexicon.rst
@@ -50,7 +50,7 @@ C
Common Subexpression Elimination. An optimization that removes common
subexpression compuation. For example ``(a+b)*(a+b)`` has two subexpressions
that are the same: ``(a+b)``. This optimization would perform the addition
- only once and then perform the multiply (but only if it's compulationally
+ only once and then perform the multiply (but only if it's computationally
correct/safe).
D
diff --git a/docs/Passes.rst b/docs/Passes.rst
index b51829d..9f40092 100644
--- a/docs/Passes.rst
+++ b/docs/Passes.rst
@@ -261,12 +261,6 @@ returns "I don't know" for alias queries. NoAA is unlike other alias analysis
implementations, in that it does not chain to a previous analysis. As such it
doesn't follow many of the rules that other alias analyses must.
-``-no-profile``: No Profile Information
----------------------------------------
-
-The default "no profile" implementation of the abstract ``ProfileInfo``
-interface.
-
``-postdomfrontier``: Post-Dominance Frontier Construction
----------------------------------------------------------
@@ -336,23 +330,6 @@ This pass is used to seek out all of the types in use by the program. Note
that this analysis explicitly does not include types only used by the symbol
table.
-``-profile-estimator``: Estimate profiling information
-------------------------------------------------------
-
-Profiling information that estimates the profiling information in a very crude
-and unimaginative way.
-
-``-profile-loader``: Load profile information from ``llvmprof.out``
--------------------------------------------------------------------
-
-A concrete implementation of profiling information that loads the information
-from a profile dump file.
-
-``-profile-verifier``: Verify profiling information
----------------------------------------------------
-
-Pass that checks profiling information for plausibility.
-
``-regions``: Detect single entry single exit regions
-----------------------------------------------------
@@ -626,24 +603,6 @@ where it is profitable, the loop could be transformed to count down to zero
Bottom-up inlining of functions into callees.
-``-insert-edge-profiling``: Insert instrumentation for edge profiling
----------------------------------------------------------------------
-
-This pass instruments the specified program with counters for edge profiling.
-Edge profiling can give a reasonable approximation of the hot paths through a
-program, and is used for a wide variety of program transformations.
-
-Note that this implementation is very naïve. It inserts a counter for *every*
-edge in the program, instead of using control flow information to prune the
-number of counters inserted.
-
-``-insert-optimal-edge-profiling``: Insert optimal instrumentation for edge profiling
--------------------------------------------------------------------------------------
-
-This pass instruments the specified program with counters for edge profiling.
-Edge profiling can give a reasonable approximation of the hot paths through a
-program, and is used for a wide variety of program transformations.
-
.. _passes-instcombine:
``-instcombine``: Combine redundant instructions
diff --git a/docs/Phabricator.rst b/docs/Phabricator.rst
index 18b2817..8ac9afe 100644
--- a/docs/Phabricator.rst
+++ b/docs/Phabricator.rst
@@ -5,18 +5,29 @@ Code Reviews with Phabricator
.. contents::
:local:
-If you prefer to use a web user interface for code reviews,
-you can now submit your patches for Clang and LLVM at
-`LLVM's Phabricator`_.
+If you prefer to use a web user interface for code reviews, you can now submit
+your patches for Clang and LLVM at `LLVM's Phabricator`_ instance.
+
+While Phabricator is a useful tool for some, the relevant -commits mailing list
+is the system of record for all LLVM code review. The mailing list should be
+added as a subscriber on all reviews, and Phabricator users should be prepared
+to respond to free-form comments in mail sent to the commits list.
Sign up
-------
+To get started with Phabricator, navigate to `http://reviews.llvm.org`_ and
+click the power icon in the top right. You can register with a GitHub account,
+a Google account, or you can create your own profile.
+
+Make *sure* that the email address registered with Phabricator is subscribed
+to the relevant -commits mailing list. If your are not subscribed to the commit
+list, all mail sent by Phabricator on your behalf will be held for moderation.
+
Note that if you use your Subversion user name as Phabricator user name,
Phabricator will automatically connect your submits to your Phabricator user in
the `Code Repository Browser`_.
-
Requesting a review via the command line
----------------------------------------
@@ -90,6 +101,15 @@ a change from Phabricator.
Committing a change
-------------------
+Arcanist can manage the commit transparently. It will retrieve the description,
+reviewers, the ``Differential Revision``, etc from the review and commit it to the repository.
+
+::
+
+ arc patch D<Revision>
+ arc commit --revision D<Revision>
+
+
When committing an LLVM change that has been reviewed using
Phabricator, the convention is for the commit message to end with the
line:
@@ -113,6 +133,7 @@ Status
Please let us know whether you like it and what could be improved!
.. _LLVM's Phabricator: http://reviews.llvm.org
+.. _`http://reviews.llvm.org`: http://reviews.llvm.org
.. _Code Repository Browser: http://reviews.llvm.org/diffusion/
.. _Arcanist Quick Start: http://www.phabricator.com/docs/phabricator/article/Arcanist_Quick_Start.html
.. _Arcanist User Guide: http://www.phabricator.com/docs/phabricator/article/Arcanist_User_Guide.html
diff --git a/docs/ProgrammersManual.rst b/docs/ProgrammersManual.rst
index 7e46ac4..a7b28b3 100644
--- a/docs/ProgrammersManual.rst
+++ b/docs/ProgrammersManual.rst
@@ -387,7 +387,8 @@ Fine grained debug info with ``DEBUG_TYPE`` and the ``-debug-only`` option
Sometimes you may find yourself in a situation where enabling ``-debug`` just
turns on **too much** information (such as when working on the code generator).
If you want to enable debug information with more fine-grained control, you
-define the ``DEBUG_TYPE`` macro and the ``-debug`` only option as follows:
+can define the ``DEBUG_TYPE`` macro and use the ``-debug-only`` option as
+follows:
.. code-block:: c++
@@ -545,14 +546,15 @@ methods. Within GDB, for example, you can usually use something like ``call
DAG.viewGraph()`` to pop up a window. Alternatively, you can sprinkle calls to
these functions in your code in places you want to debug.
-Getting this to work requires a small amount of configuration. On Unix systems
+Getting this to work requires a small amount of setup. On Unix systems
with X11, install the `graphviz <http://www.graphviz.org>`_ toolkit, and make
sure 'dot' and 'gv' are in your path. If you are running on Mac OS X, download
and install the Mac OS X `Graphviz program
<http://www.pixelglow.com/graphviz/>`_ and add
``/Applications/Graphviz.app/Contents/MacOS/`` (or wherever you install it) to
-your path. Once in your system and path are set up, rerun the LLVM configure
-script and rebuild LLVM to enable this functionality.
+your path. The programs need not be present when configuring, building or
+running LLVM and can simply be installed when needed during an active debug
+session.
``SelectionDAG`` has been extended to make it easier to locate *interesting*
nodes in large complex graphs. From gdb, if you ``call DAG.setGraphColor(node,
@@ -1916,7 +1918,7 @@ which is a pointer to an integer on the run time stack.
*Inserting instructions*
-There are essentially two ways to insert an ``Instruction`` into an existing
+There are essentially three ways to insert an ``Instruction`` into an existing
sequence of instructions that form a ``BasicBlock``:
* Insertion into an explicit instruction list
@@ -1986,6 +1988,41 @@ sequence of instructions that form a ``BasicBlock``:
which is much cleaner, especially if you're creating a lot of instructions and
adding them to ``BasicBlock``\ s.
+* Insertion using an instance of ``IRBuilder``
+
+ Inserting several ``Instruction``\ s can be quite laborious using the previous
+ methods. The ``IRBuilder`` is a convenience class that can be used to add
+ several instructions to the end of a ``BasicBlock`` or before a particular
+ ``Instruction``. It also supports constant folding and renaming named
+ registers (see ``IRBuilder``'s template arguments).
+
+ The example below demonstrates a very simple use of the ``IRBuilder`` where
+ three instructions are inserted before the instruction ``pi``. The first two
+ instructions are Call instructions and third instruction multiplies the return
+ value of the two calls.
+
+ .. code-block:: c++
+
+ Instruction *pi = ...;
+ IRBuilder<> Builder(pi);
+ CallInst* callOne = Builder.CreateCall(...);
+ CallInst* callTwo = Builder.CreateCall(...);
+ Value* result = Builder.CreateMul(callOne, callTwo);
+
+ The example below is similar to the above example except that the created
+ ``IRBuilder`` inserts instructions at the end of the ``BasicBlock`` ``pb``.
+
+ .. code-block:: c++
+
+ BasicBlock *pb = ...;
+ IRBuilder<> Builder(pb);
+ CallInst* callOne = Builder.CreateCall(...);
+ CallInst* callTwo = Builder.CreateCall(...);
+ Value* result = Builder.CreateMul(callOne, callTwo);
+
+ See :doc:`tutorial/LangImpl3` for a practical use of the ``IRBuilder``.
+
+
.. _schanges_deleting:
Deleting Instructions
@@ -2133,46 +2170,13 @@ compiler, consider compiling LLVM and LLVM-GCC in single-threaded mode, and
using the resultant compiler to build a copy of LLVM with multithreading
support.
-.. _startmultithreaded:
-
-Entering and Exiting Multithreaded Mode
----------------------------------------
-
-In order to properly protect its internal data structures while avoiding
-excessive locking overhead in the single-threaded case, the LLVM must intialize
-certain data structures necessary to provide guards around its internals. To do
-so, the client program must invoke ``llvm_start_multithreaded()`` before making
-any concurrent LLVM API calls. To subsequently tear down these structures, use
-the ``llvm_stop_multithreaded()`` call. You can also use the
-``llvm_is_multithreaded()`` call to check the status of multithreaded mode.
-
-Note that both of these calls must be made *in isolation*. That is to say that
-no other LLVM API calls may be executing at any time during the execution of
-``llvm_start_multithreaded()`` or ``llvm_stop_multithreaded``. It is the
-client's responsibility to enforce this isolation.
-
-The return value of ``llvm_start_multithreaded()`` indicates the success or
-failure of the initialization. Failure typically indicates that your copy of
-LLVM was built without multithreading support, typically because GCC atomic
-intrinsics were not found in your system compiler. In this case, the LLVM API
-will not be safe for concurrent calls. However, it *will* be safe for hosting
-threaded applications in the JIT, though :ref:`care must be taken
-<jitthreading>` to ensure that side exits and the like do not accidentally
-result in concurrent LLVM API calls.
-
.. _shutdown:
Ending Execution with ``llvm_shutdown()``
-----------------------------------------
When you are done using the LLVM APIs, you should call ``llvm_shutdown()`` to
-deallocate memory used for internal structures. This will also invoke
-``llvm_stop_multithreaded()`` if LLVM is operating in multithreaded mode. As
-such, ``llvm_shutdown()`` requires the same isolation guarantees as
-``llvm_stop_multithreaded()``.
-
-Note that, if you use scope-based shutdown, you can use the
-``llvm_shutdown_obj`` class, which calls ``llvm_shutdown()`` in its destructor.
+deallocate memory used for internal structures.
.. _managedstatic:
@@ -2180,20 +2184,11 @@ Lazy Initialization with ``ManagedStatic``
------------------------------------------
``ManagedStatic`` is a utility class in LLVM used to implement static
-initialization of static resources, such as the global type tables. Before the
-invocation of ``llvm_shutdown()``, it implements a simple lazy initialization
-scheme. Once ``llvm_start_multithreaded()`` returns, however, it uses
+initialization of static resources, such as the global type tables. In a
+single-threaded environment, it implements a simple lazy initialization scheme.
+When LLVM is compiled with support for multi-threading, however, it uses
double-checked locking to implement thread-safe lazy initialization.
-Note that, because no other threads are allowed to issue LLVM API calls before
-``llvm_start_multithreaded()`` returns, it is possible to have
-``ManagedStatic``\ s of ``llvm::sys::Mutex``\ s.
-
-The ``llvm_acquire_global_lock()`` and ``llvm_release_global_lock`` APIs provide
-access to the global lock used to implement the double-checked locking for lazy
-initialization. These should only be used internally to LLVM, and only if you
-know what you're doing!
-
.. _llvmcontext:
Achieving Isolation with ``LLVMContext``
diff --git a/docs/ReleaseNotes.rst b/docs/ReleaseNotes.rst
index 8dc1681..fb2e248 100644
--- a/docs/ReleaseNotes.rst
+++ b/docs/ReleaseNotes.rst
@@ -50,11 +50,19 @@ Non-comprehensive list of changes in this release
the ``-no-integrated-as`` option,
* llvm-ar now handles IR files like regular object files. In particular, a
- regular symbol table is created for symbols defined in IR files.
+ regular symbol table is created for symbols defined in IR files, including
+ those in file scope inline assembly.
* LLVM now always uses cfi directives for producing most stack
unwinding information.
+* The prefix for loop vectorizer hint metadata has been changed from
+ ``llvm.vectorizer`` to ``llvm.loop.vectorize``.
+
+* Some backends previously implemented Atomic NAND(x,y) as ``x & ~y``. Now
+ all backends implement it as ``~(x & y)``, matching the semantics of GCC 4.4
+ and later.
+
.. NOTE
For small 1-3 sentence descriptions, just add an entry at the end of
this list. If your description won't fit comfortably in one bullet
diff --git a/docs/SourceLevelDebugging.rst b/docs/SourceLevelDebugging.rst
index f957a7d..869d3a3 100644
--- a/docs/SourceLevelDebugging.rst
+++ b/docs/SourceLevelDebugging.rst
@@ -235,8 +235,8 @@ File descriptors
.. code-block:: llvm
!0 = metadata !{
- i32, ;; Tag = 41 (DW_TAG_file_type)
- metadata, ;; Source directory (including trailing slash) & file pair
+ i32, ;; Tag = 41 (DW_TAG_file_type)
+ metadata, ;; Source directory (including trailing slash) & file pair
}
These descriptors contain information for a file. Global variables and top
@@ -269,7 +269,7 @@ Global variable descriptors
metadata, ;; The static member declaration, if any
}
-These descriptors provide debug information about globals variables. They
+These descriptors provide debug information about global variables. They
provide details such as name, type and where the variable is defined. All
global variables are collected inside the named metadata ``!llvm.dbg.cu``.
@@ -297,7 +297,7 @@ Subprogram descriptors
;; derived class
i32, ;; Flags - Artificial, Private, Protected, Explicit, Prototyped.
i1, ;; isOptimized
- Function * , ;; Pointer to LLVM function
+ {}*, ;; Reference to the LLVM function
metadata, ;; Lists function template parameters
metadata, ;; Function declaration descriptor
metadata, ;; List of function variables
@@ -314,13 +314,13 @@ Block descriptors
.. code-block:: llvm
!3 = metadata !{
- i32, ;; Tag = 11 (DW_TAG_lexical_block)
- metadata,;; Source directory (including trailing slash) & file pair
- metadata,;; Reference to context descriptor
- i32, ;; Line number
- i32, ;; Column number
- i32, ;; DWARF path discriminator value
- i32 ;; Unique ID to identify blocks from a template function
+ i32, ;; Tag = 11 (DW_TAG_lexical_block)
+ metadata, ;; Source directory (including trailing slash) & file pair
+ metadata, ;; Reference to context descriptor
+ i32, ;; Line number
+ i32, ;; Column number
+ i32, ;; DWARF path discriminator value
+ i32 ;; Unique ID to identify blocks from a template function
}
This descriptor provides debug information about nested blocks within a
@@ -330,9 +330,9 @@ lexical blocks at same depth.
.. code-block:: llvm
!3 = metadata !{
- i32, ;; Tag = 11 (DW_TAG_lexical_block)
- metadata,;; Source directory (including trailing slash) & file pair
- metadata ;; Reference to the scope we're annotating with a file change
+ i32, ;; Tag = 11 (DW_TAG_lexical_block)
+ metadata, ;; Source directory (including trailing slash) & file pair
+ metadata ;; Reference to the scope we're annotating with a file change
}
This descriptor provides a wrapper around a lexical scope to handle file
@@ -528,9 +528,9 @@ Subrange descriptors
.. code-block:: llvm
!42 = metadata !{
- i32, ;; Tag = 33 (DW_TAG_subrange_type)
- i64, ;; Low value
- i64 ;; High value
+ i32, ;; Tag = 33 (DW_TAG_subrange_type)
+ i64, ;; Low value
+ i64 ;; High value
}
These descriptors are used to define ranges of array subscripts for an array
@@ -570,6 +570,7 @@ Local variables
metadata, ;; Reference to the type descriptor
i32, ;; flags
metadata ;; (optional) Reference to inline location
+ metadata ;; (optional) Reference to a complex expression (see below)
}
These descriptors are used to define variables local to a sub program. The
diff --git a/docs/TestingGuide.rst b/docs/TestingGuide.rst
index f9222372..481be55 100644
--- a/docs/TestingGuide.rst
+++ b/docs/TestingGuide.rst
@@ -304,8 +304,7 @@ For instance, on ``test/CodeGen/ARM``, the ``lit.local.cfg`` is:
.. code-block:: python
config.suffixes = ['.ll', '.c', '.cpp', '.test']
- targets = set(config.root.targets_to_build.split())
- if not 'ARM' in targets:
+ if not 'ARM' in config.root.targets:
config.unsupported = True
Other platform-specific tests are those that depend on a specific feature
diff --git a/docs/Vectorizers.rst b/docs/Vectorizers.rst
index 887ccaa..2b70217 100644
--- a/docs/Vectorizers.rst
+++ b/docs/Vectorizers.rst
@@ -51,6 +51,89 @@ Users can control the unroll factor using the command line flag "-force-vector-u
$ clang -mllvm -force-vector-unroll=2 ...
$ opt -loop-vectorize -force-vector-unroll=2 ...
+Pragma loop hint directives
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``#pragma clang loop`` directive allows loop vectorization hints to be
+specified for the subsequent for, while, do-while, or c++11 range-based for
+loop. The directive allows vectorization and interleaving to be enabled or
+disabled. Vector width as well as interleave count can also be manually
+specified. The following example explicitly enables vectorization and
+interleaving:
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize(enable) interleave(enable)
+ while(...) {
+ ...
+ }
+
+The following example implicitly enables vectorization and interleaving by
+specifying a vector width and interleaving count:
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize_width(2) interleave_count(2)
+ for(...) {
+ ...
+ }
+
+See the Clang
+`language extensions
+<http://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations>`_
+for details.
+
+Diagnostics
+-----------
+
+Many loops cannot be vectorized including loops with complicated control flow,
+unvectorizable types, and unvectorizable calls. The loop vectorizer generates
+optimization remarks which can be queried using command line options to identify
+and diagnose loops that are skipped by the loop-vectorizer.
+
+Optimization remarks are enabled using:
+
+``-Rpass=loop-vectorize`` identifies loops that were successfully vectorized.
+
+``-Rpass-missed=loop-vectorize`` identifies loops that failed vectorization and
+indicates if vectorization was specified.
+
+``-Rpass-analysis=loop-vectorize`` identifies the statements that caused
+vectorization to fail.
+
+Consider the following loop:
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize(enable)
+ for (int i = 0; i < Length; i++) {
+ switch(A[i]) {
+ case 0: A[i] = i*2; break;
+ case 1: A[i] = i; break;
+ default: A[i] = 0;
+ }
+ }
+
+The command line ``-Rpass-missed=loop-vectorized`` prints the remark:
+
+.. code-block:: console
+
+ no_switch.cpp:4:5: remark: loop not vectorized: vectorization is explicitly enabled [-Rpass-missed=loop-vectorize]
+
+And the command line ``-Rpass-analysis=loop-vectorize`` indicates that the
+switch statement cannot be vectorized.
+
+.. code-block:: console
+
+ no_switch.cpp:4:5: remark: loop not vectorized: loop contains a switch statement [-Rpass-analysis=loop-vectorize]
+ switch(A[i]) {
+ ^
+
+To ensure line and column numbers are produced include the command line options
+``-gline-tables-only`` and ``-gcolumn-info``. See the Clang `user manual
+<http://clang.llvm.org/docs/UsersManual.html#options-to-emit-optimization-reports>`_
+for details
+
Features
--------
diff --git a/docs/WritingAnLLVMPass.rst b/docs/WritingAnLLVMPass.rst
index f9cb4fe..cfbda04 100644
--- a/docs/WritingAnLLVMPass.rst
+++ b/docs/WritingAnLLVMPass.rst
@@ -259,7 +259,6 @@ To see what happened to the other string you registered, try running
-hello - Hello World Pass
-indvars - Induction Variable Simplification
-inline - Function Integration/Inlining
- -insert-edge-profiling - Insert instrumentation for edge profiling
...
The pass name gets added as the information string for your pass, giving some