aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/pci/intel-iommu.c
Commit message (Collapse)AuthorAgeFilesLines
* intel-iommu: Fix leaks in pagetable freeingAlex Williamson2013-10-131-37/+35
| | | | | | | | | | | | | | | | | | | | commit 3269ee0bd6686baf86630300d528500ac5b516d7 upstream. At best the current code only seems to free the leaf pagetables and the root. If you're unlucky enough to have a large gap (like any QEMU guest with more than 3G of memory), only the first chunk of leaf pagetables are freed (plus the root). This is a massive memory leak. This patch re-writes the pagetable freeing function to use a recursive algorithm and manages to not only free all the pagetables, but does it without any apparent performance loss versus the current broken version. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* intel-iommu: Prevent devices with RMRRs from being placed into SI DomainTom Mingarelli2013-01-211-0/+31
| | | | | | | | | | | | | | | | | commit ea2447f700cab264019b52e2b417d689e052dcfd upstream. This patch is to prevent non-USB devices that have RMRRs associated with them from being placed into the SI Domain during init. This fixes the issue where the RMRR info for devices being placed in and out of the SI Domain gets lost. Signed-off-by: Thomas Mingarelli <thomas.mingarelli@hp.com> Tested-by: Shuah Khan <shuah.khan@hp.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* intel-iommu: Free old page tables before creating superpageWoodhouse, David2013-01-171-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 6491d4d02893d9787ba67279595990217177b351 upstream. The dma_pte_free_pagetable() function will only free a page table page if it is asked to free the *entire* 2MiB range that it covers. So if a page table page was used for one or more small mappings, it's likely to end up still present in the page tables... but with no valid PTEs. This was fine when we'd only be repopulating it with 4KiB PTEs anyway but the same virtual address range can end up being reused for a *large-page* mapping. And in that case were were trying to insert the large page into the second-level page table, and getting a complaint from the sanity check in __domain_mapping() because there was already a corresponding entry. This was *relatively* harmless; it led to a memory leak of the old page table page, but no other ill-effects. Fix it by calling dma_pte_clear_range (hopefully redundant) and dma_pte_free_pagetable() before setting up the new large page. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Tested-by: Ravi Murty <Ravi.Murty@intel.com> Tested-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* intel-iommu: Fix AB-BA lockdep reportRoland Dreier2012-11-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3e7abe2556b583e87dabda3e0e6178a67b20d06f upstream. When unbinding a device so that I could pass it through to a KVM VM, I got the lockdep report below. It looks like a legitimate lock ordering problem: - domain_context_mapping_one() takes iommu->lock and calls iommu_support_dev_iotlb(), which takes device_domain_lock (inside iommu->lock). - domain_remove_one_dev_info() starts by taking device_domain_lock then takes iommu->lock inside it (near the end of the function). So this is the classic AB-BA deadlock. It looks like a safe fix is to simply release device_domain_lock a bit earlier, since as far as I can tell, it doesn't protect any of the stuff accessed at the end of domain_remove_one_dev_info() anyway. BTW, the use of device_domain_lock looks a bit unsafe to me... it's at least not obvious to me why we aren't vulnerable to the race below: iommu_support_dev_iotlb() domain_remove_dev_info() lock device_domain_lock find info unlock device_domain_lock lock device_domain_lock find same info unlock device_domain_lock free_devinfo_mem(info) do stuff with info after it's free However I don't understand the locking here well enough to know if this is a real problem, let alone what the best fix is. Anyway here's the full lockdep output that prompted all of this: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.39.1+ #1 ------------------------------------------------------- bash/13954 is trying to acquire lock: (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230 but task is already holding lock: (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (device_domain_lock){-.-...}: [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130 [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0 [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750 [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120 [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0 [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e [<ffffffff81cab204>] pci_iommu_init+0x16/0x41 [<ffffffff81002165>] do_one_initcall+0x45/0x190 [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168 [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10 -> #0 (&(&iommu->lock)->rlock){......}: [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10 [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130 [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0 [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230 [<ffffffff812f8b42>] device_notifier+0x72/0x90 [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0 [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0 [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0 [<ffffffff81373ccf>] device_release_driver+0x2f/0x50 [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0 [<ffffffff813724ac>] drv_attr_store+0x2c/0x30 [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170 [<ffffffff8117569e>] vfs_write+0xce/0x190 [<ffffffff811759e4>] sys_write+0x54/0xa0 [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b other info that might help us debug this: 6 locks held by bash/13954: #0: (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170 #1: (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170 #2: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0 #3: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50 #4: (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0 #5: (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230 stack backtrace: Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1 Call Trace: [<ffffffff810993a7>] print_circular_bug+0xf7/0x100 [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10 [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180 [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130 [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230 [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0 [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230 [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230 [<ffffffff812f8b42>] device_notifier+0x72/0x90 [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0 [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0 [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0 [<ffffffff81373ccf>] device_release_driver+0x2f/0x50 [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0 [<ffffffff813724ac>] drv_attr_store+0x2c/0x30 [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170 [<ffffffff8117569e>] vfs_write+0xce/0x190 [<ffffffff811759e4>] sys_write+0x54/0xa0 [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* intel-iommu: fix superpage support in pfn_to_dma_pte()Allen Kay2011-12-211-9/+8
| | | | | | | | | | | | | | | commit 4399c8bf2b9093696fa8160d79712e7346989c46 upstream. If target_level == 0, current code breaks out of the while-loop if SUPERPAGE bit is set. We should also break out if PTE is not present. If we don't do this, KVM calls to iommu_iova_to_phys() will cause pfn_to_dma_pte() to create mapping for 4KiB pages. Signed-off-by: Allen Kay <allen.m.kay@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* intel-iommu: set iommu_superpage on VM domains to lowest common denominatorAllen Kay2011-12-211-5/+7
| | | | | | | | | | | | | | | | commit 8140a95d228efbcd64d84150e794761a32463947 upstream. set dmar->iommu_superpage field to the smallest common denominator of super page sizes supported by all active VT-d engines. Initialize this field in intel_iommu_domain_init() API so intel_iommu_map() API will be able to use iommu_superpage field to determine the appropriate super page size to use. Signed-off-by: Allen Kay <allen.m.kay@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* intel-iommu: fix return value of iommu_unmap() APIAllen Kay2011-12-211-3/+8
| | | | | | | | | | | | | | commit 292827cb164ad00cc7689a21283b1261c0b6daed upstream. iommu_unmap() API expects IOMMU drivers to return the actual page order of the address being unmapped. Previous code was just returning page order passed in from the caller. This patch fixes this problem. Signed-off-by: Allen Kay <allen.m.kay@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* PM / Intel IOMMU: Fix init_iommu_pm_ops() for CONFIG_PM unsetRafael J. Wysocki2011-06-071-1/+1
| | | | | | | | | | If CONFIG_PM is not set, init_iommu_pm_ops() introduced by commit 134fac3f457f3dd753ecdb25e6da3e5f6629f696 (PCI / Intel IOMMU: Use syscore_ops instead of sysdev class and sysdev) is not defined appropriately. Fix this issue. Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* Merge git://git.infradead.org/iommu-2.6Linus Torvalds2011-06-021-39/+201
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.infradead.org/iommu-2.6: intel-iommu: Fix off-by-one in RMRR setup intel-iommu: Add domain check in domain_remove_one_dev_info intel-iommu: Remove Host Bridge devices from identity mapping intel-iommu: Use coherent DMA mask when requested intel-iommu: Dont cache iova above 32bit intel-iommu: Speed up processing of the identity_mapping function intel-iommu: Check for identity mapping candidate using system dma mask intel-iommu: Only unlink device domains from iommu intel-iommu: Enable super page (2MiB, 1GiB, etc.) support intel-iommu: Flush unmaps at domain_exit intel-iommu: Remove obsolete comment from detect_intel_iommu intel-iommu: fix VT-d PMR disable for TXT on S3 resume
| * intel-iommu: Fix off-by-one in RMRR setupDavid Woodhouse2011-06-011-2/+2
| | | | | | | | | | | | | | | | We were mapping an extra byte (and hence usually an extra page): iommu_prepare_identity_map() expects to be given an 'end' argument which is the last byte to be mapped; not the first byte *not* to be mapped. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Add domain check in domain_remove_one_dev_infoMike Habeck2011-06-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The comment in domain_remove_one_dev_info() states "No need to compare PCI domain; it has to be the same". But for the si_domain that isn't going to be true, as it consists of all the PCI devices that are identity mapped thus multiple PCI domains can be in si_domain. The code needs to validate the PCI domain too. Signed-off-by: Mike Habeck <habeck@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Cc: stable@kernel.org Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Remove Host Bridge devices from identity mappingMike Travis2011-06-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | When using the 1:1 (identity) PCI DMA remapping, PCI Host Bridge devices that do not use the IOMMU causes a kernel panic. Fix that by not inserting those devices into the si_domain. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Mike Habeck <habeck@sgi.com> Cc: stable@kernel.org Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Use coherent DMA mask when requestedMike Travis2011-06-011-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | The __intel_map_single function is not honoring the passed in DMA mask. This results in not using the coherent DMA mask when called from intel_alloc_coherent(). Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Reviewed-by: Mike Habeck <habeck@sgi.com> Cc: stable@kernel.org Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Speed up processing of the identity_mapping functionMike Travis2011-06-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When there are a large count of PCI devices, and the pass through option for iommu is set, much time is spent in the identity_mapping function hunting though the iommu domains to check if a specific device is "identity mapped". Speed up the function by checking the cached info to see if it's mapped to the static identity domain. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Mike Habeck <habeck@sgi.com> Cc: stable@kernel.org Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Check for identity mapping candidate using system dma maskChris Wright2011-06-011-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The identity mapping code appears to make the assumption that if the devices dma_mask is greater than 32bits the device can use identity mapping. But that is not true: take the case where we have a 40bit device in a 44bit architecture. The device can potentially receive a physical address that it will truncate and cause incorrect addresses to be used. Instead check to see if the device's dma_mask is large enough to address the system's dma_mask. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Mike Habeck <habeck@sgi.com> Cc: stable@kernel.org Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Only unlink device domains from iommuAlex Williamson2011-06-011-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a97590e5 added unlinking domains from iommus to reciprocate the iommu from domains unlinking that was already done. We actually want to only do this for device domains and never for the static identity map domain or VM domains. The SI domain is special and never freed, while VM domain->id lives in their own special address space, separate from iommu->domain_ids. In the current code, a VM can get domain->id zero, then mark that domain unused when unbound from pci-stub. This leads to DMAR write faults when the device is re-bound to the host driver. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@kernel.org Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Enable super page (2MiB, 1GiB, etc.) supportYouquan Song2011-06-011-18/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no externally-visible changes with this. In the loop in the internal __domain_mapping() function, we simply detect if we are mapping: - size >= 2MiB, and - virtual address aligned to 2MiB, and - physical address aligned to 2MiB, and - on hardware that supports superpages. (and likewise for larger superpages). We automatically use a superpage for such mappings. We never have to worry about *breaking* superpages, since we trust that we will always *unmap* the same range that was mapped. So all we need to do is ensure that dma_pte_clear_range() will also cope with superpages. Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so it can return a PTE at the appropriate level rather than always extending the page tables all the way down to level 1. Again, this is simplified by the fact that we should never encounter existing small pages when we're creating a mapping; any old mapping that used the same virtual range will have been entirely removed and its obsolete page tables freed. Provide an 'intel_iommu=sp_off' argument on the command line as a chicken bit. Not that it should ever be required. == The original commit seen in the iommu-2.6.git was Youquan's implementation (and completion) of my own half-baked code which I'd typed into an email. Followed by half a dozen subsequent 'fixes'. I've taken the unusual step of rewriting history and collapsing the original commits in order to keep the main history simpler, and make life easier for the people who are going to have to backport this to older kernels. And also so I can give it a more coherent commit comment which (hopefully) gives a better explanation of what's going on. The original sequence of commits leading to identical code was: Youquan Song (3): intel-iommu: super page support intel-iommu: Fix superpage alignment calculation error intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte() David Woodhouse (4): intel-iommu: Precalculate superpage support for dmar_domain intel-iommu: Fix hardware_largepage_caps() intel-iommu: Fix inappropriate use of superpages in __domain_mapping() intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Flush unmaps at domain_exitAlex Williamson2011-05-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | We typically batch unmaps to be lazily flushed out at regular intervals. When we destroy a domain, we need to force a flush of these lazy unmaps to be sure none reference the domain we're about to free. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=35062 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org
| * intel-iommu: fix VT-d PMR disable for TXT on S3 resumeJoseph Cihula2011-05-241-6/+25
| | | | | | | | | | | | | | | | | | | | This patch is a follow on to https://lkml.org/lkml/2011/3/21/239, which was merged as commit 51a63e67da6056c13b5b597dcc9e1b3bd7ceaa55. This patch adds support for S3, as pointed out by Chris Wright. Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| |
| \
*-. \ Merge branches 'dma-debug/next', 'amd-iommu/command-cleanups', ↵Joerg Roedel2011-05-101-0/+1
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | 'amd-iommu/ats' and 'amd-iommu/extended-features' into iommu/2.6.40 Conflicts: arch/x86/include/asm/amd_iommu_types.h arch/x86/kernel/amd_iommu.c arch/x86/kernel/amd_iommu_init.c
| | * PCI: Move ATS declarations in seperate header fileJoerg Roedel2011-04-111-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | This patch moves the relevant declarations from the local header file in drivers/pci to a more accessible locations so that it can be used by the AMD IOMMU driver too. The file is named pci-ats.h because support for the PCI PRI capability will also be added there in a later patch-set. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* | Merge git://git.infradead.org/iommu-2.6Linus Torvalds2011-04-211-12/+43
|\ \ | | | | | | | | | | | | | | | | | | | | | * git://git.infradead.org/iommu-2.6: intel_iommu: disable all VT-d PMRs when TXT launched intel-iommu: Fix get_domain_for_dev() error path intel-iommu: Unlink domain from iommu intel-iommu: Fix use after release during device attach
| * | intel_iommu: disable all VT-d PMRs when TXT launchedJoseph Cihula2011-04-211-9/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel VT-d Protected Memory Regions (PMRs) are supposed to be disabled, on each VT-d engine, after DMA remapping is enabled on the engines. This is because the behavior of having both enabled is not deterministic and because, if TXT has been used to launch the kernel, the PMRs may be programmed to cover memory regions that will be used for DMA. Under some circumstances (certain quirks detected, lack of multiple devices, etc.), the current code does not set up DMA remapping on some VT-d engines. In such cases it also skips disabling the PMRs. This causes failures when the kernel is launched with TXT (most often this occurs on the graphics engine and results in colored vertical bars on the display). This patch detects when the kernel has been launched with TXT and then disables the PMRs on all VT-d engines. In some cases where the reason that remapping is not being enabled is due to possible ACPI DMAR table errors, the VT-d engine addresses may not be correct and thus not able to be safely programmed even to disable PMRs. Because part of the TXT launch process is the verification of these addresses, it will always be safe to disable PMRs if the TXT launch has succeeded and hence only doing this in such cases. Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | intel-iommu: Fix get_domain_for_dev() error pathAlex Williamson2011-03-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we run out of domain_ids and fail iommu_attach_domain(), we fall into domain_exit() without having setup enough of the domain structure for this to do anything useful. In fact, it typically runs off into the weeds walking the bogus domain->devices list. Just free the domain. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Donald Dutile <ddutile@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org
| * | intel-iommu: Unlink domain from iommuAlex Williamson2011-03-121-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we remove a device, we unlink the iommu from the domain, but we never do the reverse unlinking of the domain from the iommu. This means that we never clear iommu->domain_ids, eventually leading to resource exhaustion if we repeatedly bind and unbind a device to a driver. Also free empty domains to avoid a resource leak. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Donald Dutile <ddutile@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org
| * | intel-iommu: Fix use after release during device attachJan Kiszka2011-01-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Obtain the new pgd pointer before releasing the page containing this value. Cc: stable@kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* | | Fix common misspellingsLucas De Marchi2011-03-311-1/+1
| |/ |/| | | | | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* | drivers: Final irq namespace conversionThomas Gleixner2011-03-291-1/+1
| | | | | | | | | | | | Scripted with coccinelle. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | PCI / Intel IOMMU: Use syscore_ops instead of sysdev class and sysdevRafael J. Wysocki2011-03-231-29/+9
|/ | | | | | | | | | | | | The Intel IOMMU subsystem uses a sysdev class and a sysdev for executing iommu_suspend() after interrupts have been turned off on the boot CPU (during system suspend) and for executing iommu_resume() before turning on interrupts on the boot CPU (during system resume). However, since both of these functions ignore their arguments, the entire mechanism may be replaced with a struct syscore_ops object which is simpler. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Joerg Roedel <joerg.roedel@amd.com>
* Merge git://git.infradead.org/iommu-2.6Linus Torvalds2010-09-271-0/+27
|\ | | | | | | | | | | * git://git.infradead.org/iommu-2.6: intel-iommu: Use symbolic values instead of magic numbers in Lenovo w/a intel-iommu: Abort IOMMU setup for igfx if BIOS gave no shadow GTT space
| * intel-iommu: Use symbolic values instead of magic numbers in Lenovo w/aAdam Jackson2010-09-211-2/+12
| | | | | | | | | | | | | | | | | | Commit 9eecabcb9a924f1e11ba670365fd4babe423045c ("intel-iommu: Abort IOMMU setup for igfx if BIOS gave no shadow GTT space") uses a bunch of magic numbers. Provide #defines for those to make it look slightly saner. Signed-off-by: Adam Jackson <ajax@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Abort IOMMU setup for igfx if BIOS gave no shadow GTT spaceDavid Woodhouse2010-09-211-0/+17
| | | | | | | | | | | | Yet another BIOS bug; Lenovo this time (X201). Red Hat bug #593516. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* | drivers/pci/intel-iommu.c: fix build with older gcc'sAndrew Morton2010-09-221-47/+43
|/ | | | | | | | | | | | | | | | | | drivers/pci/intel-iommu.c: In function `__iommu_calculate_agaw': drivers/pci/intel-iommu.c:437: sorry, unimplemented: inlining failed in call to 'width_to_agaw': function body not available drivers/pci/intel-iommu.c:445: sorry, unimplemented: called from here Move the offending function (and its siblings) to top-of-file, remove the forward declaration. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=17441 Reported-by: Martin Mokrejs <mmokrejs@ribosome.natur.cuni.cz> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.infradead.org/iommu-2.6Linus Torvalds2010-08-151-1/+1
|\ | | | | | | | | | | * git://git.infradead.org/iommu-2.6: intel-iommu: Fix 32-bit build warning with __cmpxchg() intr-remap: allow disabling source id checking
| * intel-iommu: Fix 32-bit build warning with __cmpxchg()David Woodhouse2010-08-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | drivers/pci/intel-iommu.c: In function 'dma_pte_addr': drivers/pci/intel-iommu.c:239: warning: passing argument 1 of '__cmpxchg64' from incompatible pointer type It seems that __cmpxchg64() now cares about the type of its pointer argument, so give it a (uint64_t *) instead of a pointer to a structure which contains only that. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* | Merge branch 'next' of ↵Linus Torvalds2010-08-091-0/+28
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (30 commits) DMAENGINE: at_hdmac: locking fixlet DMAENGINE: pch_dma: kill another usage of __raw_{read|write}l dma: dmatest: fix potential sign bug ioat2: catch and recover from broken vtd configurations v6 DMAENGINE: add runtime slave control to COH 901 318 v3 DMAENGINE: add runtime slave config to DMA40 v3 DMAENGINE: generic slave channel control v3 dmaengine: Driver for Topcliff PCH DMA controller intel_mid: Add Mrst & Mfld DMA Drivers drivers/dma: Eliminate a NULL pointer dereference dma/timb_dma: compile warning on 32 bit DMAENGINE: ste_dma40: support older silicon DMAENGINE: ste_dma40: support disabling physical channels DMAENGINE: ste_dma40: no disabled phy channels on ux500 DMAENGINE: ste_dma40: fix suspend bug DMAENGINE: ste_dma40: add DB8500 memcpy channels DMAENGINE: ste_dma40: no flow control on memcpy DMAENGINE: ste_dma40: arch updates for LCLA and LCPA DMAENGINE: ste_dma40: allocate LCLA dynamically DMAENGINE: ste_dma40: no premature stop ... Fix up trivial conflicts in arch/arm/mach-ux500/devices-db8500.c
| * | ioat2: catch and recover from broken vtd configurations v6Dan Williams2010-08-041-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some platforms (MacPro3,1) the BIOS assigns the ioatdma device to the incorrect iommu causing faults when the driver initializes. Add a quirk to catch this misconfiguration and try falling back to untranslated operation (which works in the MacPro3,1 case). Assuming there are other platforms with misconfigured iommus teach the ioatdma driver to treat initialization failures as non-fatal (just fail the driver load and emit a warning instead of triggering a BUG_ON). This can be classified as a boot regression since 2.6.32 on affected platforms since the ioatdma module did not autoload prior to that kernel. Cc: <stable@kernel.org> Acked-by: David Woodhouse <David.Woodhouse@intel.com> Reported-by: Chris Li <lkml@chrisli.org> Tested-by: Chris Li <lkml@chrisli.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | | iommu-api: Extension to check for interrupt remappingTom Lyon2010-07-191-0/+2
| |/ |/| | | | | | | | | | | | | | | | | This patch allows IOMMU users to determine whether the hardware and software support safe, isolated interrupt remapping. Not all Intel IOMMUs have the hardware, and the software for AMD is not there yet. Signed-off-by: Tom Lyon <pugs@cisco.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
* | intel-iommu: Force-disable IOMMU for iGFX on broken Cantiga revisions.David Woodhouse2010-06-151-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Certain revisions of this chipset appear to be broken. There is a shadow GTT which mirrors the real GTT but contains pre-translated physical addresses, for performance reasons. When a GTT update happens, the translations are done once and the resulting physical addresses written back to the shadow GTT. Except sometimes, the physical address is actually written back to the _real_ GTT, not the shadow GTT. Thus we start to see faults when that physical address is fed through translation again. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* | intel-iommu: Fix double lock in get_domain_for_dev()Jiri Slaby2010-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | stanse found the following double lock. In get_domain_for_dev: spin_lock_irqsave(&device_domain_lock, flags); domain_exit(domain); domain_remove_dev_info(domain); spin_lock_irqsave(&device_domain_lock, flags); spin_unlock_irqrestore(&device_domain_lock, flags); spin_unlock_irqrestore(&device_domain_lock, flags); This happens when the domain is created by another CPU at the same time as this function is creating one, and the other CPU wins the race to attach it to the device in question, so we have to destroy our own newly-created one. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* | intel-iommu: Fix reference by physical address in intel_iommu_attach_device()Sheng Yang2010-06-151-1/+2
|/ | | | | | | | | | | | | | Commit a99c47a2 "intel-iommu: errors with smaller iommu widths" replace the dmar_domain->pgd with the first entry of page table when iommu's supported width is smaller than dmar_domain's. But it use physical address directly for new dmar_domain->pgd... This result in KVM oops with VT-d on some machines. Reported-by: Allen Kay <allen.m.kay@intel.com> Cc: Tom Lyon <pugs@cisco.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* Merge git://git.infradead.org/iommu-2.6Linus Torvalds2010-05-211-70/+59
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.infradead.org/iommu-2.6: intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables intel-iommu: Combine the BIOS DMAR table warning messages panic: Add taint flag TAINT_FIRMWARE_WORKAROUND ('I') panic: Allow warnings to set different taint flags intel-iommu: intel_iommu_map_range failed at very end of address space intel-iommu: errors with smaller iommu widths intel-iommu: Fix boot inside 64bit virtualbox with io-apic disabled intel-iommu: use physfn to search drhd for VF intel-iommu: Print out iommu seq_id intel-iommu: Don't complain that ACPI_DMAR_SCOPE_TYPE_IOAPIC is not supported intel-iommu: Avoid global flushes with caching mode. intel-iommu: Use correct domain ID when caching mode is enabled intel-iommu mistakenly uses offset_pfn when caching mode is enabled intel-iommu: use for_each_set_bit() intel-iommu: Fix section mismatch dmar_ir_support() uses dmar_tbl.
| * intel-iommu: intel_iommu_map_range failed at very end of address spaceTom Lyon2010-05-171-8/+3
| | | | | | | | | | | | | | | | | | intel_iommu_map_range() doesn't allow allocation at the very end of the address space; that code has been simplified and corrected. Signed-off-by: Tom Lyon <pugs@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: errors with smaller iommu widthsTom Lyon2010-05-171-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using iommu_domain_alloc with the Intel iommu, the domain address width is always initialized to 48 bits (agaw 2). This domain->agaw value is then used by pfn_to_dma_pte to (always) build a 4 level page table. However, not all systems support iommu width of 48 or 4 level page tables. In particular, the Core i5-660 and i5-670 support an address width of 36 bits (not 39!), an agaw of only 1, and only 3 level page tables. This version of the patch simply lops off extra levels of the page tables if the agaw value of the iommu is less than what is currently allocated for the domain (in intel_iommu_attach_device). If there were already allocated addresses above what the new iommu can handle, EFAULT is returned. Signed-off-by: Tom Lyon <pugs@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Print out iommu seq_idYinghai Lu2010-04-091-3/+6
| | | | | | | | | | | | | | more info on system with more than one IOMMU Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Avoid global flushes with caching mode.Nadav Amit2010-04-091-5/+14
| | | | | | | | | | | | | | | | | | | | | | While it may be efficient on real hardware, emulation of global invalidations is very expensive as all shadow entries must be examined. This patch changes the behaviour when caching mode is enabled (which is the case when IOMMU emulation takes place). In this case, page specific invalidation is used instead. Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: Use correct domain ID when caching mode is enabledNadav Amit2010-04-091-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In caching-mode mappings of pages (changes from non-present to present) require invalidation. Currently, this IOTLB flush is performed with domain ID of zero. This is not according to the VT-d spec and causes big problems for emulating software. This patch uses the correct domain ID in IOTLB flushes. Device IOTLB invalidation is performed only on present to non-present changes. This decision is now based on explicit parameter instead of zero domain-ID. Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu mistakenly uses offset_pfn when caching mode is enabledNadav Amit2010-04-091-2/+1
| | | | | | | | | | | | | | | | intel_map_sg used offset_pfn which was set to zero when invalidating the IOTLB. intel_map_sg now uses size variable for this matter. Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * intel-iommu: use for_each_set_bit()Akinobu Mita2010-04-091-26/+7
| | | | | | | | | | | | | | | | Replace open-coded loop with for_each_set_bit(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* | VT-d: Change {un}map_range functions to implement {un}map interfaceJoerg Roedel2010-03-071-10/+12
| | | | | | | | | | | | | | | | | | This patch changes the iommu-api functions for mapping and unmapping page ranges to use the new page-size based interface. This allows to remove the range based functions later. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>