Changes in v5: - Put all registers(exclude translation table associated) definition into each smmu private file.
Changes in v4: - Add device_remove hook, so hisi-smmu and smmu-v3 can reclaim other resources. like dynamic allocted memory. And s1cbt and s2cbt memory are now allocated in driver(Previously, I hope BIOS to do this). - Fix bugs according to review comments. CB_FAR_LOW, CB_FAR_HIGH are (n) << 3. - Change context_map in struct arm_smmu_device to dynamic allocate memory. - Merge original patch 3 and 4 into one patch.
Changes in v3: - Split arm-smmu.c into three files: arm-smmu.h arm-smmu-base.c arm-smmu.c. To build stardard arm-smmu driver, use these three files. To build hisilicon smmu driver, replace arm-smmu.c with hisi-smmu.c. Now, hisi smmu driver is not dependent on arm smmu driver. They can seperate exist, or coexist, when both building and running time. - Give up Hisilicon private properties. - Place hooks from global variable into struct arm_smmu_device. And deleted three hooks: tlb_sync, flush_pgtable and dt_cfg_probe. - Share the codes which are used to limit the size of smmu ias,oas,ubs. - Add two little patchs about code style, variable types, etc.
Changes in v2: - Split Hisilicon smmu implementation in a separate file, hisi-smmu.c - Refactor arm-smmu.c. Some direct call hardware dependent functions replaced with hooks. And move common struct and marco definition into arm-smmu.h - Merge the description of Hisilicon private properties into arm,smmu.txt
I tried to merge hisi-smmu driver into arm-smmu.c, but it looks impossible. The biggest problem is that too many registers are diffrent: the base address, the field definition, or present only on one side. And if I use #if, hisi-smmu and arm-smmu can not coexist in one binary file. Almost need 20 #if.
In addition, SMMUv3 is also not compatible with v2. And something is similar with hisi-smmu: registers definition and fault handler is different with v2, but can reuse fdt configuration and memory map. Hence, arm-smmu-base.c and arm-smmu.h should be shared by all SMMUs(v2, v3 and hisi), and each smmu will own a private file, like: arm-smmu.c(for v1 and v2), arm-smmu-v3.c, hisi-smmu.c
All marcos which are not used in arm-smmu-base.c and not shared by all SMMUs, have been placed into each private file, some are duplicated. But I think it will not bring any maintenance headaches, except when need rename the marcos. After all, it is hardware dependent.
Zhen Lei (5): iommu/arm: change some structure member types in arm_smmu_device iommu/arm: eliminate errors reported by checkpatch iommu/arm: apart arm-smmu.c to share code with other SMMUs iommu/hisilicon: Add support for Hisilicon Ltd. System MMU architecture documentation/iommu: Add description of Hisilicon SMMU private binding
.../devicetree/bindings/iommu/arm,smmu.txt | 2 + drivers/iommu/Kconfig | 14 + drivers/iommu/Makefile | 2 + drivers/iommu/arm-smmu-base.c | 1085 +++++++++++++++++ drivers/iommu/arm-smmu.c | 1247 +------------------- drivers/iommu/arm-smmu.h | 258 ++++ drivers/iommu/hisi-smmu.c | 575 +++++++++ 7 files changed, 1985 insertions(+), 1198 deletions(-) create mode 100644 drivers/iommu/arm-smmu-base.c create mode 100644 drivers/iommu/arm-smmu.h create mode 100644 drivers/iommu/hisi-smmu.c
Some structure members, such as s1_output_size, it's impossible large than 4G. Change unsigned long to u32 can save a few memory on ARM64.
Signed-off-by: Zhen Lei thunder.leizhen@huawei.com --- drivers/iommu/arm-smmu.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 1599354..8c95727 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -352,8 +352,8 @@ struct arm_smmu_device { struct device_node *parent_of_node;
void __iomem *base; - unsigned long size; - unsigned long pagesize; + u32 size; + u32 pagesize;
#define ARM_SMMU_FEAT_COHERENT_WALK (1 << 0) #define ARM_SMMU_FEAT_STREAM_MATCH (1 << 1) @@ -374,9 +374,9 @@ struct arm_smmu_device { u32 num_mapping_groups; DECLARE_BITMAP(smr_map, ARM_SMMU_MAX_SMRS);
- unsigned long input_size; - unsigned long s1_output_size; - unsigned long s2_output_size; + u32 input_size; + u32 s1_output_size; + u32 s2_output_size;
u32 num_global_irqs; u32 num_context_irqs; @@ -1676,7 +1676,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); }
-static int arm_smmu_id_size_to_bits(int size) +static u32 arm_smmu_id_size_to_bits(u32 size) { switch (size) { case 0: @@ -1697,7 +1697,7 @@ static int arm_smmu_id_size_to_bits(int size)
static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) { - unsigned long size; + u32 size; void __iomem *gr0_base = ARM_SMMU_GR0(smmu); u32 id;
@@ -1782,8 +1782,8 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) size = 1 << (((id >> ID1_NUMPAGENDXB_SHIFT) & ID1_NUMPAGENDXB_MASK) + 1); size *= (smmu->pagesize << 1); if (smmu->size != size) - dev_warn(smmu->dev, "SMMU address space size (0x%lx) differs " - "from mapped region size (0x%lx)!\n", size, smmu->size); + dev_warn(smmu->dev, "SMMU address space size (0x%x) differs " + "from mapped region size (0x%x)!\n", size, smmu->size);
smmu->num_s2_context_banks = (id >> ID1_NUMS2CB_SHIFT) & ID1_NUMS2CB_MASK; @@ -1804,14 +1804,14 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) * allocation (PTRS_PER_PGD). */ #ifdef CONFIG_64BIT - smmu->s1_output_size = min((unsigned long)VA_BITS, size); + smmu->s1_output_size = min((u32)VA_BITS, size); #else - smmu->s1_output_size = min(32UL, size); + smmu->s1_output_size = min(32U, size); #endif
/* The stage-2 output mask is also applied for bypass */ size = arm_smmu_id_size_to_bits((id >> ID2_OAS_SHIFT) & ID2_OAS_MASK); - smmu->s2_output_size = min((unsigned long)PHYS_MASK_SHIFT, size); + smmu->s2_output_size = min((u32)PHYS_MASK_SHIFT, size);
if (smmu->version == 1) { smmu->input_size = 32; @@ -1834,7 +1834,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) }
dev_notice(smmu->dev, - "\t%lu-bit VA, %lu-bit IPA, %lu-bit PA\n", + "\t%u-bit VA, %u-bit IPA, %u-bit PA\n", smmu->input_size, smmu->s1_output_size, smmu->s2_output_size); return 0; } -- 1.8.0
If run scripts/checkpatch.pl, returns some errors. About code style, etc.
Signed-off-by: Zhen Lei thunder.leizhen@huawei.com --- drivers/iommu/arm-smmu.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 8c95727..e93f2dc 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -10,10 +10,6 @@ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. - * * Copyright (C) 2013 ARM Limited * * Author: Will Deacon will.deacon@arm.com @@ -419,7 +415,7 @@ struct arm_smmu_option_prop { const char *prop; };
-static struct arm_smmu_option_prop arm_smmu_options [] = { +static struct arm_smmu_option_prop arm_smmu_options[] = { { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" }, { 0, NULL}, }; @@ -1913,7 +1909,8 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev) } dev_notice(dev, "registered %d master devices\n", i);
- if ((dev_node = of_parse_phandle(dev->of_node, "smmu-parent", 0))) + dev_node = of_parse_phandle(dev->of_node, "smmu-parent", 0); + if (dev_node) smmu->parent_of_node = dev_node;
err = arm_smmu_device_cfg_probe(smmu); @@ -2006,7 +2003,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev) free_irq(smmu->irqs[i], smmu);
/* Turn the thing off */ - writel(sCR0_CLIENTPD,ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); + writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); return 0; }
-- 1.8.0
To support other SMMUs(contains SMMUv3) which using incompatible registers definition relate to SMMUv1-2, but choose ARMv8 Translation System. In order to reuse current arm-smmu(SMMUv1-2) code as much as possible, apart arm-smmu.c. Both arm-smmu-base.c and arm-smmu.h are shared by all SMMUs.
To keep all SMMU drivers can be running at the same time, place hwdep-ops hooks in structure arm_smmu_device, so each smmu can correctly invoke the appropriate hooks at running time.
After apart, adjust code as below: 1. Limit smmu ias,oas,uas size is a common operation, which can be shared. 2. In some smmus, the relationship between StreamID and context bank is fixed. 3. Add marco MAIR0_STAGE1 definition. 4. Change array context_map in struct arm_smmu_device to dynamic memory allocation. 5. Place bus_set_iommu operation into arm_smmu_ops_init, which init phase is subsys_initcall_sync. Insure all SMMUs device_probe can be finished before it. 6. Rename member "cbar" in struct arm_smmu_cfg to "type", because some SMMUs may not contain register CBAR. Associated marcos are also trim prefix CBAR_.
Signed-off-by: Zhen Lei thunder.leizhen@huawei.com --- drivers/iommu/Kconfig | 4 + drivers/iommu/Makefile | 1 + drivers/iommu/arm-smmu-base.c | 1082 ++++++++++++++++++++++++++++++++++++ drivers/iommu/arm-smmu.c | 1236 ++--------------------------------------- drivers/iommu/arm-smmu.h | 258 +++++++++ 5 files changed, 1390 insertions(+), 1191 deletions(-) create mode 100644 drivers/iommu/arm-smmu-base.c create mode 100644 drivers/iommu/arm-smmu.h
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index d260605..fad5e38 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -292,10 +292,14 @@ config SPAPR_TCE_IOMMU Enables bits of IOMMU API required by VFIO. The iommu_ops is not implemented as it is not necessary for VFIO.
+config ARM_SMMU_BASE + bool + config ARM_SMMU bool "ARM Ltd. System MMU (SMMU) Support" depends on ARM64 || (ARM_LPAE && OF) select IOMMU_API + select ARM_SMMU_BASE select ARM_DMA_USE_IOMMU if ARM help Support for implementations of the ARM System MMU architecture diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 8893bad..717cfa3 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_OF_IOMMU) += of_iommu.o obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o msm_iommu_dev.o obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o +obj-$(CONFIG_ARM_SMMU_BASE) += arm-smmu-base.o obj-$(CONFIG_ARM_SMMU) += arm-smmu.o obj-$(CONFIG_DMAR_TABLE) += dmar.o obj-$(CONFIG_INTEL_IOMMU) += iova.o intel-iommu.o diff --git a/drivers/iommu/arm-smmu-base.c b/drivers/iommu/arm-smmu-base.c new file mode 100644 index 0000000..ca0e3db --- /dev/null +++ b/drivers/iommu/arm-smmu-base.c @@ -0,0 +1,1082 @@ +/* + * IOMMU API for ARM architected SMMU implementations. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Copyright (C) 2013 ARM Limited + * + * Author: Will Deacon will.deacon@arm.com + * + * This driver currently supports: + * - SMMUv1 and v2 implementations + * - Stream-matching and stream-indexing + * - v7/v8 long-descriptor format + * - Non-secure access to the SMMU + * - 4k and 64k pages, with contiguous pte hints. + * - Up to 42-bit addressing (dependent on VA_BITS) + * - Context fault reporting + */ + +#define pr_fmt(fmt) "arm-smmu: " fmt + +#include <linux/delay.h> +#include <linux/dma-mapping.h> +#include <linux/err.h> +#include <linux/interrupt.h> +#include <linux/io.h> +#include <linux/iommu.h> +#include <linux/mm.h> +#include <linux/of.h> +#include <linux/platform_device.h> +#include <linux/slab.h> +#include <linux/spinlock.h> + +#include <linux/amba/bus.h> + +#include <asm/pgalloc.h> +#include "arm-smmu.h" + +static DEFINE_SPINLOCK(arm_smmu_devices_lock); +static LIST_HEAD(arm_smmu_devices); + +struct arm_smmu_option_prop { + u32 opt; + const char *prop; +}; + +static struct arm_smmu_option_prop arm_smmu_options[] = { + { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" }, + { 0, NULL}, +}; + +static void parse_driver_options(struct arm_smmu_device *smmu) +{ + int i = 0; + do { + if (of_property_read_bool(smmu->dev->of_node, + arm_smmu_options[i].prop)) { + smmu->options |= arm_smmu_options[i].opt; + dev_notice(smmu->dev, "option %s\n", + arm_smmu_options[i].prop); + } + } while (arm_smmu_options[++i].opt); +} + +static struct arm_smmu_master *find_smmu_master(struct arm_smmu_device *smmu, + struct device_node *dev_node) +{ + struct rb_node *node = smmu->masters.rb_node; + + while (node) { + struct arm_smmu_master *master; + master = container_of(node, struct arm_smmu_master, node); + + if (dev_node < master->of_node) + node = node->rb_left; + else if (dev_node > master->of_node) + node = node->rb_right; + else + return master; + } + + return NULL; +} + +static int insert_smmu_master(struct arm_smmu_device *smmu, + struct arm_smmu_master *master) +{ + struct rb_node **new, *parent; + + new = &smmu->masters.rb_node; + parent = NULL; + while (*new) { + struct arm_smmu_master *this; + this = container_of(*new, struct arm_smmu_master, node); + + parent = *new; + if (master->of_node < this->of_node) + new = &((*new)->rb_left); + else if (master->of_node > this->of_node) + new = &((*new)->rb_right); + else + return -EEXIST; + } + + rb_link_node(&master->node, parent, new); + rb_insert_color(&master->node, &smmu->masters); + return 0; +} + +static int register_smmu_master(struct arm_smmu_device *smmu, + struct device *dev, + struct of_phandle_args *masterspec) +{ + int i; + struct arm_smmu_master *master; + + master = find_smmu_master(smmu, masterspec->np); + if (master) { + dev_err(dev, + "rejecting multiple registrations for master device %s\n", + masterspec->np->name); + return -EBUSY; + } + + if (masterspec->args_count > MAX_MASTER_STREAMIDS) { + dev_err(dev, + "reached maximum number (%d) of stream IDs for master device %s\n", + MAX_MASTER_STREAMIDS, masterspec->np->name); + return -ENOSPC; + } + + master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL); + if (!master) + return -ENOMEM; + + master->of_node = masterspec->np; + master->num_streamids = masterspec->args_count; + + for (i = 0; i < master->num_streamids; ++i) + master->streamids[i] = masterspec->args[i]; + + return insert_smmu_master(smmu, master); +} + +struct arm_smmu_device *find_parent_smmu(struct arm_smmu_device *smmu) +{ + struct arm_smmu_device *parent; + + if (!smmu->parent_of_node) + return NULL; + + spin_lock(&arm_smmu_devices_lock); + list_for_each_entry(parent, &arm_smmu_devices, list) + if (parent->dev->of_node == smmu->parent_of_node) + goto out_unlock; + + parent = NULL; + dev_warn(smmu->dev, + "Failed to find SMMU parent despite parent in DT\n"); +out_unlock: + spin_unlock(&arm_smmu_devices_lock); + return parent; +} + +int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end) +{ + int idx; + + do { + idx = find_next_zero_bit(map, end, start); + if (idx == end) + return -ENOSPC; + } while (test_and_set_bit(idx, map)); + + return idx; +} + +void __arm_smmu_free_bitmap(unsigned long *map, int idx) +{ + clear_bit(idx, map); +} + +void arm_smmu_tlb_sync_wait(struct arm_smmu_device *smmu) +{ + int count = 0; + + while (!smmu->hwdep_ops->tlb_sync_finished(smmu)) { + cpu_relax(); + if (++count == TLB_LOOP_TIMEOUT) { + dev_err_ratelimited(smmu->dev, + "TLB sync timed out -- SMMU may be deadlocked\n"); + return; + } + udelay(1); + } +} + +void arm_smmu_flush_pgtable(struct arm_smmu_device *smmu, void *addr, + size_t size) +{ + unsigned long offset = (unsigned long)addr & ~PAGE_MASK; + + + /* Ensure new page tables are visible to the hardware walker */ + if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) { + dsb(ishst); + } else { + /* + * If the SMMU can't walk tables in the CPU caches, treat them + * like non-coherent DMA since we need to flush the new entries + * all the way out to memory. There's no possibility of + * recursion here as the SMMU table walker will not be wired + * through another SMMU. + */ + dma_map_page(smmu->dev, virt_to_page(addr), offset, size, + DMA_TO_DEVICE); + } +} + +static int arm_smmu_init_domain_context(struct iommu_domain *domain, + struct device *dev) +{ + int irq, ret, start; + struct arm_smmu_domain *smmu_domain = domain->priv; + struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; + struct arm_smmu_device *smmu, *parent; + struct arm_smmu_master *master; + + /* + * Walk the SMMU chain to find the root device for this chain. + * We assume that no masters have translations which terminate + * early, and therefore check that the root SMMU does indeed have + * a StreamID for the master in question. + */ + parent = dev->archdata.iommu; + smmu_domain->output_mask = -1; + do { + smmu = parent; + smmu_domain->output_mask &= (1ULL << smmu->s2_output_size) - 1; + } while ((parent = find_parent_smmu(smmu))); + + master = find_smmu_master(smmu, dev->of_node); + if (!master) { + dev_err(dev, "unable to find root SMMU for device\n"); + return -ENODEV; + } + + if (smmu->features & ARM_SMMU_FEAT_TRANS_NESTED) { + /* + * We will likely want to change this if/when KVM gets + * involved. + */ + root_cfg->type = TYPE_S1_TRANS_S2_BYPASS; + start = smmu->num_s2_context_banks; + } else if (smmu->features & ARM_SMMU_FEAT_TRANS_S2) { + root_cfg->type = TYPE_S2_TRANS; + start = 0; + } else { + root_cfg->type = TYPE_S1_TRANS_S2_BYPASS; + start = smmu->num_s2_context_banks; + } + + ret = smmu->hwdep_ops->alloc_context(smmu, start, + smmu->num_context_banks, master); + if (IS_ERR_VALUE(ret)) + return ret; + + root_cfg->cbndx = ret; + if (smmu->version == 1) { + root_cfg->irptndx = atomic_inc_return(&smmu->irptndx); + root_cfg->irptndx %= smmu->num_context_irqs; + } else { + root_cfg->irptndx = root_cfg->cbndx; + } + + irq = smmu->irqs[smmu->num_global_irqs + root_cfg->irptndx]; + ret = request_irq(irq, smmu->hwdep_ops->context_fault, IRQF_SHARED, + "arm-smmu-context-fault", domain); + if (IS_ERR_VALUE(ret)) { + dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n", + root_cfg->irptndx, irq); + root_cfg->irptndx = INVALID_IRPTNDX; + goto out_free_context; + } + + root_cfg->smmu = smmu; + smmu->hwdep_ops->init_context_bank(smmu_domain); + return ret; + +out_free_context: + __arm_smmu_free_bitmap(smmu->context_map, root_cfg->cbndx); + return ret; +} + +static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain = domain->priv; + struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; + struct arm_smmu_device *smmu = root_cfg->smmu; + int irq; + + if (!smmu) + return; + + smmu->hwdep_ops->destroy_context_bank(smmu_domain); + + if (root_cfg->irptndx != INVALID_IRPTNDX) { + irq = smmu->irqs[smmu->num_global_irqs + root_cfg->irptndx]; + free_irq(irq, domain); + } + + __arm_smmu_free_bitmap(smmu->context_map, root_cfg->cbndx); +} + +static int arm_smmu_domain_init(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain; + pgd_t *pgd; + + /* + * Allocate the domain and initialise some of its data structures. + * We can't really do anything meaningful until we've added a + * master. + */ + smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL); + if (!smmu_domain) + return -ENOMEM; + + pgd = kzalloc(PTRS_PER_PGD * sizeof(pgd_t), GFP_KERNEL); + if (!pgd) + goto out_free_domain; + smmu_domain->root_cfg.pgd = pgd; + + spin_lock_init(&smmu_domain->lock); + domain->priv = smmu_domain; + return 0; + +out_free_domain: + kfree(smmu_domain); + return -ENOMEM; +} + +static void arm_smmu_free_ptes(pmd_t *pmd) +{ + pgtable_t table = pmd_pgtable(*pmd); + pgtable_page_dtor(table); + __free_page(table); +} + +static void arm_smmu_free_pmds(pud_t *pud) +{ + int i; + pmd_t *pmd, *pmd_base = pmd_offset(pud, 0); + + pmd = pmd_base; + for (i = 0; i < PTRS_PER_PMD; ++i) { + if (pmd_none(*pmd)) + continue; + + arm_smmu_free_ptes(pmd); + pmd++; + } + + pmd_free(NULL, pmd_base); +} + +static void arm_smmu_free_puds(pgd_t *pgd) +{ + int i; + pud_t *pud, *pud_base = pud_offset(pgd, 0); + + pud = pud_base; + for (i = 0; i < PTRS_PER_PUD; ++i) { + if (pud_none(*pud)) + continue; + + arm_smmu_free_pmds(pud); + pud++; + } + + pud_free(NULL, pud_base); +} + +static void arm_smmu_free_pgtables(struct arm_smmu_domain *smmu_domain) +{ + int i; + struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; + pgd_t *pgd, *pgd_base = root_cfg->pgd; + + /* + * Recursively free the page tables for this domain. We don't + * care about speculative TLB filling because the tables should + * not be active in any context bank at this point (SCTLR.M is 0). + */ + pgd = pgd_base; + for (i = 0; i < PTRS_PER_PGD; ++i) { + if (pgd_none(*pgd)) + continue; + arm_smmu_free_puds(pgd); + pgd++; + } + + kfree(pgd_base); +} + +static void arm_smmu_domain_destroy(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain = domain->priv; + + /* + * Free the domain resources. We assume that all devices have + * already been detached. + */ + arm_smmu_destroy_domain_context(domain); + arm_smmu_free_pgtables(smmu_domain); + kfree(smmu_domain); +} + +static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) +{ + int ret = -EINVAL; + struct arm_smmu_domain *smmu_domain = domain->priv; + struct arm_smmu_device *device_smmu = dev->archdata.iommu; + struct arm_smmu_master *master; + unsigned long flags; + + if (!device_smmu) { + dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n"); + return -ENXIO; + } + + /* + * Sanity check the domain. We don't currently support domains + * that cross between different SMMU chains. + */ + spin_lock_irqsave(&smmu_domain->lock, flags); + if (!smmu_domain->leaf_smmu) { + /* Now that we have a master, we can finalise the domain */ + ret = arm_smmu_init_domain_context(domain, dev); + if (IS_ERR_VALUE(ret)) + goto err_unlock; + + smmu_domain->leaf_smmu = device_smmu; + } else if (smmu_domain->leaf_smmu != device_smmu) { + dev_err(dev, + "cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n", + dev_name(smmu_domain->leaf_smmu->dev), + dev_name(device_smmu->dev)); + goto err_unlock; + } + spin_unlock_irqrestore(&smmu_domain->lock, flags); + + /* Looks ok, so add the device to the domain */ + master = find_smmu_master(smmu_domain->leaf_smmu, dev->of_node); + if (!master) + return -ENODEV; + + return device_smmu->hwdep_ops->domain_add_master(smmu_domain, master); + +err_unlock: + spin_unlock_irqrestore(&smmu_domain->lock, flags); + return ret; +} + +static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev) +{ + struct arm_smmu_domain *smmu_domain = domain->priv; + struct arm_smmu_master *master; + struct arm_smmu_device *smmu = smmu_domain->root_cfg.smmu; + + master = find_smmu_master(smmu_domain->leaf_smmu, dev->of_node); + if (master) + smmu->hwdep_ops->domain_remove_master(smmu_domain, master); +} + +static bool arm_smmu_pte_is_contiguous_range(unsigned long addr, + unsigned long end) +{ + return !(addr & ~ARM_SMMU_PTE_CONT_MASK) && + (addr + ARM_SMMU_PTE_CONT_SIZE <= end); +} + +static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd, + unsigned long addr, unsigned long end, + unsigned long pfn, int prot, int stage) +{ + pte_t *pte, *start; + pteval_t pteval = ARM_SMMU_PTE_PAGE | ARM_SMMU_PTE_AF | ARM_SMMU_PTE_XN; + + if (pmd_none(*pmd)) { + /* Allocate a new set of tables */ + pgtable_t table = alloc_page(GFP_ATOMIC|__GFP_ZERO); + if (!table) + return -ENOMEM; + + arm_smmu_flush_pgtable(smmu, page_address(table), PAGE_SIZE); + if (!pgtable_page_ctor(table)) { + __free_page(table); + return -ENOMEM; + } + pmd_populate(NULL, pmd, table); + arm_smmu_flush_pgtable(smmu, pmd, sizeof(*pmd)); + } + + if (stage == 1) { + pteval |= ARM_SMMU_PTE_AP_UNPRIV | ARM_SMMU_PTE_nG; + if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) + pteval |= ARM_SMMU_PTE_AP_RDONLY; + + if (prot & IOMMU_CACHE) + pteval |= (MAIR_ATTR_IDX_CACHE << + ARM_SMMU_PTE_ATTRINDX_SHIFT); + } else { + pteval |= ARM_SMMU_PTE_HAP_FAULT; + if (prot & IOMMU_READ) + pteval |= ARM_SMMU_PTE_HAP_READ; + if (prot & IOMMU_WRITE) + pteval |= ARM_SMMU_PTE_HAP_WRITE; + if (prot & IOMMU_CACHE) + pteval |= ARM_SMMU_PTE_MEMATTR_OIWB; + else + pteval |= ARM_SMMU_PTE_MEMATTR_NC; + } + + /* If no access, create a faulting entry to avoid TLB fills */ + if (prot & IOMMU_EXEC) + pteval &= ~ARM_SMMU_PTE_XN; + else if (!(prot & (IOMMU_READ | IOMMU_WRITE))) + pteval &= ~ARM_SMMU_PTE_PAGE; + + pteval |= ARM_SMMU_PTE_SH_IS; + start = pmd_page_vaddr(*pmd) + pte_index(addr); + pte = start; + + /* + * Install the page table entries. This is fairly complicated + * since we attempt to make use of the contiguous hint in the + * ptes where possible. The contiguous hint indicates a series + * of ARM_SMMU_PTE_CONT_ENTRIES ptes mapping a physically + * contiguous region with the following constraints: + * + * - The region start is aligned to ARM_SMMU_PTE_CONT_SIZE + * - Each pte in the region has the contiguous hint bit set + * + * This complicates unmapping (also handled by this code, when + * neither IOMMU_READ or IOMMU_WRITE are set) because it is + * possible, yet highly unlikely, that a client may unmap only + * part of a contiguous range. This requires clearing of the + * contiguous hint bits in the range before installing the new + * faulting entries. + * + * Note that re-mapping an address range without first unmapping + * it is not supported, so TLB invalidation is not required here + * and is instead performed at unmap and domain-init time. + */ + do { + int i = 1; + pteval &= ~ARM_SMMU_PTE_CONT; + + if (arm_smmu_pte_is_contiguous_range(addr, end)) { + i = ARM_SMMU_PTE_CONT_ENTRIES; + pteval |= ARM_SMMU_PTE_CONT; + } else if (pte_val(*pte) & + (ARM_SMMU_PTE_CONT | ARM_SMMU_PTE_PAGE)) { + int j; + pte_t *cont_start; + unsigned long idx = pte_index(addr); + + idx &= ~(ARM_SMMU_PTE_CONT_ENTRIES - 1); + cont_start = pmd_page_vaddr(*pmd) + idx; + for (j = 0; j < ARM_SMMU_PTE_CONT_ENTRIES; ++j) + pte_val(*(cont_start + j)) &= ~ARM_SMMU_PTE_CONT; + + arm_smmu_flush_pgtable(smmu, cont_start, + sizeof(*pte) * + ARM_SMMU_PTE_CONT_ENTRIES); + } + + do { + *pte = pfn_pte(pfn, __pgprot(pteval)); + } while (pte++, pfn++, addr += PAGE_SIZE, --i); + } while (addr != end); + + arm_smmu_flush_pgtable(smmu, start, sizeof(*pte) * (pte - start)); + return 0; +} + +static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud, + unsigned long addr, unsigned long end, + phys_addr_t phys, int prot, int stage) +{ + int ret; + pmd_t *pmd; + unsigned long next, pfn = __phys_to_pfn(phys); + +#ifndef __PAGETABLE_PMD_FOLDED + if (pud_none(*pud)) { + pmd = (pmd_t *)get_zeroed_page(GFP_ATOMIC); + if (!pmd) + return -ENOMEM; + + arm_smmu_flush_pgtable(smmu, pmd, PAGE_SIZE); + pud_populate(NULL, pud, pmd); + arm_smmu_flush_pgtable(smmu, pud, sizeof(*pud)); + + pmd += pmd_index(addr); + } else +#endif + pmd = pmd_offset(pud, addr); + + do { + next = pmd_addr_end(addr, end); + ret = arm_smmu_alloc_init_pte(smmu, pmd, addr, next, pfn, + prot, stage); + phys += next - addr; + } while (pmd++, addr = next, addr < end); + + return ret; +} + +static int arm_smmu_alloc_init_pud(struct arm_smmu_device *smmu, pgd_t *pgd, + unsigned long addr, unsigned long end, + phys_addr_t phys, int prot, int stage) +{ + int ret = 0; + pud_t *pud; + unsigned long next; + +#ifndef __PAGETABLE_PUD_FOLDED + if (pgd_none(*pgd)) { + pud = (pud_t *)get_zeroed_page(GFP_ATOMIC); + if (!pud) + return -ENOMEM; + + arm_smmu_flush_pgtable(smmu, pud, PAGE_SIZE); + pgd_populate(NULL, pgd, pud); + arm_smmu_flush_pgtable(smmu, pgd, sizeof(*pgd)); + + pud += pud_index(addr); + } else +#endif + pud = pud_offset(pgd, addr); + + do { + next = pud_addr_end(addr, end); + ret = arm_smmu_alloc_init_pmd(smmu, pud, addr, next, phys, + prot, stage); + phys += next - addr; + } while (pud++, addr = next, addr < end); + + return ret; +} + +static int arm_smmu_handle_mapping(struct arm_smmu_domain *smmu_domain, + unsigned long iova, phys_addr_t paddr, + size_t size, int prot) +{ + int ret, stage; + unsigned long end; + phys_addr_t input_mask, output_mask; + struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; + pgd_t *pgd = root_cfg->pgd; + struct arm_smmu_device *smmu = root_cfg->smmu; + unsigned long flags; + + if (root_cfg->type == TYPE_S2_TRANS) { + stage = 2; + output_mask = (1ULL << smmu->s2_output_size) - 1; + } else { + stage = 1; + output_mask = (1ULL << smmu->s1_output_size) - 1; + } + + if (!pgd) + return -EINVAL; + + if (size & ~PAGE_MASK) + return -EINVAL; + + input_mask = (1ULL << smmu->input_size) - 1; + if ((phys_addr_t)iova & ~input_mask) + return -ERANGE; + + if (paddr & ~output_mask) + return -ERANGE; + + spin_lock_irqsave(&smmu_domain->lock, flags); + pgd += pgd_index(iova); + end = iova + size; + do { + unsigned long next = pgd_addr_end(iova, end); + + ret = arm_smmu_alloc_init_pud(smmu, pgd, iova, next, paddr, + prot, stage); + if (ret) + goto out_unlock; + + paddr += next - iova; + iova = next; + } while (pgd++, iova != end); + +out_unlock: + spin_unlock_irqrestore(&smmu_domain->lock, flags); + + return ret; +} + +static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot) +{ + struct arm_smmu_domain *smmu_domain = domain->priv; + + if (!smmu_domain) + return -ENODEV; + + /* Check for silent address truncation up the SMMU chain. */ + if ((phys_addr_t)iova & ~smmu_domain->output_mask) + return -ERANGE; + + return arm_smmu_handle_mapping(smmu_domain, iova, paddr, size, prot); +} + +static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, + size_t size) +{ + int ret; + struct arm_smmu_domain *smmu_domain = domain->priv; + struct arm_smmu_device *smmu = smmu_domain->root_cfg.smmu; + + ret = arm_smmu_handle_mapping(smmu_domain, iova, 0, size, 0); + smmu->hwdep_ops->tlb_inv_context(&smmu_domain->root_cfg); + return ret ? 0 : size; +} + +static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, + dma_addr_t iova) +{ + pgd_t *pgdp, pgd; + pud_t pud; + pmd_t pmd; + pte_t pte; + struct arm_smmu_domain *smmu_domain = domain->priv; + struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; + + pgdp = root_cfg->pgd; + if (!pgdp) + return 0; + + pgd = *(pgdp + pgd_index(iova)); + if (pgd_none(pgd)) + return 0; + + pud = *pud_offset(&pgd, iova); + if (pud_none(pud)) + return 0; + + pmd = *pmd_offset(&pud, iova); + if (pmd_none(pmd)) + return 0; + + pte = *(pmd_page_vaddr(pmd) + pte_index(iova)); + if (pte_none(pte)) + return 0; + + return __pfn_to_phys(pte_pfn(pte)) | (iova & ~PAGE_MASK); +} + +static int arm_smmu_domain_has_cap(struct iommu_domain *domain, + unsigned long cap) +{ + unsigned long caps = 0; + struct arm_smmu_domain *smmu_domain = domain->priv; + + if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) + caps |= IOMMU_CAP_CACHE_COHERENCY; + + return !!(cap & caps); +} + +static int arm_smmu_add_device(struct device *dev) +{ + struct arm_smmu_device *child, *parent, *smmu; + struct arm_smmu_master *master = NULL; + struct iommu_group *group; + int ret; + + if (dev->archdata.iommu) { + dev_warn(dev, "IOMMU driver already assigned to device\n"); + return -EINVAL; + } + + spin_lock(&arm_smmu_devices_lock); + list_for_each_entry(parent, &arm_smmu_devices, list) { + smmu = parent; + + /* Try to find a child of the current SMMU. */ + list_for_each_entry(child, &arm_smmu_devices, list) { + if (child->parent_of_node == parent->dev->of_node) { + /* Does the child sit above our master? */ + master = find_smmu_master(child, dev->of_node); + if (master) { + smmu = NULL; + break; + } + } + } + + /* We found some children, so keep searching. */ + if (!smmu) { + master = NULL; + continue; + } + + master = find_smmu_master(smmu, dev->of_node); + if (master) + break; + } + spin_unlock(&arm_smmu_devices_lock); + + if (!master) + return -ENODEV; + + group = iommu_group_alloc(); + if (IS_ERR(group)) { + dev_err(dev, "Failed to allocate IOMMU group\n"); + return PTR_ERR(group); + } + + ret = iommu_group_add_device(group, dev); + iommu_group_put(group); + dev->archdata.iommu = smmu; + + return ret; +} + +static void arm_smmu_remove_device(struct device *dev) +{ + dev->archdata.iommu = NULL; + iommu_group_remove_device(dev); +} + +static struct iommu_ops arm_smmu_ops = { + .domain_init = arm_smmu_domain_init, + .domain_destroy = arm_smmu_domain_destroy, + .attach_dev = arm_smmu_attach_dev, + .detach_dev = arm_smmu_detach_dev, + .map = arm_smmu_map, + .unmap = arm_smmu_unmap, + .iova_to_phys = arm_smmu_iova_to_phys, + .domain_has_cap = arm_smmu_domain_has_cap, + .add_device = arm_smmu_add_device, + .remove_device = arm_smmu_remove_device, + .pgsize_bitmap = (SECTION_SIZE | + ARM_SMMU_PTE_CONT_SIZE | + PAGE_SIZE), +}; + +int arm_smmu_device_dt_probe(struct platform_device *pdev, + struct smmu_hwdep_ops *ops) +{ + struct resource *res; + struct arm_smmu_device *smmu; + struct device_node *dev_node; + struct device *dev = &pdev->dev; + struct rb_node *node; + struct of_phandle_args masterspec; + int num_irqs, i, err; + + smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL); + if (!smmu) { + dev_err(dev, "failed to allocate arm_smmu_device\n"); + return -ENOMEM; + } + smmu->dev = dev; + + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + smmu->base = devm_ioremap_resource(dev, res); + if (IS_ERR(smmu->base)) + return PTR_ERR(smmu->base); + smmu->size = resource_size(res); + + if (of_property_read_u32(dev->of_node, "#global-interrupts", + &smmu->num_global_irqs)) { + dev_err(dev, "missing #global-interrupts property\n"); + return -ENODEV; + } + + num_irqs = 0; + while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) { + num_irqs++; + if (num_irqs > smmu->num_global_irqs) + smmu->num_context_irqs++; + } + + if (!smmu->num_context_irqs) { + dev_err(dev, "found %d interrupts but expected at least %d\n", + num_irqs, smmu->num_global_irqs + 1); + return -ENODEV; + } + + smmu->irqs = devm_kzalloc(dev, sizeof(*smmu->irqs) * num_irqs, + GFP_KERNEL); + if (!smmu->irqs) { + dev_err(dev, "failed to allocate %d irqs\n", num_irqs); + return -ENOMEM; + } + + for (i = 0; i < num_irqs; ++i) { + int irq = platform_get_irq(pdev, i); + if (irq < 0) { + dev_err(dev, "failed to get irq index %d\n", i); + return -ENODEV; + } + smmu->irqs[i] = irq; + } + + i = 0; + smmu->masters = RB_ROOT; + while (!of_parse_phandle_with_args(dev->of_node, "mmu-masters", + "#stream-id-cells", i, + &masterspec)) { + err = register_smmu_master(smmu, dev, &masterspec); + if (err) { + dev_err(dev, "failed to add master %s\n", + masterspec.np->name); + goto out_put_masters; + } + + i++; + } + dev_notice(dev, "registered %d master devices\n", i); + + dev_node = of_parse_phandle(dev->of_node, "smmu-parent", 0); + if (dev_node) + smmu->parent_of_node = dev_node; + + smmu->hwdep_ops = ops; + + err = smmu->hwdep_ops->device_cfg_probe(smmu); + if (err) + goto out_put_parent; + + smmu->context_map = devm_kzalloc(dev, + BITS_TO_LONGS(smmu->num_context_banks), GFP_KERNEL); + if (!smmu->context_map) { + dev_err(dev, "failed to allocate context map\n"); + return -ENOMEM; + } + + /* + * Stage-1 output limited by stage-2 input size due to pgd + * allocation (PTRS_PER_PGD). + */ +#ifdef CONFIG_64BIT + smmu->s1_output_size = min((u32)VA_BITS, smmu->s1_output_size); + smmu->input_size = min((u32)VA_BITS, smmu->input_size); +#else + smmu->s1_output_size = min(32UL, smmu->s1_output_size); + smmu->input_size = 32; +#endif + + /* The stage-2 output mask is also applied for bypass */ + smmu->s2_output_size = min((u32)PHYS_MASK_SHIFT, smmu->s2_output_size); + + dev_notice(smmu->dev, + "\t%u-bit VA, %u-bit IPA, %u-bit PA\n", + smmu->input_size, + smmu->s1_output_size, smmu->s2_output_size); + + parse_driver_options(smmu); + + if (smmu->version > 1 && + smmu->num_context_banks != smmu->num_context_irqs) { + dev_err(dev, + "found only %d context interrupt(s) but %d required\n", + smmu->num_context_irqs, smmu->num_context_banks); + err = -ENODEV; + goto out_put_parent; + } + + for (i = 0; i < smmu->num_global_irqs; ++i) { + err = request_irq(smmu->irqs[i], + smmu->hwdep_ops->global_fault, + IRQF_SHARED, + "arm-smmu global fault", + smmu); + if (err) { + dev_err(dev, "failed to request global IRQ %d (%u)\n", + i, smmu->irqs[i]); + goto out_free_irqs; + } + } + + INIT_LIST_HEAD(&smmu->list); + spin_lock(&arm_smmu_devices_lock); + list_add(&smmu->list, &arm_smmu_devices); + spin_unlock(&arm_smmu_devices_lock); + + err = smmu->hwdep_ops->device_reset(smmu); + if (err) + goto out_free_irqs; + + return 0; + +out_free_irqs: + while (i--) + free_irq(smmu->irqs[i], smmu); + +out_put_parent: + if (smmu->parent_of_node) + of_node_put(smmu->parent_of_node); + +out_put_masters: + for (node = rb_first(&smmu->masters); node; node = rb_next(node)) { + struct arm_smmu_master *master; + master = container_of(node, struct arm_smmu_master, node); + of_node_put(master->of_node); + } + + return err; +} + +int arm_smmu_device_remove(struct platform_device *pdev) +{ + int i; + struct device *dev = &pdev->dev; + struct arm_smmu_device *curr, *smmu = NULL; + struct rb_node *node; + + spin_lock(&arm_smmu_devices_lock); + list_for_each_entry(curr, &arm_smmu_devices, list) { + if (curr->dev == dev) { + smmu = curr; + list_del(&smmu->list); + break; + } + } + spin_unlock(&arm_smmu_devices_lock); + + if (!smmu) + return -ENODEV; + + if (smmu->parent_of_node) + of_node_put(smmu->parent_of_node); + + for (node = rb_first(&smmu->masters); node; node = rb_next(node)) { + struct arm_smmu_master *master; + master = container_of(node, struct arm_smmu_master, node); + of_node_put(master->of_node); + } + + if (!bitmap_empty(smmu->context_map, smmu->num_context_banks)) + dev_err(dev, "removing device with active domains!\n"); + + for (i = 0; i < smmu->num_global_irqs; ++i) + free_irq(smmu->irqs[i], smmu); + + /* Turn the thing off */ + return smmu->hwdep_ops->device_remove(smmu); +} + +static int __init arm_smmu_ops_init(void) +{ + /* Oh, for a proper bus abstraction */ + if (!iommu_present(&platform_bus_type)) + bus_set_iommu(&platform_bus_type, &arm_smmu_ops); + +#ifdef CONFIG_ARM_AMBA + if (!iommu_present(&amba_bustype)) + bus_set_iommu(&amba_bustype, &arm_smmu_ops); +#endif + + return 0; +} +subsys_initcall_sync(arm_smmu_ops_init); diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index e93f2dc..868e9ac 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -27,30 +27,16 @@ #define pr_fmt(fmt) "arm-smmu: " fmt
#include <linux/delay.h> -#include <linux/dma-mapping.h> #include <linux/err.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/iommu.h> -#include <linux/mm.h> #include <linux/module.h> #include <linux/of.h> -#include <linux/platform_device.h> #include <linux/slab.h> #include <linux/spinlock.h>
-#include <linux/amba/bus.h> - -#include <asm/pgalloc.h> - -/* Maximum number of stream IDs assigned to a single device */ -#define MAX_MASTER_STREAMIDS MAX_PHANDLE_ARGS - -/* Maximum number of context banks per SMMU */ -#define ARM_SMMU_MAX_CBS 128 - -/* Maximum number of mapping groups per SMMU */ -#define ARM_SMMU_MAX_SMRS 128 +#include "arm-smmu.h"
/* SMMU global address space */ #define ARM_SMMU_GR0(smmu) ((smmu)->base) @@ -66,40 +52,6 @@ ((smmu->options & ARM_SMMU_OPT_SECURE_CFG_ACCESS) \ ? 0x400 : 0))
-/* Page table bits */ -#define ARM_SMMU_PTE_XN (((pteval_t)3) << 53) -#define ARM_SMMU_PTE_CONT (((pteval_t)1) << 52) -#define ARM_SMMU_PTE_AF (((pteval_t)1) << 10) -#define ARM_SMMU_PTE_SH_NS (((pteval_t)0) << 8) -#define ARM_SMMU_PTE_SH_OS (((pteval_t)2) << 8) -#define ARM_SMMU_PTE_SH_IS (((pteval_t)3) << 8) -#define ARM_SMMU_PTE_PAGE (((pteval_t)3) << 0) - -#if PAGE_SIZE == SZ_4K -#define ARM_SMMU_PTE_CONT_ENTRIES 16 -#elif PAGE_SIZE == SZ_64K -#define ARM_SMMU_PTE_CONT_ENTRIES 32 -#else -#define ARM_SMMU_PTE_CONT_ENTRIES 1 -#endif - -#define ARM_SMMU_PTE_CONT_SIZE (PAGE_SIZE * ARM_SMMU_PTE_CONT_ENTRIES) -#define ARM_SMMU_PTE_CONT_MASK (~(ARM_SMMU_PTE_CONT_SIZE - 1)) - -/* Stage-1 PTE */ -#define ARM_SMMU_PTE_AP_UNPRIV (((pteval_t)1) << 6) -#define ARM_SMMU_PTE_AP_RDONLY (((pteval_t)2) << 6) -#define ARM_SMMU_PTE_ATTRINDX_SHIFT 2 -#define ARM_SMMU_PTE_nG (((pteval_t)1) << 11) - -/* Stage-2 PTE */ -#define ARM_SMMU_PTE_HAP_FAULT (((pteval_t)0) << 6) -#define ARM_SMMU_PTE_HAP_READ (((pteval_t)1) << 6) -#define ARM_SMMU_PTE_HAP_WRITE (((pteval_t)2) << 6) -#define ARM_SMMU_PTE_MEMATTR_OIWB (((pteval_t)0xf) << 2) -#define ARM_SMMU_PTE_MEMATTR_NC (((pteval_t)0x5) << 2) -#define ARM_SMMU_PTE_MEMATTR_DEV (((pteval_t)0x1) << 2) - /* Configuration registers */ #define ARM_SMMU_GR0_sCR0 0x0 #define sCR0_CLIENTPD (1 << 0) @@ -173,7 +125,6 @@ #define ARM_SMMU_GR0_sTLBGSYNC 0x70 #define ARM_SMMU_GR0_sTLBGSTATUS 0x74 #define sTLBGSTATUS_GSACTIVE (1 << 0) -#define TLB_LOOP_TIMEOUT 1000000 /* 1s! */
/* Stream mapping registers */ #define ARM_SMMU_GR0_SMR(n) (0x800 + ((n) << 2)) @@ -204,10 +155,6 @@ #define CBAR_S1_MEMATTR_WB 0xf #define CBAR_TYPE_SHIFT 16 #define CBAR_TYPE_MASK 0x3 -#define CBAR_TYPE_S2_TRANS (0 << CBAR_TYPE_SHIFT) -#define CBAR_TYPE_S1_TRANS_S2_BYPASS (1 << CBAR_TYPE_SHIFT) -#define CBAR_TYPE_S1_TRANS_S2_FAULT (2 << CBAR_TYPE_SHIFT) -#define CBAR_TYPE_S1_TRANS_S2_TRANS (3 << CBAR_TYPE_SHIFT) #define CBAR_IRPTNDX_SHIFT 24 #define CBAR_IRPTNDX_MASK 0xff
@@ -242,40 +189,6 @@ #define SCTLR_M (1 << 0) #define SCTLR_EAE_SBOP (SCTLR_AFE | SCTLR_TRE)
-#define RESUME_RETRY (0 << 0) -#define RESUME_TERMINATE (1 << 0) - -#define TTBCR_EAE (1 << 31) - -#define TTBCR_PASIZE_SHIFT 16 -#define TTBCR_PASIZE_MASK 0x7 - -#define TTBCR_TG0_4K (0 << 14) -#define TTBCR_TG0_64K (1 << 14) - -#define TTBCR_SH0_SHIFT 12 -#define TTBCR_SH0_MASK 0x3 -#define TTBCR_SH_NS 0 -#define TTBCR_SH_OS 2 -#define TTBCR_SH_IS 3 - -#define TTBCR_ORGN0_SHIFT 10 -#define TTBCR_IRGN0_SHIFT 8 -#define TTBCR_RGN_MASK 0x3 -#define TTBCR_RGN_NC 0 -#define TTBCR_RGN_WBWA 1 -#define TTBCR_RGN_WT 2 -#define TTBCR_RGN_WB 3 - -#define TTBCR_SL0_SHIFT 6 -#define TTBCR_SL0_MASK 0x3 -#define TTBCR_SL0_LVL_2 0 -#define TTBCR_SL0_LVL_1 1 - -#define TTBCR_T1SZ_SHIFT 16 -#define TTBCR_T0SZ_SHIFT 0 -#define TTBCR_SZ_MASK 0xf - #define TTBCR2_SEP_SHIFT 15 #define TTBCR2_SEP_MASK 0x7
@@ -292,15 +205,6 @@
#define TTBRn_HI_ASID_SHIFT 16
-#define MAIR_ATTR_SHIFT(n) ((n) << 3) -#define MAIR_ATTR_MASK 0xff -#define MAIR_ATTR_DEVICE 0x04 -#define MAIR_ATTR_NC 0x44 -#define MAIR_ATTR_WBRWA 0xff -#define MAIR_ATTR_IDX_NC 0 -#define MAIR_ATTR_IDX_CACHE 1 -#define MAIR_ATTR_IDX_DEV 2 - #define FSR_MULTI (1 << 31) #define FSR_SS (1 << 30) #define FSR_UUT (1 << 8) @@ -319,262 +223,37 @@
#define FSYNR0_WNR (1 << 4)
-struct arm_smmu_smr { - u8 idx; - u16 mask; - u16 id; -}; - -struct arm_smmu_master { - struct device_node *of_node; - - /* - * The following is specific to the master's position in the - * SMMU chain. - */ - struct rb_node node; - int num_streamids; - u16 streamids[MAX_MASTER_STREAMIDS]; - - /* - * We only need to allocate these on the root SMMU, as we - * configure unmatched streams to bypass translation. - */ - struct arm_smmu_smr *smrs; -}; - -struct arm_smmu_device { - struct device *dev; - struct device_node *parent_of_node; - - void __iomem *base; - u32 size; - u32 pagesize; - -#define ARM_SMMU_FEAT_COHERENT_WALK (1 << 0) -#define ARM_SMMU_FEAT_STREAM_MATCH (1 << 1) -#define ARM_SMMU_FEAT_TRANS_S1 (1 << 2) -#define ARM_SMMU_FEAT_TRANS_S2 (1 << 3) -#define ARM_SMMU_FEAT_TRANS_NESTED (1 << 4) - u32 features; - -#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0) - u32 options; - int version; - - u32 num_context_banks; - u32 num_s2_context_banks; - DECLARE_BITMAP(context_map, ARM_SMMU_MAX_CBS); - atomic_t irptndx; - - u32 num_mapping_groups; - DECLARE_BITMAP(smr_map, ARM_SMMU_MAX_SMRS); - - u32 input_size; - u32 s1_output_size; - u32 s2_output_size; - - u32 num_global_irqs; - u32 num_context_irqs; - unsigned int *irqs; - - struct list_head list; - struct rb_root masters; -};
-struct arm_smmu_cfg { - struct arm_smmu_device *smmu; - u8 cbndx; - u8 irptndx; - u32 cbar; - pgd_t *pgd; -}; -#define INVALID_IRPTNDX 0xff - -#define ARM_SMMU_CB_ASID(cfg) ((cfg)->cbndx) -#define ARM_SMMU_CB_VMID(cfg) ((cfg)->cbndx + 1) - -struct arm_smmu_domain { - /* - * A domain can span across multiple, chained SMMUs and requires - * all devices within the domain to follow the same translation - * path. - */ - struct arm_smmu_device *leaf_smmu; - struct arm_smmu_cfg root_cfg; - phys_addr_t output_mask; - - spinlock_t lock; -}; - -static DEFINE_SPINLOCK(arm_smmu_devices_lock); -static LIST_HEAD(arm_smmu_devices); - -struct arm_smmu_option_prop { - u32 opt; - const char *prop; -}; - -static struct arm_smmu_option_prop arm_smmu_options[] = { - { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" }, - { 0, NULL}, -}; - -static void parse_driver_options(struct arm_smmu_device *smmu) +static int arm_smmu_alloc_context(struct arm_smmu_device *smmu, + int start, int end, struct arm_smmu_master *master) { - int i = 0; - do { - if (of_property_read_bool(smmu->dev->of_node, - arm_smmu_options[i].prop)) { - smmu->options |= arm_smmu_options[i].opt; - dev_notice(smmu->dev, "option %s\n", - arm_smmu_options[i].prop); - } - } while (arm_smmu_options[++i].opt); + return __arm_smmu_alloc_bitmap(smmu->context_map, start, end); }
-static struct arm_smmu_master *find_smmu_master(struct arm_smmu_device *smmu, - struct device_node *dev_node) +static int arm_smmu_tlb_sync_finished(struct arm_smmu_device *smmu) { - struct rb_node *node = smmu->masters.rb_node; - - while (node) { - struct arm_smmu_master *master; - master = container_of(node, struct arm_smmu_master, node); - - if (dev_node < master->of_node) - node = node->rb_left; - else if (dev_node > master->of_node) - node = node->rb_right; - else - return master; - } - - return NULL; -} - -static int insert_smmu_master(struct arm_smmu_device *smmu, - struct arm_smmu_master *master) -{ - struct rb_node **new, *parent; - - new = &smmu->masters.rb_node; - parent = NULL; - while (*new) { - struct arm_smmu_master *this; - this = container_of(*new, struct arm_smmu_master, node); - - parent = *new; - if (master->of_node < this->of_node) - new = &((*new)->rb_left); - else if (master->of_node > this->of_node) - new = &((*new)->rb_right); - else - return -EEXIST; - } - - rb_link_node(&master->node, parent, new); - rb_insert_color(&master->node, &smmu->masters); - return 0; -} - -static int register_smmu_master(struct arm_smmu_device *smmu, - struct device *dev, - struct of_phandle_args *masterspec) -{ - int i; - struct arm_smmu_master *master; - - master = find_smmu_master(smmu, masterspec->np); - if (master) { - dev_err(dev, - "rejecting multiple registrations for master device %s\n", - masterspec->np->name); - return -EBUSY; - } - - if (masterspec->args_count > MAX_MASTER_STREAMIDS) { - dev_err(dev, - "reached maximum number (%d) of stream IDs for master device %s\n", - MAX_MASTER_STREAMIDS, masterspec->np->name); - return -ENOSPC; - } - - master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL); - if (!master) - return -ENOMEM; - - master->of_node = masterspec->np; - master->num_streamids = masterspec->args_count; - - for (i = 0; i < master->num_streamids; ++i) - master->streamids[i] = masterspec->args[i]; - - return insert_smmu_master(smmu, master); -} - -static struct arm_smmu_device *find_parent_smmu(struct arm_smmu_device *smmu) -{ - struct arm_smmu_device *parent; - - if (!smmu->parent_of_node) - return NULL; - - spin_lock(&arm_smmu_devices_lock); - list_for_each_entry(parent, &arm_smmu_devices, list) - if (parent->dev->of_node == smmu->parent_of_node) - goto out_unlock; - - parent = NULL; - dev_warn(smmu->dev, - "Failed to find SMMU parent despite parent in DT\n"); -out_unlock: - spin_unlock(&arm_smmu_devices_lock); - return parent; -} - -static int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end) -{ - int idx; - - do { - idx = find_next_zero_bit(map, end, start); - if (idx == end) - return -ENOSPC; - } while (test_and_set_bit(idx, map)); + u32 reg; + void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
- return idx; -} + reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS);
-static void __arm_smmu_free_bitmap(unsigned long *map, int idx) -{ - clear_bit(idx, map); + return !(reg & sTLBGSTATUS_GSACTIVE); }
/* Wait for any pending TLB invalidations to complete */ static void arm_smmu_tlb_sync(struct arm_smmu_device *smmu) { - int count = 0; void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC); - while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS) - & sTLBGSTATUS_GSACTIVE) { - cpu_relax(); - if (++count == TLB_LOOP_TIMEOUT) { - dev_err_ratelimited(smmu->dev, - "TLB sync timed out -- SMMU may be deadlocked\n"); - return; - } - udelay(1); - } + arm_smmu_tlb_sync_wait(smmu); }
static void arm_smmu_tlb_inv_context(struct arm_smmu_cfg *cfg) { struct arm_smmu_device *smmu = cfg->smmu; void __iomem *base = ARM_SMMU_GR0(smmu); - bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS; + bool stage1 = cfg->type != TYPE_S2_TRANS;
if (stage1) { base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx); @@ -666,28 +345,6 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) return IRQ_HANDLED; }
-static void arm_smmu_flush_pgtable(struct arm_smmu_device *smmu, void *addr, - size_t size) -{ - unsigned long offset = (unsigned long)addr & ~PAGE_MASK; - - - /* Ensure new page tables are visible to the hardware walker */ - if (smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) { - dsb(ishst); - } else { - /* - * If the SMMU can't walk tables in the CPU caches, treat them - * like non-coherent DMA since we need to flush the new entries - * all the way out to memory. There's no possibility of - * recursion here as the SMMU table walker will not be wired - * through another SMMU. - */ - dma_map_page(smmu->dev, virt_to_page(addr), offset, size, - DMA_TO_DEVICE); - } -} - static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain) { u32 reg; @@ -698,11 +355,11 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain)
gr0_base = ARM_SMMU_GR0(smmu); gr1_base = ARM_SMMU_GR1(smmu); - stage1 = root_cfg->cbar != CBAR_TYPE_S2_TRANS; + stage1 = root_cfg->type != TYPE_S2_TRANS; cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, root_cfg->cbndx);
/* CBAR */ - reg = root_cfg->cbar; + reg = root_cfg->type << CBAR_TYPE_SHIFT; if (smmu->version == 1) reg |= root_cfg->irptndx << CBAR_IRPTNDX_SHIFT;
@@ -832,9 +489,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain)
/* MAIR0 (stage-1 only) */ if (stage1) { - reg = (MAIR_ATTR_NC << MAIR_ATTR_SHIFT(MAIR_ATTR_IDX_NC)) | - (MAIR_ATTR_WBRWA << MAIR_ATTR_SHIFT(MAIR_ATTR_IDX_CACHE)) | - (MAIR_ATTR_DEVICE << MAIR_ATTR_SHIFT(MAIR_ATTR_IDX_DEV)); + reg = MAIR0_STAGE1; writel_relaxed(reg, cb_base + ARM_SMMU_CB_S1_MAIR0); }
@@ -848,205 +503,16 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain) writel_relaxed(reg, cb_base + ARM_SMMU_CB_SCTLR); }
-static int arm_smmu_init_domain_context(struct iommu_domain *domain, - struct device *dev) +static void arm_smmu_destroy_context_bank(struct arm_smmu_domain *smmu_domain) { - int irq, ret, start; - struct arm_smmu_domain *smmu_domain = domain->priv; - struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; - struct arm_smmu_device *smmu, *parent; - - /* - * Walk the SMMU chain to find the root device for this chain. - * We assume that no masters have translations which terminate - * early, and therefore check that the root SMMU does indeed have - * a StreamID for the master in question. - */ - parent = dev->archdata.iommu; - smmu_domain->output_mask = -1; - do { - smmu = parent; - smmu_domain->output_mask &= (1ULL << smmu->s2_output_size) - 1; - } while ((parent = find_parent_smmu(smmu))); - - if (!find_smmu_master(smmu, dev->of_node)) { - dev_err(dev, "unable to find root SMMU for device\n"); - return -ENODEV; - } - - if (smmu->features & ARM_SMMU_FEAT_TRANS_NESTED) { - /* - * We will likely want to change this if/when KVM gets - * involved. - */ - root_cfg->cbar = CBAR_TYPE_S1_TRANS_S2_BYPASS; - start = smmu->num_s2_context_banks; - } else if (smmu->features & ARM_SMMU_FEAT_TRANS_S2) { - root_cfg->cbar = CBAR_TYPE_S2_TRANS; - start = 0; - } else { - root_cfg->cbar = CBAR_TYPE_S1_TRANS_S2_BYPASS; - start = smmu->num_s2_context_banks; - } - - ret = __arm_smmu_alloc_bitmap(smmu->context_map, start, - smmu->num_context_banks); - if (IS_ERR_VALUE(ret)) - return ret; - - root_cfg->cbndx = ret; - if (smmu->version == 1) { - root_cfg->irptndx = atomic_inc_return(&smmu->irptndx); - root_cfg->irptndx %= smmu->num_context_irqs; - } else { - root_cfg->irptndx = root_cfg->cbndx; - } - - irq = smmu->irqs[smmu->num_global_irqs + root_cfg->irptndx]; - ret = request_irq(irq, arm_smmu_context_fault, IRQF_SHARED, - "arm-smmu-context-fault", domain); - if (IS_ERR_VALUE(ret)) { - dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n", - root_cfg->irptndx, irq); - root_cfg->irptndx = INVALID_IRPTNDX; - goto out_free_context; - } - - root_cfg->smmu = smmu; - arm_smmu_init_context_bank(smmu_domain); - return ret; - -out_free_context: - __arm_smmu_free_bitmap(smmu->context_map, root_cfg->cbndx); - return ret; -} - -static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) -{ - struct arm_smmu_domain *smmu_domain = domain->priv; struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; struct arm_smmu_device *smmu = root_cfg->smmu; void __iomem *cb_base; - int irq; - - if (!smmu) - return;
/* Disable the context bank and nuke the TLB before freeing it. */ cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, root_cfg->cbndx); writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR); arm_smmu_tlb_inv_context(root_cfg); - - if (root_cfg->irptndx != INVALID_IRPTNDX) { - irq = smmu->irqs[smmu->num_global_irqs + root_cfg->irptndx]; - free_irq(irq, domain); - } - - __arm_smmu_free_bitmap(smmu->context_map, root_cfg->cbndx); -} - -static int arm_smmu_domain_init(struct iommu_domain *domain) -{ - struct arm_smmu_domain *smmu_domain; - pgd_t *pgd; - - /* - * Allocate the domain and initialise some of its data structures. - * We can't really do anything meaningful until we've added a - * master. - */ - smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL); - if (!smmu_domain) - return -ENOMEM; - - pgd = kzalloc(PTRS_PER_PGD * sizeof(pgd_t), GFP_KERNEL); - if (!pgd) - goto out_free_domain; - smmu_domain->root_cfg.pgd = pgd; - - spin_lock_init(&smmu_domain->lock); - domain->priv = smmu_domain; - return 0; - -out_free_domain: - kfree(smmu_domain); - return -ENOMEM; -} - -static void arm_smmu_free_ptes(pmd_t *pmd) -{ - pgtable_t table = pmd_pgtable(*pmd); - pgtable_page_dtor(table); - __free_page(table); -} - -static void arm_smmu_free_pmds(pud_t *pud) -{ - int i; - pmd_t *pmd, *pmd_base = pmd_offset(pud, 0); - - pmd = pmd_base; - for (i = 0; i < PTRS_PER_PMD; ++i) { - if (pmd_none(*pmd)) - continue; - - arm_smmu_free_ptes(pmd); - pmd++; - } - - pmd_free(NULL, pmd_base); -} - -static void arm_smmu_free_puds(pgd_t *pgd) -{ - int i; - pud_t *pud, *pud_base = pud_offset(pgd, 0); - - pud = pud_base; - for (i = 0; i < PTRS_PER_PUD; ++i) { - if (pud_none(*pud)) - continue; - - arm_smmu_free_pmds(pud); - pud++; - } - - pud_free(NULL, pud_base); -} - -static void arm_smmu_free_pgtables(struct arm_smmu_domain *smmu_domain) -{ - int i; - struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; - pgd_t *pgd, *pgd_base = root_cfg->pgd; - - /* - * Recursively free the page tables for this domain. We don't - * care about speculative TLB filling because the tables should - * not be active in any context bank at this point (SCTLR.M is 0). - */ - pgd = pgd_base; - for (i = 0; i < PTRS_PER_PGD; ++i) { - if (pgd_none(*pgd)) - continue; - arm_smmu_free_puds(pgd); - pgd++; - } - - kfree(pgd_base); -} - -static void arm_smmu_domain_destroy(struct iommu_domain *domain) -{ - struct arm_smmu_domain *smmu_domain = domain->priv; - - /* - * Free the domain resources. We assume that all devices have - * already been detached. - */ - arm_smmu_destroy_domain_context(domain); - arm_smmu_free_pgtables(smmu_domain); - kfree(smmu_domain); }
static int arm_smmu_master_configure_smrs(struct arm_smmu_device *smmu, @@ -1184,444 +650,7 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain, arm_smmu_master_free_smrs(smmu, master); }
-static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) -{ - int ret = -EINVAL; - struct arm_smmu_domain *smmu_domain = domain->priv; - struct arm_smmu_device *device_smmu = dev->archdata.iommu; - struct arm_smmu_master *master; - unsigned long flags; - - if (!device_smmu) { - dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n"); - return -ENXIO; - } - - /* - * Sanity check the domain. We don't currently support domains - * that cross between different SMMU chains. - */ - spin_lock_irqsave(&smmu_domain->lock, flags); - if (!smmu_domain->leaf_smmu) { - /* Now that we have a master, we can finalise the domain */ - ret = arm_smmu_init_domain_context(domain, dev); - if (IS_ERR_VALUE(ret)) - goto err_unlock; - - smmu_domain->leaf_smmu = device_smmu; - } else if (smmu_domain->leaf_smmu != device_smmu) { - dev_err(dev, - "cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n", - dev_name(smmu_domain->leaf_smmu->dev), - dev_name(device_smmu->dev)); - goto err_unlock; - } - spin_unlock_irqrestore(&smmu_domain->lock, flags); - - /* Looks ok, so add the device to the domain */ - master = find_smmu_master(smmu_domain->leaf_smmu, dev->of_node); - if (!master) - return -ENODEV; - - return arm_smmu_domain_add_master(smmu_domain, master); - -err_unlock: - spin_unlock_irqrestore(&smmu_domain->lock, flags); - return ret; -} - -static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev) -{ - struct arm_smmu_domain *smmu_domain = domain->priv; - struct arm_smmu_master *master; - - master = find_smmu_master(smmu_domain->leaf_smmu, dev->of_node); - if (master) - arm_smmu_domain_remove_master(smmu_domain, master); -} - -static bool arm_smmu_pte_is_contiguous_range(unsigned long addr, - unsigned long end) -{ - return !(addr & ~ARM_SMMU_PTE_CONT_MASK) && - (addr + ARM_SMMU_PTE_CONT_SIZE <= end); -} - -static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd, - unsigned long addr, unsigned long end, - unsigned long pfn, int prot, int stage) -{ - pte_t *pte, *start; - pteval_t pteval = ARM_SMMU_PTE_PAGE | ARM_SMMU_PTE_AF | ARM_SMMU_PTE_XN; - - if (pmd_none(*pmd)) { - /* Allocate a new set of tables */ - pgtable_t table = alloc_page(GFP_ATOMIC|__GFP_ZERO); - if (!table) - return -ENOMEM; - - arm_smmu_flush_pgtable(smmu, page_address(table), PAGE_SIZE); - if (!pgtable_page_ctor(table)) { - __free_page(table); - return -ENOMEM; - } - pmd_populate(NULL, pmd, table); - arm_smmu_flush_pgtable(smmu, pmd, sizeof(*pmd)); - } - - if (stage == 1) { - pteval |= ARM_SMMU_PTE_AP_UNPRIV | ARM_SMMU_PTE_nG; - if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) - pteval |= ARM_SMMU_PTE_AP_RDONLY; - - if (prot & IOMMU_CACHE) - pteval |= (MAIR_ATTR_IDX_CACHE << - ARM_SMMU_PTE_ATTRINDX_SHIFT); - } else { - pteval |= ARM_SMMU_PTE_HAP_FAULT; - if (prot & IOMMU_READ) - pteval |= ARM_SMMU_PTE_HAP_READ; - if (prot & IOMMU_WRITE) - pteval |= ARM_SMMU_PTE_HAP_WRITE; - if (prot & IOMMU_CACHE) - pteval |= ARM_SMMU_PTE_MEMATTR_OIWB; - else - pteval |= ARM_SMMU_PTE_MEMATTR_NC; - } - - /* If no access, create a faulting entry to avoid TLB fills */ - if (prot & IOMMU_EXEC) - pteval &= ~ARM_SMMU_PTE_XN; - else if (!(prot & (IOMMU_READ | IOMMU_WRITE))) - pteval &= ~ARM_SMMU_PTE_PAGE; - - pteval |= ARM_SMMU_PTE_SH_IS; - start = pmd_page_vaddr(*pmd) + pte_index(addr); - pte = start; - - /* - * Install the page table entries. This is fairly complicated - * since we attempt to make use of the contiguous hint in the - * ptes where possible. The contiguous hint indicates a series - * of ARM_SMMU_PTE_CONT_ENTRIES ptes mapping a physically - * contiguous region with the following constraints: - * - * - The region start is aligned to ARM_SMMU_PTE_CONT_SIZE - * - Each pte in the region has the contiguous hint bit set - * - * This complicates unmapping (also handled by this code, when - * neither IOMMU_READ or IOMMU_WRITE are set) because it is - * possible, yet highly unlikely, that a client may unmap only - * part of a contiguous range. This requires clearing of the - * contiguous hint bits in the range before installing the new - * faulting entries. - * - * Note that re-mapping an address range without first unmapping - * it is not supported, so TLB invalidation is not required here - * and is instead performed at unmap and domain-init time. - */ - do { - int i = 1; - pteval &= ~ARM_SMMU_PTE_CONT; - - if (arm_smmu_pte_is_contiguous_range(addr, end)) { - i = ARM_SMMU_PTE_CONT_ENTRIES; - pteval |= ARM_SMMU_PTE_CONT; - } else if (pte_val(*pte) & - (ARM_SMMU_PTE_CONT | ARM_SMMU_PTE_PAGE)) { - int j; - pte_t *cont_start; - unsigned long idx = pte_index(addr); - - idx &= ~(ARM_SMMU_PTE_CONT_ENTRIES - 1); - cont_start = pmd_page_vaddr(*pmd) + idx; - for (j = 0; j < ARM_SMMU_PTE_CONT_ENTRIES; ++j) - pte_val(*(cont_start + j)) &= ~ARM_SMMU_PTE_CONT; - - arm_smmu_flush_pgtable(smmu, cont_start, - sizeof(*pte) * - ARM_SMMU_PTE_CONT_ENTRIES); - } - - do { - *pte = pfn_pte(pfn, __pgprot(pteval)); - } while (pte++, pfn++, addr += PAGE_SIZE, --i); - } while (addr != end); - - arm_smmu_flush_pgtable(smmu, start, sizeof(*pte) * (pte - start)); - return 0; -} - -static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud, - unsigned long addr, unsigned long end, - phys_addr_t phys, int prot, int stage) -{ - int ret; - pmd_t *pmd; - unsigned long next, pfn = __phys_to_pfn(phys); - -#ifndef __PAGETABLE_PMD_FOLDED - if (pud_none(*pud)) { - pmd = (pmd_t *)get_zeroed_page(GFP_ATOMIC); - if (!pmd) - return -ENOMEM; - - arm_smmu_flush_pgtable(smmu, pmd, PAGE_SIZE); - pud_populate(NULL, pud, pmd); - arm_smmu_flush_pgtable(smmu, pud, sizeof(*pud)); - - pmd += pmd_index(addr); - } else -#endif - pmd = pmd_offset(pud, addr); - - do { - next = pmd_addr_end(addr, end); - ret = arm_smmu_alloc_init_pte(smmu, pmd, addr, next, pfn, - prot, stage); - phys += next - addr; - } while (pmd++, addr = next, addr < end); - - return ret; -} - -static int arm_smmu_alloc_init_pud(struct arm_smmu_device *smmu, pgd_t *pgd, - unsigned long addr, unsigned long end, - phys_addr_t phys, int prot, int stage) -{ - int ret = 0; - pud_t *pud; - unsigned long next; - -#ifndef __PAGETABLE_PUD_FOLDED - if (pgd_none(*pgd)) { - pud = (pud_t *)get_zeroed_page(GFP_ATOMIC); - if (!pud) - return -ENOMEM; - - arm_smmu_flush_pgtable(smmu, pud, PAGE_SIZE); - pgd_populate(NULL, pgd, pud); - arm_smmu_flush_pgtable(smmu, pgd, sizeof(*pgd)); - - pud += pud_index(addr); - } else -#endif - pud = pud_offset(pgd, addr); - - do { - next = pud_addr_end(addr, end); - ret = arm_smmu_alloc_init_pmd(smmu, pud, addr, next, phys, - prot, stage); - phys += next - addr; - } while (pud++, addr = next, addr < end); - - return ret; -} - -static int arm_smmu_handle_mapping(struct arm_smmu_domain *smmu_domain, - unsigned long iova, phys_addr_t paddr, - size_t size, int prot) -{ - int ret, stage; - unsigned long end; - phys_addr_t input_mask, output_mask; - struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; - pgd_t *pgd = root_cfg->pgd; - struct arm_smmu_device *smmu = root_cfg->smmu; - unsigned long flags; - - if (root_cfg->cbar == CBAR_TYPE_S2_TRANS) { - stage = 2; - output_mask = (1ULL << smmu->s2_output_size) - 1; - } else { - stage = 1; - output_mask = (1ULL << smmu->s1_output_size) - 1; - } - - if (!pgd) - return -EINVAL; - - if (size & ~PAGE_MASK) - return -EINVAL; - - input_mask = (1ULL << smmu->input_size) - 1; - if ((phys_addr_t)iova & ~input_mask) - return -ERANGE; - - if (paddr & ~output_mask) - return -ERANGE; - - spin_lock_irqsave(&smmu_domain->lock, flags); - pgd += pgd_index(iova); - end = iova + size; - do { - unsigned long next = pgd_addr_end(iova, end); - - ret = arm_smmu_alloc_init_pud(smmu, pgd, iova, next, paddr, - prot, stage); - if (ret) - goto out_unlock; - - paddr += next - iova; - iova = next; - } while (pgd++, iova != end); - -out_unlock: - spin_unlock_irqrestore(&smmu_domain->lock, flags); - - return ret; -} - -static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t size, int prot) -{ - struct arm_smmu_domain *smmu_domain = domain->priv; - - if (!smmu_domain) - return -ENODEV; - - /* Check for silent address truncation up the SMMU chain. */ - if ((phys_addr_t)iova & ~smmu_domain->output_mask) - return -ERANGE; - - return arm_smmu_handle_mapping(smmu_domain, iova, paddr, size, prot); -} - -static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, - size_t size) -{ - int ret; - struct arm_smmu_domain *smmu_domain = domain->priv; - - ret = arm_smmu_handle_mapping(smmu_domain, iova, 0, size, 0); - arm_smmu_tlb_inv_context(&smmu_domain->root_cfg); - return ret ? 0 : size; -} - -static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, - dma_addr_t iova) -{ - pgd_t *pgdp, pgd; - pud_t pud; - pmd_t pmd; - pte_t pte; - struct arm_smmu_domain *smmu_domain = domain->priv; - struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; - - pgdp = root_cfg->pgd; - if (!pgdp) - return 0; - - pgd = *(pgdp + pgd_index(iova)); - if (pgd_none(pgd)) - return 0; - - pud = *pud_offset(&pgd, iova); - if (pud_none(pud)) - return 0; - - pmd = *pmd_offset(&pud, iova); - if (pmd_none(pmd)) - return 0; - - pte = *(pmd_page_vaddr(pmd) + pte_index(iova)); - if (pte_none(pte)) - return 0; - - return __pfn_to_phys(pte_pfn(pte)) | (iova & ~PAGE_MASK); -} - -static int arm_smmu_domain_has_cap(struct iommu_domain *domain, - unsigned long cap) -{ - unsigned long caps = 0; - struct arm_smmu_domain *smmu_domain = domain->priv; - - if (smmu_domain->root_cfg.smmu->features & ARM_SMMU_FEAT_COHERENT_WALK) - caps |= IOMMU_CAP_CACHE_COHERENCY; - - return !!(cap & caps); -} - -static int arm_smmu_add_device(struct device *dev) -{ - struct arm_smmu_device *child, *parent, *smmu; - struct arm_smmu_master *master = NULL; - struct iommu_group *group; - int ret; - - if (dev->archdata.iommu) { - dev_warn(dev, "IOMMU driver already assigned to device\n"); - return -EINVAL; - } - - spin_lock(&arm_smmu_devices_lock); - list_for_each_entry(parent, &arm_smmu_devices, list) { - smmu = parent; - - /* Try to find a child of the current SMMU. */ - list_for_each_entry(child, &arm_smmu_devices, list) { - if (child->parent_of_node == parent->dev->of_node) { - /* Does the child sit above our master? */ - master = find_smmu_master(child, dev->of_node); - if (master) { - smmu = NULL; - break; - } - } - } - - /* We found some children, so keep searching. */ - if (!smmu) { - master = NULL; - continue; - } - - master = find_smmu_master(smmu, dev->of_node); - if (master) - break; - } - spin_unlock(&arm_smmu_devices_lock); - - if (!master) - return -ENODEV; - - group = iommu_group_alloc(); - if (IS_ERR(group)) { - dev_err(dev, "Failed to allocate IOMMU group\n"); - return PTR_ERR(group); - } - - ret = iommu_group_add_device(group, dev); - iommu_group_put(group); - dev->archdata.iommu = smmu; - - return ret; -} - -static void arm_smmu_remove_device(struct device *dev) -{ - dev->archdata.iommu = NULL; - iommu_group_remove_device(dev); -} - -static struct iommu_ops arm_smmu_ops = { - .domain_init = arm_smmu_domain_init, - .domain_destroy = arm_smmu_domain_destroy, - .attach_dev = arm_smmu_attach_dev, - .detach_dev = arm_smmu_detach_dev, - .map = arm_smmu_map, - .unmap = arm_smmu_unmap, - .iova_to_phys = arm_smmu_iova_to_phys, - .domain_has_cap = arm_smmu_domain_has_cap, - .add_device = arm_smmu_add_device, - .remove_device = arm_smmu_remove_device, - .pgsize_bitmap = (SECTION_SIZE | - ARM_SMMU_PTE_CONT_SIZE | - PAGE_SIZE), -}; - -static void arm_smmu_device_reset(struct arm_smmu_device *smmu) +static int arm_smmu_device_reset(struct arm_smmu_device *smmu) { void __iomem *gr0_base = ARM_SMMU_GR0(smmu); void __iomem *cb_base; @@ -1670,6 +699,8 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) /* Push the button */ arm_smmu_tlb_sync(smmu); writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); + + return 0; }
static u32 arm_smmu_id_size_to_bits(u32 size) @@ -1697,7 +728,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) void __iomem *gr0_base = ARM_SMMU_GR0(smmu); u32 id;
- dev_notice(smmu->dev, "probing hardware configuration...\n"); + dev_notice(smmu->dev, "probing arm-smmu hardware configuration...\n");
/* Primecell ID */ id = readl_relaxed(gr0_base + ARM_SMMU_GR0_PIDR2); @@ -1795,30 +826,16 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) id = readl_relaxed(gr0_base + ARM_SMMU_GR0_ID2); size = arm_smmu_id_size_to_bits((id >> ID2_IAS_SHIFT) & ID2_IAS_MASK);
- /* - * Stage-1 output limited by stage-2 input size due to pgd - * allocation (PTRS_PER_PGD). - */ -#ifdef CONFIG_64BIT - smmu->s1_output_size = min((u32)VA_BITS, size); -#else - smmu->s1_output_size = min(32U, size); -#endif + smmu->s1_output_size = size;
- /* The stage-2 output mask is also applied for bypass */ size = arm_smmu_id_size_to_bits((id >> ID2_OAS_SHIFT) & ID2_OAS_MASK); - smmu->s2_output_size = min((u32)PHYS_MASK_SHIFT, size); + smmu->s2_output_size = size;
if (smmu->version == 1) { smmu->input_size = 32; } else { -#ifdef CONFIG_64BIT size = (id >> ID2_UBS_SHIFT) & ID2_UBS_MASK; - size = min(VA_BITS, arm_smmu_id_size_to_bits(size)); -#else - size = 32; -#endif - smmu->input_size = size; + smmu->input_size = arm_smmu_id_size_to_bits(size);
if ((PAGE_SIZE == SZ_4K && !(id & ID2_PTFS_4K)) || (PAGE_SIZE == SZ_64K && !(id & ID2_PTFS_64K)) || @@ -1829,183 +846,30 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) } }
- dev_notice(smmu->dev, - "\t%u-bit VA, %u-bit IPA, %u-bit PA\n", - smmu->input_size, smmu->s1_output_size, smmu->s2_output_size); return 0; }
-static int arm_smmu_device_dt_probe(struct platform_device *pdev) +static int arm_smmu_device_unload(struct arm_smmu_device *smmu) { - struct resource *res; - struct arm_smmu_device *smmu; - struct device_node *dev_node; - struct device *dev = &pdev->dev; - struct rb_node *node; - struct of_phandle_args masterspec; - int num_irqs, i, err; - - smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL); - if (!smmu) { - dev_err(dev, "failed to allocate arm_smmu_device\n"); - return -ENOMEM; - } - smmu->dev = dev; - - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - smmu->base = devm_ioremap_resource(dev, res); - if (IS_ERR(smmu->base)) - return PTR_ERR(smmu->base); - smmu->size = resource_size(res); - - if (of_property_read_u32(dev->of_node, "#global-interrupts", - &smmu->num_global_irqs)) { - dev_err(dev, "missing #global-interrupts property\n"); - return -ENODEV; - } - - num_irqs = 0; - while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) { - num_irqs++; - if (num_irqs > smmu->num_global_irqs) - smmu->num_context_irqs++; - } - - if (!smmu->num_context_irqs) { - dev_err(dev, "found %d interrupts but expected at least %d\n", - num_irqs, smmu->num_global_irqs + 1); - return -ENODEV; - } - - smmu->irqs = devm_kzalloc(dev, sizeof(*smmu->irqs) * num_irqs, - GFP_KERNEL); - if (!smmu->irqs) { - dev_err(dev, "failed to allocate %d irqs\n", num_irqs); - return -ENOMEM; - } - - for (i = 0; i < num_irqs; ++i) { - int irq = platform_get_irq(pdev, i); - if (irq < 0) { - dev_err(dev, "failed to get irq index %d\n", i); - return -ENODEV; - } - smmu->irqs[i] = irq; - } - - i = 0; - smmu->masters = RB_ROOT; - while (!of_parse_phandle_with_args(dev->of_node, "mmu-masters", - "#stream-id-cells", i, - &masterspec)) { - err = register_smmu_master(smmu, dev, &masterspec); - if (err) { - dev_err(dev, "failed to add master %s\n", - masterspec.np->name); - goto out_put_masters; - } - - i++; - } - dev_notice(dev, "registered %d master devices\n", i); - - dev_node = of_parse_phandle(dev->of_node, "smmu-parent", 0); - if (dev_node) - smmu->parent_of_node = dev_node; - - err = arm_smmu_device_cfg_probe(smmu); - if (err) - goto out_put_parent; - - parse_driver_options(smmu); - - if (smmu->version > 1 && - smmu->num_context_banks != smmu->num_context_irqs) { - dev_err(dev, - "found only %d context interrupt(s) but %d required\n", - smmu->num_context_irqs, smmu->num_context_banks); - err = -ENODEV; - goto out_put_parent; - } - - for (i = 0; i < smmu->num_global_irqs; ++i) { - err = request_irq(smmu->irqs[i], - arm_smmu_global_fault, - IRQF_SHARED, - "arm-smmu global fault", - smmu); - if (err) { - dev_err(dev, "failed to request global IRQ %d (%u)\n", - i, smmu->irqs[i]); - goto out_free_irqs; - } - } - - INIT_LIST_HEAD(&smmu->list); - spin_lock(&arm_smmu_devices_lock); - list_add(&smmu->list, &arm_smmu_devices); - spin_unlock(&arm_smmu_devices_lock); + writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
- arm_smmu_device_reset(smmu); return 0; - -out_free_irqs: - while (i--) - free_irq(smmu->irqs[i], smmu); - -out_put_parent: - if (smmu->parent_of_node) - of_node_put(smmu->parent_of_node); - -out_put_masters: - for (node = rb_first(&smmu->masters); node; node = rb_next(node)) { - struct arm_smmu_master *master; - master = container_of(node, struct arm_smmu_master, node); - of_node_put(master->of_node); - } - - return err; }
-static int arm_smmu_device_remove(struct platform_device *pdev) -{ - int i; - struct device *dev = &pdev->dev; - struct arm_smmu_device *curr, *smmu = NULL; - struct rb_node *node; - - spin_lock(&arm_smmu_devices_lock); - list_for_each_entry(curr, &arm_smmu_devices, list) { - if (curr->dev == dev) { - smmu = curr; - list_del(&smmu->list); - break; - } - } - spin_unlock(&arm_smmu_devices_lock); - - if (!smmu) - return -ENODEV; - - if (smmu->parent_of_node) - of_node_put(smmu->parent_of_node); - - for (node = rb_first(&smmu->masters); node; node = rb_next(node)) { - struct arm_smmu_master *master; - master = container_of(node, struct arm_smmu_master, node); - of_node_put(master->of_node); - } - - if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS)) - dev_err(dev, "removing device with active domains!\n"); - - for (i = 0; i < smmu->num_global_irqs; ++i) - free_irq(smmu->irqs[i], smmu); - - /* Turn the thing off */ - writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); - return 0; -} +static struct smmu_hwdep_ops arm_smmu_hwdep_ops = { + .alloc_context = arm_smmu_alloc_context, + .tlb_sync_finished = arm_smmu_tlb_sync_finished, + .tlb_inv_context = arm_smmu_tlb_inv_context, + .context_fault = arm_smmu_context_fault, + .global_fault = arm_smmu_global_fault, + .init_context_bank = arm_smmu_init_context_bank, + .destroy_context_bank = arm_smmu_destroy_context_bank, + .domain_add_master = arm_smmu_domain_add_master, + .domain_remove_master = arm_smmu_domain_remove_master, + .device_reset = arm_smmu_device_reset, + .device_cfg_probe = arm_smmu_device_cfg_probe, + .device_remove = arm_smmu_device_unload, +};
#ifdef CONFIG_OF static struct of_device_id arm_smmu_of_match[] = { @@ -2018,34 +882,24 @@ static struct of_device_id arm_smmu_of_match[] = { MODULE_DEVICE_TABLE(of, arm_smmu_of_match); #endif
+static int arm_smmu_device_probe(struct platform_device *pdev) +{ + return arm_smmu_device_dt_probe(pdev, &arm_smmu_hwdep_ops); +} + static struct platform_driver arm_smmu_driver = { .driver = { .owner = THIS_MODULE, .name = "arm-smmu", .of_match_table = of_match_ptr(arm_smmu_of_match), }, - .probe = arm_smmu_device_dt_probe, + .probe = arm_smmu_device_probe, .remove = arm_smmu_device_remove, };
static int __init arm_smmu_init(void) { - int ret; - - ret = platform_driver_register(&arm_smmu_driver); - if (ret) - return ret; - - /* Oh, for a proper bus abstraction */ - if (!iommu_present(&platform_bus_type)) - bus_set_iommu(&platform_bus_type, &arm_smmu_ops); - -#ifdef CONFIG_ARM_AMBA - if (!iommu_present(&amba_bustype)) - bus_set_iommu(&amba_bustype, &arm_smmu_ops); -#endif - - return 0; + return platform_driver_register(&arm_smmu_driver); }
static void __exit arm_smmu_exit(void) diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h new file mode 100644 index 0000000..b88b5de --- /dev/null +++ b/drivers/iommu/arm-smmu.h @@ -0,0 +1,258 @@ +/* + * IOMMU API for ARM architected SMMU implementations. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Copyright (C) 2013 ARM Limited + * + * Author: Will Deacon will.deacon@arm.com + * + */ + +#ifndef ARM_SMMU_H +#define ARM_SMMU_H + +#include <linux/iommu.h> +#include <linux/of.h> +#include <linux/platform_device.h> + +/* Maximum number of stream IDs assigned to a single device */ +#define MAX_MASTER_STREAMIDS MAX_PHANDLE_ARGS + +/* Maximum number of mapping groups per SMMU */ +#define ARM_SMMU_MAX_SMRS 128 + +/* Page table bits */ +#define ARM_SMMU_PTE_XN (((pteval_t)3) << 53) +#define ARM_SMMU_PTE_CONT (((pteval_t)1) << 52) +#define ARM_SMMU_PTE_AF (((pteval_t)1) << 10) +#define ARM_SMMU_PTE_SH_NS (((pteval_t)0) << 8) +#define ARM_SMMU_PTE_SH_OS (((pteval_t)2) << 8) +#define ARM_SMMU_PTE_SH_IS (((pteval_t)3) << 8) +#define ARM_SMMU_PTE_PAGE (((pteval_t)3) << 0) + +#if PAGE_SIZE == SZ_4K +#define ARM_SMMU_PTE_CONT_ENTRIES 16 +#elif PAGE_SIZE == SZ_64K +#define ARM_SMMU_PTE_CONT_ENTRIES 32 +#else +#define ARM_SMMU_PTE_CONT_ENTRIES 1 +#endif + +#define ARM_SMMU_PTE_CONT_SIZE (PAGE_SIZE * ARM_SMMU_PTE_CONT_ENTRIES) +#define ARM_SMMU_PTE_CONT_MASK (~(ARM_SMMU_PTE_CONT_SIZE - 1)) + +/* Stage-1 PTE */ +#define ARM_SMMU_PTE_AP_UNPRIV (((pteval_t)1) << 6) +#define ARM_SMMU_PTE_AP_RDONLY (((pteval_t)2) << 6) +#define ARM_SMMU_PTE_ATTRINDX_SHIFT 2 +#define ARM_SMMU_PTE_nG (((pteval_t)1) << 11) + +/* Stage-2 PTE */ +#define ARM_SMMU_PTE_HAP_FAULT (((pteval_t)0) << 6) +#define ARM_SMMU_PTE_HAP_READ (((pteval_t)1) << 6) +#define ARM_SMMU_PTE_HAP_WRITE (((pteval_t)2) << 6) +#define ARM_SMMU_PTE_MEMATTR_OIWB (((pteval_t)0xf) << 2) +#define ARM_SMMU_PTE_MEMATTR_NC (((pteval_t)0x5) << 2) +#define ARM_SMMU_PTE_MEMATTR_DEV (((pteval_t)0x1) << 2) + +#define TLB_LOOP_TIMEOUT 1000000 /* 1s! */ + +#define TYPE_S2_TRANS 0 +#define TYPE_S1_TRANS_S2_BYPASS 1 +#define TYPE_S1_TRANS_S2_TRANS 3 + +#define RESUME_RETRY (0 << 0) +#define RESUME_TERMINATE (1 << 0) + +/* In SMMUv2, this register is named SMMU_CBn_TCR */ +#define TTBCR_EAE (1 << 31) + +#define TTBCR_PASIZE_SHIFT 16 +#define TTBCR_PASIZE_MASK 0x7 + +#define TTBCR_TG0_4K (0 << 14) +#define TTBCR_TG0_64K (1 << 14) + +#define TTBCR_SH0_SHIFT 12 +#define TTBCR_SH0_MASK 0x3 +#define TTBCR_SH_NS 0 +#define TTBCR_SH_OS 2 +#define TTBCR_SH_IS 3 + +#define TTBCR_ORGN0_SHIFT 10 +#define TTBCR_IRGN0_SHIFT 8 +#define TTBCR_RGN_MASK 0x3 +#define TTBCR_RGN_NC 0 +#define TTBCR_RGN_WBWA 1 +#define TTBCR_RGN_WT 2 +#define TTBCR_RGN_WB 3 + +#define TTBCR_SL0_SHIFT 6 +#define TTBCR_SL0_MASK 0x3 +#define TTBCR_SL0_LVL_2 0 +#define TTBCR_SL0_LVL_1 1 + +#define TTBCR_T1SZ_SHIFT 16 +#define TTBCR_T0SZ_SHIFT 0 +#define TTBCR_SZ_MASK 0xf + +#define MAIR_ATTR_SHIFT(n) ((n) << 3) +#define MAIR_ATTR_MASK 0xff +#define MAIR_ATTR_DEVICE 0x04 +#define MAIR_ATTR_NC 0x44 +#define MAIR_ATTR_WBRWA 0xff +#define MAIR_ATTR_IDX_NC 0 +#define MAIR_ATTR_IDX_CACHE 1 +#define MAIR_ATTR_IDX_DEV 2 + +#define MAIR0_STAGE1 \ + ((MAIR_ATTR_NC << MAIR_ATTR_SHIFT(MAIR_ATTR_IDX_NC)) | \ + (MAIR_ATTR_WBRWA << MAIR_ATTR_SHIFT(MAIR_ATTR_IDX_CACHE)) | \ + (MAIR_ATTR_DEVICE << MAIR_ATTR_SHIFT(MAIR_ATTR_IDX_DEV))) + + +struct arm_smmu_smr { + u8 idx; + u16 mask; + u16 id; +}; + +struct arm_smmu_master { + struct device_node *of_node; + + /* + * The following is specific to the master's position in the + * SMMU chain. + */ + struct rb_node node; + int num_streamids; + u16 streamids[MAX_MASTER_STREAMIDS]; + + /* + * We only need to allocate these on the root SMMU, as we + * configure unmatched streams to bypass translation. + */ + struct arm_smmu_smr *smrs; +}; + +struct smmu_hwdep_ops; + +struct arm_smmu_device { + struct device *dev; + struct device_node *parent_of_node; + + struct smmu_hwdep_ops *hwdep_ops; + + void __iomem *s1cbt; + void __iomem *s2cbt; + void __iomem *base; + u32 size; + u32 pagesize; + +#define ARM_SMMU_FEAT_COHERENT_WALK (1 << 0) +#define ARM_SMMU_FEAT_STREAM_MATCH (1 << 1) +#define ARM_SMMU_FEAT_TRANS_S1 (1 << 2) +#define ARM_SMMU_FEAT_TRANS_S2 (1 << 3) +#define ARM_SMMU_FEAT_TRANS_NESTED (1 << 4) + u32 features; + +#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0) + u32 options; + int version; + + u32 num_context_banks; + u32 num_s2_context_banks; + unsigned long *context_map; + atomic_t irptndx; + + u32 num_mapping_groups; + DECLARE_BITMAP(smr_map, ARM_SMMU_MAX_SMRS); + + u32 input_size; + u32 s1_output_size; + u32 s2_output_size; + + u32 num_global_irqs; + u32 num_context_irqs; + unsigned int *irqs; + + struct list_head list; + struct rb_root masters; +}; + +struct arm_smmu_cfg { + struct arm_smmu_device *smmu; + u8 cbndx; + u8 irptndx; + u32 type; + pgd_t *pgd; +}; +#define INVALID_IRPTNDX 0xff + +#define ARM_SMMU_CB_ASID(cfg) ((cfg)->cbndx) +#define ARM_SMMU_CB_VMID(cfg) ((cfg)->cbndx + 1) + +struct arm_smmu_domain { + /* + * A domain can span across multiple, chained SMMUs and requires + * all devices within the domain to follow the same translation + * path. + */ + struct arm_smmu_device *leaf_smmu; + struct arm_smmu_cfg root_cfg; + phys_addr_t output_mask; + + spinlock_t lock; +}; + +/** + * struct smmu_hwdep_ops - smmu hardware dependent ops + * @alloc_context: alloc a free context bank + * @tlb_sync_finished: check whether tlb sync operation is finished + * @tlb_inv_context: invalid smmu context bank tlb + * @context_fault: context fault handler + * @global_fault: global fault handler + * @init_context_bank: init a context bank + * @destroy_context_bank: disable a context bank and invalid the TLBs + * @domain_add_master: add a master into a domain + * @domain_remove_master: remove a master from a domain + * @device_reset: initialize a smmu + * @device_cfg_probe: probe hardware configuration + * @device_remove: turn off a smmu and reclaim associated resources + */ +struct smmu_hwdep_ops { + int (*alloc_context)(struct arm_smmu_device *smmu, + int start, int end, struct arm_smmu_master *master); + int (*tlb_sync_finished)(struct arm_smmu_device *smmu); + void (*tlb_inv_context)(struct arm_smmu_cfg *cfg); + irqreturn_t (*context_fault)(int irq, void *dev); + irqreturn_t (*global_fault)(int irq, void *dev); + void (*init_context_bank)(struct arm_smmu_domain *smmu_domain); + void (*destroy_context_bank)(struct arm_smmu_domain *smmu_domain); + int (*domain_add_master)(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_master *master); + void (*domain_remove_master)(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_master *master); + int (*device_reset)(struct arm_smmu_device *smmu); + int (*device_cfg_probe)(struct arm_smmu_device *smmu); + int (*device_remove)(struct arm_smmu_device *smmu); +}; + +extern int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end); +extern void __arm_smmu_free_bitmap(unsigned long *map, int idx); +extern void arm_smmu_tlb_sync_wait(struct arm_smmu_device *smmu); +extern struct arm_smmu_device *find_parent_smmu(struct arm_smmu_device *smmu); +extern void arm_smmu_flush_pgtable(struct arm_smmu_device *smmu, void *addr, + size_t size); +extern int arm_smmu_device_dt_probe(struct platform_device *pdev, + struct smmu_hwdep_ops *ops); +extern int arm_smmu_device_remove(struct platform_device *pdev); +#endif -- 1.8.0
Some Hisilicon smmu features are list below: 1. StreamID is 16 bits, highest 8 bits is VMID, lowest 8 bits is ASID. StreamID match is not support, so direct use VMID and ASID to index context bank. First use VMID to index stage2 context bank, then use ASID to index stage1 context bank. In fact, max 256 stage2 context banks, each stage2 context bank relate to 256 stage1 context banks. |-----------------| |-----------------| |stage2 CB VMID0 |----------->|stage1 CB ASID0 | |-----------------| |-----------------| | ...... | | ...... | |-----------------| |-----------------| |stage2 CB VMID255|-----| |stage2 CB ASID255| |-----------------| | |-----------------| | | | |----->|-----------------| |stage1 CB ASID0 | |-----------------| | ...... | |-----------------| |stage2 CB ASID255| |-----------------|
2. The base address of stage2 context bank is stored in SMMU_CFG_S2CTBAR, and the base address of stage1 context bank is stored in S2_S1CTBAR(locate in stage2 context bank).
3. All context bank fault share 8 groups of context fault registers. That is, max record 8 context faults. Fault syndrome register recorded StreamID to help software determine which context bank issue fault.
Signed-off-by: Zhen Lei thunder.leizhen@huawei.com --- drivers/iommu/Kconfig | 10 + drivers/iommu/Makefile | 1 + drivers/iommu/hisi-smmu.c | 575 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 586 insertions(+) create mode 100644 drivers/iommu/hisi-smmu.c
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index fad5e38..716b0ab 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -309,4 +309,14 @@ config ARM_SMMU Say Y here if your SoC includes an IOMMU device implementing the ARM SMMU architecture.
+config HISI_SMMU + bool "Hisilicon Ltd. System MMU (SMMU) Support" + depends on ARM64 || (ARM_LPAE && OF) + select IOMMU_API + select ARM_SMMU_BASE + select ARM_DMA_USE_IOMMU if ARM + help + Say Y here if your SoC includes an IOMMU device implementing + the Hisilicon SMMU architecture. + endif # IOMMU_SUPPORT diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 717cfa3..ef932f2 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -6,6 +6,7 @@ obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o obj-$(CONFIG_ARM_SMMU_BASE) += arm-smmu-base.o obj-$(CONFIG_ARM_SMMU) += arm-smmu.o +obj-$(CONFIG_HISI_SMMU) += hisi-smmu.o obj-$(CONFIG_DMAR_TABLE) += dmar.o obj-$(CONFIG_INTEL_IOMMU) += iova.o intel-iommu.o obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o diff --git a/drivers/iommu/hisi-smmu.c b/drivers/iommu/hisi-smmu.c new file mode 100644 index 0000000..7191d5c --- /dev/null +++ b/drivers/iommu/hisi-smmu.c @@ -0,0 +1,575 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Copyright (C) 2014 Hisilicon Limited + * + * Author: Zhen Lei thunder.leizhen@huawei.com + * + * Hisilicon smmu-v1 implemention + * + */ + +#define pr_fmt(fmt) "hisi-smmu: " fmt + +#include <linux/delay.h> +#include <linux/err.h> +#include <linux/interrupt.h> +#include <linux/io.h> +#include <linux/iommu.h> +#include <linux/module.h> +#include <linux/of.h> +#include <linux/spinlock.h> + +#include "arm-smmu.h" + +/* Maximum number of context banks per VMID */ +#define HISI_SMMU_MAX_CBS 256 + +#define SMMU_OS_VMID 0 +#define SMMU_CB_NUMIRPT 8 +#define SMMU_S1CBT_SIZE 0x10000 +#define SMMU_S2CBT_SIZE 0x2000 +#define SMMU_S1CBT_SHIFT 16 +#define SMMU_S2CBT_SHIFT 12 + +/* SMMU global address space */ +#define SMMU_GR0(smmu) ((smmu)->base) + +#define SMMU_CTRL_CR0 0x0 +#define SMMU_CTRL_ACR 0x8 +#define SMMU_CFG_S2CTBAR 0xc +#define SMMU_IDR0 0x10 +#define SMMU_IDR1 0x14 +#define SMMU_IDR2 0x18 +#define SMMU_HIS_GFAR_LOW 0x20 +#define SMMU_HIS_GFAR_HIGH 0x24 +#define SMMU_RINT_GFSR 0x28 +#define SMMU_RINT_GFSYNR 0x2c +#define SMMU_CFG_GFIM 0x30 +#define SMMU_CFG_CBF 0x34 +#define SMMU_TLBIALL 0x40 +#define SMMU_TLBIVMID 0x44 +#define SMMU_TLBISID 0x48 +#define SMMU_TLBIVA_LOW 0x4c +#define SMMU_TLBIVA_HIGH 0x50 +#define SMMU_TLBGSYNC 0x54 +#define SMMU_TLBGSTATUS 0x58 +#define SMMU_CXTIALL 0x60 +#define SMMU_CXTIVMID 0x64 +#define SMMU_CXTISID 0x68 +#define SMMU_CXTGSYNC 0x6c +#define SMMU_CXTGSTATUS 0x70 +#define SMMU_RINT_CB_FSR(n) (0x100 + ((n) << 2)) +#define SMMU_RINT_CB_FSYNR(n) (0x120 + ((n) << 2)) +#define SMMU_HIS_CB_FAR_LOW(n) (0x140 + ((n) << 3)) +#define SMMU_HIS_CB_FAR_HIGH(n) (0x144 + ((n) << 3)) +#define SMMU_CTRL_CB_RESUME(n) (0x180 + ((n) << 2)) + +#define SMMU_CB_S2CR(n) (0x0 + ((n) << 5)) +#define SMMU_CB_CBAR(n) (0x4 + ((n) << 5)) +#define SMMU_CB_S1CTBAR(n) (0x18 + ((n) << 5)) + +/* SMMU stage1 context bank and StreamID */ +#define SMMU_CB_BASE(smmu) ((smmu)->s1cbt) +#define SMMU_CB(smmu, n) ((n) << 5) +#define SMMU_CB_SID(cfg) (((u16)SMMU_OS_VMID << 8) | \ + ((cfg)->cbndx)) + +#define SMMU_S1_MAIR0 0x0 +#define SMMU_S1_MAIR1 0x4 +#define SMMU_S1_TTBR0_L 0x8 +#define SMMU_S1_TTBR0_H 0xc +#define SMMU_S1_TTBR1_L 0x10 +#define SMMU_S1_TTBR1_H 0x14 +#define SMMU_S1_TTBCR 0x18 +#define SMMU_S1_SCTLR 0x1c + +#define CFG_CBF_S1_ORGN_WA (1 << 12) +#define CFG_CBF_S1_IRGN_WA (1 << 10) +#define CFG_CBF_S1_SHCFG_IS (3 << 8) +#define CFG_CBF_S2_ORGN_WA (1 << 4) +#define CFG_CBF_S2_IRGN_WA (1 << 2) +#define CFG_CBF_S2_SHCFG_IS (3 << 0) + +#if (PAGE_SIZE == SZ_4K) +#define sACR_WC_EN (7 << 0) +#elif (PAGE_SIZE == SZ_64K) +#define sACR_WC_EN (3 << 5) +#else +#define sACR_WC_EN 0 +#endif + +/* Configuration registers */ +#define sCR0_CLIENTPD (1 << 0) +#define sCR0_GFRE (1 << 1) +#define sCR0_GFIE (1 << 2) +#define sCR0_GCFGFRE (1 << 4) +#define sCR0_GCFGFIE (1 << 5) + +#define ID0_S1TS (1 << 30) +#define ID0_NTS (1 << 28) +#define ID0_CTTW (1 << 14) + +#define ID2_IAS_GET(id2) (((id2) << 0) & 0xff) +#define ID2_OAS_GET(id2) (((id2) << 8) & 0xff) +#define ID2_IPA_SIZE 48 + +#define CBAR_TYPE_S1_TRANS_S2_BYPASS (0x1 << 16) +#define CBAR_S1_BPSHCFG_NSH (0x3 << 8) +#define CBAR_S1_MEMATTR_WB (0xf << 12) +#define CBAR_MTSH_WEAKEST (CBAR_S1_BPSHCFG_NSH | \ + CBAR_S1_MEMATTR_WB) + +#define S2CR_TYPE_SHIFT 16 +#define S2CR_TYPE_TRANS (0 << S2CR_TYPE_SHIFT) +#define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) +#define S2CR_SHCFG_NS (3 << 8) +#define S2CR_MTCFG (1 << 11) +#define S2CR_MEMATTR_OIWB (0xf << 12) +#define S2CR_MTSH_WEAKEST (S2CR_SHCFG_NS | \ + S2CR_MTCFG | S2CR_MEMATTR_OIWB) + +#define SCTLR_CFCFG (1 << 7) +#define SCTLR_CFIE (1 << 6) +#define SCTLR_CFRE (1 << 5) +#define SCTLR_E (1 << 4) +#define SCTLR_AFED (1 << 3) +#define SCTLR_M (1 << 0) + +#define sTLBGSTATUS_GSACTIVE (1 << 0) + +#define HISI_TTBCR_TG0_64K (3 << 14) + +#define FSR_MULTI (1 << 31) +#define FSR_EF (1 << 4) +#define FSR_PF (1 << 3) +#define FSR_AFF (1 << 2) +#define FSR_TF (1 << 1) +#define FSR_IGN (FSR_AFF) +#define FSR_FAULT (FSR_MULTI | FSR_EF | \ + FSR_PF | FSR_TF | FSR_IGN) + +#define FSYNR0_ASID(n) (0xff & ((n) >> 24)) +#define FSYNR0_VMID(n) (0xff & ((n) >> 16)) +#define FSYNR0_WNR (1 << 4) +#define FSYNR0_SS (1 << 2) +#define FSYNR0_CF (1 << 0) + +static int hisi_smmu_alloc_context(struct arm_smmu_device *smmu, + int start, int end, struct arm_smmu_master *master) +{ + if (!master) + return -ENOSPC; + + start = master->streamids[0]; + + return __arm_smmu_alloc_bitmap(smmu->context_map, start, start + 1); +} + +static int hisi_smmu_tlb_sync_finished(struct arm_smmu_device *smmu) +{ + u32 reg; + + reg = readl_relaxed(SMMU_GR0(smmu) + SMMU_TLBGSTATUS); + + return !(reg & sTLBGSTATUS_GSACTIVE); +} + +static void hisi_smmu_tlb_sync(struct arm_smmu_device *smmu) +{ + writel_relaxed(0, SMMU_GR0(smmu) + SMMU_TLBGSYNC); + arm_smmu_tlb_sync_wait(smmu); +} + +static void hisi_smmu_tlb_inv_context(struct arm_smmu_cfg *cfg) +{ + struct arm_smmu_device *smmu = cfg->smmu; + + writel_relaxed(SMMU_CB_SID(cfg), SMMU_GR0(smmu) + SMMU_CXTISID); + hisi_smmu_tlb_sync(smmu); +} + +static irqreturn_t hisi_smmu_context_fault(int irq, void *dev) +{ + int i, flags, ret = IRQ_NONE; + u32 fsr, far, fsynr, resume; + unsigned long iova; + struct iommu_domain *domain = dev; + struct arm_smmu_domain *smmu_domain = domain->priv; + struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; + struct arm_smmu_device *smmu = root_cfg->smmu; + void __iomem *gr0_base = SMMU_GR0(smmu); + + for (i = 0; i < SMMU_CB_NUMIRPT; i++) { + fsynr = readl_relaxed(gr0_base + SMMU_RINT_CB_FSYNR(i)); + + if ((fsynr & FSYNR0_CF) && + (FSYNR0_VMID(fsynr) == SMMU_OS_VMID) && + (root_cfg->cbndx == FSYNR0_ASID(fsynr))) + break; + } + + if (i >= SMMU_CB_NUMIRPT) + return IRQ_NONE; + + fsr = readl_relaxed(gr0_base + SMMU_RINT_CB_FSR(i)); + if (fsr & FSR_IGN) + dev_err_ratelimited(smmu->dev, + "Unexpected context fault (fsr 0x%u)\n", + fsr); + + flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ; + + far = readl_relaxed(gr0_base + SMMU_HIS_CB_FAR_LOW(i)); + iova = far; +#ifdef CONFIG_64BIT + far = readl_relaxed(gr0_base + SMMU_HIS_CB_FAR_HIGH(i)); + iova |= ((unsigned long)far << 32); +#endif + + if (!report_iommu_fault(domain, smmu->dev, iova, flags)) { + ret = IRQ_HANDLED; + resume = RESUME_RETRY; + } else { + dev_err_ratelimited(smmu->dev, + "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, cb=%d\n", + iova, fsynr, root_cfg->cbndx); + ret = IRQ_NONE; + resume = RESUME_TERMINATE; + } + + /* Clear the faulting FSR */ + writel(fsr, gr0_base + SMMU_RINT_CB_FSR(i)); + + /* Retry or terminate any stalled transactions */ + if (fsynr & FSYNR0_SS) + writel_relaxed(resume, gr0_base + SMMU_CTRL_CB_RESUME(i)); + + return ret; +} + +static irqreturn_t hisi_smmu_global_fault(int irq, void *dev) +{ + u32 gfsr, gfsynr0; + struct arm_smmu_device *smmu = dev; + void __iomem *gr0_base = SMMU_GR0(smmu); + + gfsr = readl_relaxed(gr0_base + SMMU_RINT_GFSR); + if (!gfsr) + return IRQ_NONE; + + gfsynr0 = readl_relaxed(gr0_base + SMMU_RINT_GFSYNR); + + dev_err_ratelimited(smmu->dev, + "Unexpected global fault, this could be serious\n"); + dev_err_ratelimited(smmu->dev, + "\tGFSR 0x%08x, GFSYNR0 0x%08x\n", gfsr, gfsynr0); + + writel(gfsr, gr0_base + SMMU_RINT_GFSR); + return IRQ_HANDLED; +} + +static void hisi_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain) +{ + u32 reg; + struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; + struct arm_smmu_device *smmu = root_cfg->smmu; + void __iomem *cb_base; + + cb_base = SMMU_CB_BASE(smmu) + SMMU_CB(smmu, root_cfg->cbndx); + + /* TTBR0 */ + arm_smmu_flush_pgtable(smmu, root_cfg->pgd, + PTRS_PER_PGD * sizeof(pgd_t)); + reg = __pa(root_cfg->pgd); + writel_relaxed(reg, cb_base + SMMU_S1_TTBR0_L); + reg = (phys_addr_t)__pa(root_cfg->pgd) >> 32; + writel_relaxed(reg, cb_base + SMMU_S1_TTBR0_H); + + /* + * TTBCR + * We use long descriptor, with inner-shareable WBWA tables in TTBR0. + */ + if (PAGE_SIZE == SZ_4K) + reg = TTBCR_TG0_4K; + else + reg = HISI_TTBCR_TG0_64K; + + reg |= (64 - smmu->s1_output_size) << TTBCR_T0SZ_SHIFT; + + reg |= (TTBCR_SH_IS << TTBCR_SH0_SHIFT) | + (TTBCR_RGN_WBWA << TTBCR_ORGN0_SHIFT) | + (TTBCR_RGN_WBWA << TTBCR_IRGN0_SHIFT); + writel_relaxed(reg, cb_base + SMMU_S1_TTBCR); + + reg = MAIR0_STAGE1; + writel_relaxed(reg, cb_base + SMMU_S1_MAIR0); + + /* SCTLR */ + reg = SCTLR_CFCFG | SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_AFED; +#ifdef __BIG_ENDIAN + reg |= SCTLR_E; +#endif + writel_relaxed(reg, cb_base + SMMU_S1_SCTLR); +} + +static void hisi_smmu_destroy_context_bank(struct arm_smmu_domain *smmu_domain) +{ + struct arm_smmu_cfg *root_cfg = &smmu_domain->root_cfg; + struct arm_smmu_device *smmu = root_cfg->smmu; + void __iomem *cb_base; + + /* Disable the context bank and nuke the TLB before freeing it. */ + cb_base = SMMU_CB_BASE(smmu) + SMMU_CB(smmu, root_cfg->cbndx); + writel_relaxed(0, cb_base + SMMU_S1_SCTLR); + hisi_smmu_tlb_inv_context(root_cfg); +} + +static int hisi_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_master *master) +{ + if (SMMU_CB_SID(&smmu_domain->root_cfg) != master->streamids[0]) { + dev_err(smmu_domain->leaf_smmu->dev, "Too many sid attached\n"); + return -ENODEV; + } + + return 0; +} + +static void hisi_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_master *master) +{ +} + +static int hisi_smmu_device_reset(struct arm_smmu_device *smmu) +{ + void __iomem *gr0_base = SMMU_GR0(smmu); + void __iomem *cb_base; + struct page *cbt_page; + int i = 0; + u32 reg; + + /* Clear Global FSR */ + reg = readl_relaxed(gr0_base + SMMU_RINT_GFSR); + writel(reg, gr0_base + SMMU_RINT_GFSR); + + /* unmask all global interrupt */ + writel_relaxed(0, gr0_base + SMMU_CFG_GFIM); + + reg = CFG_CBF_S1_ORGN_WA | CFG_CBF_S1_IRGN_WA | CFG_CBF_S1_SHCFG_IS; + reg |= CFG_CBF_S2_ORGN_WA | CFG_CBF_S2_IRGN_WA | CFG_CBF_S2_SHCFG_IS; + writel_relaxed(reg, gr0_base + SMMU_CFG_CBF); + + /* stage 2 context banks table */ + reg = readl_relaxed(gr0_base + SMMU_CFG_S2CTBAR); + if (!reg) { + cbt_page = alloc_pages(GFP_DMA32, get_order(SMMU_S2CBT_SIZE)); + if (!cbt_page) { + pr_err("Failed to allocate stage2 CB table\n"); + return -ENOMEM; + } + + reg = (u32)(page_to_phys(cbt_page) >> SMMU_S2CBT_SHIFT); + writel_relaxed(reg, gr0_base + SMMU_CFG_S2CTBAR); + smmu->s2cbt = page_address(cbt_page); + + for (i = 0; i < HISI_SMMU_MAX_CBS; i++) { + writel_relaxed(0, smmu->s2cbt + SMMU_CB_S1CTBAR(i)); + writel_relaxed(S2CR_TYPE_BYPASS, + smmu->s2cbt + SMMU_CB_S2CR(i)); + } + + /* Invalidate all TLB */ + writel_relaxed(0, gr0_base + SMMU_TLBIALL); + hisi_smmu_tlb_sync(smmu); + } else { + smmu->s2cbt = ioremap_cache( + (phys_addr_t)reg << SMMU_S2CBT_SHIFT, SMMU_S2CBT_SIZE); + } + + /* stage 1 context banks table */ + cbt_page = alloc_pages(GFP_DMA32, get_order(SMMU_S1CBT_SIZE)); + if (!cbt_page) { + pr_err("Failed to allocate stage1 CB table\n"); + return -ENOMEM; + } + + reg = (u32)(page_to_phys(cbt_page) >> SMMU_S1CBT_SHIFT); + writel_relaxed(reg, smmu->s2cbt + SMMU_CB_S1CTBAR(SMMU_OS_VMID)); + smmu->s1cbt = page_address(cbt_page); + + /* Make sure all context banks are disabled */ + for (i = 0; i < smmu->num_context_banks; i++) { + cb_base = SMMU_CB_BASE(smmu) + SMMU_CB(smmu, i); + + writel_relaxed(0, cb_base + SMMU_S1_SCTLR); + } + + /* Clear CB_FSR */ + for (i = 0; i < SMMU_CB_NUMIRPT; i++) + writel_relaxed(FSR_FAULT, gr0_base + SMMU_RINT_CB_FSR(i)); + + /* + * Use the weakest attribute, so no impact stage 1 output attribute. + */ + reg = CBAR_TYPE_S1_TRANS_S2_BYPASS | CBAR_MTSH_WEAKEST; + writel_relaxed(reg, smmu->s2cbt + SMMU_CB_CBAR(SMMU_OS_VMID)); + + /* Bypass need use another S2CR */ + reg = S2CR_TYPE_BYPASS | S2CR_MTSH_WEAKEST; + writel_relaxed(reg, smmu->s2cbt + SMMU_CB_S2CR(0xff)); + + /* Mark S2CR as translation */ + reg = S2CR_TYPE_TRANS | S2CR_MTSH_WEAKEST; + writel_relaxed(reg, smmu->s2cbt + SMMU_CB_S2CR(SMMU_OS_VMID)); + + /* Invalidate host OS TLB */ + writel_relaxed(SMMU_OS_VMID, gr0_base + SMMU_TLBIVMID); + hisi_smmu_tlb_sync(smmu); + + writel_relaxed(sACR_WC_EN, gr0_base + SMMU_CTRL_ACR); + + /* Enable fault report */ + reg = readl_relaxed(SMMU_GR0(smmu) + SMMU_CTRL_CR0); + reg |= (sCR0_GFRE | sCR0_GFIE | sCR0_GCFGFRE | sCR0_GCFGFIE); + reg &= ~sCR0_CLIENTPD; + + writel_relaxed(reg, gr0_base + SMMU_CTRL_CR0); + + return 0; +} + +static u32 hisi_smmu_id_size_to_bits(u32 size) +{ + int i; + + for (i = 7; i >= 0; i--) + if ((size >> i) & 0x1) + break; + + return 32 + 4 * (i + 1); +} + +static int hisi_smmu_device_cfg_probe(struct arm_smmu_device *smmu) +{ + void __iomem *gr0_base = SMMU_GR0(smmu); + u32 id; + + dev_notice(smmu->dev, "probing hisi-smmu hardware configuration...\n"); + + smmu->version = 1; + + /* ID0 */ + id = readl_relaxed(gr0_base + SMMU_IDR0); + + if (id & ID0_NTS) { + smmu->features |= ARM_SMMU_FEAT_TRANS_NESTED; + smmu->features |= ARM_SMMU_FEAT_TRANS_S1; + smmu->features |= ARM_SMMU_FEAT_TRANS_S2; + dev_notice(smmu->dev, "\tnested translation\n"); + } else if (id & ID0_S1TS) { + smmu->features |= ARM_SMMU_FEAT_TRANS_S1; + dev_notice(smmu->dev, "\tstage 1 translation\n"); + } + + if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) { + dev_err(smmu->dev, "\tstage 1 translation not support!\n"); + return -ENODEV; + } + + if (id & ID0_CTTW) { + smmu->features |= ARM_SMMU_FEAT_COHERENT_WALK; + dev_notice(smmu->dev, "\tcoherent table walk\n"); + } + + smmu->num_context_banks = HISI_SMMU_MAX_CBS; + + /* ID2 */ + id = readl_relaxed(gr0_base + SMMU_IDR2); + smmu->input_size = hisi_smmu_id_size_to_bits(ID2_IAS_GET(id)); + smmu->s1_output_size = ID2_IPA_SIZE; + smmu->s2_output_size = hisi_smmu_id_size_to_bits(ID2_OAS_GET(id)); + + return 0; +} + +static int hisi_smmu_device_remove(struct arm_smmu_device *smmu) +{ + u32 reg; + + /* + * Here, we only free s1cbt. + * The s2cbt may be shared with hypervisor or other smmu devices. + */ + free_pages((unsigned long)smmu->s1cbt, get_order(SMMU_S1CBT_SIZE)); + + /* Disable fault report */ + reg = readl_relaxed(SMMU_GR0(smmu) + SMMU_CTRL_CR0); + reg &= ~(sCR0_GFRE | sCR0_GFIE | sCR0_GCFGFRE | sCR0_GCFGFIE); + reg |= sCR0_CLIENTPD; + writel(reg, SMMU_GR0(smmu) + SMMU_CTRL_CR0); + + return 0; +} + +static struct smmu_hwdep_ops hisi_smmu_hwdep_ops = { + .alloc_context = hisi_smmu_alloc_context, + .tlb_sync_finished = hisi_smmu_tlb_sync_finished, + .tlb_inv_context = hisi_smmu_tlb_inv_context, + .context_fault = hisi_smmu_context_fault, + .global_fault = hisi_smmu_global_fault, + .init_context_bank = hisi_smmu_init_context_bank, + .destroy_context_bank = hisi_smmu_destroy_context_bank, + .domain_add_master = hisi_smmu_domain_add_master, + .domain_remove_master = hisi_smmu_domain_remove_master, + .device_reset = hisi_smmu_device_reset, + .device_cfg_probe = hisi_smmu_device_cfg_probe, + .device_remove = hisi_smmu_device_remove, +}; + +#ifdef CONFIG_OF +static struct of_device_id hisi_smmu_of_match[] = { + { .compatible = "hisilicon,smmu-v1", }, + { }, +}; +MODULE_DEVICE_TABLE(of, hisi_smmu_of_match); +#endif + +static int arm_smmu_device_probe(struct platform_device *pdev) +{ + return arm_smmu_device_dt_probe(pdev, &hisi_smmu_hwdep_ops); +} + +static struct platform_driver hisi_smmu_driver = { + .driver = { + .owner = THIS_MODULE, + .name = "hisi-smmu", + .of_match_table = of_match_ptr(hisi_smmu_of_match), + }, + .probe = arm_smmu_device_probe, + .remove = arm_smmu_device_remove, +}; + +static int __init hisi_smmu_init(void) +{ + return platform_driver_register(&hisi_smmu_driver); +} + +static void __exit hisi_smmu_exit(void) +{ + return platform_driver_unregister(&hisi_smmu_driver); +} + +subsys_initcall(hisi_smmu_init); +module_exit(hisi_smmu_exit); + +MODULE_DESCRIPTION("IOMMU API for Hisilicon architected SMMU implementations"); +MODULE_AUTHOR("Zhen Lei thunder.leizhen@huawei.com"); +MODULE_LICENSE("GPL v2"); -- 1.8.0
This patch adds a description of private properties for the Hisilicon System MMU architecture.
Signed-off-by: Zhen Lei thunder.leizhen@huawei.com --- Documentation/devicetree/bindings/iommu/arm,smmu.txt | 2 ++ drivers/iommu/arm-smmu-base.c | 3 +++ 2 files changed, 5 insertions(+)
diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index f284b99..23035ce 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -1,4 +1,5 @@ * ARM System MMU Architecture Implementation +* Hisilicon System MMU Architecture Implementation
ARM SoCs may contain an implementation of the ARM System Memory Management Unit Architecture, which can be used to provide 1 or 2 stages @@ -15,6 +16,7 @@ conditions. "arm,smmu-v2" "arm,mmu-400" "arm,mmu-500" + "hisilicon,smmu-v1"
depending on the particular implementation and/or the version of the architecture implemented. diff --git a/drivers/iommu/arm-smmu-base.c b/drivers/iommu/arm-smmu-base.c index ca0e3db..d87122b 100644 --- a/drivers/iommu/arm-smmu-base.c +++ b/drivers/iommu/arm-smmu-base.c @@ -22,6 +22,9 @@ * - 4k and 64k pages, with contiguous pte hints. * - Up to 42-bit addressing (dependent on VA_BITS) * - Context fault reporting + * + * Additional supports: + * - Hisilicon smmu-v1 implementation */
#define pr_fmt(fmt) "arm-smmu: " fmt -- 1.8.0
On 2014/7/9 11:00, leizhen wrote:
On 2014/7/9 8:51, Li Zefan wrote:
Remove RFC from the subject?
On 2014/7/8 17:38, Zhen Lei wrote:
Changes in v5:
- Put all registers(exclude translation table associated) definition into each smmu private file.
.
OK, Tomorrow I will sent this patch to community. And use "PATCH v1"
Just remove RFC and stick with v5.
On 2014/7/9 11:05, Li Zefan wrote:
On 2014/7/9 11:00, leizhen wrote:
On 2014/7/9 8:51, Li Zefan wrote:
Remove RFC from the subject?
On 2014/7/8 17:38, Zhen Lei wrote:
Changes in v5:
- Put all registers(exclude translation table associated) definition into each smmu private file.
.
OK, Tomorrow I will sent this patch to community. And use "PATCH v1"
Just remove RFC and stick with v5.
.
OK. internal is v5, external is still v3.
linaro-kernel@lists.linaro.org