From aa24886e379d2b641c5117e178b15ce1d5d366ba Mon Sep 17 00:00:00 2001 From: David Brownell Date: Fri, 10 Aug 2007 13:10:27 -0700 Subject: dma_free_coherent() needs irqs enabled (sigh) On at least ARM (and I'm told MIPS too) dma_free_coherent() has a newish call context requirement: unlike its dma_alloc_coherent() sibling, it may not be called with IRQs disabled. (This was new behavior on ARM as of late 2005, caused by ARM SMP updates.) This little surprise can be annoyingly driver-visible. Since it looks like that restriction won't be removed, this patch changes the definition of the API to include that requirement. Also, to help catch nonportable drivers, it updates the x86 and swiotlb versions to include the relevant warnings. (I already observed that it trips on the bus_reset_tasklet of the new firewire_ohci driver.) Signed-off-by: David Brownell Cc: David Miller Acked-by: Russell King Cc: Andi Kleen Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman --- Documentation/DMA-API.txt | 3 +++ 1 file changed, 3 insertions(+) (limited to 'Documentation') diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index cc7a8c39fb6f..b939ebb62871 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -68,6 +68,9 @@ size and dma_handle must all be the same as those passed into the consistent allocate. cpu_addr must be the virtual address returned by the consistent allocate. +Note that unlike their sibling allocation calls, these routines +may only be called with IRQs enabled. + Part Ib - Using small dma-coherent buffers ------------------------------------------ -- cgit v1.2.3 From 4904e23b6b37c748a35bb2bd14ea325d0592e671 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Thu, 20 Sep 2007 12:48:23 +1000 Subject: PCI: Remove no longer correct documentation regarding MSI vector assignment MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The MSI vector reservation system described in Documentation/MSI-HOWTO.txt was removed by Eric in 92db6d10bc1bc43330a4c540fa5b64c83d9d865f. Remove the references to it in the documentation. While we're here ยง 5.5.1 refers to x86 hardware requirements, so make that clear. Signed-off-by: Michael Ellerman Cc: Eric W. Biederman Signed-off-by: Greg Kroah-Hartman --- Documentation/MSI-HOWTO.txt | 69 +++------------------------------------------ 1 file changed, 4 insertions(+), 65 deletions(-) (limited to 'Documentation') diff --git a/Documentation/MSI-HOWTO.txt b/Documentation/MSI-HOWTO.txt index 0d8240774fca..a51f693c1541 100644 --- a/Documentation/MSI-HOWTO.txt +++ b/Documentation/MSI-HOWTO.txt @@ -241,68 +241,7 @@ address space of the MSI-X table/MSI-X PBA. Otherwise, the PCI subsystem will fail enabling MSI-X on its hardware device when it calls the function pci_enable_msix(). -5.3.2 Handling MSI-X allocation - -Determining the number of MSI-X vectors allocated to a function is -dependent on the number of MSI capable devices and MSI-X capable -devices populated in the system. The policy of allocating MSI-X -vectors to a function is defined as the following: - -#of MSI-X vectors allocated to a function = (x - y)/z where - -x = The number of available PCI vector resources by the time - the device driver calls pci_enable_msix(). The PCI vector - resources is the sum of the number of unassigned vectors - (new) and the number of released vectors when any MSI/MSI-X - device driver switches its hardware device back to a legacy - mode or is hot-removed. The number of unassigned vectors - may exclude some vectors reserved, as defined in parameter - NR_HP_RESERVED_VECTORS, for the case where the system is - capable of supporting hot-add/hot-remove operations. Users - may change the value defined in NR_HR_RESERVED_VECTORS to - meet their specific needs. - -y = The number of MSI capable devices populated in the system. - This policy ensures that each MSI capable device has its - vector reserved to avoid the case where some MSI-X capable - drivers may attempt to claim all available vector resources. - -z = The number of MSI-X capable devices populated in the system. - This policy ensures that maximum (x - y) is distributed - evenly among MSI-X capable devices. - -Note that the PCI subsystem scans y and z during a bus enumeration. -When the PCI subsystem completes configuring MSI/MSI-X capability -structure of a device as requested by its device driver, y/z is -decremented accordingly. - -5.3.3 Handling MSI-X shortages - -For the case where fewer MSI-X vectors are allocated to a function -than requested, the function pci_enable_msix() will return the -maximum number of MSI-X vectors available to the caller. A device -driver may re-send its request with fewer or equal vectors indicated -in the return. For example, if a device driver requests 5 vectors, but -the number of available vectors is 3 vectors, a value of 3 will be -returned as a result of pci_enable_msix() call. A function could be -designed for its driver to use only 3 MSI-X table entries as -different combinations as ABC--, A-B-C, A--CB, etc. Note that this -patch does not support multiple entries with the same vector. Such -attempt by a device driver to use 5 MSI-X table entries with 3 vectors -as ABBCC, AABCC, BCCBA, etc will result as a failure by the function -pci_enable_msix(). Below are the reasons why supporting multiple -entries with the same vector is an undesirable solution. - - - The PCI subsystem cannot determine the entry that - generated the message to mask/unmask MSI while handling - software driver ISR. Attempting to walk through all MSI-X - table entries (2048 max) to mask/unmask any match vector - is an undesirable solution. - - - Walking through all MSI-X table entries (2048 max) to handle - SMP affinity of any match vector is an undesirable solution. - -5.3.4 API pci_enable_msix +5.3.2 API pci_enable_msix int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec) @@ -339,7 +278,7 @@ a failure. This failure may be a result of duplicate entries specified in second argument, or a result of no available vector, or a result of failing to initialize MSI-X table entries. -5.3.5 API pci_disable_msix +5.3.3 API pci_disable_msix void pci_disable_msix(struct pci_dev *dev) @@ -349,7 +288,7 @@ always call free_irq() on all MSI-X vectors it has done request_irq() on before calling this API. Failure to do so results in a BUG_ON() and a device will be left with MSI-X enabled and leaks its vectors. -5.3.6 MSI-X mode vs. legacy mode diagram +5.3.4 MSI-X mode vs. legacy mode diagram The below diagram shows the events which switch the interrupt mode on the MSI-X capable device function between MSI-X mode and @@ -407,7 +346,7 @@ between MSI mod MSI-X mode during a run-time. MSI/MSI-X support requires support from both system hardware and individual hardware device functions. -5.5.1 System hardware support +5.5.1 Required x86 hardware support Since the target of MSI address is the local APIC CPU, enabling MSI/MSI-X support in the Linux kernel is dependent on whether existing -- cgit v1.2.3 From 7f785763660e75c9eddaddea3d618696af4ae3a2 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 5 Oct 2007 13:17:58 -0700 Subject: pci: implement "pci=noaer" For cases in which CONFIG_PCIEAER=y (such as distro kernels), allow users to disable PCIE Advanced Error Reporting by using "pci=noaer" on the kernel command line. This can be used to work around hardware or (kernel) software problems. Signed-off-by: Randy Dunlap Signed-off-by: Greg Kroah-Hartman --- Documentation/kernel-parameters.txt | 4 ++++ drivers/pci/pci.c | 2 ++ drivers/pci/pci.h | 6 ++++++ drivers/pci/pcie/aer/aerdrv.c | 9 +++++++++ 4 files changed, 21 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index a57c1f216b21..3f0173f45019 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -68,6 +68,7 @@ parameter is applicable: PARIDE The ParIDE (parallel port IDE) subsystem is enabled. PARISC The PA-RISC architecture is enabled. PCI PCI bus support is enabled. + PCIE PCI Express support is enabled. PCMCIA The PCMCIA subsystem is enabled. PNP Plug & Play support is enabled. PPC PowerPC architecture is enabled. @@ -1270,6 +1271,9 @@ and is between 256 and 4096 characters. It is defined in the file Mechanism 1. conf2 [X86-32] Force use of PCI Configuration Mechanism 2. + noaer [PCIE] If the PCIEAER kernel config parameter is + enabled, this kernel boot option can be used to + disable the use of PCIE advanced error reporting. nommconf [X86-32,X86_64] Disable use of MMCONFIG for PCI Configuration nomsi [MSI] If the PCI_MSI kernel config parameter is diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 19a64a36ecab..2dd5c282fabe 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1586,6 +1586,8 @@ static int __devinit pci_setup(char *str) if (*str && (str = pcibios_setup(str)) && *str) { if (!strcmp(str, "nomsi")) { pci_no_msi(); + } else if (!strcmp(str, "noaer")) { + pci_no_aer(); } else if (!strncmp(str, "cbiosize=", 9)) { pci_cardbus_io_size = memparse(str + 9, &str); } else if (!strncmp(str, "cbmemsize=", 10)) { diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 4c36e80f6d26..5360d73d4941 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -52,6 +52,12 @@ void pci_restore_msi_state(struct pci_dev *dev); static inline void pci_restore_msi_state(struct pci_dev *dev) {} #endif +#ifdef CONFIG_PCIEAER +void pci_no_aer(void); +#else +static inline void pci_no_aer(void) { } +#endif + static inline int pci_no_d1d2(struct pci_dev *dev) { unsigned int parent_dstates = 0; diff --git a/drivers/pci/pcie/aer/aerdrv.c b/drivers/pci/pcie/aer/aerdrv.c index ad90a01b0dfc..7a62f7dd9009 100644 --- a/drivers/pci/pcie/aer/aerdrv.c +++ b/drivers/pci/pcie/aer/aerdrv.c @@ -81,6 +81,13 @@ static struct pcie_port_service_driver aerdriver = { .reset_link = aer_root_reset, }; +static int pcie_aer_disable; + +void pci_no_aer(void) +{ + pcie_aer_disable = 1; /* has priority over 'forceload' */ +} + /** * aer_irq - Root Port's ISR * @irq: IRQ assigned to Root Port @@ -327,6 +334,8 @@ static void aer_error_resume(struct pci_dev *dev) **/ static int __init aer_service_init(void) { + if (pcie_aer_disable) + return -ENXIO; return pcie_port_service_register(&aerdriver); } -- cgit v1.2.3 From 62f420f828249f686aaae949ac3439d1304a759a Mon Sep 17 00:00:00 2001 From: Gary Hade Date: Wed, 3 Oct 2007 15:56:51 -0700 Subject: PCI: use _CRS for PCI resource allocation Use _CRS for PCI resource allocation This patch resolves an issue where incorrect PCI memory and i/o ranges are being assigned to hotplugged PCI devices on some IBM systems. The resource mis-allocation not only makes the PCI device unuseable but often makes the entire system unuseable due to resulting machine checks. The hotplug capable PCI slots on the affected systems are not located under a standard P2P bridge but are instead located under PCI root bridges or subtractive decode P2P bridges. For example, the IBM x3850 contains 2 hotplug capable PCI-X slots and 4 hotplug capable PCIe slots with the PCI-X slots each located under a PCI root bridge and the PCIe slots each located under a subtractive decode P2P bridge. The current i386/x86_64 PCI resource allocation code does not use _CRS returned resource information. No other resource information source is available for slots that are not below a standard P2P bridge so incorrect ranges are being allocated from e820 hole causing the bad result. This patch causes the kernel to use _CRS returned resource info. It is roughly based on a change provided by Matthew Wilcox for the ia64 kernel in 2005. Due to possible buggy BIOS factor and possible yet to be discovered kernel issues the function is disabled by default and can be enabled with pci=use_crs. Signed-off-by: Gary Hade Signed-off-by: Greg Kroah-Hartman --- Documentation/kernel-parameters.txt | 2 + arch/x86/pci/acpi.c | 139 ++++++++++++++++++++++++++++++++++++ arch/x86/pci/common.c | 3 + arch/x86/pci/pci.h | 1 + 4 files changed, 145 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 3f0173f45019..e9acd5540d29 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1318,6 +1318,8 @@ and is between 256 and 4096 characters. It is defined in the file IRQ routing is enabled. noacpi [X86-32] Do not use ACPI for IRQ routing or for PCI scanning. + use_crs [X86-32] Use _CRS for PCI resource + allocation. routeirq Do IRQ routing for all PCI devices. This is normally done in pci_enable_device(), so this option is a temporary workaround diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c index 1dd6f3fc077d..c6fd3a6afa42 100644 --- a/arch/x86/pci/acpi.c +++ b/arch/x86/pci/acpi.c @@ -45,6 +45,142 @@ static struct dmi_system_id acpi_pciprobe_dmi_table[] = { {} }; +struct pci_root_info { + char *name; + unsigned int res_num; + struct resource *res; + struct pci_bus *bus; + int busnum; +}; + +static acpi_status +resource_to_addr(struct acpi_resource *resource, + struct acpi_resource_address64 *addr) +{ + acpi_status status; + + status = acpi_resource_to_address64(resource, addr); + if (ACPI_SUCCESS(status) && + (addr->resource_type == ACPI_MEMORY_RANGE || + addr->resource_type == ACPI_IO_RANGE) && + addr->address_length > 0 && + addr->producer_consumer == ACPI_PRODUCER) { + return AE_OK; + } + return AE_ERROR; +} + +static acpi_status +count_resource(struct acpi_resource *acpi_res, void *data) +{ + struct pci_root_info *info = data; + struct acpi_resource_address64 addr; + acpi_status status; + + status = resource_to_addr(acpi_res, &addr); + if (ACPI_SUCCESS(status)) + info->res_num++; + return AE_OK; +} + +static acpi_status +setup_resource(struct acpi_resource *acpi_res, void *data) +{ + struct pci_root_info *info = data; + struct resource *res; + struct acpi_resource_address64 addr; + acpi_status status; + unsigned long flags; + struct resource *root; + + status = resource_to_addr(acpi_res, &addr); + if (!ACPI_SUCCESS(status)) + return AE_OK; + + if (addr.resource_type == ACPI_MEMORY_RANGE) { + root = &iomem_resource; + flags = IORESOURCE_MEM; + if (addr.info.mem.caching == ACPI_PREFETCHABLE_MEMORY) + flags |= IORESOURCE_PREFETCH; + } else if (addr.resource_type == ACPI_IO_RANGE) { + root = &ioport_resource; + flags = IORESOURCE_IO; + } else + return AE_OK; + + res = &info->res[info->res_num]; + res->name = info->name; + res->flags = flags; + res->start = addr.minimum + addr.translation_offset; + res->end = res->start + addr.address_length - 1; + res->child = NULL; + + if (insert_resource(root, res)) { + printk(KERN_ERR "PCI: Failed to allocate 0x%lx-0x%lx " + "from %s for %s\n", (unsigned long) res->start, + (unsigned long) res->end, root->name, info->name); + } else { + info->bus->resource[info->res_num] = res; + info->res_num++; + } + return AE_OK; +} + +static void +adjust_transparent_bridge_resources(struct pci_bus *bus) +{ + struct pci_dev *dev; + + list_for_each_entry(dev, &bus->devices, bus_list) { + int i; + u16 class = dev->class >> 8; + + if (class == PCI_CLASS_BRIDGE_PCI && dev->transparent) { + for(i = 3; i < PCI_BUS_NUM_RESOURCES; i++) + dev->subordinate->resource[i] = + dev->bus->resource[i - 3]; + } + } +} + +static void +get_current_resources(struct acpi_device *device, int busnum, + struct pci_bus *bus) +{ + struct pci_root_info info; + size_t size; + + info.bus = bus; + info.res_num = 0; + acpi_walk_resources(device->handle, METHOD_NAME__CRS, count_resource, + &info); + if (!info.res_num) + return; + + size = sizeof(*info.res) * info.res_num; + info.res = kmalloc(size, GFP_KERNEL); + if (!info.res) + goto res_alloc_fail; + + info.name = kmalloc(12, GFP_KERNEL); + if (!info.name) + goto name_alloc_fail; + sprintf(info.name, "PCI Bus #%02x", busnum); + + info.res_num = 0; + acpi_walk_resources(device->handle, METHOD_NAME__CRS, setup_resource, + &info); + if (info.res_num) + adjust_transparent_bridge_resources(bus); + + return; + +name_alloc_fail: + kfree(info.res); +res_alloc_fail: + return; +} + struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_device *device, int domain, int busnum) { struct pci_bus *bus; @@ -89,6 +225,9 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_device *device, int do } } #endif + + if (bus && (pci_probe & PCI_USE__CRS)) + get_current_resources(device, busnum, bus); return bus; } diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 7c4d75e3c5f3..7d6a9a5aa7cd 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -436,6 +436,9 @@ char * __devinit pcibios_setup(char *str) } else if (!strcmp(str, "assign-busses")) { pci_probe |= PCI_ASSIGN_ALL_BUSSES; return NULL; + } else if (!strcmp(str, "use_crs")) { + pci_probe |= PCI_USE__CRS; + return NULL; } else if (!strcmp(str, "routeirq")) { pci_routeirq = 1; return NULL; diff --git a/arch/x86/pci/pci.h b/arch/x86/pci/pci.h index 057f335fa3f6..ac56d3916c50 100644 --- a/arch/x86/pci/pci.h +++ b/arch/x86/pci/pci.h @@ -27,6 +27,7 @@ #define PCI_BIOS_IRQ_SCAN 0x2000 #define PCI_ASSIGN_ALL_BUSSES 0x4000 #define PCI_CAN_SKIP_ISA_ALIGN 0x8000 +#define PCI_USE__CRS 0x10000 extern unsigned int pci_probe; extern unsigned long pirq_table_addr; -- cgit v1.2.3 From 32a2eea795643929a43cbbba00d8c4a176b309bf Mon Sep 17 00:00:00 2001 From: Jeff Garzik Date: Thu, 11 Oct 2007 16:57:27 -0400 Subject: PCI: Add 'nodomains' boot option, and pci_domains_supported global * Introduce pci_domains_supported global, hardcoded to zero if !CONFIG_PCI_DOMAINS. * Introduce 'nodomains' boot option, which clears pci_domains_supported on platforms that enable it by default (x86, x86-64, and others when they are converted to use this). Signed-off-by: Jeff Garzik Cc: Andi Kleen Signed-off-by: Greg Kroah-Hartman --- Documentation/kernel-parameters.txt | 2 ++ drivers/pci/pci.c | 13 +++++++++++++ include/linux/pci.h | 7 +++++-- 3 files changed, 20 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e9acd5540d29..d006e8b66ffa 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1274,6 +1274,8 @@ and is between 256 and 4096 characters. It is defined in the file noaer [PCIE] If the PCIEAER kernel config parameter is enabled, this kernel boot option can be used to disable the use of PCIE advanced error reporting. + nodomains [PCI] Disable support for multiple PCI + root domains (aka PCI segments, in ACPI-speak). nommconf [X86-32,X86_64] Disable use of MMCONFIG for PCI Configuration nomsi [MSI] If the PCI_MSI kernel config parameter is diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 2dd5c282fabe..728b3c863d87 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -23,6 +23,10 @@ unsigned int pci_pm_d3_delay = 10; +#ifdef CONFIG_PCI_DOMAINS +int pci_domains_supported = 1; +#endif + #define DEFAULT_CARDBUS_IO_SIZE (256) #define DEFAULT_CARDBUS_MEM_SIZE (64*1024*1024) /* pci=cbmemsize=nnM,cbiosize=nn can override this */ @@ -1567,6 +1571,13 @@ int pci_select_bars(struct pci_dev *dev, unsigned long flags) return bars; } +static void __devinit pci_no_domains(void) +{ +#ifdef CONFIG_PCI_DOMAINS + pci_domains_supported = 0; +#endif +} + static int __devinit pci_init(void) { struct pci_dev *dev = NULL; @@ -1588,6 +1599,8 @@ static int __devinit pci_setup(char *str) pci_no_msi(); } else if (!strcmp(str, "noaer")) { pci_no_aer(); + } else if (!strcmp(str, "nodomains")) { + pci_no_domains(); } else if (!strncmp(str, "cbiosize=", 9)) { pci_cardbus_io_size = memparse(str + 9, &str); } else if (!strncmp(str, "cbmemsize=", 10)) { diff --git a/include/linux/pci.h b/include/linux/pci.h index 038a0dc7273a..768b93359f90 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -685,13 +685,16 @@ extern void pci_unblock_user_cfg_access(struct pci_dev *dev); * a PCI domain is defined to be a set of PCI busses which share * configuration space. */ -#ifndef CONFIG_PCI_DOMAINS +#ifdef CONFIG_PCI_DOMAINS +extern int pci_domains_supported; +#else +enum { pci_domains_supported = 0 }; static inline int pci_domain_nr(struct pci_bus *bus) { return 0; } static inline int pci_proc_domain(struct pci_bus *bus) { return 0; } -#endif +#endif /* CONFIG_PCI_DOMAINS */ #else /* CONFIG_PCI is not enabled */ -- cgit v1.2.3