Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 00/19] Introduce a global lock to serialize all PCI hotplug

From: Jiang Liu <liuj97 <at> gmail.com>

There are multiple methods to trigger PCI hotplug requests/operations
concurrently, such as:
1. Sysfs interfaces exported by the PCI core subsystem
	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../remove
	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../rescan
	/sys/devices/pcissss:bb/ssss:bb:dd.f/.../pci_bus/ssss:bb/rescan
	/sys/bus/pci/rescan
2. Sysfs interfaces exported by the PCI hotplug subsystem
	/sys/bus/pci/slots/xx/power
3. PCI hotplug events triggered by PCI Hotplug Controllers
4. ACPI hotplug events for PCI host bridges
5. Driver binding/unbinding events
	binding/unbinding pci drivers with SR-IOV support

With current implementation, the PCI core subsystem doesn't support
concurrent hotplug operations yet. The existing pci_bus_sem lock only
protects several lists in struct pci_bus, such as children list,
devices list, but it doesn't protect the pci_bus or pci_dev structure
themselves.

Let's take pci_remove_bus_device() as an example, which are used by
PCI hotplug drivers to hot-remove PCI devices.  Currently all these
are free running without any protection, so it can't support reentrance.
pci_remove_bus_device()
    ->pci_stop_bus_device()
        ->pci_stop_bus_device()
            ->pci_stop_bus_devices()
        ->pci_stop_dev()
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 01/19] PCI: introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details

From: Jiang Liu <jiang.liu <at> huawei.com>

Introduce pci_bus_get()/pci_bus_put() to hide PCI implementation details.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/bus.c   |   15 +++++++++++++++
 include/linux/pci.h |    2 ++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 4ce5ef2..50f9c5d 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
 <at>  <at>  -35,6 +35,21  <at>  <at>  void pci_add_resource_offset(struct list_head *resources, struct resource *res,
 }
 EXPORT_SYMBOL(pci_add_resource_offset);

+struct pci_bus *pci_bus_get(struct pci_bus *bus)
+{
+	if (bus)
+		get_device(&bus->dev);
+	return bus;
+}
+EXPORT_SYMBOL(pci_bus_get);
+
+void pci_bus_put(struct pci_bus *bus)
+{
+	if (bus)
+		put_device(&bus->dev);
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 03/19] PCI: replace pci_remove_rescan_mutex with the PCI hotplug lock

From: Jiang Liu <jiang.liu <at> huawei.com>

Replace pci_remove_rescan_mutex with the PCI hotplug lock, which is used to
globally serialize all PCI hotplug operations.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/pci-sysfs.c |  100 +++++++++++++++++++++++++++--------------------
 drivers/pci/remove.c    |    1 +
 2 files changed, 58 insertions(+), 43 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index a55e248..ecf197d 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
 <at>  <at>  -283,7 +283,31  <at>  <at>  msi_bus_store(struct device *dev, struct device_attribute *attr,
 }

 #ifdef CONFIG_HOTPLUG
-static DEFINE_MUTEX(pci_remove_rescan_mutex);
+/*
+ * Schedule a device callback to avoid deadlock in case of
+ * 1) An attribute method tries to deregister the sysfs file itself
+ * 2) Another thread is trying to remove the sysfs file, which have
+ *    already held the PCI hotplug lock by invoking pci_hotplug_enter().
+ */
+static int schedule_hp_callback(struct device *dev,
+				const char *buf, size_t count,
+				void (*func)(struct device *))
+{
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces

From: Jiang Liu <jiang.liu <at> huawei.com>

Use PCI hotplug lock to globally serialize hotplug operations triggered by
PCI hotplug sysfs interfaces.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/pci_hotplug_core.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/hotplug/pci_hotplug_core.c b/drivers/pci/hotplug/pci_hotplug_core.c
index 1572665..9bbbe3e 100644
--- a/drivers/pci/hotplug/pci_hotplug_core.c
+++ b/drivers/pci/hotplug/pci_hotplug_core.c
 <at>  <at>  -39,6 +39,7  <at>  <at> 
 #include <linux/mutex.h>
 #include <linux/pci.h>
 #include <linux/pci_hotplug.h>
+#include <linux/delay.h>
 #include <asm/uaccess.h>
 #include "../pci.h"

 <at>  <at>  -121,6 +122,17  <at>  <at>  static ssize_t power_write_file(struct pci_slot *pci_slot, const char *buf,
 		retval = -ENODEV;
 		goto exit;
 	}
+
+	/* Avoid deadlock with pci_hp_deregister() */
+	while (!pci_hotplug_try_enter()) {
+		/* Check whether the slot has been deregistered. */
(Continue reading)

Greg KH | 2 May 07:06 2012

Re: [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces

On Fri, Apr 27, 2012 at 11:16:45PM +0800, Jiang Liu wrote:
> +	/* Avoid deadlock with pci_hp_deregister() */
> +	while (!pci_hotplug_try_enter()) {
> +		/* Check whether the slot has been deregistered. */
> +		if (list_empty(&slot->slot_list)) {
> +			retval = -ENODEV;
> +			goto exit_put;
> +		}
> +		msleep(1);
> +	}

Oh my.

Wow.

{sigh}

ick.

My eyes hurt.

And your cpu load just went crazy.

You can now handle all of the nasty emails from sysadmins asking why
their systems look like their load is high for no good reason.

Not to mention all of the other issues here.

My statement about not inventing new lock types has now been proven
true.
(Continue reading)

Jiang Liu | 2 May 09:20 2012

Re: [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces

On 2012-5-2 13:06, Greg KH wrote:
> On Fri, Apr 27, 2012 at 11:16:45PM +0800, Jiang Liu wrote:
>> +	/* Avoid deadlock with pci_hp_deregister() */
>> +	while (!pci_hotplug_try_enter()) {
>> +		/* Check whether the slot has been deregistered. */
>> +		if (list_empty(&slot->slot_list)) {
>> +			retval = -ENODEV;
>> +			goto exit_put;
>> +		}
>> +		msleep(1);
>> +	}
>
> Oh my.
>
> Wow.
>
> {sigh}
>
> ick.
>
> My eyes hurt.
>
> And your cpu load just went crazy.
>
> You can now handle all of the nasty emails from sysadmins asking why
> their systems look like their load is high for no good reason.
My bad, should use schedule_timeout_interruptible() instead of
msleep(1) here to avoid busy waiting.

>
(Continue reading)

Greg KH | 2 May 23:48 2012

Re: [PATCH v2 04/19] PCI: serialize hotplug operations triggered by PCI hotplug sysfs interfaces

On Wed, May 02, 2012 at 03:20:15PM +0800, Jiang Liu wrote:
> On 2012-5-2 13:06, Greg KH wrote:
> >On Fri, Apr 27, 2012 at 11:16:45PM +0800, Jiang Liu wrote:
> >>+	/* Avoid deadlock with pci_hp_deregister() */
> >>+	while (!pci_hotplug_try_enter()) {
> >>+		/* Check whether the slot has been deregistered. */
> >>+		if (list_empty(&slot->slot_list)) {
> >>+			retval = -ENODEV;
> >>+			goto exit_put;
> >>+		}
> >>+		msleep(1);
> >>+	}
> >
> >Oh my.
> >
> >Wow.
> >
> >{sigh}
> >
> >ick.
> >
> >My eyes hurt.
> >
> >And your cpu load just went crazy.
> >
> >You can now handle all of the nasty emails from sysadmins asking why
> >their systems look like their load is high for no good reason.
> My bad, should use schedule_timeout_interruptible() instead of
> msleep(1) here to avoid busy waiting.

(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations

From: Jiang Liu <jiang.liu <at> huawei.com>

There are multiple ways to trigger PCI hotplug requests concurrently,
such as:
1. Sysfs interfaces exported by the PCI core subsystem
2. Sysfs interfaces exported by the PCI hotplug subsystem
3. PCI hotplug events triggered by PCI Hotplug Controllers
4. ACPI hotplug events for PCI host bridges
5. Driver binding/unbinding events

The PCI core subsystem doesn't support concurrent hotplug operations yet,
so all PCI hotplug requests should be globally serialized. This patch
introduces several new interfaces to serialize PCI hotplug operations.

pci_hotplug_try_enter(): try to acquire write lock
pci_hotplug_enter(): acquire write lock
pci_hotplug_exit(): release write lock
pci_hotplug_disable(): acquire read lock
pci_hotplug_enable(): release read lock

Today we have reproduced the issue on a real platform by using
acpiphp driver. It's an IA64 platform running Suse 11SP1 (official
2.6.32.12 kernel). The test script is:

This issue could be reproduced on an IA64 platform with Suse 11SP1
(official 2.6.32.12 kernel) and acpiphp driver.
---------------------------------------------------------------------
#!/bin/bash

for ((i=0;i<100;i++))
(Continue reading)

Greg KH | 2 May 07:00 2012

Re: [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations

On Fri, Apr 27, 2012 at 11:16:43PM +0800, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu <at> huawei.com>
> 
> There are multiple ways to trigger PCI hotplug requests concurrently,
> such as:
> 1. Sysfs interfaces exported by the PCI core subsystem
> 2. Sysfs interfaces exported by the PCI hotplug subsystem
> 3. PCI hotplug events triggered by PCI Hotplug Controllers
> 4. ACPI hotplug events for PCI host bridges
> 5. Driver binding/unbinding events
> 
> The PCI core subsystem doesn't support concurrent hotplug operations yet,
> so all PCI hotplug requests should be globally serialized. This patch
> introduces several new interfaces to serialize PCI hotplug operations.
> 
> pci_hotplug_try_enter(): try to acquire write lock

Ick, no, why would you ever want to do that?

> pci_hotplug_enter(): acquire write lock
> pci_hotplug_exit(): release write lock
> pci_hotplug_disable(): acquire read lock
> pci_hotplug_enable(): release read lock

No, the pci hotplug core should not need a rwsem, just a simple lock, if
that:
	pci_hotplug_lock()
	pci_hotplug_unlock()
and that's it.

(Continue reading)

Jiang Liu | 2 May 09:25 2012

Re: [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations

On 2012-5-2 13:00, Greg KH wrote:
> On Fri, Apr 27, 2012 at 11:16:43PM +0800, Jiang Liu wrote:
>> From: Jiang Liu<jiang.liu <at> huawei.com>
>>
>> There are multiple ways to trigger PCI hotplug requests concurrently,
>> such as:
>> 1. Sysfs interfaces exported by the PCI core subsystem
>> 2. Sysfs interfaces exported by the PCI hotplug subsystem
>> 3. PCI hotplug events triggered by PCI Hotplug Controllers
>> 4. ACPI hotplug events for PCI host bridges
>> 5. Driver binding/unbinding events
>>
>> The PCI core subsystem doesn't support concurrent hotplug operations yet,
>> so all PCI hotplug requests should be globally serialized. This patch
>> introduces several new interfaces to serialize PCI hotplug operations.
>>
>> pci_hotplug_try_enter(): try to acquire write lock
>
> Ick, no, why would you ever want to do that?
>
>> pci_hotplug_enter(): acquire write lock
>> pci_hotplug_exit(): release write lock
>> pci_hotplug_disable(): acquire read lock
>> pci_hotplug_enable(): release read lock
>
> No, the pci hotplug core should not need a rwsem, just a simple lock, if
> that:
> 	pci_hotplug_lock()
> 	pci_hotplug_unlock()
> and that's it.
(Continue reading)

Greg KH | 2 May 23:46 2012

Re: [PATCH v2 02/19] PCI: introduce recursive rwsem to serialize PCI hotplug operations

On Wed, May 02, 2012 at 03:25:21PM +0800, Jiang Liu wrote:
> On 2012-5-2 13:00, Greg KH wrote:
> >On Fri, Apr 27, 2012 at 11:16:43PM +0800, Jiang Liu wrote:
> >>From: Jiang Liu<jiang.liu <at> huawei.com>
> >>
> >>There are multiple ways to trigger PCI hotplug requests concurrently,
> >>such as:
> >>1. Sysfs interfaces exported by the PCI core subsystem
> >>2. Sysfs interfaces exported by the PCI hotplug subsystem
> >>3. PCI hotplug events triggered by PCI Hotplug Controllers
> >>4. ACPI hotplug events for PCI host bridges
> >>5. Driver binding/unbinding events
> >>
> >>The PCI core subsystem doesn't support concurrent hotplug operations yet,
> >>so all PCI hotplug requests should be globally serialized. This patch
> >>introduces several new interfaces to serialize PCI hotplug operations.
> >>
> >>pci_hotplug_try_enter(): try to acquire write lock
> >
> >Ick, no, why would you ever want to do that?
> >
> >>pci_hotplug_enter(): acquire write lock
> >>pci_hotplug_exit(): release write lock
> >>pci_hotplug_disable(): acquire read lock
> >>pci_hotplug_enable(): release read lock
> >
> >No, the pci hotplug core should not need a rwsem, just a simple lock, if
> >that:
> >	pci_hotplug_lock()
> >	pci_hotplug_unlock()
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 05/19] PCI: correctly flush workqueue when destroy pcie hotplug controller

From: Jiang Liu <jiang.liu <at> huawei.com>

When destroying a PCIe hotplug controller, all work items associated with
that controller should be flushed.  Function pcie_cleanup_slot() calls
cancel_delayed_work() and flush_workqueue() to achieve that.
Function flush_workqueue() will flush all work items already submitted,
but new work items submitted by those already submitted work items may
still be in live state when returning from flush_workqueue().

For the extreme case, pciehp driver may expierence following calling path:
1) pcie_isr() -> pciehp_handle_xxx() -> queue_interrupt_event()->queue_work()
2) interrupt_event_handler() -> handle_button_press_event() ->
   queue_delayed_work()
3) pciehp_queue_pushbutton_work() -> queue_work()

So enhance pcie_cleanup_slot() to correctly flush workqueue when destroying
PCIe hotplug controller.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/pciehp_hpc.c |   14 +++++++++++++-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index a960fae..98b775f 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
 <at>  <at>  -889,8 +889,20  <at>  <at>  static int pcie_init_slot(struct controller *ctrl)
 static void pcie_cleanup_slot(struct controller *ctrl)
 {
(Continue reading)

Greg KH | 2 May 07:08 2012

Re: [PATCH v2 05/19] PCI: correctly flush workqueue when destroy pcie hotplug controller

On Fri, Apr 27, 2012 at 11:16:46PM +0800, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu <at> huawei.com>
> 
> When destroying a PCIe hotplug controller, all work items associated with
> that controller should be flushed.  Function pcie_cleanup_slot() calls
> cancel_delayed_work() and flush_workqueue() to achieve that.
> Function flush_workqueue() will flush all work items already submitted,
> but new work items submitted by those already submitted work items may
> still be in live state when returning from flush_workqueue().
> 
> For the extreme case, pciehp driver may expierence following calling path:
> 1) pcie_isr() -> pciehp_handle_xxx() -> queue_interrupt_event()->queue_work()
> 2) interrupt_event_handler() -> handle_button_press_event() ->
>    queue_delayed_work()
> 3) pciehp_queue_pushbutton_work() -> queue_work()
> 
> So enhance pcie_cleanup_slot() to correctly flush workqueue when destroying
> PCIe hotplug controller.
> 
> Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
> ---
>  drivers/pci/hotplug/pciehp_hpc.c |   14 +++++++++++++-
>  1 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index a960fae..98b775f 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
>  <at>  <at>  -889,8 +889,20  <at>  <at>  static int pcie_init_slot(struct controller *ctrl)
>  static void pcie_cleanup_slot(struct controller *ctrl)
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 06/19] PCI: prepare for serializing hotplug operations triggered by pciehp driver

From: Jiang Liu <jiang.liu <at> huawei.com>

Split pciehp_wq into two workqueues to avoid possible deadlock issues
when using PCI hotplug lock to serialize hotplug operations triggered
by pciehp driver.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/pciehp.h      |    3 ++-
 drivers/pci/hotplug/pciehp_core.c |   18 +++++++++++++-----
 drivers/pci/hotplug/pciehp_ctrl.c |    8 ++++----
 drivers/pci/hotplug/pciehp_hpc.c  |   11 ++++++-----
 4 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 4b7cce1..c8a1a27 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
 <at>  <at>  -44,7 +44,8  <at>  <at>  extern bool pciehp_poll_mode;
 extern int pciehp_poll_time;
 extern bool pciehp_debug;
 extern bool pciehp_force;
-extern struct workqueue_struct *pciehp_wq;
+extern struct workqueue_struct *pciehp_wq_event;
+extern struct workqueue_struct *pciehp_wq_power;

 #define dbg(format, arg...)						\
 do {									\
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 365c6b9..4ceefe3 100644
(Continue reading)

Greg KH | 2 May 07:10 2012

Re: [PATCH v2 06/19] PCI: prepare for serializing hotplug operations triggered by pciehp driver

On Fri, Apr 27, 2012 at 11:16:47PM +0800, Jiang Liu wrote:
> From: Jiang Liu <jiang.liu <at> huawei.com>
> 
> Split pciehp_wq into two workqueues to avoid possible deadlock issues
> when using PCI hotplug lock to serialize hotplug operations triggered
> by pciehp driver.

Why two workqueues?  What is deadlocking?  And why name one "power" and
one "event"?  What do they now do differently?  How are you serializing
events across the workqueues as they can be doing the same thing at the
same time to the same hardware now, right?

What am I missing?

Ick, I said I wasn't going to read anymore, I'm really stopping now.
Sorry, it's like watching a train-wreck, you just can't turn away...

greg k-h
Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 07/19] PCI: serialize hotplug operaitons triggered by the pciehp driver

From: Jiang Liu <jiang.liu <at> huawei.com>

Use PCI hotplug lock to serialize hotplug operations triggered by the
pciehp driver. It solves following crash issues.

test scripts:
[root]cat offline.sh
#!/bin/bash

for((i=0;i<100000;i++))
do
	`echo 0 > /sys/bus/pci/slots/16/power`
	`echo 1 > /sys/bus/pci/slots/16/power`
done

[root]cat remove.sh
#!/bin/bash

for((i=0;i<100000;i++))
do
	`echo 1 > /sys/bus/pci/devices/0000:0f:00.0/remove`
	`echo 1 > /sys/bus/pci/rescan`
done

--------------------------------------------
CPU 11
Modules linked in: pciehp pci_hotplug ipv6 cpufreq_conservative cpufreq_userspac

Pid: 8675, comm: offline.sh Not tainted 3.4.0-rc3-yijing+ #7 Huawei Technologies
CPU 11
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 08/19] PCI: fix two race windows when probing/removing SHPC controller

From: Jiang Liu <jiang.liu <at> huawei.com>

With current shpchp implementation, interrupt is enabled before corresponding
slot data structures are created. If interrupt happens immediately after
enabling the hotplug interrupt, shpchp_find_slot() may return NULL, thus
causes NULL pointer dereference. So create slot data structures before
enabling interrupt.

And cleanup_slots() is called to destroy slot data structures before disabling
interrupt, which may cause invalid memory access in shpc_isr().  So call
cleanup_slots() after disabling interrupt/removing the poll timer.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/shpchp.h      |    1 +
 drivers/pci/hotplug/shpchp_core.c |    6 +++---
 drivers/pci/hotplug/shpchp_hpc.c  |   36 +++++++++++++++++++++---------------
 3 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp.h b/drivers/pci/hotplug/shpchp.h
index ca64932..6691dc4 100644
--- a/drivers/pci/hotplug/shpchp.h
+++ b/drivers/pci/hotplug/shpchp.h
 <at>  <at>  -182,6 +182,7  <at>  <at>  extern int shpchp_unconfigure_device(struct slot *p_slot);
 extern void cleanup_slots(struct controller *ctrl);
 extern void shpchp_queue_pushbutton_work(struct work_struct *work);
 extern int shpc_init( struct controller *ctrl, struct pci_dev *pdev);
+extern void shpc_enable_intr(struct controller *ctrl);

 static inline const char *slot_name(struct slot *slot)
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 11/19] PCI: release IO resource in error handling path in cpcihp_generic_init()

From: Jiang Liu <jiang.liu <at> huawei.com>

Release IO resource in error handling path in function cpcihp_generic_init().

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/cpcihp_generic.c |   25 ++++++++++++++++---------
 1 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
index 81af764..518f387 100644
--- a/drivers/pci/hotplug/cpcihp_generic.c
+++ b/drivers/pci/hotplug/cpcihp_generic.c
 <at>  <at>  -157,16 +157,19  <at>  <at>  static int __init cpcihp_generic_init(void)
 	bus = pci_find_bus(0, bridge_busnr);
 	if (!bus) {
 		err("Invalid bus number %d", bridge_busnr);
-		return -EINVAL;
-	}
-	dev = pci_get_slot(bus, PCI_DEVFN(bridge_slot, 0));
-	if(!dev || dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
-		err("Invalid bridge device %s", bridge);
+	} else {
+		dev = pci_get_slot(bus, PCI_DEVFN(bridge_slot, 0));
+		if (!dev || dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
+			err("Invalid bridge device %s", bridge);
+			bus = NULL;
+		} else
+			bus = dev->subordinate;
 		pci_dev_put(dev);
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 09/19] PCI: correctly flush workqueues and timer when destroy SHPC controller

From: Jiang Liu <jiang.liu <at> huawei.com>

del_timer() only deactivates a timer but doesn't wait for the handler
to finish, so use del_timer_sync() to deactivate a timer and wait for
the handler to finish in hpc_release_ctrl().

This patch also tune the workqueue flush logic to correctly flush all
work items.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/shpchp_core.c |    2 +-
 drivers/pci/hotplug/shpchp_hpc.c  |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp_core.c b/drivers/pci/hotplug/shpchp_core.c
index aaa66be..19762b3 100644
--- a/drivers/pci/hotplug/shpchp_core.c
+++ b/drivers/pci/hotplug/shpchp_core.c
 <at>  <at>  -173,8 +173,8  <at>  <at>  void cleanup_slots(struct controller *ctrl)
 	list_for_each_safe(tmp, next, &ctrl->slot_list) {
 		slot = list_entry(tmp, struct slot, slot_list);
 		list_del(&slot->slot_list);
-		cancel_delayed_work(&slot->work);
 		flush_workqueue(shpchp_wq);
+		cancel_delayed_work_sync(&slot->work);
 		flush_workqueue(shpchp_ordered_wq);
 		pci_hp_deregister(slot->hotplug_slot);
 	}
diff --git a/drivers/pci/hotplug/shpchp_hpc.c b/drivers/pci/hotplug/shpchp_hpc.c
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 10/19] PCI: serialize hotplug operaitons triggered by the shpchp driver

From: Jiang Liu <jiang.liu <at> huawei.com>

Use PCI hotplug lock to serialize hotplug operations triggered by the
shpchp driver.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/shpchp.h      |    2 ++
 drivers/pci/hotplug/shpchp_core.c |    3 ++-
 drivers/pci/hotplug/shpchp_ctrl.c |   32 ++++++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp.h b/drivers/pci/hotplug/shpchp.h
index 6691dc4..07f3b2d 100644
--- a/drivers/pci/hotplug/shpchp.h
+++ b/drivers/pci/hotplug/shpchp.h
 <at>  <at>  -157,6 +157,7  <at>  <at>  struct controller {
 #define BLINKINGOFF_STATE		2
 #define POWERON_STATE			3
 #define POWEROFF_STATE			4
+#define SHUTDOWN_STATE			5

 /* Error messages */
 #define INTERLOCK_OPEN			0x00000002
 <at>  <at>  -179,6 +180,7  <at>  <at>  extern u8 shpchp_handle_presence_change(u8 hp_slot, struct controller *ctrl);
 extern u8 shpchp_handle_power_fault(u8 hp_slot, struct controller *ctrl);
 extern int shpchp_configure_device(struct slot *p_slot);
 extern int shpchp_unconfigure_device(struct slot *p_slot);
+extern void shpchp_shutdown_slot(struct slot *slot);
 extern void cleanup_slots(struct controller *ctrl);
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 12/19] PCI: clean up all resources in error handling path in zt5550_hc_init_one()

From: Jiang Liu <jiang.liu <at> huawei.com>

Clean up all resources in error handling path in function zt5550_hc_init_one().

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/cpcihp_zt5550.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/hotplug/cpcihp_zt5550.c b/drivers/pci/hotplug/cpcihp_zt5550.c
index 6bf8d2a..8a6f968 100644
--- a/drivers/pci/hotplug/cpcihp_zt5550.c
+++ b/drivers/pci/hotplug/cpcihp_zt5550.c
 <at>  <at>  -257,11 +257,13  <at>  <at>  static int zt5550_hc_init_one (struct pci_dev *pdev, const struct pci_device_id
 	if(status != 0) {
 		err("could not started cPCI hotplug system");
 		cpci_hp_unregister_bus(bus0);
-		goto init_register_error;
+		goto init_start_error;
 	}
 	dbg("started cpci hp system");

 	return 0;
+init_start_error:
+	cpci_hp_unregister_bus(bus0);
 init_register_error:
 	cpci_hp_unregister_controller(&zt5550_hpc);
 init_hc_error:
--

-- 
1.7.5.4
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 14/19] PCI: fix race windows when shutting down cpcihp controller

From: Jiang Liu <jiang.liu <at> huawei.com>

When cpci_hp_stop() is called to disabled cpcihp controller, it will disable
interrupt for that controller. But there's small window for event_thread()
to reenable the interrupt again. So stop the worker thread before disabling
the interrupt.

If check_slots() returns error, ther working thread (cpci_thread) will exit.
Later when cpci_stop_thread() or cpci_hp_intr() tries to access cpci_thread,
it may have already been destroyed. So hold a reference count to cpci_thread
to avoid invalid memory access.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/cpci_hotplug_core.c |   27 +++++++++++++++++----------
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c b/drivers/pci/hotplug/cpci_hotplug_core.c
index 7898023..68e43c7 100644
--- a/drivers/pci/hotplug/cpci_hotplug_core.c
+++ b/drivers/pci/hotplug/cpci_hotplug_core.c
 <at>  <at>  -60,7 +60,6  <at>  <at>  static atomic_t extracting;
 int cpci_debug;
 static struct cpci_hp_controller *controller;
 static struct task_struct *cpci_thread;
-static int thread_finished;

 static int enable_slot(struct hotplug_slot *slot);
 static int disable_slot(struct hotplug_slot *slot);
 <at>  <at>  -341,7 +340,8  <at>  <at>  cpci_hp_intr(int irq, void *data)
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 13/19] PCI: trivial code clean up in cpci_hotplug_core.c

From: Jiang Liu <jiang.liu <at> huawei.com>

1) get rid of redundant lock operations in cpcihp_core.
2) return suitable error code instead of -1.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/cpci_hotplug_core.c |   16 +++-------------
 1 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c b/drivers/pci/hotplug/cpci_hotplug_core.c
index 3fadf2f..7898023 100644
--- a/drivers/pci/hotplug/cpci_hotplug_core.c
+++ b/drivers/pci/hotplug/cpci_hotplug_core.c
 <at>  <at>  -304,7 +304,7  <at>  <at>  cpci_hp_unregister_bus(struct pci_bus *bus)
 	down_write(&list_rwsem);
 	if (!slots) {
 		up_write(&list_rwsem);
-		return -1;
+		return -ENODEV;
 	}
 	list_for_each_entry_safe(slot, tmp, &slot_list, slot_list) {
 		if (slot->bus == bus) {
 <at>  <at>  -357,11 +357,6  <at>  <at>  init_slots(int clear_ins)
 	struct pci_dev* dev;

 	dbg("%s - enter", __func__);
-	down_read(&list_rwsem);
-	if (!slots) {
-		up_read(&list_rwsem);
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 15/19] PCI: hold a reference count to the PCI bus used by cpcihp drivers

From: Jiang Liu <jiang.liu <at> huawei.com>

The PCI bus used by cpcihp drivers may be removed at runtime through
/sys/devices/pcissss:bb/ssss\:bb\:dd.f/.../remove interface, so hold
a reference count to the PCI bus to avoid invalid memory access.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/cpcihp_generic.c |    7 +++++--
 drivers/pci/hotplug/cpcihp_zt5550.c  |    7 +++++--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/cpcihp_generic.c b/drivers/pci/hotplug/cpcihp_generic.c
index 518f387..f002be5 100644
--- a/drivers/pci/hotplug/cpcihp_generic.c
+++ b/drivers/pci/hotplug/cpcihp_generic.c
 <at>  <at>  -163,7 +163,7  <at>  <at>  static int __init cpcihp_generic_init(void)
 			err("Invalid bridge device %s", bridge);
 			bus = NULL;
 		} else
-			bus = dev->subordinate;
+			bus = pci_bus_get(dev->subordinate);
 		pci_dev_put(dev);
 	}
 	if (!bus) {
 <at>  <at>  -179,7 +179,7  <at>  <at>  static int __init cpcihp_generic_init(void)
 	if(status != 0) {
 		err("Could not register cPCI hotplug controller");
 		status = -ENODEV;
-		goto init_find_bus_error;
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 16/19] PCI: serialize PCI hotplug operations triggered by cpcihp drivers

From: Jiang Liu <jiang.liu <at> huawei.com>

Use PCI hotplug lock to globally serialize hotplug operations triggered
by cpcihp_xxx drivers.

Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/hotplug/cpci_hotplug_core.c |   10 ++++++++++
 drivers/pci/hotplug/cpcihp_generic.c    |    2 ++
 drivers/pci/hotplug/cpcihp_zt5550.c     |   12 ++++++++----
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/cpci_hotplug_core.c b/drivers/pci/hotplug/cpci_hotplug_core.c
index 68e43c7..bbac305 100644
--- a/drivers/pci/hotplug/cpci_hotplug_core.c
+++ b/drivers/pci/hotplug/cpci_hotplug_core.c
 <at>  <at>  -502,6 +502,10  <at>  <at>  event_thread(void *data)
 		if (kthread_should_stop())
 			break;
 		do {
+			if (!pci_hotplug_try_enter()) {
+				msleep(1);
+				continue;
+			}
 			rc = check_slots();
 			if (rc > 0) {
 				/* Give userspace a chance to handle extraction */
 <at>  <at>  -510,6 +514,7  <at>  <at>  event_thread(void *data)
 				dbg("%s - error checking slots", __func__);
 				goto out;
(Continue reading)

Jiang Liu | 27 Apr 17:17 2012
Picon

[PATCH v2 19/19] PCI: hide sys interface 'remove' and 'rescan' for SR-IOV virtual devices

From: Jiang Liu <liuj97 <at> gmail.com>

From: Jiang Liu <liuj97 <at> gmail.com>

All SR-IOV virtual PCI devices should be managed by corresponding physical
device drivers. And the PCI core shouldn't create or destroy virtual PCI
devices directly without cordination with physical device drivers.
Otherwise it may cause system crashes like below.  So hide the remove and
rescan sys interfaces for SR-IOV virtual PCI devices.

Running following two scripts may trigger system dump on a system with
Intel 82576 NIC.

[root <at> localhost tests]# cat mod.sh
#!/bin/bash
while true; do
        modprobe igb max_vfs=2
        sleep 0.01
        rmmod igb
done
[root <at> localhost tests]# cat remove_virt.sh
#!/bin/bash
while true; do
        echo 1 > /sys/devices/pci0000:40/0000:40:03.0/0000:41:00.0/0000:42:02.0/0000:44:10.0/remove
        echo 1 > /sys/devices/pci0000:40/0000:40:03.0/0000:41:00.0/0000:42:02.0/0000:44:10.1/remove
        echo 1 > /sys/devices/pci0000:40/0000:40:03.0/0000:41:00.0/0000:42:02.0/0000:44:10.2/remove
        echo 1 > /sys/devices/pci0000:40/0000:40:03.0/0000:41:00.0/0000:42:02.0/0000:43:10.3/rescan
        sleep 0.01
done

(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 18/19] PCI, sysfs: Use device_type and attr_groups with pci dev

From: Yinghai Lu <yinghai <at> kernel.org>

From: Yinghai Lu <yinghai <at> kernel.org>

We want to create rescan in sys only for pci bridge instead of all pci dev.

We could use attribute_groups/is_visible method to do that.

Now pci dev does not use device type yet. So add pci_dev_type to take
attr_groups with it.

Add pci_dev_bridge_attrs_are_visible() to control attr_bridge_group only create
attr for bridge.

This is the framework related change, later could attr to bridge_attr_group,
to make those attr only show up on pci bridge device.

Also We could add more attr groups with is_visible to reduce messness in
	pci_create_sysfs_dev_files. ( at least for boot_vga one )

Signed-off-by: Yinghai Lu <yinghai <at> kernel.org>
Signed-off-by: Jiang Liu <liuj97 <at> gmail.com>
---
 drivers/pci/pci-sysfs.c |   30 ++++++++++++++++++++++++++++++
 drivers/pci/pci.h       |    1 +
 drivers/pci/probe.c     |    1 +
 3 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index ecf197d..bc3c422 100644
(Continue reading)

Jiang Liu | 27 Apr 17:16 2012
Picon

[PATCH v2 17/19] PCI: serialize PCI hotplug operations triggered by fakephp drivers

From: Jiang Liu <jiang.liu <at> huawei.com>

Use PCI hotplug lock to globally serialize hotplug operations triggered
by fakephp driver. This patch solves following crash.

[ 1426.145264] IP: [<ffffffff812f811b>] __pci_remove_bus_device+0x4b/0xc0
[ 1426.145264] PGD 30463067 PUD 38f9e067 PMD 0
[ 1426.145264] Oops: 0002 [#1] SMP
[ 1426.145264] CPU 0
[ 1426.145264] Modules linked in: fakephp shpchp r8169 [last unloaded: fakephp]
[ 1426.145264]
[ 1426.145264] Pid: 2086, comm: kworker/u:0 Tainted: G        W    3.4.0-rc2+ #19 To Be Filled By O.E.M. To Be Filled .
[ 1426.145264] RIP: 0010:[<ffffffff812f811b>]  [<ffffffff812f811b>] __pci_remove_bus_device+0x4b/0xc0
[ 1426.145264] RSP: 0018:ffff88002e851d10  EFLAGS: 00010282
[ 1426.145264] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000001880
[ 1426.145264] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81c4fec0
[ 1426.145264] RBP: ffff88002e851d20 R08: 0000000000000000 R09: 0000000000000000
[ 1426.145264] R10: 00000000000003c7 R11: 0001f630d1b3ac30 R12: ffff880030db3800
[ 1426.145264] R13: ffff880030443400 R14: ffffffff81fa8840 R15: ffffffff811a5220
[ 1426.145264] FS:  0000000000000000(0000) GS:ffff88003d600000(0000) knlGS:0000000000000000
[ 1426.145264] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1426.145264] CR2: 0000000000000008 CR3: 0000000030ff8000 CR4: 00000000000007f0
[ 1426.145264] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1426.145264] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1426.145264] Process kworker/u:0 (pid: 2086, threadinfo ffff88002e850000, task ffff880037b38000)
[ 1426.145264] Stack:
[ 1426.145264]  ffff880030db3800 ffff88002aa1c530 ffff88002e851d40 ffffffff812f81a9
[ 1426.145264]  0000000000000000 ffff88002a81b900 ffff88002e851d60 ffffffffa17ec0a4
[ 1426.145264]  ffffffff81fa8840 ffff88002aa1c530 ffff88002e851d80 ffffffff811a5233
[ 1426.145264] Call Trace:
(Continue reading)


Gmane