summaryrefslogtreecommitdiff
path: root/drivers
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2021-11-02 16:04:28 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2021-11-02 16:04:28 -0700
commit833db72142b93a89211c1e43ca0a1e2e16457756 (patch)
treeac074094a3ff4054478273023f0f8b05dd6318fa /drivers
parentc0d6586afa3546a3d148cf4b9d9a407b4f79d0bb (diff)
parentbf56b90797c4a03c94b702c50e62296edea9fad9 (diff)
downloadlwn-833db72142b93a89211c1e43ca0a1e2e16457756.tar.gz
lwn-833db72142b93a89211c1e43ca0a1e2e16457756.zip
Merge tag 'pm-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki: "These make the power management of PCI devices with ACPI companions more straightforwad, add support for inefficient operating performance points to the Energy model and make cpufreq handle them as appropriate, rearrange the handling of cpuidle during system PM transitions, update a few cpufreq drivers and intel_idle, fix assorded issues and clean up code in multiple places. Specifics: - Add support for inefficient operating performance points to the Energy Model and modify cpufreq to use them properly (Vincent Donnefort). - Rearrange the DTPM framework code to simplify it and make it easier to follow (Daniel Lezcano). - Fix power intialization in DTPM (Daniel Lezcano). - Add CPU load consideration when estimating the instaneous power consumption in DTPM (Daniel Lezcano). - Fix cpu->pstate.turbo_freq initialization in intel_pstate (Zhang Rui). - Make intel_pstate process HWP Guaranteed change notifications from the processor (Srinivas Pandruvada). - Fix typo in cpufreq.h (Rafael Wysocki). - Fix tegra driver to handle BPMP errors properly (Mikko Perttunen). - Fix the parameter usage of the newly added perf-domain API (Hector Yuan). - Minor cleanups to cppc, vexpress and s3c244x drivers (Han Wang, Guenter Roeck, and Arnd Bergmann). - Fix kobject memory leaks in cpuidle error paths (Anel Orazgaliyeva). - Make intel_idle enable interrupts before entering C1 on some Xeon processor models (Artem Bityutskiy). - Clean up hib_wait_io() (Falla Coulibaly). - Fix sparse warnings in hibernation-related code (Anders Roxell). - Use vzalloc() and kzalloc() instead of their open-coded equivalents in hibernation-related code (Cai Huoqing). - Prevent user space from crashing the kernel by attempting to restore the system state from a swap partition in use (Ye Bin). - Do not let "syscore" devices runtime-suspend during system PM transitions (Rafael Wysocki). - Do not pause cpuidle in the suspend-to-idle path (Rafael Wysocki). - Pause cpuidle later and resume it earlier during system PM transitions (Rafael Wysocki). - Make system suspend code use valid_state() consistently (Rafael Wysocki). - Add support for enabling wakeup IRQs after invoking the ->runtime_suspend() callback and make two drivers use it (Chunfeng Yun). - Make the association of ACPI device objects with PCI devices more straightforward and simplify the code doing that for all devices in general (Rafael Wysocki). - Eliminate struct pci_platform_pm_ops and handle the both of its users (PCI and Intel MID) directly in the PCI bus code (Rafael Wysocki). - Simplify and clarify ACPI PCI device PM helpers (Rafael Wysocki). - Fix ordering of operations in pci_back_from_sleep() (Rafael Wysocki). - Make exynos-ppmu use hyphens in DT properties (Krzysztof Kozlowski). - Simplify parsing event-type from DT in exynos-ppmu (Krzysztof Kozlowski). - Strengthen check for freq_table in devfreq (Samuel Holland)" * tag 'pm-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (49 commits) cpufreq: Fix parameter in parse_perf_domain() usb: mtu3: enable wake-up interrupt after runtime_suspend called usb: xhci-mtk: enable wake-up interrupt after runtime_suspend called PM / wakeirq: support enabling wake-up irq after runtime_suspend called PM / devfreq: Strengthen check for freq_table devfreq: exynos-ppmu: simplify parsing event-type from DT devfreq: exynos-ppmu: use node names with hyphens cpufreq: intel_pstate: Fix cpu->pstate.turbo_freq initialization PM: suspend: Use valid_state() consistently PM: sleep: Pause cpuidle later and resume it earlier during system transitions PM: suspend: Do not pause cpuidle in the suspend-to-idle path PM: sleep: Do not let "syscore" devices runtime-suspend during system transitions PM: hibernate: Get block device exclusively in swsusp_check() powercap/drivers/dtpm: Fix power limit initialization powercap/drivers/dtpm: Scale the power with the load powercap/drivers/dtpm: Use container_of instead of a private data field powercap/drivers/dtpm: Simplify the dtpm table powercap/drivers/dtpm: Encapsulate even more the code PM: hibernate: swap: Use vzalloc() and kzalloc() PM: hibernate: fix sparse warnings ...
Diffstat (limited to 'drivers')
-rw-r--r--drivers/base/power/main.c14
-rw-r--r--drivers/base/power/power.h7
-rw-r--r--drivers/base/power/runtime.c6
-rw-r--r--drivers/base/power/wakeirq.c101
-rw-r--r--drivers/cpufreq/acpi-cpufreq.c3
-rw-r--r--drivers/cpufreq/amd_freq_sensitivity.c3
-rw-r--r--drivers/cpufreq/cppc_cpufreq.c2
-rw-r--r--drivers/cpufreq/cpufreq.c19
-rw-r--r--drivers/cpufreq/cpufreq_conservative.c6
-rw-r--r--drivers/cpufreq/cpufreq_ondemand.c16
-rw-r--r--drivers/cpufreq/intel_pstate.c120
-rw-r--r--drivers/cpufreq/mediatek-cpufreq-hw.c2
-rw-r--r--drivers/cpufreq/powernv-cpufreq.c4
-rw-r--r--drivers/cpufreq/s3c2440-cpufreq.c2
-rw-r--r--drivers/cpufreq/s5pv210-cpufreq.c2
-rw-r--r--drivers/cpufreq/tegra186-cpufreq.c4
-rw-r--r--drivers/cpufreq/tegra194-cpufreq.c8
-rw-r--r--drivers/cpuidle/sysfs.c5
-rw-r--r--drivers/devfreq/devfreq.c2
-rw-r--r--drivers/devfreq/event/exynos-ppmu.c12
-rw-r--r--drivers/idle/intel_idle.c13
-rw-r--r--drivers/pci/pci-acpi.c43
-rw-r--r--drivers/pci/pci-mid.c37
-rw-r--r--drivers/pci/pci.c154
-rw-r--r--drivers/pci/pci.h96
-rw-r--r--drivers/powercap/dtpm.c78
-rw-r--r--drivers/powercap/dtpm_cpu.c228
-rw-r--r--drivers/usb/host/xhci-mtk.c2
-rw-r--r--drivers/usb/mtu3/mtu3_plat.c2
29 files changed, 591 insertions, 400 deletions
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index cbea78e79f3d..ac4dde8fdb8b 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -32,7 +32,6 @@
#include <linux/suspend.h>
#include <trace/events/power.h>
#include <linux/cpufreq.h>
-#include <linux/cpuidle.h>
#include <linux/devfreq.h>
#include <linux/timer.h>
@@ -747,8 +746,6 @@ void dpm_resume_noirq(pm_message_t state)
resume_device_irqs();
device_wakeup_disarm_wake_irqs();
-
- cpuidle_resume();
}
/**
@@ -1051,7 +1048,7 @@ static void device_complete(struct device *dev, pm_message_t state)
const char *info = NULL;
if (dev->power.syscore)
- return;
+ goto out;
device_lock(dev);
@@ -1081,6 +1078,7 @@ static void device_complete(struct device *dev, pm_message_t state)
device_unlock(dev);
+out:
pm_runtime_put(dev);
}
@@ -1336,8 +1334,6 @@ int dpm_suspend_noirq(pm_message_t state)
{
int ret;
- cpuidle_pause();
-
device_wakeup_arm_wake_irqs();
suspend_device_irqs();
@@ -1794,9 +1790,6 @@ static int device_prepare(struct device *dev, pm_message_t state)
int (*callback)(struct device *) = NULL;
int ret = 0;
- if (dev->power.syscore)
- return 0;
-
/*
* If a device's parent goes into runtime suspend at the wrong time,
* it won't be possible to resume the device. To prevent this we
@@ -1805,6 +1798,9 @@ static int device_prepare(struct device *dev, pm_message_t state)
*/
pm_runtime_get_noresume(dev);
+ if (dev->power.syscore)
+ return 0;
+
device_lock(dev);
dev->power.wakeup_path = false;
diff --git a/drivers/base/power/power.h b/drivers/base/power/power.h
index 54292cdd7808..0eb7f02b3ad5 100644
--- a/drivers/base/power/power.h
+++ b/drivers/base/power/power.h
@@ -25,8 +25,10 @@ extern u64 pm_runtime_active_time(struct device *dev);
#define WAKE_IRQ_DEDICATED_ALLOCATED BIT(0)
#define WAKE_IRQ_DEDICATED_MANAGED BIT(1)
+#define WAKE_IRQ_DEDICATED_REVERSE BIT(2)
#define WAKE_IRQ_DEDICATED_MASK (WAKE_IRQ_DEDICATED_ALLOCATED | \
- WAKE_IRQ_DEDICATED_MANAGED)
+ WAKE_IRQ_DEDICATED_MANAGED | \
+ WAKE_IRQ_DEDICATED_REVERSE)
struct wake_irq {
struct device *dev;
@@ -39,7 +41,8 @@ extern void dev_pm_arm_wake_irq(struct wake_irq *wirq);
extern void dev_pm_disarm_wake_irq(struct wake_irq *wirq);
extern void dev_pm_enable_wake_irq_check(struct device *dev,
bool can_change_status);
-extern void dev_pm_disable_wake_irq_check(struct device *dev);
+extern void dev_pm_disable_wake_irq_check(struct device *dev, bool cond_disable);
+extern void dev_pm_enable_wake_irq_complete(struct device *dev);
#ifdef CONFIG_PM_SLEEP
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index ec94049442b9..d504cd4ab3cb 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -645,6 +645,8 @@ static int rpm_suspend(struct device *dev, int rpmflags)
if (retval)
goto fail;
+ dev_pm_enable_wake_irq_complete(dev);
+
no_callback:
__update_runtime_status(dev, RPM_SUSPENDED);
pm_runtime_deactivate_timer(dev);
@@ -690,7 +692,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
return retval;
fail:
- dev_pm_disable_wake_irq_check(dev);
+ dev_pm_disable_wake_irq_check(dev, true);
__update_runtime_status(dev, RPM_ACTIVE);
dev->power.deferred_resume = false;
wake_up_all(&dev->power.wait_queue);
@@ -873,7 +875,7 @@ static int rpm_resume(struct device *dev, int rpmflags)
callback = RPM_GET_CALLBACK(dev, runtime_resume);
- dev_pm_disable_wake_irq_check(dev);
+ dev_pm_disable_wake_irq_check(dev, false);
retval = rpm_callback(callback, dev);
if (retval) {
__update_runtime_status(dev, RPM_SUSPENDED);
diff --git a/drivers/base/power/wakeirq.c b/drivers/base/power/wakeirq.c
index b91a3a9bf9f6..0004db4a9d3b 100644
--- a/drivers/base/power/wakeirq.c
+++ b/drivers/base/power/wakeirq.c
@@ -142,24 +142,7 @@ static irqreturn_t handle_threaded_wake_irq(int irq, void *_wirq)
return IRQ_HANDLED;
}
-/**
- * dev_pm_set_dedicated_wake_irq - Request a dedicated wake-up interrupt
- * @dev: Device entry
- * @irq: Device wake-up interrupt
- *
- * Unless your hardware has separate wake-up interrupts in addition
- * to the device IO interrupts, you don't need this.
- *
- * Sets up a threaded interrupt handler for a device that has
- * a dedicated wake-up interrupt in addition to the device IO
- * interrupt.
- *
- * The interrupt starts disabled, and needs to be managed for
- * the device by the bus code or the device driver using
- * dev_pm_enable_wake_irq() and dev_pm_disable_wake_irq()
- * functions.
- */
-int dev_pm_set_dedicated_wake_irq(struct device *dev, int irq)
+static int __dev_pm_set_dedicated_wake_irq(struct device *dev, int irq, unsigned int flag)
{
struct wake_irq *wirq;
int err;
@@ -197,7 +180,7 @@ int dev_pm_set_dedicated_wake_irq(struct device *dev, int irq)
if (err)
goto err_free_irq;
- wirq->status = WAKE_IRQ_DEDICATED_ALLOCATED;
+ wirq->status = WAKE_IRQ_DEDICATED_ALLOCATED | flag;
return err;
@@ -210,9 +193,58 @@ err_free:
return err;
}
+
+
+/**
+ * dev_pm_set_dedicated_wake_irq - Request a dedicated wake-up interrupt
+ * @dev: Device entry
+ * @irq: Device wake-up interrupt
+ *
+ * Unless your hardware has separate wake-up interrupts in addition
+ * to the device IO interrupts, you don't need this.
+ *
+ * Sets up a threaded interrupt handler for a device that has
+ * a dedicated wake-up interrupt in addition to the device IO
+ * interrupt.
+ *
+ * The interrupt starts disabled, and needs to be managed for
+ * the device by the bus code or the device driver using
+ * dev_pm_enable_wake_irq*() and dev_pm_disable_wake_irq*()
+ * functions.
+ */
+int dev_pm_set_dedicated_wake_irq(struct device *dev, int irq)
+{
+ return __dev_pm_set_dedicated_wake_irq(dev, irq, 0);
+}
EXPORT_SYMBOL_GPL(dev_pm_set_dedicated_wake_irq);
/**
+ * dev_pm_set_dedicated_wake_irq_reverse - Request a dedicated wake-up interrupt
+ * with reverse enable ordering
+ * @dev: Device entry
+ * @irq: Device wake-up interrupt
+ *
+ * Unless your hardware has separate wake-up interrupts in addition
+ * to the device IO interrupts, you don't need this.
+ *
+ * Sets up a threaded interrupt handler for a device that has a dedicated
+ * wake-up interrupt in addition to the device IO interrupt. It sets
+ * the status of WAKE_IRQ_DEDICATED_REVERSE to tell rpm_suspend()
+ * to enable dedicated wake-up interrupt after running the runtime suspend
+ * callback for @dev.
+ *
+ * The interrupt starts disabled, and needs to be managed for
+ * the device by the bus code or the device driver using
+ * dev_pm_enable_wake_irq*() and dev_pm_disable_wake_irq*()
+ * functions.
+ */
+int dev_pm_set_dedicated_wake_irq_reverse(struct device *dev, int irq)
+{
+ return __dev_pm_set_dedicated_wake_irq(dev, irq, WAKE_IRQ_DEDICATED_REVERSE);
+}
+EXPORT_SYMBOL_GPL(dev_pm_set_dedicated_wake_irq_reverse);
+
+/**
* dev_pm_enable_wake_irq - Enable device wake-up interrupt
* @dev: Device
*
@@ -282,28 +314,55 @@ void dev_pm_enable_wake_irq_check(struct device *dev,
return;
enable:
- enable_irq(wirq->irq);
+ if (!can_change_status || !(wirq->status & WAKE_IRQ_DEDICATED_REVERSE))
+ enable_irq(wirq->irq);
}
/**
* dev_pm_disable_wake_irq_check - Checks and disables wake-up interrupt
* @dev: Device
+ * @cond_disable: if set, also check WAKE_IRQ_DEDICATED_REVERSE
*
* Disables wake-up interrupt conditionally based on status.
* Should be only called from rpm_suspend() and rpm_resume() path.
*/
-void dev_pm_disable_wake_irq_check(struct device *dev)
+void dev_pm_disable_wake_irq_check(struct device *dev, bool cond_disable)
{
struct wake_irq *wirq = dev->power.wakeirq;
if (!wirq || !(wirq->status & WAKE_IRQ_DEDICATED_MASK))
return;
+ if (cond_disable && (wirq->status & WAKE_IRQ_DEDICATED_REVERSE))
+ return;
+
if (wirq->status & WAKE_IRQ_DEDICATED_MANAGED)
disable_irq_nosync(wirq->irq);
}
/**
+ * dev_pm_enable_wake_irq_complete - enable wake IRQ not enabled before
+ * @dev: Device using the wake IRQ
+ *
+ * Enable wake IRQ conditionally based on status, mainly used if want to
+ * enable wake IRQ after running ->runtime_suspend() which depends on
+ * WAKE_IRQ_DEDICATED_REVERSE.
+ *
+ * Should be only called from rpm_suspend() path.
+ */
+void dev_pm_enable_wake_irq_complete(struct device *dev)
+{
+ struct wake_irq *wirq = dev->power.wakeirq;
+
+ if (!wirq || !(wirq->status & WAKE_IRQ_DEDICATED_MASK))
+ return;
+
+ if (wirq->status & WAKE_IRQ_DEDICATED_MANAGED &&
+ wirq->status & WAKE_IRQ_DEDICATED_REVERSE)
+ enable_irq(wirq->irq);
+}
+
+/**
* dev_pm_arm_wake_irq - Arm device wake-up
* @wirq: Device wake-up interrupt
*
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 28467d83c745..3d514b82d055 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -470,7 +470,8 @@ static unsigned int acpi_cpufreq_fast_switch(struct cpufreq_policy *policy,
if (policy->cached_target_freq == target_freq)
index = policy->cached_resolved_idx;
else
- index = cpufreq_table_find_index_dl(policy, target_freq);
+ index = cpufreq_table_find_index_dl(policy, target_freq,
+ false);
entry = &policy->freq_table[index];
next_freq = entry->frequency;
diff --git a/drivers/cpufreq/amd_freq_sensitivity.c b/drivers/cpufreq/amd_freq_sensitivity.c
index d0b10baf039a..6448e03bcf48 100644
--- a/drivers/cpufreq/amd_freq_sensitivity.c
+++ b/drivers/cpufreq/amd_freq_sensitivity.c
@@ -91,7 +91,8 @@ static unsigned int amd_powersave_bias_target(struct cpufreq_policy *policy,
unsigned int index;
index = cpufreq_table_find_index_h(policy,
- policy->cur - 1);
+ policy->cur - 1,
+ relation & CPUFREQ_RELATION_E);
freq_next = policy->freq_table[index].frequency;
}
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index d4c27022b9c9..db17196266e4 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -741,8 +741,6 @@ static int __init cppc_cpufreq_init(void)
if ((acpi_disabled) || !acpi_cpc_valid())
return -ENODEV;
- INIT_LIST_HEAD(&cpu_data_list);
-
cppc_check_hisi_workaround();
cppc_freq_invariance_init();
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 5782b15a8caa..e338d2f010fe 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -554,7 +554,7 @@ static unsigned int __resolve_freq(struct cpufreq_policy *policy,
unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
unsigned int target_freq)
{
- return __resolve_freq(policy, target_freq, CPUFREQ_RELATION_L);
+ return __resolve_freq(policy, target_freq, CPUFREQ_RELATION_LE);
}
EXPORT_SYMBOL_GPL(cpufreq_driver_resolve_freq);
@@ -2260,8 +2260,16 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
!(cpufreq_driver->flags & CPUFREQ_NEED_UPDATE_LIMITS))
return 0;
- if (cpufreq_driver->target)
+ if (cpufreq_driver->target) {
+ /*
+ * If the driver hasn't setup a single inefficient frequency,
+ * it's unlikely it knows how to decode CPUFREQ_RELATION_E.
+ */
+ if (!policy->efficiencies_available)
+ relation &= ~CPUFREQ_RELATION_E;
+
return cpufreq_driver->target(policy, target_freq, relation);
+ }
if (!cpufreq_driver->target_index)
return -EINVAL;
@@ -2523,8 +2531,15 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
if (ret)
return ret;
+ /*
+ * Resolve policy min/max to available frequencies. It ensures
+ * no frequency resolution will neither overshoot the requested maximum
+ * nor undershoot the requested minimum.
+ */
policy->min = new_data.min;
policy->max = new_data.max;
+ policy->min = __resolve_freq(policy, policy->min, CPUFREQ_RELATION_L);
+ policy->max = __resolve_freq(policy, policy->max, CPUFREQ_RELATION_H);
trace_cpu_frequency_limits(policy);
policy->cached_target_freq = UINT_MAX;
diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index aa39ff31ec9f..0879ec3c170c 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -111,7 +111,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
if (requested_freq > policy->max)
requested_freq = policy->max;
- __cpufreq_driver_target(policy, requested_freq, CPUFREQ_RELATION_H);
+ __cpufreq_driver_target(policy, requested_freq,
+ CPUFREQ_RELATION_HE);
dbs_info->requested_freq = requested_freq;
goto out;
}
@@ -134,7 +135,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
else
requested_freq = policy->min;
- __cpufreq_driver_target(policy, requested_freq, CPUFREQ_RELATION_L);
+ __cpufreq_driver_target(policy, requested_freq,
+ CPUFREQ_RELATION_LE);
dbs_info->requested_freq = requested_freq;
}
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index eb4320b619c9..3b8f924771b4 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -83,9 +83,11 @@ static unsigned int generic_powersave_bias_target(struct cpufreq_policy *policy,
freq_avg = freq_req - freq_reduc;
/* Find freq bounds for freq_avg in freq_table */
- index = cpufreq_table_find_index_h(policy, freq_avg);
+ index = cpufreq_table_find_index_h(policy, freq_avg,
+ relation & CPUFREQ_RELATION_E);
freq_lo = freq_table[index].frequency;
- index = cpufreq_table_find_index_l(policy, freq_avg);
+ index = cpufreq_table_find_index_l(policy, freq_avg,
+ relation & CPUFREQ_RELATION_E);
freq_hi = freq_table[index].frequency;
/* Find out how long we have to be in hi and lo freqs */
@@ -118,12 +120,12 @@ static void dbs_freq_increase(struct cpufreq_policy *policy, unsigned int freq)
if (od_tuners->powersave_bias)
freq = od_ops.powersave_bias_target(policy, freq,
- CPUFREQ_RELATION_H);
+ CPUFREQ_RELATION_HE);
else if (policy->cur == policy->max)
return;
__cpufreq_driver_target(policy, freq, od_tuners->powersave_bias ?
- CPUFREQ_RELATION_L : CPUFREQ_RELATION_H);
+ CPUFREQ_RELATION_LE : CPUFREQ_RELATION_HE);
}
/*
@@ -161,9 +163,9 @@ static void od_update(struct cpufreq_policy *policy)
if (od_tuners->powersave_bias)
freq_next = od_ops.powersave_bias_target(policy,
freq_next,
- CPUFREQ_RELATION_L);
+ CPUFREQ_RELATION_LE);
- __cpufreq_driver_target(policy, freq_next, CPUFREQ_RELATION_C);
+ __cpufreq_driver_target(policy, freq_next, CPUFREQ_RELATION_CE);
}
}
@@ -182,7 +184,7 @@ static unsigned int od_dbs_update(struct cpufreq_policy *policy)
*/
if (sample_type == OD_SUB_SAMPLE && policy_dbs->sample_delay_ns > 0) {
__cpufreq_driver_target(policy, dbs_info->freq_lo,
- CPUFREQ_RELATION_H);
+ CPUFREQ_RELATION_HE);
return dbs_info->freq_lo_delay_us;
}
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 8c176b7dae41..349ddbaef796 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -32,6 +32,7 @@
#include <asm/cpu_device_id.h>
#include <asm/cpufeature.h>
#include <asm/intel-family.h>
+#include "../drivers/thermal/intel/thermal_interrupt.h"
#define INTEL_PSTATE_SAMPLING_INTERVAL (10 * NSEC_PER_MSEC)
@@ -219,6 +220,7 @@ struct global_params {
* @sched_flags: Store scheduler flags for possible cross CPU update
* @hwp_boost_min: Last HWP boosted min performance
* @suspended: Whether or not the driver has been suspended.
+ * @hwp_notify_work: workqueue for HWP notifications.
*
* This structure stores per CPU instance data for all CPUs.
*/
@@ -257,6 +259,7 @@ struct cpudata {
unsigned int sched_flags;
u32 hwp_boost_min;
bool suspended;
+ struct delayed_work hwp_notify_work;
};
static struct cpudata **all_cpu_data;
@@ -537,7 +540,8 @@ static void intel_pstate_hybrid_hwp_adjust(struct cpudata *cpu)
* scaling factor is too high, so recompute it to make the HWP_CAP
* highest performance correspond to the maximum turbo frequency.
*/
- if (turbo_freq < cpu->pstate.turbo_pstate * scaling) {
+ cpu->pstate.turbo_freq = cpu->pstate.turbo_pstate * scaling;
+ if (turbo_freq < cpu->pstate.turbo_freq) {
cpu->pstate.turbo_freq = turbo_freq;
scaling = DIV_ROUND_UP(turbo_freq, cpu->pstate.turbo_pstate);
cpu->pstate.scaling = scaling;
@@ -985,11 +989,15 @@ skip_epp:
wrmsrl_on_cpu(cpu, MSR_HWP_REQUEST, value);
}
+static void intel_pstate_disable_hwp_interrupt(struct cpudata *cpudata);
+
static void intel_pstate_hwp_offline(struct cpudata *cpu)
{
u64 value = READ_ONCE(cpu->hwp_req_cached);
int min_perf;
+ intel_pstate_disable_hwp_interrupt(cpu);
+
if (boot_cpu_has(X86_FEATURE_HWP_EPP)) {
/*
* In case the EPP has been set to "performance" by the
@@ -1053,6 +1061,9 @@ static int intel_pstate_suspend(struct cpufreq_policy *policy)
cpu->suspended = true;
+ /* disable HWP interrupt and cancel any pending work */
+ intel_pstate_disable_hwp_interrupt(cpu);
+
return 0;
}
@@ -1546,15 +1557,105 @@ static void intel_pstate_sysfs_hide_hwp_dynamic_boost(void)
/************************** sysfs end ************************/
+static void intel_pstate_notify_work(struct work_struct *work)
+{
+ struct cpudata *cpudata =
+ container_of(to_delayed_work(work), struct cpudata, hwp_notify_work);
+
+ cpufreq_update_policy(cpudata->cpu);
+ wrmsrl_on_cpu(cpudata->cpu, MSR_HWP_STATUS, 0);
+}
+
+static DEFINE_SPINLOCK(hwp_notify_lock);
+static cpumask_t hwp_intr_enable_mask;
+
+void notify_hwp_interrupt(void)
+{
+ unsigned int this_cpu = smp_processor_id();
+ struct cpudata *cpudata;
+ unsigned long flags;
+ u64 value;
+
+ if (!READ_ONCE(hwp_active) || !boot_cpu_has(X86_FEATURE_HWP_NOTIFY))
+ return;
+
+ rdmsrl_safe(MSR_HWP_STATUS, &value);
+ if (!(value & 0x01))
+ return;
+
+ spin_lock_irqsave(&hwp_notify_lock, flags);
+
+ if (!cpumask_test_cpu(this_cpu, &hwp_intr_enable_mask))
+ goto ack_intr;
+
+ /*
+ * Currently we never free all_cpu_data. And we can't reach here
+ * without this allocated. But for safety for future changes, added
+ * check.
+ */
+ if (unlikely(!READ_ONCE(all_cpu_data)))
+ goto ack_intr;
+
+ /*
+ * The free is done during cleanup, when cpufreq registry is failed.
+ * We wouldn't be here if it fails on init or switch status. But for
+ * future changes, added check.
+ */
+ cpudata = READ_ONCE(all_cpu_data[this_cpu]);
+ if (unlikely(!cpudata))
+ goto ack_intr;
+
+ schedule_delayed_work(&cpudata->hwp_notify_work, msecs_to_jiffies(10));
+
+ spin_unlock_irqrestore(&hwp_notify_lock, flags);
+
+ return;
+
+ack_intr:
+ wrmsrl_safe(MSR_HWP_STATUS, 0);
+ spin_unlock_irqrestore(&hwp_notify_lock, flags);
+}
+
+static void intel_pstate_disable_hwp_interrupt(struct cpudata *cpudata)
+{
+ unsigned long flags;
+
+ /* wrmsrl_on_cpu has to be outside spinlock as this can result in IPC */
+ wrmsrl_on_cpu(cpudata->cpu, MSR_HWP_INTERRUPT, 0x00);
+
+ spin_lock_irqsave(&hwp_notify_lock, flags);
+ if (cpumask_test_and_clear_cpu(cpudata->cpu, &hwp_intr_enable_mask))
+ cancel_delayed_work(&cpudata->hwp_notify_work);
+ spin_unlock_irqrestore(&hwp_notify_lock, flags);
+}
+
+static void intel_pstate_enable_hwp_interrupt(struct cpudata *cpudata)
+{
+ /* Enable HWP notification interrupt for guaranteed performance change */
+ if (boot_cpu_has(X86_FEATURE_HWP_NOTIFY)) {
+ unsigned long flags;
+
+ spin_lock_irqsave(&hwp_notify_lock, flags);
+ INIT_DELAYED_WORK(&cpudata->hwp_notify_work, intel_pstate_notify_work);
+ cpumask_set_cpu(cpudata->cpu, &hwp_intr_enable_mask);
+ spin_unlock_irqrestore(&hwp_notify_lock, flags);
+
+ /* wrmsrl_on_cpu has to be outside spinlock as this can result in IPC */
+ wrmsrl_on_cpu(cpudata->cpu, MSR_HWP_INTERRUPT, 0x01);
+ }
+}
+
static void intel_pstate_hwp_enable(struct cpudata *cpudata)
{
- /* First disable HWP notification interrupt as we don't process them */
+ /* First disable HWP notification interrupt till we activate again */
if (boot_cpu_has(X86_FEATURE_HWP_NOTIFY))
wrmsrl_on_cpu(cpudata->cpu, MSR_HWP_INTERRUPT, 0x00);
wrmsrl_on_cpu(cpudata->cpu, MSR_PM_ENABLE, 0x1);
if (cpudata->epp_default == -EINVAL)
cpudata->epp_default = intel_pstate_get_epp(cpudata, 0);
+
+ intel_pstate_enable_hwp_interrupt(cpudata);
}
static int atom_get_min_pstate(void)
@@ -2266,7 +2367,7 @@ static int intel_pstate_init_cpu(unsigned int cpunum)
if (!cpu)
return -ENOMEM;
- all_cpu_data[cpunum] = cpu;
+ WRITE_ONCE(all_cpu_data[cpunum], cpu);
cpu->cpu = cpunum;
@@ -2929,8 +3030,10 @@ static void intel_pstate_driver_cleanup(void)
if (intel_pstate_driver == &intel_pstate)
intel_pstate_clear_update_util_hook(cpu);
+ spin_lock(&hwp_notify_lock);
kfree(all_cpu_data[cpu]);
- all_cpu_data[cpu] = NULL;
+ WRITE_ONCE(all_cpu_data[cpu], NULL);
+ spin_unlock(&hwp_notify_lock);
}
}
cpus_read_unlock();
@@ -3199,6 +3302,7 @@ static bool intel_pstate_hwp_is_enabled(void)
static int __init intel_pstate_init(void)
{
+ static struct cpudata **_all_cpu_data;
const struct x86_cpu_id *id;
int rc;
@@ -3224,7 +3328,7 @@ static int __init intel_pstate_init(void)
* deal with it.
*/
if ((!no_hwp && boot_cpu_has(X86_FEATURE_HWP_EPP)) || hwp_forced) {
- hwp_active++;
+ WRITE_ONCE(hwp_active, 1);
hwp_mode_bdw = id->driver_data;
intel_pstate.attr = hwp_cpufreq_attrs;
intel_cpufreq.attr = hwp_cpufreq_attrs;
@@ -3275,10 +3379,12 @@ hwp_cpu_matched:
pr_info("Intel P-state driver initializing\n");
- all_cpu_data = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
- if (!all_cpu_data)
+ _all_cpu_data = vzalloc(array_size(sizeof(void *), num_possible_cpus()));
+ if (!_all_cpu_data)
return -ENOMEM;
+ WRITE_ONCE(all_cpu_data, _all_cpu_data);
+
intel_pstate_request_control_from_smm();
intel_pstate_sysfs_expose_params();
diff --git a/drivers/cpufreq/mediatek-cpufreq-hw.c b/drivers/cpufreq/mediatek-cpufreq-hw.c
index 0cf18dd46b92..8ddbd0c5ce37 100644
--- a/drivers/cpufreq/mediatek-cpufreq-hw.c
+++ b/drivers/cpufreq/mediatek-cpufreq-hw.c
@@ -109,7 +109,7 @@ static unsigned int mtk_cpufreq_hw_fast_switch(struct cpufreq_policy *policy,
struct mtk_cpufreq_data *data = policy->driver_data;
unsigned int index;
- index = cpufreq_table_find_index_dl(policy, target_freq);
+ index = cpufreq_table_find_index_dl(policy, target_freq, false);
writel_relaxed(index, data->reg_bases[REG_FREQ_PERF_STATE]);
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 5a2cf5f91ccb..fddbd1ea1635 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -934,7 +934,7 @@ static void powernv_cpufreq_work_fn(struct work_struct *work)
policy = cpufreq_cpu_get(cpu);
if (!policy)
continue;
- index = cpufreq_table_find_index_c(policy, policy->cur);
+ index = cpufreq_table_find_index_c(policy, policy->cur, false);
powernv_cpufreq_target_index(policy, index);
cpumask_andnot(&mask, &mask, policy->cpus);
cpufreq_cpu_put(policy);
@@ -1022,7 +1022,7 @@ static unsigned int powernv_fast_switch(struct cpufreq_policy *policy,
int index;
struct powernv_smp_call_data freq_data;
- index = cpufreq_table_find_index_dl(policy, target_freq);
+ index = cpufreq_table_find_index_dl(policy, target_freq, false);
freq_data.pstate_id = powernv_freqs[index].driver_data;
freq_data.gpstate_id = powernv_freqs[index].driver_data;
set_pstate(&freq_data);
diff --git a/drivers/cpufreq/s3c2440-cpufreq.c b/drivers/cpufreq/s3c2440-cpufreq.c
index 148e8aedefa9..2011fb9c03a4 100644
--- a/drivers/cpufreq/s3c2440-cpufreq.c
+++ b/drivers/cpufreq/s3c2440-cpufreq.c
@@ -173,12 +173,14 @@ static void s3c2440_cpufreq_setdivs(struct s3c_cpufreq_config *cfg)
case 6:
camdiv |= S3C2440_CAMDIVN_HCLK3_HALF;
+ fallthrough;
case 3:
clkdiv |= S3C2440_CLKDIVN_HDIVN_3_6;
break;
case 8:
camdiv |= S3C2440_CAMDIVN_HCLK4_HALF;
+ fallthrough;
case 4:
clkdiv |= S3C2440_CLKDIVN_HDIVN_4_8;
break;
diff --git a/drivers/cpufreq/s5pv210-cpufreq.c b/drivers/cpufreq/s5pv210-cpufreq.c
index ad7d4f272ddc..76c888ed8d16 100644
--- a/drivers/cpufreq/s5pv210-cpufreq.c
+++ b/drivers/cpufreq/s5pv210-cpufreq.c
@@ -243,7 +243,7 @@ static int s5pv210_target(struct cpufreq_policy *policy, unsigned int index)
new_freq = s5pv210_freq_table[index].frequency;
/* Finding current running level index */
- priv_index = cpufreq_table_find_index_h(policy, old_freq);
+ priv_index = cpufreq_table_find_index_h(policy, old_freq, false);
arm_volt = dvs_conf[index].arm_volt;
int_volt = dvs_conf[index].int_volt;
diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c
index 5d1943e787b0..6c88827f4e62 100644
--- a/drivers/cpufreq/tegra186-cpufreq.c
+++ b/drivers/cpufreq/tegra186-cpufreq.c
@@ -159,6 +159,10 @@ static struct cpufreq_frequency_table *init_vhint_table(
table = ERR_PTR(err);
goto free;
}
+ if (msg.rx.ret) {
+ table = ERR_PTR(-EINVAL);
+ goto free;
+ }
for (i = data->vfloor; i <= data->vceil; i++) {
u16 ndiv = data->ndiv[i];
diff --git a/drivers/cpufreq/tegra194-cpufreq.c b/drivers/cpufreq/tegra194-cpufreq.c
index a9620e4489ae..ac381db25dbe 100644
--- a/drivers/cpufreq/tegra194-cpufreq.c
+++ b/drivers/cpufreq/tegra194-cpufreq.c
@@ -242,7 +242,7 @@ static int tegra194_cpufreq_init(struct cpufreq_policy *policy)
smp_call_function_single(policy->cpu, get_cpu_cluster, &cl, true);
- if (cl >= data->num_clusters)
+ if (cl >= data->num_clusters || !data->tables[cl])
return -EINVAL;
/* set same policy for all cpus in a cluster */
@@ -310,6 +310,12 @@ init_freq_table(struct platform_device *pdev, struct tegra_bpmp *bpmp,
err = tegra_bpmp_transfer(bpmp, &msg);
if (err)
return ERR_PTR(err);
+ if (msg.rx.ret == -BPMP_EINVAL) {
+ /* Cluster not available */
+ return NULL;
+ }
+ if (msg.rx.ret)
+ return ERR_PTR(-EINVAL);
/*
* Make sure frequency table step is a multiple of mdiv to match
diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c
index 53ec9585ccd4..469e18547d06 100644
--- a/drivers/cpuidle/sysfs.c
+++ b/drivers/cpuidle/sysfs.c
@@ -488,6 +488,7 @@ static int cpuidle_add_state_sysfs(struct cpuidle_device *device)
&kdev->kobj, "state%d", i);
if (ret) {
kobject_put(&kobj->kobj);
+ kfree(kobj);
goto error_state;
}
cpuidle_add_s2idle_attr_group(kobj);
@@ -619,6 +620,7 @@ static int cpuidle_add_driver_sysfs(struct cpuidle_device *dev)
&kdev->kobj, "driver");
if (ret) {
kobject_put(&kdrv->kobj);
+ kfree(kdrv);
return ret;
}
@@ -705,7 +707,6 @@ int cpuidle_add_sysfs(struct cpuidle_device *dev)
if (!kdev)
return -ENOMEM;
kdev->dev = dev;
- dev->kobj_dev = kdev;
init_completion(&kdev->kobj_unregister);
@@ -713,9 +714,11 @@ int cpuidle_add_sysfs(struct cpuidle_device *dev)
"cpuidle");
if (error) {
kobject_put(&kdev->kobj);
+ kfree(kdev);
return error;
}
+ dev->kobj_dev = kdev;
kobject_uevent(&kdev->kobj, KOBJ_ADD);
return 0;
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 85faa7a5c7d1..06333d430382 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -827,7 +827,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
goto err_dev;
}
- if (!devfreq->profile->max_state && !devfreq->profile->freq_table) {
+ if (!devfreq->profile->max_state || !devfreq->profile->freq_table) {
mutex_unlock(&devfreq->lock);
err = set_freq_table(devfreq);
if (err < 0)
diff --git a/drivers/devfreq/event/exynos-ppmu.c b/drivers/devfreq/event/exynos-ppmu.c
index 17ed980d9099..9b849d781116 100644
--- a/drivers/devfreq/event/exynos-ppmu.c
+++ b/drivers/devfreq/event/exynos-ppmu.c
@@ -94,11 +94,16 @@ static struct __exynos_ppmu_events {
PPMU_EVENT(d1-general),
PPMU_EVENT(d1-rt),
- /* For Exynos5422 SoC */
+ /* For Exynos5422 SoC, deprecated (backwards compatible) */
PPMU_EVENT(dmc0_0),
PPMU_EVENT(dmc0_1),
PPMU_EVENT(dmc1_0),
PPMU_EVENT(dmc1_1),
+ /* For Exynos5422 SoC */
+ PPMU_EVENT(dmc0-0),
+ PPMU_EVENT(dmc0-1),
+ PPMU_EVENT(dmc1-0),
+ PPMU_EVENT(dmc1-1),
};
static int __exynos_ppmu_find_ppmu_id(const char *edev_name)
@@ -561,13 +566,10 @@ static int of_get_devfreq_events(struct device_node *np,
* use default if not.
*/
if (info->ppmu_type == EXYNOS_TYPE_PPMU_V2) {
- int id;
/* Not all registers take the same value for
* read+write data count.
*/
- id = __exynos_ppmu_find_ppmu_id(desc[j].name);
-
- switch (id) {
+ switch (ppmu_events[i].id) {
case PPMU_PMNCNT0:
case PPMU_PMNCNT1:
case PPMU_PMNCNT2:
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index e6c543b5ee1d..0b66e25c0e2d 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -89,6 +89,12 @@ static struct cpuidle_state *cpuidle_state_table __initdata;
static unsigned int mwait_substates __initdata;
/*
+ * Enable interrupts before entering the C-state. On some platforms and for
+ * some C-states, this may measurably decrease interrupt latency.
+ */
+#define CPUIDLE_FLAG_IRQ_ENABLE BIT(14)
+
+/*
* Enable this state by default even if the ACPI _CST does not list it.
*/
#define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15)
@@ -127,6 +133,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
unsigned long eax = flg2MWAIT(state->flags);
unsigned long ecx = 1; /* break on interrupt flag */
+ if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
+ local_irq_enable();
+
mwait_idle_with_hints(eax, ecx);
return index;
@@ -698,7 +707,7 @@ static struct cpuidle_state skx_cstates[] __initdata = {
{
.name = "C1",
.desc = "MWAIT 0x00",
- .flags = MWAIT2flg(0x00),
+ .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
.exit_latency = 2,
.target_residency = 2,
.enter = &intel_idle,
@@ -727,7 +736,7 @@ static struct cpuidle_state icx_cstates[] __initdata = {
{
.name = "C1",
.desc = "MWAIT 0x00",
- .flags = MWAIT2flg(0x00),
+ .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
.exit_latency = 1,
.target_residency = 1,
.enter = &intel_idle,
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index b00d80370379..a42dbf448860 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -906,7 +906,7 @@ acpi_status pci_acpi_add_pm_notifier(struct acpi_device *dev,
* choose highest power _SxD or any lower power
*/
-static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
+pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
{
int acpi_state, d_max;
@@ -965,22 +965,20 @@ int pci_dev_acpi_reset(struct pci_dev *dev, bool probe)
return 0;
}
-static bool acpi_pci_power_manageable(struct pci_dev *dev)
+bool acpi_pci_power_manageable(struct pci_dev *dev)
{
struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
- if (!adev)
- return false;
- return acpi_device_power_manageable(adev);
+ return adev && acpi_device_power_manageable(adev);
}
-static bool acpi_pci_bridge_d3(struct pci_dev *dev)
+bool acpi_pci_bridge_d3(struct pci_dev *dev)
{
const union acpi_object *obj;
struct acpi_device *adev;
struct pci_dev *rpdev;
- if (!dev->is_hotplug_bridge)
+ if (acpi_pci_disabled || !dev->is_hotplug_bridge)
return false;
/* Assume D3 support if the bridge is power-manageable by ACPI. */
@@ -1008,7 +1006,7 @@ static bool acpi_pci_bridge_d3(struct pci_dev *dev)
return obj->integer.value == 1;
}
-static int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
+int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
{
struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
static const u8 state_conv[] = {
@@ -1046,7 +1044,7 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
return error;
}
-static pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
+pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
{
struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
static const pci_power_t state_conv[] = {
@@ -1068,7 +1066,7 @@ static pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
return state_conv[state];
}
-static void acpi_pci_refresh_power_state(struct pci_dev *dev)
+void acpi_pci_refresh_power_state(struct pci_dev *dev)
{
struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
@@ -1093,17 +1091,23 @@ static int acpi_pci_propagate_wakeup(struct pci_bus *bus, bool enable)
return 0;
}
-static int acpi_pci_wakeup(struct pci_dev *dev, bool enable)
+int acpi_pci_wakeup(struct pci_dev *dev, bool enable)
{
+ if (acpi_pci_disabled)
+ return 0;
+
if (acpi_pm_device_can_wakeup(&dev->dev))
return acpi_pm_set_device_wakeup(&dev->dev, enable);
return acpi_pci_propagate_wakeup(dev->bus, enable);
}
-static bool acpi_pci_need_resume(struct pci_dev *dev)
+bool acpi_pci_need_resume(struct pci_dev *dev)
{
- struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
+ struct acpi_device *adev;
+
+ if (acpi_pci_disabled)
+ return false;
/*
* In some cases (eg. Samsung 305V4A) leaving a bridge in suspend over
@@ -1115,6 +1119,7 @@ static bool acpi_pci_need_resume(struct pci_dev *dev)
if (pci_is_bridge(dev) && acpi_target_system_state() != ACPI_STATE_S0)
return true;
+ adev = ACPI_COMPANION(&dev->dev);
if (!adev || !acpi_device_power_manageable(adev))
return false;
@@ -1128,17 +1133,6 @@ static bool acpi_pci_need_resume(struct pci_dev *dev)
return !!adev->power.flags.dsw_present;
}
-static const struct pci_platform_pm_ops acpi_pci_platform_pm = {
- .bridge_d3 = acpi_pci_bridge_d3,
- .is_manageable = acpi_pci_power_manageable,
- .set_state = acpi_pci_set_power_state,
- .get_state = acpi_pci_get_power_state,
- .refresh_state = acpi_pci_refresh_power_state,
- .choose_state = acpi_pci_choose_state,
- .set_wakeup = acpi_pci_wakeup,
- .need_resume = acpi_pci_need_resume,
-};
-
void acpi_pci_add_bus(struct pci_bus *bus)
{
union acpi_object *obj;
@@ -1451,7 +1445,6 @@ static int __init acpi_pci_init(void)
if (acpi_pci_disabled)
return 0;
- pci_set_platform_pm(&acpi_pci_platform_pm);
acpi_pci_slot_init();
acpiphp_init();
diff --git a/drivers/pci/pci-mid.c b/drivers/pci/pci-mid.c
index aafd58da3a89..fbfd78127123 100644
--- a/drivers/pci/pci-mid.c
+++ b/drivers/pci/pci-mid.c
@@ -16,45 +16,23 @@
#include "pci.h"
-static bool mid_pci_power_manageable(struct pci_dev *dev)
+static bool pci_mid_pm_enabled __read_mostly;
+
+bool pci_use_mid_pm(void)
{
- return true;
+ return pci_mid_pm_enabled;
}
-static int mid_pci_set_power_state(struct pci_dev *pdev, pci_power_t state)
+int mid_pci_set_power_state(struct pci_dev *pdev, pci_power_t state)
{
return intel_mid_pci_set_power_state(pdev, state);
}
-static pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
+pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
{
return intel_mid_pci_get_power_state(pdev);
}
-static pci_power_t mid_pci_choose_state(struct pci_dev *pdev)
-{
- return PCI_D3hot;
-}
-
-static int mid_pci_wakeup(struct pci_dev *dev, bool enable)
-{
- return 0;
-}
-
-static bool mid_pci_need_resume(struct pci_dev *dev)
-{
- return false;
-}
-
-static const struct pci_platform_pm_ops mid_pci_platform_pm = {
- .is_manageable = mid_pci_power_manageable,
- .set_state = mid_pci_set_power_state,
- .get_state = mid_pci_get_power_state,
- .choose_state = mid_pci_choose_state,
- .set_wakeup = mid_pci_wakeup,
- .need_resume = mid_pci_need_resume,
-};
-
/*
* This table should be in sync with the one in
* arch/x86/platform/intel-mid/pwr.c.
@@ -71,7 +49,8 @@ static int __init mid_pci_init(void)
id = x86_match_cpu(lpss_cpu_ids);
if (id)
- pci_set_platform_pm(&mid_pci_platform_pm);
+ pci_mid_pm_enabled = true;
+
return 0;
}
arch_initcall(mid_pci_init);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ce2ab62b64cf..11c88ffb51d9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -972,61 +972,67 @@ static void pci_restore_bars(struct pci_dev *dev)
pci_update_resource(dev, i);
}
-static const struct pci_platform_pm_ops *pci_platform_pm;
-
-int pci_set_platform_pm(const struct pci_platform_pm_ops *ops)
-{
- if (!ops->is_manageable || !ops->set_state || !ops->get_state ||
- !ops->choose_state || !ops->set_wakeup || !ops->need_resume)
- return -EINVAL;
- pci_platform_pm = ops;
- return 0;
-}
-
static inline bool platform_pci_power_manageable(struct pci_dev *dev)
{
- return pci_platform_pm ? pci_platform_pm->is_manageable(dev) : false;
+ if (pci_use_mid_pm())
+ return true;
+
+ return acpi_pci_power_manageable(dev);
}
static inline int platform_pci_set_power_state(struct pci_dev *dev,
pci_power_t t)
{
- return pci_platform_pm ? pci_platform_pm->set_state(dev, t) : -ENOSYS;
+ if (pci_use_mid_pm())
+ return mid_pci_set_power_state(dev, t);
+
+ return acpi_pci_set_power_state(dev, t);
}
static inline pci_power_t platform_pci_get_power_state(struct pci_dev *dev)
{
- return pci_platform_pm ? pci_platform_pm->get_state(dev) : PCI_UNKNOWN;
+ if (pci_use_mid_pm())
+ return mid_pci_get_power_state(dev);
+
+ return acpi_pci_get_power_state(dev);
}
static inline void platform_pci_refresh_power_state(struct pci_dev *dev)
{
- if (pci_platform_pm && pci_platform_pm->refresh_state)
- pci_platform_pm->refresh_state(dev);
+ if (!pci_use_mid_pm())
+ acpi_pci_refresh_power_state(dev);
}
static inline pci_power_t platform_pci_choose_state(struct pci_dev *dev)
{
- return pci_platform_pm ?
- pci_platform_pm->choose_state(dev) : PCI_POWER_ERROR;
+ if (pci_use_mid_pm())
+ return PCI_POWER_ERROR;
+
+ return acpi_pci_choose_state(dev);
}
static inline int platform_pci_set_wakeup(struct pci_dev *dev, bool enable)
{
- return pci_platform_pm ?
- pci_platform_pm->set_wakeup(dev, enable) : -ENODEV;
+ if (pci_use_mid_pm())
+ return PCI_POWER_ERROR;
+
+ return acpi_pci_wakeup(dev, enable);
}
static inline bool platform_pci_need_resume(struct pci_dev *dev)
{
- return pci_platform_pm ? pci_platform_pm->need_resume(dev) : false;
+ if (pci_use_mid_pm())
+ return false;
+
+ return acpi_pci_need_resume(dev);
}
static inline bool platform_pci_bridge_d3(struct pci_dev *dev)
{
- if (pci_platform_pm && pci_platform_pm->bridge_d3)
- return pci_platform_pm->bridge_d3(dev);
- return false;
+ if (pci_use_mid_pm())
+ return false;
+
+ return acpi_pci_bridge_d3(dev);
}
/**
@@ -1185,9 +1191,7 @@ void pci_update_current_state(struct pci_dev *dev, pci_power_t state)
*/
void pci_refresh_power_state(struct pci_dev *dev)
{
- if (platform_pci_power_manageable(dev))
- platform_pci_refresh_power_state(dev);
-
+ platform_pci_refresh_power_state(dev);
pci_update_current_state(dev, dev->current_state);
}
@@ -1200,14 +1204,10 @@ int pci_platform_power_transition(struct pci_dev *dev, pci_power_t state)
{
int error;
- if (platform_pci_power_manageable(dev)) {
- error = platform_pci_set_power_state(dev, state);
- if (!error)
- pci_update_current_state(dev, state);
- } else
- error = -ENODEV;
-
- if (error && !dev->pm_cap) /* Fall back to PCI_D0 */
+ error = platform_pci_set_power_state(dev, state);
+ if (!error)
+ pci_update_current_state(dev, state);
+ else if (!dev->pm_cap) /* Fall back to PCI_D0 */
dev->current_state = PCI_D0;
return error;
@@ -1388,44 +1388,6 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
}
EXPORT_SYMBOL(pci_set_power_state);
-/**
- * pci_choose_state - Choose the power state of a PCI device
- * @dev: PCI device to be suspended
- * @state: target sleep state for the whole system. This is the value
- * that is passed to suspend() function.
- *
- * Returns PCI power state suitable for given device and given system
- * message.
- */
-pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state)
-{
- pci_power_t ret;
-
- if (!dev->pm_cap)
- return PCI_D0;
-
- ret = platform_pci_choose_state(dev);
- if (ret != PCI_POWER_ERROR)
- return ret;
-
- switch (state.event) {
- case PM_EVENT_ON:
- return PCI_D0;
- case PM_EVENT_FREEZE:
- case PM_EVENT_PRETHAW:
- /* REVISIT both freeze and pre-thaw "should" use D0 */
- case PM_EVENT_SUSPEND:
- case PM_EVENT_HIBERNATE:
- return PCI_D3hot;
- default:
- pci_info(dev, "unrecognized suspend event %d\n",
- state.event);
- BUG();
- }
- return PCI_D0;
-}
-EXPORT_SYMBOL(pci_choose_state);
-
#define PCI_EXP_SAVE_REGS 7
static struct pci_cap_saved_state *_pci_find_saved_cap(struct pci_dev *pci_dev,
@@ -2577,8 +2539,6 @@ EXPORT_SYMBOL(pci_wake_from_d3);
*/
static pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
{
- pci_power_t target_state = PCI_D3hot;
-
if (platform_pci_power_manageable(dev)) {
/*
* Call the platform to find the target state for the device.
@@ -2588,32 +2548,29 @@ static pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
switch (state) {
case PCI_POWER_ERROR:
case PCI_UNKNOWN:
- break;
+ return PCI_D3hot;
+
case PCI_D1:
case PCI_D2:
if (pci_no_d1d2(dev))
- break;
- fallthrough;
- default:
- target_state = state;
+ return PCI_D3hot;
}
- return target_state;
+ return state;
}
- if (!dev->pm_cap)
- target_state = PCI_D0;
-
/*
* If the device is in D3cold even though it's not power-manageable by
* the platform, it may have been powered down by non-standard means.
* Best to let it slumber.
*/
if (dev->current_state == PCI_D3cold)
- target_state = PCI_D3cold;
+ return PCI_D3cold;
+ else if (!dev->pm_cap)
+ return PCI_D0;
if (wakeup && dev->pme_support) {
- pci_power_t state = target_state;
+ pci_power_t state = PCI_D3hot;
/*
* Find the deepest state from which the device can generate
@@ -2628,7 +2585,7 @@ static pci_power_t pci_target_state(struct pci_dev *dev, bool wakeup)
return PCI_D0;
}
- return target_state;
+ return PCI_D3hot;
}
/**
@@ -2681,8 +2638,13 @@ EXPORT_SYMBOL(pci_prepare_to_sleep);
*/
int pci_back_from_sleep(struct pci_dev *dev)
{
+ int ret = pci_set_power_state(dev, PCI_D0);
+
+ if (ret)
+ return ret;
+
pci_enable_wake(dev, PCI_D0, false);
- return pci_set_power_state(dev, PCI_D0);
+ return 0;
}
EXPORT_SYMBOL(pci_back_from_sleep);
@@ -2842,6 +2804,22 @@ void pci_dev_complete_resume(struct pci_dev *pci_dev)
spin_unlock_irq(&dev->power.lock);
}
+/**
+ * pci_choose_state - Choose the power state of a PCI device.
+ * @dev: Target PCI device.
+ * @state: Target state for the whole system.
+ *
+ * Returns PCI power state suitable for @dev and @state.
+ */
+pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state)
+{
+ if (state.event == PM_EVENT_ON)
+ return PCI_D0;
+
+ return pci_target_state(dev, false);
+}
+EXPORT_SYMBOL(pci_choose_state);
+
void pci_config_pm_runtime_get(struct pci_dev *pdev)
{
struct device *dev = &pdev->dev;
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1cce56c2aea0..bd39098c57da 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -63,45 +63,6 @@ struct pci_cap_saved_state *pci_find_saved_ext_cap(struct pci_dev *dev,
#define PCI_PM_D3HOT_WAIT 10 /* msec */
#define PCI_PM_D3COLD_WAIT 100 /* msec */
-/**
- * struct pci_platform_pm_ops - Firmware PM callbacks
- *
- * @bridge_d3: Does the bridge allow entering into D3
- *
- * @is_manageable: returns 'true' if given device is power manageable by the
- * platform firmware
- *
- * @set_state: invokes the platform firmware to set the device's power state
- *
- * @get_state: queries the platform firmware for a device's current power state
- *
- * @refresh_state: asks the platform to refresh the device's power state data
- *
- * @choose_state: returns PCI power state of given device preferred by the
- * platform; to be used during system-wide transitions from a
- * sleeping state to the working state and vice versa
- *
- * @set_wakeup: enables/disables wakeup capability for the device
- *
- * @need_resume: returns 'true' if the given device (which is currently
- * suspended) needs to be resumed to be configured for system
- * wakeup.
- *
- * If given platform is generally capable of power managing PCI devices, all of
- * these callbacks are mandatory.
- */
-struct pci_platform_pm_ops {
- bool (*bridge_d3)(struct pci_dev *dev);
- bool (*is_manageable)(struct pci_dev *dev);
- int (*set_state)(struct pci_dev *dev, pci_power_t state);
- pci_power_t (*get_state)(struct pci_dev *dev);
- void (*refresh_state)(struct pci_dev *dev);
- pci_power_t (*choose_state)(struct pci_dev *dev);
- int (*set_wakeup)(struct pci_dev *dev, bool enable);
- bool (*need_resume)(struct pci_dev *dev);
-};
-
-int pci_set_platform_pm(const struct pci_platform_pm_ops *ops);
void pci_update_current_state(struct pci_dev *dev, pci_power_t state);
void pci_refresh_power_state(struct pci_dev *dev);
int pci_power_up(struct pci_dev *dev);
@@ -725,17 +686,53 @@ int pci_acpi_program_hp_params(struct pci_dev *dev);
extern const struct attribute_group pci_dev_acpi_attr_group;
void pci_set_acpi_fwnode(struct pci_dev *dev);
int pci_dev_acpi_reset(struct pci_dev *dev, bool probe);
+bool acpi_pci_power_manageable(struct pci_dev *dev);
+bool acpi_pci_bridge_d3(struct pci_dev *dev);
+int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state);
+pci_power_t acpi_pci_get_power_state(struct pci_dev *dev);
+void acpi_pci_refresh_power_state(struct pci_dev *dev);
+int acpi_pci_wakeup(struct pci_dev *dev, bool enable);
+bool acpi_pci_need_resume(struct pci_dev *dev);
+pci_power_t acpi_pci_choose_state(struct pci_dev *pdev);
#else
static inline int pci_dev_acpi_reset(struct pci_dev *dev, bool probe)
{
return -ENOTTY;
}
-
static inline void pci_set_acpi_fwnode(struct pci_dev *dev) {}
static inline int pci_acpi_program_hp_params(struct pci_dev *dev)
{
return -ENODEV;
}
+static inline bool acpi_pci_power_manageable(struct pci_dev *dev)
+{
+ return false;
+}
+static inline bool acpi_pci_bridge_d3(struct pci_dev *dev)
+{
+ return false;
+}
+static inline int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
+{
+ return -ENODEV;
+}
+static inline pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
+{
+ return PCI_UNKNOWN;
+}
+static inline void acpi_pci_refresh_power_state(struct pci_dev *dev) {}
+static inline int acpi_pci_wakeup(struct pci_dev *dev, bool enable)
+{
+ return -ENODEV;
+}
+static inline bool acpi_pci_need_resume(struct pci_dev *dev)
+{
+ return false;
+}
+static inline pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
+{
+ return PCI_POWER_ERROR;
+}
#endif
#ifdef CONFIG_PCIEASPM
@@ -744,4 +741,23 @@ extern const struct attribute_group aspm_ctrl_attr_group;
extern const struct attribute_group pci_dev_reset_method_attr_group;
+#ifdef CONFIG_X86_INTEL_MID
+bool pci_use_mid_pm(void);
+int mid_pci_set_power_state(struct pci_dev *pdev, pci_power_t state);
+pci_power_t mid_pci_get_power_state(struct pci_dev *pdev);
+#else
+static inline bool pci_use_mid_pm(void)
+{
+ return false;
+}
+static inline int mid_pci_set_power_state(struct pci_dev *pdev, pci_power_t state)
+{
+ return -ENODEV;
+}
+static inline pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
+{
+ return PCI_UNKNOWN;
+}
+#endif
+
#endif /* DRIVERS_PCI_H */
diff --git a/drivers/powercap/dtpm.c b/drivers/powercap/dtpm.c
index c2185ec5f887..b9fac786246a 100644
--- a/drivers/powercap/dtpm.c
+++ b/drivers/powercap/dtpm.c
@@ -116,8 +116,6 @@ static void __dtpm_sub_power(struct dtpm *dtpm)
parent->power_limit -= dtpm->power_limit;
parent = parent->parent;
}
-
- __dtpm_rebalance_weight(root);
}
static void __dtpm_add_power(struct dtpm *dtpm)
@@ -130,45 +128,45 @@ static void __dtpm_add_power(struct dtpm *dtpm)
parent->power_limit += dtpm->power_limit;
parent = parent->parent;
}
+}
+
+static int __dtpm_update_power(struct dtpm *dtpm)
+{
+ int ret;
+
+ __dtpm_sub_power(dtpm);
+
+ ret = dtpm->ops->update_power_uw(dtpm);
+ if (ret)
+ pr_err("Failed to update power for '%s': %d\n",
+ dtpm->zone.name, ret);
- __dtpm_rebalance_weight(root);
+ if (!test_bit(DTPM_POWER_LIMIT_FLAG, &dtpm->flags))
+ dtpm->power_limit = dtpm->power_max;
+
+ __dtpm_add_power(dtpm);
+
+ if (root)
+ __dtpm_rebalance_weight(root);
+
+ return ret;
}
/**
* dtpm_update_power - Update the power on the dtpm
* @dtpm: a pointer to a dtpm structure to update
- * @power_min: a u64 representing the new power_min value
- * @power_max: a u64 representing the new power_max value
*
* Function to update the power values of the dtpm node specified in
* parameter. These new values will be propagated to the tree.
*
* Return: zero on success, -EINVAL if the values are inconsistent
*/
-int dtpm_update_power(struct dtpm *dtpm, u64 power_min, u64 power_max)
+int dtpm_update_power(struct dtpm *dtpm)
{
- int ret = 0;
+ int ret;
mutex_lock(&dtpm_lock);
-
- if (power_min == dtpm->power_min && power_max == dtpm->power_max)
- goto unlock;
-
- if (power_max < power_min) {
- ret = -EINVAL;
- goto unlock;
- }
-
- __dtpm_sub_power(dtpm);
-
- dtpm->power_min = power_min;
- dtpm->power_max = power_max;
- if (!test_bit(DTPM_POWER_LIMIT_FLAG, &dtpm->flags))
- dtpm->power_limit = power_max;
-
- __dtpm_add_power(dtpm);
-
-unlock:
+ ret = __dtpm_update_power(dtpm);
mutex_unlock(&dtpm_lock);
return ret;
@@ -359,24 +357,18 @@ static struct powercap_zone_ops zone_ops = {
};
/**
- * dtpm_alloc - Allocate and initialize a dtpm struct
- * @name: a string specifying the name of the node
- *
- * Return: a struct dtpm pointer, NULL in case of error
+ * dtpm_init - Allocate and initialize a dtpm struct
+ * @dtpm: The dtpm struct pointer to be initialized
+ * @ops: The dtpm device specific ops, NULL for a virtual node
*/
-struct dtpm *dtpm_alloc(struct dtpm_ops *ops)
+void dtpm_init(struct dtpm *dtpm, struct dtpm_ops *ops)
{
- struct dtpm *dtpm;
-
- dtpm = kzalloc(sizeof(*dtpm), GFP_KERNEL);
if (dtpm) {
INIT_LIST_HEAD(&dtpm->children);
INIT_LIST_HEAD(&dtpm->sibling);
dtpm->weight = 1024;
dtpm->ops = ops;
}
-
- return dtpm;
}
/**
@@ -436,6 +428,7 @@ int dtpm_register(const char *name, struct dtpm *dtpm, struct dtpm *parent)
if (dtpm->ops && !(dtpm->ops->set_power_uw &&
dtpm->ops->get_power_uw &&
+ dtpm->ops->update_power_uw &&
dtpm->ops->release))
return -EINVAL;
@@ -455,7 +448,10 @@ int dtpm_register(const char *name, struct dtpm *dtpm, struct dtpm *parent)
root = dtpm;
}
- __dtpm_add_power(dtpm);
+ if (dtpm->ops && !dtpm->ops->update_power_uw(dtpm)) {
+ __dtpm_add_power(dtpm);
+ dtpm->power_limit = dtpm->power_max;
+ }
pr_info("Registered dtpm node '%s' / %llu-%llu uW, \n",
dtpm->zone.name, dtpm->power_min, dtpm->power_max);
@@ -465,9 +461,9 @@ int dtpm_register(const char *name, struct dtpm *dtpm, struct dtpm *parent)
return 0;
}
-static int __init dtpm_init(void)
+static int __init init_dtpm(void)
{
- struct dtpm_descr **dtpm_descr;
+ struct dtpm_descr *dtpm_descr;
pct = powercap_register_control_type(NULL, "dtpm", NULL);
if (IS_ERR(pct)) {
@@ -476,8 +472,8 @@ static int __init dtpm_init(void)
}
for_each_dtpm_table(dtpm_descr)
- (*dtpm_descr)->init(*dtpm_descr);
+ dtpm_descr->init();
return 0;
}
-late_initcall(dtpm_init);
+late_initcall(init_dtpm);
diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 51c366938acd..44faa3a74db6 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -14,6 +14,8 @@
* The CPU hotplug is supported and the power numbers will be updated
* if a CPU is hot plugged / unplugged.
*/
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/cpumask.h>
#include <linux/cpufreq.h>
#include <linux/cpuhotplug.h>
@@ -23,66 +25,29 @@
#include <linux/slab.h>
#include <linux/units.h>
-static struct dtpm *__parent;
-
-static DEFINE_PER_CPU(struct dtpm *, dtpm_per_cpu);
-
struct dtpm_cpu {
+ struct dtpm dtpm;
struct freq_qos_request qos_req;
int cpu;
};
-/*
- * When a new CPU is inserted at hotplug or boot time, add the power
- * contribution and update the dtpm tree.
- */
-static int power_add(struct dtpm *dtpm, struct em_perf_domain *em)
-{
- u64 power_min, power_max;
-
- power_min = em->table[0].power;
- power_min *= MICROWATT_PER_MILLIWATT;
- power_min += dtpm->power_min;
+static DEFINE_PER_CPU(struct dtpm_cpu *, dtpm_per_cpu);
- power_max = em->table[em->nr_perf_states - 1].power;
- power_max *= MICROWATT_PER_MILLIWATT;
- power_max += dtpm->power_max;
-
- return dtpm_update_power(dtpm, power_min, power_max);
-}
-
-/*
- * When a CPU is unplugged, remove its power contribution from the
- * dtpm tree.
- */
-static int power_sub(struct dtpm *dtpm, struct em_perf_domain *em)
+static struct dtpm_cpu *to_dtpm_cpu(struct dtpm *dtpm)
{
- u64 power_min, power_max;
-
- power_min = em->table[0].power;
- power_min *= MICROWATT_PER_MILLIWATT;
- power_min = dtpm->power_min - power_min;
-
- power_max = em->table[em->nr_perf_states - 1].power;
- power_max *= MICROWATT_PER_MILLIWATT;
- power_max = dtpm->power_max - power_max;
-
- return dtpm_update_power(dtpm, power_min, power_max);
+ return container_of(dtpm, struct dtpm_cpu, dtpm);
}
static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
{
- struct dtpm_cpu *dtpm_cpu = dtpm->private;
- struct em_perf_domain *pd;
+ struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
+ struct em_perf_domain *pd = em_cpu_get(dtpm_cpu->cpu);
struct cpumask cpus;
unsigned long freq;
u64 power;
int i, nr_cpus;
- pd = em_cpu_get(dtpm_cpu->cpu);
-
cpumask_and(&cpus, cpu_online_mask, to_cpumask(pd->cpus));
-
nr_cpus = cpumask_weight(&cpus);
for (i = 0; i < pd->nr_perf_states; i++) {
@@ -103,34 +68,88 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
return power_limit;
}
+static u64 scale_pd_power_uw(struct cpumask *pd_mask, u64 power)
+{
+ unsigned long max = 0, sum_util = 0;
+ int cpu;
+
+ for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
+
+ /*
+ * The capacity is the same for all CPUs belonging to
+ * the same perf domain, so a single call to
+ * arch_scale_cpu_capacity() is enough. However, we
+ * need the CPU parameter to be initialized by the
+ * loop, so the call ends up in this block.
+ *
+ * We can initialize 'max' with a cpumask_first() call
+ * before the loop but the bits computation is not
+ * worth given the arch_scale_cpu_capacity() just
+ * returns a value where the resulting assembly code
+ * will be optimized by the compiler.
+ */
+ max = arch_scale_cpu_capacity(cpu);
+ sum_util += sched_cpu_util(cpu, max);
+ }
+
+ /*
+ * In the improbable case where all the CPUs of the perf
+ * domain are offline, 'max' will be zero and will lead to an
+ * illegal operation with a zero division.
+ */
+ return max ? (power * ((sum_util << 10) / max)) >> 10 : 0;
+}
+
static u64 get_pd_power_uw(struct dtpm *dtpm)
{
- struct dtpm_cpu *dtpm_cpu = dtpm->private;
+ struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *pd;
- struct cpumask cpus;
+ struct cpumask *pd_mask;
unsigned long freq;
- int i, nr_cpus;
+ int i;
pd = em_cpu_get(dtpm_cpu->cpu);
+
+ pd_mask = em_span_cpus(pd);
+
freq = cpufreq_quick_get(dtpm_cpu->cpu);
- cpumask_and(&cpus, cpu_online_mask, to_cpumask(pd->cpus));
- nr_cpus = cpumask_weight(&cpus);
for (i = 0; i < pd->nr_perf_states; i++) {
if (pd->table[i].frequency < freq)
continue;
- return pd->table[i].power *
- MICROWATT_PER_MILLIWATT * nr_cpus;
+ return scale_pd_power_uw(pd_mask, pd->table[i].power *
+ MICROWATT_PER_MILLIWATT);
}
return 0;
}
+static int update_pd_power_uw(struct dtpm *dtpm)
+{
+ struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
+ struct em_perf_domain *em = em_cpu_get(dtpm_cpu->cpu);
+ struct cpumask cpus;
+ int nr_cpus;
+
+ cpumask_and(&cpus, cpu_online_mask, to_cpumask(em->cpus));
+ nr_cpus = cpumask_weight(&cpus);
+
+ dtpm->power_min = em->table[0].power;
+ dtpm->power_min *= MICROWATT_PER_MILLIWATT;
+ dtpm->power_min *= nr_cpus;
+
+ dtpm->power_max = em->table[em->nr_perf_states - 1].power;
+ dtpm->power_max *= MICROWATT_PER_MILLIWATT;
+ dtpm->power_max *= nr_cpus;
+
+ return 0;
+}
+
static void pd_release(struct dtpm *dtpm)
{
- struct dtpm_cpu *dtpm_cpu = dtpm->private;
+ struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
if (freq_qos_request_active(&dtpm_cpu->qos_req))
freq_qos_remove_request(&dtpm_cpu->qos_req);
@@ -139,44 +158,28 @@ static void pd_release(struct dtpm *dtpm)
}
static struct dtpm_ops dtpm_ops = {
- .set_power_uw = set_pd_power_limit,
- .get_power_uw = get_pd_power_uw,
- .release = pd_release,
+ .set_power_uw = set_pd_power_limit,
+ .get_power_uw = get_pd_power_uw,
+ .update_power_uw = update_pd_power_uw,
+ .release = pd_release,
};
static int cpuhp_dtpm_cpu_offline(unsigned int cpu)
{
- struct cpufreq_policy *policy;
struct em_perf_domain *pd;
- struct dtpm *dtpm;
-
- policy = cpufreq_cpu_get(cpu);
-
- if (!policy)
- return 0;
+ struct dtpm_cpu *dtpm_cpu;
pd = em_cpu_get(cpu);
if (!pd)
return -EINVAL;
- dtpm = per_cpu(dtpm_per_cpu, cpu);
+ dtpm_cpu = per_cpu(dtpm_per_cpu, cpu);
- power_sub(dtpm, pd);
-
- if (cpumask_weight(policy->cpus) != 1)
- return 0;
-
- for_each_cpu(cpu, policy->related_cpus)
- per_cpu(dtpm_per_cpu, cpu) = NULL;
-
- dtpm_unregister(dtpm);
-
- return 0;
+ return dtpm_update_power(&dtpm_cpu->dtpm);
}
static int cpuhp_dtpm_cpu_online(unsigned int cpu)
{
- struct dtpm *dtpm;
struct dtpm_cpu *dtpm_cpu;
struct cpufreq_policy *policy;
struct em_perf_domain *pd;
@@ -184,7 +187,6 @@ static int cpuhp_dtpm_cpu_online(unsigned int cpu)
int ret = -ENOMEM;
policy = cpufreq_cpu_get(cpu);
-
if (!policy)
return 0;
@@ -192,66 +194,82 @@ static int cpuhp_dtpm_cpu_online(unsigned int cpu)
if (!pd)
return -EINVAL;
- dtpm = per_cpu(dtpm_per_cpu, cpu);
- if (dtpm)
- return power_add(dtpm, pd);
-
- dtpm = dtpm_alloc(&dtpm_ops);
- if (!dtpm)
- return -EINVAL;
+ dtpm_cpu = per_cpu(dtpm_per_cpu, cpu);
+ if (dtpm_cpu)
+ return dtpm_update_power(&dtpm_cpu->dtpm);
dtpm_cpu = kzalloc(sizeof(*dtpm_cpu), GFP_KERNEL);
if (!dtpm_cpu)
- goto out_kfree_dtpm;
+ return -ENOMEM;
- dtpm->private = dtpm_cpu;
+ dtpm_init(&dtpm_cpu->dtpm, &dtpm_ops);
dtpm_cpu->cpu = cpu;
for_each_cpu(cpu, policy->related_cpus)
- per_cpu(dtpm_per_cpu, cpu) = dtpm;
+ per_cpu(dtpm_per_cpu, cpu) = dtpm_cpu;
- sprintf(name, "cpu%d", dtpm_cpu->cpu);
+ snprintf(name, sizeof(name), "cpu%d-cpufreq", dtpm_cpu->cpu);
- ret = dtpm_register(name, dtpm, __parent);
+ ret = dtpm_register(name, &dtpm_cpu->dtpm, NULL);
if (ret)
goto out_kfree_dtpm_cpu;
- ret = power_add(dtpm, pd);
- if (ret)
- goto out_dtpm_unregister;
-
ret = freq_qos_add_request(&policy->constraints,
&dtpm_cpu->qos_req, FREQ_QOS_MAX,
pd->table[pd->nr_perf_states - 1].frequency);
if (ret)
- goto out_power_sub;
+ goto out_dtpm_unregister;
return 0;
-out_power_sub:
- power_sub(dtpm, pd);
-
out_dtpm_unregister:
- dtpm_unregister(dtpm);
+ dtpm_unregister(&dtpm_cpu->dtpm);
dtpm_cpu = NULL;
- dtpm = NULL;
out_kfree_dtpm_cpu:
for_each_cpu(cpu, policy->related_cpus)
per_cpu(dtpm_per_cpu, cpu) = NULL;
kfree(dtpm_cpu);
-out_kfree_dtpm:
- kfree(dtpm);
return ret;
}
-int dtpm_register_cpu(struct dtpm *parent)
+static int __init dtpm_cpu_init(void)
{
- __parent = parent;
+ int ret;
+
+ /*
+ * The callbacks at CPU hotplug time are calling
+ * dtpm_update_power() which in turns calls update_pd_power().
+ *
+ * The function update_pd_power() uses the online mask to
+ * figure out the power consumption limits.
+ *
+ * At CPUHP_AP_ONLINE_DYN, the CPU is present in the CPU
+ * online mask when the cpuhp_dtpm_cpu_online function is
+ * called, but the CPU is still in the online mask for the
+ * tear down callback. So the power can not be updated when
+ * the CPU is unplugged.
+ *
+ * At CPUHP_AP_DTPM_CPU_DEAD, the situation is the opposite as
+ * above. The CPU online mask is not up to date when the CPU
+ * is plugged in.
+ *
+ * For this reason, we need to call the online and offline
+ * callbacks at different moments when the CPU online mask is
+ * consistent with the power numbers we want to update.
+ */
+ ret = cpuhp_setup_state(CPUHP_AP_DTPM_CPU_DEAD, "dtpm_cpu:offline",
+ NULL, cpuhp_dtpm_cpu_offline);
+ if (ret < 0)
+ return ret;
+
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "dtpm_cpu:online",
+ cpuhp_dtpm_cpu_online, NULL);
+ if (ret < 0)
+ return ret;
- return cpuhp_setup_state(CPUHP_AP_DTPM_CPU_ONLINE,
- "dtpm_cpu:online",
- cpuhp_dtpm_cpu_online,
- cpuhp_dtpm_cpu_offline);
+ return 0;
}
+
+DTPM_DECLARE(dtpm_cpu, dtpm_cpu_init);
diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index c53f6f276d5c..58a0eae4f41b 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -602,7 +602,7 @@ static int xhci_mtk_probe(struct platform_device *pdev)
goto dealloc_usb2_hcd;
if (wakeup_irq > 0) {
- ret = dev_pm_set_dedicated_wake_irq(dev, wakeup_irq);
+ ret = dev_pm_set_dedicated_wake_irq_reverse(dev, wakeup_irq);
if (ret) {
dev_err(dev, "set wakeup irq %d failed\n", wakeup_irq);
goto dealloc_usb3_hcd;
diff --git a/drivers/usb/mtu3/mtu3_plat.c b/drivers/usb/mtu3/mtu3_plat.c
index f13531022f4a..4309ed939178 100644
--- a/drivers/usb/mtu3/mtu3_plat.c
+++ b/drivers/usb/mtu3/mtu3_plat.c
@@ -337,7 +337,7 @@ static int mtu3_probe(struct platform_device *pdev)
goto comm_init_err;
if (ssusb->wakeup_irq > 0) {
- ret = dev_pm_set_dedicated_wake_irq(dev, ssusb->wakeup_irq);
+ ret = dev_pm_set_dedicated_wake_irq_reverse(dev, ssusb->wakeup_irq);
if (ret) {
dev_err(dev, "failed to set wakeup irq %d\n", ssusb->wakeup_irq);
goto comm_exit;