lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2015-07-10	intel_pstate: set BYT MSR with wrmsrl_on_cpu()	Joe Konno
	commit 0dd23f94251f49da99a6cbfb22418b2d757d77d6 upstream. Commit 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P states.) introduced byt_set_pstate() with the assumption that it would always be run by the CPU whose MSR is to be written by it. It turns out, however, that is not always the case in practice, so modify byt_set_pstate() to enforce the MSR write done by it to always happen on the right CPU. Fixes: 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P states.) Signed-off-by: Joe Konno <joe.konno@intel.com> Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-06	cpufreq: s3c: remove incorrect __init annotations	Arnd Bergmann
	commit 61882b63171736571e1139ab5aa929e3bb336016 upstream. The two functions s3c2416_cpufreq_driver_init and s3c_cpufreq_register are marked init but are called from a context that might be run after the __init sections are discarded, as the compiler points out: WARNING: vmlinux.o(.data+0x1ad9dc): Section mismatch in reference from the variable s3c2416_cpufreq_driver to the function .init.text:s3c2416_cpufreq_driver_init() WARNING: drivers/built-in.o(.text+0x35b5dc): Section mismatch in reference from the function s3c2410a_cpufreq_add() to the function .init.text:s3c_cpufreq_register() This removes the __init markings. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-06	cpufreq: speedstep-smi: enable interrupts when waiting	Mikulas Patocka
	commit d4d4eda23794c701442e55129dd4f8f2fefd5e4d upstream. On Dell Latitude C600 laptop with Pentium 3 850MHz processor, the speedstep-smi driver sometimes loads and sometimes doesn't load with "change to state X failed" message. The hardware sometimes refuses to change frequency and in this case, we need to retry later. I found out that we need to enable interrupts while waiting. When we enable interrupts, the hardware blockage that prevents frequency transition resolves and the transition is possible. With disabled interrupts, the blockage doesn't resolve (no matter how long do we wait). The exact reasons for this hardware behavior are unknown. This patch enables interrupts in the function speedstep_set_state that can be called with disabled interrupts. However, this function is called with disabled interrupts only from speedstep_get_freqs, so it shouldn't cause any problem. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-06	cpufreq: Set cpufreq_cpu_data to NULL before putting kobject	Viresh Kumar
	commit 6ffae8c06fab058d6c3f8ecb7f921327721034e7 upstream. In __cpufreq_remove_dev_finish(), per-cpu 'cpufreq_cpu_data' needs to be cleared before calling kobject_put(&policy->kobj) and under cpufreq_driver_lock. Otherwise, if someone else calls cpufreq_cpu_get() in parallel with it, they can obtain a non-NULL policy from that after kobject_put(&policy->kobj) was executed. Consider this case: Thread A Thread B cpufreq_cpu_get() acquire cpufreq_driver_lock read-per-cpu cpufreq_cpu_data kobject_put(&policy->kobj); kobject_get(&policy->kobj); ... per_cpu(&cpufreq_cpu_data, cpu) = NULL And this will result in a warning like this one: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47 kobject_get+0x41/0x50() Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon ... Call Trace: [<ffffffff81661b14>] dump_stack+0x46/0x58 [<ffffffff81072b61>] warn_slowpath_common+0x81/0xa0 [<ffffffff81072c7a>] warn_slowpath_null+0x1a/0x20 [<ffffffff812e16d1>] kobject_get+0x41/0x50 [<ffffffff815262a5>] cpufreq_cpu_get+0x75/0xc0 [<ffffffff81527c3e>] cpufreq_update_policy+0x2e/0x1f0 [<ffffffff810b8cb2>] ? up+0x32/0x50 [<ffffffff81381aa9>] ? acpi_ns_get_node+0xcb/0xf2 [<ffffffff81381efd>] ? acpi_evaluate_object+0x22c/0x252 [<ffffffff813824f6>] ? acpi_get_handle+0x95/0xc0 [<ffffffff81360967>] ? acpi_has_method+0x25/0x40 [<ffffffff81391e08>] acpi_processor_ppc_has_changed+0x77/0x82 [<ffffffff81089566>] ? move_linked_works+0x66/0x90 [<ffffffff8138e8ed>] acpi_processor_notify+0x58/0xe7 [<ffffffff8137410c>] acpi_ev_notify_dispatch+0x44/0x5c [<ffffffff8135f293>] acpi_os_execute_deferred+0x15/0x22 [<ffffffff8108c910>] process_one_work+0x160/0x410 [<ffffffff8108d05b>] worker_thread+0x11b/0x520 [<ffffffff8108cf40>] ? rescuer_thread+0x380/0x380 [<ffffffff81092421>] kthread+0xe1/0x100 [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0 [<ffffffff81669ebc>] ret_from_fork+0x7c/0xb0 [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0 ---[ end trace 89e66eb9795efdf7 ]--- The actual code flow is as follows: Thread A: Workqueue: kacpi_notify acpi_processor_notify() acpi_processor_ppc_has_changed() cpufreq_update_policy() cpufreq_cpu_get() kobject_get() Thread B: xenbus_thread() xenbus_thread() msg->u.watch.handle->callback() handle_vcpu_hotplug_event() vcpu_hotplug() cpu_down() __cpu_notify(CPU_POST_DEAD..) cpufreq_cpu_callback() __cpufreq_remove_dev_finish() cpufreq_policy_put_kobj() kobject_put() cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data under cpufreq_driver_lock, and once it gets a valid policy it expects it to not be freed until cpufreq_cpu_put() is called. But the race happens when another thread puts the kobject first and updates cpufreq_cpu_data before or later. And so the first thread gets a valid policy structure and before it does kobject_get() on it, the second one has already done kobject_put(). Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that too under locks. Reported-by: Ethan Zhao <ethan.zhao@oracle.com> Reported-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14	intel_pstate: Correct BYT VID values.	Dirk Brandewie
	commit d022a65ed2473fac4a600e3424503dc571160a3e upstream. Using a VID value that is not high enough for the requested P state can cause machine checks. Add a ceiling function to ensure calulated VIDs with fractional values are set to the next highest integer VID value. The algorythm for calculating the non-trubo VID from the BIOS writers guide is: vid_ratio = (vid_max - vid_min) / (max_pstate - min_pstate) vid = ceiling(vid_min + (req_pstate - min_pstate) * vid_ratio) Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14	intel_pstate: Fix BYT frequency reporting	Dirk Brandewie
	commit b27580b05e6f5253228debc60b8ff4a786ff573a upstream. BYT has a different conversion from P state to frequency than the core processors. This causes the min/max and current frequency to be misreported on some BYT SKUs. Tested on BYT N2820, Ivybridge and Haswell processors. Link: https://bugzilla.yoctoproject.org/show_bug.cgi?id=6663 Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14	cpufreq: intel_pstate: Add CPU ID for Braswell processor	Mika Westerberg
	commit 16405f98bca8eb39a23b3ce03e241ca19e7af370 upstream. This is pretty much the same as Intel Baytrail, only the CPU ID is different. Add the new ID to the supported CPU list. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chang Rebecca Swee Fun <rebecca.swee.fun.chang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14	intel_pstate: Add CPU IDs for Broadwell processors	Dirk Brandewie
	commit c7e241df5970171e3e86a516f91ca8a30ca516e8 upstream. Add support for Broadwell processors. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chang Rebecca Swee Fun <rebecca.swee.fun.chang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14	cpufreq: intel_pstate: Fix setting max_perf_pct in performance policy	Pali Rohár
	commit 36b4bed5cd8f6e17019fa7d380e0836872c7b367 upstream. Code which changes policy to powersave changes also max_policy_pct based on max_freq. Code which change max_perf_pct has upper limit base on value max_policy_pct. When policy is changing from powersave back to performance then max_policy_pct is not changed. Which means that changing max_perf_pct is not possible to high values if max_freq was too low in powersave policy. Test case: $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq 800000 $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 3300000 $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance $ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct 100 $ echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor $ echo 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq $ echo 20 > /sys/devices/system/cpu/intel_pstate/max_perf_pct $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor powersave $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 800000 $ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct 20 $ echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor $ echo 3300000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq $ echo 100 > /sys/devices/system/cpu/intel_pstate/max_perf_pct $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq 3300000 $ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct 24 And now intel_pstate driver allows to set maximal value for max_perf_pct based on max_policy_pct which is 24 for previous powersave max_freq 800000. This patch will set default value for max_policy_pct when setting policy to performance so it will allow to set also max value for max_perf_pct. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14	cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers	Dirk Brandewie
	commit c034b02e213d271b98c45c4a7b54af8f69aaac1e upstream. Currently the core does not expose scaling_cur_freq for set_policy() drivers this breaks some userspace monitoring tools. Change the core to expose this file for all drivers and if the set_policy() driver supports the get() callback use it to retrieve the current frequency. Link: https://bugzilla.kernel.org/show_bug.cgi?id=73741 Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-09	cpufreq: integrator: fix integrator_cpufreq_remove return type	Arnd Bergmann
	commit d62dbf77f7dfaa6fb455b4b9828069a11965929c upstream. When building this driver as a module, we get a helpful warning about the return type: drivers/cpufreq/integrator-cpufreq.c:232:2: warning: initialization from incompatible pointer type .remove = __exit_p(integrator_cpufreq_remove), If the remove callback returns void, the caller gets an undefined value as it expects an integer to be returned. This fixes the problem by passing down the value from cpufreq_unregister_driver. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-05	cpufreq: release policy->rwsem on error	Prarit Bhargava
	commit 7106e02baed4a72fb23de56b02ad4d31daa74d95 upstream. While debugging a cpufreq-related hardware failure on a system I saw the following lockdep warning: ========================= [ BUG: held lock freed! ] 3.17.0-rc4+ #1 Tainted: G E ------------------------- insmod/2247 is freeing memory ffff88006e1b1400-ffff88006e1b17ff, with a lock still held there! (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 3 locks held by insmod/2247: #0: (subsys mutex#5){+.+.+.}, at: [<ffffffff81485579>] subsys_interface_register+0x69/0x120 #1: (cpufreq_rwsem){.+.+.+}, at: [<ffffffff8156cf73>] __cpufreq_add_dev.isra.21+0x73/0xb80 #2: (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80 stack backtrace: CPU: 0 PID: 2247 Comm: insmod Tainted: G E 3.17.0-rc4+ #1 Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 08/24/2013 0000000000000000 000000008f3063c4 ffff88006f87bb30 ffffffff8171b358 ffff88006bcf3750 ffff88006f87bb68 ffffffff810e09e1 ffff88006e1b1400 ffffea0001b86c00 ffffffff8156d327 ffff880073003500 0000000000000246 Call Trace: [<ffffffff8171b358>] dump_stack+0x4d/0x66 [<ffffffff810e09e1>] debug_check_no_locks_freed+0x171/0x180 [<ffffffff8156d327>] ? __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff8121412b>] kfree+0xab/0x2b0 [<ffffffff8156d327>] __cpufreq_add_dev.isra.21+0x427/0xb80 [<ffffffff81724cf7>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff8156da8e>] cpufreq_add_dev+0xe/0x10 [<ffffffff814855d1>] subsys_interface_register+0xc1/0x120 [<ffffffff8156bcf2>] cpufreq_register_driver+0x112/0x340 [<ffffffff8121415a>] ? kfree+0xda/0x2b0 [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffffa003562e>] pcc_cpufreq_init+0x4af/0xe81 [pcc_cpufreq] [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq] [<ffffffff81002144>] do_one_initcall+0xd4/0x210 [<ffffffff811f7472>] ? __vunmap+0xd2/0x120 [<ffffffff81127155>] load_module+0x1315/0x1b70 [<ffffffff811222a0>] ? store_uevent+0x70/0x70 [<ffffffff811229d9>] ? copy_module_from_fd.isra.44+0x129/0x180 [<ffffffff81127b86>] SyS_finit_module+0xa6/0xd0 [<ffffffff81725b69>] system_call_fastpath+0x16/0x1b cpufreq: __cpufreq_add_dev: ->get() failed insmod: ERROR: could not insert module pcc-cpufreq.ko: No such device The warning occurs in the __cpufreq_add_dev() code which does down_write(&policy->rwsem); ... if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { policy->cur = cpufreq_driver->get(policy->cpu); if (!policy->cur) { pr_err("%s: ->get() failed\n", __func__); goto err_get_freq; } If cpufreq_driver->get(policy->cpu) returns an error we execute the code at err_get_freq, which does not up the policy->rwsem. This causes the lockdep warning. Trivial patch to up the policy->rwsem in the error path. After the patch has been applied, and an error occurs in the cpufreq_driver->get(policy->cpu) call we will now see cpufreq: __cpufreq_add_dev: ->get() failed cpufreq: __cpufreq_add_dev: ->get() failed modprobe: ERROR: could not insert 'pcc_cpufreq': No such device Fixes: 4e97b631f24c (cpufreq: Initialize governor for a new policy under policy->rwsem) Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-07	cpufreq: move policy kobj to policy->cpu at resume	Viresh Kumar
	commit 92c14bd9477a20a83144f08c0ca25b0308bf0730 upstream. This is only relevant to implementations with multiple clusters, where clusters have separate clock lines but all CPUs within a cluster share it. Consider a dual cluster platform with 2 cores per cluster. During suspend we start hot unplugging CPUs in order 1 to 3. When CPU2 is removed, policy->kobj would be moved to CPU3 and when CPU3 goes down we wouldn't free policy or its kobj as we want to retain permissions/values/etc. Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev(). We will recover the old policy and update policy->cpu from 3 to 2 from update_policy_cpu(). But the kobj is still tied to CPU3 and isn't moved to CPU2. We wouldn't create a link for CPU2, but would try that for CPU3 while bringing it online. Which will report errors as CPU3 already has kobj assigned to it. This bug got introduced with commit 42f921a, which overlooked this scenario. To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a cluster back. Also do a WARN_ON() if kobject_move failed, as we would reach here only for the first CPU of a non-boot cluster. And we can't recover from this situation, if kobject_move() fails. Fixes: 42f921a6f10c (cpufreq: remove sysfs files for CPUs which failed to come back after resume) Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Reported-and-tested-by: Bu Yitian <ybu@qti.qualcomm.com> Reported-by: Saravana Kannan <skannan@codeaurora.org> Reviewed-by: Srivatsa S. Bhat <srivatsa@mit.edu> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-17	intel_pstate: Set CPU number before accessing MSRs	Vincent Minet
	commit 179e8471673ce0249cd4ecda796008f7757e5bad upstream. Ensure that cpu->cpu is set before writing MSR_IA32_PERF_CTL during CPU initialization. Otherwise only cpu0 has its P-state set and all other cores are left with their values unchanged. In most cases, this is not too serious because the P-states will be set correctly when the timer function is run. But when the default governor is set to performance, the per-CPU current_pstate stays the same forever and no attempts are made to write the MSRs again. Signed-off-by: Vincent Minet <vincent@vincent-minet.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-17	intel_pstate: don't touch turbo bit if turbo disabled or unavailable.	Dirk Brandewie
	commit dd5fbf70f96dbfd7ee432096a1f979b2b3267856 upstream. If turbo is disabled in the BIOS bit 38 should be set in MSR_IA32_MISC_ENABLE register per section 14.3.2.1 of the SDM Vol 3 document 325384-050US Feb 2014. If this bit is set do not attempt to disable trubo via the MSR_IA32_PERF_CTL register. On some systems trying to disable turbo via MSR_IA32_PERF_CTL will cause subsequent writes to MSR_IA32_PERF_CTL not take affect, in fact reading MSR_IA32_PERF_CTL will not show the IDA/Turbo DISENGAGE bit(32) as set. A write of bit 32 to zero returns to normal operation. Also deal with the case where the processor does not support turbo and the BIOS does not report the fact in MSR_IA32_MISC_ENABLE but does report the max and turbo P states as the same value. Link: https://bugzilla.kernel.org/show_bug.cgi?id=64251 Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-17	intel_pstate: Fix setting VID	Dirk Brandewie
	commit c16ed06024a6e699c332831dd50d8276744e3de8 upstream. Commit 21855ff5 (intel_pstate: Set turbo VID for BayTrail) introduced setting the turbo VID which is required to prevent a machine check on some Baytrail SKUs under heavy graphics based workloads. The docmumentation update that brought the requirement to light also changed the bit mask used for enumerating P state and VID values from 0x7f to 0x3f. This change returns the mask value to 0x7f. Tested with the Intel NUC DN2820FYK, BIOS version FYBYT10H.86A.0034.2014.0513.1413 with v3.16-rc1 and v3.14.8 kernel versions. Fixes: 21855ff5 (intel_pstate: Set turbo VID for BayTrail) Link: https://bugzilla.kernel.org/show_bug.cgi?id=77951 Reported-and-tested-by: Rune Reterson <rune@megahurts.dk> Reported-and-tested-by: Eric Eickmeyer <erich@ericheickmeyer.com> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-17	cpufreq: Makefile: fix compilation for davinci platform	Prabhakar Lad
	commit 5a90af67c2126fe1d04ebccc1f8177e6ca70d3a9 upstream. Since commtit 8a7b1227e303 (cpufreq: davinci: move cpufreq driver to drivers/cpufreq) this added dependancy only for CONFIG_ARCH_DAVINCI_DA850 where as davinci_cpufreq_init() call is used by all davinci platform. This patch fixes following build error: arch/arm/mach-davinci/built-in.o: In function `davinci_init_late': :(.init.text+0x928): undefined reference to `davinci_cpufreq_init' make: *** [vmlinux] Error 1 Fixes: 8a7b1227e303 (cpufreq: davinci: move cpufreq driver to drivers/cpufreq) Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-09	intel_pstate: Correct rounding in busy calculation	Doug Smythies
	commit 51d211e9c334b9eca3505f4052afa660c3e0606b upstream. There was a mistake in the actual rounding portion this previous patch: f0fe3cd7e12d (intel_pstate: Correct rounding in busy calculation) such that the rounding was asymetric and incorrect. Severity: Not very serious, but can increase target pstate by one extra value. For real world work flows the issue should self correct (but I have no proof). It is the equivalent of different PID gains for positive and negative numbers. Examples: -3.000000 used to round to -4, rounds to -3 with this patch. -3.503906 used to round to -5, rounds to -4 with this patch. Fixes: f0fe3cd7e12d (intel_pstate: Correct rounding in busy calculation) Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11	intel_pstate: Improve initial busy calculation	Doug Smythies
	commit bf8102228a8bf053051f311e5486042fe0542894 upstream. This change makes the busy calculation using 64 bit math which prevents overflow for large values of aperf/mperf. Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11	intel_pstate: add sample time scaling	Dirk Brandewie
	commit c4ee841f602e5eef8eab673295c49c5b49d7732b upstream. The PID assumes that samples are of equal time, which for a deferable timers this is not true when the system goes idle. This causes the PID to take a long time to converge to the min P state and depending on the pattern of the idle load can make the P state appear stuck. The hold-off value of three sample times before using the scaling is to give a grace period for applications that have high performance requirements and spend a lot of time idle, The poster child for this behavior is the ffmpeg benchmark in the Phoronix test suite. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11	intel_pstate: Correct rounding in busy calculation	Dirk Brandewie
	commit f0fe3cd7e12d8290c82284b5c8aee723cbd0371a upstream. Changing to fixed point math throughout the busy calculation in commit e66c1768 (Change busy calculation to use fixed point math.) Introduced some inaccuracies by rounding the busy value at two points in the calculation. This change removes roundings and moves the rounding to the output of the PID where the calculations are complete and the value returned as an integer. Fixes: e66c17683746 (intel_pstate: Change busy calculation to use fixed point math.) Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11	intel_pstate: Remove C0 tracking	Dirk Brandewie
	commit adacdf3f2b8e65aa441613cf61c4f598e9042690 upstream. Commit fcb6a15c (intel_pstate: Take core C0 time into account for core busy calculation) introduced a regression referenced below. The issue with "lockup" after suspend that this commit was addressing is now dealt with in the suspend path. Fixes: fcb6a15c2e7e (intel_pstate: Take core C0 time into account for core busy calculation) Link: https://bugzilla.kernel.org/show_bug.cgi?id=66581 Link: https://bugzilla.kernel.org/show_bug.cgi?id=75121 Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11	intel_pstate: remove unneeded sample buffers	Dirk Brandewie
	commit d37e2b764499e092ebc493d6f980827feb952e23 upstream. Remove unneeded sample buffers, intel_pstate operates on the most recent sample only. This save some memory and make the code more readable. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11	cpufreq: remove race while accessing cur_policy	Bibek Basu
	commit c5450db85b828d0c46ac8fc570fb8a51bf07ac40 upstream. While accessing cur_policy during executing events CPUFREQ_GOV_START, CPUFREQ_GOV_STOP, CPUFREQ_GOV_LIMITS, same mutex lock is not taken, dbs_data->mutex, which leads to race and data corruption while running continious suspend resume test. This is seen with ondemand governor with suspend resume test using rtcwake. Unable to handle kernel NULL pointer dereference at virtual address 00000028 pgd = ed610000 [00000028] pgd=adf11831, pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM Modules linked in: nvhost_vi CPU: 1 PID: 3243 Comm: rtcwake Not tainted 3.10.24-gf5cf9e5 #1 task: ee708040 ti: ed61c000 task.ti: ed61c000 PC is at cpufreq_governor_dbs+0x400/0x634 LR is at cpufreq_governor_dbs+0x3f8/0x634 pc : [<c05652b8>] lr : [<c05652b0>] psr: 600f0013 sp : ed61dcb0 ip : 000493e0 fp : c1cc14f0 r10: 00000000 r9 : 00000000 r8 : 00000000 r7 : eb725280 r6 : c1cc1560 r5 : eb575200 r4 : ebad7740 r3 : ee708040 r2 : ed61dca8 r1 : 001ebd24 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: ad61006a DAC: 00000015 [<c05652b8>] (cpufreq_governor_dbs+0x400/0x634) from [<c055f700>] (__cpufreq_governor+0x98/0x1b4) [<c055f700>] (__cpufreq_governor+0x98/0x1b4) from [<c0560770>] (__cpufreq_set_policy+0x250/0x320) [<c0560770>] (__cpufreq_set_policy+0x250/0x320) from [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168) [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168) from [<c0561ed0>] (cpu_freq_notify+0x68/0xdc) [<c0561ed0>] (cpu_freq_notify+0x68/0xdc) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c) [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310) [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310) from [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70) [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70) from [<c004b4b8>] (tegra_pm_notify+0x114/0x134) [<c004b4b8>] (tegra_pm_notify+0x114/0x134) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c) [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34) [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34) from [<c00ad38c>] (enter_state+0xec/0x128) [<c00ad38c>] (enter_state+0xec/0x128) from [<c00ad400>] (pm_suspend+0x38/0xa4) [<c00ad400>] (pm_suspend+0x38/0xa4) from [<c00ac114>] (state_store+0x70/0xc0) [<c00ac114>] (state_store+0x70/0xc0) from [<c027b1e8>] (kobj_attr_store+0x14/0x20) [<c027b1e8>] (kobj_attr_store+0x14/0x20) from [<c019cd9c>] (sysfs_write_file+0x104/0x184) [<c019cd9c>] (sysfs_write_file+0x104/0x184) from [<c0143038>] (vfs_write+0xd0/0x19c) [<c0143038>] (vfs_write+0xd0/0x19c) from [<c0143414>] (SyS_write+0x4c/0x78) [<c0143414>] (SyS_write+0x4c/0x78) from [<c000f080>] (ret_fast_syscall+0x0/0x30) Code: e1a00006 eb084346 e59b0020 e5951024 (e5903028) ---[ end trace 0488523c8f6b0f9d ]--- Signed-off-by: Bibek Basu <bbasu@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-11	cpufreq: cpu0: drop wrong devm usage	Lucas Stach
	commit e3beb0ac521d50d158a9d253373eae8421ac3998 upstream. This driver is using devres managed calls incorrectly, giving the cpu0 device as first parameter instead of the cpufreq platform device. This results in resources not being freed if the cpufreq platform device is unbound, for example if probing has to be deferred for a missing regulator. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-07	intel_pstate: remove setting P state to MAX on init	Dirk Brandewie
	commit d40a63c45b506b0681918d7c62a15cc9d48c8681 upstream. Setting the P state of the core to max at init time is a hold over from early implementation of intel_pstate where intel_pstate disabled cpufreq and loaded VERY early in the boot sequence. This was to ensure that intel_pstate did not affect boot time. This in not needed now that intel_pstate is a cpufreq driver. Removing this covers the case where a CPU has gone through a manual CPU offline/online cycle and the P state is set to MAX on init and the CPU immediately goes idle. Due to HW coordination the P state request on the idle CPU will drag all cores to MAX P state until the load is reevaluated when to core goes non-idle. Reported-by: Patrick Marlier <patrick.marlier@gmail.com> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-07	intel_pstate: Set turbo VID for BayTrail	Dirk Brandewie
	commit 21855ff5bcbdd075e1c99772827a84912ab083dd upstream. A documentation update exposed that there is a separate set of VID values that must be used in the turbo/boost P state range. Add enumerating and setting the correct VID for P states in the turbo range. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-07	MIPS/loongson2_cpufreq: Fix CPU clock rate setting	Aaro Koskinen
	commit 8e8acb32960f42c81b1d50deac56a2c07bb6a18a upstream. Loongson2 has been using (incorrectly) kHz for cpu_clk rate. This has been unnoticed, as loongson2_cpufreq was the only place where the rate was set/get. After commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656 (cpufreq: introduce cpufreq_generic_get() routine) things however broke, and now loops_per_jiffy adjustments are incorrect (1000 times too long). The patch fixes this by changing cpu_clk rate to Hz. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: cpufreq@vger.kernel.org Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Patchwork: https://patchwork.linux-mips.org/patch/6678/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13	cpufreq: unicore32: fix typo issue for 'clk'	Chen Gang
	commit b4ddad95020e65cfbbf9aee63d3bcdf682794ade upstream. Need use 'clk' instead of 'mclk', which is the original removed local variable. The related original commit: "652ed95 cpufreq: introduce cpufreq_generic_get() routine" The related error with allmodconfig for unicore32: CC drivers/cpufreq/unicore2-cpufreq.o drivers/cpufreq/unicore2-cpufreq.c: In function ‘ucv2_target’: drivers/cpufreq/unicore2-cpufreq.c:48: error: ‘struct cpufreq_policy’ has no member named ‘mclk’ make[2]: * [drivers/cpufreq/unicore2-cpufreq.o] Error 1 make[1]: * [drivers/cpufreq] Error 2 make: *** [drivers] Error 2 Fixes: 652ed95d5fa6 (cpufreq: introduce cpufreq_generic_get() routine) Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13	cpufreq: at32ap: don't declare local variable as static	Viresh Kumar
	commit 0ca97886fece9e1acd71ade4ca3a250945c8fc8b upstream. Earlier commit: commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jan 9 20:38:43 2014 +0530 cpufreq: introduce cpufreq_generic_get() routine did some changes to driver and by mistake made cpuclk as a 'static' local variable, which wasn't actually required. Fix it. Fixes: 652ed95d5fa6 (cpufreq: introduce cpufreq_generic_get() routine) Reported-by: Alexandre Oliva <lxoliva@fsfla.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13	cpufreq: loongson2_cpufreq: don't declare local variable as static	Viresh Kumar
	commit 05ed672292dc9d37db4fafdd49baa782158cd795 upstream. Earlier commit: commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Thu Jan 9 20:38:43 2014 +0530 cpufreq: introduce cpufreq_generic_get() routine did some changes to driver and by mistake made cpuclk as a 'static' local variable, which wasn't actually required. Fix it. Fixes: 652ed95d5fa6 (cpufreq: introduce cpufreq_generic_get() routine) Reported-by: Alexandre Oliva <lxoliva@fsfla.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-13	cpufreq: Skip current frequency initialization for ->setpolicy drivers	Rafael J. Wysocki
	After commit da60ce9f2fac (cpufreq: call cpufreq_driver->get() after calling ->init()) __cpufreq_add_dev() sometimes fails for CPUs handled by intel_pstate, because that driver may return 0 from its ->get() callback if it has not run long enough to collect enough samples on the given CPU. That didn't happen before commit da60ce9f2fac which added policy->cur initialization to __cpufreq_add_dev() to help reduce code duplication in other cpufreq drivers. However, the code added by commit da60ce9f2fac need not be executed for cpufreq drivers having the ->setpolicy callback defined, because the subsequent invocation of cpufreq_set_policy() will use that callback to initialize the policy anyway and it doesn't need policy->cur to be initialized upfront. The analogous code in cpufreq_update_policy() is also unnecessary for cpufreq drivers having ->setpolicy set and may be skipped for them as well. Since intel_pstate provides ->setpolicy, skipping the upfront policy->cur initialization for cpufreq drivers with that callback set will cover intel_pstate and the problem it's been having after commit da60ce9f2fac will be addressed. Fixes: da60ce9f2fac (cpufreq: call cpufreq_driver->get() after calling ->init()) References: https://bugzilla.kernel.org/show_bug.cgi?id=71931 Reported-and-tested-by: Patrik Lundquist <patrik.lundquist@gmail.com> Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06	cpufreq: Initialize governor for a new policy under policy->rwsem	Viresh Kumar
	policy->rwsem is used to lock access to all parts of code modifying struct cpufreq_policy, but it's not used on a new policy created by __cpufreq_add_dev(). Because of that, if cpufreq_update_policy() is called in a tight loop on one CPU in parallel with offline/online of another CPU, then the following crash can be triggered: Unable to handle kernel NULL pointer dereference at virtual address 00000020 pgd = c0003000 [00000020] pgd=80000000004003, pmd=00000000 Internal error: Oops: 206 [#1] PREEMPT SMP ARM PC is at __cpufreq_governor+0x10/0x1ac LR is at cpufreq_update_policy+0x114/0x150 ---[ end trace f23a8defea6cd706 ]--- Kernel panic - not syncing: Fatal exception CPU0: stopping CPU: 0 PID: 7136 Comm: mpdecision Tainted: G D W 3.10.0-gd727407-00074-g979ede8 #396 [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) [<c02a23ac>] (__blocking_notifier_call_chain+0x40/0x58) from [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) [<c02a23d8>] (blocking_notifier_call_chain+0x14/0x1c) from [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) [<c0803c68>] (cpufreq_set_policy+0xd4/0x2b8) from [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) [<c0803e7c>] (cpufreq_init_policy+0x30/0x98) from [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) [<c0805a18>] (__cpufreq_add_dev.isra.17+0x4dc/0x7a4) from [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) [<c0805d38>] (cpufreq_cpu_callback+0x58/0x84) from [<c0afe180>] (notifier_call_chain+0x40/0x68) [<c0afe180>] (notifier_call_chain+0x40/0x68) from [<c02812dc>] (__cpu_notify+0x28/0x44) [<c02812dc>] (__cpu_notify+0x28/0x44) from [<c0aeed90>] (_cpu_up+0xf4/0x1dc) [<c0aeed90>] (_cpu_up+0xf4/0x1dc) from [<c0aeeed4>] (cpu_up+0x5c/0x78) [<c0aeeed4>] (cpu_up+0x5c/0x78) from [<c0aec808>] (store_online+0x44/0x74) [<c0aec808>] (store_online+0x44/0x74) from [<c03a40f4>] (sysfs_write_file+0x108/0x14c) [<c03a40f4>] (sysfs_write_file+0x108/0x14c) from [<c03517d4>] (vfs_write+0xd0/0x180) [<c03517d4>] (vfs_write+0xd0/0x180) from [<c0351ca8>] (SyS_write+0x38/0x68) [<c0351ca8>] (SyS_write+0x38/0x68) from [<c0205de0>] (ret_fast_syscall+0x0/0x30) Fix that by taking locks at appropriate places in __cpufreq_add_dev() as well. Reported-by: Saravana Kannan <skannan@codeaurora.org> Suggested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06	cpufreq: Initialize policy before making it available for others to use	Viresh Kumar
	Policy must be fully initialized before it is being made available for use by others. Otherwise cpufreq_cpu_get() would be able to grab a half initialized policy structure that might not have affected_cpus (for example) populated. Then, anybody accessing those fields will get a wrong value and that will lead to unpredictable results. In order to fix this, do all the necessary initialization before we make the policy structure available via cpufreq_cpu_get(). That will guarantee that any code accessing fields of the policy will get correct data from them. Reported-by: Saravana Kannan <skannan@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-06	cpufreq: use cpufreq_cpu_get() to avoid cpufreq_get() race conditions	Aaron Plattner
	If a module calls cpufreq_get while cpufreq is initializing, it's possible for it to be called after cpufreq_driver is set but before cpufreq_cpu_data is written during subsys_interface_register. This happens because cpufreq_get doesn't take the cpufreq_driver_lock around its use of cpufreq_cpu_data. Fix this by using cpufreq_cpu_get(cpu) to look up the policy rather than reading it out of cpufreq_cpu_data directly. cpufreq_cpu_get() takes the appropriate locks to prevent this race from happening. Since it's possible for policy to be NULL if the caller passes in an invalid CPU number or calls the function before cpufreq is initialized, delete the BUG_ON(!policy) and simply return 0. Don't try to return -ENOENT because that's negative and the function returns an unsigned integer. References: https://bbs.archlinux.org/viewtopic.php?id=177934 Signed-off-by: Aaron Plattner <aplattner@nvidia.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-26	intel_pstate: Change busy calculation to use fixed point math.	Dirk Brandewie
	Commit fcb6a15c2e (intel_pstate: Take core C0 time into account for core busy calculation) introduced a regression on some processor SKUs supported by intel_pstate. This was due to the truncation caused by using integer math to calculate core busy and C0 percentages. On a i7-4770K processor operating at 800Mhz going to 100% utilization the percent busy of the CPU using integer math is 22%, but it actually is 22.85%. This value scaled to the current frequency returned 97 which the PID interpreted as no error and did not adjust the P state. Tested on i7-4770K, i7-2600, i5-3230M. Fixes: fcb6a15c2e7e (intel_pstate: Take core C0 time into account for core busy calculation) References: https://lkml.org/lkml/2014/2/19/626 References: https://bugzilla.kernel.org/show_bug.cgi?id=70941 Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-21	intel_pstate: Add support for Baytrail turbo P states	Dirk Brandewie
	A documentation update exposed the existance of the turbo ratio register. Update baytrail support to use the turbo range. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-21	intel_pstate: Use LFM bus ratio as min ratio/P state	Dirk Brandewie
	LFM (max efficiency ratio) is the max frequency at minimum voltage supported by the processor. Using LFM as the minimum P state increases performmance without affecting power. By not using P states below LFM we avoid using P states that are less power efficient. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Cc: 3.13+ <stable@vger.kernel.org> # 3.13+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-19	cpufreq: powernow-k8: Initialize per-cpu data-structures properly	Srivatsa S. Bhat
	The powernow-k8 driver maintains a per-cpu data-structure called powernow_data that is used to perform the frequency transitions. It initializes this data structure only for the policy->cpu. So, accesses to this data structure by other CPUs results in various problems because they would have been uninitialized. Specifically, if a cpu (!= policy->cpu) invokes the drivers' ->get() function, it returns 0 as the KHz value, since its per-cpu memory doesn't point to anything valid. This causes problems during suspend/resume since cpufreq_update_policy() tries to enforce this (0 KHz) as the current frequency of the CPU, and this madness gets propagated to adjust_jiffies() as well. Eventually, lots of things start breaking down, including the r8169 ethernet card, in one particularly interesting case reported by Pierre Ossman. Fix this by initializing the per-cpu data-structures of all the CPUs in the policy appropriately. References: https://bugzilla.kernel.org/show_bug.cgi?id=70311 Reported-by: Pierre Ossman <pierre@ossman.eu> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: All applicable <stable@vger.kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-19	cpufreq: remove sysfs link when a cpu != policy->cpu, is removed	viresh kumar
	Commit 42f921a (cpufreq: remove sysfs files for CPUs which failed to come back after resume) tried to do this but missed this piece of code to fix. Currently we are getting this on suspend/resume: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 877 at fs/sysfs/dir.c:52 sysfs_warn_dup+0x68/0x84() sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq' Modules linked in: brcmfmac brcmutil CPU: 0 PID: 877 Comm: test-rtc-resume Not tainted 3.14.0-rc2-00259-g9398a10cd964 #12 [<c0015bac>] (unwind_backtrace) from [<c0011850>] (show_stack+0x10/0x14) [<c0011850>] (show_stack) from [<c056e018>] (dump_stack+0x80/0xcc) [<c056e018>] (dump_stack) from [<c0025e44>] (warn_slowpath_common+0x64/0x88) [<c0025e44>] (warn_slowpath_common) from [<c0025efc>] (warn_slowpath_fmt+0x30/0x40) [<c0025efc>] (warn_slowpath_fmt) from [<c012776c>] (sysfs_warn_dup+0x68/0x84) [<c012776c>] (sysfs_warn_dup) from [<c0127a54>] (sysfs_do_create_link_sd+0xb0/0xb8) [<c0127a54>] (sysfs_do_create_link_sd) from [<c038ef64>] (__cpufreq_add_dev.isra.27+0x2a8/0x814) [<c038ef64>] (__cpufreq_add_dev.isra.27) from [<c038f548>] (cpufreq_cpu_callback+0x70/0x8c) [<c038f548>] (cpufreq_cpu_callback) from [<c0043864>] (notifier_call_chain+0x44/0x84) [<c0043864>] (notifier_call_chain) from [<c0025f60>] (__cpu_notify+0x28/0x44) [<c0025f60>] (__cpu_notify) from [<c00261e8>] (_cpu_up+0xf0/0x140) [<c00261e8>] (_cpu_up) from [<c0569eb8>] (enable_nonboot_cpus+0x68/0xb0) [<c0569eb8>] (enable_nonboot_cpus) from [<c006339c>] (suspend_devices_and_enter+0x198/0x2dc) [<c006339c>] (suspend_devices_and_enter) from [<c0063654>] (pm_suspend+0x174/0x1e8) [<c0063654>] (pm_suspend) from [<c00624e0>] (state_store+0x6c/0xbc) [<c00624e0>] (state_store) from [<c01fc200>] (kobj_attr_store+0x14/0x20) [<c01fc200>] (kobj_attr_store) from [<c0126e50>] (sysfs_kf_write+0x44/0x48) [<c0126e50>] (sysfs_kf_write) from [<c012a274>] (kernfs_fop_write+0xb4/0x14c) [<c012a274>] (kernfs_fop_write) from [<c00d4818>] (vfs_write+0xa8/0x180) [<c00d4818>] (vfs_write) from [<c00d4bb8>] (SyS_write+0x3c/0x70) [<c00d4bb8>] (SyS_write) from [<c000e620>] (ret_fast_syscall+0x0/0x30) ---[ end trace 76969904b614c18f ]--- Fix this by removing sysfs link for cpufreq directory when cpu removed isn't policy->cpu. Revamps: 42f921a (cpufreq: remove sysfs files for CPUs which failed to come back after resume) Reported-and-tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-13	intel_pstate: Remove energy reporting from pstate_sample tracepoint	Dirk Brandewie
	Remove the reporting of energy since it does not provide any useful information about the state of the driver and will be a maintainance headache going forward since the RAPL energy units register is not architectural and subject to change between micro-architectures References: https://bugzilla.kernel.org/show_bug.cgi?id=69831 Fixes: b69880f9ccf7 (intel_pstate: Add trace point to report internal state.) Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-02-04	intel_pstate: Take core C0 time into account for core busy calculation	Dirk Brandewie
	Take non-idle time into account when calculating core busy time. This ensures that intel_pstate will notice a decrease in load. References: https://bugzilla.kernel.org/show_bug.cgi?id=66581 Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-31	Merge tag 'pm+acpi-3.14-rc1-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes and cleanups from Rafael Wysocki: - ACPI device hotplug fix preventing ACPI drivers from binding to device objects that acpi_bus_trim() has been called for and the devices represented by them may not be operational. - Recent cpufreq changes related to the "boost" (turbo) feature broke the acpi-cpufreq error code path causing a NULL pointer dereference to occur on some systems. Fix from Konrad Rzeszutek Wilk. - The log level of a CPU initialization error message added recently needs to be reduced, because the particular BIOS issue indicated by it turns out to be widespread and doesn't really matter for the majority of systems having it. From Jiang Liu. - The regulator API needs to be told to stay away from things on systems with ACPI BIOSes or it may conflict with the BIOS' own handling of voltage regulators. Fix from Mark Brown that works around a 3.13 regression in lm90 on PCs occuring if the regulator API is enabled. - Prevent the Exynos4 devfreq driver from being built on multiplatform, because it depends on things that aren't available during such builds. From Sachin Kamat. - Upstream ACPICA doesn't use the bool type as defined in the kernel, so modify the kernel's ACPICA code to follow the upstream in that respect (only one variable definition is affected) to reduce divergences between the two. From Lv Zheng. - Make the ACPI device PM code use ACPI_COMPANION() instead of its own routine doing the same thing (and invokng ACPI_COMPANION() in the process). - Modify some routines in the ACPI processor driver to follow the common convention and return negative integers on errors. From Hanjun Guo. * tag 'pm+acpi-3.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / scan: Clear match_driver flag in acpi_bus_trim() ACPI / init: Flag use of ACPI and ACPI idioms for power supplies to regulator API acpi-cpufreq: De-register CPU notifier and free struct msr on error. ACPICA: Remove bool usage from ACPICA. PM / devfreq: Disable Exynos4 driver build on multiplatform ACPI / PM: Use ACPI_COMPANION() to get ACPI companions of devices ACPI / scan: reduce log level of "ACPI: \_PR_.CPU4: failed to get CPU APIC ID" ACPI / processor: Return specific error value when mapping lapic id
2014-01-29	Merge branches 'pm-cpufreq' and 'pm-devfreq'	Rafael J. Wysocki
	* pm-cpufreq: acpi-cpufreq: De-register CPU notifier and free struct msr on error. * pm-devfreq: PM / devfreq: Disable Exynos4 driver build on multiplatform
2014-01-28	acpi-cpufreq: De-register CPU notifier and free struct msr on error.	Konrad Rzeszutek Wilk
	If cpufreq_register_driver() fails we would free the acpi driver related structures but not free the ones allocated by acpi_cpufreq_boost_init() function. This meant that as the driver error-ed out and a CPU online/offline event came we would crash and burn as one of the CPU notifiers would point to garbage. Fixes: cfc9c8ed03e4 (acpi-cpufreq: Adjust the code to use the common boost attribute) Acked-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-24	Merge branch 'next' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management updates from Zhang Rui: "This time, the biggest change is the work of representing hardware thermal properties in device tree infrastructure. This work includes the introduction of a device tree bindings for describing the hardware thermal behavior and limits, and also a parser to read and interpret the data, and build thermal zones and thermal binding parameters. It also contains three examples on how to use the new representation on sensor devices, using three different drivers to accomplish it. One driver is in thermal subsystem, the TI SoC thermal, and the other two drivers are in hwmon subsystem. Actually, this would be the first step of the complete work because we still need to check other potential drivers to be converted and then validate the proposed API. But the reason why I include it in this pull request is that, first, this change does not hurt any others without using this approach, second, the principle and concept of this change would not break after converting the remaining drivers. BTW, as you can see, there are several points in this change that do not belong to thermal subsystem. Because it has been suggested by Guenter R that in such cases, it is recommended to send the complete series via one single subsystem. Specifics: - representing hardware thermal properties in device tree infrastructure - fix a regression that the imx thermal driver breaks system suspend. - introduce ACPI INT3403 thermal driver to retrieve temperature data from the INT3403 ACPI device object present on some systems. - introduce debug statement for thermal core and step_wise governor. - assorted fixes and cleanups for thermal core, cpu cooling, exynos thrmal, intel powerclamp and imx thermal driver" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (34 commits) thermal: remove const flag from .ops of imx thermal Thermal: update thermal zone device after setting emul_temp intel_powerclamp: Fix cstate counter detection. thermal: imx: add necessary clk operation Thermal cpu cooling: return error if no valid cpu frequency entry thermal: fix cpu_cooling max_level behavior thermal: rcar-thermal: Enable driver compilation with COMPILE_TEST thermal: debug: add debug statement for core and step_wise thermal: imx_thermal: add module device table drivers: thermal: Mark function as static in x86_pkg_temp_thermal.c thermal:samsung: fix compilation warning thermal: imx: correct suspend/resume flow thermal: exynos: fix error return code Thermal: ACPI INT3403 thermal driver MAINTAINERS: add thermal bindings entry in thermal domain arm: dts: make OMAP4460 bandgap node to belong to OCP arm: dts: make OMAP443x bandgap node to belong to OCP arm: dts: add cooling properties on omap5 cpu node arm: dts: add omap5 thermal data arm: dts: add omap5 CORE thermal data ...
2014-01-24	Merge tag 'pm+acpi-3.14-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "As far as the number of commits goes, the top spot belongs to ACPI this time with cpufreq in the second position and a handful of PM core, PNP and cpuidle updates. They are fixes and cleanups mostly, as usual, with a couple of new features in the mix. The most visible change is probably that we will create struct acpi_device objects (visible in sysfs) for all devices represented in the ACPI tables regardless of their status and there will be a new sysfs attribute under those objects allowing user space to check that status via _STA. Consequently, ACPI device eject or generally hot-removal will not delete those objects, unless the table containing the corresponding namespace nodes is unloaded, which is extremely rare. Also ACPI container hotplug will be handled quite a bit differently and cpufreq will support CPU boost ("turbo") generically and not only in the acpi-cpufreq driver. Specifics: - ACPI core changes to make it create a struct acpi_device object for every device represented in the ACPI tables during all namespace scans regardless of the current status of that device. In accordance with this, ACPI hotplug operations will not delete those objects, unless the underlying ACPI tables go away. - On top of the above, new sysfs attribute for ACPI device objects allowing user space to check device status by triggering the execution of _STA for its ACPI object. From Srinivas Pandruvada. - ACPI core hotplug changes reducing code duplication, integrating the PCI root hotplug with the core and reworking container hotplug. - ACPI core simplifications making it use ACPI_COMPANION() in the code "glueing" ACPI device objects to "physical" devices. - ACPICA update to upstream version 20131218. This adds support for the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug facilities. From Bob Moore, Lv Zheng and Betty Dall. - Init code change to carry out the early ACPI initialization earlier. That should allow us to use ACPI during the timekeeping initialization and possibly to simplify the EFI initialization too. From Chun-Yi Lee. - Clenups of the inclusions of ACPI headers in many places all over from Lv Zheng and Rashika Kheria (work in progress). - New helper for ACPI _DSM execution and rework of the code in drivers that uses _DSM to execute it via the new helper. From Jiang Liu. - New Win8 OSI blacklist entries from Takashi Iwai. - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria, Tang Chen, Zhang Rui. - intel_pstate driver updates, including proper Baytrail support, from Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra. - Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski. - powernow-k6 cpufreq driver fixes from Mikulas Patocka. - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown. - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar. - cpuidle cleanups from Bartlomiej Zolnierkiewicz. - Support for hibernation APM events from Bin Shi. - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled during thaw transitions from Bjørn Mork. - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson. - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa, Rashika Kheria. - New tool for profiling system suspend from Todd E Brandt and a cpupower tool cleanup from One Thousand Gnomes" * tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits) thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412) cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ Documentation: cpufreq / boost: Update BOOST documentation cpufreq: exynos: Extend Exynos cpufreq driver to support boost cpufreq / boost: Kconfig: Support for software-managed BOOST acpi-cpufreq: Adjust the code to use the common boost attribute cpufreq: Add boost frequency support in core intel_pstate: Add trace point to report internal state. cpufreq: introduce cpufreq_generic_get() routine ARM: SA1100: Create dummy clk_get_rate() to avoid build failures cpufreq: stats: create sysfs entries when cpufreq_stats is a module cpufreq: stats: free table and remove sysfs entry in a single routine cpufreq: stats: remove hotplug notifiers cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly cpufreq: speedstep: remove unused speedstep_get_state platform: introduce OF style 'modalias' support for platform bus PM / tools: new tool for suspend/resume performance optimization ACPI: fix module autoloading for ACPI enumerated devices ACPI: add module autoloading support for ACPI enumerated devices ACPI: fix create_modalias() return value handling ...
2014-01-23	Merge tag 'cleanup-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC cleanups from Olof Johansson: "This is the branch where we usually queue up cleanup efforts, moving drivers out of the architecture directory, header file restructuring, etc. Sometimes they tangle with new development so it's hard to keep it strictly to cleanups. Some of the things included in this branch are: * Atmel SAMA5 conversion to common clock * Reset framework conversion for tegra platforms - Some of this depends on tegra clock driver reworks that are shared with Mike Turquette's clk tree. * Tegra DMA refactoring, which are shared branches with the DMA tree. * Removal of some header files on exynos to prepare for multiplatform" * tag 'cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (169 commits) ARM: mvebu: move Armada 370/XP specific definitions to armada-370-xp.h ARM: mvebu: remove prototypes of non-existing functions from common.h ARM: mvebu: move ARMADA_XP_MAX_CPUS to armada-370-xp.h serial: sh-sci: Rework baud rate calculation serial: sh-sci: Compute overrun_bit without using baud rate algo serial: sh-sci: Remove unused GPIO request code serial: sh-sci: Move overrun_bit and error_mask fields out of pdata serial: sh-sci: Support resources passed through platform resources serial: sh-sci: Don't check IRQ in verify port operation serial: sh-sci: Set the UPF_FIXED_PORT flag serial: sh-sci: Remove duplicate interrupt check in verify port op serial: sh-sci: Simplify baud rate calculation algorithms serial: sh-sci: Remove baud rate calculation algorithm 5 serial: sh-sci: Sort headers alphabetically ARM: EXYNOS: Kill exynos_pm_late_initcall() ARM: EXYNOS: Consolidate selection of PM_GENERIC_DOMAINS for Exynos4 ARM: at91: switch Calao QIL-A9260 board to DT clk: at91: fix pmc_clk_ids data type attriubte PM / devfreq: use inclusion <mach/map.h> instead of <plat/map-s5p.h> ARM: EXYNOS: remove <mach/regs-clock.h> for exynos ...
2014-01-17	cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ	Lukasz Majewski
	Add a special driver data flag (CPUFREQ_BOOST_FREQ) to indicate a frequency that can be only enabled for BOOST mode. This frequency will be used only for limited time, since running with it for too long may cause the target device to overheat. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17	cpufreq: exynos: Extend Exynos cpufreq driver to support boost	Lukasz Majewski
	The cpufreq_driver's boost_supported flag is true only when boost support is explicitly enabled. Boost related attributes are exported only under the same condition. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>