aboutsummaryrefslogtreecommitdiffstats
path: root/xen/arch/x86/acpi
Commit message (Collapse)AuthorAgeFilesLines
* x86: use {rd,wr}{fs,gs}base when availableJan Beulich2013-10-111-4/+4
| | | | | | | | | | | | ... as being intended to be faster than MSR reads/writes. In the case of emulate_privileged_op() also use these in favor of the cached (but possibly stale) addresses from arch.pv_vcpu. This allows entirely removing the code that was the subject of XSA-67. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: Introduce and use GLOBAL() in asm codeAndrew Cooper2013-09-091-5/+5
| | | | | | Also clean up some cases of misused/opencoded ENTRY() Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
* x86/Intel: add support for Haswell CPU modelsJan Beulich2013-08-271-1/+3
| | | | | | | ... according to their most recent public documentation. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* ACPI: fix acpi_os_map_memory()Jan Beulich2013-08-211-1/+1
| | | | | | | | | | | | | | It using map_domain_page() was entirely wrong. Use __acpi_map_table() instead for the time being, with locking added as the mappings it produces get replaced with subsequent invocations. Using locking in this way is acceptable here since the only two runtime callers are acpi_os_{read,write}_memory(), which don't leave mappings pending upon returning to their callers. Also fix __acpi_map_table()'s first parameter's type - while benign for unstable, backports to pre-4.3 trees will need this. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* x86/cpuidle: Change logging for unknown APIC IDsAndrew Cooper2013-07-171-1/+4
| | | | | | | | | | | | | | | Dom0 uses this hypercall to pass ACPI information to Xen. It is not very uncommon for more cpus to be listed in the ACPI tables than are present on the system, particularly on systems with a common BIOS for a 2 and 4 socket server varients. As Dom0 does not control the number of entries in the ACPI tables, and is required to pass everything it finds to Xen, change the logging. There is now an single unconditional warning for the first unknown ID, and further warnings if "cpuinfo" is requested by the user on the command line. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
* x86/S3: Restore broken vcpu affinity on resumeBen Guthro2013-04-021-0/+3
| | | | | | | | | | | | | When in SYS_STATE_suspend, and going through the cpu_disable_scheduler path, save a copy of the current cpu affinity, and mark a flag to restore it later. Later, in the resume process, when enabling nonboot cpus restore these affinities. Signed-off-by: Ben Guthro <benjamin.guthro@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* powernow: add fixups for AMD P-state figuresKonrad Rzeszutek Wilk2013-03-121-6/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the Linux kernel, these two git commits: - f594065faf4f9067c2283a34619fc0714e79a98d ACPI: Add fixups for AMD P-state figures - 9855d8ce41a7801548a05d844db2f46c3e810166 ACPI: Check MSR valid bit before using P-state frequencies Try to fix the the issue that "some AMD systems may round the frequencies in ACPI tables to 100MHz boundaries. We can obtain the real frequencies from MSRs, so add a quirk to fix these frequencies up on AMD systems." (from f594065..) In discussion (around 9855d8..) "it turned out that indeed real HW/BIOSes may choose to not set the valid bit and thus mark the P-state as invalid. So this could be considered a fix for broken BIOSes." (from 9855d8..) which is great for Linux. Unfortunatly the Linux kernel, when it tries to do the RDMSR under Xen it fails to get the right value (it gets zero) as Xen traps it and returns zero. Hence when dom0 uploads the P-states they will be unmodified and we should take care of updating the frequencies with the right values. I've tested it under Dell Inc. PowerEdge T105 /0RR825, BIOS 1.3.2 08/20/2008 where this quirk can be observed (x86 == 0x10, model == 2). Also on other AMD (x86 == 0x12, A8-3850; x86 = 0x14, AMD E-350) to make sure the quirk is not applied there. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: stefan.bader@canonical.com Do the MSR access here (and while at it, also the one reading MSR_PSTATE_CUR_LIMIT) on the target CPU, and bound the loop over amd_fixup_frequency() by max_hw_pstate (matching the one in powernow_cpufreq_cpu_init()). Signed-off-by: Jan Beulich <jbeulich@suse.com>
* ACPI: support v5 (reduced HW) sleep interfaceJan Beulich2013-02-222-22/+106
| | | | | | | | | | | | Note that this also fixes a broken input check in acpi_enter_sleep() (previously validating the sleep->pm1[ab]_cnt_val relationship based on acpi_sinfo.pm1b_cnt_val, which however gets set only subsequently). Also adjust a few minor issues with the pre-v5 handling in acpi_fadt_parse_sleep_info(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* ACPI 5.0: Implement hardware-reduced optionBob Moore2013-02-221-0/+5
| | | | | | | | | | | | | If HW-reduced flag is set in the FADT, do not attempt to access or initialize any ACPI hardware, including SCI and global lock. No FACS will be present. Signed-off-by: Bob Moore <robert.moore@intel.com> Also adjust acpi_fadt_parse_sleep_info(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* xen: use XSM instead of IS_PRIV where duplicatedDaniel De Graaf2013-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Xen hypervisor has two basic access control function calls: IS_PRIV and the xsm_* functions. Most privileged operations currently require that both checks succeed, and many times the checks are at different locations in the code. This patch eliminates the explicit and implicit IS_PRIV checks that are duplicated in XSM hooks. When XSM_ENABLE is not defined or when the dummy XSM module is used, this patch should not change any functionality. Because the locations of privilege checks have sometimes moved below argument validation, error returns of some functions may change from EPERM to EINVAL or ESRCH if called with invalid arguments and from a domain without permission to perform the operation. Some checks are removed due to non-obvious duplicates in their callers: * acpi_enter_sleep is checked in XENPF_enter_acpi_sleep * map_domain_pirq has IS_PRIV_FOR checked in its callers: * physdev_map_pirq checks when acquiring the RCU lock * ioapic_guest_write is checked in PHYSDEVOP_apic_write * PHYSDEVOP_{manage_pci_add,manage_pci_add_ext,pci_device_add} are checked by xsm_resource_plug_pci in pci_add_device * PHYSDEVOP_manage_pci_remove is checked by xsm_resource_unplug_pci in pci_remove_device * PHYSDEVOP_{restore_msi,restore_msi_ext} are checked by xsm_resource_setup_pci in pci_restore_msi_state * do_console_io has changed to IS_PRIV from an explicit domid==0 Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Jan Beulich <jbeulich@suse.com> Committed-by: Keir Fraser <keir@xen.org>
* VT-d: adjust IOMMU interrupt affinities when all CPUs are onlineJan Beulich2012-11-281-0/+1
| | | | | | | | | | | | Since these interrupts get setup before APs get brought online, their affinities naturally could only ever point to CPU 0 alone so far. Adjust this to include potentially multiple CPUs in the target mask (when running in one of the cluster modes), and take into account NUMA information (to handle the interrupts on a CPU on the node where the respective IOMMU is). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/ACPI: invalidate BGRT if necessaryJan Beulich2012-11-081-0/+24
| | | | | | | | | | | | | | | | | | | | | | Since the image pointed to may live in boot services memory (which we add to the global memory pool long before ACPI tables get looked at), we should prevent Dom0 from trying to retrieve the image data in that case. The alternatives would be to - not add boot services memory to the global pool at all, or - defer adding boot services memory until Dom0 indicates it is safe to do so, or - find and parse the BGRT table in xen/arch/x86/efi/boot.c, and avoid adding that specific region to the E820 table. None of these are really attractive, and as Xen commonly prints to the video console anyway (without trying to avoid any regions on the screen), the invalidation would need to be done conditionally anyway. (xen/include/acpi/actbl3.h is a verbatim copy from Linux 3.7-rc4) Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* ACPI/cpuidle: remove unused "power" field from Cx state dataJan Beulich2012-11-021-1/+0
| | | | | | | | It has never been used for anything, and Linux 3.7 doesn't propagate this information anymore. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/HPET: allow use for broadcast when interrupt remapping is in effectJan Beulich2012-10-181-0/+1
| | | | | | | | | | This requires some additions to the VT-d side; AMD IOMMUs use the "normal" MSI message format even when interrupt remapping is enabled, thus making adjustments here unnecessary. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Xiantao Zhang<xiantao.zhang@intel.com>
* x86/Intel: add further support for Ivy Bridge CPU modelsJan Beulich2012-10-021-2/+6
| | | | | | | And some initial Haswell ones at once. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: "Nakajima, Jun" <jun.nakajima@intel.com>
* x86/S3: add cache flush on secondary CPUs before going to sleepBen Guthro2012-09-251-2/+1
| | | | | | | | | | | | | | | | | | Secondary CPUs, between doing their final memory writes (particularly updating cpu_initialized) and getting a subsequent INIT, may not write back all modified data. The INIT itself then causes those modifications to be lost, so in the cpu_initialized case the CPU would find itself already initialized, (intentionally) entering an infinite loop instead of actually coming online. Signed-off-by: Ben Guthro <ben@guthro.net> Make acpi_dead_idle() call default_dead_idle() rather than duplicating the logic there. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Jan Beulich <jbeulich@suse.com>
* x86: fix MWAIT-based idle driver for CPUs without ARATJan Beulich2012-09-251-18/+24
| | | | | | | | lapic_timer_{on,off} need to get initialized in this case. This in turn requires getting HPET broadcast setup to be carried out earlier (and hence preventing double initialization there). Signed-off-by: Jan Beulich <jbeulich@suse.com>
* x86: enable VIA CPU supportJan Beulich2012-09-211-2/+4
| | | | | | | | | | | | | Newer VIA CPUs have both 64-bit and VMX support. Enable them to be recognized for these purposes, at once stripping off any 32-bit CPU only bits from the respective CPU support file, and adding 64-bit ones found in recent Linux. This particularly implies untying the VMX == Intel assumption in a few places. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* printk: prefer %#x et at over 0x%xJan Beulich2012-09-212-7/+7
| | | | | | | | | Performance is not an issue with printk(), so let the function do minimally more work and instead save a byte per affected format specifier. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: introduce MWAIT-based, ACPI-less CPU idle driverJan Beulich2012-09-211-35/+31
| | | | | | | This is a port of Linux'es intel-idle driver serving the same purpose. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* cpuidle: remove unused latency_ticks memberJan Beulich2012-09-211-4/+0
| | | | | | | ... and code used only for initializing it. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/ACPI: fix error indication from acpi_parse_madt_lapic_entries()Jan Beulich2012-09-191-2/+2
| | | | | | | | | If the legacy APIC invocation of acpi_table_parse_madt() succeeds but the x2APIC counterpart fails, this is regarded as failure by the function, yet its return value would indicate success. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* xen: Remove x86_32 build target.Keir Fraser2012-09-124-288/+0
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* powernow: Update P-state directly when _PSD's CoordType is ↵Boris Ostrovsky2012-09-111-37/+25
| | | | | | | | | | | | DOMAIN_COORD_TYPE_HW_ALL When _PSD's CoordType is DOMAIN_COORD_TYPE_HW_ALL (i.e. shared_type is CPUFREQ_SHARED_TYPE_HW) which most often is the case on servers, there is no reason to go into on_selected_cpus() code, we call call transition_pstate() directly. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* x86/cpuidle: clean up statistics reporting to user modeJan Beulich2012-08-101-30/+15
| | | | | | | | | | First of all, when no ACPI Cx data was reported, make sure the usage count passed back to user mode is not random. Besides that, fold a lot of redundant code. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86, cpufreq: Change powernow's CPB status immediatelyBoris Ostrovsky2012-06-181-4/+26
| | | | | | | | | | | | | | When command to modify turbo mode (CPB on AMD processors) comes in the actual change happens later, when P-state transition is requested. There is no time limit on when this transition will occur and therefore change in CPB state may take long time from the moment when command to toggle it is issued. This patch makes CPB mode change happen immediately when request is made. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com> Committed-by: Keir Fraser <keir@xen.org>
* xenpm, x86: Fix reporting of idle state average residency timesBoris Ostrovsky2012-06-061-36/+61
| | | | | | | | | | | | If CPU stays in the same idle state for the full duration of xenpm sample then average residency may not be reported correctly since usage counter will not be incremented. In addition, in order to calculate averages correctly residence time and usage counter should be read and written atomically. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com> Committed-by: Keir Fraser <keir@xen.org>
* x86/cpuidle: do not flush cache unless entering C3Wei Wang2012-04-161-2/+5
| | | | | | | | | Nor is there a need to disable bus master arbitration in that case. Signed-off-by: Wei Wang <wei.wang2@amd.com> Modified-by: Zhang, Yang Z <yang.z.zhang@intel.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* Introduce system_state variable.Keir Fraser2012-03-221-0/+10
| | | | | | | | | | Use it to replace x86-specific early_boot boolean variable. Also use it to detect suspend/resume case during cpu offline/online to avoid unnecessarily breaking vcpu and cpupool affinities. Signed-off-by: Keir Fraser <keir@xen.org> Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
* XENPF_set_processor_pminfo XEN_PM_CX overflows states arrayEric Chanudet2012-03-081-40/+38
| | | | | | | | | | | | | | | | | | | | Calling XENPF_set_processor_pminfo with XEN_PM_CX could cause states array in "struct acpi_processor_power" to exceed its limit. The array used to be reset (by function cpuidle_init_cpu()) for each hypercall. The patch puts it back that way and adds an assertion to make it clear in case that happens again. Signed-off-by: Eric Chanudet <eric.chanudet@eu.citrix.com> - convert assertion to printk() & bail - eliminate struct acpi_processor_cx's valid member (not read anymore) - further adjustments to one-time-only vs each-time operations in cpuidle_init_cpu() - don't use ACPI_STATE_Cn as array index anymore Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* Revert 24973:50a70b652b43 "x86: Use deep C states for off-lined CPUs"Keir Fraser2012-03-071-18/+0
| | | | | | applied already Signed-off-by: Keir Fraser <keir@xen.org>
* x86: Use deep C states for off-lined CPUsBoris Ostrovsky2012-03-071-0/+18
| | | | | | | | | Currently when a core is taken off-line it is placed in C1 state (unless MONITOR/MWAIT is used). This patch allows a core to go to deeper C states resulting in significantly higher power savings. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com> Committed-by: Keir Fraser <keir@xen.org>
* x86/cpuidle: deny access to the I/O port used for EM_SYSIOJan Beulich2012-03-061-0/+4
| | | | | | | Nothing, not even Dom0, should fiddle with this. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/cpuidle: restrict scope of mwait_ptr in acpi_dead_idle()Jan Beulich2012-03-061-3/+2
| | | | | | | | ... just to make sure it doesn't get used improperly (resulting from the discussion around what became c/s 24968:8964c223836c). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: Use deep C states for off-lined CPUsBoris Ostrovsky2012-03-061-0/+17
| | | | | | | | | Currently when a core is taken off-line it is placed in C1 state (unless MONITOR/MWAIT is used). This patch allows a core to go to deeper C states resulting in significantly higher power savings. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com> Committed-by: Jan Beulich <jbeulich@suse.com>
* x86: add Ivy Bridge model numbers to model specific MSR handlingJan Beulich2012-02-091-0/+2
| | | | | | | | | This is model 0x3a (decimal 58) as per the most recent SDM. In vPMU code, also add a forgotten earlier model. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Xiantao Zhang<xiantao.zhang@intel.com>
* x86: Make asmlinkage explicitly a no-op, and avoid usage in arch/x86Keir Fraser2012-01-151-1/+1
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* remove inclusion of asm/config.hJan Beulich2012-01-132-2/+0
| | | | | | | | | This was always bogus (xen/config.h should have been used instead) and is superfluous now that xen/config.h gets included through the compiler command line. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* ACPI: eliminate duplicate MADT parsing and unused SBF definitionsJan Beulich2011-12-131-39/+38
| | | | | | | Use their proper counterparts in include/acpi/actbl*.h instead. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/cpuidle: add Westmere-EX support to hw residencies reading logicJan Beulich2011-11-301-0/+1
| | | | | | | | This is in accordance with http://software.intel.com/en-us/articles/intel-processor-identification-with-cpuid-model-and-family-numbers/ Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Haitao Shan <maillists.shan@gmail.com>
* x86: quiesce cpuidle codeJan Beulich2011-11-111-10/+14
| | | | | | | | | | | So far these messages got pointlessly (as the code in other places assumes symmetric configuration) emitted once per CPU. Hide the debug one behind opt_cpu_info, and issue the info one just once (if the code gets adjusted to support assymtric configurations, this would need to be revisited, but ideally without producing per-CPU messages again). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/cpuidle: clean up hw residencies reading codeJan Beulich2011-11-081-17/+13
| | | | | | | Fold redundant code and eliminate pointless casts. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* eliminate first_cpu() etcJan Beulich2011-11-083-3/+3
| | | | | | | | This includes the conversion from for_each_cpu_mask() to for_each-cpu(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
* eliminate cpu_set()Jan Beulich2011-11-081-2/+2
| | | | | | Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
* cpufreq: allocate CPU masks dynamicallyJan Beulich2011-11-072-6/+6
| | | | | | | | | | struct cpufreq_policy, including a cpumask_t member, gets copied in cpufreq_limit_change(), cpufreq_add_cpu(), set_cpufreq_gov(), and set_cpufreq_para(). Make the member a cpumask_var_t, thus reducing the amount of data needing copying (particularly with large NR_CPUS). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* powernow: don't read never initialized structure memberJan Beulich2011-11-071-6/+8
| | | | | | | | | | c/s 20361:51b031b0737e removed the writing of struct processor_performance's shared_cpu_map member, but the powernow driver still has code to read it (though presumably that code path can't be taken on actual hardware supported by the powernow driver). Remove the use of the field along with the field itself. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* x86 pm: provide CC7/PC2 residencyYang Zhang2011-10-251-4/+23
| | | | | | | | | Sandy bridge introduces new MSR to get cc7/pc2 residency (core C-state 7/package C-state 2). Print the cc7/pc2 residency when on sandy bridge platform. Signed-off-by: Yang Zhang <yang.z.zhang@intel.com> Committed-by: Keir Fraser <keir@xen.org>
* eliminate direct assignments of CPU masksJan Beulich2011-10-211-1/+1
| | | | | | | | | | | | | | | | Use cpumask_copy() instead of direct variable assignments for copying CPU masks. While direct assignments are not a problem when both sides are variables actually defined as cpumask_t (except for possibly copying *much* more than would actually need to be copied), they must not happen when the original variable is of type cpumask_var_t (which may have lass space allocated to it than a full cpumask_t). Eliminate as many of such assignments as possible (in several cases it's even possible to collapse two operations [copy then clear one bit] into one [cpumask_andnot()]), and thus set the way for reducing the allocation size in alloc_cpumask_var(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* eliminate cpumask accessors referencing NR_CPUSJan Beulich2011-10-212-3/+3
| | | | | | | ... in favor of using the new, nr_cpumask_bits-based ones. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* introduce and use nr_cpu_ids and nr_cpumask_bitsJan Beulich2011-10-213-3/+3
| | | | | | | | | | | | | | | The former is the runtime equivalent of NR_CPUS (and users of NR_CPUS, where necessary, get adjusted accordingly), while the latter is for the sole use of determining the allocation size when dynamically allocating CPU masks (done later in this series). Adjust accessors to use either of the two to bound their bitmap operations - which one gets used depends on whether accessing the bits in the gap between nr_cpu_ids and nr_cpumask_bits is benign but more efficient. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>