aboutsummaryrefslogtreecommitdiffstats
path: root/xen/include/asm-x86/mm.h
Commit message (Collapse)AuthorAgeFilesLines
* x86: correct LDT checksJan Beulich2013-10-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | - MMUEXT_SET_LDT should behave as similarly to the LLDT instruction as possible: fail only if the base address is non-canonical - instead LDT descriptor accesses should fault if the descriptor address ends up being non-canonical (by ensuring this we at once avoid reading an entry from the mach-to-phys table and consider it a page table entry) - fault propagation on using LDT selectors must distinguish #PF and #GP (the latter must be raised for a non-canonical descriptor address, which also applies to several other uses of propagate_page_fault(), and hence the problem is being fixed there) - map_ldt_shadow_page() should properly wrap addresses for 32-bit VMs At once remove the odd invokation of map_ldt_shadow_page() from the MMUEXT_SET_LDT handler: There's nothing really telling us that the first LDT page is going to be preferred over others. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: don't blindly create L3 tables for the direct mapJan Beulich2013-09-301-1/+1
| | | | | | | | | | | | Now that the direct map area can extend all the way up to almost the end of address space, this is wasteful. Also fold two almost redundant messages in SRAT parsing into one. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Malcolm Crossley <malcolm.crossley@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: fix page refcount handling in page table pin error pathJan Beulich2013-06-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | In the original patch 7 of the series addressing XSA-45 I mistakenly took the addition of the call to get_page_light() in alloc_page_type() to cover two decrements that would happen: One for the PGT_partial bit that is getting set along with the call, and the other for the page reference the caller hold (and would be dropping on its error path). But of course the additional page reference is tied to the PGT_partial bit, and hence any caller of a function that may leave ->arch.old_guest_table non-NULL for error cleanup purposes has to make sure a respective page reference gets retained. Similar issues were then also spotted elsewhere: In effect all callers of get_page_type_preemptible() need to deal with errors in similar ways. To make sure error handling can work this way without leaking page references, a respective assertion gets added to that function. This is CVE-2013-1432 / XSA-58. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
* x86: cleanup after making various page table manipulation operations preemptibleJan Beulich2013-05-021-7/+2
| | | | | | | | | | | | This drops the "preemptible" parameters from various functions where now they can't (or shouldn't, validated by assertions) be run in non- preemptible mode anymore, to prove that manipulations of at least L3 and L4 page tables and page table entries are now always preemptible, i.e. the earlier patches actually fulfill their purpose of fixing the resulting security issue. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
* x86: allow Dom0 read-only access to IO-APICsJan Beulich2013-05-021-0/+2
| | | | | | | | | | | | | | | | | There are BIOSes that want to map the IO-APIC MMIO region from some ACPI method(s), and there is at least one BIOS flavor that wants to use this mapping to clear an RTE's mask bit. While we can't allow the latter, we can permit reads and simply drop write attempts, leveraging the already existing infrastructure introduced for dealing with AMD IOMMUs' representation as PCI devices. This fixes an interrupt setup problem on a system where _CRS evaluation involved the above described BIOS/ACPI behavior, and is expected to also deal with a boot time crash of pv-ops Linux upon encountering the same kind of system. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: make vcpu_reset() preemptibleJan Beulich2013-05-021-1/+1
| | | | | | | | | | ... as dropping the old page tables may take significant amounts of time. This is part of CVE-2013-1918 / XSA-45. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
* x86: make vcpu_destroy_pagetables() preemptibleJan Beulich2013-05-021-0/+1
| | | | | | | | | | | | | | | ... as it may take significant amounts of time. The function, being moved to mm.c as the better home for it anyway, and to avoid having to make a new helper function there non-static, is given a "preemptible" parameter temporarily (until, in a subsequent patch, its other caller is also being made capable of dealing with preemption). This is part of CVE-2013-1918 / XSA-45. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
* x86: reserve pages when SandyBridge integrated graphicsXudong Hao2013-03-261-0/+1
| | | | | | | | | | | | | | | | SNB graphics devices have a bug that prevent them from accessing certain memory ranges, namely anything below 1M and in the pages listed in the table. Xen does not initialize below 1MB to heap, i.e. below 1MB pages don't be allocated, so it's unnecessary to reserve memory below the 1 MB mark that has not already been reserved. So reserve those pages listed in the table at xen boot if set detect a SNB gfx device on the CPU to avoid GPU hangs. Signed-off-by: Xudong Hao <xudong.hao@intel.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/mm: avoid undefined behavior in IS_NIL()Xi Wang2013-03-151-2/+2
| | | | | | | | | | | | | | | | | Since pointer overflow is undefined behavior in C, some compilers such as clang optimize away the check !((ptr) + 1) in the macro IS_NIL(). This patch fixes the issue by casting the pointer type to uintptr_t, the operations of which are well-defined. Signed-off-by: Xi Wang <xi@mit.edu> With that, we also need to avoid the overflow in NIL(). Note that either part of the change results in the respective macros to become unsuitable for use with "void". Signed-off-by: Jan Beulich <jbeulich@suse.com>
* x86: rework hypercall argument translation area setupJan Beulich2013-02-281-0/+2
| | | | | | | | | | | | | | ... using the new per-domain mapping management functions, adding destroy_perdomain_mapping() to the previously introduced pair. Rather than using an order-1 Xen heap allocation, use (currently 2) individual domain heap pages to populate space in the per-domain mapping area. Also fix a benign off-by-one mistake in is_compat_arg_xlat_range(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: introduce create_perdomain_mapping()Jan Beulich2013-02-281-0/+8
| | | | | | | | | | | | | ... as well as free_perdomain_mappings(), and use them to carry out the existing per-domain mapping setup/teardown. This at once makes the setup of the first sub-range PV domain specific (with idle domains also excluded), as the GDT/LDT mapping area is needed only for those. Also fix an improperly scaled BUILD_BUG_ON() expression in mapcache_domain_init(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: consolidate initialization of PV guest L4 page tablesJan Beulich2013-01-231-0/+2
| | | | | | | So far this has been repeated in 3 places, requiring to remember to update all of them if a change is being made. Signed-off-by: Jan Beulich <jbeulich@suse.com>
* miscellaneous cleanupJan Beulich2013-01-171-1/+0
| | | | | | | | | | | | | | | | | ... noticed while putting together the 16Tb support patches for x86. Briefly, this (in order of the changes below) - fixes an inefficiency in x86's context switch code (translations to/ from struct page are more involved than to/from MFNs) - drop unnecessary MFM-to-page conversions - drop a redundant call to destroy_xen_mappings() (an indentical call is being made a few lines up) - simplify a VA-to-MFN translation - drop dead code (several occurrences) - add a missing __init annotation Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86: frame table related improvementsJan Beulich2012-12-111-1/+1
| | | | | | | | | | | | | | | - fix super page frame table setup for memory hotplug case (should create full table, or else the hotplug code would need to do the necessary table population) - simplify super page frame table setup (can re-use frame table setup code) - slightly streamline frame table setup code - fix (tighten) a BUG_ON() and an ASSERT() condition - fix spage <-> pdx conversion macros (they had no users so far, and hence no-one noticed how broken they were) Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/mm: Comment the definitions of _mfn(), _gfn() &c.Tim Deegan2012-11-291-0/+5
| | | | | | | | It's not very easy to find them if you don't know to look for the TYPE_SAFE() macro. Signed-off-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
* xen: replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when appropriateStefano Stabellini2012-10-171-4/+4
| | | | | | | | | | | | Note: these changes don't make any difference on x86. Replace XEN_GUEST_HANDLE with XEN_GUEST_HANDLE_PARAM when it is used as an hypercall argument. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Committed-by: Ian Campbell <ian.campbell@citrix.com>
* x86: Remove CONFIG_COMPAT ifdef'ery from arch/x86 -- it is always defined.Keir Fraser2012-09-121-15/+0
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* x86: We can assume CONFIG_PAGING_LEVELS==4.Keir Fraser2012-09-121-10/+0
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* xen: Remove x86_32 build target.Keir Fraser2012-09-121-46/+0
| | | | Signed-off-by: Keir Fraser <keir@xen.org>
* x86: comment opaque expression in __page_to_virt()Jan Beulich2012-09-031-0/+6
| | | | | | | | | mm.h's __page_to_virt() has a rather opaque expression. Comment it. Reported-By: Ian Campbell <ian.campbell@citrix.com> Suggested-by: Ian Jackson <ian.jackson@eu.citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
* AMD IOMMU: add mechanism to protect their PCI devices' config spacesJan Beulich2012-06-221-0/+2
| | | | | | | | | | | | | | | | | | | Recent Dom0 kernels want to disable PCI MSI on all devices, yet doing so on AMD IOMMUs (which get represented by a PCI device) disables part of the functionality set up by the hypervisor. Add a mechanism to mark certain PCI devices as having write protected config spaces (both through port based [method 1] accesses and, for x86-64, mmconfig), and use that for AMD's IOMMUs. Note that due to ptwr_do_page_fault() being run first, there'll be a MEM_LOG() issued for each such mmconfig based write attempt. If that's undesirable, the order of the calls in fixup_page_fault() would need to be swapped. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Wei Wang <wei.wang2@amd.com> Acked-by: Keir Fraser <keir@xen.org>
* x86/mm: make p2m lock into an rwlockTim Deegan2012-05-171-0/+8
| | | | | | | Because the p2m lock was already recursive, we need to add a new mm-lock class of recursive rwlocks. Signed-off-by: Tim Deegan <tim@xen.org>
* x86/mm: Eliminate _shadow_mode_refcountsAndres Lagar-Cavilla2012-05-031-1/+0
| | | | | | | | Replace its only useer with paging_mode_refcounts(). Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
* x86/mm: Clean up mem event structures on domain destructionTim Deegan2012-03-081-0/+6
| | | | | | | | | Otherwise we wind up with zombie domains, still holding onto refs to the mem event ring pages. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
* x86/mm: Make p2m lookups fully synchronized wrt modificationsAndres Lagar-Cavilla2012-02-101-3/+3
| | | | | | | | | | | | | | | | | | We achieve this by locking/unlocking the global p2m_lock in get/put_gfn. The lock is always taken recursively, as there are many paths that call get_gfn, and later, make another attempt at grabbing the p2m_lock. The lock is not taken for shadow lookups. We believe there are no problems remaining for synchronized p2m+shadow paging, but we are not enabling this combination due to lack of testing. Unlocked shadow p2m access are tolerable as long as shadows do not gain support for paging or sharing. HAP (EPT) lookups and all modifications do take the lock. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
* x86/mm: Sharing overhaul style improvementsAndres Lagar-Cavilla2012-01-261-1/+1
| | | | | | | | | | | | | | The name 'shared_info' for the list of shared pages backed by a share frame collided with the identifier also used for a domain's shared info page. To avoid grep/cscope/etc aliasing, rename the shared memory token to 'sharing. This patch only addresses style, and performs no functional changes. To ease reviwing, the patch was left as a stand-alone last-slot addition to the queue to avoid propagating changes throughout the whole series. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
* x86/mm: Enforce lock ordering for sharing page locksAndres Lagar-Cavilla2012-01-261-1/+2
| | | | | | | | | | | Use the ordering constructs in mm-locks.h to enforce an order for the p2m and page locks in the sharing code. Applies to either the global sharing lock (in audit mode) or the per page locks. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scanneell <adin@scannell.ca> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
* x86/mm: Add per-page locking for memory sharing, when audits are disabledAndres Lagar-Cavilla2012-01-261-4/+23
| | | | | | | | | | | | | | | | | With the removal of the hash table, all that is needed now is locking of individual shared pages, as new (gfn,domain) pairs are removed or added from the list of mappings. We recycle PGT_locked and use it to lock individual pages. We ensure deadlock is averted by locking pages in increasing order. The global lock remains for the benefit of the auditing code, and is thus enabled only as a compile-time option. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Signed-off-by: Adin Scannell <adin@scannell.ca> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
* x86/mm: Eliminate hash table in sharing code as index of shared mfnsAndres Lagar-Cavilla2012-01-261-2/+9
| | | | | | | | | | | | Eliminate the sharing hastable mechanism by storing a list head directly in the page info for the case when the page is shared. This does not add any extra space to the page_info and serves to remove significant complexity from sharing. Signed-off-by: Adin Scannell <adin@scannell.ca> Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
* x86: Disable set_gpfn_from_mfn until m2p table is allocated.Keir Fraser2011-06-101-2/+13
| | | | | | | This is a prerequisite for calling set_gpfn_from_mfn() unconditionally from free_heap_pages(). Signed-off-by: Keir Fraser <keir@xen.org>
* x86/mm: dedup the various copies of the shadow lock functionsTim Deegan2011-06-021-0/+9
| | | | | | | | | Define the lock and unlock functions once, and list all the locks in one place so (a) it's obvious what the locking discipline is and (b) none of the locks are visible to non-mm code. Automatically enforce that these locks never get taken in the wrong order. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
* x86: run-time callers of map_pages_to_xen() must check for errorsJan Beulich2011-03-091-2/+0
| | | | | | | | | | | | | | | Again, (out-of-memory) errors must not cause hypervisor crashes, and hence ought to be propagated. This also adjusts the cache attribute changing loop in get_page_from_l1e() to not go through an unnecessary iteration. While this could be considered mere cleanup, it is actually a requirement for the subsequent now necessary error recovery path. Also make a few functions static, easing the check for potential callers needing adjustment. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* Use bool_t for various boolean variablesKeir Fraser2010-12-241-2/+2
| | | | | | | | | | | ... decreasing cache footprint. As a prerequisite this requires making cmdline_parse() a little more flexible. Also remove a few variables altogether, and adjust sections annotations for several others. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xen.org>
* x86 shadow: allocate all shadow memory in single pagesTim Deegan2010-09-011-5/+2
| | | | | | now that multi-page shadows need not be contiguous. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
* x86 shadow: explicitly link the pages of multipage shadowsTim Deegan2010-09-011-5/+10
| | | | | | | | together using their list headers. Update the users of the pinned-shadows list to expect l2_32 shadows to have four entries in the list, which must be kept together during updates. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
* x86 shadow: for multi-page shadows, explicitly track the first pageTim Deegan2010-09-011-1/+2
| | | | | | | | (where the refcounts are) and check that none of the routines that do refcounting ever see the second, third or fourth page. This is just stating and enforcing an existing implicit requirement. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
* x86: put_superpage() must also work for !opt_allow_superpageKeir Fraser2010-06-151-0/+1
| | | | | | | | | | | This is because the P2M table, when placed at a kernel specified location, gets populated with large pages, which the domain must have a way to unmap/recycle. Additionally when allowing Dom0 to use superpages, they ought to be tracked accordingly in the superpage frame table. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: Speed up PV-guest superpage mappingKeir Fraser2010-05-271-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current version of superpage mapping takes a PGT_writable reference to every page in a superpage each time it is mapped. This is extremely slow, so slow that applications become unusable. My solution for this is to introduce a superpage table in the hypervisor, similar to the frametable structure for pages. Currently this table only has a type_info element. There are three types a superpage can have, SGT_mark, SGT_dynamic, or SGT_none. In normal operation, the first time a superpage is mapped, a PGT_writable reference is taken to each page in the superpage, and the superpage is set to type SGT_dynamic and the superpage typecount is incremented. On subsequent mappings and unmappings, only the superpage typecount changes. On the last unmap, the PGT_writable reference on each page is removed. The SGT_mark type is set and cleared through two new MMUEXT hypercalls, mark_super and unmark_super. When the hypercall is made, the superpage's type is set to SGT_mark and a PGT_writable reference is taken to its pages. On unmark, the type is cleared and the reference removed. If a page is already set to SGT_dynamic when mark_super is called, the type is changed to SGT_mark and no additional PGT_writable reference is taken. If there are still outstanding mappings of this superpage when unmark_super is called, the type is set to SGT_dynamic and the PGT_writable reference is not removed. Fast superpage mapping is only supported on 64 bit hypervisors. For 32 bit hyperviors, superpage mapping is supported but will be extremely slow. Signed-off-by: Dave McCracken <dave.mccracken@oracle.com>
* x86: Pull dynamic memory allocation out of do_boot_cpu().Keir Fraser2010-05-181-0/+1
| | | | | | | | | | | | This has two advantages: (a) We can move the allocations to a context where we can handle failure. (b) We can implement matching deallocations on CPU offline. Only the idle vcpu structure is now not freed on CPU offline. This probably does not really matter. Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* x86: Rename __sync_lazy_execstate() to __sync_local_execstate().Keir Fraser2010-04-191-1/+1
| | | | | | | This naming scheme is more rational. Also use non-x86-specific function sync_local_execstate() where possible. Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
* This patch establishes a new abstraction of sharing handles (encoded as a 64bitKeir Fraser2009-12-171-0/+2
| | | | | | | | | | | | | | | | int), each corresponding to a single sharable pages. Externally all sharing related operations (e.g. nominate/share) will use sharing handles, thus solving a lot of consistency problems (like: is this sharable page still the same sharable page as before). Internally, sharing handles can be translated to the MFNs (using a newly created hashtable), and then for each MFNs a doubly linked list of GFNs translating to this MFN is maintained. Finally, sharing handle is stored in page_info strucutre for each sharable MFN. All this allows to share and unshare pages efficiently. However, at the moment a single lock is used to protect the sharing handle hash table. For scalability reasons, the locking needs to be made more granular. Signed-off-by: Grzegorz Milos <Grzegorz.Milos@citrix.com>
* M2P translation cannot be handled through flat table with only one slot per MFNKeir Fraser2009-12-171-4/+16
| | | | | | | | | | when an MFN is shared. However, all existing calls can either infer the GFN (for example p2m table destructor) or will not need to know GFN for shared pages. This patch identifies and fixes all the M2P accessors, either by removing the translation altogether or by making the relevant modifications. Shared MFNs have a special value of SHARED_M2P_ENTRY stored in their M2P table slot. Signed-off-by: Grzegorz Milos <Grzegorz.Milos@citrix.com>
* This patch defines a new PGT type called PGT_shared_page and a new syntheticKeir Fraser2009-12-171-20/+25
| | | | | | | | | | domain called 'dom_cow'. In order to share a page, the type needs to be changed to PGT_shared_page and the owner to dom_dow. Only pages with PGT_none, and no type count are allowed to become sharable. Conversly, sharable pages can only be made 'private' if type count equals one. page_make_sharable() and page_make_private() handle these transitions. Signed-off-by: Grzegorz Milos <Grzegorz.Milos@citrix.com>
* x86_32: Fix build after RDTSCP and memory hotplug changes.Keir Fraser2009-12-141-3/+8
| | | | | Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
* memory hotadd 7/7: hypercall supportKeir Fraser2009-12-111-0/+6
| | | | | | | | | | | The basic work flow to handle the memory hotadd is: Update node information Map new pages to xen 1:1 mapping Setup frametable for new memory range Setup m2p table for new memory range Put the new pages to domheap Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
* memory hotadd 5/7: Sync changes to mapping changes caused by memoryKeir Fraser2009-12-111-0/+15
| | | | | | | | | | hotplug in page fault handler. In compact guest situation, the compat m2p table is copied, not directly mapped in L3, so we have to sync it. Direct mapping range may changes, and we need sync it with guest's table. Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
* SRAT memory hotplug 2/2: Support overlapped and sparse node memory arrangement.Keir Fraser2009-12-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently xen hypervisor use nodes to keep start/end address of node. It assume memory among nodes has no overlap, this is not always true, especially if we have memory hotplug support in the system. This patch backport Linux kernel's memblks to support overlapping among node. The memblks will be used both for checking conflict, and caculate memnode_shift. Also, currently if there is no memory populated in a node when system booting, the node will be unparsed later, and the corresponding CPU's numa information will be removed also. This patch will keep the CPU information. One thing need notice is, currently we caculate memnode_shift with all memory, including un-populated ones. This should work if the smallest chuck is not so small. Other option can be flags in the page_info structure, etc. The memnodemap is changed from paddr to pdx, both to save space, and also because currently most access is from pfn. A flag is mem_hotplug added if there is hotplug memory range. Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
* x86: reduce the uses of CONFIG_COMPATKeir Fraser2009-10-121-4/+0
| | | | | | | ... to where it really is needed and meaningful (i.e. in some places it seems to make more sense to use __x86_64__ instead). Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86: map frame table sparselyKeir Fraser2009-09-221-0/+4
| | | | | | | | | | | | Avoid backing frame table holes with memory, when those holes are large enough to cover an exact multiple of large pages. This is based on the introduction of a bit map, where each bit represents one such range, thus allowing mfn_valid() checks to easily filter out those MFNs that now shouldn't be used to index the frame table. This allows for saving a couple of 2M pages even on "normal" systems. Signed-off-by: Jan Beulich <jbeulich@novell.com>
* x86-64: reduce range spanned by 1:1 mapping and frame table indexesKeir Fraser2009-09-221-10/+36
| | | | | | | | | | | | | | | | | Introduces a virtual space conserving transformation on the MFN thus far used to index 1:1 mapping and frame table, removing the largest range of contiguous bits (below the most significant one) which are zero for all valid MFNs from the MFN representation, to be used to index into those arrays, thereby cutting the virtual range these tables must cover approximately by half with each bit removed. Since this should account for hotpluggable memory (in order to not requiring a re-write when that gets supported), the determination of which bits are candidates for removal must not be based on the E820 information, but instead has to use the SRAT. That in turn requires a change to the ordering of steps done during early boot. Signed-off-by: Jan Beulich <jbeulich@novell.com>