3 files changed, 133 insertions, 111 deletions
diff --git a/.rootkeys b/.rootkeys
index e1f67d7c3f..99803cc8e7 100644
--- a/.rootkeys
+++ b/.rootkeys
@@ -193,6 +193,7 @@
 3ddb79bcbOVHh38VJzc97-JEGD4dJQ xen/Makefile
 3ddb79bcCa2VbsMp7mWKlhgwLQUQGA xen/README
 3ddb79bcWnTwYsQRWl_PaneJfa6p0w xen/Rules.mk
+3e74d2be6ELqhaY1sW0yyHRKhpOvDQ xen/TODO
 3ddb79bcZbRBzT3elFWSX7u6NtMagQ xen/arch/i386/Makefile
 3ddb79bcBQF85CfLS4i1WGZ4oLLaCA xen/arch/i386/Rules.mk
 3e5636e5FAYZ5_vQnmgwFJfSdmO5Mw xen/arch/i386/acpitable.c
diff --git a/xen/README b/xen/README
index 3518b8254a..ea14e52d86 100644
--- a/xen/README
+++ b/xen/README
@@ -1,110 +1,17 @@
 
 *****************************************************
-   Xeno Hypervisor (18/7/02)
+   Xeno Hypervisor (16/3/03)
 
-1) Tree layout
-Looks rather like a simplified Linux :-)
-Headers are in include/xeno and include asm-<arch>.
-At build time we create symlinks:
- include/linux -> include/xeno
- include/asm   -> include/asm-<arch>
-In this way, Linux device drivers should need less tweaking of
-their #include lines.
-
-For source files, mapping between hypervisor and Linux is:
- Linux                 Hypervisor
- -----                 ----------
- kernel/init/mm/lib -> common
- net/*              -> net/*
- drivers/*          -> drivers/*
- arch/*             -> arch/*
-
-Note that the use of #include <asm/...> and #include <linux/...> can
-lead to confusion, as such files will often exist on the system include
-path, even if a version doesn't exist within the hypervisor tree.
-Unfortunately '-nostdinc' cannot be specified to the compiler, as that
-prevents us using stdarg.h in the compiler's own header directory.
-
-We try to not modify things in driver/* as much as possible, so we can
-easily take updates from Linux. arch/* is basically straight from
-Linux, with fingers in Linux-specific pies hacked off. common/* has
-a lot of Linux code in it, but certain subsystems (task maintenance,
-low-level memory handling) have been replaced. net/* contains enough
-Linux-like gloop to get network drivers to work with little/no
-modification.
-
-2) Building
 'make': Builds ELF executable called 'image' in base directory
-'make install': gzip-compresses 'image' and copies it to TFTP server
 'make clean': removes *all* build and target files
 
 
-*****************************************************
-Random thoughts and stuff from here down...
-
-Todo list
----------
-* Hypervisor need only directly map its own memory pool
-  (maybe 128MB, tops). That would need 0x08000000....
-  This would allow 512MB Linux with plenty room for vmalloc'ed areas.
-* Network device -- port drivers to hypervisor, implement virtual
-  driver for xeno-linux. Looks like Ethernet.
-  -- Hypervisor needs to do (at a minimum):
-       - packet filtering on tx (unicast IP only)
-       - packet demux on rx     (unicast IP only)
-       - provide DHCP [maybedo something simpler?]
-         and ARP [at least for hypervisor IP address]
-
-
-Segment descriptor tables
--------------------------
-We want to allow guest OSes to specify GDT and LDT tables using their
-own pages of memory (just like with page tables). So allow the following:
- * new_table_entry(ptr, val)
-   [Allows insertion of a code, data, or LDT descriptor into given
-    location. Can simply be checked then poked, with no need to look at
-    page type.]
- * new_GDT() -- relevent virtual pages are resolved to frames. Either
-    (i) page not present; or (ii) page is only mapped read-only and checks
-    out okay (then marked as special page). Old table is resolved first,
-    and the pages are unmarked (no longer special type).
- * new_LDT() -- same as for new_GDT(), with same special page type.
-
-Page table updates must be hooked, so we look for updates to virtual page
-addresses in the GDT/LDT range. If map to not present, then old physpage
-has type_count decremented. If map to present, ensure read-only, check the
-page, and set special type.
-
-Merge set_{LDT,GDT} into update_baseptr, by passing four args:
- update_baseptrs(mask, ptab, gdttab, ldttab);
-Update of ptab requires update of gtab (or set to internal default).
-Update of gtab requires update of ltab (or set to internal default).
-
-
-The hypervisor page cache
--------------------------
-This will allow guest OSes to make use of spare pages in the system, but
-allow them to be immediately used for any new domains or memory requests.
-The idea is that, when a page is laundered and falls off Linux's clean_LRU
-list, rather than freeing it it becomes a candidate for passing down into
-the hypervisor. In return, xeno-linux may ask for one of its previously-
-cached pages back:
- (page, new_id) = cache_query(page, old_id);
-If the requested page couldn't be kept, a blank page is returned.
-When would Linux make the query? Whenever it wants a page back without
-the delay or going to disc. Also, whenever a page would otherwise be
-flushed to disc.
-
-To try and add to the cache: (blank_page, new_id) = cache_query(page, NULL);
- [NULL means "give me a blank page"].
-To try and retrieve from the cache: (page, new_id) = cache_query(x_page, id)
- [we may request that x_page just be discarded, and therefore not impinge
-  on this domain's cache quota].
-
 
 Booting secondary processors
 ----------------------------
 
+It's twisty and turny, so this is (roughly) the code path:
+
 start_of_day (i386/setup.c)
 smp_boot_cpus (i386/smpboot.c)
  * initialises boot CPU data
@@ -128,18 +35,3 @@ On other processor:
        * barrier, then write bitmasks to signal back to boot cpu
        * then barrel into...
          cpu_idle (i386/process.c)
-         [THIS IS PROBABLY REASONABLE -- BOOT CPU SHOULD KICK
-          SECONDARIES TO GET WORK DONE]
-
-
-SMP capabilities
-----------------
-
-Current intention is to allow hypervisor to schedule on all processors in
-SMP boxen, but to tie each domain to a single processor. This simplifies
-many SMP intricacies both in terms of correctness and efficiency (eg.
-TLB flushing, network packet delivery, ...).
-
-Clients can still make use of SMP by installing multiple domains on a single
-machine, and treating it as a fast cluster (at the very least, the
-hypervisor will have fast routing of locally-destined packets).
diff --git a/xen/TODO b/xen/TODO
new file mode 100644
index 0000000000..e81023e995
--- /dev/null
+++ b/xen/TODO
@@ -0,0 +1,129 @@
+
+This is stuff we probably want to implement in the near future. I
+think I have them in a sensible priority order -- the first few would
+be nice to fix before a code release. The later ones can be
+longer-term goals.
+
+ -- Keir (16/3/03)
+
+
+1. ASSIGNING DOMAINS TO PROCESSORS
+----------------------------------
+More intelligent assignment of domains to processors. In
+particular, we don't play well with hyperthreading: we will assign
+domains to virtual processors on the same package, rather then
+spreading them across processor packages.
+
+What we need to do is port code from Linux which stores information on
+relationships between processors in the system (eg. which ones are
+siblings in teh same package). We then use this to balance domains
+across packages, and across virtual processors within a package.
+
+2. PROPER DESTRUCTION OF DOMAINS
+--------------------------------
+Currently we do not free resources when destroying a domain. This is
+because they may be tied up in subsystems, and there is no way of
+pulling them back in a safe manner.
+
+The fix is probably to reference count resources and automatically
+free them when the count reaches zero. We may get away with one count
+per domain (for all its resources). When this reaches zero we know it
+is safe to free everything: block-device rings, network rings, and all
+the rest.
+
+3. FIX HANDLING OF NETWORK RINGS
+--------------------------------
+Handling of the transmit rings is currently very broken (for example,
+sending an inter-domain packet will wedge the hypervisor). This is
+because we may handle packets out of order (eg. inter-domain packets
+are handled eagerly, while packets for real interfaces are queued),
+but our current ring design really assumes in-order handling.
+
+A neat fix will be to allow responses to be queued in a different
+order to requests, just as we already do with block-device
+rings. We'll need to add an opaque identifier to ring entries,
+allowing matching of requests and responses, but that's about it.
+
+4. GDT AND LDT VIRTUALISATION 
+----------------------------- 
+We do not allow modification of the GDT, or any use of the LDT. This
+is necessary for support of unmodified applications (eg. Linux uses
+LDT in threaded applications, while Windows needs to update GDT
+entries).
+
+I have some text on how to do this:
+/usr/groups/xeno/discussion-docs/memory_management/segment_tables.txt
+It's already half implemented, but the rest is still to do.
+
+5. DOMAIN 0 MANAGEMENT DAEMON
+-----------------------------
+A better control daemon is required for domain 0, which keeps proper
+track of machine resources and can make sensible policy choices. This
+may require support in Xen; for example, notifications (eg. DOMn is
+killed), and requests (eg. can DOMn allocate x frames of memory?).
+
+6. ACCURATE TIMERS AND WALL-CLOCK TIME
+--------------------------------------
+Currently our long-term timebase free runs on CPU0, with no external
+calibration. We should run ntpd on domain 0 and allow this to warp
+Xen's timebase. Once this is done, we can have a timebase per CPU and
+not worry about relative drift (since they'll all get sync'ed
+periodically by ntp).
+
+7. NEW DESIGN FEATURES
+----------------------
+This includes the last-chance page cache, and the unified buffer cache.
+
+
+
+Graveyard
+*********
+
+Following is some description how some of the above might be
+implemented. Some of it is superceded and/or out of date, so follow
+with caution.
+
+Segment descriptor tables
+-------------------------
+We want to allow guest OSes to specify GDT and LDT tables using their
+own pages of memory (just like with page tables). So allow the following:
+ * new_table_entry(ptr, val)
+   [Allows insertion of a code, data, or LDT descriptor into given
+    location. Can simply be checked then poked, with no need to look at
+    page type.]
+ * new_GDT() -- relevent virtual pages are resolved to frames. Either
+    (i) page not present; or (ii) page is only mapped read-only and checks
+    out okay (then marked as special page). Old table is resolved first,
+    and the pages are unmarked (no longer special type).
+ * new_LDT() -- same as for new_GDT(), with same special page type.
+
+Page table updates must be hooked, so we look for updates to virtual page
+addresses in the GDT/LDT range. If map to not present, then old physpage
+has type_count decremented. If map to present, ensure read-only, check the
+page, and set special type.
+
+Merge set_{LDT,GDT} into update_baseptr, by passing four args:
+ update_baseptrs(mask, ptab, gdttab, ldttab);
+Update of ptab requires update of gtab (or set to internal default).
+Update of gtab requires update of ltab (or set to internal default).
+
+
+The hypervisor page cache
+-------------------------
+This will allow guest OSes to make use of spare pages in the system, but
+allow them to be immediately used for any new domains or memory requests.
+The idea is that, when a page is laundered and falls off Linux's clean_LRU
+list, rather than freeing it it becomes a candidate for passing down into
+the hypervisor. In return, xeno-linux may ask for one of its previously-
+cached pages back:
+ (page, new_id) = cache_query(page, old_id);
+If the requested page couldn't be kept, a blank page is returned.
+When would Linux make the query? Whenever it wants a page back without
+the delay or going to disc. Also, whenever a page would otherwise be
+flushed to disc.
+
+To try and add to the cache: (blank_page, new_id) = cache_query(page, NULL);
+ [NULL means "give me a blank page"].
+To try and retrieve from the cache: (page, new_id) = cache_query(x_page, id)
+ [we may request that x_page just be discarded, and therefore not impinge
+  on this domain's cache quota].