diff options
Diffstat (limited to 'docs/src/interface/devices.tex')
-rw-r--r-- | docs/src/interface/devices.tex | 178 |
1 files changed, 0 insertions, 178 deletions
diff --git a/docs/src/interface/devices.tex b/docs/src/interface/devices.tex deleted file mode 100644 index ddd0cd1d2b..0000000000 --- a/docs/src/interface/devices.tex +++ /dev/null @@ -1,178 +0,0 @@ -\chapter{Devices} -\label{c:devices} - -Devices such as network and disk are exported to guests using a split -device driver. The device driver domain, which accesses the physical -device directly also runs a \emph{backend} driver, serving requests to -that device from guests. Each guest will use a simple \emph{frontend} -driver, to access the backend. Communication between these domains is -composed of two parts: First, data is placed onto a shared memory page -between the domains. Second, an event channel between the two domains -is used to pass notification that data is outstanding. This -separation of notification from data transfer allows message batching, -and results in very efficient device access. - -Event channels are used extensively in device virtualization; each -domain has a number of end-points or \emph{ports} each of which may be -bound to one of the following \emph{event sources}: -\begin{itemize} - \item a physical interrupt from a real device, - \item a virtual interrupt (callback) from Xen, or - \item a signal from another domain -\end{itemize} - -Events are lightweight and do not carry much information beyond the -source of the notification. Hence when performing bulk data transfer, -events are typically used as synchronization primitives over a shared -memory transport. Event channels are managed via the {\tt - event\_channel\_op()} hypercall; for more details see -Section~\ref{s:idc}. - -This chapter focuses on some individual device interfaces available to -Xen guests. - - -\section{Network I/O} - -Virtual network device services are provided by shared memory -communication with a backend domain. From the point of view of other -domains, the backend may be viewed as a virtual ethernet switch -element with each domain having one or more virtual network interfaces -connected to it. - -\subsection{Backend Packet Handling} - -The backend driver is responsible for a variety of actions relating to -the transmission and reception of packets from the physical device. -With regard to transmission, the backend performs these key actions: - -\begin{itemize} -\item {\bf Validation:} To ensure that domains do not attempt to - generate invalid (e.g. spoofed) traffic, the backend driver may - validate headers ensuring that source MAC and IP addresses match the - interface that they have been sent from. - - Validation functions can be configured using standard firewall rules - ({\small{\tt iptables}} in the case of Linux). - -\item {\bf Scheduling:} Since a number of domains can share a single - physical network interface, the backend must mediate access when - several domains each have packets queued for transmission. This - general scheduling function subsumes basic shaping or rate-limiting - schemes. - -\item {\bf Logging and Accounting:} The backend domain can be - configured with classifier rules that control how packets are - accounted or logged. For example, log messages might be generated - whenever a domain attempts to send a TCP packet containing a SYN. -\end{itemize} - -On receipt of incoming packets, the backend acts as a simple -demultiplexer: Packets are passed to the appropriate virtual interface -after any necessary logging and accounting have been carried out. - -\subsection{Data Transfer} - -Each virtual interface uses two ``descriptor rings'', one for -transmit, the other for receive. Each descriptor identifies a block -of contiguous physical memory allocated to the domain. - -The transmit ring carries packets to transmit from the guest to the -backend domain. The return path of the transmit ring carries messages -indicating that the contents have been physically transmitted and the -backend no longer requires the associated pages of memory. - -To receive packets, the guest places descriptors of unused pages on -the receive ring. The backend will return received packets by -exchanging these pages in the domain's memory with new pages -containing the received data, and passing back descriptors regarding -the new packets on the ring. This zero-copy approach allows the -backend to maintain a pool of free pages to receive packets into, and -then deliver them to appropriate domains after examining their -headers. - -% Real physical addresses are used throughout, with the domain -% performing translation from pseudo-physical addresses if that is -% necessary. - -If a domain does not keep its receive ring stocked with empty buffers -then packets destined to it may be dropped. This provides some -defence against receive livelock problems because an overload domain -will cease to receive further data. Similarly, on the transmit path, -it provides the application with feedback on the rate at which packets -are able to leave the system. - -Flow control on rings is achieved by including a pair of producer -indexes on the shared ring page. Each side will maintain a private -consumer index indicating the next outstanding message. In this -manner, the domains cooperate to divide the ring into two message -lists, one in each direction. Notification is decoupled from the -immediate placement of new messages on the ring; the event channel -will be used to generate notification when {\em either} a certain -number of outstanding messages are queued, {\em or} a specified number -of nanoseconds have elapsed since the oldest message was placed on the -ring. - -%% Not sure if my version is any better -- here is what was here -%% before: Synchronization between the backend domain and the guest is -%% achieved using counters held in shared memory that is accessible to -%% both. Each ring has associated producer and consumer indices -%% indicating the area in the ring that holds descriptors that contain -%% data. After receiving {\it n} packets or {\t nanoseconds} after -%% receiving the first packet, the hypervisor sends an event to the -%% domain. - - -\section{Block I/O} - -All guest OS disk access goes through the virtual block device VBD -interface. This interface allows domains access to portions of block -storage devices visible to the the block backend device. The VBD -interface is a split driver, similar to the network interface -described above. A single shared memory ring is used between the -frontend and backend drivers, across which read and write messages are -sent. - -Any block device accessible to the backend domain, including -network-based block (iSCSI, *NBD, etc), loopback and LVM/MD devices, -can be exported as a VBD. Each VBD is mapped to a device node in the -guest, specified in the guest's startup configuration. - -Old (Xen 1.2) virtual disks are not supported under Xen 2.0, since -similar functionality can be achieved using the more complete LVM -system, which is already in widespread use. - -\subsection{Data Transfer} - -The single ring between the guest and the block backend supports three -messages: - -\begin{description} -\item [{\small {\tt PROBE}}:] Return a list of the VBDs available to - this guest from the backend. The request includes a descriptor of a - free page into which the reply will be written by the backend. - -\item [{\small {\tt READ}}:] Read data from the specified block - device. The front end identifies the device and location to read - from and attaches pages for the data to be copied to (typically via - DMA from the device). The backend acknowledges completed read - requests as they finish. - -\item [{\small {\tt WRITE}}:] Write data to the specified block - device. This functions essentially as {\small {\tt READ}}, except - that the data moves to the device instead of from it. -\end{description} - -%% um... some old text: In overview, the same style of descriptor-ring -%% that is used for network packets is used here. Each domain has one -%% ring that carries operation requests to the hypervisor and carries -%% the results back again. - -%% Rather than copying data, the backend simply maps the domain's -%% buffers in order to enable direct DMA to them. The act of mapping -%% the buffers also increases the reference counts of the underlying -%% pages, so that the unprivileged domain cannot try to return them to -%% the hypervisor, install them as page tables, or any other unsafe -%% behaviour. -%% -%% % block API here |