Implement vector callback for PVHVM and unify event channel implementations Re-structure Xen HVM support so that: - Xen is detected and hypercalls can be performed very early in system startup. - Xen interrupt services are implemented using FreeBSD's native interrupt delivery infrastructure. - the Xen interrupt service implementation is shared between PV and HVM guests. - Xen interrupt handlers can optionally use a filter handler in order to avoid the overhead of dispatch to an interrupt thread. - interrupt load can be distributed among all available CPUs. - the overhead of accessing the emulated local and I/O apics on HVM is removed for event channel port events. - a similar optimization can eventually, and fairly easily, be used to optimize MSI. Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure, and misc Xen cleanups: Sponsored by: Spectra Logic Corporation Unification of PV & HVM interrupt infrastructure, bug fixes, and misc Xen cleanups: Submitted by: Roger Pau MonnĂ© sys/x86/x86/local_apic.c: sys/amd64/include/apicvar.h: sys/i386/include/apicvar.h: sys/amd64/amd64/apic_vector.S: sys/i386/i386/apic_vector.s: sys/amd64/amd64/machdep.c: sys/i386/i386/machdep.c: sys/i386/xen/exception.s: sys/x86/include/segments.h: Reserve IDT vector 0x93 for the Xen event channel upcall interrupt handler. On Hypervisors that support the direct vector callback feature, we can request that this vector be called directly by an injected HVM interrupt event, instead of a simulated PCI interrupt on the Xen platform PCI device. This avoids all of the overhead of dealing with the emulated I/O APIC and local APIC. It also means that the Hypervisor can inject these events on any CPU, allowing upcalls for different ports to be handled in parallel. sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: Map Xen per-vcpu area during AP startup. sys/amd64/include/intr_machdep.h: sys/i386/include/intr_machdep.h: Increase the FreeBSD IRQ vector table to include space for event channel interrupt sources. sys/amd64/include/pcpu.h: sys/i386/include/pcpu.h: Remove Xen HVM per-cpu variable data. These fields are now allocated via the dynamic per-cpu scheme. See xen_intr.c for details. sys/amd64/include/xen/hypercall.h: sys/dev/xen/blkback/blkback.c: sys/i386/include/xen/xenvar.h: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/xen/gnttab.c: Prefer FreeBSD primatives to Linux ones in Xen support code. sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: sys/dev/xen/balloon/balloon.c: sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/console/xencons_ring.c: sys/dev/xen/control/control.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/dev/xen/xenpci/xenpci.c: sys/i386/i386/machdep.c: sys/i386/include/pmap.h: sys/i386/include/xen/xenfunc.h: sys/i386/isa/npx.c: sys/i386/xen/clock.c: sys/i386/xen/mp_machdep.c: sys/i386/xen/mptable.c: sys/i386/xen/xen_clock_util.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/xen_rtc.c: sys/xen/evtchn/evtchn_dev.c: sys/xen/features.c: sys/xen/gnttab.c: sys/xen/gnttab.h: sys/xen/hvm.h: sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbus_if.m: sys/xen/xenbus/xenbusb_front.c: sys/xen/xenbus/xenbusvar.h: sys/xen/xenstore/xenstore.c: sys/xen/xenstore/xenstore_dev.c: sys/xen/xenstore/xenstorevar.h: Pull common Xen OS support functions/settings into xen/xen-os.h. sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: Remove constants, macros, and functions unused in FreeBSD's Xen support. sys/xen/xen-os.h: sys/i386/xen/xen_machdep.c: sys/x86/xen/hvm.c: Introduce new functions xen_domain(), xen_pv_domain(), and xen_hvm_domain(). These are used in favor of #ifdefs so that FreeBSD can dynamically detect and adapt to the presence of a hypervisor. The goal is to have an HVM optimized GENERIC, but more is necessary before this is possible. sys/amd64/amd64/machdep.c: sys/dev/xen/xenpci/xenpcivar.h: sys/dev/xen/xenpci/xenpci.c: sys/x86/xen/hvm.c: sys/sys/kernel.h: Refactor magic ioport, Hypercall table and Hypervisor shared information page setup, and move it to a dedicated HVM support module. HVM mode initialization is now triggered during the SI_SUB_HYPERVISOR phase of system startup. This currently occurs just after the kernel VM is fully setup which is just enough infrastructure to allow the hypercall table and shared info page to be properly mapped. sys/xen/hvm.h: sys/x86/xen/hvm.c: Add definitions and a method for configuring Hypervisor event delievery via a direct vector callback. sys/amd64/include/xen/xen-os.h: sys/x86/xen/hvm.c: sys/conf/files: sys/conf/files.amd64: sys/conf/files.i386: Adjust kernel build to reflect the refactoring of early Xen startup code and Xen interrupt services. sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkfront/block.h: sys/dev/xen/control/control.c: sys/dev/xen/evtchn/evtchn_dev.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/xen/xenstore/xenstore.c: sys/xen/evtchn/evtchn_dev.c: sys/dev/xen/console/console.c: sys/dev/xen/console/xencons_ring.c Adjust drivers to use new xen_intr_*() API. sys/dev/xen/blkback/blkback.c: Since blkback defers all event handling to a taskqueue, convert this task queue to a "fast" taskqueue, and schedule it via an interrupt filter. This avoids an unnecessary ithread context switch. sys/xen/xenstore/xenstore.c: The xenstore driver is MPSAFE. Indicate as much when registering its interrupt handler. sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbusvar.h: Remove unused event channel APIs. sys/xen/evtchn.h: Remove all kernel Xen interrupt service API definitions from this file. It is now only used for structure and ioctl definitions related to the event channel userland device driver. Update the definitions in this file to match those from NetBSD. Implementing this interface will be necessary for Dom0 support. sys/xen/evtchn/evtchnvar.h: Add a header file for implemenation internal APIs related to managing event channels event delivery. This is used to allow, for example, the event channel userland device driver to access low-level routines that typical kernel consumers of event channel services should never access. sys/xen/interface/event_channel.h: sys/xen/xen_intr.h: Standardize on the evtchn_port_t type for referring to an event channel port id. In order to prevent low-level event channel APIs from leaking to kernel consumers who should not have access to this data, the type is defined twice: Once in the Xen provided event_channel.h, and again in xen/xen_intr.h. The double declaration is protected by __XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared twice within a given compilation unit. sys/xen/xen_intr.h: sys/xen/evtchn/evtchn.c: sys/x86/xen/xen_intr.c: sys/dev/xen/xenpci/evtchn.c: sys/dev/xen/xenpci/xenpcivar.h: New implementation of Xen interrupt services. This is similar in many respects to the i386 PV implementation with the exception that events for bound to event channel ports (i.e. not IPI, virtual IRQ, or physical IRQ) are further optimized to avoid mask/unmask operations that aren't necessary for these edge triggered events. Stubs exist for supporting physical IRQ binding, but will need additional work before this implementation can be fully shared between PV and HVM. sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: sys/i386/xen/mp_machdep.c sys/x86/xen/hvm.c: Add support for placing vcpu_info into an arbritary memory page instead of using HYPERVISOR_shared_info->vcpu_info. This allows the creation of domains with more than 32 vcpus. sys/i386/i386/machdep.c: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/exception.s: Add support for new event channle implementation. Index: sys/amd64/amd64/apic_vector.S =================================================================== --- sys/amd64/amd64/apic_vector.S (revision 255014) +++ sys/amd64/amd64/apic_vector.S (working copy) @@ -128,6 +128,22 @@ IDTVEC(errorint) MEXITCOUNT jmp doreti +#ifdef XENHVM +/* + * Xen event channel upcall interrupt handler. + * Only used when the hypervisor supports direct vector callbacks. + */ + .text + SUPERALIGN_TEXT +IDTVEC(xen_intr_upcall) + PUSH_FRAME + FAKE_MCOUNT(TF_RIP(%rsp)) + movq %rsp, %rdi + call xen_intr_handle_upcall + MEXITCOUNT + jmp doreti +#endif + #ifdef SMP /* * Global address space TLB shootdown. Index: sys/amd64/amd64/machdep.c =================================================================== --- sys/amd64/amd64/machdep.c (revision 255014) +++ sys/amd64/amd64/machdep.c (working copy) @@ -1204,6 +1204,9 @@ extern inthand_t #ifdef KDTRACE_HOOKS IDTVEC(dtrace_ret), #endif +#ifdef XENHVM + IDTVEC(xen_intr_upcall), +#endif IDTVEC(fast_syscall), IDTVEC(fast_syscall32); #ifdef DDB @@ -1787,6 +1790,9 @@ hammer_time(u_int64_t modulep, u_int64_t physfree) #ifdef KDTRACE_HOOKS setidt(IDT_DTRACE_RET, &IDTVEC(dtrace_ret), SDT_SYSIGT, SEL_UPL, 0); #endif +#ifdef XENHVM + setidt(IDT_EVTCHN, &IDTVEC(xen_intr_upcall), SDT_SYSIGT, SEL_UPL, 0); +#endif r_idt.rd_limit = sizeof(idt0) - 1; r_idt.rd_base = (long) idt; @@ -1910,14 +1916,6 @@ hammer_time(u_int64_t modulep, u_int64_t physfree) if (env != NULL) strlcpy(kernelname, env, sizeof(kernelname)); -#ifdef XENHVM - if (inw(0x10) == 0x49d2) { - if (bootverbose) - printf("Xen detected: disabling emulated block and network devices\n"); - outw(0x10, 3); - } -#endif - cpu_probe_amdc1e(); #ifdef FDT Index: sys/amd64/amd64/mp_machdep.c =================================================================== --- sys/amd64/amd64/mp_machdep.c (revision 255014) +++ sys/amd64/amd64/mp_machdep.c (working copy) @@ -70,6 +70,10 @@ __FBSDID("$FreeBSD$"); #include #include +#ifdef XENHVM +#include +#endif + #define WARMBOOT_TARGET 0 #define WARMBOOT_OFF (KERNBASE + 0x0467) #define WARMBOOT_SEG (KERNBASE + 0x0469) @@ -711,6 +715,11 @@ init_secondary(void) /* set up FPU state on the AP */ fpuinit(); +#ifdef XENHVM + /* register vcpu_info area */ + xen_hvm_init_cpu(); +#endif + /* A quick check from sanity claus */ cpuid = PCPU_GET(cpuid); if (PCPU_GET(apic_id) != lapic_id()) { Index: sys/amd64/include/apicvar.h =================================================================== --- sys/amd64/include/apicvar.h (revision 255014) +++ sys/amd64/include/apicvar.h (working copy) @@ -227,6 +227,7 @@ int lapic_set_lvt_triggermode(u_int apic_id, u_int enum intr_trigger trigger); void lapic_set_tpr(u_int vector); void lapic_setup(int boot); +void xen_intr_handle_upcall(struct trapframe *frame); #endif /* !LOCORE */ #endif /* _MACHINE_APICVAR_H_ */ Index: sys/amd64/include/intr_machdep.h =================================================================== --- sys/amd64/include/intr_machdep.h (revision 255014) +++ sys/amd64/include/intr_machdep.h (working copy) @@ -44,12 +44,24 @@ * allocate IDT vectors. * * The first 255 IRQs (0 - 254) are reserved for ISA IRQs and PCI intline IRQs. - * IRQ values beyond 256 are used by MSI. We leave 255 unused to avoid - * confusion since 255 is used in PCI to indicate an invalid IRQ. + * IRQ values from 256 to 767 are used by MSI. When running under the Xen + * Hypervisor, IRQ values from 768 to 4863 are available for binding to + * event channel events. We leave 255 unused to avoid confusion since 255 is + * used in PCI to indicate an invalid IRQ. */ #define NUM_MSI_INTS 512 #define FIRST_MSI_INT 256 -#define NUM_IO_INTS (FIRST_MSI_INT + NUM_MSI_INTS) +#ifdef XENHVM +#include +#define NUM_EVTCHN_INTS NR_EVENT_CHANNELS +#define FIRST_EVTCHN_INT \ + (FIRST_MSI_INT + NUM_MSI_INTS) +#define LAST_EVTCHN_INT \ + (FIRST_EVTCHN_INT + NUM_EVTCHN_INTS - 1) +#else +#define NUM_EVTCHN_INTS 0 +#endif +#define NUM_IO_INTS (FIRST_MSI_INT + NUM_MSI_INTS + NUM_EVTCHN_INTS) /* * Default base address for MSI messages on x86 platforms. Index: sys/amd64/include/pcpu.h =================================================================== --- sys/amd64/include/pcpu.h (revision 255014) +++ sys/amd64/include/pcpu.h (working copy) @@ -42,15 +42,6 @@ #endif #endif -#ifdef XENHVM -#define PCPU_XEN_FIELDS \ - ; \ - unsigned int pc_last_processed_l1i; \ - unsigned int pc_last_processed_l2i -#else -#define PCPU_XEN_FIELDS -#endif - /* * The SMP parts are setup in pmap.c and locore.s for the BSP, and * mp_machdep.c sets up the data for the AP's to "see" when they awake. @@ -76,8 +67,7 @@ struct system_segment_descriptor *pc_ldt; \ /* Pointer to the CPU TSS descriptor */ \ struct system_segment_descriptor *pc_tss; \ - u_int pc_cmci_mask /* MCx banks for CMCI */ \ - PCPU_XEN_FIELDS; \ + u_int pc_cmci_mask; /* MCx banks for CMCI */ \ uint64_t pc_dbreg[16]; /* ddb debugging regs */ \ int pc_dbreg_cmd; /* ddb debugging reg cmd */ \ char __pad[161] /* be divisor of PAGE_SIZE \ Index: sys/amd64/include/xen/hypercall.h =================================================================== --- sys/amd64/include/xen/hypercall.h (revision 255014) +++ sys/amd64/include/xen/hypercall.h (working copy) @@ -1,7 +1,7 @@ /****************************************************************************** * hypercall.h * - * Linux-specific hypervisor handling. + * FreeBSD-specific hypervisor handling. * * Copyright (c) 2002-2004, K A Fraser * @@ -270,7 +270,7 @@ HYPERVISOR_event_channel_op( int rc = _hypercall2(int, event_channel_op, cmd, arg); #if CONFIG_XEN_COMPAT <= 0x030002 - if (unlikely(rc == -ENOXENSYS)) { + if (__predict_false(rc == -ENOXENSYS)) { struct evtchn_op op; op.cmd = cmd; memcpy(&op.u, arg, sizeof(op.u)); @@ -303,7 +303,7 @@ HYPERVISOR_physdev_op( int rc = _hypercall2(int, physdev_op, cmd, arg); #if CONFIG_XEN_COMPAT <= 0x030002 - if (unlikely(rc == -ENOXENSYS)) { + if (__predict_false(rc == -ENOXENSYS)) { struct physdev_op op; op.cmd = cmd; memcpy(&op.u, arg, sizeof(op.u)); Index: sys/amd64/include/xen/xen-os.h =================================================================== --- sys/amd64/include/xen/xen-os.h (revision 255014) +++ sys/amd64/include/xen/xen-os.h (working copy) @@ -1,40 +1,42 @@ /****************************************************************************** - * os.h + * amd64/xen/xen-os.h * - * random collection of macros and definition + * Random collection of macros and definition * + * Copyright (c) 2003, 2004 Keir Fraser (on behalf of the Xen team) + * All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + * * $FreeBSD$ */ -#ifndef _XEN_OS_H_ -#define _XEN_OS_H_ +#ifndef _MACHINE_XEN_XEN_OS_H_ +#define _MACHINE_XEN_XEN_OS_H_ #ifdef PAE #define CONFIG_X86_PAE #endif -#ifdef LOCORE -#define __ASSEMBLY__ -#endif - -#if !defined(__XEN_INTERFACE_VERSION__) -#define __XEN_INTERFACE_VERSION__ 0x00030208 -#endif - -#define GRANT_REF_INVALID 0xffffffff - -#include - /* Everything below this point is not included by assembler (.S) files. */ #ifndef __ASSEMBLY__ -/* Force a proper event-channel callback from Xen. */ -void force_evtchn_callback(void); - -extern int gdtset; - -extern shared_info_t *HYPERVISOR_shared_info; - /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ static inline void rep_nop(void) { @@ -42,162 +44,13 @@ static inline void rep_nop(void) } #define cpu_relax() rep_nop() -/* crude memory allocator for memory allocation early in - * boot - */ -void *bootmem_alloc(unsigned int size); -void bootmem_free(void *ptr, unsigned int size); - -void printk(const char *fmt, ...); - -/* some function prototypes */ -void trap_init(void); - -#define likely(x) __builtin_expect((x),1) -#define unlikely(x) __builtin_expect((x),0) - -#ifndef XENHVM - -/* - * STI/CLI equivalents. These basically set and clear the virtual - * event_enable flag in the shared_info structure. Note that when - * the enable bit is set, there may be pending events to be handled. - * We may therefore call into do_hypervisor_callback() directly. - */ - -#define __cli() \ -do { \ - vcpu_info_t *_vcpu; \ - _vcpu = &HYPERVISOR_shared_info->vcpu_info[PCPU_GET(cpuid)]; \ - _vcpu->evtchn_upcall_mask = 1; \ - barrier(); \ -} while (0) - -#define __sti() \ -do { \ - vcpu_info_t *_vcpu; \ - barrier(); \ - _vcpu = &HYPERVISOR_shared_info->vcpu_info[PCPU_GET(cpuid)]; \ - _vcpu->evtchn_upcall_mask = 0; \ - barrier(); /* unmask then check (avoid races) */ \ - if ( unlikely(_vcpu->evtchn_upcall_pending) ) \ - force_evtchn_callback(); \ -} while (0) - -#define __restore_flags(x) \ -do { \ - vcpu_info_t *_vcpu; \ - barrier(); \ - _vcpu = &HYPERVISOR_shared_info->vcpu_info[PCPU_GET(cpuid)]; \ - if ((_vcpu->evtchn_upcall_mask = (x)) == 0) { \ - barrier(); /* unmask then check (avoid races) */ \ - if ( unlikely(_vcpu->evtchn_upcall_pending) ) \ - force_evtchn_callback(); \ - } \ -} while (0) - -/* - * Add critical_{enter, exit}? - * - */ -#define __save_and_cli(x) \ -do { \ - vcpu_info_t *_vcpu; \ - _vcpu = &HYPERVISOR_shared_info->vcpu_info[PCPU_GET(cpuid)]; \ - (x) = _vcpu->evtchn_upcall_mask; \ - _vcpu->evtchn_upcall_mask = 1; \ - barrier(); \ -} while (0) - - -#define cli() __cli() -#define sti() __sti() -#define save_flags(x) __save_flags(x) -#define restore_flags(x) __restore_flags(x) -#define save_and_cli(x) __save_and_cli(x) - -#define local_irq_save(x) __save_and_cli(x) -#define local_irq_restore(x) __restore_flags(x) -#define local_irq_disable() __cli() -#define local_irq_enable() __sti() - -#define mtx_lock_irqsave(lock, x) {local_irq_save((x)); mtx_lock_spin((lock));} -#define mtx_unlock_irqrestore(lock, x) {mtx_unlock_spin((lock)); local_irq_restore((x)); } -#define spin_lock_irqsave mtx_lock_irqsave -#define spin_unlock_irqrestore mtx_unlock_irqrestore - -#else -#endif - -#ifndef xen_mb -#define xen_mb() mb() -#endif -#ifndef xen_rmb -#define xen_rmb() rmb() -#endif -#ifndef xen_wmb -#define xen_wmb() wmb() -#endif -#ifdef SMP -#define smp_mb() mb() -#define smp_rmb() rmb() -#define smp_wmb() wmb() -#define smp_read_barrier_depends() read_barrier_depends() -#define set_mb(var, value) do { xchg(&var, value); } while (0) -#else -#define smp_mb() barrier() -#define smp_rmb() barrier() -#define smp_wmb() barrier() -#define smp_read_barrier_depends() do { } while(0) -#define set_mb(var, value) do { var = value; barrier(); } while (0) -#endif - - /* This is a barrier for the compiler only, NOT the processor! */ #define barrier() __asm__ __volatile__("": : :"memory") #define LOCK_PREFIX "" #define LOCK "" #define ADDR (*(volatile long *) addr) -/* - * Make sure gcc doesn't try to be clever and move things around - * on us. We need to use _exactly_ the address the user gave us, - * not some alias that contains the same information. - */ -typedef struct { volatile int counter; } atomic_t; - - -#define xen_xchg(ptr,v) \ - ((__typeof__(*(ptr)))__xchg((unsigned long)(v),(ptr),sizeof(*(ptr)))) -struct __xchg_dummy { unsigned long a[100]; }; -#define __xg(x) ((volatile struct __xchg_dummy *)(x)) -static __inline unsigned long __xchg(unsigned long x, volatile void * ptr, - int size) -{ - switch (size) { - case 1: - __asm__ __volatile__("xchgb %b0,%1" - :"=q" (x) - :"m" (*__xg(ptr)), "0" (x) - :"memory"); - break; - case 2: - __asm__ __volatile__("xchgw %w0,%1" - :"=r" (x) - :"m" (*__xg(ptr)), "0" (x) - :"memory"); - break; - case 4: - __asm__ __volatile__("xchgl %0,%1" - :"=r" (x) - :"m" (*__xg(ptr)), "0" (x) - :"memory"); - break; - } - return x; -} - /** * test_and_clear_bit - Clear a bit and return its old value * @nr: Bit to set @@ -238,7 +91,6 @@ static __inline int variable_test_bit(int nr, vola constant_test_bit((nr),(addr)) : \ variable_test_bit((nr),(addr))) - /** * set_bit - Atomically set a bit in memory * @nr: the bit to set @@ -275,25 +127,6 @@ static __inline__ void clear_bit(int nr, volatile :"Ir" (nr)); } -/** - * atomic_inc - increment atomic variable - * @v: pointer of type atomic_t - * - * Atomically increments @v by 1. Note that the guaranteed - * useful range of an atomic_t is only 24 bits. - */ -static __inline__ void atomic_inc(atomic_t *v) -{ - __asm__ __volatile__( - LOCK "incl %0" - :"=m" (v->counter) - :"m" (v->counter)); -} - - -#define rdtscll(val) \ - __asm__ __volatile__("rdtsc" : "=A" (val)) - #endif /* !__ASSEMBLY__ */ -#endif /* _OS_H_ */ +#endif /* _MACHINE_XEN_XEN_OS_H_ */ Index: sys/conf/files =================================================================== --- sys/conf/files (revision 255014) +++ sys/conf/files (working copy) @@ -2499,7 +2499,6 @@ dev/xen/control/control.c optional xen | xenhvm dev/xen/netback/netback.c optional xen | xenhvm dev/xen/netfront/netfront.c optional xen | xenhvm dev/xen/xenpci/xenpci.c optional xenpci -dev/xen/xenpci/evtchn.c optional xenpci dev/xl/if_xl.c optional xl pci dev/xl/xlphy.c optional xl pci fs/deadfs/dead_vnops.c standard @@ -3815,7 +3814,6 @@ vm/vm_zeroidle.c standard vm/vnode_pager.c standard xen/gnttab.c optional xen | xenhvm xen/features.c optional xen | xenhvm -xen/evtchn/evtchn.c optional xen xen/evtchn/evtchn_dev.c optional xen | xenhvm xen/xenbus/xenbus_if.m optional xen | xenhvm xen/xenbus/xenbus.c optional xen | xenhvm Index: sys/conf/files.amd64 =================================================================== --- sys/conf/files.amd64 (revision 255014) +++ sys/conf/files.amd64 (working copy) @@ -531,3 +531,5 @@ x86/x86/mptable_pci.c optional mptable pci x86/x86/msi.c optional pci x86/x86/nexus.c standard x86/x86/tsc.c standard +x86/xen/hvm.c optional xenhvm +x86/xen/xen_intr.c optional xen | xenhvm Index: sys/conf/files.i386 =================================================================== --- sys/conf/files.i386 (revision 255014) +++ sys/conf/files.i386 (working copy) @@ -568,3 +568,5 @@ x86/x86/mptable_pci.c optional apic native pci x86/x86/msi.c optional apic pci x86/x86/nexus.c standard x86/x86/tsc.c standard +x86/xen/hvm.c optional xenhvm +x86/xen/xen_intr.c optional xen | xenhvm Index: sys/dev/xen/balloon/balloon.c =================================================================== --- sys/dev/xen/balloon/balloon.c (revision 255014) +++ sys/dev/xen/balloon/balloon.c (working copy) @@ -40,14 +40,15 @@ __FBSDID("$FreeBSD$"); #include #include -#include -#include -#include +#include +#include + +#include #include +#include #include -#include -#include +#include static MALLOC_DEFINE(M_BALLOON, "Balloon", "Xen Balloon Driver"); Index: sys/dev/xen/blkback/blkback.c =================================================================== --- sys/dev/xen/blkback/blkback.c (revision 255014) +++ sys/dev/xen/blkback/blkback.c (working copy) @@ -70,14 +70,13 @@ __FBSDID("$FreeBSD$"); #include #include -#include #include #include #include +#include #include -#include #include #include @@ -682,7 +681,7 @@ struct xbb_softc { blkif_back_rings_t rings; /** IRQ mapping for the communication ring event channel. */ - int irq; + xen_intr_handle_t xen_intr_handle; /** * \brief Backend access mode flags (e.g. write, or read-only). @@ -1347,7 +1346,7 @@ xbb_send_response(struct xbb_softc *xbb, struct xb taskqueue_enqueue(xbb->io_taskqueue, &xbb->io_task); if (notify) - notify_remote_via_irq(xbb->irq); + xen_intr_signal(xbb->xen_intr_handle); } /** @@ -1616,8 +1615,8 @@ xbb_dispatch_io(struct xbb_softc *xbb, struct xbb_ sg = NULL; /* Check that number of segments is sane. */ - if (unlikely(nseg == 0) - || unlikely(nseg > xbb->max_request_segments)) { + if (__predict_false(nseg == 0) + || __predict_false(nseg > xbb->max_request_segments)) { DPRINTF("Bad number of segments in request (%d)\n", nseg); reqlist->status = BLKIF_RSP_ERROR; @@ -1734,7 +1733,7 @@ xbb_dispatch_io(struct xbb_softc *xbb, struct xbb_ for (seg_idx = 0, map = xbb->maps; seg_idx < reqlist->nr_segments; seg_idx++, map++){ - if (unlikely(map->status != 0)) { + if (__predict_false(map->status != 0)) { DPRINTF("invalid buffer -- could not remap " "it (%d)\n", map->status); DPRINTF("Mapping(%d): Host Addr 0x%lx, flags " @@ -2026,14 +2025,16 @@ xbb_run_queue(void *context, int pending) * \param arg Callback argument registerd during event channel * binding - the xbb_softc for this instance. */ -static void -xbb_intr(void *arg) +static int +xbb_filter(void *arg) { struct xbb_softc *xbb; - /* Defer to kernel thread. */ + /* Defer to taskqueue thread. */ xbb = (struct xbb_softc *)arg; taskqueue_enqueue(xbb->io_taskqueue, &xbb->io_task); + + return (FILTER_HANDLED); } SDT_PROVIDER_DEFINE(xbb); @@ -2081,7 +2082,7 @@ xbb_dispatch_dev(struct xbb_softc *xbb, struct xbb if (operation == BIO_FLUSH) { nreq = STAILQ_FIRST(&reqlist->contig_req_list); bio = g_new_bio(); - if (unlikely(bio == NULL)) { + if (__predict_false(bio == NULL)) { DPRINTF("Unable to allocate bio for BIO_FLUSH\n"); error = ENOMEM; return (error); @@ -2143,7 +2144,7 @@ xbb_dispatch_dev(struct xbb_softc *xbb, struct xbb } bio = bios[nbio++] = g_new_bio(); - if (unlikely(bio == NULL)) { + if (__predict_false(bio == NULL)) { error = ENOMEM; goto fail_free_bios; } @@ -2811,10 +2812,7 @@ xbb_disconnect(struct xbb_softc *xbb) if ((xbb->flags & XBBF_RING_CONNECTED) == 0) return (0); - if (xbb->irq != 0) { - unbind_from_irqhandler(xbb->irq); - xbb->irq = 0; - } + xen_intr_unbind(&xbb->xen_intr_handle); mtx_unlock(&xbb->lock); taskqueue_drain(xbb->io_taskqueue, &xbb->io_task); @@ -2966,13 +2964,14 @@ xbb_connect_ring(struct xbb_softc *xbb) xbb->flags |= XBBF_RING_CONNECTED; - error = - bind_interdomain_evtchn_to_irqhandler(xbb->otherend_id, - xbb->ring_config.evtchn, - device_get_nameunit(xbb->dev), - xbb_intr, /*arg*/xbb, - INTR_TYPE_BIO | INTR_MPSAFE, - &xbb->irq); + error = xen_intr_bind_remote_port(xbb->dev, + xbb->otherend_id, + xbb->ring_config.evtchn, + xbb_filter, + /*ithread_handler*/NULL, + /*arg*/xbb, + INTR_TYPE_BIO | INTR_MPSAFE, + &xbb->xen_intr_handle); if (error) { (void)xbb_disconnect(xbb); xenbus_dev_fatal(xbb->dev, error, "binding event channel"); @@ -3791,9 +3790,10 @@ xbb_attach(device_t dev) * Create a taskqueue for doing work that must occur from a * thread context. */ - xbb->io_taskqueue = taskqueue_create(device_get_nameunit(dev), M_NOWAIT, - taskqueue_thread_enqueue, - /*context*/&xbb->io_taskqueue); + xbb->io_taskqueue = taskqueue_create_fast(device_get_nameunit(dev), + M_NOWAIT, + taskqueue_thread_enqueue, + /*contxt*/&xbb->io_taskqueue); if (xbb->io_taskqueue == NULL) { xbb_attach_failed(xbb, error, "Unable to create taskqueue"); return (ENOMEM); Index: sys/dev/xen/blkfront/blkfront.c =================================================================== --- sys/dev/xen/blkfront/blkfront.c (revision 255014) +++ sys/dev/xen/blkfront/blkfront.c (working copy) @@ -51,19 +51,17 @@ __FBSDID("$FreeBSD$"); #include #include -#include -#include -#include -#include - +#include #include #include -#include #include #include #include #include +#include +#include + #include #include @@ -139,7 +137,7 @@ xbd_flush_requests(struct xbd_softc *sc) RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&sc->xbd_ring, notify); if (notify) - notify_remote_via_irq(sc->xbd_irq); + xen_intr_signal(sc->xen_intr_handle); } static void @@ -310,7 +308,7 @@ xbd_bio_command(struct xbd_softc *sc) struct xbd_command *cm; struct bio *bp; - if (unlikely(sc->xbd_state != XBD_STATE_CONNECTED)) + if (__predict_false(sc->xbd_state != XBD_STATE_CONNECTED)) return (NULL); bp = xbd_dequeue_bio(sc); @@ -437,7 +435,7 @@ xbd_bio_complete(struct xbd_softc *sc, struct xbd_ bp = cm->cm_bp; - if (unlikely(cm->cm_status != BLKIF_RSP_OKAY)) { + if (__predict_false(cm->cm_status != BLKIF_RSP_OKAY)) { disk_err(bp, "disk error" , -1, 0); printf(" status: %x\n", cm->cm_status); bp->bio_flags |= BIO_ERROR; @@ -470,7 +468,7 @@ xbd_int(void *xsc) mtx_lock(&sc->xbd_io_lock); - if (unlikely(sc->xbd_state == XBD_STATE_DISCONNECTED)) { + if (__predict_false(sc->xbd_state == XBD_STATE_DISCONNECTED)) { mtx_unlock(&sc->xbd_io_lock); return; } @@ -531,7 +529,7 @@ xbd_int(void *xsc) xbd_startio(sc); - if (unlikely(sc->xbd_state == XBD_STATE_SUSPENDED)) + if (__predict_false(sc->xbd_state == XBD_STATE_SUSPENDED)) wakeup(&sc->xbd_cm_q[XBD_Q_BUSY]); mtx_unlock(&sc->xbd_io_lock); @@ -782,13 +780,12 @@ xbd_alloc_ring(struct xbd_softc *sc) } } - error = bind_listening_port_to_irqhandler( - xenbus_get_otherend_id(sc->xbd_dev), - "xbd", (driver_intr_t *)xbd_int, sc, - INTR_TYPE_BIO | INTR_MPSAFE, &sc->xbd_irq); + error = xen_intr_alloc_and_bind_local_port(sc->xbd_dev, + xenbus_get_otherend_id(sc->xbd_dev), NULL, xbd_int, sc, + INTR_TYPE_BIO | INTR_MPSAFE, &sc->xen_intr_handle); if (error) { xenbus_dev_fatal(sc->xbd_dev, error, - "bind_evtchn_to_irqhandler failed"); + "xen_intr_alloc_and_bind_local_port failed"); return (error); } @@ -1042,10 +1039,8 @@ xbd_free(struct xbd_softc *sc) xbd_initq_cm(sc, XBD_Q_COMPLETE); } - if (sc->xbd_irq) { - unbind_from_irqhandler(sc->xbd_irq); - sc->xbd_irq = 0; - } + xen_intr_unbind(&sc->xen_intr_handle); + } /*--------------------------- State Change Handlers --------------------------*/ @@ -1277,7 +1272,7 @@ xbd_initialize(struct xbd_softc *sc) } error = xs_printf(XST_NIL, node_path, "event-channel", - "%u", irq_to_evtchn_port(sc->xbd_irq)); + "%u", xen_intr_port(sc->xen_intr_handle)); if (error) { xenbus_dev_fatal(sc->xbd_dev, error, "writing %s/event-channel", Index: sys/dev/xen/blkfront/block.h =================================================================== --- sys/dev/xen/blkfront/block.h (revision 255014) +++ sys/dev/xen/blkfront/block.h (working copy) @@ -179,7 +179,7 @@ struct xbd_softc { uint32_t xbd_max_request_size; grant_ref_t xbd_ring_ref[XBD_MAX_RING_PAGES]; blkif_front_ring_t xbd_ring; - unsigned int xbd_irq; + xen_intr_handle_t xen_intr_handle; struct gnttab_free_callback xbd_callback; xbd_cm_q_t xbd_cm_q[XBD_Q_COUNT]; bus_dma_tag_t xbd_io_dmat; Index: sys/dev/xen/console/console.c =================================================================== --- sys/dev/xen/console/console.c (revision 255014) +++ sys/dev/xen/console/console.c (working copy) @@ -15,7 +15,7 @@ __FBSDID("$FreeBSD$"); #include #include #include -#include +#include #include #include #include @@ -71,6 +71,8 @@ static char rbuf[RBUF_SIZE]; static int rc, rp; static unsigned int cnsl_evt_reg; static unsigned int wc, wp; /* write_cons, write_prod */ +xen_intr_handle_t xen_intr_handle; +device_t xencons_dev; #ifdef KDB static int xc_altbrk; @@ -232,6 +234,7 @@ xc_attach(device_t dev) { int error; + xencons_dev = dev; xccons = tty_alloc(&xc_ttydevsw, NULL); tty_makedev(xccons, NULL, "xc%r", 0); @@ -243,15 +246,10 @@ xc_attach(device_t dev) callout_reset(&xc_callout, XC_POLLTIME, xc_timeout, xccons); if (xen_start_info->flags & SIF_INITDOMAIN) { - error = bind_virq_to_irqhandler( - VIRQ_CONSOLE, - 0, - "console", - NULL, - xencons_priv_interrupt, NULL, - INTR_TYPE_TTY, NULL); - - KASSERT(error >= 0, ("can't register console interrupt")); + error = xen_intr_bind_virq(dev, VIRQ_CONSOLE, 0, NULL, + xencons_priv_interrupt, NULL, + INTR_TYPE_TTY, &xen_intr_handle); + KASSERT(error >= 0, ("can't register console interrupt")); } /* register handler to flush console on shutdown */ Index: sys/dev/xen/console/xencons_ring.c =================================================================== --- sys/dev/xen/console/xencons_ring.c (revision 255014) +++ sys/dev/xen/console/xencons_ring.c (working copy) @@ -16,7 +16,8 @@ __FBSDID("$FreeBSD$"); #include #include -#include + +#include #include #include #include @@ -30,9 +31,10 @@ __FBSDID("$FreeBSD$"); #include #define console_evtchn console.domU.evtchn -static unsigned int console_irq; +xen_intr_handle_t console_handle; extern char *console_page; extern struct mtx cn_mtx; +extern device_t xencons_dev; static inline struct xencons_interface * xencons_interface(void) @@ -74,7 +76,7 @@ xencons_ring_send(const char *data, unsigned len) wmb(); intf->out_prod = prod; - notify_remote_via_evtchn(xen_start_info->console_evtchn); + xen_intr_signal(console_handle); return sent; @@ -106,7 +108,7 @@ xencons_handle_input(void *unused) intf->in_cons = cons; CN_LOCK(cn_mtx); - notify_remote_via_evtchn(xen_start_info->console_evtchn); + xen_intr_signal(console_handle); xencons_tx(); CN_UNLOCK(cn_mtx); @@ -126,9 +128,9 @@ xencons_ring_init(void) if (!xen_start_info->console_evtchn) return 0; - err = bind_caller_port_to_irqhandler(xen_start_info->console_evtchn, - "xencons", xencons_handle_input, NULL, - INTR_TYPE_MISC | INTR_MPSAFE, &console_irq); + err = xen_intr_bind_local_port(xencons_dev, + xen_start_info->console_evtchn, NULL, xencons_handle_input, NULL, + INTR_TYPE_MISC | INTR_MPSAFE, &console_handle); if (err) { return err; } @@ -146,7 +148,7 @@ xencons_suspend(void) if (!xen_start_info->console_evtchn) return; - unbind_from_irqhandler(console_irq); + xen_intr_unbind(&console_handle); } void Index: sys/dev/xen/control/control.c =================================================================== --- sys/dev/xen/control/control.c (revision 255014) +++ sys/dev/xen/control/control.c (working copy) @@ -128,12 +128,13 @@ __FBSDID("$FreeBSD$"); #include #include -#include +#include #include #include #include +#include #include #include #include @@ -144,6 +145,9 @@ __FBSDID("$FreeBSD$"); #include +#include +#include + /*--------------------------- Forward Declarations --------------------------*/ /** Function signature for shutdown event handlers. */ typedef void (xctrl_shutdown_handler_t)(void); @@ -242,6 +246,7 @@ xctrl_suspend() xencons_suspend(); gnttab_suspend(); + intr_suspend(); max_pfn = HYPERVISOR_shared_info->arch.max_pfn; @@ -282,7 +287,7 @@ xctrl_suspend() HYPERVISOR_shared_info->arch.max_pfn = max_pfn; gnttab_resume(); - irq_resume(); + intr_resume(); local_irq_enable(); xencons_resume(); @@ -352,14 +357,12 @@ xctrl_suspend() * Prevent any races with evtchn_interrupt() handler. */ disable_intr(); - irq_suspend(); + intr_suspend(); suspend_cancelled = HYPERVISOR_suspend(0); - if (suspend_cancelled) - irq_resume(); - else - xenpci_resume(); + intr_resume(); + /* * Re-enable interrupts and put the scheduler back to normal. */ Index: sys/dev/xen/netback/netback.c =================================================================== --- sys/dev/xen/netback/netback.c (revision 255014) +++ sys/dev/xen/netback/netback.c (working copy) @@ -79,14 +79,15 @@ __FBSDID("$FreeBSD$"); #include #include -#include -#include -#include +#include +#include #include #include #include +#include + /*--------------------------- Compile-time Tunables --------------------------*/ /*---------------------------------- Macros ----------------------------------*/ @@ -433,8 +434,8 @@ struct xnb_softc { /** Xen device handle.*/ long handle; - /** IRQ mapping for the communication ring event channel. */ - int irq; + /** Handle to the communication ring event channel. */ + xen_intr_handle_t xen_intr_handle; /** * \brief Cached value of the front-end's domain id. @@ -647,10 +648,7 @@ xnb_disconnect(struct xnb_softc *xnb) int error; int i; - if (xnb->irq != 0) { - unbind_from_irqhandler(xnb->irq); - xnb->irq = 0; - } + xen_intr_unbind(xnb->xen_intr_handle); /* * We may still have another thread currently processing requests. We @@ -773,13 +771,13 @@ xnb_connect_comms(struct xnb_softc *xnb) xnb->flags |= XNBF_RING_CONNECTED; - error = - bind_interdomain_evtchn_to_irqhandler(xnb->otherend_id, - xnb->evtchn, - device_get_nameunit(xnb->dev), - xnb_intr, /*arg*/xnb, - INTR_TYPE_BIO | INTR_MPSAFE, - &xnb->irq); + error = xen_intr_bind_remote_port(xnb->dev, + xnb->otherend_id, + xnb->evtchn, + /*filter*/NULL, + xnb_intr, /*arg*/xnb, + INTR_TYPE_BIO | INTR_MPSAFE, + &xnb->xen_intr_handle); if (error != 0) { (void)xnb_disconnect(xnb); xenbus_dev_fatal(xnb->dev, error, "binding event channel"); @@ -1448,7 +1446,7 @@ xnb_intr(void *arg) RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(txb, notify); if (notify != 0) - notify_remote_via_irq(xnb->irq); + xen_intr_signal(xnb->xen_intr_handle); txb->sring->req_event = txb->req_cons + 1; xen_mb(); @@ -2361,7 +2359,7 @@ xnb_start_locked(struct ifnet *ifp) RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(rxb, notify); if ((notify != 0) || (out_of_space != 0)) - notify_remote_via_irq(xnb->irq); + xen_intr_signal(xnb->xen_intr_handle); rxb->sring->req_event = req_prod_local + 1; xen_mb(); } while (rxb->sring->req_prod != req_prod_local) ; Index: sys/dev/xen/netfront/netfront.c =================================================================== --- sys/dev/xen/netfront/netfront.c (revision 255014) +++ sys/dev/xen/netfront/netfront.c (working copy) @@ -76,17 +76,16 @@ __FBSDID("$FreeBSD$"); #include -#include -#include -#include +#include #include #include -#include #include #include #include #include +#include + #include #include "xenbus_if.h" @@ -257,8 +256,7 @@ struct netfront_info { struct mtx rx_lock; struct mtx sc_lock; - u_int handle; - u_int irq; + xen_intr_handle_t xen_intr_handle; u_int copying_receiver; u_int carrier; u_int maxfrags; @@ -547,7 +545,8 @@ talk_to_backend(device_t dev, struct netfront_info goto abort_transaction; } err = xs_printf(xst, node, - "event-channel", "%u", irq_to_evtchn_port(info->irq)); + "event-channel", "%u", + xen_intr_port(info->xen_intr_handle)); if (err) { message = "writing event-channel"; goto abort_transaction; @@ -609,7 +608,6 @@ setup_device(device_t dev, struct netfront_info *i info->rx_ring_ref = GRANT_REF_INVALID; info->rx.sring = NULL; info->tx.sring = NULL; - info->irq = 0; txs = (netif_tx_sring_t *)malloc(PAGE_SIZE, M_DEVBUF, M_NOWAIT|M_ZERO); if (!txs) { @@ -636,12 +634,13 @@ setup_device(device_t dev, struct netfront_info *i if (error) goto fail; - error = bind_listening_port_to_irqhandler(xenbus_get_otherend_id(dev), - "xn", xn_intr, info, INTR_TYPE_NET | INTR_MPSAFE, &info->irq); + error = xen_intr_alloc_and_bind_local_port(dev, + xenbus_get_otherend_id(dev), /*filter*/NULL, xn_intr, info, + INTR_TYPE_NET | INTR_MPSAFE | INTR_ENTROPY, &info->xen_intr_handle); if (error) { xenbus_dev_fatal(dev, error, - "bind_evtchn_to_irqhandler failed"); + "xen_intr_alloc_and_bind_local_port failed"); goto fail; } @@ -806,7 +805,7 @@ network_alloc_rx_buffers(struct netfront_info *sc) req_prod = sc->rx.req_prod_pvt; - if (unlikely(sc->carrier == 0)) + if (__predict_false(sc->carrier == 0)) return; /* @@ -946,7 +945,7 @@ refill: /* Zap PTEs and give away pages in one big multicall. */ (void)HYPERVISOR_multicall(sc->rx_mcl, i+1); - if (unlikely(sc->rx_mcl[i].result != i || + if (__predict_false(sc->rx_mcl[i].result != i || HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation) != i)) panic("%s: unable to reduce memory " @@ -961,7 +960,7 @@ refill: push: RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&sc->rx, notify); if (notify) - notify_remote_via_irq(sc->irq); + xen_intr_signal(sc->xen_intr_handle); } static void @@ -1003,7 +1002,7 @@ xn_rxeof(struct netfront_info *np) err = xennet_get_responses(np, &rinfo, rp, &i, &m, &pages_flipped); - if (unlikely(err)) { + if (__predict_false(err)) { if (m) mbufq_tail(&errq, m); np->stats.rx_errors++; @@ -1151,7 +1150,7 @@ xn_txeof(struct netfront_info *np) */ if (!m->m_next) ifp->if_opackets++; - if (unlikely(gnttab_query_foreign_access( + if (__predict_false(gnttab_query_foreign_access( np->grant_tx_ref[id]) != 0)) { panic("%s: grant id %u still in use by the " "backend", __func__, id); @@ -1249,7 +1248,7 @@ xennet_get_extras(struct netfront_info *np, struct mbuf *m; grant_ref_t ref; - if (unlikely(*cons + 1 == rp)) { + if (__predict_false(*cons + 1 == rp)) { #if 0 if (net_ratelimit()) WPRINTK("Missing extra info\n"); @@ -1261,7 +1260,7 @@ xennet_get_extras(struct netfront_info *np, extra = (struct netif_extra_info *) RING_GET_RESPONSE(&np->rx, ++(*cons)); - if (unlikely(!extra->type || + if (__predict_false(!extra->type || extra->type >= XEN_NETIF_EXTRA_TYPE_MAX)) { #if 0 if (net_ratelimit()) @@ -1317,7 +1316,7 @@ xennet_get_responses(struct netfront_info *np, DPRINTK("rx->status=%hd rx->offset=%hu frags=%u\n", rx->status, rx->offset, frags); #endif - if (unlikely(rx->status < 0 || + if (__predict_false(rx->status < 0 || rx->offset + rx->status > PAGE_SIZE)) { #if 0 @@ -1679,7 +1678,7 @@ xn_start_locked(struct ifnet *ifp) RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&sc->tx, notify); if (notify) - notify_remote_via_irq(sc->irq); + xen_intr_signal(sc->xen_intr_handle); if (RING_FULL(&sc->tx)) { sc->tx_full = 1; @@ -1961,7 +1960,7 @@ network_connect(struct netfront_info *np) * packets. */ netfront_carrier_on(np); - notify_remote_via_irq(np->irq); + xen_intr_signal(np->xen_intr_handle); XN_TX_LOCK(np); xn_txeof(np); XN_TX_UNLOCK(np); @@ -2050,8 +2049,9 @@ xn_configure_features(struct netfront_info *np) return (err); } -/** Create a network device. - * @param handle device handle +/** + * Create a network device. + * @param dev Newbus device representing this virtual NIC. */ int create_netdev(device_t dev) @@ -2198,10 +2198,7 @@ netif_disconnect_backend(struct netfront_info *inf free_ring(&info->tx_ring_ref, &info->tx.sring); free_ring(&info->rx_ring_ref, &info->rx.sring); - if (info->irq) - unbind_from_irqhandler(info->irq); - - info->irq = 0; + xen_intr_unbind(&info->xen_intr_handle); } static void Index: sys/dev/xen/xenpci/evtchn.c =================================================================== --- sys/dev/xen/xenpci/evtchn.c (revision 255014) +++ sys/dev/xen/xenpci/evtchn.c (working copy) @@ -1,467 +0,0 @@ -/****************************************************************************** - * evtchn.c - * - * A simplified event channel for para-drivers in unmodified linux - * - * Copyright (c) 2002-2005, K A Fraser - * Copyright (c) 2005, Intel Corporation - * - * This file may be distributed separately from the Linux kernel, or - * incorporated into other software packages, subject to the following license: - * - * Permission is hereby granted, free of charge, to any person obtaining a copy - * of this source file (the "Software"), to deal in the Software without - * restriction, including without limitation the rights to use, copy, modify, - * merge, publish, distribute, sublicense, and/or sell copies of the Software, - * and to permit persons to whom the Software is furnished to do so, subject to - * the following conditions: - * - * The above copyright notice and this permission notice shall be included in - * all copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS - * IN THE SOFTWARE. - */ - -#include -__FBSDID("$FreeBSD$"); - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include - -#include - -#if defined(__i386__) -#define __ffs(word) (ffs(word) - 1) -#elif defined(__amd64__) -static inline unsigned long __ffs(unsigned long word) -{ - __asm__("bsfq %1,%0" - :"=r" (word) - :"rm" (word)); /* XXXRW: why no "cc"? */ - return word; -} -#else -#error "evtchn: unsupported architecture" -#endif - -#define is_valid_evtchn(x) ((x) != 0) -#define evtchn_from_irq(x) (irq_evtchn[irq].evtchn) - -static struct { - struct mtx lock; - driver_intr_t *handler; - void *arg; - int evtchn; - int close:1; /* close on unbind_from_irqhandler()? */ - int inuse:1; - int in_handler:1; - int mpsafe:1; -} irq_evtchn[256]; -static int evtchn_to_irq[NR_EVENT_CHANNELS] = { - [0 ... NR_EVENT_CHANNELS-1] = -1 }; - -static struct mtx irq_alloc_lock; -static device_t xenpci_device; - -#define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0])) - -static unsigned int -alloc_xen_irq(void) -{ - static int warned; - unsigned int irq; - - mtx_lock(&irq_alloc_lock); - - for (irq = 1; irq < ARRAY_SIZE(irq_evtchn); irq++) { - if (irq_evtchn[irq].inuse) - continue; - irq_evtchn[irq].inuse = 1; - mtx_unlock(&irq_alloc_lock); - return irq; - } - - if (!warned) { - warned = 1; - printf("alloc_xen_irq: No available IRQ to bind to: " - "increase irq_evtchn[] size in evtchn.c.\n"); - } - - mtx_unlock(&irq_alloc_lock); - - return -ENOSPC; -} - -static void -free_xen_irq(int irq) -{ - - mtx_lock(&irq_alloc_lock); - irq_evtchn[irq].inuse = 0; - mtx_unlock(&irq_alloc_lock); -} - -int -irq_to_evtchn_port(int irq) -{ - - return irq_evtchn[irq].evtchn; -} - -void -mask_evtchn(int port) -{ - shared_info_t *s = HYPERVISOR_shared_info; - - synch_set_bit(port, &s->evtchn_mask[0]); -} - -void -unmask_evtchn(int port) -{ - evtchn_unmask_t op = { .port = port }; - - HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &op); -} - -int -bind_listening_port_to_irqhandler(unsigned int remote_domain, - const char *devname, driver_intr_t handler, void *arg, - unsigned long irqflags, unsigned int *irqp) -{ - struct evtchn_alloc_unbound alloc_unbound; - unsigned int irq; - int error; - - irq = alloc_xen_irq(); - if (irq < 0) - return irq; - - mtx_lock(&irq_evtchn[irq].lock); - - alloc_unbound.dom = DOMID_SELF; - alloc_unbound.remote_dom = remote_domain; - error = HYPERVISOR_event_channel_op(EVTCHNOP_alloc_unbound, - &alloc_unbound); - if (error) { - mtx_unlock(&irq_evtchn[irq].lock); - free_xen_irq(irq); - return (-error); - } - - irq_evtchn[irq].handler = handler; - irq_evtchn[irq].arg = arg; - irq_evtchn[irq].evtchn = alloc_unbound.port; - irq_evtchn[irq].close = 1; - irq_evtchn[irq].mpsafe = (irqflags & INTR_MPSAFE) != 0; - - evtchn_to_irq[alloc_unbound.port] = irq; - - unmask_evtchn(alloc_unbound.port); - - mtx_unlock(&irq_evtchn[irq].lock); - - if (irqp) - *irqp = irq; - return (0); -} - -int -bind_interdomain_evtchn_to_irqhandler(unsigned int remote_domain, - unsigned int remote_port, const char *devname, driver_intr_t handler, - void *arg, unsigned long irqflags, unsigned int *irqp) -{ - struct evtchn_bind_interdomain bind_interdomain; - unsigned int irq; - int error; - - irq = alloc_xen_irq(); - if (irq < 0) - return irq; - - mtx_lock(&irq_evtchn[irq].lock); - - bind_interdomain.remote_dom = remote_domain; - bind_interdomain.remote_port = remote_port; - error = HYPERVISOR_event_channel_op(EVTCHNOP_bind_interdomain, - &bind_interdomain); - if (error) { - mtx_unlock(&irq_evtchn[irq].lock); - free_xen_irq(irq); - return (-error); - } - - irq_evtchn[irq].handler = handler; - irq_evtchn[irq].arg = arg; - irq_evtchn[irq].evtchn = bind_interdomain.local_port; - irq_evtchn[irq].close = 1; - irq_evtchn[irq].mpsafe = (irqflags & INTR_MPSAFE) != 0; - - evtchn_to_irq[bind_interdomain.local_port] = irq; - - unmask_evtchn(bind_interdomain.local_port); - - mtx_unlock(&irq_evtchn[irq].lock); - - if (irqp) - *irqp = irq; - return (0); -} - - -int -bind_caller_port_to_irqhandler(unsigned int caller_port, - const char *devname, driver_intr_t handler, void *arg, - unsigned long irqflags, unsigned int *irqp) -{ - unsigned int irq; - - irq = alloc_xen_irq(); - if (irq < 0) - return irq; - - mtx_lock(&irq_evtchn[irq].lock); - - irq_evtchn[irq].handler = handler; - irq_evtchn[irq].arg = arg; - irq_evtchn[irq].evtchn = caller_port; - irq_evtchn[irq].close = 0; - irq_evtchn[irq].mpsafe = (irqflags & INTR_MPSAFE) != 0; - - evtchn_to_irq[caller_port] = irq; - - unmask_evtchn(caller_port); - - mtx_unlock(&irq_evtchn[irq].lock); - - if (irqp) - *irqp = irq; - return (0); -} - -void -unbind_from_irqhandler(unsigned int irq) -{ - int evtchn; - - mtx_lock(&irq_evtchn[irq].lock); - - evtchn = evtchn_from_irq(irq); - - if (is_valid_evtchn(evtchn)) { - evtchn_to_irq[evtchn] = -1; - mask_evtchn(evtchn); - if (irq_evtchn[irq].close) { - struct evtchn_close close = { .port = evtchn }; - if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)) - panic("EVTCHNOP_close failed"); - } - } - - irq_evtchn[irq].handler = NULL; - irq_evtchn[irq].evtchn = 0; - - mtx_unlock(&irq_evtchn[irq].lock); - - while (irq_evtchn[irq].in_handler) - cpu_relax(); - - free_xen_irq(irq); -} - -void notify_remote_via_irq(int irq) -{ - int evtchn; - - evtchn = evtchn_from_irq(irq); - if (is_valid_evtchn(evtchn)) - notify_remote_via_evtchn(evtchn); -} - -static inline unsigned long active_evtchns(unsigned int cpu, shared_info_t *sh, - unsigned int idx) -{ - return (sh->evtchn_pending[idx] & ~sh->evtchn_mask[idx]); -} - -static void -evtchn_interrupt(void *arg) -{ - unsigned int l1i, l2i, port; - unsigned long masked_l1, masked_l2; - /* XXX: All events are bound to vcpu0 but irq may be redirected. */ - int cpu = 0; /*smp_processor_id();*/ - driver_intr_t *handler; - void *handler_arg; - int irq, handler_mpsafe; - shared_info_t *s = HYPERVISOR_shared_info; - vcpu_info_t *v = &s->vcpu_info[cpu]; - struct pcpu *pc = pcpu_find(cpu); - unsigned long l1, l2; - - v->evtchn_upcall_pending = 0; - -#if 0 -#ifndef CONFIG_X86 /* No need for a barrier -- XCHG is a barrier on x86. */ - /* Clear master flag /before/ clearing selector flag. */ - wmb(); -#endif -#endif - - l1 = atomic_readandclear_long(&v->evtchn_pending_sel); - - l1i = pc->pc_last_processed_l1i; - l2i = pc->pc_last_processed_l2i; - - while (l1 != 0) { - - l1i = (l1i + 1) % LONG_BIT; - masked_l1 = l1 & ((~0UL) << l1i); - - if (masked_l1 == 0) { /* if we masked out all events, wrap around to the beginning */ - l1i = LONG_BIT - 1; - l2i = LONG_BIT - 1; - continue; - } - l1i = __ffs(masked_l1); - - do { - l2 = active_evtchns(cpu, s, l1i); - - l2i = (l2i + 1) % LONG_BIT; - masked_l2 = l2 & ((~0UL) << l2i); - - if (masked_l2 == 0) { /* if we masked out all events, move on */ - l2i = LONG_BIT - 1; - break; - } - l2i = __ffs(masked_l2); - - /* process port */ - port = (l1i * LONG_BIT) + l2i; - synch_clear_bit(port, &s->evtchn_pending[0]); - - irq = evtchn_to_irq[port]; - if (irq < 0) - continue; - - mtx_lock(&irq_evtchn[irq].lock); - handler = irq_evtchn[irq].handler; - handler_arg = irq_evtchn[irq].arg; - handler_mpsafe = irq_evtchn[irq].mpsafe; - if (unlikely(handler == NULL)) { - printf("Xen IRQ%d (port %d) has no handler!\n", - irq, port); - mtx_unlock(&irq_evtchn[irq].lock); - continue; - } - irq_evtchn[irq].in_handler = 1; - mtx_unlock(&irq_evtchn[irq].lock); - - //local_irq_enable(); - if (!handler_mpsafe) - mtx_lock(&Giant); - handler(handler_arg); - if (!handler_mpsafe) - mtx_unlock(&Giant); - //local_irq_disable(); - - mtx_lock(&irq_evtchn[irq].lock); - irq_evtchn[irq].in_handler = 0; - mtx_unlock(&irq_evtchn[irq].lock); - - /* if this is the final port processed, we'll pick up here+1 next time */ - pc->pc_last_processed_l1i = l1i; - pc->pc_last_processed_l2i = l2i; - - } while (l2i != LONG_BIT - 1); - - l2 = active_evtchns(cpu, s, l1i); - if (l2 == 0) /* we handled all ports, so we can clear the selector bit */ - l1 &= ~(1UL << l1i); - } -} - -void -irq_suspend(void) -{ - struct xenpci_softc *scp = device_get_softc(xenpci_device); - - /* - * Take our interrupt handler out of the list of handlers - * that can handle this irq. - */ - if (scp->intr_cookie != NULL) { - if (BUS_TEARDOWN_INTR(device_get_parent(xenpci_device), - xenpci_device, scp->res_irq, scp->intr_cookie) != 0) - printf("intr teardown failed.. continuing\n"); - scp->intr_cookie = NULL; - } -} - -void -irq_resume(void) -{ - struct xenpci_softc *scp = device_get_softc(xenpci_device); - int evtchn, irq; - - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) { - mask_evtchn(evtchn); - evtchn_to_irq[evtchn] = -1; - } - - for (irq = 0; irq < ARRAY_SIZE(irq_evtchn); irq++) - irq_evtchn[irq].evtchn = 0; - - BUS_SETUP_INTR(device_get_parent(xenpci_device), - xenpci_device, scp->res_irq, INTR_TYPE_MISC, - NULL, evtchn_interrupt, NULL, &scp->intr_cookie); -} - -int -xenpci_irq_init(device_t device, struct xenpci_softc *scp) -{ - int irq, cpu; - int error; - - mtx_init(&irq_alloc_lock, "xen-irq-lock", NULL, MTX_DEF); - - for (irq = 0; irq < ARRAY_SIZE(irq_evtchn); irq++) - mtx_init(&irq_evtchn[irq].lock, "irq-evtchn", NULL, MTX_DEF); - - for (cpu = 0; cpu < mp_ncpus; cpu++) { - pcpu_find(cpu)->pc_last_processed_l1i = LONG_BIT - 1; - pcpu_find(cpu)->pc_last_processed_l2i = LONG_BIT - 1; - } - - error = BUS_SETUP_INTR(device_get_parent(device), device, - scp->res_irq, INTR_MPSAFE|INTR_TYPE_MISC, NULL, evtchn_interrupt, - NULL, &scp->intr_cookie); - if (error) - return (error); - - xenpci_device = device; - - return (0); -} Index: sys/dev/xen/xenpci/xenpci.c =================================================================== --- sys/dev/xen/xenpci/xenpci.c (revision 255014) +++ sys/dev/xen/xenpci/xenpci.c (working copy) @@ -32,40 +32,25 @@ __FBSDID("$FreeBSD$"); #include #include #include -#include -#include -#include #include #include #include #include -#include + +#include #include #include -#include -#include -#include -#include +#include #include #include -#include -#include -#include -#include - #include -/* - * These variables are used by the rest of the kernel to access the - * hypervisor. - */ -char *hypercall_stubs; -shared_info_t *HYPERVISOR_shared_info; -static vm_paddr_t shared_info_pa; +extern void xen_intr_handle_upcall(struct trapframe *trap_frame); + static device_t nexus; /* @@ -73,103 +58,42 @@ static device_t nexus; */ static devclass_t xenpci_devclass; -/* - * Return the CPUID base address for Xen functions. - */ -static uint32_t -xenpci_cpuid_base(void) +static int +xenpci_intr_filter(void *trap_frame) { - uint32_t base, regs[4]; - - for (base = 0x40000000; base < 0x40010000; base += 0x100) { - do_cpuid(base, regs); - if (!memcmp("XenVMMXenVMM", ®s[1], 12) - && (regs[0] - base) >= 2) - return (base); - } - return (0); + xen_intr_handle_upcall(trap_frame); + return (FILTER_HANDLED); } -/* - * Allocate and fill in the hypcall page. - */ static int -xenpci_init_hypercall_stubs(device_t dev, struct xenpci_softc * scp) +xenpci_irq_init(device_t device, struct xenpci_softc *scp) { - uint32_t base, regs[4]; - int i; + int error; - base = xenpci_cpuid_base(); - if (!base) { - device_printf(dev, "Xen platform device but not Xen VMM\n"); - return (EINVAL); - } + error = BUS_SETUP_INTR(device_get_parent(device), device, + scp->res_irq, INTR_MPSAFE|INTR_TYPE_MISC, + xenpci_intr_filter, NULL, /*trap_frame*/NULL, + &scp->intr_cookie); + if (error) + return error; - if (bootverbose) { - do_cpuid(base + 1, regs); - device_printf(dev, "Xen version %d.%d.\n", - regs[0] >> 16, regs[0] & 0xffff); - } - /* - * Find the hypercall pages. + * When using the PCI event delivery callback we cannot assign + * events to specific vCPUs, so all events are delivered to vCPU#0 by + * Xen. Since the PCI interrupt can fire on any CPU by default, we + * need to bind it to vCPU#0 in order to ensure that + * xen_intr_handle_upcall always gets called on vCPU#0. */ - do_cpuid(base + 2, regs); - - hypercall_stubs = malloc(regs[0] * PAGE_SIZE, M_TEMP, M_WAITOK); + error = BUS_BIND_INTR(device_get_parent(device), device, + scp->res_irq, 0); + if (error) + return error; - for (i = 0; i < regs[0]; i++) { - wrmsr(regs[1], vtophys(hypercall_stubs + i * PAGE_SIZE) + i); - } - + xen_hvm_set_callback(device); return (0); } /* - * After a resume, re-initialise the hypercall page. - */ -static void -xenpci_resume_hypercall_stubs(device_t dev, struct xenpci_softc * scp) -{ - uint32_t base, regs[4]; - int i; - - base = xenpci_cpuid_base(); - - do_cpuid(base + 2, regs); - for (i = 0; i < regs[0]; i++) { - wrmsr(regs[1], vtophys(hypercall_stubs + i * PAGE_SIZE) + i); - } -} - -/* - * Tell the hypervisor how to contact us for event channel callbacks. - */ -static void -xenpci_set_callback(device_t dev) -{ - int irq; - uint64_t callback; - struct xen_hvm_param xhp; - - irq = pci_get_irq(dev); - if (irq < 16) { - callback = irq; - } else { - callback = (pci_get_intpin(dev) - 1) & 3; - callback |= pci_get_slot(dev) << 11; - callback |= 1ull << 56; - } - - xhp.domid = DOMID_SELF; - xhp.index = HVM_PARAM_CALLBACK_IRQ; - xhp.value = callback; - if (HYPERVISOR_hvm_op(HVMOP_set_param, &xhp)) - panic("Can't set evtchn callback"); -} - - -/* * Deallocate anything allocated by xenpci_allocate_resources. */ static int @@ -293,35 +217,6 @@ xenpci_deactivate_resource(device_t dev, device_t } /* - * Called very early in the resume sequence - reinitialise the various - * bits of Xen machinery including the hypercall page and the shared - * info page. - */ -void -xenpci_resume() -{ - device_t dev = devclass_get_device(xenpci_devclass, 0); - struct xenpci_softc *scp = device_get_softc(dev); - struct xen_add_to_physmap xatp; - - xenpci_resume_hypercall_stubs(dev, scp); - - xatp.domid = DOMID_SELF; - xatp.idx = 0; - xatp.space = XENMAPSPACE_shared_info; - xatp.gpfn = shared_info_pa >> PAGE_SHIFT; - if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp)) - panic("HYPERVISOR_memory_op failed"); - - pmap_kenter((vm_offset_t) HYPERVISOR_shared_info, shared_info_pa); - - xenpci_set_callback(dev); - - gnttab_resume(); - irq_resume(); -} - -/* * Probe - just check device ID. */ static int @@ -341,11 +236,9 @@ xenpci_probe(device_t dev) static int xenpci_attach(device_t dev) { - int error; struct xenpci_softc *scp = device_get_softc(dev); - struct xen_add_to_physmap xatp; - vm_offset_t shared_va; devclass_t dc; + int error; /* * Find and record nexus0. Since we are not really on the @@ -365,34 +258,16 @@ xenpci_attach(device_t dev) goto errexit; } - error = xenpci_init_hypercall_stubs(dev, scp); + /* + * Hook the irq up to evtchn + */ + error = xenpci_irq_init(dev, scp); if (error) { - device_printf(dev, "xenpci_init_hypercall_stubs failed(%d).\n", - error); + device_printf(dev, "xenpci_irq_init failed(%d).\n", + error); goto errexit; } - setup_xen_features(); - - xenpci_alloc_space_int(scp, PAGE_SIZE, &shared_info_pa); - - xatp.domid = DOMID_SELF; - xatp.idx = 0; - xatp.space = XENMAPSPACE_shared_info; - xatp.gpfn = shared_info_pa >> PAGE_SHIFT; - if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp)) - panic("HYPERVISOR_memory_op failed"); - - shared_va = kva_alloc(PAGE_SIZE); - pmap_kenter(shared_va, shared_info_pa); - HYPERVISOR_shared_info = (void *) shared_va; - - /* - * Hook the irq up to evtchn - */ - xenpci_irq_init(dev, scp); - xenpci_set_callback(dev); - return (bus_generic_attach(dev)); errexit: @@ -431,13 +306,42 @@ xenpci_detach(device_t dev) return (xenpci_deallocate_resources(dev)); } +static int +xenpci_suspend(device_t dev) +{ + struct xenpci_softc *scp = device_get_softc(dev); + device_t parent = device_get_parent(dev); + + if (scp->intr_cookie != NULL) { + if (BUS_TEARDOWN_INTR(parent, dev, scp->res_irq, + scp->intr_cookie) != 0) + printf("intr teardown failed.. continuing\n"); + scp->intr_cookie = NULL; + } + + return (bus_generic_suspend(dev)); +} + +static int +xenpci_resume(device_t dev) +{ + struct xenpci_softc *scp = device_get_softc(dev); + device_t parent = device_get_parent(dev); + + BUS_SETUP_INTR(parent, dev, scp->res_irq, + INTR_MPSAFE|INTR_TYPE_MISC, xenpci_intr_filter, NULL, + /*trap_frame*/NULL, &scp->intr_cookie); + xen_hvm_set_callback(dev); + return (bus_generic_resume(dev)); +} + static device_method_t xenpci_methods[] = { /* Device interface */ DEVMETHOD(device_probe, xenpci_probe), DEVMETHOD(device_attach, xenpci_attach), DEVMETHOD(device_detach, xenpci_detach), - DEVMETHOD(device_suspend, bus_generic_suspend), - DEVMETHOD(device_resume, bus_generic_resume), + DEVMETHOD(device_suspend, xenpci_suspend), + DEVMETHOD(device_resume, xenpci_resume), /* Bus interface */ DEVMETHOD(bus_add_child, bus_generic_add_child), Index: sys/dev/xen/xenpci/xenpcivar.h =================================================================== --- sys/dev/xen/xenpci/xenpcivar.h (revision 255014) +++ sys/dev/xen/xenpci/xenpcivar.h (working copy) @@ -38,7 +38,4 @@ struct xenpci_softc { vm_paddr_t phys_next; /* next page from mem range */ }; -extern int xenpci_irq_init(device_t device, struct xenpci_softc *scp); extern int xenpci_alloc_space(size_t sz, vm_paddr_t *pa); -extern void xenpci_resume(void); -extern void xen_suspend(void); Index: sys/i386/i386/apic_vector.s =================================================================== --- sys/i386/i386/apic_vector.s (revision 255014) +++ sys/i386/i386/apic_vector.s (working copy) @@ -138,6 +138,25 @@ IDTVEC(errorint) MEXITCOUNT jmp doreti +#ifdef XENHVM +/* + * Xen event channel upcall interrupt handler. + * Only used when the hypervisor supports direct vector callbacks. + */ + .text + SUPERALIGN_TEXT +IDTVEC(xen_intr_upcall) + PUSH_FRAME + SET_KERNEL_SREGS + cld + FAKE_MCOUNT(TF_EIP(%esp)) + pushl %esp + call xen_intr_handle_upcall + add $4, %esp + MEXITCOUNT + jmp doreti +#endif + #ifdef SMP /* * Global address space TLB shootdown. Index: sys/i386/i386/machdep.c =================================================================== --- sys/i386/i386/machdep.c (revision 255014) +++ sys/i386/i386/machdep.c (working copy) @@ -160,9 +160,8 @@ uint32_t arch_i386_xbox_memsize = 0; #ifdef XEN /* XEN includes */ -#include +#include #include -#include #include #include #include @@ -1216,6 +1215,13 @@ cpu_est_clockrate(int cpu_id, uint64_t *rate) #ifdef XEN +static void +idle_block(void) +{ + + HYPERVISOR_sched_op(SCHEDOP_block, 0); +} + void cpu_halt(void) { @@ -1960,6 +1966,9 @@ extern inthand_t #ifdef KDTRACE_HOOKS IDTVEC(dtrace_ret), #endif +#ifdef XENHVM + IDTVEC(xen_intr_upcall), +#endif IDTVEC(lcall_syscall), IDTVEC(int0x80_syscall); #ifdef DDB @@ -2948,6 +2957,10 @@ init386(first) setidt(IDT_DTRACE_RET, &IDTVEC(dtrace_ret), SDT_SYS386TGT, SEL_UPL, GSEL(GCODE_SEL, SEL_KPL)); #endif +#ifdef XENHVM + setidt(IDT_EVTCHN, &IDTVEC(xen_intr_upcall), SDT_SYS386IGT, SEL_UPL, + GSEL(GCODE_SEL, SEL_KPL)); +#endif r_idt.rd_limit = sizeof(idt0) - 1; r_idt.rd_base = (int) idt; Index: sys/i386/i386/mp_machdep.c =================================================================== --- sys/i386/i386/mp_machdep.c (revision 255014) +++ sys/i386/i386/mp_machdep.c (working copy) @@ -82,6 +82,10 @@ __FBSDID("$FreeBSD$"); #include #include +#ifdef XENHVM +#include +#endif + #define WARMBOOT_TARGET 0 #define WARMBOOT_OFF (KERNBASE + 0x0467) #define WARMBOOT_SEG (KERNBASE + 0x0469) @@ -747,6 +751,11 @@ init_secondary(void) /* set up SSE registers */ enable_sse(); +#ifdef XENHVM + /* register vcpu_info area */ + xen_hvm_init_cpu(); +#endif + #ifdef PAE /* Enable the PTE no-execute bit. */ if ((amd_feature & AMDID_NX) != 0) { Index: sys/i386/include/apicvar.h =================================================================== --- sys/i386/include/apicvar.h (revision 255014) +++ sys/i386/include/apicvar.h (working copy) @@ -226,6 +226,7 @@ int lapic_set_lvt_triggermode(u_int apic_id, u_int enum intr_trigger trigger); void lapic_set_tpr(u_int vector); void lapic_setup(int boot); +void xen_intr_handle_upcall(struct trapframe *frame); #endif /* !LOCORE */ #endif /* _MACHINE_APICVAR_H_ */ Index: sys/i386/include/intr_machdep.h =================================================================== --- sys/i386/include/intr_machdep.h (revision 255014) +++ sys/i386/include/intr_machdep.h (working copy) @@ -44,12 +44,30 @@ * allocate IDT vectors. * * The first 255 IRQs (0 - 254) are reserved for ISA IRQs and PCI intline IRQs. - * IRQ values beyond 256 are used by MSI. We leave 255 unused to avoid - * confusion since 255 is used in PCI to indicate an invalid IRQ. + * IRQ values from 256 to 767 are used by MSI. When running under the Xen + * Hypervisor, IRQ values from 768 to 4863 are available for binding to + * event channel events. We leave 255 unused to avoid confusion since 255 is + * used in PCI to indicate an invalid IRQ. */ #define NUM_MSI_INTS 512 #define FIRST_MSI_INT 256 -#define NUM_IO_INTS (FIRST_MSI_INT + NUM_MSI_INTS) +#ifdef XENHVM +#include +#define NUM_EVTCHN_INTS NR_EVENT_CHANNELS +#define FIRST_EVTCHN_INT \ + (FIRST_MSI_INT + NUM_MSI_INTS) +#define LAST_EVTCHN_INT \ + (FIRST_EVTCHN_INT + NUM_EVTCHN_INTS - 1) +#elif defined(XEN) +#include +#define NUM_EVTCHN_INTS NR_EVENT_CHANNELS +#define FIRST_EVTCHN_INT 0 +#define LAST_EVTCHN_INT \ + (FIRST_EVTCHN_INT + NUM_EVTCHN_INTS - 1) +#else /* !XEN && !XENHVM */ +#define NUM_EVTCHN_INTS 0 +#endif +#define NUM_IO_INTS (FIRST_MSI_INT + NUM_MSI_INTS + NUM_EVTCHN_INTS) /* * Default base address for MSI messages on x86 platforms. Index: sys/i386/include/pcpu.h =================================================================== --- sys/i386/include/pcpu.h (revision 255014) +++ sys/i386/include/pcpu.h (working copy) @@ -71,22 +71,10 @@ struct shadow_time_info { vm_paddr_t *pc_pdir_shadow; \ uint64_t pc_processed_system_time; \ struct shadow_time_info pc_shadow_time; \ - int pc_resched_irq; \ - int pc_callfunc_irq; \ - int pc_virq_to_irq[NR_VIRQS]; \ - int pc_ipi_to_irq[NR_IPIS]; \ - char __pad[77] + char __pad[189] -#elif defined(XENHVM) +#else /* !XEN */ -#define PCPU_XEN_FIELDS \ - ; \ - unsigned int pc_last_processed_l1i; \ - unsigned int pc_last_processed_l2i; \ - char __pad[229] - -#else /* !XEN && !XENHVM */ - #define PCPU_XEN_FIELDS \ ; \ char __pad[237] Index: sys/i386/include/pmap.h =================================================================== --- sys/i386/include/pmap.h (revision 255014) +++ sys/i386/include/pmap.h (working copy) @@ -213,7 +213,9 @@ extern pd_entry_t *IdlePTD; /* physical address of #if defined(XEN) #include -#include + +#include + #include #include Index: sys/i386/include/xen/xen-os.h =================================================================== --- sys/i386/include/xen/xen-os.h (revision 255014) +++ sys/i386/include/xen/xen-os.h (working copy) @@ -1,52 +1,64 @@ -/****************************************************************************** - * os.h +/***************************************************************************** + * i386/xen/xen-os.h * - * random collection of macros and definition + * Random collection of macros and definition + * + * Copyright (c) 2003, 2004 Keir Fraser (on behalf of the Xen team) + * All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + * + * $FreeBSD$ */ -#ifndef _XEN_OS_H_ -#define _XEN_OS_H_ -#include +#ifndef _MACHINE_XEN_XEN_OS_H_ +#define _MACHINE_XEN_XEN_OS_H_ #ifdef PAE #define CONFIG_X86_PAE #endif -#ifdef LOCORE -#define __ASSEMBLY__ -#endif - -#if !defined(__XEN_INTERFACE_VERSION__) -#define __XEN_INTERFACE_VERSION__ 0x00030208 -#endif - -#define GRANT_REF_INVALID 0xffffffff - -#include - /* Everything below this point is not included by assembler (.S) files. */ #ifndef __ASSEMBLY__ /* Force a proper event-channel callback from Xen. */ void force_evtchn_callback(void); -#define likely(x) __builtin_expect((x),1) -#define unlikely(x) __builtin_expect((x),0) +/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ +static inline void rep_nop(void) +{ + __asm__ __volatile__ ( "rep;nop" : : : "memory" ); +} +#define cpu_relax() rep_nop() -#ifndef vtophys -#include -#include -#include -#endif +#ifndef XENHVM +void xc_printf(const char *fmt, ...); +#ifdef SMP extern int gdtset; -#ifdef SMP + #include /* XXX for pcpu.h */ #include /* XXX for PCPU_GET */ static inline int smp_processor_id(void) { - if (likely(gdtset)) + if (__predict_true(gdtset)) return PCPU_GET(cpuid); return 0; } @@ -55,50 +67,16 @@ smp_processor_id(void) #define smp_processor_id() 0 #endif -#ifndef NULL -#define NULL (void *)0 -#endif - #ifndef PANIC_IF -#define PANIC_IF(exp) if (unlikely(exp)) {printk("panic - %s: %s:%d\n",#exp, __FILE__, __LINE__); panic("%s: %s:%d", #exp, __FILE__, __LINE__);} +#define PANIC_IF(exp) if (__predict_false(exp)) {printf("panic - %s: %s:%d\n",#exp, __FILE__, __LINE__); panic("%s: %s:%d", #exp, __FILE__, __LINE__);} #endif -extern shared_info_t *HYPERVISOR_shared_info; - -/* Somewhere in the middle of the GCC 2.96 development cycle, we implemented - a mechanism by which the user can annotate likely branch directions and - expect the blocks to be reordered appropriately. Define __builtin_expect - to nothing for earlier compilers. */ - -/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ -static inline void rep_nop(void) -{ - __asm__ __volatile__ ( "rep;nop" : : : "memory" ); -} -#define cpu_relax() rep_nop() - - -#if __GNUC__ == 2 && __GNUC_MINOR__ < 96 -#define __builtin_expect(x, expected_value) (x) -#endif - -#define per_cpu(var, cpu) (pcpu_find((cpu))->pc_ ## var) - -/* crude memory allocator for memory allocation early in - * boot +/* + * Crude memory allocator for memory allocation early in boot. */ void *bootmem_alloc(unsigned int size); void bootmem_free(void *ptr, unsigned int size); -#include - -void printk(const char *fmt, ...); - -/* some function prototypes */ -void trap_init(void); - -#ifndef XENHVM - /* * STI/CLI equivalents. These basically set and clear the virtual * event_enable flag in the shared_info structure. Note that when @@ -106,7 +84,6 @@ void bootmem_free(void *ptr, unsigned int size); * We may therefore call into do_hypervisor_callback() directly. */ - #define __cli() \ do { \ vcpu_info_t *_vcpu; \ @@ -122,7 +99,7 @@ do { _vcpu = &HYPERVISOR_shared_info->vcpu_info[smp_processor_id()]; \ _vcpu->evtchn_upcall_mask = 0; \ barrier(); /* unmask then check (avoid races) */ \ - if ( unlikely(_vcpu->evtchn_upcall_pending) ) \ + if (__predict_false(_vcpu->evtchn_upcall_pending)) \ force_evtchn_callback(); \ } while (0) @@ -133,7 +110,7 @@ do { _vcpu = &HYPERVISOR_shared_info->vcpu_info[smp_processor_id()]; \ if ((_vcpu->evtchn_upcall_mask = (x)) == 0) { \ barrier(); /* unmask then check (avoid races) */ \ - if ( unlikely(_vcpu->evtchn_upcall_pending) ) \ + if (__predict_false(_vcpu->evtchn_upcall_pending)) \ force_evtchn_callback(); \ } \ } while (0) @@ -168,32 +145,8 @@ do { #define spin_lock_irqsave mtx_lock_irqsave #define spin_unlock_irqrestore mtx_unlock_irqrestore -#endif +#endif /* !XENHVM */ -#ifndef xen_mb -#define xen_mb() mb() -#endif -#ifndef xen_rmb -#define xen_rmb() rmb() -#endif -#ifndef xen_wmb -#define xen_wmb() wmb() -#endif -#ifdef SMP -#define smp_mb() mb() -#define smp_rmb() rmb() -#define smp_wmb() wmb() -#define smp_read_barrier_depends() read_barrier_depends() -#define set_mb(var, value) do { xchg(&var, value); } while (0) -#else -#define smp_mb() barrier() -#define smp_rmb() barrier() -#define smp_wmb() barrier() -#define smp_read_barrier_depends() do { } while(0) -#define set_mb(var, value) do { var = value; barrier(); } while (0) -#endif - - /* This is a barrier for the compiler only, NOT the processor! */ #define barrier() __asm__ __volatile__("": : :"memory") @@ -207,8 +160,6 @@ do { */ typedef struct { volatile int counter; } atomic_t; - - #define xen_xchg(ptr,v) \ ((__typeof__(*(ptr)))__xchg((unsigned long)(v),(ptr),sizeof(*(ptr)))) struct __xchg_dummy { unsigned long a[100]; }; @@ -335,33 +286,6 @@ static __inline__ void atomic_inc(atomic_t *v) #define rdtscll(val) \ __asm__ __volatile__("rdtsc" : "=A" (val)) - - -/* - * Kernel pointers have redundant information, so we can use a - * scheme where we can return either an error code or a dentry - * pointer with the same return value. - * - * This should be a per-architecture thing, to allow different - * error and pointer decisions. - */ -#define IS_ERR_VALUE(x) unlikely((x) > (unsigned long)-1000L) - -static inline void *ERR_PTR(long error) -{ - return (void *) error; -} - -static inline long PTR_ERR(const void *ptr) -{ - return (long) ptr; -} - -static inline long IS_ERR(const void *ptr) -{ - return IS_ERR_VALUE((unsigned long)ptr); -} - #endif /* !__ASSEMBLY__ */ -#endif /* _OS_H_ */ +#endif /* _MACHINE_XEN_XEN_OS_H_ */ Index: sys/i386/include/xen/xenfunc.h =================================================================== --- sys/i386/include/xen/xenfunc.h (revision 255014) +++ sys/i386/include/xen/xenfunc.h (working copy) @@ -29,10 +29,14 @@ #ifndef _XEN_XENFUNC_H_ #define _XEN_XENFUNC_H_ -#include +#include #include + +#include + #include #include + #include #define BKPT __asm__("int3"); #define XPQ_CALL_DEPTH 5 Index: sys/i386/include/xen/xenvar.h =================================================================== --- sys/i386/include/xen/xenvar.h (revision 255014) +++ sys/i386/include/xen/xenvar.h (working copy) @@ -37,7 +37,8 @@ #define XPMAP 0x2 extern int xendebug_flags; #ifndef NOXENDEBUG -#define XENPRINTF printk +/* Print directly to the Xen console during debugging. */ +#define XENPRINTF xc_printf #else #define XENPRINTF printf #endif Index: sys/i386/isa/npx.c =================================================================== --- sys/i386/isa/npx.c (revision 255014) +++ sys/i386/isa/npx.c (working copy) @@ -69,7 +69,7 @@ __FBSDID("$FreeBSD$"); #include #ifdef XEN -#include +#include #include #endif Index: sys/i386/xen/clock.c =================================================================== --- sys/i386/xen/clock.c (revision 255014) +++ sys/i386/xen/clock.c (working copy) @@ -84,7 +84,7 @@ __FBSDID("$FreeBSD$"); #include #include #include -#include +#include #include #include #include @@ -133,6 +133,8 @@ static uint64_t processed_system_time; /* stime (n static const u_char daysinmonth[] = {31,28,31,30,31,30,31,31,30,31,30,31}; +int ap_cpu_initclocks(int cpu); + SYSCTL_INT(_machdep, OID_AUTO, independent_wallclock, CTLFLAG_RW, &independent_wallclock, 0, ""); SYSCTL_INT(_machdep, OID_AUTO, xen_disable_rtc_set, @@ -257,9 +259,11 @@ static void __get_time_values_from_xen(void) struct vcpu_time_info *src; struct shadow_time_info *dst; uint32_t pre_version, post_version; + struct pcpu *pc; + pc = pcpu_find(smp_processor_id()); src = &s->vcpu_info[smp_processor_id()].time; - dst = &per_cpu(shadow_time, smp_processor_id()); + dst = &pc->pc_shadow_time; spinlock_enter(); do { @@ -283,9 +287,11 @@ static inline int time_values_up_to_date(int cpu) { struct vcpu_time_info *src; struct shadow_time_info *dst; + struct pcpu *pc; src = &HYPERVISOR_shared_info->vcpu_info[cpu].time; - dst = &per_cpu(shadow_time, cpu); + pc = pcpu_find(cpu); + dst = &pc->pc_shadow_time; rmb(); return (dst->version == src->version); @@ -320,7 +326,8 @@ clkintr(void *arg) { int64_t now; int cpu = smp_processor_id(); - struct shadow_time_info *shadow = &per_cpu(shadow_time, cpu); + struct pcpu *pc = pcpu_find(cpu); + struct shadow_time_info *shadow = &pc->pc_shadow_time; struct xen_et_state *state = DPCPU_PTR(et_state); do { @@ -364,8 +371,10 @@ getit(void) struct shadow_time_info *shadow; uint64_t time; uint32_t local_time_version; + struct pcpu *pc; - shadow = &per_cpu(shadow_time, smp_processor_id()); + pc = pcpu_find(smp_processor_id()); + shadow = &pc->pc_shadow_time; do { local_time_version = shadow->version; @@ -492,12 +501,14 @@ void timer_restore(void) { struct xen_et_state *state = DPCPU_PTR(et_state); + struct pcpu *pc; /* Get timebases for new environment. */ __get_time_values_from_xen(); /* Reset our own concept of passage of system time. */ - processed_system_time = per_cpu(shadow_time, 0).system_timestamp; + pc = pcpu_find(0); + processed_system_time = pc->pc_shadow_time.system_timestamp; state->next = processed_system_time; } @@ -508,10 +519,13 @@ startrtclock() uint64_t __cpu_khz; uint32_t cpu_khz; struct vcpu_time_info *info; + struct pcpu *pc; + pc = pcpu_find(0); + /* initialize xen values */ __get_time_values_from_xen(); - processed_system_time = per_cpu(shadow_time, 0).system_timestamp; + processed_system_time = pc->pc_shadow_time.system_timestamp; __cpu_khz = 1000000ULL << 32; info = &HYPERVISOR_shared_info->vcpu_info[0].time; @@ -594,8 +608,10 @@ domu_resettodr(void) int s; dom0_op_t op; struct shadow_time_info *shadow; + struct pcpu *pc; - shadow = &per_cpu(shadow_time, smp_processor_id()); + pc = pcpu_find(smp_processor_id()); + shadow = &pc->pc_shadow_time; if (xen_disable_rtc_set) return; @@ -773,6 +789,7 @@ xen_et_start(struct eventtimer *et, sbintime_t fir struct xen_et_state *state = DPCPU_PTR(et_state); struct shadow_time_info *shadow; int64_t fperiod; + struct pcpu *pc; __get_time_values_from_xen(); @@ -788,7 +805,8 @@ xen_et_start(struct eventtimer *et, sbintime_t fir else fperiod = state->period; - shadow = &per_cpu(shadow_time, smp_processor_id()); + pc = pcpu_find(smp_processor_id()); + shadow = &pc->pc_shadow_time; state->next = shadow->system_timestamp + get_nsec_offset(shadow); state->next += fperiod; HYPERVISOR_set_timer_op(state->next + 50000); @@ -811,11 +829,11 @@ xen_et_stop(struct eventtimer *et) void cpu_initclocks(void) { - unsigned int time_irq; + xen_intr_handle_t time_irq; int error; HYPERVISOR_vcpu_op(VCPUOP_stop_periodic_timer, 0, NULL); - error = bind_virq_to_irqhandler(VIRQ_TIMER, 0, "cpu0:timer", + error = xen_intr_bind_virq(root_bus, VIRQ_TIMER, 0, clkintr, NULL, NULL, INTR_TYPE_CLK, &time_irq); if (error) panic("failed to register clock interrupt\n"); @@ -840,13 +858,11 @@ cpu_initclocks(void) int ap_cpu_initclocks(int cpu) { - char buf[MAXCOMLEN + 1]; - unsigned int time_irq; + xen_intr_handle_t time_irq; int error; HYPERVISOR_vcpu_op(VCPUOP_stop_periodic_timer, cpu, NULL); - snprintf(buf, sizeof(buf), "cpu%d:timer", cpu); - error = bind_virq_to_irqhandler(VIRQ_TIMER, cpu, buf, + error = xen_intr_bind_virq(root_bus, VIRQ_TIMER, cpu, clkintr, NULL, NULL, INTR_TYPE_CLK, &time_irq); if (error) panic("failed to register clock interrupt\n"); @@ -859,8 +875,11 @@ xen_get_timecount(struct timecounter *tc) { uint64_t clk; struct shadow_time_info *shadow; - shadow = &per_cpu(shadow_time, smp_processor_id()); + struct pcpu *pc; + pc = pcpu_find(smp_processor_id()); + shadow = &pc->pc_shadow_time; + __get_time_values_from_xen(); clk = shadow->system_timestamp + get_nsec_offset(shadow); @@ -876,13 +895,6 @@ get_system_time(int ticks) return processed_system_time + (ticks * NS_PER_TICK); } -void -idle_block(void) -{ - - HYPERVISOR_sched_op(SCHEDOP_block, 0); -} - int timer_spkr_acquire(void) { Index: sys/i386/xen/exception.s =================================================================== --- sys/i386/xen/exception.s (revision 255014) +++ sys/i386/xen/exception.s (working copy) @@ -168,7 +168,7 @@ call_evtchn_upcall: jb critical_region_fixup 10: pushl %esp - call evtchn_do_upcall + call xen_intr_handle_upcall addl $4,%esp /* Index: sys/i386/xen/mp_machdep.c =================================================================== --- sys/i386/xen/mp_machdep.c (revision 255014) +++ sys/i386/xen/mp_machdep.c (working copy) @@ -87,7 +87,7 @@ __FBSDID("$FreeBSD$"); -#include +#include #include #include #include @@ -102,9 +102,6 @@ extern struct pcpu __pcpu[]; static int bootAP; static union descriptor *bootAPgdt; -static char resched_name[NR_CPUS][15]; -static char callfunc_name[NR_CPUS][15]; - /* Free these after use */ void *bootstacks[MAXCPU]; @@ -156,7 +153,11 @@ static cpuset_t hyperthreading_cpus_mask; extern void Xhypervisor_callback(void); extern void failsafe_callback(void); extern void pmap_lazyfix_action(void); +extern int ap_cpu_initclocks(int cpu); +DPCPU_DEFINE(xen_intr_handle_t, ipi_port[NR_IPIS]); +DPCPU_DEFINE(struct vcpu_info *, vcpu_info); + struct cpu_group * cpu_topo(void) { @@ -461,49 +462,49 @@ cpu_mp_announce(void) } static int -xen_smp_intr_init(unsigned int cpu) +xen_smp_cpu_init(unsigned int cpu) { int rc; - unsigned int irq; - - per_cpu(resched_irq, cpu) = per_cpu(callfunc_irq, cpu) = -1; + xen_intr_handle_t irq_handle; - sprintf(resched_name[cpu], "resched%u", cpu); - rc = bind_ipi_to_irqhandler(RESCHEDULE_VECTOR, - cpu, - resched_name[cpu], - smp_reschedule_interrupt, - INTR_TYPE_TTY, &irq); + DPCPU_ID_SET(cpu, ipi_port[RESCHEDULE_VECTOR], NULL); + DPCPU_ID_SET(cpu, ipi_port[CALL_FUNCTION_VECTOR], NULL); - printf("[XEN] IPI cpu=%d irq=%d vector=RESCHEDULE_VECTOR (%d)\n", - cpu, irq, RESCHEDULE_VECTOR); - - per_cpu(resched_irq, cpu) = irq; + /* + * The PCPU variable pc_device is not initialized on i386 PV, + * so we have to use the root_bus device in order to setup + * the IPIs. + */ + rc = xen_intr_bind_ipi(root_bus, RESCHEDULE_VECTOR, + cpu, smp_reschedule_interrupt, INTR_TYPE_TTY, &irq_handle); + if (rc < 0) + goto fail; + xen_intr_describe(irq_handle, "resched%u", cpu); + DPCPU_ID_SET(cpu, ipi_port[RESCHEDULE_VECTOR], irq_handle); - sprintf(callfunc_name[cpu], "callfunc%u", cpu); - rc = bind_ipi_to_irqhandler(CALL_FUNCTION_VECTOR, - cpu, - callfunc_name[cpu], - smp_call_function_interrupt, - INTR_TYPE_TTY, &irq); + printf("[XEN] IPI cpu=%d port=%d vector=RESCHEDULE_VECTOR (%d)\n", + cpu, xen_intr_port(irq_handle), RESCHEDULE_VECTOR); + + rc = xen_intr_bind_ipi(root_bus, CALL_FUNCTION_VECTOR, + cpu, smp_call_function_interrupt, INTR_TYPE_TTY, &irq_handle); if (rc < 0) goto fail; - per_cpu(callfunc_irq, cpu) = irq; + xen_intr_describe(irq_handle, "callfunc%u", cpu); + DPCPU_ID_SET(cpu, ipi_port[CALL_FUNCTION_VECTOR], irq_handle); - printf("[XEN] IPI cpu=%d irq=%d vector=CALL_FUNCTION_VECTOR (%d)\n", - cpu, irq, CALL_FUNCTION_VECTOR); + printf("[XEN] IPI cpu=%d port=%d vector=CALL_FUNCTION_VECTOR (%d)\n", + cpu, xen_intr_port(irq_handle), CALL_FUNCTION_VECTOR); - if ((cpu != 0) && ((rc = ap_cpu_initclocks(cpu)) != 0)) goto fail; return 0; fail: - if (per_cpu(resched_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(resched_irq, cpu)); - if (per_cpu(callfunc_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(callfunc_irq, cpu)); + xen_intr_unbind(DPCPU_ID_GET(cpu, ipi_port[RESCHEDULE_VECTOR])); + DPCPU_ID_SET(cpu, ipi_port[RESCHEDULE_VECTOR], NULL); + xen_intr_unbind(DPCPU_ID_GET(cpu, ipi_port[CALL_FUNCTION_VECTOR])); + DPCPU_ID_SET(cpu, ipi_port[CALL_FUNCTION_VECTOR], NULL); return rc; } @@ -513,9 +514,19 @@ xen_smp_intr_init_cpus(void *unused) int i; for (i = 0; i < mp_ncpus; i++) - xen_smp_intr_init(i); + xen_smp_cpu_init(i); } +static void +xen_smp_intr_setup_cpus(void *unused) +{ + int i; + + for (i = 0; i < mp_ncpus; i++) + DPCPU_ID_SET(i, vcpu_info, + &HYPERVISOR_shared_info->vcpu_info[i]); +} + #define MTOPSIZE (1<<(14 + PAGE_SHIFT)) /* @@ -959,6 +970,13 @@ start_ap(int apic_id) return 0; /* return FAILURE */ } +static void +ipi_pcpu(int cpu, u_int ipi) +{ + KASSERT((ipi <= NR_IPIS), ("invalid IPI")); + xen_intr_signal(DPCPU_ID_GET(cpu, ipi_port[ipi])); +} + /* * send an IPI to a specific CPU. */ @@ -1246,5 +1264,6 @@ release_aps(void *dummy __unused) ia32_pause(); } SYSINIT(start_aps, SI_SUB_SMP, SI_ORDER_FIRST, release_aps, NULL); -SYSINIT(start_ipis, SI_SUB_INTR, SI_ORDER_ANY, xen_smp_intr_init_cpus, NULL); +SYSINIT(start_ipis, SI_SUB_SMP, SI_ORDER_ANY, xen_smp_intr_init_cpus, NULL); +SYSINIT(start_cpu, SI_SUB_INTR, SI_ORDER_ANY, xen_smp_intr_setup_cpus, NULL); Index: sys/i386/xen/mptable.c =================================================================== --- sys/i386/xen/mptable.c (revision 255014) +++ sys/i386/xen/mptable.c (working copy) @@ -40,7 +40,7 @@ __FBSDID("$FreeBSD$"); #include #include -#include +#include #include #include Index: sys/i386/xen/xen_clock_util.c =================================================================== --- sys/i386/xen/xen_clock_util.c (revision 255014) +++ sys/i386/xen/xen_clock_util.c (working copy) @@ -39,12 +39,13 @@ __FBSDID("$FreeBSD$"); #include #include +#include #include + #include #include #include #include -#include #include #include #include Index: sys/i386/xen/xen_machdep.c =================================================================== --- sys/i386/xen/xen_machdep.c (revision 255014) +++ sys/i386/xen/xen_machdep.c (working copy) @@ -47,7 +47,7 @@ __FBSDID("$FreeBSD$"); #include #include -#include +#include #include #include @@ -96,6 +96,8 @@ xen_pfn_t *xen_pfn_to_mfn_frame_list[16]; xen_pfn_t *xen_pfn_to_mfn_frame_list_list; int preemptable, init_first; extern unsigned int avail_space; +int xen_vector_callback_enabled = 0; +enum xen_domain_type xen_domain_type = XEN_PV_DOMAIN; void ni_cli(void); void ni_sti(void); @@ -129,6 +131,12 @@ ni_sti(void) ); } +void +force_evtchn_callback(void) +{ + (void)HYPERVISOR_xen_version(0, NULL); +} + /* * Modify the cmd_line by converting ',' to NULLs so that it is in a format * suitable for the static env vars. @@ -141,7 +149,7 @@ xen_setbootenv(char *cmd_line) /* Skip leading spaces */ for (; *cmd_line == ' '; cmd_line++); - printk("xen_setbootenv(): cmd_line='%s'\n", cmd_line); + xc_printf("xen_setbootenv(): cmd_line='%s'\n", cmd_line); for (cmd_line_next = cmd_line; strsep(&cmd_line_next, ",") != NULL;); return cmd_line; @@ -177,16 +185,16 @@ xen_boothowto(char *envp) return howto; } -#define PRINTK_BUFSIZE 1024 +#define XC_PRINTF_BUFSIZE 1024 void -printk(const char *fmt, ...) +xc_printf(const char *fmt, ...) { __va_list ap; int retval; - static char buf[PRINTK_BUFSIZE]; + static char buf[XC_PRINTF_BUFSIZE]; va_start(ap, fmt); - retval = vsnprintf(buf, PRINTK_BUFSIZE - 1, fmt, ap); + retval = vsnprintf(buf, XC_PRINTF_BUFSIZE - 1, fmt, ap); va_end(ap); buf[retval] = 0; (void)HYPERVISOR_console_write(buf, retval); @@ -239,9 +247,10 @@ xen_dump_queue(void) if (_xpq_idx <= 1) return; - printk("xen_dump_queue(): %u entries\n", _xpq_idx); + xc_printf("xen_dump_queue(): %u entries\n", _xpq_idx); for (i = 0; i < _xpq_idx; i++) { - printk(" val: %llx ptr: %llx\n", XPQ_QUEUE[i].val, XPQ_QUEUE[i].ptr); + xc_printf(" val: %llx ptr: %llx\n", XPQ_QUEUE[i].val, + XPQ_QUEUE[i].ptr); } } #endif @@ -955,9 +964,10 @@ initvalues(start_info_t *startinfo) cur_space = xen_start_info->pt_base + (l3_pages + l2_pages + l1_pages + 1)*PAGE_SIZE; - printk("initvalues(): wooh - availmem=%x,%x\n", avail_space, cur_space); + xc_printf("initvalues(): wooh - availmem=%x,%x\n", avail_space, + cur_space); - printk("KERNBASE=%x,pt_base=%x, VTOPFN(base)=%x, nr_pt_frames=%x\n", + xc_printf("KERNBASE=%x,pt_base=%x, VTOPFN(base)=%x, nr_pt_frames=%x\n", KERNBASE,xen_start_info->pt_base, VTOPFN(xen_start_info->pt_base), xen_start_info->nr_pt_frames); xendebug_flags = 0; /* 0xffffffff; */ @@ -1007,7 +1017,7 @@ initvalues(start_info_t *startinfo) /* Map proc0's KSTACK */ proc0kstack = cur_space; cur_space += (KSTACK_PAGES * PAGE_SIZE); - printk("proc0kstack=%u\n", proc0kstack); + xc_printf("proc0kstack=%u\n", proc0kstack); /* vm86/bios stack */ cur_space += PAGE_SIZE; @@ -1106,18 +1116,18 @@ initvalues(start_info_t *startinfo) shinfo = xen_start_info->shared_info; PT_SET_MA(HYPERVISOR_shared_info, shinfo | PG_KERNEL); - printk("#4\n"); + xc_printf("#4\n"); xen_store_ma = (((vm_paddr_t)xen_start_info->store_mfn) << PAGE_SHIFT); PT_SET_MA(xen_store, xen_store_ma | PG_KERNEL); console_page_ma = (((vm_paddr_t)xen_start_info->console.domU.mfn) << PAGE_SHIFT); PT_SET_MA(console_page, console_page_ma | PG_KERNEL); - printk("#5\n"); + xc_printf("#5\n"); set_iopl.iopl = 1; PANIC_IF(HYPERVISOR_physdev_op(PHYSDEVOP_SET_IOPL, &set_iopl)); - printk("#6\n"); + xc_printf("#6\n"); #if 0 /* add page table for KERNBASE */ xen_queue_pt_update(IdlePTDma + KPTDI*sizeof(vm_paddr_t), @@ -1132,7 +1142,7 @@ initvalues(start_info_t *startinfo) #endif xen_flush_queue(); cur_space += PAGE_SIZE; - printk("#6\n"); + xc_printf("#6\n"); #endif /* 0 */ #ifdef notyet if (xen_start_info->flags & SIF_INITDOMAIN) { @@ -1150,13 +1160,13 @@ initvalues(start_info_t *startinfo) i < (((vm_offset_t)&etext) & ~PAGE_MASK); i += PAGE_SIZE) PT_SET_MA(i, VTOM(i) | PG_V | PG_A); - printk("#7\n"); + xc_printf("#7\n"); physfree = VTOP(cur_space); init_first = physfree >> PAGE_SHIFT; IdlePTD = (pd_entry_t *)VTOP(IdlePTD); IdlePDPT = (pd_entry_t *)VTOP(IdlePDPT); setup_xen_features(); - printk("#8, proc0kstack=%u\n", proc0kstack); + xc_printf("#8, proc0kstack=%u\n", proc0kstack); } @@ -1200,9 +1210,9 @@ HYPERVISOR_multicall(struct multicall_entry * call /* Check the results of individual hypercalls. */ for (i = 0; i < nr_calls; i++) - if (unlikely(call_list[i].result < 0)) + if (__predict_false(call_list[i].result < 0)) ret++; - if (unlikely(ret > 0)) + if (__predict_false(ret > 0)) panic("%d multicall(s) failed: cpu %d\n", ret, smp_processor_id()); Index: sys/i386/xen/xen_rtc.c =================================================================== --- sys/i386/xen/xen_rtc.c (revision 255014) +++ sys/i386/xen/xen_rtc.c (working copy) @@ -39,15 +39,17 @@ __FBSDID("$FreeBSD$"); #include #include +#include #include +#include +#include +#include + #include #include + #include -#include -#include #include -#include -#include #include #include Index: sys/sys/kernel.h =================================================================== --- sys/sys/kernel.h (revision 255014) +++ sys/sys/kernel.h (working copy) @@ -96,6 +96,11 @@ enum sysinit_sub_id { SI_SUB_VM = 0x1000000, /* virtual memory system init*/ SI_SUB_KMEM = 0x1800000, /* kernel memory*/ SI_SUB_KVM_RSRC = 0x1A00000, /* kvm operational limits*/ + SI_SUB_HYPERVISOR = 0x1A40000, /* + * Hypervisor detection and + * virtualization support + * setup. + */ SI_SUB_WITNESS = 0x1A80000, /* witness initialization */ SI_SUB_MTX_POOL_DYNAMIC = 0x1AC0000, /* dynamic mutex pool */ SI_SUB_LOCK = 0x1B00000, /* various locks */ Index: sys/x86/include/segments.h =================================================================== --- sys/x86/include/segments.h (revision 255014) +++ sys/x86/include/segments.h (working copy) @@ -217,6 +217,7 @@ union descriptor { #define IDT_IO_INTS NRSVIDT /* Base of IDT entries for I/O interrupts. */ #define IDT_SYSCALL 0x80 /* System Call Interrupt Vector */ #define IDT_DTRACE_RET 0x92 /* DTrace pid provider Interrupt Vector */ +#define IDT_EVTCHN 0x93 /* Xen HVM Event Channel Interrupt Vector */ #if defined(__i386__) || defined(__ia64__) /* Index: sys/x86/x86/local_apic.c =================================================================== --- sys/x86/x86/local_apic.c (revision 255014) +++ sys/x86/x86/local_apic.c (working copy) @@ -91,6 +91,7 @@ CTASSERT(IPI_STOP < APIC_SPURIOUS_INT); #define IRQ_TIMER (NUM_IO_INTS + 1) #define IRQ_SYSCALL (NUM_IO_INTS + 2) #define IRQ_DTRACE_RET (NUM_IO_INTS + 3) +#define IRQ_EVTCHN (NUM_IO_INTS + 4) /* * Support for local APICs. Local APICs manage interrupts on each @@ -313,6 +314,9 @@ lapic_create(u_int apic_id, int boot_cpu) lapics[apic_id].la_ioint_irqs[IDT_DTRACE_RET - APIC_IO_INTS] = IRQ_DTRACE_RET; #endif +#ifdef XENHVM + lapics[apic_id].la_ioint_irqs[IDT_EVTCHN - APIC_IO_INTS] = IRQ_EVTCHN; +#endif #ifdef SMP @@ -1137,6 +1141,10 @@ DB_SHOW_COMMAND(apic, db_show_apic) if (irq == IRQ_DTRACE_RET) continue; #endif +#ifdef XENHVM + if (irq == IRQ_EVTCHN) + continue; +#endif db_printf("vec 0x%2x -> ", i + APIC_IO_INTS); if (irq == IRQ_TIMER) db_printf("lapic timer\n"); Index: sys/x86/xen/hvm.c =================================================================== --- sys/x86/xen/hvm.c (revision 0) +++ sys/x86/xen/hvm.c (working copy) @@ -0,0 +1,256 @@ +/* + * Copyright (c) 2008 Citrix Systems, Inc. + * Copyright (c) 2012 Spectra Logic Corporation + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include + +#include +#include + +#include +#include +#include +#include +#include +#include + +#include +#include + +#include +#include + +static MALLOC_DEFINE(M_XENHVM, "xen_hvm", "Xen HVM PV Support"); + +DPCPU_DEFINE(struct vcpu_info, vcpu_local_info); +DPCPU_DEFINE(struct vcpu_info *, vcpu_info); + +/*-------------------------------- Global Data -------------------------------*/ +/** + * If non-zero, the hypervisor has been configured to use a direct + * IDT event callback to the + */ +int xen_vector_callback_enabled; + +/*------------------ Hypervisor Access Shared Memory Regions -----------------*/ +/** Hypercall table accessed via HYPERVISOR_*_op() methods. */ +char *hypercall_stubs; +shared_info_t *HYPERVISOR_shared_info; +enum xen_domain_type xen_domain_type = XEN_NATIVE; + +static uint32_t +xen_hvm_cpuid_base(void) +{ + uint32_t base, regs[4]; + + for (base = 0x40000000; base < 0x40010000; base += 0x100) { + do_cpuid(base, regs); + if (!memcmp("XenVMMXenVMM", ®s[1], 12) + && (regs[0] - base) >= 2) + return (base); + } + return (0); +} + +/* + * Allocate and fill in the hypcall page. + */ +static int +xen_hvm_init_hypercall_stubs(void) +{ + uint32_t base, regs[4]; + int i; + + base = xen_hvm_cpuid_base(); + if (!base) + return (ENXIO); + + if (hypercall_stubs == NULL) { + do_cpuid(base + 1, regs); + printf("XEN: Hypervisor version %d.%d detected.\n", + regs[0] >> 16, regs[0] & 0xffff); + } + + /* + * Find the hypercall pages. + */ + do_cpuid(base + 2, regs); + + if (hypercall_stubs == NULL) { + size_t call_region_size; + + call_region_size = regs[0] * PAGE_SIZE; + hypercall_stubs = malloc(call_region_size, M_XENHVM, M_NOWAIT); + if (hypercall_stubs == NULL) + panic("Unable to allocate Xen hypercall region"); + } + + for (i = 0; i < regs[0]; i++) + wrmsr(regs[1], vtophys(hypercall_stubs + i * PAGE_SIZE) + i); + + return (0); +} + +static void +xen_hvm_init_shared_info_page(void) +{ + struct xen_add_to_physmap xatp; + + if (HYPERVISOR_shared_info == NULL) { + HYPERVISOR_shared_info = malloc(PAGE_SIZE, M_XENHVM, M_NOWAIT); + if (HYPERVISOR_shared_info == NULL) + panic("Unable to allocate Xen shared info page"); + } + + xatp.domid = DOMID_SELF; + xatp.idx = 0; + xatp.space = XENMAPSPACE_shared_info; + xatp.gpfn = vtophys(HYPERVISOR_shared_info) >> PAGE_SHIFT; + if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp)) + panic("HYPERVISOR_memory_op failed"); +} + +/* + * Tell the hypervisor how to contact us for event channel callbacks. + */ +void +xen_hvm_set_callback(device_t dev) +{ + struct xen_hvm_param xhp; + int irq; + + xhp.domid = DOMID_SELF; + xhp.index = HVM_PARAM_CALLBACK_IRQ; + if (xen_feature(XENFEAT_hvm_callback_vector)) { + int error; + + xhp.value = HVM_CALLBACK_VECTOR(IDT_EVTCHN); + error = HYPERVISOR_hvm_op(HVMOP_set_param, &xhp); + if (error == 0) { + xen_vector_callback_enabled = 1; + return; + } + printf("Xen HVM callback vector registration failed (%d). " + "Falling back to emulated device interrupt\n", + error); + } + xen_vector_callback_enabled = 0; + if (dev == NULL) { + /* + * Called from early boot or resume. + * xenpci will invoke us again later. + */ + return; + } + + irq = pci_get_irq(dev); + if (irq < 16) { + xhp.value = HVM_CALLBACK_GSI(irq); + } else { + u_int slot; + u_int pin; + + slot = pci_get_slot(dev); + pin = pci_get_intpin(dev) - 1; + xhp.value = HVM_CALLBACK_PCI_INTX(slot, pin); + } + + if (HYPERVISOR_hvm_op(HVMOP_set_param, &xhp)) + panic("Can't set evtchn callback"); +} + +#define XEN_MAGIC_IOPORT 0x10 +enum { + XMI_MAGIC = 0x49d2, + XMI_UNPLUG_IDE_DISKS = 0x01, + XMI_UNPLUG_NICS = 0x02, + XMI_UNPLUG_IDE_EXCEPT_PRI_MASTER = 0x04 +}; + +static void +xen_hvm_disable_emulated_devices(void) +{ + if (inw(XEN_MAGIC_IOPORT) != XMI_MAGIC) + return; + + if (bootverbose) + printf("XEN: Disabling emulated block and network devices\n"); + outw(XEN_MAGIC_IOPORT, XMI_UNPLUG_IDE_DISKS|XMI_UNPLUG_NICS); +} + +void +xen_hvm_suspend(void) +{ +} + +void +xen_hvm_resume(void) +{ + xen_hvm_init_hypercall_stubs(); + xen_hvm_init_shared_info_page(); +} + +static void +xen_hvm_init(void *dummy __unused) +{ + if (xen_hvm_init_hypercall_stubs() != 0) + return; + + xen_domain_type = XEN_HVM_DOMAIN; + setup_xen_features(); + xen_hvm_init_shared_info_page(); + xen_hvm_set_callback(NULL); + xen_hvm_disable_emulated_devices(); +} + +void xen_hvm_init_cpu(void) +{ + int cpu = PCPU_GET(acpi_id); + struct vcpu_info *vcpu_info; + struct vcpu_register_vcpu_info info; + int rc; + + vcpu_info = DPCPU_PTR(vcpu_local_info); + info.mfn = vtophys(vcpu_info) >> PAGE_SHIFT; + info.offset = vtophys(vcpu_info) - trunc_page(vtophys(vcpu_info)); + + rc = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info); + if (rc) { + DPCPU_SET(vcpu_info, &HYPERVISOR_shared_info->vcpu_info[cpu]); + } else { + DPCPU_SET(vcpu_info, vcpu_info); + } +} + +SYSINIT(xen_hvm_init, SI_SUB_HYPERVISOR, SI_ORDER_FIRST, xen_hvm_init, NULL); +SYSINIT(xen_hvm_init_cpu, SI_SUB_INTR, SI_ORDER_FIRST, xen_hvm_init_cpu, NULL); Property changes on: sys/x86/xen/hvm.c ___________________________________________________________________ Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: sys/x86/xen/xen_intr.c =================================================================== --- sys/x86/xen/xen_intr.c (revision 0) +++ sys/x86/xen/xen_intr.c (working copy) @@ -0,0 +1,1126 @@ +/****************************************************************************** + * xen_intr.c + * + * Xen event and interrupt services for x86 PV and HVM guests. + * + * Copyright (c) 2002-2005, K A Fraser + * Copyright (c) 2005, Intel Corporation + * Copyright (c) 2012, Spectra Logic Corporation + * + * This file may be distributed separately from the Linux kernel, or + * incorporated into other software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include +#include +#include +#include + +#include +#include +#include + +#include +#include +#include + +#include + +static MALLOC_DEFINE(M_XENINTR, "xen_intr", "Xen Interrupt Services"); + +/** + * Per-cpu event channel processing state. + */ +struct xen_intr_pcpu_data { + /** + * The last event channel bitmap section (level one bit) processed. + * This is used to ensure we scan all ports before + * servicing an already servied port again. + */ + u_int last_processed_l1i; + + /** + * The last event channel processed within the event channel + * bitmap being scanned. + */ + u_int last_processed_l2i; + + /** Pointer to this CPU's interrupt statistic counter. */ + u_long *evtchn_intrcnt; + + /** + * A bitmap of ports that can be serviced from this CPU. + * A set bit means interrupt handling is enabled. + */ + u_long evtchn_enabled[sizeof(u_long) * 8]; +}; + +/* + * Start the scan at port 0 by initializing the last scanned + * location as the highest numbered event channel port. + */ +DPCPU_DEFINE(struct xen_intr_pcpu_data, xen_intr_pcpu) = { + .last_processed_l1i = LONG_BIT - 1, + .last_processed_l2i = LONG_BIT - 1 +}; + +DPCPU_DECLARE(struct vcpu_info *, vcpu_info); + +#define is_valid_evtchn(x) ((x) != 0) + +struct xenisrc { + struct intsrc xi_intsrc; + enum evtchn_type xi_type; + int xi_cpu; /* VCPU for delivery. */ + int xi_vector; /* Global isrc vector number. */ + evtchn_port_t xi_port; + int xi_pirq; + int xi_virq; + u_int xi_close:1; /* close on unbind? */ + u_int xi_needs_eoi:1; + u_int xi_shared:1; /* Shared with other domains. */ +}; + +#define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0])) + +static void xen_intr_suspend(struct pic *); +static void xen_intr_resume(struct pic *); +static void xen_intr_enable_source(struct intsrc *isrc); +static void xen_intr_disable_source(struct intsrc *isrc, int eoi); +static void xen_intr_eoi_source(struct intsrc *isrc); +static void xen_intr_enable_intr(struct intsrc *isrc); +static void xen_intr_disable_intr(struct intsrc *isrc); +static int xen_intr_vector(struct intsrc *isrc); +static int xen_intr_source_pending(struct intsrc *isrc); +static int xen_intr_config_intr(struct intsrc *isrc, + enum intr_trigger trig, enum intr_polarity pol); +static int xen_intr_assign_cpu(struct intsrc *isrc, u_int apic_id); + +static void xen_intr_pirq_enable_source(struct intsrc *isrc); +static void xen_intr_pirq_disable_source(struct intsrc *isrc, int eoi); +static void xen_intr_pirq_eoi_source(struct intsrc *isrc); +static void xen_intr_pirq_enable_intr(struct intsrc *isrc); + +/** + * PIC interface for all event channel port types except physical IRQs. + */ +struct pic xen_intr_pic = { + .pic_enable_source = xen_intr_enable_source, + .pic_disable_source = xen_intr_disable_source, + .pic_eoi_source = xen_intr_eoi_source, + .pic_enable_intr = xen_intr_enable_intr, + .pic_disable_intr = xen_intr_disable_intr, + .pic_vector = xen_intr_vector, + .pic_source_pending = xen_intr_source_pending, + .pic_suspend = xen_intr_suspend, + .pic_resume = xen_intr_resume, + .pic_config_intr = xen_intr_config_intr, + .pic_assign_cpu = xen_intr_assign_cpu +}; + +/** + * PIC interface for all event channel representing + * physical interrupt sources. + */ +struct pic xen_intr_pirq_pic = { + .pic_enable_source = xen_intr_pirq_enable_source, + .pic_disable_source = xen_intr_pirq_disable_source, + .pic_eoi_source = xen_intr_pirq_eoi_source, + .pic_enable_intr = xen_intr_pirq_enable_intr, + .pic_disable_intr = xen_intr_disable_intr, + .pic_vector = xen_intr_vector, + .pic_source_pending = xen_intr_source_pending, + .pic_suspend = xen_intr_suspend, + .pic_resume = xen_intr_resume, + .pic_config_intr = xen_intr_config_intr, + .pic_assign_cpu = xen_intr_assign_cpu +}; + +static struct mtx xen_intr_isrc_lock; +static int xen_intr_isrc_count; +static struct xenisrc *xen_intr_port_to_isrc[NR_EVENT_CHANNELS]; + +/*------------------------- Private Functions --------------------------------*/ +/** + * Disable signal delivery for an event channel port on the + * specified CPU. + * + * \param port The event channel port to mask. + * + * This API is used to manage the port<=>CPU binding of event + * channel handlers. + * + * \note This operation does not preclude reception of an event + * for this event channel on another CPU. To mask the + * event channel globally, use evtchn_mask(). + */ +static inline void +evtchn_cpu_mask_port(u_int cpu, evtchn_port_t port) +{ + struct xen_intr_pcpu_data *pcpu; + + pcpu = DPCPU_ID_PTR(cpu, xen_intr_pcpu); + clear_bit(port, pcpu->evtchn_enabled); +} + +/** + * Enable signal delivery for an event channel port on the + * specified CPU. + * + * \param port The event channel port to unmask. + * + * This API is used to manage the port<=>CPU binding of event + * channel handlers. + * + * \note This operation does not guarantee that event delivery + * is enabled for this event channel port. The port must + * also be globally enabled. See evtchn_unmask(). + */ +static inline void +evtchn_cpu_unmask_port(u_int cpu, evtchn_port_t port) +{ + struct xen_intr_pcpu_data *pcpu; + + pcpu = DPCPU_ID_PTR(cpu, xen_intr_pcpu); + set_bit(port, pcpu->evtchn_enabled); +} + +/** + * Allocate and register a per-cpu Xen upcall interrupt counter. + * + * \param cpu The cpu for which to register this interrupt count. + */ +static void +xen_intr_intrcnt_add(u_int cpu) +{ + char buf[MAXCOMLEN + 1]; + struct xen_intr_pcpu_data *pcpu; + + pcpu = DPCPU_ID_PTR(cpu, xen_intr_pcpu); + if (pcpu->evtchn_intrcnt != NULL) + return; + + snprintf(buf, sizeof(buf), "cpu%d:xen", cpu); + intrcnt_add(buf, &pcpu->evtchn_intrcnt); +} + +/** + * Search for an already allocated but currently unused Xen interrupt + * source object. + * + * \param type Restrict the search to interrupt sources of the given + * type. + * + * \return A pointer to a free Xen interrupt source object or NULL. + */ +static struct xenisrc * +xen_intr_find_unused_isrc(enum evtchn_type type) +{ + int isrc_idx; + + KASSERT(mtx_owned(&xen_intr_isrc_lock), ("Evtchn isrc lock not held")); + + for (isrc_idx = 0; isrc_idx < xen_intr_isrc_count; isrc_idx ++) { + struct xenisrc *isrc; + u_int vector; + + vector = FIRST_EVTCHN_INT + isrc_idx; + isrc = (struct xenisrc *)intr_lookup_source(vector); + if (isrc != NULL + && isrc->xi_type == EVTCHN_TYPE_UNBOUND) { + KASSERT(isrc->xi_intsrc.is_handlers == 0, + ("Free evtchn still has handlers")); + isrc->xi_type = type; + return (isrc); + } + } + return (NULL); +} + +/** + * Allocate a Xen interrupt source object. + * + * \param type The type of interrupt source to create. + * + * \return A pointer to a newly allocated Xen interrupt source + * object or NULL. + */ +static struct xenisrc * +xen_intr_alloc_isrc(enum evtchn_type type) +{ + static int warned; + struct xenisrc *isrc; + int vector; + + KASSERT(mtx_owned(&xen_intr_isrc_lock), ("Evtchn alloc lock not held")); + + if (xen_intr_isrc_count > NR_EVENT_CHANNELS) { + if (!warned) { + warned = 1; + printf("xen_intr_alloc: Event channels exhausted.\n"); + } + return (NULL); + } + vector = FIRST_EVTCHN_INT + xen_intr_isrc_count; + xen_intr_isrc_count++; + + mtx_unlock(&xen_intr_isrc_lock); + isrc = malloc(sizeof(*isrc), M_XENINTR, M_WAITOK | M_ZERO); + isrc->xi_intsrc.is_pic = &xen_intr_pic; + isrc->xi_vector = vector; + isrc->xi_type = type; + intr_register_source(&isrc->xi_intsrc); + mtx_lock(&xen_intr_isrc_lock); + + return (isrc); +} + +/** + * Attempt to free an active Xen interrupt source object. + * + * \param isrc The interrupt source object to release. + * + * \returns EBUSY if the source is still in use, otherwise 0. + */ +static int +xen_intr_release_isrc(struct xenisrc *isrc) +{ + + mtx_lock(&xen_intr_isrc_lock); + if (isrc->xi_intsrc.is_handlers != 0) { + mtx_unlock(&xen_intr_isrc_lock); + return (EBUSY); + } + evtchn_mask_port(isrc->xi_port); + evtchn_clear_port(isrc->xi_port); + + /* Rebind port to CPU 0. */ + evtchn_cpu_mask_port(isrc->xi_cpu, isrc->xi_port); + evtchn_cpu_unmask_port(0, isrc->xi_port); + + if (isrc->xi_close != 0) { + struct evtchn_close close = { .port = isrc->xi_port }; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)) + panic("EVTCHNOP_close failed"); + } + + xen_intr_port_to_isrc[isrc->xi_port] = NULL; + isrc->xi_cpu = 0; + isrc->xi_type = EVTCHN_TYPE_UNBOUND; + isrc->xi_port = 0; + mtx_unlock(&xen_intr_isrc_lock); + return (0); +} + +/** + * Associate an interrupt handler with an already allocated local Xen + * event channel port. + * + * \param isrcp The returned Xen interrupt object associated with + * the specified local port. + * \param local_port The event channel to bind. + * \param type The event channel type of local_port. + * \param intr_owner The device making this bind request. + * \param filter An interrupt filter handler. Specify NULL + * to always dispatch to the ithread handler. + * \param handler An interrupt ithread handler. Optional (can + * specify NULL) if all necessary event actions + * are performed by filter. + * \param arg Argument to present to both filter and handler. + * \param irqflags Interrupt handler flags. See sys/bus.h. + * \param handlep Pointer to an opaque handle used to manage this + * registration. + * + * \returns 0 on success, otherwise an errno. + */ +static int +xen_intr_bind_isrc(struct xenisrc **isrcp, evtchn_port_t local_port, + enum evtchn_type type, device_t intr_owner, driver_filter_t filter, + driver_intr_t handler, void *arg, enum intr_type flags, + xen_intr_handle_t *port_handlep) +{ + struct xenisrc *isrc; + int error; + + *isrcp = NULL; + if (port_handlep == NULL) { + device_printf(intr_owner, + "xen_intr_bind_isrc: Bad event handle\n"); + return (EINVAL); + } + + mtx_lock(&xen_intr_isrc_lock); + isrc = xen_intr_find_unused_isrc(type); + if (isrc == NULL) { + isrc = xen_intr_alloc_isrc(type); + if (isrc == NULL) { + mtx_unlock(&xen_intr_isrc_lock); + return (ENOSPC); + } + } + isrc->xi_port = local_port; + xen_intr_port_to_isrc[local_port] = isrc; + mtx_unlock(&xen_intr_isrc_lock); + + error = intr_add_handler(device_get_nameunit(intr_owner), + isrc->xi_vector, filter, handler, arg, + flags|INTR_EXCL, port_handlep); + if (error != 0) { + device_printf(intr_owner, + "xen_intr_bind_irq: intr_add_handler failed\n"); + xen_intr_release_isrc(isrc); + return (error); + } + *isrcp = isrc; + return (0); +} + +/** + * Lookup a Xen interrupt source object given an interrupt binding handle. + * + * \param handle A handle initialized by a previous call to + * xen_intr_bind_isrc(). + * + * \returns A pointer to the Xen interrupt source object associated + * with the given interrupt handle. NULL if no association + * currently exists. + */ +static struct xenisrc * +xen_intr_isrc(xen_intr_handle_t handle) +{ + struct intr_handler *ih; + + ih = handle; + if (ih == NULL || ih->ih_event == NULL) + return (NULL); + + return (ih->ih_event->ie_source); +} + +/** + * Determine the event channel ports at the given section of the + * event port bitmap which have pending events for the given cpu. + * + * \param pcpu The Xen interrupt pcpu data for the cpu being querried. + * \param sh The Xen shared info area. + * \param idx The index of the section of the event channel bitmap to + * inspect. + * + * \returns A u_long with bits set for every event channel with pending + * events. + */ +static inline u_long +xen_intr_active_ports(struct xen_intr_pcpu_data *pcpu, shared_info_t *sh, + u_int idx) +{ + return (sh->evtchn_pending[idx] + & ~sh->evtchn_mask[idx] + & pcpu->evtchn_enabled[idx]); +} + +/** + * Interrupt handler for processing all Xen event channel events. + * + * \param trap_frame The trap frame context for the current interrupt. + */ +void +xen_intr_handle_upcall(struct trapframe *trap_frame) +{ + u_int l1i, l2i, port, cpu; + u_long masked_l1, masked_l2; + struct xenisrc *isrc; + shared_info_t *s; + vcpu_info_t *v; + struct xen_intr_pcpu_data *pc; + u_long l1, l2; + + /* + * Disable preemption in order to always check and fire events + * on the right vCPU + */ + critical_enter(); + + cpu = PCPU_GET(cpuid); + pc = DPCPU_PTR(xen_intr_pcpu); + s = HYPERVISOR_shared_info; + v = DPCPU_GET(vcpu_info); + + if (xen_hvm_domain() && !xen_vector_callback_enabled) { + KASSERT((cpu == 0), ("Fired PCI event callback on wrong CPU")); + } + + v->evtchn_upcall_pending = 0; + +#if 0 +#ifndef CONFIG_X86 /* No need for a barrier -- XCHG is a barrier on x86. */ + /* Clear master flag /before/ clearing selector flag. */ + wmb(); +#endif +#endif + + l1 = atomic_readandclear_long(&v->evtchn_pending_sel); + + l1i = pc->last_processed_l1i; + l2i = pc->last_processed_l2i; + (*pc->evtchn_intrcnt)++; + + while (l1 != 0) { + + l1i = (l1i + 1) % LONG_BIT; + masked_l1 = l1 & ((~0UL) << l1i); + + if (masked_l1 == 0) { + /* + * if we masked out all events, wrap around + * to the beginning. + */ + l1i = LONG_BIT - 1; + l2i = LONG_BIT - 1; + continue; + } + l1i = ffsl(masked_l1) - 1; + + do { + l2 = xen_intr_active_ports(pc, s, l1i); + + l2i = (l2i + 1) % LONG_BIT; + masked_l2 = l2 & ((~0UL) << l2i); + + if (masked_l2 == 0) { + /* if we masked out all events, move on */ + l2i = LONG_BIT - 1; + break; + } + l2i = ffsl(masked_l2) - 1; + + /* process port */ + port = (l1i * LONG_BIT) + l2i; + synch_clear_bit(port, &s->evtchn_pending[0]); + + isrc = xen_intr_port_to_isrc[port]; + if (__predict_false(isrc == NULL)) + continue; + + /* Make sure we are firing on the right vCPU */ + KASSERT((isrc->xi_cpu == PCPU_GET(cpuid)), + ("Received unexpected event on vCPU#%d, event bound to vCPU#%d", + PCPU_GET(cpuid), isrc->xi_cpu)); + + intr_execute_handlers(&isrc->xi_intsrc, trap_frame); + + /* + * If this is the final port processed, + * we'll pick up here+1 next time. + */ + pc->last_processed_l1i = l1i; + pc->last_processed_l2i = l2i; + + } while (l2i != LONG_BIT - 1); + + l2 = xen_intr_active_ports(pc, s, l1i); + if (l2 == 0) { + /* + * We handled all ports, so we can clear the + * selector bit. + */ + l1 &= ~(1UL << l1i); + } + } + critical_exit(); +} + +static int +xen_intr_init(void *dummy __unused) +{ + struct xen_intr_pcpu_data *pcpu; + int i; + + mtx_init(&xen_intr_isrc_lock, "xen-irq-lock", NULL, MTX_DEF); + + /* + * Register interrupt count manually as we aren't + * guaranteed to see a call to xen_intr_assign_cpu() + * before our first interrupt. Also set the per-cpu + * mask of CPU#0 to enable all, since by default + * all event channels are bound to CPU#0. + */ + CPU_FOREACH(i) { + pcpu = DPCPU_ID_PTR(i, xen_intr_pcpu); + memset(pcpu->evtchn_enabled, i == 0 ? ~0 : 0, + sizeof(pcpu->evtchn_enabled)); + xen_intr_intrcnt_add(i); + } + + intr_register_pic(&xen_intr_pic); + + return (0); +} +SYSINIT(xen_intr_init, SI_SUB_INTR, SI_ORDER_MIDDLE, xen_intr_init, NULL); + +/*--------------------------- Common PIC Functions ---------------------------*/ +/** + * Prepare this PIC for system suspension. + */ +static void +xen_intr_suspend(struct pic *unused) +{ +} + +/** + * Return this PIC to service after being suspended. + */ +static void +xen_intr_resume(struct pic *unused) +{ + u_int port; + + /* + * Mask events for all ports. They will be unmasked after + * drivers have re-registered their handlers. + */ + for (port = 0; port < NR_EVENT_CHANNELS; port++) + evtchn_mask_port(port); +} + +/** + * Disable a Xen interrupt source. + * + * \param isrc The interrupt source to disable. + */ +static void +xen_intr_disable_intr(struct intsrc *base_isrc) +{ + struct xenisrc *isrc = (struct xenisrc *)base_isrc; + + evtchn_mask_port(isrc->xi_port); +} + +/** + * Determine the global interrupt vector number for + * a Xen interrupt source. + * + * \param isrc The interrupt source to query. + * + * \return The vector number corresponding to the given interrupt source. + */ +static int +xen_intr_vector(struct intsrc *base_isrc) +{ + struct xenisrc *isrc = (struct xenisrc *)base_isrc; + + return (isrc->xi_vector); +} + +/** + * Determine whether or not interrupt events are pending on the + * the given interrupt source. + * + * \param isrc The interrupt source to query. + * + * \returns 0 if no events are pending, otherwise non-zero. + */ +static int +xen_intr_source_pending(struct intsrc *isrc) +{ + /* + * EventChannels are edge triggered and never masked. + * There can be no pending events. + */ + return (0); +} + +/** + * Perform configuration of an interrupt source. + * + * \param isrc The interrupt source to configure. + * \param trig Edge or level. + * \param pol Active high or low. + * + * \returns 0 if no events are pending, otherwise non-zero. + */ +static int +xen_intr_config_intr(struct intsrc *isrc, enum intr_trigger trig, + enum intr_polarity pol) +{ + /* Configuration is only possible via the evtchn apis. */ + return (ENODEV); +} + +/** + * Configure CPU affinity for interrupt source event delivery. + * + * \param isrc The interrupt source to configure. + * \param apic_id The apic id of the CPU for handling future events. + * + * \returns 0 if successful, otherwise an errno. + */ +static int +xen_intr_assign_cpu(struct intsrc *base_isrc, u_int apic_id) +{ + struct evtchn_bind_vcpu bind_vcpu; + struct xenisrc *isrc; + u_int to_cpu, acpi_id; + int error; + +#ifdef XENHVM + if (xen_vector_callback_enabled == 0) + return (EOPNOTSUPP); +#endif + + to_cpu = apic_cpuid(apic_id); + acpi_id = pcpu_find(to_cpu)->pc_acpi_id; + xen_intr_intrcnt_add(to_cpu); + + mtx_lock(&xen_intr_isrc_lock); + isrc = (struct xenisrc *)base_isrc; + if (!is_valid_evtchn(isrc->xi_port)) { + mtx_unlock(&xen_intr_isrc_lock); + return (EINVAL); + } + + if ((isrc->xi_type == EVTCHN_TYPE_VIRQ) || + (isrc->xi_type == EVTCHN_TYPE_IPI)) { + /* + * Virtual IRQs are associated with a cpu by + * the Hypervisor at evtchn_bind_virq time, so + * all we need to do is update the per-CPU masks. + */ + evtchn_cpu_mask_port(isrc->xi_cpu, isrc->xi_port); + isrc->xi_cpu = to_cpu; + evtchn_cpu_unmask_port(isrc->xi_cpu, isrc->xi_port); + mtx_unlock(&xen_intr_isrc_lock); + return (0); + } + + bind_vcpu.port = isrc->xi_port; + bind_vcpu.vcpu = acpi_id; + + /* + * Allow interrupts to be fielded on the new VCPU before + * we ask the hypervisor to deliver them there. + */ + evtchn_cpu_unmask_port(to_cpu, isrc->xi_port); + error = HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu); + if (isrc->xi_cpu != to_cpu) { + if (error == 0) { + /* Commit to new binding by removing the old one. */ + evtchn_cpu_mask_port(isrc->xi_cpu, isrc->xi_port); + isrc->xi_cpu = to_cpu; + } else { + /* Roll-back to previous binding. */ + evtchn_cpu_mask_port(to_cpu, isrc->xi_port); + } + } + mtx_unlock(&xen_intr_isrc_lock); + return (0); +} + +/*------------------- Virtual Interrupt Source PIC Functions -----------------*/ +/* + * Mask a level triggered interrupt source. + * + * \param isrc The interrupt source to mask (if necessary). + * \param eoi If non-zero, perform any necessary end-of-interrupt + * acknowledgements. + */ +static void +xen_intr_disable_source(struct intsrc *isrc, int eoi) +{ +} + +/* + * Unmask a level triggered interrupt source. + * + * \param isrc The interrupt source to unmask (if necessary). + */ +static void +xen_intr_enable_source(struct intsrc *isrc) +{ +} + +/* + * Perform any necessary end-of-interrupt acknowledgements. + * + * \param isrc The interrupt source to EOI. + */ +static void +xen_intr_eoi_source(struct intsrc *isrc) +{ +} + +/* + * Enable and unmask the interrupt source. + * + * \param isrc The interrupt source to enable. + */ +static void +xen_intr_enable_intr(struct intsrc *base_isrc) +{ + struct xenisrc *isrc = (struct xenisrc *)base_isrc; + + evtchn_unmask_port(isrc->xi_port); +} + +/*------------------ Physical Interrupt Source PIC Functions -----------------*/ +/* + * Mask a level triggered interrupt source. + * + * \param isrc The interrupt source to mask (if necessary). + * \param eoi If non-zero, perform any necessary end-of-interrupt + * acknowledgements. + */ +static void +xen_intr_pirq_disable_source(struct intsrc *base_isrc, int eoi) +{ + struct xenisrc *isrc; + + isrc = (struct xenisrc *)base_isrc; + evtchn_mask_port(isrc->xi_port); +} + +/* + * Unmask a level triggered interrupt source. + * + * \param isrc The interrupt source to unmask (if necessary). + */ +static void +xen_intr_pirq_enable_source(struct intsrc *base_isrc) +{ + struct xenisrc *isrc; + + isrc = (struct xenisrc *)base_isrc; + evtchn_unmask_port(isrc->xi_port); +} + +/* + * Perform any necessary end-of-interrupt acknowledgements. + * + * \param isrc The interrupt source to EOI. + */ +static void +xen_intr_pirq_eoi_source(struct intsrc *base_isrc) +{ + struct xenisrc *isrc; + + /* XXX Use shared page of flags for this. */ + isrc = (struct xenisrc *)base_isrc; + if (isrc->xi_needs_eoi != 0) { + struct physdev_eoi eoi = { .irq = isrc->xi_pirq }; + + (void)HYPERVISOR_physdev_op(PHYSDEVOP_eoi, &eoi); + } +} + +/* + * Enable and unmask the interrupt source. + * + * \param isrc The interrupt source to enable. + */ +static void +xen_intr_pirq_enable_intr(struct intsrc *isrc) +{ +} + +/*--------------------------- Public Functions -------------------------------*/ +/*------- API comments for these methods can be found in xen/xenintr.h -------*/ +int +xen_intr_bind_local_port(device_t dev, evtchn_port_t local_port, + driver_filter_t filter, driver_intr_t handler, void *arg, + enum intr_type flags, xen_intr_handle_t *port_handlep) +{ + struct xenisrc *isrc; + int error; + + error = xen_intr_bind_isrc(&isrc, local_port, EVTCHN_TYPE_PORT, dev, + filter, handler, arg, flags, port_handlep); + if (error != 0) + return (error); + + /* + * The Event Channel API didn't open this port, so it is not + * responsible for closing it automatically on unbind. + */ + isrc->xi_close = 0; + return (0); +} + +int +xen_intr_alloc_and_bind_local_port(device_t dev, u_int remote_domain, + driver_filter_t filter, driver_intr_t handler, void *arg, + enum intr_type flags, xen_intr_handle_t *port_handlep) +{ + struct xenisrc *isrc; + struct evtchn_alloc_unbound alloc_unbound; + int error; + + alloc_unbound.dom = DOMID_SELF; + alloc_unbound.remote_dom = remote_domain; + error = HYPERVISOR_event_channel_op(EVTCHNOP_alloc_unbound, + &alloc_unbound); + if (error != 0) { + /* + * XXX Trap Hypercall error code Linuxisms in + * the HYPERCALL layer. + */ + return (-error); + } + + error = xen_intr_bind_isrc(&isrc, alloc_unbound.port, EVTCHN_TYPE_PORT, + dev, filter, handler, arg, flags, + port_handlep); + if (error != 0) { + evtchn_close_t close = { .port = alloc_unbound.port }; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)) + panic("EVTCHNOP_close failed"); + return (error); + } + + isrc->xi_close = 1; + return (0); +} + +int +xen_intr_bind_remote_port(device_t dev, u_int remote_domain, + u_int remote_port, driver_filter_t filter, driver_intr_t handler, + void *arg, enum intr_type flags, xen_intr_handle_t *port_handlep) +{ + struct xenisrc *isrc; + struct evtchn_bind_interdomain bind_interdomain; + int error; + + bind_interdomain.remote_dom = remote_domain; + bind_interdomain.remote_port = remote_port; + error = HYPERVISOR_event_channel_op(EVTCHNOP_bind_interdomain, + &bind_interdomain); + if (error != 0) { + /* + * XXX Trap Hypercall error code Linuxisms in + * the HYPERCALL layer. + */ + return (-error); + } + + error = xen_intr_bind_isrc(&isrc, bind_interdomain.local_port, + EVTCHN_TYPE_PORT, dev, filter, handler, + arg, flags, port_handlep); + if (error) { + evtchn_close_t close = { .port = bind_interdomain.local_port }; + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)) + panic("EVTCHNOP_close failed"); + return (error); + } + + /* + * The Event Channel API opened this port, so it is + * responsible for closing it automatically on unbind. + */ + isrc->xi_close = 1; + return (0); +} + +int +xen_intr_bind_virq(device_t dev, u_int virq, u_int cpu, + driver_filter_t filter, driver_intr_t handler, void *arg, + enum intr_type flags, xen_intr_handle_t *port_handlep) +{ + int acpi_id = pcpu_find(cpu)->pc_acpi_id; + struct xenisrc *isrc; + struct evtchn_bind_virq bind_virq = { .virq = virq, .vcpu = acpi_id }; + int error; + + /* Ensure the target CPU is ready to handle evtchn interrupts. */ + xen_intr_intrcnt_add(cpu); + + isrc = NULL; + error = HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, &bind_virq); + if (error != 0) { + /* + * XXX Trap Hypercall error code Linuxisms in + * the HYPERCALL layer. + */ + return (-error); + } + + error = xen_intr_bind_isrc(&isrc, bind_virq.port, EVTCHN_TYPE_VIRQ, dev, + filter, handler, arg, flags, port_handlep); + if (error == 0) + error = intr_event_bind(isrc->xi_intsrc.is_event, cpu); + + if (error != 0) { + evtchn_close_t close = { .port = bind_virq.port }; + + xen_intr_unbind(*port_handlep); + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)) + panic("EVTCHNOP_close failed"); + return (error); + } + + if (isrc->xi_cpu != cpu) { + /* + * Too early in the boot process for the generic interrupt + * code to perform the binding. Update our event channel + * masks manually so events can't fire on the wrong cpu + * during AP startup. + */ + xen_intr_assign_cpu(&isrc->xi_intsrc, cpu_apic_ids[cpu]); + } + + /* + * The Event Channel API opened this port, so it is + * responsible for closing it automatically on unbind. + */ + isrc->xi_close = 1; + return (0); +} + +int +xen_intr_bind_ipi(device_t dev, u_int ipi, u_int cpu, + driver_filter_t filter, enum intr_type flags, + xen_intr_handle_t *port_handlep) +{ + int acpi_id = pcpu_find(cpu)->pc_acpi_id; + struct xenisrc *isrc; + struct evtchn_bind_ipi bind_ipi = { .vcpu = acpi_id }; + int error; + + /* Ensure the target CPU is ready to handle evtchn interrupts. */ + xen_intr_intrcnt_add(cpu); + + isrc = NULL; + error = HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, &bind_ipi); + if (error != 0) { + /* + * XXX Trap Hypercall error code Linuxisms in + * the HYPERCALL layer. + */ + return (-error); + } + + error = xen_intr_bind_isrc(&isrc, bind_ipi.port, EVTCHN_TYPE_IPI, + dev, filter, NULL, NULL, flags, + port_handlep); + if (error == 0) + error = intr_event_bind(isrc->xi_intsrc.is_event, cpu); + + if (error != 0) { + evtchn_close_t close = { .port = bind_ipi.port }; + + xen_intr_unbind(*port_handlep); + if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close)) + panic("EVTCHNOP_close failed"); + return (error); + } + + if (isrc->xi_cpu != cpu) { + /* + * Too early in the boot process for the generic interrupt + * code to perform the binding. Update our event channel + * masks manually so events can't fire on the wrong cpu + * during AP startup. + */ + xen_intr_assign_cpu(&isrc->xi_intsrc, cpu_apic_ids[cpu]); + } + + /* + * The Event Channel API opened this port, so it is + * responsible for closing it automatically on unbind. + */ + isrc->xi_close = 1; + return (0); +} + +int +xen_intr_describe(xen_intr_handle_t port_handle, const char *fmt, ...) +{ + char descr[MAXCOMLEN + 1]; + struct xenisrc *isrc; + va_list ap; + + isrc = xen_intr_isrc(port_handle); + if (isrc == NULL) + return (EINVAL); + + va_start(ap, fmt); + vsnprintf(descr, sizeof(descr), fmt, ap); + va_end(ap); + return (intr_describe(isrc->xi_vector, port_handle, descr)); +} + +void +xen_intr_unbind(xen_intr_handle_t *port_handlep) +{ + struct intr_handler *handler; + struct xenisrc *isrc; + + handler = *port_handlep; + *port_handlep = NULL; + isrc = xen_intr_isrc(handler); + if (isrc == NULL) + return; + + intr_remove_handler(handler); + xen_intr_release_isrc(isrc); +} + +void +xen_intr_signal(xen_intr_handle_t handle) +{ + struct xenisrc *isrc; + + isrc = xen_intr_isrc(handle); + if (isrc != NULL) { + KASSERT(isrc->xi_type == EVTCHN_TYPE_PORT || + isrc->xi_type == EVTCHN_TYPE_IPI, + ("evtchn_signal on something other than a local port")); + struct evtchn_send send = { .port = isrc->xi_port }; + (void)HYPERVISOR_event_channel_op(EVTCHNOP_send, &send); + } +} + +evtchn_port_t +xen_intr_port(xen_intr_handle_t handle) +{ + struct xenisrc *isrc; + + isrc = xen_intr_isrc(handle); + if (isrc == NULL) + return (0); + + return (isrc->xi_port); +} Property changes on: sys/x86/xen/xen_intr.c ___________________________________________________________________ Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: sys/xen/evtchn/evtchn.c =================================================================== --- sys/xen/evtchn/evtchn.c (revision 255014) +++ sys/xen/evtchn/evtchn.c (working copy) @@ -1,1141 +0,0 @@ -/****************************************************************************** - * evtchn.c - * - * Communication via Xen event channels. - * - * Copyright (c) 2002-2005, K A Fraser - * Copyright (c) 2005-2006 Kip Macy - */ - -#include -__FBSDID("$FreeBSD$"); - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include - -#include -#include -#include -#include -#include -#include -#include - -#include -#include - -static inline unsigned long __ffs(unsigned long word) -{ - __asm__("bsfl %1,%0" - :"=r" (word) - :"rm" (word)); - return word; -} - -/* - * irq_mapping_update_lock: in order to allow an interrupt to occur in a critical - * section, to set pcpu->ipending (etc...) properly, we - * must be able to get the icu lock, so it can't be - * under witness. - */ -static struct mtx irq_mapping_update_lock; -MTX_SYSINIT(irq_mapping_update_lock, &irq_mapping_update_lock, "xp", MTX_SPIN); - -static struct xenpic *xp; -struct xenpic_intsrc { - struct intsrc xp_intsrc; - void *xp_cookie; - uint8_t xp_vector; - boolean_t xp_masked; -}; - -struct xenpic { - struct pic *xp_dynirq_pic; - struct pic *xp_pirq_pic; - uint16_t xp_numintr; - struct xenpic_intsrc xp_pins[0]; -}; - -#define TODO printf("%s: not implemented!\n", __func__) - -/* IRQ <-> event-channel mappings. */ -static int evtchn_to_irq[NR_EVENT_CHANNELS]; - -/* Packed IRQ information: binding type, sub-type index, and event channel. */ -static uint32_t irq_info[NR_IRQS]; -/* Binding types. */ -enum { - IRQT_UNBOUND, - IRQT_PIRQ, - IRQT_VIRQ, - IRQT_IPI, - IRQT_LOCAL_PORT, - IRQT_CALLER_PORT, - _IRQT_COUNT - -}; - - -#define _IRQT_BITS 4 -#define _EVTCHN_BITS 12 -#define _INDEX_BITS (32 - _IRQT_BITS - _EVTCHN_BITS) - -/* Constructor for packed IRQ information. */ -static inline uint32_t -mk_irq_info(uint32_t type, uint32_t index, uint32_t evtchn) -{ - - return ((type << (32 - _IRQT_BITS)) | (index << _EVTCHN_BITS) | evtchn); -} - -/* Constructor for packed IRQ information. */ - -/* Convenient shorthand for packed representation of an unbound IRQ. */ -#define IRQ_UNBOUND mk_irq_info(IRQT_UNBOUND, 0, 0) - -/* - * Accessors for packed IRQ information. - */ - -static inline unsigned int evtchn_from_irq(int irq) -{ - return irq_info[irq] & ((1U << _EVTCHN_BITS) - 1); -} - -static inline unsigned int index_from_irq(int irq) -{ - return (irq_info[irq] >> _EVTCHN_BITS) & ((1U << _INDEX_BITS) - 1); -} - -static inline unsigned int type_from_irq(int irq) -{ - return irq_info[irq] >> (32 - _IRQT_BITS); -} - - -/* IRQ <-> VIRQ mapping. */ - -/* IRQ <-> IPI mapping. */ -#ifndef NR_IPIS -#ifdef SMP -#error "NR_IPIS not defined" -#endif -#define NR_IPIS 1 -#endif - -/* Bitmap indicating which PIRQs require Xen to be notified on unmask. */ -static unsigned long pirq_needs_unmask_notify[NR_PIRQS/sizeof(unsigned long)]; - -/* Reference counts for bindings to IRQs. */ -static int irq_bindcount[NR_IRQS]; - -#define VALID_EVTCHN(_chn) ((_chn) != 0) - -#ifdef SMP - -static uint8_t cpu_evtchn[NR_EVENT_CHANNELS]; -static unsigned long cpu_evtchn_mask[XEN_LEGACY_MAX_VCPUS][NR_EVENT_CHANNELS/LONG_BIT]; - -#define active_evtchns(cpu,sh,idx) \ - ((sh)->evtchn_pending[idx] & \ - cpu_evtchn_mask[cpu][idx] & \ - ~(sh)->evtchn_mask[idx]) - -static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) -{ - clear_bit(chn, (unsigned long *)cpu_evtchn_mask[cpu_evtchn[chn]]); - set_bit(chn, (unsigned long *)cpu_evtchn_mask[cpu]); - cpu_evtchn[chn] = cpu; -} - -static void init_evtchn_cpu_bindings(void) -{ - /* By default all event channels notify CPU#0. */ - memset(cpu_evtchn, 0, sizeof(cpu_evtchn)); - memset(cpu_evtchn_mask[0], ~0, sizeof(cpu_evtchn_mask[0])); -} - -#define cpu_from_evtchn(evtchn) (cpu_evtchn[evtchn]) - -#else - -#define active_evtchns(cpu,sh,idx) \ - ((sh)->evtchn_pending[idx] & \ - ~(sh)->evtchn_mask[idx]) -#define bind_evtchn_to_cpu(chn,cpu) ((void)0) -#define init_evtchn_cpu_bindings() ((void)0) -#define cpu_from_evtchn(evtchn) (0) - -#endif - - -/* - * Force a proper event-channel callback from Xen after clearing the - * callback mask. We do this in a very simple manner, by making a call - * down into Xen. The pending flag will be checked by Xen on return. - */ -void force_evtchn_callback(void) -{ - (void)HYPERVISOR_xen_version(0, NULL); -} - -void -evtchn_do_upcall(struct trapframe *frame) -{ - unsigned long l1, l2; - unsigned int l1i, l2i, port; - int irq, cpu; - shared_info_t *s; - vcpu_info_t *vcpu_info; - - cpu = PCPU_GET(cpuid); - s = HYPERVISOR_shared_info; - vcpu_info = &s->vcpu_info[cpu]; - - vcpu_info->evtchn_upcall_pending = 0; - - /* NB. No need for a barrier here -- XCHG is a barrier on x86. */ - l1 = xen_xchg(&vcpu_info->evtchn_pending_sel, 0); - - while (l1 != 0) { - l1i = __ffs(l1); - l1 &= ~(1 << l1i); - - while ((l2 = active_evtchns(cpu, s, l1i)) != 0) { - l2i = __ffs(l2); - - port = (l1i * LONG_BIT) + l2i; - if ((irq = evtchn_to_irq[port]) != -1) { - struct intsrc *isrc = intr_lookup_source(irq); - /* - * ack - */ - mask_evtchn(port); - clear_evtchn(port); - - intr_execute_handlers(isrc, frame); - } else { - evtchn_device_upcall(port); - } - } - } -} - -/* - * Send an IPI from the current CPU to the destination CPU. - */ -void -ipi_pcpu(unsigned int cpu, int vector) -{ - int irq; - - irq = pcpu_find(cpu)->pc_ipi_to_irq[vector]; - - notify_remote_via_irq(irq); -} - -static int -find_unbound_irq(void) -{ - int dynirq, irq; - - for (dynirq = 0; dynirq < NR_IRQS; dynirq++) { - irq = dynirq_to_irq(dynirq); - if (irq_bindcount[irq] == 0) - break; - } - - if (irq == NR_IRQS) - panic("No available IRQ to bind to: increase NR_IRQS!\n"); - - return (irq); -} - -static int -bind_caller_port_to_irq(unsigned int caller_port, int * port) -{ - int irq; - - mtx_lock_spin(&irq_mapping_update_lock); - - if ((irq = evtchn_to_irq[caller_port]) == -1) { - if ((irq = find_unbound_irq()) < 0) - goto out; - - evtchn_to_irq[caller_port] = irq; - irq_info[irq] = mk_irq_info(IRQT_CALLER_PORT, 0, caller_port); - } - - irq_bindcount[irq]++; - *port = caller_port; - - out: - mtx_unlock_spin(&irq_mapping_update_lock); - return irq; -} - -static int -bind_local_port_to_irq(unsigned int local_port, int * port) -{ - int irq; - - mtx_lock_spin(&irq_mapping_update_lock); - - KASSERT(evtchn_to_irq[local_port] == -1, - ("evtchn_to_irq inconsistent")); - - if ((irq = find_unbound_irq()) < 0) { - struct evtchn_close close = { .port = local_port }; - HYPERVISOR_event_channel_op(EVTCHNOP_close, &close); - - goto out; - } - - evtchn_to_irq[local_port] = irq; - irq_info[irq] = mk_irq_info(IRQT_LOCAL_PORT, 0, local_port); - irq_bindcount[irq]++; - *port = local_port; - - out: - mtx_unlock_spin(&irq_mapping_update_lock); - return irq; -} - -static int -bind_listening_port_to_irq(unsigned int remote_domain, int * port) -{ - struct evtchn_alloc_unbound alloc_unbound; - int err; - - alloc_unbound.dom = DOMID_SELF; - alloc_unbound.remote_dom = remote_domain; - - err = HYPERVISOR_event_channel_op(EVTCHNOP_alloc_unbound, - &alloc_unbound); - - return err ? : bind_local_port_to_irq(alloc_unbound.port, port); -} - -static int -bind_interdomain_evtchn_to_irq(unsigned int remote_domain, - unsigned int remote_port, int * port) -{ - struct evtchn_bind_interdomain bind_interdomain; - int err; - - bind_interdomain.remote_dom = remote_domain; - bind_interdomain.remote_port = remote_port; - - err = HYPERVISOR_event_channel_op(EVTCHNOP_bind_interdomain, - &bind_interdomain); - - return err ? : bind_local_port_to_irq(bind_interdomain.local_port, port); -} - -static int -bind_virq_to_irq(unsigned int virq, unsigned int cpu, int * port) -{ - struct evtchn_bind_virq bind_virq; - int evtchn = 0, irq; - - mtx_lock_spin(&irq_mapping_update_lock); - - if ((irq = pcpu_find(cpu)->pc_virq_to_irq[virq]) == -1) { - if ((irq = find_unbound_irq()) < 0) - goto out; - - bind_virq.virq = virq; - bind_virq.vcpu = cpu; - HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, &bind_virq); - - evtchn = bind_virq.port; - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); - - pcpu_find(cpu)->pc_virq_to_irq[virq] = irq; - - bind_evtchn_to_cpu(evtchn, cpu); - } - - irq_bindcount[irq]++; - *port = evtchn; -out: - mtx_unlock_spin(&irq_mapping_update_lock); - - return irq; -} - - -static int -bind_ipi_to_irq(unsigned int ipi, unsigned int cpu, int * port) -{ - struct evtchn_bind_ipi bind_ipi; - int irq; - int evtchn = 0; - - mtx_lock_spin(&irq_mapping_update_lock); - - if ((irq = pcpu_find(cpu)->pc_ipi_to_irq[ipi]) == -1) { - if ((irq = find_unbound_irq()) < 0) - goto out; - - bind_ipi.vcpu = cpu; - HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, &bind_ipi); - evtchn = bind_ipi.port; - - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); - - pcpu_find(cpu)->pc_ipi_to_irq[ipi] = irq; - - bind_evtchn_to_cpu(evtchn, cpu); - } - irq_bindcount[irq]++; - *port = evtchn; -out: - - mtx_unlock_spin(&irq_mapping_update_lock); - - return irq; -} - - -static void -unbind_from_irq(int irq) -{ - struct evtchn_close close; - int evtchn = evtchn_from_irq(irq); - int cpu; - - mtx_lock_spin(&irq_mapping_update_lock); - - if ((--irq_bindcount[irq] == 0) && VALID_EVTCHN(evtchn)) { - close.port = evtchn; - HYPERVISOR_event_channel_op(EVTCHNOP_close, &close); - - switch (type_from_irq(irq)) { - case IRQT_VIRQ: - cpu = cpu_from_evtchn(evtchn); - pcpu_find(cpu)->pc_virq_to_irq[index_from_irq(irq)] = -1; - break; - case IRQT_IPI: - cpu = cpu_from_evtchn(evtchn); - pcpu_find(cpu)->pc_ipi_to_irq[index_from_irq(irq)] = -1; - break; - default: - break; - } - - /* Closed ports are implicitly re-bound to VCPU0. */ - bind_evtchn_to_cpu(evtchn, 0); - - evtchn_to_irq[evtchn] = -1; - irq_info[irq] = IRQ_UNBOUND; - } - - mtx_unlock_spin(&irq_mapping_update_lock); -} - -int -bind_caller_port_to_irqhandler(unsigned int caller_port, - const char *devname, driver_intr_t handler, void *arg, - unsigned long irqflags, unsigned int *irqp) -{ - unsigned int irq; - int port = -1; - int error; - - irq = bind_caller_port_to_irq(caller_port, &port); - intr_register_source(&xp->xp_pins[irq].xp_intsrc); - error = intr_add_handler(devname, irq, NULL, handler, arg, irqflags, - &xp->xp_pins[irq].xp_cookie); - - if (error) { - unbind_from_irq(irq); - return (error); - } - if (port != -1) - unmask_evtchn(port); - - if (irqp) - *irqp = irq; - - return (0); -} - -int -bind_listening_port_to_irqhandler(unsigned int remote_domain, - const char *devname, driver_intr_t handler, void *arg, - unsigned long irqflags, unsigned int *irqp) -{ - unsigned int irq; - int port = -1; - int error; - - irq = bind_listening_port_to_irq(remote_domain, &port); - intr_register_source(&xp->xp_pins[irq].xp_intsrc); - error = intr_add_handler(devname, irq, NULL, handler, arg, irqflags, - &xp->xp_pins[irq].xp_cookie); - if (error) { - unbind_from_irq(irq); - return (error); - } - if (port != -1) - unmask_evtchn(port); - if (irqp) - *irqp = irq; - - return (0); -} - -int -bind_interdomain_evtchn_to_irqhandler(unsigned int remote_domain, - unsigned int remote_port, const char *devname, - driver_intr_t handler, void *arg, unsigned long irqflags, - unsigned int *irqp) -{ - unsigned int irq; - int port = -1; - int error; - - irq = bind_interdomain_evtchn_to_irq(remote_domain, remote_port, &port); - intr_register_source(&xp->xp_pins[irq].xp_intsrc); - error = intr_add_handler(devname, irq, NULL, handler, arg, - irqflags, &xp->xp_pins[irq].xp_cookie); - if (error) { - unbind_from_irq(irq); - return (error); - } - if (port != -1) - unmask_evtchn(port); - - if (irqp) - *irqp = irq; - return (0); -} - -int -bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, - const char *devname, driver_filter_t filter, driver_intr_t handler, - void *arg, unsigned long irqflags, unsigned int *irqp) -{ - unsigned int irq; - int port = -1; - int error; - - irq = bind_virq_to_irq(virq, cpu, &port); - intr_register_source(&xp->xp_pins[irq].xp_intsrc); - error = intr_add_handler(devname, irq, filter, handler, - arg, irqflags, &xp->xp_pins[irq].xp_cookie); - if (error) { - unbind_from_irq(irq); - return (error); - } - if (port != -1) - unmask_evtchn(port); - - if (irqp) - *irqp = irq; - return (0); -} - -int -bind_ipi_to_irqhandler(unsigned int ipi, unsigned int cpu, - const char *devname, driver_filter_t filter, - unsigned long irqflags, unsigned int *irqp) -{ - unsigned int irq; - int port = -1; - int error; - - irq = bind_ipi_to_irq(ipi, cpu, &port); - intr_register_source(&xp->xp_pins[irq].xp_intsrc); - error = intr_add_handler(devname, irq, filter, NULL, - NULL, irqflags, &xp->xp_pins[irq].xp_cookie); - if (error) { - unbind_from_irq(irq); - return (error); - } - if (port != -1) - unmask_evtchn(port); - - if (irqp) - *irqp = irq; - return (0); -} - -void -unbind_from_irqhandler(unsigned int irq) -{ - intr_remove_handler(xp->xp_pins[irq].xp_cookie); - unbind_from_irq(irq); -} - -#if 0 -/* Rebind an evtchn so that it gets delivered to a specific cpu */ -static void -rebind_irq_to_cpu(unsigned irq, unsigned tcpu) -{ - evtchn_op_t op = { .cmd = EVTCHNOP_bind_vcpu }; - int evtchn; - - mtx_lock_spin(&irq_mapping_update_lock); - - evtchn = evtchn_from_irq(irq); - if (!VALID_EVTCHN(evtchn)) { - mtx_unlock_spin(&irq_mapping_update_lock); - return; - } - - /* Send future instances of this interrupt to other vcpu. */ - bind_vcpu.port = evtchn; - bind_vcpu.vcpu = tcpu; - - /* - * If this fails, it usually just indicates that we're dealing with a - * virq or IPI channel, which don't actually need to be rebound. Ignore - * it, but don't do the xenlinux-level rebind in that case. - */ - if (HYPERVISOR_event_channel_op(&op) >= 0) - bind_evtchn_to_cpu(evtchn, tcpu); - - mtx_unlock_spin(&irq_mapping_update_lock); - -} - -static void set_affinity_irq(unsigned irq, cpumask_t dest) -{ - unsigned tcpu = ffs(dest) - 1; - rebind_irq_to_cpu(irq, tcpu); -} -#endif - -/* - * Interface to generic handling in intr_machdep.c - */ - - -/*------------ interrupt handling --------------------------------------*/ -#define TODO printf("%s: not implemented!\n", __func__) - - -static void xenpic_dynirq_enable_source(struct intsrc *isrc); -static void xenpic_dynirq_disable_source(struct intsrc *isrc, int); -static void xenpic_dynirq_eoi_source(struct intsrc *isrc); -static void xenpic_dynirq_enable_intr(struct intsrc *isrc); -static void xenpic_dynirq_disable_intr(struct intsrc *isrc); - -static void xenpic_pirq_enable_source(struct intsrc *isrc); -static void xenpic_pirq_disable_source(struct intsrc *isrc, int); -static void xenpic_pirq_eoi_source(struct intsrc *isrc); -static void xenpic_pirq_enable_intr(struct intsrc *isrc); - - -static int xenpic_vector(struct intsrc *isrc); -static int xenpic_source_pending(struct intsrc *isrc); -static void xenpic_suspend(struct pic* pic); -static void xenpic_resume(struct pic* pic); -static int xenpic_assign_cpu(struct intsrc *, u_int apic_id); - - -struct pic xenpic_dynirq_template = { - .pic_enable_source = xenpic_dynirq_enable_source, - .pic_disable_source = xenpic_dynirq_disable_source, - .pic_eoi_source = xenpic_dynirq_eoi_source, - .pic_enable_intr = xenpic_dynirq_enable_intr, - .pic_disable_intr = xenpic_dynirq_disable_intr, - .pic_vector = xenpic_vector, - .pic_source_pending = xenpic_source_pending, - .pic_suspend = xenpic_suspend, - .pic_resume = xenpic_resume -}; - -struct pic xenpic_pirq_template = { - .pic_enable_source = xenpic_pirq_enable_source, - .pic_disable_source = xenpic_pirq_disable_source, - .pic_eoi_source = xenpic_pirq_eoi_source, - .pic_enable_intr = xenpic_pirq_enable_intr, - .pic_vector = xenpic_vector, - .pic_source_pending = xenpic_source_pending, - .pic_suspend = xenpic_suspend, - .pic_resume = xenpic_resume, - .pic_assign_cpu = xenpic_assign_cpu -}; - - - -void -xenpic_dynirq_enable_source(struct intsrc *isrc) -{ - unsigned int irq; - struct xenpic_intsrc *xp; - - xp = (struct xenpic_intsrc *)isrc; - - mtx_lock_spin(&irq_mapping_update_lock); - if (xp->xp_masked) { - irq = xenpic_vector(isrc); - unmask_evtchn(evtchn_from_irq(irq)); - xp->xp_masked = FALSE; - } - mtx_unlock_spin(&irq_mapping_update_lock); -} - -static void -xenpic_dynirq_disable_source(struct intsrc *isrc, int foo) -{ - unsigned int irq; - struct xenpic_intsrc *xp; - - xp = (struct xenpic_intsrc *)isrc; - - mtx_lock_spin(&irq_mapping_update_lock); - if (!xp->xp_masked) { - irq = xenpic_vector(isrc); - mask_evtchn(evtchn_from_irq(irq)); - xp->xp_masked = TRUE; - } - mtx_unlock_spin(&irq_mapping_update_lock); -} - -static void -xenpic_dynirq_enable_intr(struct intsrc *isrc) -{ - unsigned int irq; - struct xenpic_intsrc *xp; - - xp = (struct xenpic_intsrc *)isrc; - mtx_lock_spin(&irq_mapping_update_lock); - xp->xp_masked = 0; - irq = xenpic_vector(isrc); - unmask_evtchn(evtchn_from_irq(irq)); - mtx_unlock_spin(&irq_mapping_update_lock); -} - -static void -xenpic_dynirq_disable_intr(struct intsrc *isrc) -{ - unsigned int irq; - struct xenpic_intsrc *xp; - - xp = (struct xenpic_intsrc *)isrc; - mtx_lock_spin(&irq_mapping_update_lock); - irq = xenpic_vector(isrc); - mask_evtchn(evtchn_from_irq(irq)); - xp->xp_masked = 1; - mtx_unlock_spin(&irq_mapping_update_lock); -} - -static void -xenpic_dynirq_eoi_source(struct intsrc *isrc) -{ - unsigned int irq; - struct xenpic_intsrc *xp; - - xp = (struct xenpic_intsrc *)isrc; - mtx_lock_spin(&irq_mapping_update_lock); - xp->xp_masked = 0; - irq = xenpic_vector(isrc); - unmask_evtchn(evtchn_from_irq(irq)); - mtx_unlock_spin(&irq_mapping_update_lock); -} - -static int -xenpic_vector(struct intsrc *isrc) -{ - struct xenpic_intsrc *pin; - - pin = (struct xenpic_intsrc *)isrc; - //printf("xenpic_vector(): isrc=%p,vector=%u\n", pin, pin->xp_vector); - - return (pin->xp_vector); -} - -static int -xenpic_source_pending(struct intsrc *isrc) -{ - struct xenpic_intsrc *pin = (struct xenpic_intsrc *)isrc; - - /* XXXEN: TODO */ - printf("xenpic_source_pending(): vector=%x,masked=%x\n", - pin->xp_vector, pin->xp_masked); - -/* notify_remote_via_evtchn(pin->xp_vector); // XXX RS: Is this correct? */ - return 0; -} - -static void -xenpic_suspend(struct pic* pic) -{ - TODO; -} - -static void -xenpic_resume(struct pic* pic) -{ - TODO; -} - -static int -xenpic_assign_cpu(struct intsrc *isrc, u_int apic_id) -{ - TODO; - return (EOPNOTSUPP); -} - -void -notify_remote_via_irq(int irq) -{ - int evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - notify_remote_via_evtchn(evtchn); - else - panic("invalid evtchn %d", irq); -} - -/* required for support of physical devices */ -static inline void -pirq_unmask_notify(int pirq) -{ - struct physdev_eoi eoi = { .irq = pirq }; - - if (unlikely(test_bit(pirq, &pirq_needs_unmask_notify[0]))) { - (void)HYPERVISOR_physdev_op(PHYSDEVOP_eoi, &eoi); - } -} - -static inline void -pirq_query_unmask(int pirq) -{ - struct physdev_irq_status_query irq_status_query; - - irq_status_query.irq = pirq; - (void)HYPERVISOR_physdev_op(PHYSDEVOP_IRQ_STATUS_QUERY, &irq_status_query); - clear_bit(pirq, &pirq_needs_unmask_notify[0]); - if ( irq_status_query.flags & PHYSDEVOP_IRQ_NEEDS_UNMASK_NOTIFY ) - set_bit(pirq, &pirq_needs_unmask_notify[0]); -} - -/* - * On startup, if there is no action associated with the IRQ then we are - * probing. In this case we should not share with others as it will confuse us. - */ -#define probing_irq(_irq) (intr_lookup_source(irq) == NULL) - -static void -xenpic_pirq_enable_intr(struct intsrc *isrc) -{ - struct evtchn_bind_pirq bind_pirq; - int evtchn; - unsigned int irq; - - mtx_lock_spin(&irq_mapping_update_lock); - irq = xenpic_vector(isrc); - evtchn = evtchn_from_irq(irq); - - if (VALID_EVTCHN(evtchn)) - goto out; - - bind_pirq.pirq = irq; - /* NB. We are happy to share unless we are probing. */ - bind_pirq.flags = probing_irq(irq) ? 0 : BIND_PIRQ__WILL_SHARE; - - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq) != 0) { -#ifndef XEN_PRIVILEGED_GUEST - panic("unexpected pirq call"); -#endif - if (!probing_irq(irq)) /* Some failures are expected when probing. */ - printf("Failed to obtain physical IRQ %d\n", irq); - mtx_unlock_spin(&irq_mapping_update_lock); - return; - } - evtchn = bind_pirq.port; - - pirq_query_unmask(irq_to_pirq(irq)); - - bind_evtchn_to_cpu(evtchn, 0); - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_PIRQ, irq, evtchn); - - out: - unmask_evtchn(evtchn); - pirq_unmask_notify(irq_to_pirq(irq)); - mtx_unlock_spin(&irq_mapping_update_lock); -} - -static void -xenpic_pirq_enable_source(struct intsrc *isrc) -{ - int evtchn; - unsigned int irq; - - mtx_lock_spin(&irq_mapping_update_lock); - irq = xenpic_vector(isrc); - evtchn = evtchn_from_irq(irq); - - if (!VALID_EVTCHN(evtchn)) - goto done; - - unmask_evtchn(evtchn); - pirq_unmask_notify(irq_to_pirq(irq)); - done: - mtx_unlock_spin(&irq_mapping_update_lock); -} - -static void -xenpic_pirq_disable_source(struct intsrc *isrc, int eoi) -{ - int evtchn; - unsigned int irq; - - mtx_lock_spin(&irq_mapping_update_lock); - irq = xenpic_vector(isrc); - evtchn = evtchn_from_irq(irq); - - if (!VALID_EVTCHN(evtchn)) - goto done; - - mask_evtchn(evtchn); - done: - mtx_unlock_spin(&irq_mapping_update_lock); -} - - -static void -xenpic_pirq_eoi_source(struct intsrc *isrc) -{ - int evtchn; - unsigned int irq; - - mtx_lock_spin(&irq_mapping_update_lock); - irq = xenpic_vector(isrc); - evtchn = evtchn_from_irq(irq); - - if (!VALID_EVTCHN(evtchn)) - goto done; - - unmask_evtchn(evtchn); - pirq_unmask_notify(irq_to_pirq(irq)); - done: - mtx_unlock_spin(&irq_mapping_update_lock); -} - -int -irq_to_evtchn_port(int irq) -{ - return evtchn_from_irq(irq); -} - -void -mask_evtchn(int port) -{ - shared_info_t *s = HYPERVISOR_shared_info; - synch_set_bit(port, &s->evtchn_mask[0]); -} - -void -unmask_evtchn(int port) -{ - shared_info_t *s = HYPERVISOR_shared_info; - unsigned int cpu = PCPU_GET(cpuid); - vcpu_info_t *vcpu_info = &s->vcpu_info[cpu]; - - /* Slow path (hypercall) if this is a non-local port. */ - if (unlikely(cpu != cpu_from_evtchn(port))) { - struct evtchn_unmask unmask = { .port = port }; - (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); - return; - } - - synch_clear_bit(port, &s->evtchn_mask); - - /* - * The following is basically the equivalent of 'hw_resend_irq'. Just - * like a real IO-APIC we 'lose the interrupt edge' if the channel is - * masked. - */ - if (synch_test_bit(port, &s->evtchn_pending) && - !synch_test_and_set_bit(port / LONG_BIT, - &vcpu_info->evtchn_pending_sel)) { - vcpu_info->evtchn_upcall_pending = 1; - if (!vcpu_info->evtchn_upcall_mask) - force_evtchn_callback(); - } -} - -void irq_resume(void) -{ - evtchn_op_t op; - int cpu, pirq, virq, ipi, irq, evtchn; - - struct evtchn_bind_virq bind_virq; - struct evtchn_bind_ipi bind_ipi; - - init_evtchn_cpu_bindings(); - - /* New event-channel space is not 'live' yet. */ - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) - mask_evtchn(evtchn); - - /* Check that no PIRQs are still bound. */ - for (pirq = 0; pirq < NR_PIRQS; pirq++) { - KASSERT(irq_info[pirq_to_irq(pirq)] == IRQ_UNBOUND, - ("pirq_to_irq inconsistent")); - } - - /* Secondary CPUs must have no VIRQ or IPI bindings. */ - for (cpu = 1; cpu < XEN_LEGACY_MAX_VCPUS; cpu++) { - for (virq = 0; virq < NR_VIRQS; virq++) { - KASSERT(pcpu_find(cpu)->pc_virq_to_irq[virq] == -1, - ("virq_to_irq inconsistent")); - } - for (ipi = 0; ipi < NR_IPIS; ipi++) { - KASSERT(pcpu_find(cpu)->pc_ipi_to_irq[ipi] == -1, - ("ipi_to_irq inconsistent")); - } - } - - /* No IRQ <-> event-channel mappings. */ - for (irq = 0; irq < NR_IRQS; irq++) - irq_info[irq] &= ~0xFFFF; /* zap event-channel binding */ - for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++) - evtchn_to_irq[evtchn] = -1; - - /* Primary CPU: rebind VIRQs automatically. */ - for (virq = 0; virq < NR_VIRQS; virq++) { - if ((irq = pcpu_find(0)->pc_virq_to_irq[virq]) == -1) - continue; - - KASSERT(irq_info[irq] == mk_irq_info(IRQT_VIRQ, virq, 0), - ("irq_info inconsistent")); - - /* Get a new binding from Xen. */ - bind_virq.virq = virq; - bind_virq.vcpu = 0; - HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq, &bind_virq); - evtchn = bind_virq.port; - - /* Record the new mapping. */ - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn); - - /* Ready for use. */ - unmask_evtchn(evtchn); - } - - /* Primary CPU: rebind IPIs automatically. */ - for (ipi = 0; ipi < NR_IPIS; ipi++) { - if ((irq = pcpu_find(0)->pc_ipi_to_irq[ipi]) == -1) - continue; - - KASSERT(irq_info[irq] == mk_irq_info(IRQT_IPI, ipi, 0), - ("irq_info inconsistent")); - - /* Get a new binding from Xen. */ - memset(&op, 0, sizeof(op)); - bind_ipi.vcpu = 0; - HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi, &bind_ipi); - evtchn = bind_ipi.port; - - /* Record the new mapping. */ - evtchn_to_irq[evtchn] = irq; - irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn); - - /* Ready for use. */ - unmask_evtchn(evtchn); - } -} - -static void -evtchn_init(void *dummy __unused) -{ - int i, cpu; - struct xenpic_intsrc *pin, *tpin; - - - init_evtchn_cpu_bindings(); - - /* No VIRQ or IPI bindings. */ - for (cpu = 0; cpu < mp_ncpus; cpu++) { - for (i = 0; i < NR_VIRQS; i++) - pcpu_find(cpu)->pc_virq_to_irq[i] = -1; - for (i = 0; i < NR_IPIS; i++) - pcpu_find(cpu)->pc_ipi_to_irq[i] = -1; - } - - /* No event-channel -> IRQ mappings. */ - for (i = 0; i < NR_EVENT_CHANNELS; i++) { - evtchn_to_irq[i] = -1; - mask_evtchn(i); /* No event channels are 'live' right now. */ - } - - /* No IRQ -> event-channel mappings. */ - for (i = 0; i < NR_IRQS; i++) - irq_info[i] = IRQ_UNBOUND; - - xp = malloc(sizeof(struct xenpic) + NR_IRQS*sizeof(struct xenpic_intsrc), - M_DEVBUF, M_WAITOK); - - xp->xp_dynirq_pic = &xenpic_dynirq_template; - xp->xp_pirq_pic = &xenpic_pirq_template; - xp->xp_numintr = NR_IRQS; - bzero(xp->xp_pins, sizeof(struct xenpic_intsrc) * NR_IRQS); - - - /* We need to register our PIC's beforehand */ - if (intr_register_pic(&xenpic_pirq_template)) - panic("XEN: intr_register_pic() failure"); - if (intr_register_pic(&xenpic_dynirq_template)) - panic("XEN: intr_register_pic() failure"); - - /* - * Initialize the dynamic IRQ's - we initialize the structures, but - * we do not bind them (bind_evtchn_to_irqhandle() does this) - */ - pin = xp->xp_pins; - for (i = 0; i < NR_DYNIRQS; i++) { - /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ - irq_bindcount[dynirq_to_irq(i)] = 0; - - tpin = &pin[dynirq_to_irq(i)]; - tpin->xp_intsrc.is_pic = xp->xp_dynirq_pic; - tpin->xp_vector = dynirq_to_irq(i); - - } - /* - * Now, we go ahead and claim every PIRQ there is. - */ - pin = xp->xp_pins; - for (i = 0; i < NR_PIRQS; i++) { - /* Dynamic IRQ space is currently unbound. Zero the refcnts. */ - irq_bindcount[pirq_to_irq(i)] = 0; - -#ifdef RTC_IRQ - /* If not domain 0, force our RTC driver to fail its probe. */ - if ((i == RTC_IRQ) && - !(xen_start_info->flags & SIF_INITDOMAIN)) - continue; -#endif - tpin = &pin[pirq_to_irq(i)]; - tpin->xp_intsrc.is_pic = xp->xp_pirq_pic; - tpin->xp_vector = pirq_to_irq(i); - - } -} - -SYSINIT(evtchn_init, SI_SUB_INTR, SI_ORDER_MIDDLE, evtchn_init, NULL); - Index: sys/xen/evtchn/evtchn_dev.c =================================================================== --- sys/xen/evtchn/evtchn_dev.c (revision 255014) +++ sys/xen/evtchn/evtchn_dev.c (working copy) @@ -22,28 +22,23 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include -#include +#include +#include #include + #include -#include #include #include -#include -#include +#include typedef struct evtchn_sotfc { struct selinfo ev_rsel; } evtchn_softc_t; - -#ifdef linuxcrap -/* NB. This must be shared amongst drivers if more things go in /dev/xen */ -static devfs_handle_t xen_dev_dir; -#endif - /* Only one process may open /dev/xen/evtchn at any time. */ static unsigned long evtchn_dev_inuse; @@ -72,12 +67,12 @@ static d_close_t evtchn_close; void -evtchn_device_upcall(int port) +evtchn_device_upcall(evtchn_port_t port) { mtx_lock(&upcall_lock); - mask_evtchn(port); - clear_evtchn(port); + evtchn_mask_port(port); + evtchn_clear_port(port); if ( ring != NULL ) { if ( (ring_prod - ring_cons) < EVTCHN_RING_SIZE ) { @@ -208,7 +203,7 @@ evtchn_write(struct cdev *dev, struct uio *uio, in mtx_lock_spin(&lock); for ( i = 0; i < (count/2); i++ ) if ( test_bit(kbuf[i], &bound_ports[0]) ) - unmask_evtchn(kbuf[i]); + evtchn_unmask_port(kbuf[i]); mtx_unlock_spin(&lock); rc = count; @@ -224,6 +219,7 @@ evtchn_ioctl(struct cdev *dev, unsigned long cmd, { int rc = 0; +#ifdef NOTYET mtx_lock_spin(&lock); switch ( cmd ) @@ -249,6 +245,7 @@ evtchn_ioctl(struct cdev *dev, unsigned long cmd, } mtx_unlock_spin(&lock); +#endif return rc; } @@ -309,7 +306,7 @@ evtchn_close(struct cdev *dev, int flag, int otyp, mtx_lock_spin(&lock); for ( i = 0; i < NR_EVENT_CHANNELS; i++ ) if ( synch_test_and_clear_bit(i, &bound_ports[0]) ) - mask_evtchn(i); + evtchn_mask_port(i); mtx_unlock_spin(&lock); evtchn_dev_inuse = 0; @@ -352,34 +349,6 @@ evtchn_dev_init(void *dummy __unused) evtchn_dev->si_drv1 = malloc(sizeof(evtchn_softc_t), M_DEVBUF, M_WAITOK); bzero(evtchn_dev->si_drv1, sizeof(evtchn_softc_t)); - /* XXX I don't think we need any of this rubbish */ -#if 0 - if ( err != 0 ) - { - printk(KERN_ALERT "Could not register /dev/misc/evtchn\n"); - return err; - } - - /* (DEVFS) create directory '/dev/xen'. */ - xen_dev_dir = devfs_mk_dir(NULL, "xen", NULL); - - /* (DEVFS) &link_dest[pos] == '../misc/evtchn'. */ - pos = devfs_generate_path(evtchn_miscdev.devfs_handle, - &link_dest[3], - sizeof(link_dest) - 3); - if ( pos >= 0 ) - strncpy(&link_dest[pos], "../", 3); - /* (DEVFS) symlink '/dev/xen/evtchn' -> '../misc/evtchn'. */ - (void)devfs_mk_symlink(xen_dev_dir, - "evtchn", - DEVFS_FL_DEFAULT, - &link_dest[pos], - &symlink_handle, - NULL); - - /* (DEVFS) automatically destroy the symlink with its destination. */ - devfs_auto_unregister(evtchn_miscdev.devfs_handle, symlink_handle); -#endif if (bootverbose) printf("Event-channel device installed.\n"); @@ -387,5 +356,3 @@ evtchn_dev_init(void *dummy __unused) } SYSINIT(evtchn_dev_init, SI_SUB_DRIVERS, SI_ORDER_FIRST, evtchn_dev_init, NULL); - - Index: sys/xen/evtchn/evtchnvar.h =================================================================== --- sys/xen/evtchn/evtchnvar.h (revision 0) +++ sys/xen/evtchn/evtchnvar.h (working copy) @@ -0,0 +1,105 @@ +/****************************************************************************** + * evtchn.h + * + * Data structures and definitions private to the FreeBSD implementation + * of the Xen event channel API. + * + * Copyright (c) 2004, K A Fraser + * Copyright (c) 2012, Spectra Logic Corporation + * + * This file may be distributed separately from the Linux kernel, or + * incorporated into other software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * $FreeBSD$ + */ + +#ifndef __XEN_EVTCHN_EVTCHNVAR_H__ +#define __XEN_EVTCHN_EVTCHNVAR_H__ + +#include +#include + +enum evtchn_type { + EVTCHN_TYPE_UNBOUND, + EVTCHN_TYPE_PIRQ, + EVTCHN_TYPE_VIRQ, + EVTCHN_TYPE_IPI, + EVTCHN_TYPE_PORT, + EVTCHN_TYPE_COUNT +}; + +/** Submit a port notification for delivery to a userland evtchn consumer */ +void evtchn_device_upcall(evtchn_port_t port); + +/** + * Disable signal delivery for an event channel port, returning its + * previous mask state. + * + * \param port The event channel port to query and mask. + * + * \returns 1 if event delivery was previously disabled. Otherwise 0. + */ +static inline int +evtchn_test_and_set_mask(evtchn_port_t port) +{ + shared_info_t *s = HYPERVISOR_shared_info; + return synch_test_and_set_bit(port, s->evtchn_mask); +} + +/** + * Clear any pending event for the given event channel port. + * + * \param port The event channel port to clear. + */ +static inline void +evtchn_clear_port(evtchn_port_t port) +{ + shared_info_t *s = HYPERVISOR_shared_info; + synch_clear_bit(port, &s->evtchn_pending[0]); +} + +/** + * Disable signal delivery for an event channel port. + * + * \param port The event channel port to mask. + */ +static inline void +evtchn_mask_port(evtchn_port_t port) +{ + shared_info_t *s = HYPERVISOR_shared_info; + + synch_set_bit(port, &s->evtchn_mask[0]); +} + +/** + * Enable signal delivery for an event channel port. + * + * \param port The event channel port to enable. + */ +static inline void +evtchn_unmask_port(evtchn_port_t port) +{ + evtchn_unmask_t op = { .port = port }; + + HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &op); +} + +#endif /* __XEN_EVTCHN_EVTCHNVAR_H__ */ Property changes on: sys/xen/evtchn/evtchnvar.h ___________________________________________________________________ Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: sys/xen/evtchn.h =================================================================== --- sys/xen/evtchn.h (revision 255014) +++ sys/xen/evtchn.h (working copy) @@ -1,94 +1,87 @@ /****************************************************************************** * evtchn.h * - * Communication via Xen event channels. - * Also definitions for the device that demuxes notifications to userspace. + * Interface to /dev/xen/evtchn. * - * Copyright (c) 2004, K A Fraser + * Copyright (c) 2003-2005, K A Fraser + * + * This file may be distributed separately from the Linux kernel, or + * incorporated into other software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. * * $FreeBSD$ */ -#ifndef __ASM_EVTCHN_H__ -#define __ASM_EVTCHN_H__ -#include -#include -#include -#include +#ifndef __XEN_EVTCHN_H__ +#define __XEN_EVTCHN_H__ /* - * LOW-LEVEL DEFINITIONS + * Bind a fresh port to VIRQ @virq. */ +#define IOCTL_EVTCHN_BIND_VIRQ \ + _IOWR('E', 4, struct ioctl_evtchn_bind_virq) +struct ioctl_evtchn_bind_virq { + unsigned int virq; + unsigned int port; +}; /* - * Unlike notify_remote_via_evtchn(), this is safe to use across - * save/restore. Notifications on a broken connection are silently dropped. + * Bind a fresh port to remote <@remote_domain, @remote_port>. */ -void notify_remote_via_irq(int irq); +#define IOCTL_EVTCHN_BIND_INTERDOMAIN \ + _IOWR('E', 5, struct ioctl_evtchn_bind_interdomain) +struct ioctl_evtchn_bind_interdomain { + unsigned int remote_domain, remote_port; + unsigned int port; +}; +/* + * Allocate a fresh port for binding to @remote_domain. + */ +#define IOCTL_EVTCHN_BIND_UNBOUND_PORT \ + _IOWR('E', 6, struct ioctl_evtchn_bind_unbound_port) +struct ioctl_evtchn_bind_unbound_port { + unsigned int remote_domain; + unsigned int port; +}; -/* Entry point for notifications into Linux subsystems. */ -void evtchn_do_upcall(struct trapframe *frame); - -/* Entry point for notifications into the userland character device. */ -void evtchn_device_upcall(int port); - -void mask_evtchn(int port); - -void unmask_evtchn(int port); - -#ifdef SMP -void rebind_evtchn_to_cpu(int port, unsigned int cpu); -#else -#define rebind_evtchn_to_cpu(port, cpu) ((void)0) -#endif - -static inline -int test_and_set_evtchn_mask(int port) -{ - shared_info_t *s = HYPERVISOR_shared_info; - return synch_test_and_set_bit(port, s->evtchn_mask); -} - -static inline void -clear_evtchn(int port) -{ - shared_info_t *s = HYPERVISOR_shared_info; - synch_clear_bit(port, &s->evtchn_pending[0]); -} - -static inline void -notify_remote_via_evtchn(int port) -{ - struct evtchn_send send = { .port = port }; - (void)HYPERVISOR_event_channel_op(EVTCHNOP_send, &send); -} - /* - * Use these to access the event channel underlying the IRQ handle returned - * by bind_*_to_irqhandler(). + * Unbind previously allocated @port. */ -int irq_to_evtchn_port(int irq); +#define IOCTL_EVTCHN_UNBIND \ + _IOW('E', 7, struct ioctl_evtchn_unbind) +struct ioctl_evtchn_unbind { + unsigned int port; +}; -void ipi_pcpu(unsigned int cpu, int vector); - /* - * CHARACTER-DEVICE DEFINITIONS + * Send event to previously allocated @port. */ +#define IOCTL_EVTCHN_NOTIFY \ + _IOW('E', 8, struct ioctl_evtchn_notify) +struct ioctl_evtchn_notify { + unsigned int port; +}; -#define PORT_NORMAL 0x0000 -#define PORT_EXCEPTION 0x8000 -#define PORTIDX_MASK 0x7fff +/* Clear and reinitialise the event buffer. Clear error condition. */ +#define IOCTL_EVTCHN_RESET \ + _IO('E', 9) -/* /dev/xen/evtchn resides at device number major=10, minor=200 */ -#define EVTCHN_MINOR 200 - -/* /dev/xen/evtchn ioctls: */ -/* EVTCHN_RESET: Clear and reinit the event buffer. Clear error condition. */ -#define EVTCHN_RESET _IO('E', 1) -/* EVTCHN_BIND: Bind to the specified event-channel port. */ -#define EVTCHN_BIND _IO('E', 2) -/* EVTCHN_UNBIND: Unbind from the specified event-channel port. */ -#define EVTCHN_UNBIND _IO('E', 3) - -#endif /* __ASM_EVTCHN_H__ */ +#endif /* __XEN_EVTCHN_H__ */ Index: sys/xen/features.c =================================================================== --- sys/xen/features.c (revision 255014) +++ sys/xen/features.c (working copy) @@ -4,7 +4,7 @@ __FBSDID("$FreeBSD$"); #include #include -#include +#include #include #include Index: sys/xen/gnttab.c =================================================================== --- sys/xen/gnttab.c (revision 255014) +++ sys/xen/gnttab.c (working copy) @@ -26,7 +26,7 @@ __FBSDID("$FreeBSD$"); #include #include -#include +#include #include #include @@ -108,7 +108,7 @@ do_free_callbacks(void) static inline void check_free_callbacks(void) { - if (unlikely(gnttab_free_callback_list != NULL)) + if (__predict_false(gnttab_free_callback_list != NULL)) do_free_callbacks(); } @@ -136,7 +136,7 @@ gnttab_grant_foreign_access(domid_t domid, unsigne error = get_free_entries(1, &ref); - if (unlikely(error)) + if (__predict_false(error)) return (error); shared[ref].frame = frame; @@ -248,7 +248,7 @@ gnttab_grant_foreign_transfer(domid_t domid, unsig int error, ref; error = get_free_entries(1, &ref); - if (unlikely(error)) + if (__predict_false(error)) return (error); gnttab_grant_foreign_transfer_ref(ref, domid, pfn); @@ -341,7 +341,7 @@ gnttab_alloc_grant_references(uint16_t count, gran int ref, error; error = get_free_entries(count, &ref); - if (unlikely(error)) + if (__predict_false(error)) return (error); *head = ref; @@ -360,7 +360,7 @@ gnttab_claim_grant_reference(grant_ref_t *private_ { grant_ref_t g = *private_head; - if (unlikely(g == GNTTAB_LIST_END)) + if (__predict_false(g == GNTTAB_LIST_END)) return (g); *private_head = gnttab_entry(g); return (g); Index: sys/xen/gnttab.h =================================================================== --- sys/xen/gnttab.h (revision 255014) +++ sys/xen/gnttab.h (working copy) @@ -36,13 +36,12 @@ #ifndef __ASM_GNTTAB_H__ -#include - +#include #include -#include -#include #include +#include + #define GNTTAB_LIST_END GRANT_REF_INVALID struct gnttab_free_callback { Index: sys/xen/hvm.h =================================================================== --- sys/xen/hvm.h (revision 255014) +++ sys/xen/hvm.h (working copy) @@ -23,6 +23,9 @@ #ifndef __XEN_HVM_H__ #define __XEN_HVM_H__ +#include +#include + #include /** @@ -91,4 +94,5 @@ enum { void xen_hvm_set_callback(device_t); void xen_hvm_suspend(void); void xen_hvm_resume(void); +void xen_hvm_init_cpu(void); #endif /* __XEN_HVM_H__ */ Index: sys/xen/interface/event_channel.h =================================================================== --- sys/xen/interface/event_channel.h (revision 255014) +++ sys/xen/interface/event_channel.h (working copy) @@ -73,8 +73,11 @@ #define EVTCHNOP_reset 10 /* ` } */ +#ifndef __XEN_EVTCHN_PORT_DEFINED__ typedef uint32_t evtchn_port_t; DEFINE_XEN_GUEST_HANDLE(evtchn_port_t); +#define __XEN_EVTCHN_PORT_DEFINED__ 1 +#endif /* * EVTCHNOP_alloc_unbound: Allocate a port in domain and mark as Index: sys/xen/xen-os.h =================================================================== --- sys/xen/xen-os.h (revision 0) +++ sys/xen/xen-os.h (working copy) @@ -0,0 +1,95 @@ +/****************************************************************************** + * xen/xen-os.h + * + * Random collection of macros and definition + * + * Copyright (c) 2003, 2004 Keir Fraser (on behalf of the Xen team) + * All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + * + * $FreeBSD$ + */ + +#ifndef _XEN_XEN_OS_H_ +#define _XEN_XEN_OS_H_ + +#if !defined(__XEN_INTERFACE_VERSION__) +#define __XEN_INTERFACE_VERSION__ 0x00030208 +#endif + +#define GRANT_REF_INVALID 0xffffffff + +#ifdef LOCORE +#define __ASSEMBLY__ +#endif + +#include + +#include + +/* Everything below this point is not included by assembler (.S) files. */ +#ifndef __ASSEMBLY__ + +/* Force a proper event-channel callback from Xen. */ +void force_evtchn_callback(void); + +extern int gdtset; + +extern shared_info_t *HYPERVISOR_shared_info; + +enum xen_domain_type { + XEN_NATIVE, /* running on bare hardware */ + XEN_PV_DOMAIN, /* running in a PV domain */ + XEN_HVM_DOMAIN, /* running in a Xen hvm domain */ +}; + +extern enum xen_domain_type xen_domain_type; + +static inline int +xen_domain(void) +{ + return (xen_domain_type != XEN_NATIVE); +} + +static inline int +xen_pv_domain(void) +{ + return (xen_domain_type == XEN_PV_DOMAIN); +} + +static inline int +xen_hvm_domain(void) +{ + return (xen_domain_type == XEN_HVM_DOMAIN); +} + +#ifndef xen_mb +#define xen_mb() mb() +#endif +#ifndef xen_rmb +#define xen_rmb() rmb() +#endif +#ifndef xen_wmb +#define xen_wmb() wmb() +#endif + +#endif /* !__ASSEMBLY__ */ + +#endif /* _XEN_XEN_OS_H_ */ Property changes on: sys/xen/xen-os.h ___________________________________________________________________ Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: sys/xen/xen_intr.h =================================================================== --- sys/xen/xen_intr.h (revision 255014) +++ sys/xen/xen_intr.h (working copy) @@ -1,103 +1,216 @@ -/* -*- Mode:C; c-basic-offset:4; tab-width:4 -*- */ +/****************************************************************************** + * xen_intr.h + * + * APIs for managing Xen event channel, virtual IRQ, and physical IRQ + * notifications. + * + * Copyright (c) 2004, K A Fraser + * Copyright (c) 2012, Spectra Logic Corporation + * + * This file may be distributed separately from the Linux kernel, or + * incorporated into other software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * $FreeBSD$ + */ #ifndef _XEN_INTR_H_ #define _XEN_INTR_H_ -/* -* The flat IRQ space is divided into two regions: -* 1. A one-to-one mapping of real physical IRQs. This space is only used -* if we have physical device-access privilege. This region is at the -* start of the IRQ space so that existing device drivers do not need -* to be modified to translate physical IRQ numbers into our IRQ space. -* 3. A dynamic mapping of inter-domain and Xen-sourced virtual IRQs. These -* are bound using the provided bind/unbind functions. -* -* -* $FreeBSD$ -*/ +#ifndef __XEN_EVTCHN_PORT_DEFINED__ +typedef uint32_t evtchn_port_t; +DEFINE_XEN_GUEST_HANDLE(evtchn_port_t); +#define __XEN_EVTCHN_PORT_DEFINED__ 1 +#endif -#define PIRQ_BASE 0 -#define NR_PIRQS 128 +/** Registered Xen interrupt callback handle. */ +typedef void * xen_intr_handle_t; -#define DYNIRQ_BASE (PIRQ_BASE + NR_PIRQS) -#define NR_DYNIRQS 128 +/** If non-zero, the hypervisor has been configured to use a direct vector */ +extern int xen_vector_callback_enabled; -#define NR_IRQS (NR_PIRQS + NR_DYNIRQS) +/** + * Associate an already allocated local event channel port an interrupt + * handler. + * + * \param dev The device making this bind request. + * \param local_port The event channel to bind. + * \param filter An interrupt filter handler. Specify NULL + * to always dispatch to the ithread handler. + * \param handler An interrupt ithread handler. Optional (can + * specify NULL) if all necessary event actions + * are performed by filter. + * \param arg Argument to present to both filter and handler. + * \param irqflags Interrupt handler flags. See sys/bus.h. + * \param handlep Pointer to an opaque handle used to manage this + * registration. + * + * \returns 0 on success, otherwise an errno. + */ +int xen_intr_bind_local_port(device_t dev, evtchn_port_t local_port, + driver_filter_t filter, driver_intr_t handler, void *arg, + enum intr_type irqflags, xen_intr_handle_t *handlep); -#define pirq_to_irq(_x) ((_x) + PIRQ_BASE) -#define irq_to_pirq(_x) ((_x) - PIRQ_BASE) +/** + * Allocate a local event channel port, accessible by the specified + * remote/foreign domain and, if successful, associate the port with + * the specified interrupt handler. + * + * \param dev The device making this bind request. + * \param remote_domain Remote domain grant permission to signal the + * newly allocated local port. + * \param filter An interrupt filter handler. Specify NULL + * to always dispatch to the ithread handler. + * \param handler An interrupt ithread handler. Optional (can + * specify NULL) if all necessary event actions + * are performed by filter. + * \param arg Argument to present to both filter and handler. + * \param irqflags Interrupt handler flags. See sys/bus.h. + * \param handlep Pointer to an opaque handle used to manage this + * registration. + * + * \returns 0 on success, otherwise an errno. + */ +int xen_intr_alloc_and_bind_local_port(device_t dev, + u_int remote_domain, driver_filter_t filter, driver_intr_t handler, + void *arg, enum intr_type irqflags, xen_intr_handle_t *handlep); -#define dynirq_to_irq(_x) ((_x) + DYNIRQ_BASE) -#define irq_to_dynirq(_x) ((_x) - DYNIRQ_BASE) - -/* - * Dynamic binding of event channels and VIRQ sources to guest IRQ space. +/** + * Associate the specified interrupt handler with the remote event + * channel port specified by remote_domain and remote_port. + * + * \param dev The device making this bind request. + * \param remote_domain The domain peer for this event channel connection. + * \param remote_port Remote domain's local port number for this event + * channel port. + * \param filter An interrupt filter handler. Specify NULL + * to always dispatch to the ithread handler. + * \param handler An interrupt ithread handler. Optional (can + * specify NULL) if all necessary event actions + * are performed by filter. + * \param arg Argument to present to both filter and handler. + * \param irqflags Interrupt handler flags. See sys/bus.h. + * \param handlep Pointer to an opaque handle used to manage this + * registration. + * + * \returns 0 on success, otherwise an errno. */ +int xen_intr_bind_remote_port(device_t dev, u_int remote_domain, + evtchn_port_t remote_port, driver_filter_t filter, + driver_intr_t handler, void *arg, enum intr_type irqflags, + xen_intr_handle_t *handlep); -/* - * Bind a caller port event channel to an interrupt handler. If - * successful, the guest IRQ number is returned in *irqp. Return zero - * on success or errno otherwise. +/** + * Associate the specified interrupt handler with the specified Xen + * virtual interrupt source. + * + * \param dev The device making this bind request. + * \param virq The Xen virtual IRQ number for the Xen interrupt + * source being hooked. + * \param cpu The cpu on which interrupt events should be delivered. + * \param filter An interrupt filter handler. Specify NULL + * to always dispatch to the ithread handler. + * \param handler An interrupt ithread handler. Optional (can + * specify NULL) if all necessary event actions + * are performed by filter. + * \param arg Argument to present to both filter and handler. + * \param irqflags Interrupt handler flags. See sys/bus.h. + * \param handlep Pointer to an opaque handle used to manage this + * registration. + * + * \returns 0 on success, otherwise an errno. */ -extern int bind_caller_port_to_irqhandler(unsigned int caller_port, - const char *devname, driver_intr_t handler, void *arg, - unsigned long irqflags, unsigned int *irqp); +int xen_intr_bind_virq(device_t dev, u_int virq, u_int cpu, + driver_filter_t filter, driver_intr_t handler, + void *arg, enum intr_type irqflags, xen_intr_handle_t *handlep); -/* - * Bind a listening port to an interrupt handler. If successful, the - * guest IRQ number is returned in *irqp. Return zero on success or - * errno otherwise. +/** + * Associate an interprocessor interrupt vector with an interrupt handler. + * + * \param dev The device making this bind request. + * \param ipi The interprocessor interrupt vector number of the + * interrupt source being hooked. + * \param cpu The cpu receiving the IPI. + * \param filter An interrupt filter handler. Specify NULL + * to always dispatch to the ithread handler. + * \param irqflags Interrupt handler flags. See sys/bus.h. + * \param handlep Pointer to an opaque handle used to manage this + * registration. + * + * \returns 0 on success, otherwise an errno. */ -extern int bind_listening_port_to_irqhandler(unsigned int remote_domain, - const char *devname, driver_intr_t handler, void *arg, - unsigned long irqflags, unsigned int *irqp); +int xen_intr_bind_ipi(device_t dev, u_int ipi, u_int cpu, + driver_filter_t filter, enum intr_type irqflags, + xen_intr_handle_t *handlep); -/* - * Bind a VIRQ to an interrupt handler. If successful, the guest IRQ - * number is returned in *irqp. Return zero on success or errno - * otherwise. +/** + * Unbind an interrupt handler from its interrupt source. + * + * \param handlep A pointer to the opaque handle that was initialized + * at the time the interrupt source was bound. + * + * \returns 0 on success, otherwise an errno. + * + * \note The event channel, if any, that was allocated at bind time is + * closed upon successful return of this method. + * + * \note It is always safe to call xen_intr_unbind() on a handle that + * has been initilized to NULL. */ -extern int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, - const char *devname, driver_filter_t filter, driver_intr_t handler, - void *arg, unsigned long irqflags, unsigned int *irqp); +void xen_intr_unbind(xen_intr_handle_t *handle); -/* - * Bind an IPI to an interrupt handler. If successful, the guest - * IRQ number is returned in *irqp. Return zero on success or errno - * otherwise. +/** + * Add a description to an interrupt handler. + * + * \param handle The opaque handle that was initialized at the time + * the interrupt source was bound. + * + * \param fmt The sprintf compatible format string for the description, + * followed by optional sprintf arguments. + * + * \returns 0 on success, otherwise an errno. */ -extern int bind_ipi_to_irqhandler(unsigned int ipi, unsigned int cpu, - const char *devname, driver_filter_t filter, - unsigned long irqflags, unsigned int *irqp); +int +xen_intr_describe(xen_intr_handle_t port_handle, const char *fmt, ...) + __attribute__((format(printf, 2, 3))); -/* - * Bind an interdomain event channel to an interrupt handler. If - * successful, the guest IRQ number is returned in *irqp. Return zero - * on success or errno otherwise. +/** + * Signal the remote peer of an interrupt source associated with an + * event channel port. + * + * \param handle The opaque handle that was initialized at the time + * the interrupt source was bound. + * + * \note For xen interrupt sources other than event channel ports, + * this method takes no action. */ -extern int bind_interdomain_evtchn_to_irqhandler(unsigned int remote_domain, - unsigned int remote_port, const char *devname, - driver_intr_t handler, void *arg, - unsigned long irqflags, unsigned int *irqp); +void xen_intr_signal(xen_intr_handle_t handle); -/* - * Unbind an interrupt handler using the guest IRQ number returned - * when it was bound. +/** + * Get the local event channel port number associated with this interrupt + * source. + * + * \param handle The opaque handle that was initialized at the time + * the interrupt source was bound. + * + * \returns 0 if the handle is invalid, otherwise positive port number. */ -extern void unbind_from_irqhandler(unsigned int irq); +evtchn_port_t xen_intr_port(xen_intr_handle_t handle); -static __inline__ int irq_cannonicalize(unsigned int irq) -{ - return (irq == 2) ? 9 : irq; -} - -extern void disable_irq(unsigned int); -extern void disable_irq_nosync(unsigned int); -extern void enable_irq(unsigned int); - -extern void irq_suspend(void); -extern void irq_resume(void); - -extern void idle_block(void); -extern int ap_cpu_initclocks(int cpu); - #endif /* _XEN_INTR_H_ */ Index: sys/xen/xenbus/xenbus.c =================================================================== --- sys/xen/xenbus/xenbus.c (revision 255014) +++ sys/xen/xenbus/xenbus.c (working copy) @@ -50,11 +50,12 @@ __FBSDID("$FreeBSD$"); #include #include -#include +#include #include #include #include #include + #include MALLOC_DEFINE(M_XENBUS, "xenbus", "XenBus Support"); @@ -222,42 +223,6 @@ xenbus_grant_ring(device_t dev, unsigned long ring return (0); } -int -xenbus_alloc_evtchn(device_t dev, evtchn_port_t *port) -{ - struct evtchn_alloc_unbound alloc_unbound; - int err; - - alloc_unbound.dom = DOMID_SELF; - alloc_unbound.remote_dom = xenbus_get_otherend_id(dev); - - err = HYPERVISOR_event_channel_op(EVTCHNOP_alloc_unbound, - &alloc_unbound); - - if (err) { - xenbus_dev_fatal(dev, -err, "allocating event channel"); - return (-err); - } - *port = alloc_unbound.port; - return (0); -} - -int -xenbus_free_evtchn(device_t dev, evtchn_port_t port) -{ - struct evtchn_close close; - int err; - - close.port = port; - - err = HYPERVISOR_event_channel_op(EVTCHNOP_close, &close); - if (err) { - xenbus_dev_error(dev, -err, "freeing event channel %d", port); - return (-err); - } - return (0); -} - XenbusState xenbus_read_driver_state(const char *path) { Index: sys/xen/xenbus/xenbus_if.m =================================================================== --- sys/xen/xenbus/xenbus_if.m (revision 255014) +++ sys/xen/xenbus/xenbus_if.m (working copy) @@ -29,7 +29,8 @@ #include #include -#include + +#include #include #include Index: sys/xen/xenbus/xenbusb_front.c =================================================================== --- sys/xen/xenbus/xenbusb_front.c (revision 255014) +++ sys/xen/xenbus/xenbusb_front.c (working copy) @@ -51,9 +51,9 @@ __FBSDID("$FreeBSD$"); #include #include -#include #include +#include #include #include #include Index: sys/xen/xenbus/xenbusvar.h =================================================================== --- sys/xen/xenbus/xenbusvar.h (revision 255014) +++ sys/xen/xenbus/xenbusvar.h (working copy) @@ -43,8 +43,8 @@ #include #include -#include +#include #include #include #include @@ -195,39 +195,6 @@ int xenbus_watch_path2(device_t dev, const char *p int xenbus_grant_ring(device_t dev, unsigned long ring_mfn, grant_ref_t *refp); /** - * Allocate an event channel for the given XenBus device. - * - * \param dev The device for which to allocate the event channel. - * \param port[out] The port identifier for the allocated event channel. - * - * \return On success, 0. Otherwise an errno value indicating the - * type of failure. - * - * A successfully allocated event channel should be free'd using - * xenbus_free_evtchn(). - * - * \note On error, \a dev will be switched to the XenbusStateClosing - * state and the returned error is saved in the per-device error node - * for \a dev in the XenStore. - */ -int xenbus_alloc_evtchn(device_t dev, evtchn_port_t *port); - -/** - * Free an existing event channel. - * - * \param dev The device which allocated this event channel. - * \param port The port identifier for the event channel to free. - * - * \return On success, 0. Otherwise an errno value indicating the - * type of failure. - * - * \note On error, \a dev will be switched to the XenbusStateClosing - * state and the returned error is saved in the per-device error node - * for \a dev in the XenStore. - */ -int xenbus_free_evtchn(device_t dev, evtchn_port_t port); - -/** * Record the given errno, along with the given, printf-style, formatted * message in dev's device specific error node in the XenStore. * Index: sys/xen/xenstore/xenstore.c =================================================================== --- sys/xen/xenstore/xenstore.c (revision 255014) +++ sys/xen/xenstore/xenstore.c (working copy) @@ -49,10 +49,9 @@ __FBSDID("$FreeBSD$"); #include #include -#include #include -#include +#include #include #include #include @@ -244,8 +243,8 @@ struct xs_softc { */ int evtchn; - /** Interrupt number for our event channel. */ - u_int irq; + /** Handle for XenStore interrupts. */ + xen_intr_handle_t xen_intr_handle; /** * Interrupt driven config hook allowing us to defer @@ -505,11 +504,10 @@ xs_write_store(const void *tdata, unsigned len) xen_store->req_prod += avail; /* - * notify_remote_via_evtchn implies mb(). The other side - * will see the change to req_prod at the time of the - * interrupt. + * xen_intr_signal() implies mb(). The other side will see + * the change to req_prod at the time of the interrupt. */ - notify_remote_via_evtchn(xs.evtchn); + xen_intr_signal(xs.xen_intr_handle); } return (0); @@ -597,11 +595,10 @@ xs_read_store(void *tdata, unsigned len) xen_store->rsp_cons += avail; /* - * notify_remote_via_evtchn implies mb(). The producer - * will see the updated consumer index when the event - * is delivered. + * xen_intr_signal() implies mb(). The producer will see + * the updated consumer index when the event is delivered. */ - notify_remote_via_evtchn(xs.evtchn); + xen_intr_signal(xs.xen_intr_handle); } return (0); @@ -1068,11 +1065,11 @@ xs_init_comms(void) xen_store->rsp_cons = xen_store->rsp_prod; } - if (xs.irq) - unbind_from_irqhandler(xs.irq); + xen_intr_unbind(&xs.xen_intr_handle); - error = bind_caller_port_to_irqhandler(xs.evtchn, "xenstore", - xs_intr, NULL, INTR_TYPE_NET, &xs.irq); + error = xen_intr_bind_local_port(xs.xs_dev, xs.evtchn, + /*filter*/NULL, xs_intr, /*arg*/NULL, INTR_TYPE_NET|INTR_MPSAFE, + &xs.xen_intr_handle); if (error) { log(LOG_WARNING, "XENSTORE request irq failed %i\n", error); return (error); @@ -1168,7 +1165,6 @@ xs_attach(device_t dev) sx_init(&xs.suspend_mutex, "xenstore suspend"); mtx_init(&xs.registered_watches_lock, "watches", NULL, MTX_DEF); mtx_init(&xs.watch_events_lock, "watch events", NULL, MTX_DEF); - xs.irq = 0; /* Initialize the shared memory rings to talk to xenstored */ error = xs_init_comms(); Index: sys/xen/xenstore/xenstore_dev.c =================================================================== --- sys/xen/xenstore/xenstore_dev.c (revision 255014) +++ sys/xen/xenstore/xenstore_dev.c (working copy) @@ -44,7 +44,7 @@ __FBSDID("$FreeBSD$"); #include #include -#include +#include #include #include Index: sys/xen/xenstore/xenstorevar.h =================================================================== --- sys/xen/xenstore/xenstorevar.h (revision 255014) +++ sys/xen/xenstore/xenstorevar.h (working copy) @@ -41,8 +41,8 @@ #include #include -#include +#include #include #include #include