From: David Woodhouse When creating a VM, userspace sets pre-init distributor registers such as GICD_IIDR before the VGIC is initialized. Strictly speaking there's no reason this couldn't be done at VM creation time before any vCPUs exist at all, but the design of the userspace API does require vCPU0 to have been created, as the system-wide registers are effectively accessed via vCPU0. So a VMM can't just set the IIDR at startup before spawning vCPU threads; it has to be done once the vCPUs are being created. However, kvm_trylock_all_vcpus() makes it unreliable by causing spurious -EBUSY failures if any vCPU cannot be locked. So userspace is forced to create the vCPUs (well, at least vCPU0), and is then forced to have a full synchronization point and quiesce them all in order to reliably set the IIDR. To slightly reduce the pain of all this, skip the trylock when the VGIC is not yet initialized. There is no need to lock the vCPUs if they can't be accessing it anyway. Signed-off-by: David Woodhouse --- Other options would include making it possible to set the IIDR before creating any vCPUs, either by creating a new device-level attribute or special-casing it not to require vCPU0 for DIST_REGS that aren't really associated to a vCPU. Deprecating kvm_trylock_all_vcpus() and killing every user of it with fire would also be a good option, perhaps... :) arch/arm64/kvm/vgic/vgic-kvm-device.c | 38 ++++++++++++++++++++------- 1 file changed, 29 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kvm/vgic/vgic-kvm-device.c b/arch/arm64/kvm/vgic/vgic-kvm-device.c index a96c77dccf35..e17ea9f07f5f 100644 --- a/arch/arm64/kvm/vgic/vgic-kvm-device.c +++ b/arch/arm64/kvm/vgic/vgic-kvm-device.c @@ -540,7 +540,7 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev, struct vgic_reg_attr reg_attr; gpa_t addr; struct kvm_vcpu *vcpu; - bool uaccess; + bool uaccess, vcpus_locked = false; u32 val; int ret; @@ -566,18 +566,37 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev, return -EFAULT; } + if (!vgic_initialized(dev->kvm) && !reg_allowed_pre_init(attr)) + return -EBUSY; + mutex_lock(&dev->kvm->lock); - if (kvm_trylock_all_vcpus(dev->kvm)) { - mutex_unlock(&dev->kvm->lock); - return -EBUSY; + /* + * Pre-init registers (e.g. GICD_IIDR) don't need vCPU quiescence + * since the VGIC isn't live yet. Skip the trylock to avoid spurious + * -EBUSY when vCPU threads happen to be running. + */ + if (vgic_initialized(dev->kvm)) { + if (kvm_trylock_all_vcpus(dev->kvm)) { + mutex_unlock(&dev->kvm->lock); + return -EBUSY; + } + vcpus_locked = true; } - mutex_lock(&dev->kvm->arch.config_lock); - if (!(vgic_initialized(dev->kvm) || reg_allowed_pre_init(attr))) { - ret = -EBUSY; - goto out; + /* + * If the VGIC becomes initialized between the above check and taking + * config_lock, drop config_lock to lock the VCPUS. + */ + if (vgic_initialized(dev->kvm) && !vcpus_locked) { + mutex_unlock(&dev->kvm->arch.config_lock); + if (kvm_trylock_all_vcpus(dev->kvm)) { + mutex_unlock(&dev->kvm->lock); + return -EBUSY; + } + mutex_lock(&dev->kvm->arch.config_lock); + vcpus_locked = true; } switch (attr->group) { @@ -612,7 +631,8 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev, out: mutex_unlock(&dev->kvm->arch.config_lock); - kvm_unlock_all_vcpus(dev->kvm); + if (vcpus_locked) + kvm_unlock_all_vcpus(dev->kvm); mutex_unlock(&dev->kvm->lock); if (!ret && uaccess && !is_write) { -- 2.43.0