Commit Graph

211 Commits

Author SHA1 Message Date
53493c1f83 hw/nvme: cap MDTS value for internal limitation
The emulated device had let the user set whatever max transfers size
they wanted, including no limit. However the device does have an
internal limit of 1024 segments. NVMe doesn't report max segments,
though. This is implicitly inferred based on the MDTS and MPSMIN values.

IOV_MAX is currently 1024 which 4k PRPs can exceed with 2MB transfers.
Don't allow MDTS values that can exceed this, otherwise users risk
seeing "internal error" status to their otherwise protocol compliant
commands.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-08-11 00:17:38 -07:00
bc0c24fdb1 hw/nvme: revert CMIC behavior
Commit cd59f50ab0 ("hw/nvme: always initialize a subsystem") causes
the controller to always set the CMIC.MCTRS ("Multiple Controllers")
bit. While spec-compliant, this is a deviation from the previous
behavior where this was only set if an nvme-subsys device was explicitly
created (to configure a subsystem with multiple controllers/namespaces).

Revert the behavior to only set CMIC.MCTRS if an nvme-subsys device is
created explicitly.

Reported-by: Alan Adamson <alan.adamson@oracle.com>
Fixes: cd59f50ab0 ("hw/nvme: always initialize a subsystem")
Reviewed-by: Alan Adamson <alan.adamson@oracle.com>
Tested-by: Alan Adamson <alan.adamson@oracle.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-08-11 00:17:38 -07:00
31b737b19d hw/nvme: fix namespace attachment
Commit 6ccca4b6bb ("hw/nvme: rework csi handling") introduced a bug in
Namespace Attachment, causing it to

  a) not allow a controller to attach namespaces to other controllers
  b) assert if a valid non-attached namespace is detached

This fixes both issues.

Fixes: 6ccca4b6bb ("hw/nvme: rework csi handling")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2976
Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-08-11 00:17:38 -07:00
c95b08106b treewide: fix paths for relocated files in comments
After the docs directory restructuring, several comments
refer to paths that no longer exist.

Replace these references to the current file locations
so readers can find the correct files.

Related commits
---------------

  189c099f75 (Jul 2021)
    docs: collect the disparate device emulation docs into one section
    Rename  docs/system/{ => devices}/nvme.rst

  5f4c96b779 (Feb 2023)
    docs/system/loongarch: update loongson3.rst and rename it to virt.rst
    Rename  docs/system/loongarch/{loongson3.rst => virt.rst}

  fe0007f3c1 (Sep 2023)
    exec: Rename cpu.c -> cpu-target.c
    Rename  cpus-common.c => cpu-common.c

  42fa9665e5 (Apr 2025)
    exec: Restrict 'cpu_ldst.h' to accel/tcg/
    Rename  include/{exec/cpu_ldst.h => accel/tcg/cpu-ldst.h}

Signed-off-by: Sean Wei <me@sean.taipei>
Message-ID: <20250616.qemu.relocated.06@sean.taipei>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-07-02 18:26:27 +02:00
7eeb1d3acc hw/nvme/ctrl: skip automatic zero-init of large arrays
The 'nvme_map_sgl' method has a 256 element array used for copying
data from the device. Skip the automatic zero-init of this array
to eliminate the performance overhead in the I/O hot path.

The 'segment' array will be fully initialized when reading data from
the device.

The 'nme_changed_nslist' method has a 4k byte array that is manually
initialized with memset(). The compiler ought to be intelligent
enough to turn the memset() into a static initialization operation,
and thus not duplicate the automatic zero-init. Replacing memset()
with '{}' makes it unambiguous that the array is statically initialized.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Message-id: 20250610123709.835102-24-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-06-12 13:40:16 -04:00
0b1c23a582 hw/nvme: fix nvme hotplugging
Commit cd59f50ab0 caused a regression on nvme hotplugging for devices
with an implicit nvm subsystem.

The nvme-subsys device was incorrectly left with being marked as
non-hotpluggable. Fix this.

Cc: qemu-stable@nongnu.org
Reported-by: Stéphane Graber <stgraber@stgraber.org>
Tested-by: Stéphane Graber <stgraber@stgraber.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2950
Fixes: cd59f50ab0 ("hw/nvme: always initialize a subsystem")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-05-15 12:18:06 +02:00
2cd09e47aa qom: Make InterfaceInfo[] uses const
Mechanical change using:

  $ sed -i -E 's/\(InterfaceInfo.?\[/\(const InterfaceInfo\[/g' \
              $(git grep -lE '\(InterfaceInfo.?\[\]\)')

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250424194905.82506-7-philmd@linaro.org>
2025-04-25 17:00:41 +02:00
12d1a768bd qom: Have class_init() take a const data argument
Mechanical change using gsed, then style manually adapted
to pass checkpatch.pl script.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250424194905.82506-4-philmd@linaro.org>
2025-04-25 17:00:41 +02:00
8c996e3271 hw/nvme: fix attachment of private namespaces
Fix regression when attaching private namespaces that gets attached to
the wrong controller.

Keep track of the original controller "owner" of private namespaces, and
only attach if this matches on controller enablement.

Fixes: 6ccca4b6bb ("hw/nvme: rework csi handling")
Reported-by: Alan Adamson <alan.adamson@oracle.com>
Suggested-by: Alan Adamson <alan.adamson@oracle.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Tested-by: Alan Adamson <alan.adamson@oracle.com>
Reviewed-by: Alan Adamson <alan.adamson@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Message-ID: <20250408-fix-private-ns-v1-1-28e169b6b60b@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-04-08 20:46:10 +02:00
2400fad572 Merge tag 'pull-vfio-20250306' of https://github.com/legoater/qemu into staging
vfio queue:

* Added property documentation
* Added Minor fixes
* Implemented basic PCI PM capability backing
* Promoted new IGD maintainer
* Deprecated vfio-plaform
* Extended VFIO migration with multifd support

# -----BEGIN PGP SIGNATURE-----
#
# iQIyBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmfJrZoACgkQUaNDx8/7
# 7KFE2A/0Dmief9u/dDJIKGIDa+iawcf4hu8iX4v5pB0DlGniT3rgK8WMGnhDpPxq
# Q4wsKfo+JJ2q6msInrT7Ckqyydu9nQztI3vwmfMuWxLhTMyH28K96ptwPqIZBjOx
# rPTEXfnVX4W3tpn1+48S+vefWVa/gkBkIvv7RpK18rMBXv1kDeyOvc/d2dbAt7ft
# zJc4f8gH3jfQzGwmnYVZU1yPrZN7p6zhYR/AD3RQOY97swgZIEyYxXhOuTPiCuEC
# zC+2AMKi9nmnCG6x/mnk7l2yJXSlv7lJdqcjYZhJ9EOIYfiUGTREYIgQbARcafE/
# 4KSg2QR35BoUd4YrmEWxXJCRf3XnyWXDY36dDKVhC0OHng1F/U44HuL4QxwoTIay
# s1SP/DHcvDiPAewVTvdgt7Iwfn9xGhcQO2pkrxBoNLB5JYwW+R6mG7WXeDv1o3GT
# QosTu1fXZezQqFd4v6+q5iRNS2KtBZLTspwAmVdywEFUs+ZLBRlC+bodYlinZw6B
# Yl/z0LfAEh4J55QmX2espbp8MH1+mALuW2H2tgSGSrTBX1nwxZFI5veFzPepgF2S
# eTx69BMjiNMwzIjq1T7e9NpDCceiW0fXDu7IK1MzYhqg1nM9lX9AidhFTeiF2DB2
# EPb3ljy/8fyxcPKa1T9X47hQaSjbMwofaO8Snoh0q0jokY246Q==
# =hIBw
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 06 Mar 2025 22:13:46 HKT
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [full]
# gpg:                 aka "Cédric Le Goater <clg@kaod.org>" [full]
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* tag 'pull-vfio-20250306' of https://github.com/legoater/qemu: (42 commits)
  hw/core/machine: Add compat for x-migration-multifd-transfer VFIO property
  vfio/migration: Make x-migration-multifd-transfer VFIO property mutable
  vfio/migration: Add x-migration-multifd-transfer VFIO property
  vfio/migration: Multifd device state transfer support - send side
  vfio/migration: Multifd device state transfer support - config loading support
  migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile
  vfio/migration: Multifd device state transfer support - load thread
  vfio/migration: Multifd device state transfer support - received buffers queuing
  vfio/migration: Setup and cleanup multifd transfer in these general methods
  vfio/migration: Multifd setup/cleanup functions and associated VFIOMultifd
  vfio/migration: Multifd device state transfer - add support checking function
  vfio/migration: Multifd device state transfer support - basic types
  vfio/migration: Move migration channel flags to vfio-common.h header file
  vfio/migration: Add vfio_add_bytes_transferred()
  vfio/migration: Convert bytes_transferred counter to atomic
  vfio/migration: Add load_device_config_state_start trace event
  migration: Add save_live_complete_precopy_thread handler
  migration/multifd: Add multifd_device_state_supported()
  migration/multifd: Make MultiFDSendData a struct
  migration/multifd: Device state transfer support - send side
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-03-07 07:39:21 +08:00
c98dac169e qdev: Rename PropertyInfo member @name to @type
PropertyInfo member @name becomes ObjectProperty member @type, while
Property member @name becomes ObjectProperty member @name.  Rename the
former.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20250227085601.4140852-4-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[One missed instance of @type fixed]
2025-03-06 10:30:58 +01:00
0681ec2531 pci: Use PCI PM capability initializer
Switch callers directly initializing the PCI PM capability with
pci_add_capability() to use pci_pm_init().

Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Stefan Weil <sw@weilnetz.de>
Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Klaus Jensen <its@irrelevant.dk>
Cc: Jesper Devantier <foss@defmacro.it>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-3-alex.williamson@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-03-06 06:47:33 +01:00
cad58ada8f hw/nvme: remove nvme_aio_err()
nvme_rw_complete_cb() is the only remaining user of nvme_aio_err(), so
open code the status code setting instead.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-26 12:40:35 +01:00
6fc39228ff hw/nvme: set error status code explicitly for misc commands
The nvme_aio_err() does not handle Verify, Compare, Copy and other misc
commands and defaults to setting the error status code to Internal
Device Error. For some of these commands, we know better, so set it
explicitly.

For the commands using the nvme_misc_cb() callback (Copy, Flush, ...),
if no status code has explicitly been set by the lower handlers, default
to Internal Device Error as previously.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-26 12:40:35 +01:00
304babd940 hw/nvme: only set command abort requested when cancelled due to Abort
The Command Abort Requested status code should only be set if the
command was explicitly cancelled due to an Abort command. Or, in the
case the cancel was due to Submission Queue deletion, set the status
code to Command Aborted due to SQ Deletion.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-26 12:40:35 +01:00
6ccca4b6bb hw/nvme: rework csi handling
The controller incorrectly allows a zoned namespace to be attached even
if CS.CSS is configured to only support the NVM command set for I/O
queues.

Rework handling of namespace command sets in general by attaching
supported namespaces when the controller is started instead of, like
now, statically when realized.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-26 12:40:32 +01:00
d96a32de3f hw/nvme: be compliant wrt. dsm processing limits
The specification states that,

> The controller shall set all three processing limit fields (i.e., the
> DMRL, DMRSL and DMSL fields) to non-zero values or shall clear all
> three processing limit fields to 0h.

So, set the DMRL and DMSL fields in addition to DMRSL.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25 12:55:21 +01:00
b202fb549d nvme: fix iocs status code values
The status codes related to I/O Command Sets are in the wrong group.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25 12:55:21 +01:00
9cf6ec0659 hw/nvme: add knob for doorbell buffer config support
Add a 'dbcs' knob to allow Doorbell Buffer Config command to be
disabled.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25 12:55:21 +01:00
e7047adf1e hw/nvme: make oacs dynamic
Virtualization Management needs sriov-related parameters. Only report
support for the command when that conditions are true.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25 12:55:21 +01:00
cd59f50ab0 hw/nvme: always initialize a subsystem
If no nvme-subsys is explicitly configured, instantiate one.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25 12:55:21 +01:00
23a4b3ebc7 hw/nvme: Add OCP SMART / Health Information Extended Log Page
The Open Compute Project [1] includes a Datacenter NVMe
SSD Specification [2]. The most recent version of this specification
(as of November 2024) is 2.6.1. This specification layers on top of
the NVM Express specifications [3] to provide additional
functionality. A key part of of this is the 512 Byte OCP SMART / Health
Information Extended log page that is defined in Section 4.8.6 of the
specification.

We add a controller argument (ocp) that toggles on/off the SMART log
extended structure.  To accommodate different vendor specific specifications
like OCP, we add a multiplexing function (nvme_vendor_specific_log) which
will route to the different log functions based on arguments and log ids.
We only return the OCP extended SMART log when the command is 0xC0 and ocp
has been turned on in the nvme argumants.

Though we add the whole nvme SMART log extended structure, we only populate
the physical_media_units_{read,written}, log_page_version and
log_page_uuid.

This patch is based on work done by Joel but has been modified enough
that he requested a co-developed-by tag rather than a signed-off-by.

[1]: https://www.opencompute.org/
[2]: https://www.opencompute.org/documents/datacenter-nvme-ssd-specification-v2-6-1-pdf
[3]: https://nvmexpress.org/specifications/

Signed-off-by: Stephen Bates <sbates@raithlin.com>
Co-developed-by: Joel Granados <j.granados@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2025-02-25 12:27:21 +01:00
3391d68e90 pcie_sriov: Ensure VF addr does not overflow
pci_new() aborts when creating a VF with addr >= PCI_DEVFN_MAX.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250116-reuse-v20-7-7cb370606368@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-02-20 18:23:19 -05:00
65cb7129f4 Merge tag 'exec-20241220' of https://github.com/philmd/qemu into staging
Accel & Exec patch queue

- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
- Add '-d invalid_mem' logging option (Zoltan)
- Create QOM containers explicitly (Peter)
- Rename sysemu/ -> system/ (Philippe)
- Re-orderning of include/exec/ headers (Philippe)
  Move a lot of declarations from these legacy mixed bag headers:
    . "exec/cpu-all.h"
    . "exec/cpu-common.h"
    . "exec/cpu-defs.h"
    . "exec/exec-all.h"
    . "exec/translate-all"
  to these more specific ones:
    . "exec/page-protection.h"
    . "exec/translation-block.h"
    . "user/cpu_loop.h"
    . "user/guest-host.h"
    . "user/page-protection.h"

 # -----BEGIN PGP SIGNATURE-----
 #
 # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
 # wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
 # KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
 # A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
 # 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
 # 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
 # xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
 # VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
 # ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
 # 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
 # +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
 # x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
 # =cjz8
 # -----END PGP SIGNATURE-----
 # gpg: Signature made Fri 20 Dec 2024 11:45:20 EST
 # gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
 # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
 # gpg: WARNING: This key is not certified with a trusted signature!
 # gpg:          There is no indication that the signature belongs to the owner.
 # Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits)
  util/qemu-timer: fix indentation
  meson: Do not define CONFIG_DEVICES on user emulation
  system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header
  system/numa: Remove unnecessary 'exec/cpu-common.h' header
  hw/xen: Remove unnecessary 'exec/cpu-common.h' header
  target/mips: Drop left-over comment about Jazz machine
  target/mips: Remove tswap() calls in semihosting uhi_fstat_cb()
  target/xtensa: Remove tswap() calls in semihosting simcall() helper
  accel/tcg: Un-inline translator_is_same_page()
  accel/tcg: Include missing 'exec/translation-block.h' header
  accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h'
  accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h'
  qemu/coroutine: Include missing 'qemu/atomic.h' header
  exec/translation-block: Include missing 'qemu/atomic.h' header
  accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h'
  exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined
  target/sparc: Move sparc_restore_state_to_opc() to cpu.c
  target/sparc: Uninline cpu_get_tb_cpu_state()
  target/loongarch: Declare loongarch_cpu_dump_state() locally
  user: Move various declarations out of 'exec/exec-all.h'
  ...

Conflicts:
	hw/char/riscv_htif.c
	hw/intc/riscv_aplic.c
	target/s390x/cpu.c

	Apply sysemu header path changes to not in the pull request.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-21 11:07:00 -05:00
32cad1ffb8 include: Rename sysemu/ -> system/
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.

Files renamed manually then mechanical change using sed tool.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
b1987a2547 Constify all opaque Property pointers
Via sed "s/  Property [*]/  const Property */".

The opaque pointers passed to ObjectProperty callbacks are
the last instances of non-const Property pointers in the tree.
For the most part, these callbacks only use object_field_prop_ptr,
which now takes a const pointer itself.

This logically should have accompanied d36f165d95 which
allowed const Property to be registered.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://lore.kernel.org/r/20241218134251.4724-25-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19 19:36:37 +01:00
5fcabe628b include/hw/qdev-properties: Remove DEFINE_PROP_END_OF_LIST
Now that all of the Property arrays are counted, we can remove
the terminator object from each array.  Update the assertions
in device_class_set_props to match.

With struct Property being 88 bytes, this was a rather large
form of terminator.  Saves 30k from qemu-system-aarch64.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://lore.kernel.org/r/20241218134251.4724-21-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19 19:36:37 +01:00
b32c22bf31 hw/nvme: Constify all Property
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-15 12:55:46 -06:00
6651f8f2e5 hw/nvme: take a reference on the subsystem on vf realization
Make sure we grab a reference on the subsystem when a VF is realized.
Otherwise, the subsytem will be unrealized automatically when the VFs
are unregistered and unreffed.

This fixes a latent bug but was not exposed until commit 08f6328480
("pcie: Release references of virtual functions"). This was then fixed
(or rather, hidden) by commit c613ad2512 ("pcie_sriov: Do not manually
unrealize"), but that was then reverted (due to other issues) in commit
b0fdaee5d1, exposing the bug yet again.

Cc: qemu-stable@nongnu.org
Fixes: 08f6328480 ("pcie: Release references of virtual functions")
Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-12-03 07:28:27 +01:00
e85987786d hw/nvme: SR-IOV VFs must hardwire pci interrupt pin register to zero
The PCI Interrupt Pin Register does not apply to VFs and MUST be
hardwired to zero.

Fixes: 44c2c09488 ("hw/nvme: Add support for SR-IOV")
Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-12-03 07:28:27 +01:00
149f6e90b5 hw/nvme: fix use/unuse of msix vectors
Only call msix_{un,}use_vector() when interrupts are actually enabled
for a completion queue.

Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-12-03 07:28:27 +01:00
9162f10125 hw/nvme: fix msix_uninit with exclusive bar
Commit fa905f65c5 introduced a machine compatibility parameter to
enable an exclusive bar for msix. It failed to account for this when
cleaning up. Make sure that if an exclusive bar is enabled, we use the
proper cleanup routine.

Cc: qemu-stable@nongnu.org
Fixes: fa905f65c5 ("hw/nvme: add machine compatibility parameter to enable msix exclusive bar")
Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-12-03 07:28:27 +01:00
9529aa6bb4 hw/nvme: fix handling of over-committed queues
If a host chooses to use the SQHD "hint" in the CQE to know if there is
room in the submission queue for additional commands, it may result in a
situation where there are not enough internal resources (struct
NvmeRequest) available to process the command. For a lack of a better
term, the host may "over-commit" the device (i.e., it may have more
inflight commands than the queue size).

For example, assume a queue with N entries. The host submits N commands
and all are picked up for processing, advancing the head and emptying
the queue. Regardless of which of these N commands complete first, the
SQHD field of that CQE will indicate to the host that the queue is
empty, which allows the host to issue N commands again. However, if the
device has not posted CQEs for all the previous commands yet, the device
will have less than N resources available to process the commands, so
queue processing is suspended.

And here lies an 11 year latent bug. In the absense of any additional
tail updates on the submission queue, we never schedule the processing
bottom-half again unless we observe a head update on an associated full
completion queue. This has been sufficient to handle N-to-1 SQ/CQ setups
(in the absense of over-commit of course). Incidentially, that "kick all
associated SQs" mechanism can now be killed since we now just schedule
queue processing when we return a processing resource to a non-empty
submission queue, which happens to cover both edge cases. However, we
must retain kicking the CQ if it was previously full.

So, apparently, no previous driver tested with hw/nvme has ever used
SQHD (e.g., neither the Linux NVMe driver or SPDK uses it). But then OSv
shows up with the driver that actually does. I salute you.

Fixes: f3c507adcd ("NVMe: Initial commit for new storage interface")
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2388
Reported-by: Waldemar Kozaczuk <jwkozaczuk@gmail.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-11-08 09:14:30 +01:00
8f472a0e7a hw/nvme: remove dead code
Remove dead code which always returns success, since PRCHK will have a
value of zero.

Signed-off-by: Arun Kumar <arun.kka@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Link: https://lore.kernel.org/r/20241022222105.3609223-1-arun.kka@samsung.com
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-11-04 19:09:45 +01:00
dbaa2936b3 hw/nvme: add NPDAL/NPDGL
Add the NPDGL and NPDAL fields to support large alignment and
granularities.

Signed-off-by: Ayush Mishra <ayush.m55@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Link: https://lore.kernel.org/r/20241001012833.3551820-1-ayush.m55@samsung.com
[k.jensen: renamed the enum values]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-11-04 19:09:45 +01:00
79e490058f hw/nvme: i/o cmd set independent namespace data structure
Add support for the I/O Command Set Independent Namespace Data
Structure (CNS 8h and 1fh).

Signed-off-by: Arun Kumar <arun.kka@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Link: https://lore.kernel.org/r/20240925004407.3521406-1-arun.kka@samsung.com
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-11-04 19:09:45 +01:00
ebd1568fc7 hw/nvme: add atomic write support
Adds support for the controller atomic parameters: AWUN and AWUPF. Atomic
Compare and Write Unit (ACWU) is not currently supported.

Writes that adhere to the ACWU and AWUPF parameters are guaranteed to be atomic.

New NVMe QEMU Parameters (See NVMe Specification for details):
       atomic.dn (default off) - Set the value of Disable Normal.
       atomic.awun=UINT16 (default: 0)
       atomic.awupf=UINT16 (default: 0)

By default (Disable Normal set to zero), the maximum atomic write size is
set to the AWUN value.  If Disable Normal is set, the maximum atomic write
size is set to AWUPF.

Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-10-01 09:02:05 +02:00
e4bcb5865c hw/nvme: add knob for CTRATT.MEM
Add a boolean prop (ctratt.mem) for setting CTRATT.MEM and default it to
unset (false) to keep existing behavior of the device intact.

Reviewed-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Arun Kumar <arun.kka@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-10-01 09:01:46 +02:00
a1ab67883d hw/nvme: support CTRATT.MEM
Indicate that 'MDTS and Size Limits Exclude Metadata (MEM)' in the
Controller Attributes (CTRATT) I/O Command Set Independent Identify
Controller Data Structure.

Signed-off-by: Arun Kumar <arun.kka@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
[k.jensen: updated commit message]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-09-30 12:45:17 +02:00
16eb2ea8ff hw/nvme: clear masked events from the aer queue
Clear masked events from the aer queue when get log page is issued with
RAE 0 without checking for the presence of outstanding aer requests.

Signed-off-by: Arun Kumar <arun.kka@samsung.com>
[k.jensen: remove unnecessary QTAILQ_EMPTY check]
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-09-30 12:45:17 +02:00
78ca36df42 hw/nvme: report id controller metadata sgl support
The controller already supports this decoding, so just set the
ID_CTRL.SGLS field accordingly.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-09-30 12:45:17 +02:00
7f2acdfbe0 hw/nvme: replace assert(false) with g_assert_not_reached()
This patch is part of a series that moves towards a consistent use of
g_assert_not_reached() rather than an ad hoc mix of different
assertion mechanisms.

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20240919044641.386068-11-pierrick.bouvier@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-24 13:53:35 +02:00
e3d0814368 hw: Use device_class_set_legacy_reset() instead of opencoding
Use device_class_set_legacy_reset() instead of opencoding an
assignment to DeviceClass::reset. This change was produced
with:
 spatch --macro-file scripts/cocci-macro-file.h \
    --sp-file scripts/coccinelle/device-reset.cocci \
    --keep-comments --smpl-spacing --in-place --dir hw

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240830145812.1967042-8-peter.maydell@linaro.org
2024-09-13 15:31:44 +01:00
6a22121c4f hw/nvme: fix leak of uninitialized memory in io_mgmt_recv
Yutaro Shimizu from the Cyber Defense Institute discovered a bug in the
NVMe emulation that leaks contents of an uninitialized heap buffer if
subsystem and FDP emulation are enabled.

Cc: qemu-stable@nongnu.org
Reported-by: Yutaro Shimizu <shimizu@cyberdefense.jp>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-08-20 06:16:48 +02:00
19c45c00dc Revert "pcie_sriov: Ensure VF function number does not overflow"
This reverts commit 7771870115.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-08-01 04:32:00 -04:00
5885bcef3d Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci,pc: features,fixes

pci: Initial support for SPDM Responders
cxl: Add support for scan media, feature commands, device patrol scrub
    control, DDR5 ECS control, firmware updates
virtio: in-order support
virtio-net: support for SR-IOV emulation (note: known issues on s390,
                                          might get reverted if not fixed)
smbios: memory device size is now configurable per Machine
cpu: architecture agnostic code to support vCPU Hotplug

Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
# zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
# yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
# wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
# VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
# M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
# =k8e9
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 10:16:31 AM AEST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (61 commits)
  hw/nvme: Add SPDM over DOE support
  backends: Initial support for SPDM socket support
  hw/pci: Add all Data Object Types defined in PCIe r6.0
  tests/acpi: Add expected ACPI AML files for RISC-V
  tests/qtest/bios-tables-test.c: Enable basic testing for RISC-V
  tests/acpi: Add empty ACPI data files for RISC-V
  tests/qtest/bios-tables-test.c: Remove the fall back path
  tests/acpi: update expected DSDT blob for aarch64 and microvm
  acpi/gpex: Create PCI link devices outside PCI root bridge
  tests/acpi: Allow DSDT acpi table changes for aarch64
  hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART
  hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC
  virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain
  hw/vfio/common: Add vfio_listener_region_del_iommu trace event
  virtio-iommu: Remove the end point on detach
  virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices
  virtio-iommu: Remove probe_done
  Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"
  gdbstub: Add helper function to unregister GDB register space
  physmem: Add helper function to destroy CPU AddressSpace
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-24 09:32:04 +10:00
4f947b10d5 hw/nvme: Add SPDM over DOE support
Setup Data Object Exchange (DOE) as an extended capability for the NVME
controller and connect SPDM to it (CMA) to it.

Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Klaus Jensen <k.jensen@samsung.com>
Message-Id: <20240703092027.644758-4-alistair.francis@wdc.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22 20:15:42 -04:00
4ea3de93a3 hw/nvme: remove useless type cast
The type of req->cmd is NvmeCmd, cast the pointer of this type to
NvmeCmd* is useless.

Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-07-22 14:43:17 +02:00
75209c071a hw/nvme: actually implement abort
Abort was not implemented previously, but we can implement it for AERs
and asynchrnously for I/O.

Signed-off-by: Ayush Mishra <ayush.m55@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-07-22 14:36:15 +02:00
d522aef88d hw/nvme: add cross namespace copy support
Extend copy command to copy user data across different namespaces via
support for specifying a namespace for each source range

Signed-off-by: Arun Kumar <arun.kka@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
2024-07-22 14:36:15 +02:00