The various logical migration channels don't have a
standardized way of advertising themselves and their
connections may be seen out of order by the migration
destination. When a new connection arrives, the incoming
migration currently make use of heuristics to determine
which channel it belongs to.
The next few patches will need to change how the multifd
and postcopy capabilities interact and that affects the
channel discovery heuristic.
Refactor the channel discovery heuristic to make it less
opaque and simplify the subsequent patches.
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250411114534.3370816-3-ppandit@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
control_save_page() is for RDMA only, unfold it to make the code more
clear.
In addition:
- Similar to other branches style in ram_save_target_page(), involve RDMA
only if the condition 'migrate_rdma()' is true.
- Further simplify the code by removing the RAM_SAVE_CONTROL_NOT_SUPP.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20250305062825.772629-6-lizhijian@fujitsu.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Depending on the order of starting RDMA and setting capability,
they can be categorized into the following scenarios:
Source:
S1: [set capabilities] -> [Start RDMA outgoing]
Destination:
D1: [set capabilities] -> [Start RDMA incoming]
D2: [Start RDMA incoming] -> [set capabilities]
Previously, compatibility between RDMA and capabilities was verified only
in scenario D1, potentially causing migration failures in other situations.
For scenarios S1 and D1, we can seamlessly incorporate
migration_transport_compatible() to address compatibility between
channels and capabilities vs transport.
For scenario D2, ensure compatibility within migrate_caps_check().
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20250305062825.772629-3-lizhijian@fujitsu.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Various patches loosely related to single binary work:
- Replace cpu_list() definition by CPUClass::list_cpus() callback
- Remove few MO_TE definitions on Hexagon / X86 targets
- Remove target_ulong uses in ARMMMUFaultInfo and ARM CPUWatchpoint
- Remove DEVICE_HOST_ENDIAN definition
- Evaluate TARGET_BIG_ENDIAN at compile time and use target_needs_bswap() more
- Rename target_words_bigendian() as target_big_endian()
- Convert target_name() and target_cpu_type() to TargetInfo API
- Constify QOM TypeInfo class_data/interfaces fields
- Get default_cpu_type calling machine_class_default_cpu_type()
- Correct various uses of GLibCompareDataFunc prototype
- Simplify ARM/Aarch64 gdb_get_core_xml_file() handling a bit
- Move device tree files in their own pc-bios/dtb/ subdir
- Correctly check strchrnul() symbol availability on macOS SDK
- Move target-agnostic methods out of cpu-target.c and accel-target.c
- Unmap canceled USB XHCI packet
- Use deposit/extract API in designware model
- Fix MIPS16e translation
- Few missing header fixes
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmgLqb8ACgkQ4+MsLN6t
# wN6nCQ//cmv1M+NsndhO5TAK8T1eUSXKlTZh932uro6ZgxKwN4p+j1Qo7bq3O9gu
# qUMHNbcfQl8sHSytiXBoxCjLMCXC3u38iyz75WGXuPay06rs4wqmahqxL4tyno3l
# 1RviFts9xlLn+tJqqrAR6+pRdALld0TY+yXUjXgr4aK5pIRpLz9U/sIEoh7qbA5U
# x0MTaceDG3A91OYo0TgrNbcMe1b9GqQZ+a4tbaP+oE37wbiKdyQ68LjrEbV08Y1O
# qrFF4oxquV31QJcUiuII1W7hC6psGrMsUA1f1qDu7QvmybAZWNZNsR9T66X9jH5J
# wXMShJmmXwxugohmuPPFnDshzJy90aFL6Jy2shrfqcG2v0W66ARY1ZnbJLCcfczt
# 073bnE2dnOVhd/ny37RrIJNJLLmYM0yFDeKuYtNNAzpK9fpA7Q2PI8QiqNacQ3Pa
# TdEYrGlMk7OeNck8xJmJMY5rATthi1D4dIBv3rjQbUolQvPJe2Y9or0R2WL1jK5v
# hhr6DY01iSPES3CravmUs/aB1HRMPi/nX45OmFR6frAB7xqWMreh81heBVuoTTK8
# PuXtRQgRMRKwDeTxlc6p+zba4mIEYG8rqJtPFRgViNCJ1KsgSIowup3BNU05YuFn
# NoPoRayMDVMgejVgJin3Mg2DCYvt/+MBmO4IoggWlFsXj59uUgA=
# =DXnZ
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 25 Apr 2025 11:26:55 EDT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'single-binary-20250425' of https://github.com/philmd/qemu: (58 commits)
qemu: Convert target_name() to TargetInfo API
accel: Move target-agnostic code from accel-target.c -> accel-common.c
accel: Make AccelCPUClass structure target-agnostic
accel: Include missing 'qemu/accel.h' header in accel-internal.h
accel: Implement accel_init_ops_interfaces() for both system/user mode
cpus: Move target-agnostic methods out of cpu-target.c
cpus: Replace CPU_RESOLVING_TYPE -> target_cpu_type()
qemu: Introduce target_cpu_type()
qapi: Rename TargetInfo structure as QemuTargetInfo
hw/microblaze: Evaluate TARGET_BIG_ENDIAN at compile time
hw/mips: Evaluate TARGET_BIG_ENDIAN at compile time
target/xtensa: Evaluate TARGET_BIG_ENDIAN at compile time
target/mips: Check CPU endianness at runtime using env_is_bigendian()
accel/kvm: Use target_needs_bswap()
linux-user/elfload: Use target_needs_bswap()
target/hexagon: Include missing 'accel/tcg/getpc.h'
accel/tcg: Correct list of included headers in tcg-stub.c
system/kvm: make functions accessible from common code
meson: Use osdep_prefix for strchrnul()
meson: Share common C source prefixes
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The migration core subsystem makes use of the VFIO migration API to
collect statistics on the number of bytes transferred. These services
are declared in "hw/vfio/vfio-common.h" which also contains VFIO
internal declarations. Move the migration declarations into a new
header file "hw/vfio/vfio-migration.h" to reduce the exposure of VFIO
internals.
While at it, use a 'vfio_migration_' prefix for these services.
To be noted, vfio_migration_add_bytes_transferred() is a VFIO
migration internal service which we will be moved in the subsequent
patches.
Cc: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-4-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Rename to migration_legacy_page_bits, to make it clear that
we cannot change the value without causing a migration break.
Move to page-vary.h and page-vary-target.c.
Define via TARGET_PAGE_BITS if not TARGET_PAGE_BITS_VARY.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Convert the existing includes with
sed -i ,exec/memory.h,system/memory.h,g
Move the include within cpu-all.h into a !CONFIG_USER_ONLY block.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Address an error in RDMA-based migration by ensuring RDMA is prioritized
when saving pages in `ram_save_target_page()`.
Previously, the RDMA protocol's page-saving step was placed after other
protocols due to a refactoring in commit bc38dc2f5f. This led to migration
failures characterized by unknown control messages and state loading errors
destination:
(qemu) qemu-system-x86_64: Unknown control message QEMU FILE
qemu-system-x86_64: error while loading state section id 1(ram)
qemu-system-x86_64: load of migration failed: Operation not permitted
source:
(qemu) qemu-system-x86_64: RDMA is in an error state waiting migration to abort!
qemu-system-x86_64: failed to save SaveStateEntry with id(name): 1(ram): -1
qemu-system-x86_64: rdma migration: recv polling control error!
qemu-system-x86_64: warning: Early error. Sending error.
qemu-system-x86_64: warning: rdma migration: send polling control error
RDMA migration implemented its own protocol/method to send pages to
destination side, hand over to RDMA first to prevent pages being saved by
other protocol.
Fixes: bc38dc2f5f ("migration: refactor ram_save_target_page functions")
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-ID: <20250305062825.772629-2-lizhijian@fujitsu.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
in the migration stream file and recreate them later, because the physical
memory for the blocks is pinned and registered for vfio. Add a blocker
for volatile ram blocks.
Also add a blocker for RAM_GUEST_MEMFD. Preserving guest_memfd may be
sufficient for CPR, but it has not been tested yet.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
On the incoming migration side, QEMU uses a coroutine to load all the VM
states. Inside, it may reference MigrationState on global states like
migration capabilities, parameters, error state, shared mutexes and more.
However there's nothing yet to make sure MigrationState won't get
destroyed (e.g. after migration_shutdown()). Meanwhile there's also no API
available to remove the incoming coroutine in migration_shutdown(),
avoiding it to access the freed elements.
There's a bug report showing this can happen and crash dest QEMU when
migration is cancelled on source.
When it happens, the dest main thread is trying to cleanup everything:
#0 qemu_aio_coroutine_enter
#1 aio_dispatch_handler
#2 aio_poll
#3 monitor_cleanup
#4 qemu_cleanup
#5 qemu_default_main
Then it found the migration incoming coroutine, schedule it (even after
migration_shutdown()), causing crash:
#0 __pthread_kill_implementation
#1 __pthread_kill_internal
#2 __GI_raise
#3 __GI_abort
#4 __assert_fail_base
#5 __assert_fail
#6 qemu_mutex_lock_impl
#7 qemu_lockable_mutex_lock
#8 qemu_lockable_lock
#9 qemu_lockable_auto_lock
#10 migrate_set_error
#11 process_incoming_migration_co
#12 coroutine_trampoline
To fix it, take a refcount after an incoming setup is properly done when
qmp_migrate_incoming() succeeded the 1st time. As it's during a QMP
handler which needs BQL, it means the main loop is still alive (without
going into cleanups, which also needs BQL).
Releasing the refcount now only until the incoming migration coroutine
finished or failed. Hence the refcount is valid for both (1) setup phase
of incoming ports, mostly IO watches (e.g. qio_channel_add_watch_full()),
and (2) the incoming coroutine itself (process_incoming_migration_co()).
Note that we can't unref in migration_incoming_state_destroy(), because
both qmp_xen_load_devices_state() and load_snapshot() will use it without
an incoming migration. Those hold BQL so they're not prone to this issue.
PS: I suspect nobody uses Xen's command at all, as it didn't register yank,
hence AFAIU the command should crash on master when trying to unregister
yank in migration_incoming_state_destroy().. but that's another story.
Also note that in some incoming failure cases we may not always unref the
MigrationState refcount, which is a trade-off to keep things simple. We
could make it accurate, but it can be an overkill. Some examples:
- Unlike most of the rest protocols, socket_start_incoming_migration()
may create net listener after incoming port setup sucessfully.
It means we can't unref in migration_channel_process_incoming() as a
generic path because socket protocol might keep using MigrationState.
- For either socket or file, multiple IO watches might be created, it
means logically each IO watch needs to take one refcount for
MigrationState so as to be 100% accurate on ownership of refcount taken.
In general, we at least need per-protocol handling to make it accurate,
which can be an overkill if we know incoming failed after all. Add a short
comment to explain that when taking the refcount in qmp_migrate_incoming().
Bugzilla: https://issues.redhat.com/browse/RHEL-69775
Tested-by: Yan Fu <yafu@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250220132459.512610-1-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The newly introduced device state buffer can be used for either storing
VFIO's read() raw data, but already also possible to store generic device
states. After noticing that device states may not easily provide a max
buffer size (also the fact that RAM MultiFDPages_t after all also want to
have flexibility on managing offset[] array), it may not be a good idea to
stick with union on MultiFDSendData.. as it won't play well with such
flexibility.
Switch MultiFDSendData to a struct.
It won't consume a lot more space in reality, after all the real buffers
were already dynamically allocated, so it's so far only about the two
structs (pages, device_state) that will be duplicated, but they're small.
With this, we can remove the pretty hard to understand alloc size logic.
Because now we can allocate offset[] together with the SendData, and
properly free it when the SendData is freed.
[MSS: Make sure to clear possible device state payload before freeing
MultiFDSendData, remove placeholders for other patches not included]
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/qemu-devel/7b02baba8e6ddb23ef7c349d312b9b631db09d7e.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add a basic support for receiving device state via multifd channels -
channels that are shared with RAM transfers.
Depending whether MULTIFD_FLAG_DEVICE_STATE flag is present or not in the
packet header either device state (MultiFDPacketDeviceState_t) or RAM
data (existing MultiFDPacket_t) is read.
The received device state data is provided to
qemu_loadvm_load_state_buffer() function for processing in the
device's load_state_buffer handler.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/9b86f806c134e7815ecce0eee84f0e0e34aa0146.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
All callers to migration_incoming_state_destroy() other than
postcopy_ram_listen_thread() do this call with BQL held.
Since migration_incoming_state_destroy() ultimately calls "load_cleanup"
SaveVMHandlers and it will soon call BQL-sensitive code it makes sense
to always call that function under BQL rather than to have it deal with
both cases (with BQL and without BQL).
Add the necessary bql_lock() and bql_unlock() to
postcopy_ram_listen_thread().
qemu_loadvm_state_main() in postcopy_ram_listen_thread() could call
"load_state" SaveVMHandlers that are expecting BQL to be held.
In principle, the only devices that should be arriving on migration
channel serviced by postcopy_ram_listen_thread() are those that are
postcopiable and whose load handlers are safe to be called without BQL
being held.
But nothing currently prevents the source from sending data for "unsafe"
devices which would cause trouble there.
Add a TODO comment there so it's clear that it would be good to improve
handling of such (erroneous) case in the future.
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/21bb5ca337b1d5a802e697f553f37faf296b5ff4.1741193259.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This QEMU_VM_COMMAND sub-command and its switchover_start SaveVMHandler is
used to mark the switchover point in main migration stream.
It can be used to inform the destination that all pre-switchover main
migration stream data has been sent/received so it can start to process
post-switchover data that it might have received via other migration
channels like the multifd ones.
Add also the relevant MigrationState bit stream compatibility property and
its hw_compat entry.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Zhang Chen <zhangckid@gmail.com> # for the COLO part
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/311be6da85fc7e49a7598684d80aa631778dcbce.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Memory pull request for 10.0
v2 changelog:
- Fix Mac (and possibly some other) build issues for two patches
- os: add an ability to lock memory on_fault
- memory: pass MemTxAttrs to memory_access_is_direct()
List of features:
- William's fix on ram hole punching when with file offset
- Daniil's patchset to introduce mem-lock=on-fault
- William's hugetlb hwpoison fix for size report & remap
- David's series to allow qemu debug writes to MMIOs
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZ6zcQBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbL3wEAqx94NpB/tEEBj6WXE3uV9LqQ0GCTYmV+
# MbM51Vep8ksA/35yFn3ltM2yoSnUf9WJW6LXEEKhQlwswI0vChQERgkE
# =++O1
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 13 Feb 2025 01:37:04 HKT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [full]
# gpg: aka "Peter Xu <peterx@redhat.com>" [full]
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'mem-next-pull-request' of https://gitlab.com/peterx/qemu:
overcommit: introduce mem-lock=on-fault
system: introduce a new MlockState enum
system/vl: extract overcommit option parsing into a helper
os: add an ability to lock memory on_fault
system/physmem: poisoned memory discard on reboot
system/physmem: handle hugetlb correctly in qemu_ram_remap()
physmem: teach cpu_memory_rw_debug() to write to more memory regions
hmp: use cpu_get_phys_page_debug() in hmp_gva2gpa()
memory: pass MemTxAttrs to memory_access_is_direct()
physmem: disallow direct access to RAM DEVICE in address_space_write_rom()
physmem: factor out direct access check into memory_region_supports_direct_access()
physmem: factor out RAM/ROMD check in memory_access_is_direct()
physmem: factor out memory_region_is_ram_device() check in memory_access_is_direct()
system/physmem: take into account fd_offset for file fallocate
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qmp_migrate guarantees that cpr_channel is not null for
MIG_MODE_CPR_TRANSFER when cpr_state_save is called:
qmp_migrate()
if (s->parameters.mode == MIG_MODE_CPR_TRANSFER && !cpr_channel) {
return;
}
cpr_state_save(cpr_channel)
but cpr_state_save checks for mode differently before using channel,
and Coverity cannot infer that they are equivalent in outgoing QEMU,
and warns that channel may be NULL:
cpr_state_save(channel)
MigMode mode = migrate_mode();
if (mode == MIG_MODE_CPR_TRANSFER) {
f = cpr_transfer_output(channel, errp);
To make Coverity happy, assert that channel != NULL in cpr_state_save.
Resolves: Coverity CID 1590980
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Message-ID: <1738788841-211843-1-git-send-email-steven.sistare@oracle.com>
[assert instead of using parameters.mode in cpr_state_save]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The expected outcome from qmp_migrate_cancel() is that the source
migration goes to the terminal state
MIGRATION_STATUS_CANCELLED. Anything different from this is a bug when
cancelling.
Make sure there is never a state transition from an unspecified state
into FAILED. Code that sets FAILED, should always either make sure
that the old state is not CANCELLING or specify the old state.
Note that the destination is allowed to go into FAILED, so there's no
issue there.
(I don't think this is relevant as a backport because cancelling does
work, it just doesn't show the right state at the end)
Fixes: 3dde8fdbad ("migration: Merge precopy/postcopy on switchover start")
Fixes: d0edb8a173 ("migration: Create the postcopy preempt channel asynchronously")
Fixes: 8518278a6a ("migration: implementation of background snapshot thread")
Fixes: bf78a046b9 ("migration: refactor migrate_fd_connect failures")
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-7-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
After postcopy has started, it's not possible to recover the source
machine in case a migration error occurs because the destination has
already been changing the state of the machine. For that same reason,
it doesn't make sense to try to cancel the migration after postcopy
has started. Reject the cancel command during postcopy.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-6-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Remove all instances of _fd_ from the migration generic code. These
functions have grown over time and the _fd_ part is now just
confusing.
migration_fd_error() -> migration_error() makes it a little
vague. Since it's only used for migration_connect() failures, change
it to migration_connect_set_error().
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-4-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're currently only checking the QEMUFile error after
qemu_loadvm_state(). This was causing a TLS termination error from
multifd recv threads to be ignored.
Start checking the migration error as well to avoid missing further
errors.
Regarding compatibility concerning the TLS termination error that was
being ignored, for QEMUs <= 9.2 - if the old QEMU is being used as
migration source - the recently added migration property
multifd-tls-clean-termination needs to be set to OFF in the
*destination* machine.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're currently changing the way the source multifd migration handles
the shutdown of the multifd channels when TLS is in use to perform a
clean termination by calling gnutls_bye().
Older src QEMUs will always close the channel without terminating the
TLS session. New dst QEMUs treat an unclean termination as an error.
Add multifd_clean_tls_termination (default true) that can be switched
on the destination whenever a src QEMU <= 9.2 is in use.
(Note that the compat property is only strictly necessary for src
QEMUs older than 9.1. Due to synchronization coincidences, src QEMUs
9.1 and 9.2 can put the destination in a condition where it doesn't
see the unclean termination. Still, make the property more inclusive
to facilitate potential backports.)
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The multifd recv side has been getting a TLS error of
GNUTLS_E_PREMATURE_TERMINATION at the end of migration when the send
side closes the sockets without ending the TLS session. This has been
masked by the code not checking the migration error after loadvm.
Start ending the TLS session at multifd_send_shutdown() so the recv
side always sees a clean termination (EOF) and we can start to
differentiate that from an actual premature termination that might
possibly happen in the middle of the migration.
There's nothing to be done if a previous migration error has already
broken the connection, so add a comment explaining it and ignore any
errors coming from gnutls_bye().
This doesn't break compat with older recv-side QEMUs because EOF has
always caused the recv thread to exit cleanly.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Locking the memory without MCL_ONFAULT instantly prefaults any mmaped
anonymous memory with a write-fault, which introduces a lot of extra
overhead in terms of memory usage when all you want to do is to prevent
kcompactd from migrating and compacting QEMU pages. Add an option to
only lock pages lazily as they're faulted by the process by using
MCL_ONFAULT if asked.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Link: https://lore.kernel.org/r/20250212143920.1269754-5-d-tatianin@yandex-team.ru
Signed-off-by: Peter Xu <peterx@redhat.com>