zstd significantly reduces cluster compression time.
It provides better compression performance maintaining
the same level of the compression ratio in comparison with
zlib, which, at the moment, is the only compression
method available.
The performance test results:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G
The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.
compress cmd:
time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
src.img [zlib|zstd]_compressed.img
decompress cmd
time ./qemu-img convert -O qcow2
[zlib|zstd]_compressed.img uncompressed.img
compression decompression
zlib zstd zlib zstd
------------------------------------------------------------
real 65.5 16.3 (-75 %) 1.9 1.6 (-16 %)
user 65.0 15.8 5.3 2.5
sys 3.3 0.2 2.0 2.0
Both ZLIB and ZSTD gave the same compression ratio: 1.57
compressed image size in both cases: 1.4G
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
QAPI part:
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200507082521.29210-4-dplotnikov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The patch adds some preparation parts for incompatible compression type
feature to qcow2 allowing the use different compression methods for
image clusters (de)compressing.
It is implied that the compression type is set on the image creation and
can be changed only later by image conversion, thus compression type
defines the only compression algorithm used for the image, and thus,
for all image clusters.
The goal of the feature is to add support of other compression methods
to qcow2. For example, ZSTD which is more effective on compression than ZLIB.
The default compression is ZLIB. Images created with ZLIB compression type
are backward compatible with older qemu versions.
Adding of the compression type breaks a number of tests because now the
compression type is reported on image creation and there are some changes
in the qcow2 header in size and offsets.
The tests are fixed in the following ways:
* filter out compression_type for many tests
* fix header size, feature table size and backing file offset
affected tests: 031, 036, 061, 080
header_size +=8: 1 byte compression type
7 bytes padding
feature_table += 48: incompatible feature compression type
backing_file_offset += 56 (8 + 48 -> header_change + feature_table_change)
* add "compression type" for test output matching when it isn't filtered
affected tests: 049, 060, 061, 065, 082, 085, 144, 182, 185, 198, 206,
242, 255, 274, 280
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
QAPI part:
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200507082521.29210-2-dplotnikov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Since 5.0 QEMU uses hostmem backend for allocating main guest RAM.
The backend however calls mbind() which is typically NOP
in case of default policy/absent host-nodes bitmap.
However when runing in container with black-listed mbind()
syscall, QEMU fails to start with error
"cannot bind memory to host NUMA nodes: Operation not permitted"
even when user hasn't provided host-nodes to pin to explictly
(which is the case with -m option)
To fix issue, call mbind() only in case when user has provided
host-nodes explicitly (i.e. host_nodes bitmap is not empty).
That should allow to run QEMU in containers with black-listed
mbind() without memory pinning. If QEMU provided memory-pinning
is required user still has to white-list mbind() in container
configuration.
Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200430154606.6421-1-imammedo@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Merge tpm 2020/05/08 v3
# gpg: Signature made Tue 12 May 2020 16:50:34 BST
# gpg: using RSA key B818B9CADF9089C2D5CEC66B75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2020-05-08-1:
hw/tpm: fix usage of bool in tpm-tis.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Clean up wrong usage of FALSE and TRUE in places that use "bool" from stdbool.h.
FALSE and TRUE (with capital letters) are the constants defined by glib for
being used with the "gboolean" type of glib. But some parts of the code also use
TRUE and FALSE for variables that are declared as "bool" (the type from <stdbool.h>).
Signed-off-by: Jafar Abdi <cafer.abdi@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
I can't find proper documentation or datasheet, but it is likely
a MMIO mapped serial device mapped in the 0x80000000..0x8000ffff
range belongs to the SoC address space, thus is always mapped in
the memory bus.
Map the devices on the bus regardless a chardev is attached to it.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jan Kiszka <jan.kiszka@web.de>
Message-id: 20200505095945.23146-1-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For contiguous predicated memory operations, we want to
minimize the number of tlb lookups performed. We have
open-coded this for sve_ld1_r, but for correctness with
MTE we will need this for all of the memory operations.
Create a structure that holds the bounds of active elements,
and metadata for two pages. Add routines to find those
active elements, lookup the pages, and run watchpoints
for those pages.
Temporarily mark the functions unused to avoid Werror.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200508154359.7494-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the "normal" memory access functions, rather than the
softmmu internal helper functions directly.
Since fb901c905d, cpu_mem_index is now a simple extract
from env->hflags and not a large computation. Which means
that it's now more work to pass around this value than it
is to recompute it.
This only adjusts the primitives, and does not clean up
all of the uses within sve_helper.c.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200508154359.7494-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There are minimal differences from Qemu's point of view between the A0
and A1 silicon revisions.
As the A1 exercises different code paths in u-boot it is desirable to
emulate that instead.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20200504093703.261135-1-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is a boot stub that is similar to the code u-boot runs, allowing
the kernel to boot the secondary CPU.
u-boot works as follows:
1. Initialises the SMP mailbox area in the SCU at 0x1e6e2180 with default values
2. Copies a stub named 'mailbox_insn' from flash to the SCU, just above the
mailbox area
3. Sets AST_SMP_MBOX_FIELD_READY to a magic value to indicate the
secondary can begin execution from the stub
4. The stub waits until the AST_SMP_MBOX_FIELD_GOSIGN register is set to
a magic value
5. Jumps to the address in AST_SMP_MBOX_FIELD_ENTRY, starting Linux
Linux indicates it is ready by writing the address of its entrypoint
function to AST_SMP_MBOX_FIELD_ENTRY and the 'go' magic number to
AST_SMP_MBOX_FIELD_GOSIGN. The secondary CPU sees this at step 4 and
breaks out of it's loop.
To be compatible, a fixed qemu stub is loaded into the mailbox area. As
qemu can ensure the stub is loaded before execution starts, we do not
need to emulate the AST_SMP_MBOX_FIELD_READY behaviour of u-boot. The
secondary CPU's program counter points to the beginning of the stub,
allowing qemu to start secondaries at step four.
Reboot behaviour is preserved by resetting AST_SMP_MBOX_FIELD_GOSIGN
when the secondaries are reset.
This is only configured when the system is booted with -kernel and qemu
does not execute u-boot first.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block layer patches:
- qcow2: Fix preallocation on block devices
- backup: Make sure that source and target size match
- vmdk: Fix zero cluster handling
- Follow-up cleanups and fixes for the truncate changes
- iotests: Skip more tests if required drivers are missing
# gpg: Signature made Fri 08 May 2020 13:39:55 BST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (30 commits)
block: Drop unused .bdrv_has_zero_init_truncate
vhdx: Rework truncation logic
parallels: Rework truncation logic
ssh: Support BDRV_REQ_ZERO_WRITE for truncate
sheepdog: Support BDRV_REQ_ZERO_WRITE for truncate
rbd: Support BDRV_REQ_ZERO_WRITE for truncate
nfs: Support BDRV_REQ_ZERO_WRITE for truncate
file-win32: Support BDRV_REQ_ZERO_WRITE for truncate
gluster: Drop useless has_zero_init callback
qcow2: Fix preallocation on block devices
iotests/055: Use cache.no-flush for vmdk target
iotests: Backup with different source/target size
backup: Make sure that source and target size match
backup: Improve error for bdrv_getlength() failure
iotests/283: Use consistent size for source and target
iotests: vmdk: Enable zeroed_grained=on by default
vmdk: Flush only once in vmdk_L2update()
vmdk: Don't update L2 table for zero write on zero cluster
vmdk: Fix partial overwrite of zero cluster
vmdk: Fix zero cluster allocation
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Now that there are no clients of bdrv_has_zero_init_truncate, none of
the drivers need to worry about providing it.
What's more, this eliminates a source of some confusion: a literal
reading of the documentation as written in ceaca56f and implemented in
commit 1dcaf527 claims that a driver which returns 0 for
bdrv_has_zero_init_truncate() must not return 1 for
bdrv_has_zero_init(); this condition was violated for parallels, qcow,
and sometimes for vdi, although in practice it did not matter since
those drivers also lacked .bdrv_co_truncate.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200428202905.770727-10-eblake@redhat.com>
Acked-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The vhdx driver uses truncation for image growth, with a special case
for blocks that already read as zero but which are only being
partially written. But with a bit of rearranging, it's just as easy
to defer the decision on whether truncation resulted in zeroes to the
actual allocation attempt, reducing the number of places that still
use bdrv_has_zero_init_truncate.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200428202905.770727-9-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The parallels driver tries to use truncation for image growth, but can
only do so when reads are guaranteed as zero. Now that we have a way
to request zero contents from truncation, we can defer the decision to
actual allocation attempts rather than up front, reducing the number
of places that still use bdrv_has_zero_init_truncate.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200428202905.770727-8-eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Our .bdrv_has_zero_init_truncate can detect when the remote side
always zero fills; we can reuse that same knowledge to implement
BDRV_REQ_ZERO_WRITE by ignoring it when the server gives it to us for
free.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200428202905.770727-7-eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Our .bdrv_has_zero_init_truncate always returns 1 because sheepdog
always 0-fills; we can use that same knowledge to implement
BDRV_REQ_ZERO_WRITE by ignoring it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200428202905.770727-6-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Our .bdrv_has_zero_init_truncate always returns 1 because rbd always
0-fills; we can use that same knowledge to implement
BDRV_REQ_ZERO_WRITE by ignoring it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200428202905.770727-5-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Our .bdrv_has_zero_init_truncate returns 1 if we detect that the OS
always 0-fills; we can use that same knowledge to implement
BDRV_REQ_ZERO_WRITE by ignoring it when the OS gives it to us for
free.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200428202905.770727-4-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When using bdrv_file, .bdrv_has_zero_init_truncate always returns 1;
therefore, we can behave just like file-posix, and always implement
BDRV_REQ_ZERO_WRITE by ignoring it since the OS gives it to us for
free (note that file-posix.c had to use an 'if' because it shared code
between regular files and block devices, but in file-win32.c,
bdrv_host_device uses a separate .bdrv_file_open).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200428202905.770727-3-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>