Rename to parts$N_sqrt.
Reimplement float128_sqrt with FloatParts128.
Reimplement with the inverse sqrt newton-raphson algorithm from musl.
This is significantly faster than even the berkeley sqrt n-r algorithm,
because it does not use division instructions, only multiplication.
Ordinarily, changing algorithms at the same time as migrating code is
a bad idea, but this is the only way I found that didn't break one of
the routines at the same time.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_compare. Rename all of the intermediate
functions to ftype_do_compare. Rename the hard-float functions
to ftype_hs_compare. Convert float128 to FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The float128 implementation is straight-forward.
Unfortuantely, we don't have any tests we can simply adjust/unlock.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210517142739.38597-24-david@redhat.com>
[rth: Update for changed parts_minmax return value]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_minmax. Combine 3 bool arguments to a bitmask.
Introduce ftype_minmax functions as a common optimization point.
Fold bfloat16 expansions into the same macro as the other types.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_float_to_uint. Reimplement
float128_to_uint{32,64}{_round_to_zero} with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_float_to_sint. Reimplement
float128_to_int{32,64}{_round_to_zero} with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, split out
parts$N_round_to_int_normal, define a macro for
parts_round_to_int using QEMU_GENERIC.
This necessarily meant some rearrangement to the
rount_to_{,u}int_and_pack routines, so go ahead and
convert to parts_round_to_int_normal, which in turn
allows cleaning up of the raised exception handling.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce parts_float_to_float_widen and parts_float_to_float_narrow.
Use them for float128_to_float{32,64} and float{32,64}_to_float128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Have x86_64 assembly for them, with a fallback.
This avoids shuffling values through %cl in the x86 case.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for implementing multiple sizes. Rename to parts_addsub,
split out parts_add/sub_normal for future reuse with muladd.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, renaming to parts$N_uncanon,
and define a macro for parts_uncanon using QEMU_GENERIC.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, rename to parts$N_canonicalize
and define a macro for parts_canonicalize using QEMU_GENERIC.
Rearrange the cases to recognize float_class_normal as
early as possible.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, rename to pick_nan_muladd$N
and define a macro for pick_nan_muladd using QEMU_GENERIC.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, rename to parts$N_pick_nan
and define a macro for parts_pick_nan using QEMU_GENERIC.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, rename to return_nan$N
and define a macro for return_nan using QEMU_GENERIC.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is the minimal change that also introduces float128_params,
float128_unpack_raw, and float128_pack_raw without running into
unused symbol Werrors.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Shuffle the fraction to the end, otherwise sort by size.
Add frac_hi and frac_lo members to alias frac.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, rename to parts64_silence_nan, split out
parts_silence_nan_frac, and define a macro for parts_silence_nan.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, rename to parts64_default_nan and add a
macro for parts_default_nan. This will be flushed out once
128-bit support is added.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not call parts_silence_nan when default_nan_mode is in
effect. This will avoid an assert in a later patch.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Testing more than one class at a time is better done with masks.
This reduces the static branch count.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We have been somewhat inconsistent about when to use
float_raise and when to or in the bit by hand.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rather than point the binary point at msb-1, put it at the msb.
Use uadd64_overflow to detect when addition overflows instead
of DECOMPOSED_OVERFLOW_BIT.
This reduces the number of special cases within the code, such
as shifting an int64_t either left or right during conversion.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This patch implements operations for bfloat16 except conversion and some misc
operations. We also add FloatFmt and pack/unpack interfaces for bfloat16.
As they are both static fields, we can't make a sperate patch for them.
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200813071421.2509-2-zhiwei_liu@c-sky.com>
[rth: Use FloatRelation for comparison operations.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With Makefiles that have automatically generated dependencies, you
generated includes are set as dependencies of the Makefile, so that they
are built before everything else and they are available when first
building the .c files.
Alternatively you can use a fine-grained dependency, e.g.
target/arm/translate.o: target/arm/decode-neon-shared.inc.c
With Meson you have only one choice and it is a third option, namely
"build at the beginning of the corresponding target"; the way you
express it is to list the includes in the sources of that target.
The problem is that Meson decides if something is a source vs. a
generated include by looking at the extension: '.c', '.cc', '.m', '.C'
are sources, while everything else is considered an include---including
'.inc.c'.
Use '.c.inc' to avoid this, as it is consistent with our other convention
of using '.rst.inc' for included reStructuredText files. The editorconfig
file is adjusted.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Both x87 and m68k need the low parts of the quotient for their
remainder operations. Arrange for floatx80_modrem to track those bits
and return them via a pointer.
The architectures using float32_rem and float64_rem do not appear to
need this information, so the *_rem interface is left unchanged and
the information returned only from floatx80_modrem. The logic used to
determine the low 7 bits of the quotient for m68k
(target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it
looks at the result of converting the remainder to integer, the
quotient having been discarded by that point); this patch does not
change that, but the m68k maintainers may wish to do so.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081656500.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The floatx80 remainder implementation unnecessarily sets the high bit
of bSig explicitly. By that point in the function, arguments that are
invalid, zero, infinity or NaN have already been handled and
subnormals have been through normalizeFloatx80Subnormal, so the high
bit will already be set. Remove the unnecessary code.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081656220.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>