We only need to track slot for predicated stores and predicated HVX
instructions.
Add arguments to the probe helper functions to indicate if the slot
is predicated.
Here is a simple example of the differences in the TCG code generated:
IN:
0x00400094: 0xf900c102 { if (P0) R2 = and(R0,R1) }
BEFORE
---- 00400094
mov_i32 slot_cancelled,$0x0
mov_i32 new_r2,r2
and_i32 tmp0,p0,$0x1
brcond_i32 tmp0,$0x0,eq,$L1
and_i32 tmp0,r0,r1
mov_i32 new_r2,tmp0
br $L2
set_label $L1
or_i32 slot_cancelled,slot_cancelled,$0x8
set_label $L2
mov_i32 r2,new_r2
AFTER
---- 00400094
mov_i32 new_r2,r2
and_i32 tmp0,p0,$0x1
brcond_i32 tmp0,$0x0,eq,$L1
and_i32 tmp0,r0,r1
mov_i32 new_r2,tmp0
br $L2
set_label $L1
set_label $L2
mov_i32 r2,new_r2
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-14-tsimpson@quicinc.com>
We assign the instruction destination register to hex_new_value[num]
instead of a TCG temp that gets copied back to hex_new_value[num].
We introduce new functions get_result_gpr[_pair] to facilitate getting
the proper destination register.
Since we preload hex_new_value for predicated instructions, we don't
need the check for slot_cancelled. So, we call gen_log_reg_write instead.
We update the helper function generation and gen_tcg.h to maintain the
disable-hexagon-idef-parser configuration.
Here is a simple example of the differences in the TCG code generated:
IN:
0x00400094: 0xf900c102 { if (P0) R2 = and(R0,R1) }
BEFORE
---- 00400094
mov_i32 slot_cancelled,$0x0
mov_i32 new_r2,r2
mov_i32 loc2,$0x0
and_i32 tmp0,p0,$0x1
brcond_i32 tmp0,$0x0,eq,$L1
and_i32 tmp0,r0,r1
mov_i32 loc2,tmp0
br $L2
set_label $L1
or_i32 slot_cancelled,slot_cancelled,$0x8
set_label $L2
and_i32 tmp0,slot_cancelled,$0x8
movcond_i32 new_r2,tmp0,$0x0,loc2,new_r2,eq
mov_i32 r2,new_r2
AFTER
---- 00400094
mov_i32 slot_cancelled,$0x0
mov_i32 new_r2,r2
and_i32 tmp0,p0,$0x1
brcond_i32 tmp0,$0x0,eq,$L1
and_i32 tmp0,r0,r1
mov_i32 new_r2,tmp0
br $L2
set_label $L1
or_i32 slot_cancelled,slot_cancelled,$0x8
set_label $L2
mov_i32 r2,new_r2
We'll remove the unnecessary manipulation of slot_cancelled in a
subsequent patch.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-13-tsimpson@quicinc.com>
The F2_sffms instruction [r0 -= sfmpy(r1, r2)] doesn't properly
handle -0. Previously we would negate the input operand by subtracting
from zero. Instead, we negate by changing the sign bit.
Test case added to tests/tcg/hexagon/fpstuff.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-12-tsimpson@quicinc.com>
Extend the analyze_<tag> functions for HVX vector and predicate writes
Remove calls to ctx_log_vreg_write[_pair] from gen_tcg_funcs.py
During gen_start_packet, reload the predicated HVX registers into
fugure_VRegs and tmp_VRegs
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-8-tsimpson@quicinc.com>
The pkt_has_store_s1 field in CPUHexagonState is only needed in generated
helpers for scalar load instructions. See check_noshuf and mem_load[1248]
in op_helper.c.
We add logic in gen_analyze_funcs.py to set need_pkt_has_store_s1 in
DisasContext when it is needed at runtime.
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-7-tsimpson@quicinc.com>
We create a new generator that creates an analyze_<tag> function for
each instruction. Currently, these functions record the writes to
R, P, and C registers by calling ctx_log_reg_write[_pair] or
ctx_log_pred_write.
During gen_start_packet, we invoke the analyze_<tag> function for
each instruction in the packet, and we mark the implicit register
and predicate writes.
Doing the analysis up front has several advantages
- We remove calls to ctx_log_* from gen_tcg_funcs.py and genptr.c
- After the analysis is performed, we can initialize hex_new_value
for each of the predicated assignments rather than during TCG
generation for the instructions
- This is a stepping stone for future work where the analysis will
include the set of registers that are read. In cases where
the packet doesn't have an overlap between the registers that are
written and registers that are read, we can avoid the intermediate
step of writing to hex_new_value. Note that other checks will also
be needed (e.g., no instructions can raise an exception).
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-6-tsimpson@quicinc.com>
These instructions perform a deallocframe+return (jumpr r31)
Add overrides for
L4_return
SL2_return
L4_return_t
L4_return_f
L4_return_tnew_pt
L4_return_fnew_pt
L4_return_tnew_pnt
L4_return_fnew_pnt
SL2_return_t
SL2_return_f
SL2_return_tnew
SL2_return_fnew
This patch eliminates the last helper that uses write_new_pc, so we
remove it from op_helper.c
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-5-tsimpson@quicinc.com>
The --disable-hexagon-idef-parser configuration was broken by this patch
2feacf60c23ba6 (target/hexagon: Drop tcg_temp_free from C code)
That config is not tested by CI
Fix is simple: Mark a few TCGv variables as unused
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230306172515.346813-1-tsimpson@quicinc.com>
Currently, the max satp mode is set with the only constraint that it must be
implemented in QEMU, i.e. set in valid_vm_1_10_[32|64].
But we actually need to add another level of constraint: what the hw is
actually capable of, because currently, a linux booting on a sifive-u54
boots in sv57 mode which is incompatible with the cpu's sv39 max
capability.
So add a new bitmap to RISCVSATPMap which contains this capability and
initialize it in every XXX_cpu_init.
Finally:
- valid_vm_1_10_[32|64] constrains which satp mode the CPU can use
- the CPU hw capabilities constrains what the user may select
- the user's selection then constrains what's available to the guest
OS.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230303131252.892893-5-alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
RISC-V specifies multiple sizes for addressable memory and Linux probes for
the machine's support at startup via the satp CSR register (done in
csr.c:validate_vm).
As per the specification, sv64 must support sv57, which in turn must
support sv48...etc. So we can restrict machine support by simply setting the
"highest" supported mode and the bare mode is always supported.
You can set the satp mode using the new properties "sv32", "sv39", "sv48",
"sv57" and "sv64" as follows:
-cpu rv64,sv57=on # Linux will boot using sv57 scheme
-cpu rv64,sv39=on # Linux will boot using sv39 scheme
-cpu rv64,sv57=off # Linux will boot using sv48 scheme
-cpu rv64 # Linux will boot using sv57 scheme by default
We take the highest level set by the user:
-cpu rv64,sv48=on,sv57=on # Linux will boot using sv57 scheme
We make sure that invalid configurations are rejected:
-cpu rv64,sv39=off,sv48=on # sv39 must be supported if higher modes are
# enabled
We accept "redundant" configurations:
-cpu rv64,sv48=on,sv57=off # Linux will boot using sv48 scheme
And contradictory configurations:
-cpu rv64,sv48=on,sv48=off # Linux will boot using sv39 scheme
Co-Developed-by: Ludovic Henry <ludovic@rivosinc.com>
Signed-off-by: Ludovic Henry <ludovic@rivosinc.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-ID: <20230303131252.892893-4-alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Integrate neighboring code from get_phys_addr_lpae which computed
starting level, as it is easier to validate when doing both at the
same time. Mirror the checks at the start of AArch{64,32}.S2Walk,
especially S2InvalidSL and S2InconsistentSL.
This reverts 49ba115bb7, which was incorrect -- there is nothing
in the ARM pseudocode that depends on TxSZ, i.e. outputsize; the
pseudocode is consistent in referencing PAMax.
Fixes: 49ba115bb7 ("target/arm: Pass outputsize down to check_s2_mmu_setup")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230227225832.816605-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The upstream gdb xml only implements {MSP,PSP}{,_NS,S}, but
go ahead and implement the other system registers as well.
Since there is significant overlap between the two, implement
them with common code. The only exception is the systemreg
view of CONTROL, which merges the banked bits as per MRS.
Signed-off-by: David Reiss <dreiss@meta.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230227213329.793795-15-richard.henderson@linaro.org
[rth: Substatial rewrite using enumerator and shared code.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace ifdefs with C, tcg_const_i32 with tcg_constant_i32.
We only need a single temporary for this.
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_constant_i64. Adjust in2_mri2_* to allocate a new
temporary for the output, using gen_ri2 for the address.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Compute the eflags write mask separately, leaving one call
to the helper. Use tcg_constant_i32.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We already have a temporary, res, which we can use for the intermediate
shift result. Simplify the constant to -1 instead of 0xf*f.
This was the last use of gen_tmp_value, so remove it.
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The allocation is immediately followed by either tcg_gen_mov_i32
or gen_read_preg (which contains tcg_gen_mov_i32), so the zero
initialization is immediately discarded.
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The allocation is immediately followed by tcg_gen_mov_i32,
so the initial assignment of zero is discarded.
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>