some avx512 to const generics#1042
some avx512 to const generics#1042Amanieu merged 126 commits intorust-lang:masterfrom minybot:avx512
Conversation
…mm_add,sub,mul,div_round_ss,sd; mm_sqrt_round_ss,sd; mm_scalf_round_ss,sd; mm_fmadd,fmsub,fnmadd,fnmsub_round_ss,sd; mm_cvt_roundss_i32,u32; mm_cvt_roundsd_i32,u32; mm_cvt_roundi32,u32_ss; mm_cvt_roundsd_ss
…s,pd_epi32,epu32; mm_max,min_round_ss,sd; mm_getexp_ss,sd; mm_cvt_roundss_sd; cvt_roundss_si32,i32,u32; mm_cvtt_roundsd_si32,i32,u32
|
r? @Amanieu (rust-highfive has picked a reviewer for you, use r? to override) |
|
_mm256_srai_epi32 (__m256i a, int imm8)
Solution 1. change stdarch-verify/x86-intel.xml to make u32 to i32 |
|
Can you cast the constant to I would prefer not changing the intrinsics XML since it is our source of reference. |
|
|
For mm_mask_srai_epi32:
|
| _ => $expand!(15), | ||
| } | ||
| }; | ||
| } |
There was a problem hiding this comment.
My mistake when training to solve merge conflict. I remove it now.
| pub(crate) const VALID: () = { | ||
| let _ = 1 / ((IMM == 4 || IMM == 8 || IMM == 9 || IMM == 10 || IMM == 11) as usize); | ||
| }; | ||
| } |
There was a problem hiding this comment.
Can you reuse the macros from x86/macros.rs here?
There was a problem hiding this comment.
Should I use #[macro_export] in x86/macros.rs?
|
By the way you need to rebase to address the merge conflicts. |
|
The imm8 range for mm512_srai_epi32 is "i32". I tested it with clang and gcc. |
|
According to the intrinsics guide the instruction only reads the low 8 bits of |
crates/core_arch/src/x86/macros.rs
Outdated
| } | ||
|
|
||
| #[allow(unused)] | ||
| macro_rules! static_assert_imm8u { |
There was a problem hiding this comment.
#1047 just got merged which includes a static_assert_imm_u8 macro.
The LLVM12 upgrade in rustc may be causing issues
cvt_roundps_epi32,epu32
cvt_roundepi32,u32_ps
cvt_roundpd_ps,epi32,epu32
mm_scalef_round_ss,sd
mm_fmadd_round_ss,sd
mm_fmsub_round_ss,sd
mm_fnmadd_round_ss,sd
mm_fnmsub_round_ss,sd
mm_maskz_cvt_roundsd_ss
mm_cvt_roundss_si32,i32,u32; mm_cvt_roundsd_si32,i32,u32
mm_cvt_roundi32_ss
mm_cvt_roundu32_ss
mm_cvt_roundsd_ss
mm_add,sub,mul,div_round_ss,sd
mm_sqrt_round_ss,sd
cvt_roundps_pd,ph
cvt_roundph_ps
cvtps_ph
cvtt_roundps,pd_epi32,epu32
mm_max,min,getexp_round_ss,sd
mm_cvt_roundss_sd
cvt_roundss_si32,i32,u32
mm_cvtt_roundsd_si32
shuffle_epi32
shuffle_i32x4,f32x4,i64x2,f64x2
mm_cvtt_roundss,sd_u64,i64,si64
mm_cvt_roundss,sd_u64,i64,si64
mm_cvt_roundu64,i64,si64_ss,sd
shldi_epi64,epi32,epi16
shrdi_epi64,epi32,epi16
ror_epi32,epi64; rol_epi32,epi64;
srai_epi32