Skip to content

issue/1124: minicpm-sala model #1125

Open
Ceng23333 wants to merge 5 commits intomainfrom
minicpm_sala_patches
Open

issue/1124: minicpm-sala model #1125
Ceng23333 wants to merge 5 commits intomainfrom
minicpm_sala_patches

Conversation

@Ceng23333
Copy link
Copy Markdown
Collaborator

@Ceng23333 Ceng23333 commented Apr 7, 2026

@Ceng23333 Ceng23333 requested a review from a team April 7, 2026 09:08
@Ceng23333 Ceng23333 changed the title squash for rebase issue/1124: minicpm-sala model Apr 7, 2026
@Ceng23333 Ceng23333 force-pushed the minicpm_sala_patches branch from c99c78a to fb8eba4 Compare April 8, 2026 02:55
Copy link
Copy Markdown
Collaborator

@wooway777 wooway777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

除评论内容外还需要解conflict

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这三个测试最好不要放在这个文件夹里,infinicore/ops里面放的都是基于测试框架写出来的测试,可以由run.py统一运行。

这种单独写的测试放在这里其他人也很难注意到

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些跟infinicore/adaptors里的同名接口的区别在于多几个参数是么?
如果他们起到的是类似的效果,是不是应该放在adaptor里面

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些是否需要注册成图算子?


namespace {

__device__ __forceinline__ float bf16_to_f32(__nv_bfloat16 x) { return __bfloat162float(x); }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这么写完海光之类的应该会炸


template <>
struct Convert<__half> {
__device__ static float to_f32(__half x) { return f16_to_f32(x); }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉仓库里已经有不少转换函数了,好多kernel里都自己定义了转换函数···不确定应不应该什么时候整理一下

infiniopTensorDescriptor_t v_desc,
infiniopTensorDescriptor_t g_gamma_desc) {

#define CREATE_CUDA(CASE, NAMESPACE) \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我在swiglu里定义create cuda是为了区分cuda/类cuda和通用实现

.gitmodules Outdated
path = third_party/nlohmann_json
url = https://github.com/nlohmann/json.git
branch = master
[submodule "third_party/infllmv2_cuda_impl"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个如果是nv专用的,是不是应该考虑像fla和cutlass一样要求手动拉取,而不是加成submodule所有平台不管用不用都clone

end

-- InfLLM-V2 direct kernels (requires aten; link against infllmv2_cuda_impl .so)
option("infllmv2")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readme最好有相关体现

@wooway777 wooway777 requested a review from PanZezhong1725 April 9, 2026 07:55
//
// Returns:
// [total_q, nheads, head_dim]
Tensor infllmv2_varlen(const Tensor &q,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

函数名字要体现是attention算子,下同

}
auto cpu_lens = seqlens_k.to(at::kCPU);
int32_t len0 = cpu_lens.numel() > 0 ? cpu_lens.data_ptr<int32_t>()[0] : -1;
f << "[infinicore][infllmv2][" << op_name << "]"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么不用spdlog

#include <stdexcept>
#include <vector>

namespace infinicore::op {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个算子实现是不是应该放到infiniop里去

xmake.lua Outdated
add_files("src/infinicore/pybind11/**.cc")

set_installdir("python/infinicore")
after_build(function (target)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这应该是install的工作

Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
@Ceng23333 Ceng23333 force-pushed the minicpm_sala_patches branch from 0435097 to b2c5e1b Compare April 10, 2026 12:40
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants