infer不支持没有flash-attn的推理

我的卡是v100，不支持flash_attn 2.7。我看了vlm的config.json，有 use_flash_attn 的flag，我手动改为false，模型infer还是会强制走flash-attn的链路。这是我的测试代码，就是从hf上copy的：
import torch
from internvlu import InternVLUPipeline

prompt = """Cluster of wildflowers, purple lupines and orange California poppies with detailed petal texture, yellow daisy centers, white Queen Anne’s lace intricate patterns, bright midday sun, morning dew"""

pipeline = InternVLUPipeline.from_pretrained(
    "InternVL-U/InternVL-U",
    torch_dtype=torch.bfloat16,
)

pipeline.to("cuda")

with torch.no_grad():
    image = pipeline(
        prompt=prompt,
        generation_mode="image",
        height=576,
        width=1024,
        generator=torch.Generator(device="cuda").manual_seed(42)
    ).images[0]

image.save("internvl_uexample_t2i.png")


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

infer不支持没有flash-attn的推理 #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

infer不支持没有flash-attn的推理 #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions