Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel. by FengDSP · Pull Request #773 · NVIDIA/FasterTransformer

FengDSP · 2023-10-24T17:05:58Z

This PR addresses an inconsistency in the shape of the masked_tokens array within the decoder's masked multi-head attention kernel. The expected shape of the masked_tokens array is [batch_size, session_length], however, the current implementation in the repo has it shaped as [batch_size, memory_length]. This discrepancy leads to unexpected behaviors when memory_length is not configured to be the same as session_length.

Feng Li added 2 commits October 24, 2023 16:37

masked_tokens uses session_length

cd59efe

masked_tokens uses session length everywhere

72319c6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel.#773

Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel.#773
FengDSP wants to merge 2 commits intoNVIDIA:mainfrom
FengDSP:fix_circular_cache

FengDSP commented Oct 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FengDSP commented Oct 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant