Skip to content

adding SIMD/TensorPrimitives to dotent-diag/analyzing-dotnet-performance#330

Merged
danmoseley merged 12 commits intodotnet:mainfrom
jeffschwMSFT:jeffschw/add-simd-vectorization-reference
Mar 25, 2026
Merged

adding SIMD/TensorPrimitives to dotent-diag/analyzing-dotnet-performance#330
danmoseley merged 12 commits intodotnet:mainfrom
jeffschwMSFT:jeffschw/add-simd-vectorization-reference

Conversation

@jeffschwMSFT
Copy link
Copy Markdown
Member

Incorporated feedback, trained on 100+ scalar loop examples, moved to analyzing-dotnet-performance reference

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds SIMD/TensorPrimitives guidance and evaluation coverage to the dotnet-diag/analyzing-dotnet-performance skill, expanding it to recognize when scalar loops are good candidates for vectorization (or when SIMD is not applicable).

Changes:

  • Adds new SIMD-focused fixtures and corresponding eval scenarios (TensorPrimitives reductions, SIMD-friendly loops, and a “no SIMD opportunity” case).
  • Introduces a new simd-vectorization.md reference with decision gating (TensorPrimitives-first vs manual intrinsics).
  • Updates the skill description and detection signals to include SIMD/vectorization.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-product.cs New fixture for product reduction intended for TensorPrimitives optimization.
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-minmax.cs New fixture for min/max reduction intended for TensorPrimitives optimization.
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-no-opportunity-catalog.cs New fixture representing a case where SIMD is not a meaningful optimization.
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-conditional-increment.cs New SIMD-friendly loop fixture (conditional increment).
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-bit-reverser.cs New SIMD-friendly byte processing fixture (bit reversal).
tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Adds five new scenarios covering TensorPrimitives, SIMD intrinsics, and “no opportunity” detection.
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md New reference doc defining the SIMD/TensorPrimitives decision gate and implementation patterns.
plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md Updates description and signal detection to include SIMD vectorization and reference loading.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Copilot AI review requested due to automatic review settings March 11, 2026 17:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md Outdated
Copilot AI review requested due to automatic review settings March 11, 2026 23:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@tannergooding
Copy link
Copy Markdown
Member

Few nits/feedback, but this looks overall good!

Copilot AI review requested due to automatic review settings March 16, 2026 23:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
@jeffschwMSFT
Copy link
Copy Markdown
Member Author

/evaluate

Copy link
Copy Markdown
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more nits, but LGTM

@jeffschwMSFT
Copy link
Copy Markdown
Member Author

we are adding a new experimental branch to this repo and this may be one of the first skills to give it a try. @artl93 and I discussed and although this seems useful, we are not sure how many people it will be most helpful for. though willing to discuss. (fwiw, I am on the list of people that have found it helpful for numeric libraries)

@github-actions
Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Quality (Isolated) Quality (Plugin) Skills Loaded Overfit Verdict
directory-build-organization Organize build infrastructure for a multi-project repo 3.0/5 → 5.0/5 🟢 3.0/5 → 4.7/5 🟢 ✅ directory-build-organization; tools: skill, create, edit, bash, task / ✅ msbuild-antipatterns; directory-build-organization; tools: task, bash, skill ✅ 0.15
dotnet-trace-collect High CPU in Kubernetes on Linux (.NET 8) 3.7/5 → 4.7/5 🟢 3.7/5 → 4.3/5 🟢 ✅ dotnet-trace-collect; tools: report_intent, skill, view, glob, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view ✅ 0.16
dotnet-trace-collect .NET Framework on Windows without admin privileges 2.0/5 → 5.0/5 🟢 2.0/5 → 5.0/5 🟢 ✅ dotnet-trace-collect; tools: skill / ✅ dotnet-trace-collect; tools: skill ✅ 0.16
dotnet-trace-collect .NET 10 on Linux with root access and native call stacks 1.7/5 → 4.0/5 🟢 1.7/5 → 4.0/5 🟢 ✅ dotnet-trace-collect; tools: skill / ✅ dotnet-trace-collect; tools: skill ✅ 0.16
dotnet-trace-collect Memory leak on Linux (.NET 8) 2.3/5 → 3.0/5 🟢 2.3/5 → 3.0/5 🟢 ✅ dotnet-trace-collect; tools: skill, report_intent, view, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view ✅ 0.16
dotnet-trace-collect Slow requests on Windows with PerfView 3.7/5 → 5.0/5 🟢 3.7/5 → 5.0/5 🟢 ✅ dotnet-trace-collect; tools: skill, report_intent, view, glob, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view ✅ 0.16
dotnet-trace-collect Excessive GC on Linux (.NET 8) 3.3/5 → 5.0/5 🟢 3.3/5 → 4.7/5 🟢 ✅ dotnet-trace-collect; tools: skill, glob / ✅ dotnet-trace-collect; tools: skill ✅ 0.16
dotnet-trace-collect Hang or deadlock diagnosis on Linux 2.7/5 → 3.7/5 🟢 2.7/5 → 3.0/5 🟢 ✅ dotnet-trace-collect; tools: skill / ✅ dotnet-trace-collect; dump-collect; tools: skill, report_intent, view ✅ 0.16 [1]
dotnet-trace-collect Windows container high CPU with PerfView 1.7/5 → 4.3/5 🟢 1.7/5 → 5.0/5 🟢 ✅ dotnet-trace-collect; tools: skill, glob / ✅ dotnet-trace-collect; tools: skill ✅ 0.16
dotnet-trace-collect Long-running intermittent issue with PerfView triggers 2.3/5 → 5.0/5 🟢 2.3/5 → 4.3/5 🟢 ✅ dotnet-trace-collect; tools: skill, report_intent, view, bash, glob / ✅ dotnet-trace-collect; tools: skill, report_intent, view ✅ 0.16
dotnet-trace-collect Linux pre-.NET 10 needing native call stacks 2.7/5 → 4.7/5 🟢 2.7/5 → 4.3/5 🟢 ✅ dotnet-trace-collect; tools: skill, report_intent, view, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view ✅ 0.16
dotnet-trace-collect Windows modern .NET with admin high CPU 2.0/5 → 4.7/5 🟢 2.0/5 → 5.0/5 🟢 ✅ dotnet-trace-collect; tools: skill, report_intent, view, bash, glob / ✅ dotnet-trace-collect; tools: skill, report_intent, view ✅ 0.16
dotnet-trace-collect Memory leak on .NET Framework Windows 3.3/5 → 5.0/5 🟢 3.3/5 → 5.0/5 🟢 ✅ dotnet-trace-collect; tools: report_intent, skill, view, glob, bash / ✅ dotnet-trace-collect; tools: report_intent, skill, view ✅ 0.16
dotnet-trace-collect Kubernetes with console access prefers console tools 4.3/5 → 4.7/5 🟢 4.3/5 → 5.0/5 🟢 ✅ dotnet-trace-collect; tools: skill, report_intent, view, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view ✅ 0.16 [2]
dotnet-trace-collect Container installation without .NET SDK 3.0/5 → 3.3/5 🟢 3.0/5 → 4.7/5 🟢 ✅ dotnet-trace-collect; tools: skill / ✅ dotnet-trace-collect; tools: skill ✅ 0.16 [3]
dotnet-trace-collect HTTP 500s from downstream service on Linux (.NET 8) 4.3/5 → 5.0/5 🟢 4.3/5 → 5.0/5 🟢 ✅ dotnet-trace-collect; tools: skill, report_intent, view, bash, glob / ✅ dotnet-trace-collect; tools: report_intent, skill, view ✅ 0.16
dotnet-trace-collect Networking timeouts on Windows with admin (.NET 8) 2.0/5 → 5.0/5 🟢 2.0/5 → 4.7/5 🟢 ✅ dotnet-trace-collect; tools: report_intent, skill, view, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view ✅ 0.16
analyzing-dotnet-performance Detects compiled regex startup budget and regex chain allocations 1.0/5 → 1.0/5 1.0/5 → 1.0/5 ⚠️ NOT ACTIVATED / ✅ analyzing-dotnet-performance; tools: skill ✅ 0.14 [4]
analyzing-dotnet-performance Detects CurrentCulture comparer and compiled regex budget in inflection rules 1.0/5 → 1.0/5 1.0/5 → 1.0/5 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14 [5]
analyzing-dotnet-performance Finds per-call Dictionary allocation not hoisted to static 1.0/5 → 1.0/5 1.0/5 → 1.0/5 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14 [6]
analyzing-dotnet-performance Catches compound allocations in recursive number converter with ToLower 1.0/5 → 1.3/5 🟢 1.0/5 → 1.0/5 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14 [7]
analyzing-dotnet-performance Finds StringComparison.Ordinal missing and FrozenDictionary opportunities 1.0/5 → 1.0/5 1.0/5 → 1.0/5 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14 [8]
analyzing-dotnet-performance Detects Aggregate+Replace chain and struct missing IEquatable 1.0/5 → 1.0/5 1.0/5 → 1.0/5 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14 [9]
analyzing-dotnet-performance Finds branched Replace chain in format string manipulation 1.0/5 → 1.0/5 1.0/5 → 1.0/5 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14 [10]
analyzing-dotnet-performance Catches LINQ on hot-path string processing and All(char.IsUpper) 1.0/5 → 1.0/5 1.0/5 → 1.0/5 ✅ analyzing-dotnet-performance; tools: glob, skill / ⚠️ NOT ACTIVATED ✅ 0.14 [11]
analyzing-dotnet-performance Detects LINQ pipeline in TimeSpan formatting and collection processing 1.0/5 → 1.0/5 1.0/5 → 1.0/5 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14 [12]
analyzing-dotnet-performance Flags Span inconsistencies and compound method chains in truncation library 1.3/5 → 1.0/5 🔴 1.3/5 → 1.0/5 🔴 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14
analyzing-dotnet-performance Identifies unsealed leaf classes and locale hierarchy patterns 1.0/5 → 1.0/5 1.0/5 → 1.3/5 🟢 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14 [13]
analyzing-dotnet-performance Optimize manual min/max with TensorPrimitives 1.0/5 ⏰ → 1.0/5 1.0/5 ⏰ → 1.0/5 ⏰ ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14 [14]
analyzing-dotnet-performance Optimize manual product with TensorPrimitives 1.0/5 ⏰ → 1.0/5 ⏰ 1.0/5 ⏰ → 2.0/5 ⏰ 🟢 ✅ analyzing-dotnet-performance; tools: skill / ✅ analyzing-dotnet-performance; tools: skill, edit ✅ 0.14
analyzing-dotnet-performance No optimization opportunity — dictionary-based lookup service 1.0/5 → 1.0/5 ⏰ 1.0/5 → 1.0/5 ⏰ ✅ analyzing-dotnet-performance; tools: edit, create, skill / ✅ analyzing-dotnet-performance; tools: create, skill ✅ 0.14 [15]
analyzing-dotnet-performance Optimize int array conditional increment with SIMD 4.7/5 ⏰ → 3.3/5 ⏰ 🔴 4.7/5 ⏰ → 3.7/5 ⏰ 🔴 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14
analyzing-dotnet-performance Optimize byte buffer bit reversal with SIMD 3.7/5 ⏰ → 1.0/5 ⏰ 🔴 3.7/5 ⏰ → 1.0/5 ⏰ 🔴 ⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED ✅ 0.14

[1] (Plugin) Quality improved but weighted score is -8.8% due to: judgment, quality
[2] (Isolated) Quality improved but weighted score is -10.0% due to: tokens (11654 → 105134), tool calls (0 → 6), time (8.8s → 39.4s)
[3] (Isolated) Quality improved but weighted score is -46.7% due to: judgment, quality
[4] (Isolated) Quality unchanged but weighted score is -13.2% due to: judgment, tokens (35194 → 40879)
[5] (Plugin) Quality unchanged but weighted score is -1.3% due to: tokens (34971 → 43307)
[6] (Plugin) Quality unchanged but weighted score is -0.8% due to: tokens (34933 → 38931)
[7] (Plugin) Quality unchanged but weighted score is -5.7% due to: tokens (23092 → 39001), tool calls (2 → 3), time (12.2s → 17.1s)
[8] (Isolated) Quality unchanged but weighted score is -0.8% due to: tokens (34917 → 40485)
[9] (Plugin) Quality unchanged but weighted score is -0.1% due to: tokens (34957 → 38958)
[10] (Isolated) Quality unchanged but weighted score is -3.3% due to: tokens (38898 → 52988), tool calls (3 → 4), time (13.4s → 16.7s)
[11] (Isolated) Quality unchanged but weighted score is -7.7% due to: tokens (34909 → 60848), tool calls (3 → 6), time (11.9s → 19.0s)
[12] (Isolated) Quality unchanged but weighted score is -1.4% due to: tokens (42920 → 48986), time (17.5s → 22.0s)
[13] (Isolated) Quality unchanged but weighted score is -0.3% due to: efficiency metrics
[14] (Plugin) Quality unchanged but weighted score is -14.1% due to: errors (0 → 1), tokens (109364 → 204707), time (71.7s → 180.1s), tool calls (9 → 17)
[15] (Isolated) Quality unchanged but weighted score is -15.0% due to: tokens (47249 → 196950), errors (0 → 1), tool calls (4 → 16), time (20.2s → 127.0s)

timeout — run hit the scenario timeout limit; scoring may be impacted by aborting model execution before it could produce its full output

Model: claude-opus-4.6 | Judge: claude-opus-4.6

Full results

Copilot AI review requested due to automatic review settings March 17, 2026 16:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md Outdated
jeffschwMSFT and others added 9 commits March 25, 2026 09:17
…p tests to come to this skill, moved from skill to reference in the performance skill
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…nces/simd-vectorization.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…nces/simd-vectorization.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…nces/simd-vectorization.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…nces/simd-vectorization.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Check Span<T>/MemoryExtensions before TensorPrimitives (no extra dependency)
- Add all supported types (sbyte, ushort, uint, ulong, nint, nuint, char via ushort)
- TensorPrimitives: add constraint and applicable types columns (not just float/double)
- Add FusedMultiplyAdd, clarify AddMultiply vs MultiplyAdd distinction
- Prefer portable APIs over platform-specific intrinsics (allow when perf justifies)
- Fix dispatch pattern to use if/else if to avoid pessimizing small inputs
- Remove LINQ references

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jeffschwMSFT jeffschwMSFT force-pushed the jeffschw/add-simd-vectorization-reference branch from b60daa2 to 093d78f Compare March 25, 2026 16:17
Copilot AI review requested due to automatic review settings March 25, 2026 16:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
Comment thread tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml Outdated
@jeffschwMSFT
Copy link
Copy Markdown
Member Author

/evaluate

@github-actions
Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Quality (Isolated) Quality (Plugin) Skills Loaded Overfit Verdict
exp-simd-vectorization Optimize manual min/max with TensorPrimitives 1.0/5 ⏰ → 5.0/5 🟢 1.0/5 ⏰ → 5.0/5 🟢 ✅ exp-simd-vectorization; tools: skill, glob, create, bash, edit / ✅ exp-simd-vectorization; tools: skill, glob, create, edit, bash 🟡 0.23
exp-simd-vectorization Optimize manual product with TensorPrimitives 1.0/5 → 5.0/5 🟢 1.0/5 → 5.0/5 🟢 ✅ exp-simd-vectorization; tools: skill, glob, create, bash / ✅ exp-simd-vectorization; tools: skill, glob, create, bash 🟡 0.23
exp-simd-vectorization No optimization opportunity — dictionary-based lookup service 4.3/5 → 4.7/5 🟢 4.3/5 → 5.0/5 🟢 ✅ exp-simd-vectorization; tools: skill / ✅ exp-simd-vectorization; tools: skill 🟡 0.23
exp-simd-vectorization Optimize int array conditional increment with SIMD 4.0/5 → 3.7/5 🔴 4.0/5 → 4.0/5 ✅ exp-simd-vectorization; tools: skill / ✅ exp-simd-vectorization; tools: skill, glob 🟡 0.23 [1]
exp-simd-vectorization Optimize byte buffer bit reversal with SIMD 2.0/5 ⏰ → 4.7/5 🟢 2.0/5 ⏰ → 4.0/5 🟢 ✅ exp-simd-vectorization; tools: skill, edit, glob, bash / ✅ exp-simd-vectorization; tools: skill, edit, bash 🟡 0.23
analyzing-dotnet-performance Detects compiled regex startup budget and regex chain allocations 2.7/5 ⏰ → 3.7/5 ⏰ 🟢 2.7/5 ⏰ → 3.0/5 ⏰ 🟢 ✅ analyzing-dotnet-performance; tools: skill, bash / ✅ analyzing-dotnet-performance; tools: skill, bash ✅ 0.16 [2]
analyzing-dotnet-performance Detects CurrentCulture comparer and compiled regex budget in inflection rules 5.0/5 → 4.0/5 ⏰ 🔴 5.0/5 → 5.0/5 ✅ analyzing-dotnet-performance; tools: skill, bash, grep / ✅ analyzing-dotnet-performance; tools: skill, bash ✅ 0.16 [3]
analyzing-dotnet-performance Finds per-call Dictionary allocation not hoisted to static 5.0/5 → 5.0/5 5.0/5 → 3.7/5 ⏰ 🔴 ✅ analyzing-dotnet-performance; tools: skill, bash, grep / ✅ analyzing-dotnet-performance; tools: skill, bash, write_bash, grep ✅ 0.16
analyzing-dotnet-performance Catches compound allocations in recursive number converter with ToLower 3.7/5 → 4.3/5 🟢 3.7/5 → 4.7/5 🟢 ✅ analyzing-dotnet-performance; tools: skill, bash / ✅ analyzing-dotnet-performance; tools: skill, bash ✅ 0.16
analyzing-dotnet-performance Finds StringComparison.Ordinal missing and FrozenDictionary opportunities 4.7/5 → 5.0/5 🟢 4.7/5 → 5.0/5 ⏰ 🟢 ✅ analyzing-dotnet-performance; tools: skill, grep, bash / ✅ analyzing-dotnet-performance; tools: skill, bash ✅ 0.16 [4]
analyzing-dotnet-performance Detects Aggregate+Replace chain and struct missing IEquatable 4.0/5 → 4.3/5 ⏰ 🟢 4.0/5 → 2.7/5 ⏰ 🔴 ✅ analyzing-dotnet-performance; tools: skill, bash / ✅ analyzing-dotnet-performance; tools: skill, bash, write_bash, stop_bash ✅ 0.16
analyzing-dotnet-performance Finds branched Replace chain in format string manipulation 1.3/5 ⏰ → 3.3/5 🟢 1.3/5 ⏰ → 3.7/5 🟢 ✅ analyzing-dotnet-performance; tools: skill, bash, grep / ✅ analyzing-dotnet-performance; tools: skill, bash ✅ 0.16
analyzing-dotnet-performance Catches LINQ on hot-path string processing and All(char.IsUpper) 4.3/5 → 4.7/5 🟢 4.3/5 → 5.0/5 🟢 ✅ analyzing-dotnet-performance; tools: skill, bash / ✅ analyzing-dotnet-performance; tools: skill, bash ✅ 0.16
analyzing-dotnet-performance Detects LINQ pipeline in TimeSpan formatting and collection processing 5.0/5 → 4.0/5 🔴 5.0/5 → 3.7/5 🔴 ✅ analyzing-dotnet-performance; tools: skill, bash / ✅ analyzing-dotnet-performance; tools: skill, bash ✅ 0.16
analyzing-dotnet-performance Flags Span inconsistencies and compound method chains in truncation library 4.3/5 → 4.0/5 🔴 4.3/5 → 4.7/5 🟢 ✅ analyzing-dotnet-performance; tools: skill, bash / ✅ analyzing-dotnet-performance; tools: skill, bash, write_bash, stop_bash, read_bash ✅ 0.16
analyzing-dotnet-performance Identifies unsealed leaf classes and locale hierarchy patterns 3.3/5 ⏰ → 4.3/5 ⏰ 🟢 3.3/5 ⏰ → 3.0/5 ⏰ 🔴 ✅ analyzing-dotnet-performance; tools: skill, bash, read_bash, stop_bash, grep / ✅ analyzing-dotnet-performance; tools: skill, bash, read_bash, stop_bash ✅ 0.16 [5]

[1] (Plugin) Quality unchanged but weighted score is -7.4% due to: quality, tokens (57025 → 83232), tool calls (5 → 6)
[2] (Plugin) Quality improved but weighted score is -62.5% due to: judgment, quality, tool calls (6 → 11), tokens (87694 → 103331)
[3] (Plugin) Quality unchanged but weighted score is -11.2% due to: tokens (28705 → 139554), tool calls (2 → 11), time (40.1s → 91.1s), quality
[4] (Plugin) Quality improved but weighted score is -5.1% due to: tokens (28967 → 132919), tool calls (2 → 11), time (39.7s → 104.5s)
[5] (Isolated) Quality improved but weighted score is -48.4% due to: judgment, quality, errors (0 → 1), tool calls (8 → 14), time (73.3s → 112.1s)

timeout — run(s) hit the (120s, 180s) scenario timeout limit; scoring may be impacted by aborting model execution before it could produce its full output (increase via timeout in eval.yaml)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

📖 See InvestigatingResults.md for how to diagnose failures. Additional debugging guidance may be provided by your workflow.

Full results

To investigate failures, paste this to your AI coding agent:

Download eval artifacts with gh run download 23552325478 --repo dotnet/skills --dir /tmp/eval-results, then fetch https://raw.githubusercontent.com/dotnet/skills/4ec893eae1c36a42f48b2aa8f61138a4e91041c7/eng/skill-validator/InvestigatingResults.md and follow it to analyze the results.json files. Diagnose each failure, suggest fixes to the eval.yaml and skill content, and tell me what to fix first.

@danmoseley danmoseley merged commit 55a7d80 into dotnet:main Mar 25, 2026
30 checks passed
@jeffschwMSFT jeffschwMSFT deleted the jeffschw/add-simd-vectorization-reference branch April 1, 2026 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants