Skip to content

feat(html): add Go template parser with structural analysis#47

Open
doITmagic wants to merge 1 commit intodevfrom
feat/go-template-parser
Open

feat(html): add Go template parser with structural analysis#47
doITmagic wants to merge 1 commit intodevfrom
feat/go-template-parser

Conversation

@doITmagic
Copy link
Owner

Description

Add semantic indexing for Go template syntax ({{ }}) in HTML, .tmpl, and .gohtml files.

Previously, ragcode indexed HTML files containing Go templates as plain text — it understood the HTML DOM structure (via goquery) but completely ignored Go template directives like {{ define }}, {{ template }}, {{ range }}, {{ if }}, etc. Additionally, .tmpl and .gohtml files were not recognized at all.

What's included

  • GoTemplateAnalyzer — regex-based parser (similar to BladeAnalyzer) that extracts 10 directive types:
    {{ define }}, {{ block }}, {{ template }}, {{ range }}, {{ if/else }}, {{ with }}, {{ .Variable }}, {{ funcName }}, {{/* comments */}}
  • Adapter — converts GoTemplate structs to parser.Symbol with:
    • RelDependency relations ({{ template "nav" }} → dependency to "nav")
    • Rich metadata: variables, custom_funcs, ranges, blocks, defines, includes
    • Per-define symbols (each {{ define "name" }} gets its own symbol)
  • HTML Analyzer integration — automatic dual-mode analysis:
    • Detects {{ in file content regardless of extension (.html, .tmpl, .gohtml)
    • Runs Go template analysis first, then HTML DOM analysis
    • Pure HTML files (no {{ }}) continue to work as before
  • Stack-based EndLine tracking for nested blocks (define → range → if → end)
  • 3 testdata files + 17 unit/integration tests all passing

Architecture decision

Implemented as a sub-package pkg/parser/html/gotemplate/ because:

  1. Go templates are an extension of HTML — they co-exist with HTML structure
  2. .tmpl/.gohtml files already contain HTML — both analyses are valuable
  3. Detection is content-based ({{ presence), not extension-based
  4. Follows the same regex pattern as BladeAnalyzer in pkg/parser/php/laravel/

Files Changed

File Change
pkg/parser/html/gotemplate/types.go NEW — 7 type definitions (GoTemplate, DefineDirective, etc.)
pkg/parser/html/gotemplate/analyzer.go NEW — GoTemplateAnalyzer with 10 regex extractors
pkg/parser/html/gotemplate/adapter.go NEW — GoTemplate → parser.Symbol conversion with relations
pkg/parser/html/gotemplate/analyzer_test.go NEW — 8 analyzer + 3 adapter unit tests
pkg/parser/html/gotemplate/testdata/ NEW — layout.html, page.tmpl, partial.gohtml
pkg/parser/html/analyzer.go MODIFIED — Integrate gotemplate + support .tmpl/.gohtml
pkg/parser/html/analyzer_test.go MODIFIED — 3 integration tests (CanHandle, HTML+GoTpl, .tmpl)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Checklist:

  • I have performed a self-review of my own code
  • I have formatted my code with go fmt ./...
  • I have run tests go test ./... and they pass
  • I have verified integration with Ollama/Qdrant (if applicable)
  • I have updated the documentation accordingly

Add GoTemplateAnalyzer that parses Go template syntax ({{ }}) in HTML,
.tmpl, and .gohtml files. Extracts directives (define, block, template,
range, if/else, with), variables, custom functions, and comments.

Key features:
- Regex-based parser similar to BladeAnalyzer
- Converts to parser.Symbol with RelDependency relations
  ({{ template "x" }} creates dependency to template x)
- Dual-mode analysis: Go template + HTML DOM for all file types
- Detects {{ }} syntax automatically regardless of extension
- Stack-based EndLine tracking for nested blocks
- Rich metadata: variables, custom_funcs, ranges, blocks

Files added:
- pkg/parser/html/gotemplate/types.go     - Type definitions
- pkg/parser/html/gotemplate/analyzer.go  - GoTemplateAnalyzer (regex)
- pkg/parser/html/gotemplate/adapter.go   - GoTemplate → Symbol conversion
- pkg/parser/html/gotemplate/testdata/    - Test fixtures

Files modified:
- pkg/parser/html/analyzer.go     - Integrate gotemplate + .tmpl/.gohtml
- pkg/parser/html/analyzer_test.go - Integration tests

17/17 tests passing
Copilot AI review requested due to automatic review settings March 18, 2026 21:02
@doITmagic doITmagic self-assigned this Mar 18, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds semantic indexing for Go html/template directives embedded in HTML-like files so ragcode can extract template structure (defines/includes/blocks/etc.) instead of treating {{ ... }} as plain text.

Changes:

  • Introduces pkg/parser/html/gotemplate sub-package (types, regex analyzer, symbol adapter) + unit tests/testdata.
  • Extends the HTML analyzer to recognize .tmpl/.gohtml and to run Go-template analysis when {{ is present.
  • Adds integration tests covering Go-template-in-HTML and new extensions.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
pkg/parser/html/gotemplate/types.go Defines parsed directive/container types for Go template extraction.
pkg/parser/html/gotemplate/analyzer.go Regex-based Go template directive extraction + block end-line tracking.
pkg/parser/html/gotemplate/adapter.go Converts parsed templates into parser.Symbols + dependency relations/metadata.
pkg/parser/html/gotemplate/analyzer_test.go Unit tests for directive extraction across sample templates.
pkg/parser/html/gotemplate/adapter_test.go Unit tests for symbol conversion, metadata, and relations.
pkg/parser/html/gotemplate/testdata/layout.html Sample template with define/include/block/range/if/with/comment.
pkg/parser/html/gotemplate/testdata/page.tmpl Sample template with template include + define/range/custom funcs.
pkg/parser/html/gotemplate/testdata/partial.gohtml Sample template with nested if/range and variables.
pkg/parser/html/analyzer.go Adds .tmpl/.gohtml handling and dual Go-template + HTML DOM analysis.
pkg/parser/html/analyzer_test.go Integration tests for Go template detection and new extensions.

You can also share your feedback on Copilot code review. Take the survey.

}
}
}

Comment on lines +13 to +14
reBlock = regexp.MustCompile(`\{\{-?\s*block\s+"([^"]+)"\s*(\.[\w.]*)?`)
reTemplate = regexp.MustCompile(`\{\{-?\s*template\s+"([^"]+)"\s*(\.[\w.]*)?`)
Comment on lines +15 to +18
reRange = regexp.MustCompile(`\{\{-?\s*range\s+(\.[\w.]+)`)
reIf = regexp.MustCompile(`\{\{-?\s*if\s+(.+?)\s*-?\}\}`)
reElse = regexp.MustCompile(`\{\{-?\s*else\s*-?\}\}`)
reWith = regexp.MustCompile(`\{\{-?\s*with\s+(\.[\w.]+)`)
Comment on lines +137 to +138
// {{ else }}
if reElse.MatchString(line) {
Comment on lines +43 to +47
for _, fp := range filePaths {
tpl, err := a.analyzeFile(fp)
if err != nil {
continue // skip unreadable files
}
Comment on lines +56 to +75
// For single files: detect Go template syntax and run GoTemplate analysis
info, err := os.Stat(path)
if err != nil {
return nil, err
}

if !info.IsDir() {
data, err := os.ReadFile(path)
if err != nil {
return nil, err
}

// If Go template syntax detected, run Go template analysis first
if bytes.Contains(data, []byte("{{")) {
goTplAnalyzer := &gotemplate.GoTemplateAnalyzer{}
templates := goTplAnalyzer.Analyze([]string{path})
symbols = append(symbols, gotemplate.ConvertToSymbols(templates)...)
}
}

)

// ConvertToSymbols converts parsed GoTemplate results to parser.Symbol entries
// with structural relations (dependency for {{ template }}, inheritance-like for {{ block }}).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants