vlm parser의 html 코드를 처리하는 html backend 추가 by inoray · Pull Request #158 · genonai/doc_parser

inoray · 2026-02-02T05:05:36Z

Checklist:

Documentation has been updated, if necessary.
Examples have been added, if necessary.
Tests have been added, if necessary.

gemini-code-assist · 2026-02-02T05:05:52Z

Summary of Changes

Hello @inoray, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

이 풀 리퀘스트는 Docling 시스템에 HTML 문서를 처리할 수 있는 새로운 백엔드를 도입합니다. 이 백엔드는 HTML 파일의 구조와 내용을 DoclingDocument 객체로 변환하여, 시스템이 HTML 소스에서 정보를 수집하고 구조화하는 능력을 향상시킵니다. 특히, HTML 태그를 Docling의 내부 데이터 모델에 매핑하고, div 요소의 클래스 및 바운딩 박스 정보를 하위 요소에 전파하여 문서의 레이아웃 및 의미론적 정보를 보다 정확하게 반영할 수 있도록 합니다.

Highlights

새로운 HTML 백엔드 추가: VLM 파서를 위한 새로운 HTML 문서 백엔드인 GenosVlmHTMLDocumentBackend가 추가되었습니다. 이 백엔드는 HTML 파일을 파싱하여 DoclingDocument 객체로 변환합니다.
HTML 태그 처리 로직 구현: h1-h6, p, pre, code, ul, ol, li, table, figure, img, details 등 다양한 HTML 태그를 DoclingDocument의 텍스트, 제목, 코드, 목록, 테이블, 그림, 그룹 등의 해당 요소로 변환하는 핸들러 함수가 구현되었습니다.
DIV 컨텍스트 및 BoundingBox 전파: div 태그의 class 및 data-bbox 속성을 하위 요소에 전파하는 컨텍스트 관리 로직이 추가되었습니다. 이를 통해 DocItemLabel 및 BoundingBox 정보를 더 정확하게 할당할 수 있습니다.
CSS 클래스 기반 DocItemLabel 매핑: 특정 CSS 클래스 이름을 DocItemLabel 값으로 매핑하는 CLASS_TO_DOCITEM_LABEL 딕셔너리가 정의되어, HTML 구조에서 의미론적 정보를 추출하는 데 활용됩니다.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This PR introduces a new GenosVlmHTMLDocumentBackend for HTML parsing, extending functionality to handle div contexts with class and data-bbox attributes. While enhancing HTML parsing, the implementation inherits several Denial of Service (DoS) vulnerabilities related to untrusted input processing. Specifically, unbounded table dimensions can lead to memory exhaustion, and deeply nested HTML can cause crashes via KeyError or RecursionError. Addressing these security concerns by implementing resource limits and robust error handling for malformed or malicious input is crucial. Additionally, the review highlighted opportunities for code quality improvement, including refactoring for better maintainability, removing unused code, and fixing minor bugs.

gemini-code-assist · 2026-02-02T05:07:28Z