Skip to content

Latest commit

 

History

History
18 lines (13 loc) · 773 Bytes

File metadata and controls

18 lines (13 loc) · 773 Bytes

ExtractVlmTextOperation

Extract text from a document using vision language models (VLMs). VLMs can understand document layout and structure more intelligently than traditional OCR.

Properties

Name Type Required Description
llm_spec LlmSpec Yes
preprocessing_configuration Optional[VlmPreprocessingConfig] No
image_spec Optional[ImageSpec] No
output_format TextOutputFormat Yes
page_range Optional[PageRange] No
type Literal["extractVlmText"] Yes None

[Back to Model list] [Back to API list] [Back to README]