Skip to content

CorAsvAnnEval: wrap CLI instead of API #3

@bertsky

Description

@bertsky

https://github.com/kba/ocrmultieval/blob/5de79f3021b48f83f9cb798a484fd472d21ed94b/ocrmultieval/backends/CorAsvAnnEval.py#L18-L20

I can see it would be hard to repeat everything in ocrd_cor_asv_ann.scripts.compare. So why not instead call the CLI and have it write its report where we want it?

Alternatively, keep the API as it is and emulate Click's context as in:

from ocrd_cor_asv_ann.scripts.compare import cli as standalone_cli

...

def compare_files(self, gt_mediatype, gt_file, ocr_mediatype, ocr_file, pageId):
    outfile = StringIO()
    kwargs = {'output_file': outfile, 
              'gt_file': gt_file,
              'ocr_files': [ocr_file],
              # add more of your non-default choices here, e.g. normalization, gt_level, confusion, histogram
              # if gt_mediatype is plaintext path list, you need to set file_lists=True
    }
    standalone_cli(**kwargs)
    report = outfile.getvalue()
    # dive into this file pair's aggregates
    metrics = report[gt_file + ',' + ocr_file]
    # maybe delete the individual line metrics
    del metrics['lines']
    return self.make_report(gt_file, ocr_file, pageId, **metrics)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions