Impressive work!
Some datasets listed in train.md, such as webdata, GQA, OCR-VQA, TextVQA and VisualGenome, seems not to be used in training, since share-captioner_coco_lcs_sam_1246k_1107.json only contains coco, llava and sam labels.
So, in your training process, has webdata, GQA ... been used in your training?
Looking forward to your early reply!
Impressive work!
Some datasets listed in train.md, such as webdata, GQA, OCR-VQA, TextVQA and VisualGenome, seems not to be used in training, since share-captioner_coco_lcs_sam_1246k_1107.json only contains coco, llava and sam labels.
So, in your training process, has webdata, GQA ... been used in your training?
Looking forward to your early reply!