Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
{
"title": "Handwritten Digit Recognition with a Back-Propagation Network",
"ids": null,
"venue": null,
"venue": "Neural Information Processing Systems (NIPS 1989)",
"publication_date": null,
"authors": [
{
"name": "Le Cun",
"name": "Y. Le Cun",
"array_index": 0,
"affiliation": "AT&T Bell Laboratories"
},
Expand Down Expand Up @@ -43,7 +43,7 @@
"abstract": "We present an application of back-propagation networks to handwritten digit recognition. Minimal preprocessing of the data was required, but architecture of the network was highly constrained and specifically designed for the task. The input of the network consists of normalized images of isolated digits. The method has 1% error rate and about a 9% reject rate on zipcode digits provided by the U.S. Postal Service.",
"keywords": [],
"number_of_pages": 9,
"publication_type": "conference paper",
"publication_type": "Conference Paper",
"citations": [
"Bottou, L.-Y. and Le Cun, Y. (1989). SN2: A Simulator for Connectionist Models. Neuristique SA, Paris, France.",
"Denker, J., Schwartz, D., Wittner, B., Solla, S. A., Howard, R., Jackel, L., and Hopfield, J. (1987). Large Automatic Learning, Rule Extraction and Generalization. Complex Systems, 1:877–922.",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"ids": "arXiv:2405.06211v3",
"title": "ASurvey on RAGMeetingLLMs: Towards Retrieval-Augmented Large Language Models",
"title": "A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models",
"venue": "arXiv",
"publication_type": "Preprint",
"authors": [
Expand Down Expand Up @@ -38,7 +38,7 @@
"name": "Dawei Yin",
"array_index": 5,
"email": "yindawei@acm.org",
"affiliation": "The Hong Kong Polytechnic University, HK SAR"
"affiliation": "Baidu Inc, China"
},
{
"name": "Tat-Seng Chua",
Expand All @@ -63,7 +63,7 @@
"Prompting."
],
"number_of_pages": 18,
"publication_date": "Fri Jun 30 2017 23:00:00 GMT+0100 (British Summer Time)",
"publication_date": "2024-06-18",
"citations": [
"Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).",
"Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, and Marjan Ghazvininejad. 2023. In-context Examples Selection for Machine Translation. In ACL (Findings). Association for Computational Linguistics, 8857–8873.",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"ids": "arXiv:2501.02189v6",
"title": "ASurvey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges",
"title": "A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges",
"venue": "arXiv",
"publication_type": "Preprint",
"authors": [
Expand All @@ -15,7 +15,7 @@
"email": "wuxiyang@umd.edu"
},
{
"name": "HongyangDu",
"name": "Hongyang Du",
"array_index": 2,
"email": "hdu1@umd.edu"
},
Expand All @@ -30,15 +30,15 @@
"email": "nghiemh@umd.edu"
},
{
"name": "GuangyaoShi",
"name": "Guangyao Shi",
"array_index": 5,
"email": "shig@usc.edu"
}
],
"abstract": "Multimodal Vision Language Models (VLMs) have emerged as a transformative topic at the intersection of computer vision and natural language processing, enabling machines to perceive and reason about the world through both visual and textual modalities. For example, models such as CLIP [194], Claude [11], and GPT-4V [246] demonstrate strong reasoning and understanding abilities on visual and textual data and beat classical single modality vision models on zero-shot classification [94]. With their rapid advancements in research and growing popularity in various applications, we provide a comprehensive survey of VLMs. Specifically, we provide a systematic overview of VLMs in the following aspects: [1] model information of the major VLMs developed up to 2025; [2] the transition of VLM architectures and the newest VLM alignment methods; [3] summary and categorization of the popular benchmarks and evaluation metrics of VLMs; [4] the challenges and issues faced by current VLMs such as hallucination, alignment, and safety.",
"keywords": [],
"number_of_pages": 22,
"publication_date": "6 Apr 2025",
"publication_date": "2025-04-06",
"citations": [
"Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, 2023. Accessed: 2024-12-23.",
"Marah Abdin, Jyoti Aneja, Hany Awadalla, Ahmed Awadallah, Ammar Ahmad Awan, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Qin Cai, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Weizhu Chen, Yen-Chun Chen, Yi-Ling Chen, Hao Cheng, Parul Chopra, Xiyang Dai, Matthew Dixon, Ronen Eldan, Victor Fragoso, Jianfeng Gao, Mei Gao, Min Gao, Amit Garg, Allie Del Giorno, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Wenxiang Hu, Jamie Huynh, Dan Iter, Sam Ade Jacobs, Mojan Javaheripi, Xin Jin, Nikos Karampatziakis, Piero Kauffmann, Mahoud Khademi, Dongwoo Kim, Young Jin Kim, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Yunsheng Li, Chen Liang, Lars Liden, Xihui Lin, Zeqi Lin, Ce Liu, Liyuan Liu, Mengchen Liu, Weishung Liu, Xiaodong Liu, Chong Luo, Piyush Madan, Ali Mahmoudzadeh, David Majercak, Matt Mazzola, Caio César Teodoro Mendes, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Liliang Ren, Gustavo de Rosa, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adi Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Yelong Shen, Swadheen Shukla, Xia Song, Masahiro Tanaka, Andrea Tupini, Praneetha Vaddamanu, Chunyu Wang, Guanhua Wang, Lijuan Wang, Shuohang Wang, Xin Wang, Yu Wang, Rachel Ward, Wen Wen, Philipp Witte, Haiping Wu, Xiaoxia Wu, Michael Wyatt, Bin Xiao, Can Xu, Jiahang Xu, Weijian Xu, Jilong Xue, Sonali Yadav, Fan Yang, Jianwei Yang, Yifan Yang, Ziyi Yang, Donghan Yu, Lu Yuan, Chenruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, and Xiren Zhou. Phi-3 technical report: A highly capable language model locally on your phone, 2024.",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@
"Capacity Evaluation"
],
"number_of_pages": 144,
"publication_date": "11 March 2025",
"publication_date": "2025-03-11",
"citations": [
"Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin, “A neural probabilistic language model,” J. Mach. Learn. Res., vol. 3, pp. 1137–1155, 2003.",
"R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. P. Kuksa, “Natural language processing (almost) from scratch,” J. Mach. Learn. Res., vol. 12, pp. 2493–2537, 2011.",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
"NA"
],
"number_of_pages": 22,
"publication_date": "July 16, 2024",
"publication_date": "2024-07-16",
"citations": [
"Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, and Jack Dongarra. Performance, design, and autotuning of batched gemm for gpus. ISBN 978-3-319-41320-4. doi: 10.1007/978-3-319-41321-1_2.",
"AI21. Introducing jamba: Ai21's groundbreaking ssm-transformer model. AI21 blog, 2024.",
Expand Down
52 changes: 35 additions & 17 deletions dataset/academic/research/research-schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,20 @@
{"type": "string"},
{"type": "null"}
],
"description": "The unique identifier for the paper. Example: '10.1000/xyz123' or 'arXiv:1234.5678'.",
"description": "The unique identifier for the paper, typically a DOI (e.g., '10.1000/xyz123') or arXiv ID (e.g., 'arXiv:1234.5678'). Extract exactly as printed in the document. Return null if no identifier is present.",
"evaluation_config": "string_exact"
},
"title": {
"type": "string",
"description": "The full title of the research paper.",
"description": "The full title of the research paper as it appears on the first page or header. Extract exactly as printed; do not abbreviate, rephrase, or translate.",
"evaluation_config": "string_semantic"
},
"venue": {
"anyOf": [
{"type": "string"},
{"type": "null"}
],
"description": "Where the paper was published (e.g., 'NeurIPS 2024', 'Nature').",
"description": "Where the paper was published or presented, including the venue name and year if available (e.g., 'NeurIPS 2024', 'Nature', 'IEEE Transactions on Pattern Analysis and Machine Intelligence'). Use the full venue name as stated in the document; do not abbreviate unless the document itself uses an abbreviation. Return null if no venue is specified.",
"evaluation_config": "string_semantic"
},
"authors": {
Expand All @@ -35,19 +35,25 @@
"name"
],
"properties": {
"array_index": {
"type": "integer",
"evaluation_config": "integer_exact"
},
"name": {
"type": "string",
"evaluation_config": "string_semantic"
"evaluation_config": "string_semantic",
"description": "Full name of the author as listed in the paper (e.g., 'John A. Smith', 'Yann LeCun'). Preserve the name format used in the document."
},
"email": {
"type": "string",
"format": "email",
"evaluation_config": "string_exact"
"evaluation_config": "string_exact",
"description": "Email address of the author, if provided in the paper. Extract exactly as printed."
},
"affiliation": {
"type": "string",
"nullable": true,
"description": "The affiliation of the author. Example: 'University of California, Berkeley' or 'Google'. If unknown, leave blank.",
"description": "Institutional affiliation of the author as stated in the paper (e.g., 'University of California, Berkeley', 'Google DeepMind'). If no affiliation is provided, leave blank. Do not infer affiliations from other sources.",
"evaluation_config": "string_semantic"
}
}
Expand All @@ -57,7 +63,7 @@
},
"abstract": {
"type": "string",
"description": "The summary or abstract of the paper.",
"description": "The full abstract or summary of the paper as it appears in the document. Extract the complete text verbatim from the abstract section. Do not truncate, paraphrase, or summarise.",
"evaluation_config": "string_semantic"
},
"keywords": {
Expand All @@ -66,7 +72,7 @@
"type": "string",
"evaluation_config": "string_semantic"
},
"description": "List of topics or tags (e.g., 'Machine Learning', 'Genomics').",
"description": "Author-provided keywords or topics as listed in the paper's keywords section (e.g., 'Machine Learning', 'Genomics', 'Transformer'). Extract only keywords explicitly stated by the authors; do not infer or generate additional keywords.",
"evaluation_config": "array_llm"
},
"citations": {
Expand All @@ -78,56 +84,68 @@
"year"
],
"properties": {
"array_index": {
"type": "integer",
"evaluation_config": "integer_exact"
},
"ids": {
"type": "string",
"description": "The unique identifier for the paper. Example: '10.1000/xyz123' or 'arXiv:1234.5678'.",
"description": "The unique identifier for the cited work, typically a DOI (e.g., '10.1000/xyz123') or arXiv ID (e.g., 'arXiv:1234.5678'). Extract if available in the reference entry.",
"evaluation_config": "string_exact"
},
"year": {
"type": "integer",
"maximum": 2100,
"minimum": 1000,
"evaluation_config": "integer_exact"
"evaluation_config": "integer_exact",
"description": "Publication year of the cited work as stated in the reference entry. Return as an integer."
},
"title": {
"type": "string",
"evaluation_config": "string_semantic"
"evaluation_config": "string_semantic",
"description": "Title of the cited work as it appears in the references section. Extract exactly as printed."
},
"venue": {
"type": "string",
"evaluation_config": "string_semantic"
"evaluation_config": "string_semantic",
"description": "Publication venue of the cited work (e.g., journal name, conference name) as stated in the reference entry."
},
"authors": {
"type": "array",
"items": {
"type": "object",
"properties": {
"array_index": {
"type": "integer",
"evaluation_config": "integer_exact"
},
"name": {
"type": "string",
"evaluation_config": "string_semantic"
"evaluation_config": "string_semantic",
"description": "Full name of the cited author as listed in the reference entry."
}
}
},
"evaluation_config": "array_llm"
}
}
},
"description": "List of works cited by this paper.",
"description": "List of works cited in the references or bibliography section of this paper. Extract all cited works with available metadata. Preserve the order as they appear in the references section.",
"evaluation_config": "array_llm"
},
"number_of_pages": {
"type": "integer",
"maximum": 1000,
"minimum": 1,
"description": "The number of pages in the paper.",
"description": "Total number of pages in the paper, counting all pages in the PDF including references and appendices.",
"evaluation_config": "integer_exact"
},
"publication_date": {
"anyOf": [
{"type": "string"},
{"type": "null"}
],
"description": "Date of publication in YYYY-MM-DD format or YYYY-MM or YYYY format.",
"description": "Date of publication. Use the most precise format available: YYYY-MM-DD if a full date is stated, YYYY-MM if only month and year are available, or YYYY if only the year is known. Extract from the document; do not infer dates. Return null if no publication date is provided.",
"evaluation_config": "string_semantic"
},
"publication_type": {
Expand All @@ -150,7 +168,7 @@
"Software"
],
"type": "string",
"description": "The type of publication (e.g., 'Conference Paper', 'Journal Article', 'Technical Report', 'Blog Post', 'Preprint').",
"description": "The type of publication, selected from the allowed enum values. Determine based on the venue, format, and context clues in the document (e.g., papers at NeurIPS/ICML are 'Conference Paper', papers in Nature/IEEE Transactions are 'Journal Article', papers on arXiv without a venue are 'Preprint').",
"evaluation_config": "string_exact"
}
}
Expand Down
Loading