parallel-data

Star

Here are 7 public repositories matching this topic...

thammegowda / mtdata

Star

A tool that locates, downloads, and extracts machine translation corpora

multilingual natural-language-processing machine-translation dataset natural-language-generation parallel-data

Updated Sep 18, 2025
Python

PartitionedArrays / PartitionedArrays.jl

Star

Large-scale, distributed, sparse linear algebra in Julia.

hpc julia linear-algebra mpi parallel-algorithms parallel-data

Updated Sep 8, 2025
Julia

VinAIResearch / PhoMT

Star

PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation (EMNLP 2021)

vietnamese machine-translation corpus english dataset vietnamese-nlp parallel-data

Updated Jun 3, 2025

Elbria / xling-SemDiv

Star

Code and data for the EMNLP 2020 paper: "Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank"

learning-to-rank parallel-data bertology multilingual-bert semantic-divergences corpus-filtering synthetic-supervision cross-lingual-similarity

Updated Feb 10, 2023
Python

lormaechea / wivico

Star

Wikipedia-Vikidia Corpus (WiViCo) - A general-purpose parallel sentence simplification dataset for French

wikipedia general-purpose text-simplification parallel-data french-language vikidia complex-simple

Updated May 20, 2025

Datsede04 / Amharic-corps-collector-bot

Star

A Telegram Bot for Amharic Speech Data Collection

bot language corpus parallel-data

Updated Apr 13, 2022
JavaScript

This repository focuses on distributed and parallel computing with PyTorch, covering model parallelism, data parallelism, and advanced optimization techniques. It provides resources for scaling AI training and inference efficiently across multiple devices.

parallel parallel-computing distributed ddp parallel-data tensor-parallelism fsdp

Updated Jun 29, 2025
Jupyter Notebook

Improve this page

Add a description, image, and links to the parallel-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the parallel-data topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallel-data

Here are 7 public repositories matching this topic...

thammegowda / mtdata

PartitionedArrays / PartitionedArrays.jl

VinAIResearch / PhoMT

Elbria / xling-SemDiv

lormaechea / wivico

Datsede04 / Amharic-corps-collector-bot

SuZeAI / DP

Improve this page

Add this topic to your repo