Skip to content

Latest commit

 

History

History
232 lines (176 loc) · 7.27 KB

File metadata and controls

232 lines (176 loc) · 7.27 KB

Solutions for GitLab to GitHub Migration Challenges

This document addresses the three main challenges you mentioned:

1. Handling Nested Repositories (Subdirectories)

Problem

GitLab supports nested groups/subdirectories (e.g., group/subgroup/repo), but GitHub doesn't support nested repositories. All repositories in GitHub are flat.

Solution

The service implements multiple naming strategies to convert nested GitLab paths to flat GitHub repository names:

Strategy 1: Flatten (Default)

Converts all slashes to hyphens:

  • group/subgroup/repogroup-subgroup-repo
  • company/backend/api-servicecompany-backend-api-service

Use Case: When you want to preserve the full path information in the repository name.

Strategy 2: Prefix

Uses first and last segment:

  • group/subgroup/repogroup-repo
  • company/backend/api-servicecompany-api-service

Use Case: When you want to keep the top-level group and repository name but remove intermediate levels.

Strategy 3: Last Segment

Uses only the repository name:

  • group/subgroup/reporepo
  • company/backend/api-serviceapi-service

Use Case: When repository names are unique and you don't need group information.

Strategy 4: Custom Mappings

Allows you to specify exact mappings for each repository:

{
  "migration": {
    "naming_strategy": "custom",
    "name_mappings": {
      "group/subgroup/complex-name": "simple-name",
      "company/backend/api-service": "api",
      "legacy/old/project": "new-project-name"
    }
  }
}

Use Case: When you need specific names for certain repositories or want to rename them during migration.

Implementation

The NameMapper class in pkg/migration/naming.go handles all naming conversions with validation:

  • Validates GitHub repository name rules (1-100 characters, valid characters only)
  • Suggests fixes for invalid names
  • Prevents conflicts with reserved names

2. Choosing New Names for Nested Repositories

Automatic Name Generation

The service automatically generates names based on your selected strategy. The process:

  1. Discovery: Lists all repositories in the GitLab group (including nested groups)
  2. Mapping: Applies the selected naming strategy to each repository path
  3. Validation: Checks if the generated name is valid for GitHub
  4. Suggestion: If invalid, suggests a corrected name
  5. Conflict Check: Verifies the name doesn't already exist in GitHub

Manual Override

You can override automatic naming by using custom mappings in the configuration:

{
  "migration": {
    "naming_strategy": "flatten",
    "name_mappings": {
      "group/subgroup/special-repo": "my-custom-name",
      "another/group/repo": "another-custom-name"
    }
  }
}

Custom mappings take precedence over the selected strategy.

Name Validation

The service validates all generated names against GitHub's rules:

  • Length: 1-100 characters
  • Characters: Alphanumeric, hyphens, underscores, and dots only
  • Cannot start or end with ., -, or _
  • Cannot be reserved names (github, api, www, etc.)

If a name is invalid, the service automatically suggests a corrected version.

3. Complete Migration Capabilities

What Can Be Migrated

✅ Code Migration

  • Full Git History: All commits, branches, and tags are preserved
  • Branches: All branches are migrated
  • Tags: All tags are migrated
  • Default Branch: The default branch is preserved

Implementation: Uses git clone --mirror to clone the entire repository with all refs, then pushes to GitHub using git push --mirror.

✅ Issues Migration

  • All issues with titles, descriptions, and metadata
  • Issue labels (migrated separately)
  • Issue comments (preserved in issue body)
  • Issue creation date and author information

Implementation: Fetches all issues from GitLab API and creates corresponding issues in GitHub with metadata preserved in the issue body.

✅ Merge Requests → Pull Requests

  • All merge requests converted to pull requests
  • Source and target branches
  • Descriptions and metadata
  • Author information

Note: Merge requests are created as pull requests. The branches must exist first (handled during code migration).

✅ Labels Migration

  • All labels with colors and descriptions
  • Label names are preserved
  • Colors are converted from GitLab format to GitHub format

Implementation: Fetches all labels and creates them in GitHub before migrating issues.

✅ Milestones Migration

  • All milestones with titles and descriptions
  • Due dates (if set)
  • Milestone states

Implementation: Fetches all milestones and creates them in GitHub before migrating issues.

✅ Wikis Migration

  • Wiki pages are detected and listed
  • Full wiki migration requires manual steps (GitHub wikis are separate git repositories)

Current Implementation: Lists all wiki pages and logs them. Full wiki migration would require:

  1. Cloning the GitLab wiki repository
  2. Creating a GitHub wiki repository
  3. Pushing the wiki content

Future Enhancement: Full wiki migration can be implemented by cloning and pushing wiki repositories.

Configuration Options

You can control what gets migrated:

{
  "migration": {
    "migrate_code": true,
    "migrate_issues": true,
    "migrate_merge_requests": true,
    "migrate_wikis": true,
    "migrate_labels": true,
    "migrate_milestones": true,
    "migrate_branches": true,
    "migrate_tags": true,
    "preserve_history": true,
    "dry_run": false
  }
}

Migration Process Flow

  1. Discovery Phase

    • Lists all repositories in the GitLab group
    • Identifies nested repositories
    • Maps GitLab paths to GitHub names
  2. Validation Phase

    • Validates repository names
    • Checks for existing repositories in GitHub
    • Suggests fixes for invalid names
  3. Code Migration Phase

    • Clones repository with full history
    • Pushes to GitHub with all branches and tags
  4. Metadata Migration Phase (in order)

    • Migrates labels (required for issues)
    • Migrates milestones (required for issues)
    • Migrates issues (uses labels and milestones)
    • Migrates merge requests (requires branches)
    • Migrates wikis (if enabled)
  5. Verification Phase

    • Checks migration status
    • Reports results and errors

Error Handling

The service provides comprehensive error handling:

  • Per-Repository Errors: Each repository migration tracks its own errors
  • Warnings: Non-critical issues (like name changes) are logged as warnings
  • Detailed Logs: All operations are logged for troubleshooting
  • Status Reporting: Migration status is available via API

Dry Run Mode

Test migrations without actually creating repositories:

{
  "migration": {
    "dry_run": true
  }
}

Or via API:

POST /api/v1/migration/repositories/:path/migrate
{
  "dry_run": true
}

Summary

The service provides a complete solution for migrating from GitLab to GitHub:

  1. Nested Repository Handling: Multiple naming strategies to convert nested paths to flat names
  2. Flexible Naming: Automatic name generation with manual override capability
  3. Complete Migration: Code, issues, PRs, labels, milestones, and wikis (with limitations)

All solutions are configurable and can be customized to fit your specific migration needs.