Alternatives Compared: Codeberg, Forgejo, Gogs, Launchpad and More

Alternatives Compared: Codeberg, Forgejo, Gogs, Launchpad and More

Part 2 of a three-part series on GitHub alternatives.

Anyone who looks at the hosting location first when switching platforms is searching by the wrong criterion. What matters more is who sets the defaults — and whether AI features are even part of the architecture. This second part sorts the serious alternatives along that line.

After Part 1 described the occasion — default-on training use starting April 24, 2026 for individual Copilot accounts — the follow-up question is: Where is the opposite? A platform built in the other direction, ideally by architecture, not by toggle. The answer depends less on geography and hosting provider than on a simple property: Who decides what data flows outward?

Four Tiers by Control Over Defaults

A useful categorization is not primarily by hosting location but by control over processing defaults. The spectrum ranges from “you build it yourself, you determine everything” to “you agree to whatever the provider puts in front of you.”

On one end are the self-hosted OSS platforms — Forgejo, Gitea, Gogs, GitLab CE. The operator is simultaneously the default-setter; AI features are either not activatable or require deliberately adding external services. Maximum effort, maximum control.

One tier down are the nonprofit SaaS providers — Codeberg, Framagit, Disroot, 0xacab. Low effort, clear ToS position against monetizing user code, but limited feature set and storage. Sourcehut occupies a special position: commercial, but with an explicitly documented anti-AI stance and EU hosting in Amsterdam.

On the other end are the established platforms — GitHub, GitLab.com, Bitbucket, Azure DevOps. Functionally mature, data-politically not under the customer’s control. Outside this ranking are two special cases discussed individually below: Launchpad with its distro focus and AWS CodeCommit as a lesson in platform risk.

Comparison Table: The Key Candidates

ProviderModelOperator / HostingDefault-on for AI features?CI integratedGH Actions compatibleActively developed
CodebergSaaS, nonprofitCodeberg e.V., DENo, enshrined in ToSForgejo ActionsLargely yesYes
Forgejo self-hostedOSS, self-hostown infrastructureNot presentForgejo ActionsLargely yesYes
Gitea self-hostedOSS, self-hostown infrastructureNot presentGitea ActionsLargely yesYes
Gogs self-hostedOSS, self-hostown infrastructureNot presentNo, external (Drone)Via third-party solutionsLimited
GitLab CE self-hostedOSS, self-hostown infrastructureDuo only in EEGitLab CIConversion neededYes
SourcehutSaaSsr.ht, NLNo, explicit anti-AIbuilds.sr.htNo, own YAMLYes
FramagitSaaS, nonprofitFramasoft, FRNoGitLab CINoYes
LaunchpadSaaSCanonical, UKNo, enshrined in ToSBuild farm (packages)NoYes, but distro focus
GitLab.com SaaS EuropeSaaS, commercialGitLab Inc. (US), AWS-FRADuo opt-out, not opt-inGitLab CINoYes
GitHub EUSaaS, commercialMicrosoft (US)Multiple default-on togglesGitHub ActionsNativeYes
AWS CodeCommitSaaS, commercialAmazon (US)No AI featuresCodeBuild/PipelineNoNo — no new customers since 07/2024

The table doesn’t produce a clear winner. But it does reveal two clusters: a sweet spot around Codeberg, Forgejo and GitLab CE, and a trailing group at the bottom right. In between sit the special cases that may still be the right choice for specific use cases.

Codeberg: The Most Obvious Entry Point Without Your Own Server

Codeberg is the only platform in this overview organized as a nonprofit association — under German law, based in Berlin, operated on servers in Germany. Membership fees fund operations, supplemented by donations. No investor money, no monetization of user data, no AI features in sight. The ToS explicitly exclude commercial use of user-generated content.

What distinguishes Codeberg from a mere hosting claim is a citable measure: since 2023, known AI crawlers — GPTBot, ClaudeBot, CCBot, Google-Extended — have been blocked server-side with HTTP 403. This is technically verifiable and provable in a dispute.

The limits are honestly communicated: 2 GB storage quota per repo, larger projects must request more. For enterprise needs — SCIM, custom domains, ISO-27001-grade audit logs — Codeberg is not designed. Beyond roughly 30 developers or a three-digit repo count, self-hosting becomes more attractive anyway.

Forgejo Self-Hosted: The Technical Optimum

Forgejo emerged in 2022 as a hard fork of Gitea, driven by the Codeberg community. The binary is a single Go file; the database can be SQLite, MySQL, MariaDB or PostgreSQL. On a 4 GB server it runs comfortably for 50 developers.

The real argument for Forgejo is Forgejo Actions: the in-house CI that takes workflows from GitHub nearly unchanged. actions/checkout@v4, actions/setup-node@v4 and a large portion of Marketplace Actions work either directly or via drop-in mirrors on code.forgejo.org. For anyone migrating from the GitHub ecosystem, the friction here is lowest.

A typical setup with Docker Compose:

services:
  forgejo:
    image: codeberg.org/forgejo/forgejo:8
    restart: unless-stopped
    environment:
      - FORGEJO__database__DB_TYPE=postgres
      - FORGEJO__database__HOST=db:5432
      - FORGEJO__database__NAME=forgejo
      - FORGEJO__database__USER=forgejo
      - FORGEJO__database__PASSWD=${DB_PASSWORD}
      - FORGEJO__server__ROOT_URL=https://git.example.de
      - FORGEJO__service__DISABLE_REGISTRATION=true
    volumes:
      - ./data:/data
    ports:
      - "3000:3000"
      - "2222:22"
    depends_on: [db]

  db:
    image: postgres:16
    restart: unless-stopped
    environment:
      - POSTGRES_DB=forgejo
      - POSTGRES_USER=forgejo
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    volumes:
      - ./pgdata:/var/lib/postgresql/data

  runner:
    image: code.forgejo.org/forgejo/runner:6
    restart: unless-stopped
    environment:
      - FORGEJO_INSTANCE_URL=https://git.example.de
      - FORGEJO_RUNNER_REGISTRATION_TOKEN=${RUNNER_TOKEN}
    volumes:
      - ./runner-data:/data
      - /var/run/docker.sock:/var/run/docker.sock

Three containers, a handful of environment variables — and the platform is running. What comes next belongs in Part 3.

Gogs: The Grandfather of This Lineage

Gogs is the original Go-based Git hosting software from which Gitea forked in 2016 — and Forgejo later forked from Gitea. Anyone looking at Gogs today should know two things.

First: Gogs is still maintained, but at a significantly lower pace of change than Gitea or Forgejo. Maintainer Unknwon (Jiahua Chen) last publicly updated the roadmap in spring 2024; security patches arrive, new features less so. Second: Gogs has no native CI. Anyone running Gogs pairs it with a separate solution — historically often Drone CI (which originally built on Gogs), today also Woodpecker CI as a Drone fork with a clear OSS orientation, or Jenkins.

This separation can be an advantage — loosely coupled architecture, free CI choice — or a disadvantage — more components, more integration work. For the data protection focus of this series, Gogs is an honorable option for setups that want decoupled operation anyway or need a very resource-light installation. For migrations from GitHub with a large Actions footprint, Forgejo is the more pragmatic choice.

Launchpad: The Special Case from the Ubuntu Universe

Launchpad is Canonical’s platform, originally built for the Ubuntu distribution, hosted in the United Kingdom. It supports Bazaar — Canonical’s own VCS — and Git, has its own bug-tracking logic linked to the Debian BTS, a translations workflow, a mailing list engine, and above all the build farm: a massive cluster that builds packages for all Ubuntu architectures.

For general software projects, Launchpad is rarely the right choice. The workflow is deeply tailored to Ubuntu/Debian packaging, the UI looks like the early 2010s, and the build farm produces .deb packages — not arbitrary containers or test suites. For projects with a distro connection (hardware drivers, kernel modules, system daemons), Launchpad remains sensible, especially as the official bridge to PPAs (Personal Package Archives).

On the data protection front, Canonical sits outside the EU post-Brexit, but with its own adequacy decision from the EU Commission. No known AI features, no default-on training toggle, stable ToS over the years. In the default-off ranking, that puts it clearly in the upper third — but because of its specialized use case, it’s not a universal replacement.

AWS CodeCommit: The Lesson in Platform Risk

AWS CodeCommit deserves its own section here, though as a warning. In July 2024, AWS announced it would close the service to new customers. Existing customers can continue to use CodeCommit, but new AWS accounts no longer have access, and the service roadmap has been frozen. The stated reason: focus on “core priorities.” In plain language: AWS lost the market to GitHub and GitLab and is pulling back.

For the data protection discussion, CodeCommit has become irrelevant — for the strategic discussion, all the more valuable. It shows what happens when you concentrate code hosting with a provider that doesn’t consider the service core business: after thirteen years of operation and millions of repositories, the provider can say “no, thanks.” The same risk exists latently with other hyperscaler code services — Azure DevOps has been in maintenance mode for years, with clear steering pressure toward GitHub.

Anyone still using CodeCommit today faces migration pressure independent of data protection concerns. That’s a separate migration that should run in parallel with the one discussed here.

GitLab CE Self-Hosted: When the Organization Is Larger

For companies with a hundred or more developers, compliance requirements or complex permission models, GitLab CE becomes attractive. Functionally, the Community Edition offers everything available on gitlab.com in the Free tier — container registry, Pages, Wiki, Issues, MR workflows. Additional features missing from CE are provided by the Enterprise Edition.

This very separation is decisive for the default-off argument: GitLab Duo, the AI feature, is only available in EE and cannot be activated in CE. Anyone running CE has a hard guarantee against accidental AI integration — one that no one can soften with a click.

The resource requirements should not be underestimated: 4 vCPU and 8 GB RAM are the minimum, 16 GB the recommendation. Backup is non-trivial because repository, database, LFS, artifact and registry storage must be backed up consistently. The official gitlab-backup tool covers this — but should be restore-tested regularly.

Sourcehut: When Conviction Matters More Than Convenience

Sourcehut is the ideologically clearest choice. Drew DeVault has publicly called AI crawlers a “plague” and runs aggressive blocking. The platform is located in Amsterdam, costs from two euros per month, and the workflow runs entirely without JavaScript — patches are submitted via mailing list (git send-email), reviews likewise.

That’s also the biggest hurdle. Anyone socialized with pull-request workflows has to relearn. For internal corporate use with familiar tools, Sourcehut is therefore usually not an option; for highly committed OSS projects or solo developers, it’s a consequent step.

Implementing a Machine-Readable TDM Reservation

Regardless of the chosen platform, companies should set the TDM reservation under § 44b German Copyright Act / Art. 4 DSM Directive in machine-readable form. The legally cleanest approach combines four layers, each serving a different retrieval path.

The first layer is robots.txt in the root of every web host:

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: PerplexityBot
Disallow: /

The second layer is ai.txt in the root, a semi-formal standard by Spawning AI:

User-Agent: *
Disallow: /

The third layer is an HTTP header on all repository URLs:

X-Robots-Tag: noai, noimageai

In Caddy as a reverse proxy:

git.example.de {
    reverse_proxy localhost:3000
    header X-Robots-Tag "noai, noimageai"
}

The fourth layer is a license or README notice as the legally primary evaluated text:

## Reservation under § 44b German Copyright Act / Art. 4 DSM Directive

The rights holders of this repository expressly reserve the use of the
content published here for text and data mining pursuant to § 44b(3)
of the German Copyright Act (UrhG) and Art. 4(3) of Directive (EU)
2019/790. Use for training AI models, including large language models,
is prohibited without express written permission.

The combination of all four layers represents the best currently available standard. Whether AI providers comply with it is another question — but in a dispute, for example under an AI Act lawsuit from August 2026 onward, machine-readable conformity is the decisive evidence.

Which Solution for Which Scale?

The selection follows scale, not ideology. Up to roughly ten developers or for purely OSS projects, Codeberg is the obvious entry point: no self-hosting, nonprofit operator, clear default-off position.

For ten to a hundred developers in a commercial setting, Forgejo self-hosted pays off. Low operational overhead, good migration paths from GitHub, in-house CI with Actions compatibility — that’s the pragmatic path today. From a hundred developers or with hard compliance requirements, GitLab CE becomes attractive: higher operational overhead, but functionally at enterprise level and cleanly separable from AI features.

For special cases, different rules apply. Solo developers or small OSS projects with ideological clarity land at Sourcehut. Anyone with a distro or packaging connection adds Launchpad — as a complement, not a replacement. Anyone who prefers a decoupled architecture pairs Gogs with Woodpecker CI as a resource-light variant.

From Comparison to Migration

With that, the provider landscape is mapped. Part 3 of this series turns operational: What does the concrete migration look like, with a focus on the hardest part — the CI pipeline. With examples for translating GitHub Actions to Forgejo Actions, a guide for setting up self-hosted runners, and a checkable migration checklist.


Translated with the help of Claude.

This series: