`git://` protocol support
See original GitHub issueI have a project that uses one of your downstreams (antora). Due to current infrastructure limitations (gitolite + cgit), there are three ways to fetch a project:
- Using an ssh key (no usernames / passwords / oauth) - obviously not ok, since mirroring is meant to be allowed.
- “dumb” http (cgit) - something that is incompatible with current isomorphic-git.
- the
git://
protocol - which does not seem to be supported right now
The error when attempting 3 is error: Content source uses an unsupported transport protocol: git://[...]
.
Please add support to the git
protocol, as it makes isomorphic-git impossible to use under the circumstances listed above, as well as similar ones (e.g ssh+git-daemon only setups, etc).
Issue Analytics
- State:
- Created 5 years ago
- Reactions:2
- Comments:18 (11 by maintainers)
Top Results From Across the Web
4.1 Git on the Server - The Protocols
Git can use four distinct protocols to transfer data: Local, HTTP, Secure Shell (SSH) and Git. Here we'll discuss what they are and...
Read more >Improving Git protocol security on GitHub | The GitHub Blog
Improving Git protocol security on GitHub. We're changing which keys are supported in SSH and removing unencrypted Git protocol. Only users ...
Read more >GitHub to Phase out Support for Git Protocol, DSA Keys ... - InfoQ
With a strong focus on having customer data as secure as possible, GitHub has decided to remove support for the unencrypted Git protocol, ......
Read more >Deprecation of the git:// protocol on GitHub - Read the Docs Blog
Git submodule URLs · Pip VCS support. If you are trying to clone a repository using the Git protocol, you may see an...
Read more >Configuring Git Protocol v2 - GitLab Docs
Set and configure Git protocol v2. ... CentOS 6 / RHEL 6 sudo service sshd restart # All other supported distributions sudo systemctl...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
For github, I highly suspect it’s 0%, but there are other git hosts out there besides github/gitlab 😃 For my repo, it doesn’t make a significant difference (based on below calculations). Here’s a (relatively short) analysis of git clone performance:
Abstract
It is suspected that due to its lack of ability to prepare custom packs, the “dumb http” git protocol will perform worse than
git-daemon
, as well as be less reliable. We test the former and discuss the latter. The data indicates that the flat performance hit due to using git is greater than the one for dumb http (though both are negligible), but that git scales significantly better for larger repositories.Methodology
We take two repositories: Alpine’s user-handbook repository, and the linux kernel.
We then create a directory into which we locally cache both of them:
We then enable “dumb http” support:
Then we run a static http server and git-daemon (note: the following two commands are ran in separate terminals):
Each repository will be cloned 3 times, into
tmpfs
, each of which will be timed, using the following commands:This approach means that we can eliminate variables such as read/write speed (reads are from cache, writes are to ram), network speed (everything happens over lo) and similar - allowing us to measure specifically protocol overhead.
Data
Small Repository with Git
Small Repository with Dumb HTTP
Large Repository with Git
Large Repository with Dumb HTTP
Mean Summary and Comparisons
Analysis and Conclusion
The data shows that “git” has a significant initial overhead cost, but that it scales significantly better than “dumb http”. However, the scaling, while linear, appears to be lower than 1:1 - meaning that as repositories get larger and larger, this becomes less important, though this may be due to io bottlenecking (even on tmpfs). It is also notable that git has reliably smaller standard deviations, suggesting it is more consistent.
It is doubtful that repositories will get much larger than 2gb, so we can consider http overhead over git to be at least 25%, While git is shown as slower for smaller repositories, that difference is mostly negligible. As such, the
git-daemon
-based protocol should be preferred over “dumb” http for read operations.Additional Notes Regarding Reliability
Observing the behavior of the HTTP server, we can see that each object is downloaded separately using HTTP GET. This would normally not be a problem, but because of the nature of large repositories, it downloads the pack like this - something that will not deal well with packet loss. Whether or not this applies to the git protocol is unknown, and should be investigated separately.
Issues
Unfortunately, it is not possible to calculate initial per-protocol overhead, nor graph the increase in time based on commit-byte. This is because git does not offer a protocol-less cloning mechanism (my understanding is that the file-based one is still greater than cp). If one wanted to make this more rigorous, one would write an extension to git-clone that would only perform cp(1), and use that as the control, as well as making a single-commit single-empty-file repository to get a reliable 0. With that, it would become possible to calculate and plot the actual protocol overhead / commit-byte. There’s also a lack of sample data - this should be repeated with a statistically significant, randomly selected set of repositories (but I’m lazy and time constrained).
Implementation Notes
All of the above tests were ran on a i5-5200U laptop with 8GB of ram, that was otherwise idle. If attempting to reproduce, I recommend adjusting system fd limit, and increasing filetree caching aggressiveness - as well as minimizing swappiness, to avoid hitting unnecessary IO. If one happens to have additional RAM, everything should be in tmpfs, and the
rm -rf
step can be skipped (don’t forget to increment indexes in that case).git://
has nothing in common with ssh. The git protocol is implemented bygit-daemon(1)
, listens on a separate TCP port, and has the path rewritten based on the arguments provided to it (e.g--base-path
and co, see the example benchmark above for how to invoke it). The ssh protocol is implemented similar to local files (e.ggit clone /srv/something
), with the path being what goes after the : (inuser@host:path
format) or after the first non-protocol / (inssh://host/path
format).The
git://
protocol provides no authentication whatsoever, and intentionally so.For more details on
git://
, please see https://git-scm.com/docs/git-daemon.