[PATCH scm.sr.ht] Add API to list all public repos
Export this patch
This is especially useful for crawlers and archivers, and is essential to
adding support for sr.ht to https://www.softwareheritage.org/
The API pages at 100 per page instead of 11 like other APIs to lower the
number of requests required to get the whole list.
---
scmsrht/blueprints/api.py | 23 ++++++++++++++++++++++ -
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/scmsrht/blueprints/api.py b/scmsrht/blueprints/api.py
index d24c02e..4228205 100644
--- a/scmsrht/blueprints/api.py
+++ b/scmsrht/blueprints/api.py
@@ -14,7 +14,8 @@ repo_json = lambda r: {
"description": r.description,
"created": r.created,
"updated": r.updated,
- "visibility": r.visibility.value
+ "visibility": r.visibility.value,
+ "clone": r.owner.canonical_name + "/" + r.name
}
wh_json = lambda wh: {
@@ -57,6 +58,26 @@ def repos_POST(oauth_token):
return valid.response
return repo_json(repo)
+ @api.route("/api/repos/all")
+ def repos_all():
+ start = request.args.get('start') or -1
+ Repository = current_app.Repository
+ repos = (Repository.query
+ .filter(Repository.visibility == RepoVisibility.public)
+ )
+ if start != -1:
+ repos = repos.filter(Repository.id <= start)
+ repos = repos.order_by(Repository.id.desc()).limit(100).all()
+ if len(repos) != 100:
+ next_id = -1
+ else:
+ next_id = repos[-1].id
+ repos = repos[:99]
+ return {
+ "next": next_id,
+ "results": [repo_json(r) for r in repos]
+ }
+
@api.route("/api/repos/~<owner>")
def repos_username_GET(owner):
User = current_app.User
--
2.11.0
Gonna have to reject this patch for the time being. The lack of
discoverability on git.sr.ht is a feature, not a bug. In the future
there will be a central project management hub (at the top-level sr.ht
domain), which will have indexing and browsing. This feature will be
more appropriate there.
For the time being I suggest using external methods (spidering the
broader web) to find repos to archive, or archiving repos on a
by-request basis.