~rjarry/dlrepo

7 3

[PATCH dlrepo 1/2] job: add internal flag

Details
Message ID
<20220510083058.1805-1-julien.floret@6wind.com>
DKIM signature
missing
Download raw message
Patch: +50 -28
When releasing a tag, all of its jobs are published to the remote
server.

A tag may contain some jobs that are only used for internal testing and
that are never meant to be published to a remote server. Today though,
when a tag is released, all of its jobs are published.

To save space on the remote server, we add an "internal" job flag.
A job flagged "internal" is never published to a remote server.

Note that it doesn't have the same meaning as the format's "internal"
flag. Indeed, the latter is purely informative: it does not prevent the
format to be published when the job containing it is published.

Signed-off-by: Julien Floret <julien.floret@6wind.com>
---
 dlrepo-cli                 | 17 +++++++++++++----
 dlrepo/fs/fmt.py           | 13 -------------
 dlrepo/fs/tag.py           |  2 ++
 dlrepo/fs/util.py          | 13 +++++++++++++
 dlrepo/views/job.py        | 17 ++++++++++-------
 docs/dlrepo-api.7.scdoc    |  7 +++++--
 docs/dlrepo-cli.1.scdoc    |  7 +++++--
 docs/dlrepo-layout.7.scdoc |  2 ++
 8 files changed, 50 insertions(+), 28 deletions(-)

diff --git a/dlrepo-cli b/dlrepo-cli
index 476d3350ad29..9be020c07724 100755
--- a/dlrepo-cli
+++ b/dlrepo-cli
@@ -391,7 +391,7 @@ def release(args):
    Arg("branch", metavar="BRANCH", help="the branch name"),
    Arg("tag", metavar="TAG", help="the tag name"),
    Arg("job", metavar="JOB", help="the job name"),
    Arg("format", metavar="FORMAT", help="the artifact format"),
    Arg("format", metavar="FORMAT", nargs="?", help="the artifact format"),
    Arg(
        "-u",
        "--unset",
@@ -401,11 +401,20 @@ def release(args):
)
def internal(args):
    """
    Set or unset the 'internal' status on an artifact format.
    Set or unset the 'internal' status on a job or an artifact format. An
    internal job is never published to a remote server. On the other hand, the
    'internal' flag on an artifact format is purely informative: it does not
    prevent the format to be published.
    """
    client = HttpClient(args.url)
    url = os.path.join("branches", args.branch, args.tag, args.job, args.format) + "/"
    client.put(url, {"artifact_format": {"internal": not args.unset}})
    if args.format:
        url = (
            os.path.join("branches", args.branch, args.tag, args.job, args.format) + "/"
        )
        client.put(url, {"artifact_format": {"internal": not args.unset}})
    else:
        url = os.path.join("branches", args.branch, args.tag, args.job) + "/"
        client.put(url, {"job": {"internal": not args.unset}})


# --------------------------------------------------------------------------------------
diff --git a/dlrepo/fs/fmt.py b/dlrepo/fs/fmt.py
index a58602198d0b..07ac3c8d4816 100644
--- a/dlrepo/fs/fmt.py
+++ b/dlrepo/fs/fmt.py
@@ -126,19 +126,6 @@ class ArtifactFormat(SubDir):
        digests[relpath] = digest
        self._digest_path().write_text(json.dumps(digests))

    def _internal_path(self) -> Path:
        return self._path / ".internal"

    def is_internal(self) -> bool:
        return self._internal_path().is_file()

    def set_internal(self, internal: bool):
        path = self._internal_path()
        if internal:
            path.touch()
        elif path.is_file():
            path.unlink()

    def _dirty_path(self) -> Path:
        return self._path / ".dirty"

diff --git a/dlrepo/fs/tag.py b/dlrepo/fs/tag.py
index 6ca4f88085f4..0638643a6054 100644
--- a/dlrepo/fs/tag.py
+++ b/dlrepo/fs/tag.py
@@ -132,6 +132,8 @@ class Tag(SubDir):
    async def _publish(self, sess: aiohttp.ClientSession, semaphore: asyncio.Semaphore):
        loop = asyncio.get_running_loop()
        for job in self.get_jobs():
            if job.is_internal():
                continue
            self._publish_status_path().write_text(f"uploading {job.name}\n")
            tasks = []
            for fmt in job.get_formats():
diff --git a/dlrepo/fs/util.py b/dlrepo/fs/util.py
index 07cceb9976e8..dc61b083cfc5 100644
--- a/dlrepo/fs/util.py
+++ b/dlrepo/fs/util.py
@@ -67,6 +67,19 @@ class SubDir:
    def path(self) -> Path:
        return self._path

    def _internal_path(self) -> Path:
        return self._path / ".internal"

    def is_internal(self) -> bool:
        return self._internal_path().is_file()

    def set_internal(self, internal: bool):
        path = self._internal_path()
        if internal:
            path.touch()
        elif path.is_file():
            path.unlink()


# --------------------------------------------------------------------------------------
HASH_ALGOS = "|".join(hashlib.algorithms_guaranteed)
diff --git a/dlrepo/views/job.py b/dlrepo/views/job.py
index 49fbbb3bc97f..16923c65e2b0 100644
--- a/dlrepo/views/job.py
+++ b/dlrepo/views/job.py
@@ -67,6 +67,7 @@ class JobView(JobArchiveView):
            raise web.HTTPFound(job.url())
        html = "html" in self.request.headers.get("Accept", "json")
        data = {"job": job.get_metadata()}
        data["job"]["internal"] = job.is_internal()
        formats = []
        for f in job.get_formats():
            fmt_url = f.url()
@@ -95,16 +96,18 @@ class JobView(JobArchiveView):
        return web.json_response(data)

    async def put(self):
        job = self._get_job()
        try:
            locked = (await self.json_body())["job"]["locked"]
            if not isinstance(locked, bool):
                raise TypeError()
            data = (await self.json_body())["job"]
        except (TypeError, KeyError) as e:
            raise web.HTTPBadRequest(reason="Invalid parameters") from e
        try:
            self._get_job().set_locked(locked)
        except FileNotFoundError as e:
            raise web.HTTPNotFound() from e
        if isinstance(data.get("locked"), bool):
            try:
                job.set_locked(data["locked"])
            except FileNotFoundError as e:
                raise web.HTTPNotFound() from e
        if isinstance(data.get("internal"), bool):
            job.set_internal(data["internal"])
        return web.Response()

    async def patch(self):
diff --git a/docs/dlrepo-api.7.scdoc b/docs/dlrepo-api.7.scdoc
index 79db971036e3..5ac9301852bd 100644
--- a/docs/dlrepo-api.7.scdoc
+++ b/docs/dlrepo-api.7.scdoc
@@ -312,6 +312,7 @@ Get metadata and artifact formats for the specified _{job}_.
		"job": {
			"name": "foobaz-x86_64-postgresql",
			"locked": true,
			"internal": false,
			"product": "foobaz",
			"product_variant": "x86_64-postgresql",
			"product_branch": "3.2",
@@ -347,8 +348,9 @@ the archive contents after extraction.
## PUT /branches/{branch}/{tag}/{job}/
## PUT /~{user}/branches/{branch}/{tag}/{job}/

Lock or unlock the specified _{job}_. A locked job can no longer receive new
artifact uploads nor metadata change.
Change the internal and/or locked status of a _{job}_. An internal job can never
be published to another server. A locked job can no longer receive new artifact
uploads nor metadata change.

*Access:*
	_rw_
@@ -356,6 +358,7 @@ artifact uploads nor metadata change.
	```
	{
		"job": {
			"internal": false,
			"locked": true
		}
	}
diff --git a/docs/dlrepo-cli.1.scdoc b/docs/dlrepo-cli.1.scdoc
index dd383a1eba21..5ff7211b2aa6 100644
--- a/docs/dlrepo-cli.1.scdoc
+++ b/docs/dlrepo-cli.1.scdoc
@@ -188,10 +188,13 @@ _TAG_

## dlrepo-cli internal [-u] BRANCH TAG JOB FORMAT

Set or unset the _internal_ status on an artifact format.
Set or unset the _internal_ status on a job or an artifact format. An internal
job is never published to a remote server. On the other hand, the 'internal'
flag on an artifact format is purely informative: it does not prevent the format
to be published.

*-u*, *--unset*
	Unset the _internal_ status from the format instead of setting it.
	Unset the _internal_ status instead of setting it.

_BRANCH_
	The branch name.
diff --git a/docs/dlrepo-layout.7.scdoc b/docs/dlrepo-layout.7.scdoc
index 2e8d993588a3..8f80749f1f1e 100644
--- a/docs/dlrepo-layout.7.scdoc
+++ b/docs/dlrepo-layout.7.scdoc
@@ -120,6 +120,8 @@ _.job_ (optional)
	Symbolic link to a _{version}_ in the _products/_ tree. This is only
	present if the _product_, _version_, _product_branch_ and
	_product_variant_ metadata have been set for this job.
_.internal_ (optional)
	Empty file created when the job is marked as "internal".

## branches/{branch}/{tag}/{job}/{format}/

-- 
2.30.2

[PATCH dlrepo 2/2] tests: check internal jobs are not published

Details
Message ID
<20220510083058.1805-2-julien.floret@6wind.com>
In-Reply-To
<20220510083058.1805-1-julien.floret@6wind.com> (view parent)
DKIM signature
missing
Download raw message
The "internal_job" job is flagged "internal". Test that it is not
published to the remote server.

Signed-off-by: Julien Floret <julien.floret@6wind.com>
---
 .../branches/branch/tag/internal_job/.internal               | 0
 .../branches/branch/tag/internal_job/fmt1/.digests           | 1 +
 .../branches/branch/tag/internal_job/fmt1/a                  | 1 +
 .../branches/branch/tag/internal_job/fmt1/b                  | 1 +
 tests/test_publish.py                                        | 5 +++++
 5 files changed, 8 insertions(+)
 create mode 100644 tests/publish_reference/branches/branch/tag/internal_job/.internal
 create mode 100644 tests/publish_reference/branches/branch/tag/internal_job/fmt1/.digests
 create mode 100644 tests/publish_reference/branches/branch/tag/internal_job/fmt1/a
 create mode 100644 tests/publish_reference/branches/branch/tag/internal_job/fmt1/b

diff --git a/tests/publish_reference/branches/branch/tag/internal_job/.internal b/tests/publish_reference/branches/branch/tag/internal_job/.internal
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/tests/publish_reference/branches/branch/tag/internal_job/fmt1/.digests b/tests/publish_reference/branches/branch/tag/internal_job/fmt1/.digests
new file mode 100644
index 000000000000..7ae13f34b9e3
--- /dev/null
+++ b/tests/publish_reference/branches/branch/tag/internal_job/fmt1/.digests
@@ -0,0 +1 @@
+{"b": "sha256:0263829989b6fd954f72baaf2fc64bc2e2f01d692d4de72986ea808f6e99813f", "a": "sha256:87428fc522803d31065e7bce3cf03fe475096631e5e07bbd7a0fde60c4cf25c7"}
\ No newline at end of file
diff --git a/tests/publish_reference/branches/branch/tag/internal_job/fmt1/a b/tests/publish_reference/branches/branch/tag/internal_job/fmt1/a
new file mode 100644
index 000000000000..78981922613b
--- /dev/null
+++ b/tests/publish_reference/branches/branch/tag/internal_job/fmt1/a
@@ -0,0 +1 @@
+a
diff --git a/tests/publish_reference/branches/branch/tag/internal_job/fmt1/b b/tests/publish_reference/branches/branch/tag/internal_job/fmt1/b
new file mode 100644
index 000000000000..61780798228d
--- /dev/null
+++ b/tests/publish_reference/branches/branch/tag/internal_job/fmt1/b
@@ -0,0 +1 @@
+b
diff --git a/tests/test_publish.py b/tests/test_publish.py
index cceb07c97695..7c9f2f572749 100644
--- a/tests/test_publish.py
+++ b/tests/test_publish.py
@@ -83,3 +83,8 @@ async def test_publish(dlrepo_servers):  # pylint: disable=redefined-outer-name
                     assert resp.status == 200
                     data_pub = await resp.text()
                     assert data_pub == data
+            sha_url = "/branches/branch/tag/internal_job/fmt1.sha256"
+            resp = await sess.get(sha_url)
+            assert resp.status == 200
+            resp = await pub_sess.get(sha_url)
+            assert resp.status == 404
-- 
2.30.2
Olivier Matz <zer0@droids-corp.org>
Details
Message ID
<YnqRxpay33+oyeny@platinum>
In-Reply-To
<20220510083058.1805-1-julien.floret@6wind.com> (view parent)
DKIM signature
missing
Download raw message
Hi Julien,

On Tue, May 10, 2022 at 10:30:57AM +0200, Julien Floret wrote:
> When releasing a tag, all of its jobs are published to the remote
> server.
> 
> A tag may contain some jobs that are only used for internal testing and
> that are never meant to be published to a remote server. Today though,
> when a tag is released, all of its jobs are published.
> 
> To save space on the remote server, we add an "internal" job flag.
> A job flagged "internal" is never published to a remote server.
> 
> Note that it doesn't have the same meaning as the format's "internal"
> flag. Indeed, the latter is purely informative: it does not prevent the
> format to be published when the job containing it is published.

Is it a good idea to have two "internal" flags that have different
meaning? If the "internal" flag on format is only informative, what
about replacing it by a sort of tag system? The tag name and its value
would be arbitrary, only useful for the user.


> (...)
>
> --- a/dlrepo/views/job.py
> +++ b/dlrepo/views/job.py
> @@ -67,6 +67,7 @@ class JobView(JobArchiveView):
>              raise web.HTTPFound(job.url())
>          html = "html" in self.request.headers.get("Accept", "json")
>          data = {"job": job.get_metadata()}
> +        data["job"]["internal"] = job.is_internal()
>          formats = []
>          for f in job.get_formats():
>              fmt_url = f.url()
> @@ -95,16 +96,18 @@ class JobView(JobArchiveView):
>          return web.json_response(data)
>  
>      async def put(self):
> +        job = self._get_job()
>          try:
> -            locked = (await self.json_body())["job"]["locked"]
> -            if not isinstance(locked, bool):
> -                raise TypeError()

I think this case is not tested anymore after your patch.

> +            data = (await self.json_body())["job"]
>          except (TypeError, KeyError) as e:
>              raise web.HTTPBadRequest(reason="Invalid parameters") from e
> -        try:
> -            self._get_job().set_locked(locked)
> -        except FileNotFoundError as e:
> -            raise web.HTTPNotFound() from e
> +        if isinstance(data.get("locked"), bool):
> +            try:
> +                job.set_locked(data["locked"])
> +            except FileNotFoundError as e:
> +                raise web.HTTPNotFound() from e
> +        if isinstance(data.get("internal"), bool):
> +            job.set_internal(data["internal"])
>          return web.Response()
>  
>      async def patch(self):
Details
Message ID
<CAHfJR7irs6iZpF3rYFoDv2EZ75wdnYUXqA8s=JFV=s_=0f=rtA@mail.gmail.com>
In-Reply-To
<YnqRxpay33+oyeny@platinum> (view parent)
DKIM signature
pass
Download raw message
Le mar. 10 mai 2022 à 18:24, Olivier Matz <zer0@droids-corp.org> a écrit :
>
> Hi Julien,
>
> On Tue, May 10, 2022 at 10:30:57AM +0200, Julien Floret wrote:
> > When releasing a tag, all of its jobs are published to the remote
> > server.
> >
> > A tag may contain some jobs that are only used for internal testing and
> > that are never meant to be published to a remote server. Today though,
> > when a tag is released, all of its jobs are published.
> >
> > To save space on the remote server, we add an "internal" job flag.
> > A job flagged "internal" is never published to a remote server.
> >
> > Note that it doesn't have the same meaning as the format's "internal"
> > flag. Indeed, the latter is purely informative: it does not prevent the
> > format to be published when the job containing it is published.
>
> Is it a good idea to have two "internal" flags that have different
> meaning? If the "internal" flag on format is only informative, what
> about replacing it by a sort of tag system? The tag name and its value
> would be arbitrary, only useful for the user.
>

Why not, I don't have a strong opinion on this. We already have the
"locked" flag which has a slightly different meaning whether it is on
a job or on a tag (a locked job can be deleted, a locked tag  cannot).

> > (...)
> >
> > --- a/dlrepo/views/job.py
> > +++ b/dlrepo/views/job.py
> > @@ -67,6 +67,7 @@ class JobView(JobArchiveView):
> >              raise web.HTTPFound(job.url())
> >          html = "html" in self.request.headers.get("Accept", "json")
> >          data = {"job": job.get_metadata()}
> > +        data["job"]["internal"] = job.is_internal()
> >          formats = []
> >          for f in job.get_formats():
> >              fmt_url = f.url()
> > @@ -95,16 +96,18 @@ class JobView(JobArchiveView):
> >          return web.json_response(data)
> >
> >      async def put(self):
> > +        job = self._get_job()
> >          try:
> > -            locked = (await self.json_body())["job"]["locked"]
> > -            if not isinstance(locked, bool):
> > -                raise TypeError()
>
> I think this case is not tested anymore after your patch.

Right. I wanted to keep the code as simple as possible, so I did like
in TagView.put(): parameters that don't have the right type are
silently ignored. I can add checks if you prefer.

> > +            data = (await self.json_body())["job"]
> >          except (TypeError, KeyError) as e:
> >              raise web.HTTPBadRequest(reason="Invalid parameters") from e
> > -        try:
> > -            self._get_job().set_locked(locked)
> > -        except FileNotFoundError as e:
> > -            raise web.HTTPNotFound() from e
> > +        if isinstance(data.get("locked"), bool):
> > +            try:
> > +                job.set_locked(data["locked"])
> > +            except FileNotFoundError as e:
> > +                raise web.HTTPNotFound() from e
> > +        if isinstance(data.get("internal"), bool):
> > +            job.set_internal(data["internal"])
> >          return web.Response()
> >
> >      async def patch(self):
Details
Message ID
<CJWBNBTZCAUF.1YE9YBSEG8QJ8@marty>
In-Reply-To
<CAHfJR7irs6iZpF3rYFoDv2EZ75wdnYUXqA8s=JFV=s_=0f=rtA@mail.gmail.com> (view parent)
DKIM signature
missing
Download raw message
Julien Floret, May 10, 2022 at 18:57:
> Le mar. 10 mai 2022 à 18:24, Olivier Matz <zer0@droids-corp.org> a écrit :
> > Is it a good idea to have two "internal" flags that have different
> > meaning? If the "internal" flag on format is only informative, what
> > about replacing it by a sort of tag system? The tag name and its value
> > would be arbitrary, only useful for the user.
>
> Why not, I don't have a strong opinion on this. We already have the
> "locked" flag which has a slightly different meaning whether it is on
> a job or on a tag (a locked job can be deleted, a locked tag  cannot).

I agree that this is confusing. I would not mind a tag/label system. The
"internal" bool is only used in the html UI to display the artifact
format in a different style. This would mean translating these tags as
CSS classes here:

https://git.sr.ht/~rjarry/dlrepo/tree/v0.22/item/dlrepo/templates/job.html#L20-23
https://git.sr.ht/~rjarry/dlrepo/tree/v0.22/item/dlrepo/templates/product_version.html#L17-20

And describing in the docs how to style these classes. Also, this:

  title="internal format (not for release)"

Should be changed to something like:

  title="tags: {{fmt.tags}}"

One thing to keep in mind is the upgrade of an existing repo with this
new tag system. Should this be done automatically? I don't like that
very much.

There could be a procedure described somewhere to remove .internal files
from format dirs and put a .tags file with the "internal" tag instead.

What do you think?
Details
Message ID
<CAHfJR7jThewVUpdOOF=HBAO5OWy8=+7V=6r60PvC-oGbbpHi3w@mail.gmail.com>
In-Reply-To
<CJWBNBTZCAUF.1YE9YBSEG8QJ8@marty> (view parent)
DKIM signature
pass
Download raw message
Le mar. 10 mai 2022 à 21:17, Robin Jarry <robin@jarry.cc> a écrit :
>
> Julien Floret, May 10, 2022 at 18:57:
> > Le mar. 10 mai 2022 à 18:24, Olivier Matz <zer0@droids-corp.org> a écrit :
> > > Is it a good idea to have two "internal" flags that have different
> > > meaning? If the "internal" flag on format is only informative, what
> > > about replacing it by a sort of tag system? The tag name and its value
> > > would be arbitrary, only useful for the user.
> >
> > Why not, I don't have a strong opinion on this. We already have the
> > "locked" flag which has a slightly different meaning whether it is on
> > a job or on a tag (a locked job can be deleted, a locked tag  cannot).
>
> I agree that this is confusing. I would not mind a tag/label system. The
> "internal" bool is only used in the html UI to display the artifact
> format in a different style. This would mean translating these tags as
> CSS classes here:
>
> https://git.sr.ht/~rjarry/dlrepo/tree/v0.22/item/dlrepo/templates/job.html#L20-23
> https://git.sr.ht/~rjarry/dlrepo/tree/v0.22/item/dlrepo/templates/product_version.html#L17-20
>
> And describing in the docs how to style these classes. Also, this:
>
>   title="internal format (not for release)"
>
> Should be changed to something like:
>
>   title="tags: {{fmt.tags}}"
>
> One thing to keep in mind is the upgrade of an existing repo with this
> new tag system. Should this be done automatically? I don't like that
> very much.
>
> There could be a procedure described somewhere to remove .internal files
> from format dirs and put a .tags file with the "internal" tag instead.
>
> What do you think?

After discussion with Olivier, we could not find use cases for setting
labels on formats. What's more, IMHO setting "internal" on a format is
confusing, since it does not prevent the format to be published.
I would be more inclined to remove the "internal" info for formats,
since it is only informative and a different style in the UI can
easily be missed. Such formats should be well known and properly
hidden via ACLs anyway.
Maybe we can add tags on formats later if the need arises?
Details
Message ID
<CJWU66QI37DV.1B3JW8IO83T20@marty>
In-Reply-To
<CAHfJR7jThewVUpdOOF=HBAO5OWy8=+7V=6r60PvC-oGbbpHi3w@mail.gmail.com> (view parent)
DKIM signature
missing
Download raw message
Julien Floret, May 11, 2022 at 11:39:
> After discussion with Olivier, we could not find use cases for setting
> labels on formats. What's more, IMHO setting "internal" on a format is
> confusing, since it does not prevent the format to be published.
> I would be more inclined to remove the "internal" info for formats,
> since it is only informative and a different style in the UI can
> easily be missed. Such formats should be well known and properly
> hidden via ACLs anyway.
> Maybe we can add tags on formats later if the need arises?

I am OK with removing this "internal" flag. ACLs should be more than
enough to hide some formats if need be.

Going back to that locked flag that has different meaning for jobs and
tags:

> Why not, I don't have a strong opinion on this. We already have the
> "locked" flag which has a slightly different meaning whether it is on
> a job or on a tag (a locked job can be deleted, a locked tag  cannot).

Perhaps, we could rename one of these flags to "frozen" instead of
"locked". I am not sure which one would suit best.
Details
Message ID
<CAHfJR7ivwOHd2AyowBtrmobyLrF2_PNyWfw8qzWfwP=SkpEE7Q@mail.gmail.com>
In-Reply-To
<CJWU66QI37DV.1B3JW8IO83T20@marty> (view parent)
DKIM signature
pass
Download raw message
> Going back to that locked flag that has different meaning for jobs and
> tags:
>
> > Why not, I don't have a strong opinion on this. We already have the
> > "locked" flag which has a slightly different meaning whether it is on
> > a job or on a tag (a locked job can be deleted, a locked tag  cannot).
>
> Perhaps, we could rename one of these flags to "frozen" instead of
> "locked". I am not sure which one would suit best.

Could be an idea. To me, "frozen" seems best suited for jobs, if we
keep this term.
A "frozen" job cannot be altered, but can be deleted. A "locked" tag
is untouchable.
"dlrepo-cli freeze job" would make sense to me.
Reply to thread Export thread (mbox)