~technomancy/fennel

6 2

Feedback on Faith, the testing framework

Rudolf Adamkovič <salutis@me.com>
Details
Message ID
<m2mt254oy3.fsf@me.com>
DKIM signature
missing
Download raw message
Howdy, howdy!

So, we have tried Faith at Egghead Games today, hoping to replace
LuaUnit.  The plan for today was:

(1) Try Faith for one test module (a subset with ~200 tests).
(2) If all works well, contribute 'error-match'.
(3) Enjoy Fennel more.

Unfortunately, we had to backpedal for the following reasons:

(1) UNUSABLE SET-UP AND TEAR-DOWN

Unexpectedly, the designated 'setup' and 'teardown' functions are called
once per module and not once per test.

Most mature xUnit test frameworks support both per-test and per-module
variants of set-up and tear-down, but the per-test variant is the first
to learn and the most useful one, mostly because it can "reset the
world" for every test.

[In our case, we execute every test in pristine state, so we need to
nullify all globals, reset all project-related environment variables,
close and delete the database, and also clean up temporary files.]

P.S. It would be perhaps nice to rename these functions to 'set-up' and
'tear-down', as verbs instead of nouns, matching the setUp and tearDown
in other languages.

(2) WRONG FAILURE LOCATIONS

The reported locations of test failures often point to 'faith.fnl' at
'xpcall' instead of the relevant assertion.

Note: LuaUnit cleverly removes itself from each stack trace.

(3) MISSING STACK TRACES

Even though test failures include clickable locations, they do not
include clickable stack traces.

For those who have ever tried this functionality, which is available in
LuaUnit, there is no going back.  While clickable failure locations add
convenience, clickable stack traces are a game changer.

This functionality saves so much time on debugging that losing it is a
no-go from the business point of view, and I say that as an engineer,
not a business person. :)

(4) UNUSABLE TEST SUITE

The library does not seem to be developed with testability in mind, and
its test suite is <CENSORED>, which makes contributing to it difficult.

For a testing library, that is a red flag, for the library guards the
correctness of literally everything else in the project (and thus the
entire business).

(5) UNCLEAR ARGUMENTS

The documentation says that "assertions take the expected value first,
then the actual", but the examples are the other way, such as '(t.= (+ 1
1) 2)'.  The <CENSORED> test suite goes both ways.

Rudy
-- 
"Chop your own wood and it will warm you twice."
-- Henry Ford; Francis Kinloch, 1819; Henry David Thoreau, 1854

Rudolf Adamkovič <salutis@me.com> [he/him]
Studenohorská 25
84103 Bratislava
Slovakia
Details
Message ID
<87ilct1t5q.fsf@hagelb.org>
In-Reply-To
<m2mt254oy3.fsf@me.com> (view parent)
DKIM signature
missing
Download raw message
Rudolf Adamkovič <salutis@me.com> writes:

> (1) UNUSABLE SET-UP AND TEAR-DOWN
>
> Unexpectedly, the designated 'setup' and 'teardown' functions are called
> once per module and not once per test.
>
> Most mature xUnit test frameworks support both per-test and per-module
> variants of set-up and tear-down, but the per-test variant is the first
> to learn and the most useful one, mostly because it can "reset the
> world" for every test.

I left this out because this is very easy to do in userspace with only
two lines, and initially I was trying to only include things which could
only be handled in the test library. For example:

    (local mod {: test-foo
                : test-bar
                : test-haha})

    (collect [k v (pairs mod)]
      k (if (k:find "^test-") (wrap-setup v) v))

I think there is room for introducing these, but I wanted the first
version to leave out all but the necessities in order to be sure things
were needed before including them. (Easier to add features later than
remove them!) I've never found the need for these myself, but I wouldn't
say they're off the table.

> (2) WRONG FAILURE LOCATIONS
>
> The reported locations of test failures often point to 'faith.fnl' at
> 'xpcall' instead of the relevant assertion.

I have only seen this in one particular circumstance: when the last
assertion of the function is TCO'd. I believe this is unavoidable. I've
definitely seen it with tests that use LuaUnit and lunatest in this
situation as well.

If you have a repro case which is not due to TCO, please share it.

I think I should mention this in the readme tho.

> (3) MISSING STACK TRACES
>
> Even though test failures include clickable locations, they do not
> include clickable stack traces.
>
> For those who have ever tried this functionality, which is available in
> LuaUnit, there is no going back.  While clickable failure locations add
> convenience, clickable stack traces are a game changer.

The stack traces are just normal stack traces. If you haven't configured
Emacs to understand how to hyperlink normal Lua stack traces yet, you
should definitely do that! Then you will get the benefit of hyperlinks
whether the stack trace is printed from a test or not.

> (4) UNUSABLE TEST SUITE
>
> The library does not seem to be developed with testability in mind, and
> its test suite is <CENSORED>, which makes contributing to it difficult.
>
> For a testing library, that is a red flag, for the library guards the
> correctness of literally everything else in the project (and thus the
> entire business).

Yes, this is because the library is very new and has only had a day or
two of work put into it yet. There's a note in the readme about how the
self-tests need to be improved.

> (5) UNCLEAR ARGUMENTS
>
> The documentation says that "assertions take the expected value first,
> then the actual", but the examples are the other way, such as '(t.= (+ 1
> 1) 2)'.  The <CENSORED> test suite goes both ways.

Thanks for catching this; I've corrected the examples.

-Phil
Rudolf Adamkovič <salutis@me.com>
Details
Message ID
<m2cz31xlvj.fsf@me.com>
In-Reply-To
<87ilct1t5q.fsf@hagelb.org> (view parent)
DKIM signature
missing
Download raw message
Phil Hagelberg <phil@hagelb.org> writes:

>> (1) UNUSABLE SET-UP AND TEAR-DOWN
> I left this out because this is very easy to do in userspace with only
> two lines, and initially I was trying to only include things which could
> only be handled in the test library. For example:
>
>     (local mod {: test-foo
>                 : test-bar
>                 : test-haha})
>
>     (collect [k v (pairs mod)]
>       k (if (k:find "^test-") (wrap-setup v) v))

What a great hack! The fact that Lua modules are tables is fucking
awesome.  This is where I do not miss Scheme, or any other language for
that matter.  Simply amazing.

> I think there is room for introducing these, but I wanted the first
> version to leave out all but the necessities in order to be sure things
> were needed before including them. (Easier to add features later than
> remove them!) I've never found the need for these myself, but I wouldn't
> say they're off the table.

I think set-up and tear-down should run for every test by default, as
everyone expects that, as per all the existing xUnit set-up/tear-down
style frameworks.

>> (2) WRONG FAILURE LOCATIONS
>
> I have only seen this in one particular circumstance: when the last
> assertion of the function is TCO'd. I believe this is unavoidable. I've
> definitely seen it with tests that use LuaUnit and lunatest in this
> situation as well.

For LuaUnit, I solved the problem with the following macro-module:

{:nil
 (λ [...]
   `(let [luaunit# (require :luaunit)]
      (luaunit#.assert_nil ,...)
      nil))
 :not-nil
 (λ [...]
   `(let [luaunit# (require :luaunit)]
      (luaunit#.assert_not_nil ,...)
      nil))
 :false
 (λ [...]
   `(let [luaunit# (require :luaunit)]
      (luaunit#.assert_false ,...)
      nil))
 :true
 (λ [...]
   `(let [luaunit# (require :luaunit)]
      (luaunit#.assert_true ,...)
      nil))
 :=
 (λ [...]
   `(let [luaunit# (require :luaunit)]
      (luaunit#.assert_equals ,...)
      nil))
 :not=
 (λ [...]
   `(let [luaunit# (require :luaunit)]
      (luaunit#.assert_not_equals ,...)
      nil))
 :almost=
 (λ [...]
   `(let [luaunit# (require :luaunit)]
      (luaunit#.assert_almost_equals ,...)
      nil))
 :error
 (λ [...]
   `(let [luaunit# (require :luaunit)]
      (luaunit#.assert_error_msg_contains ,...)
      nil))}

[I have written about 500 tests with the above.  Zero problems.]

Would this help?

>> (3) MISSING STACK TRACES
>
> The stack traces are just normal stack traces. If you haven't configured
> Emacs to understand how to hyperlink normal Lua stack traces yet, you
> should definitely do that! Then you will get the benefit of hyperlinks
> whether the stack trace is printed from a test or not.

Oh, my bad then!

(I have it configured, just got confused, apparently.)

>> (4) UNUSABLE TEST SUITE
>
> Yes, this is because the library is very new and has only had a day or
> two of work put into it yet. There's a note in the readme about how the
> self-tests need to be improved.

Gotcha.  As I am *fanatically* test-driven, I will hold my horses until
the tests look more normal.  :)

We could perhaps use a couple of pages of with plain old 'assert' calls?
Still better than the current eyeballing or diffing, IMO.

Rudy
-- 
"Chop your own wood and it will warm you twice."
-- Henry Ford; Francis Kinloch, 1819; Henry David Thoreau, 1854

Rudolf Adamkovič <salutis@me.com> [he/him]
Studenohorská 25
84103 Bratislava
Slovakia
Details
Message ID
<87fs7w1d3d.fsf@hagelb.org>
In-Reply-To
<m2cz31xlvj.fsf@me.com> (view parent)
DKIM signature
missing
Download raw message
Rudolf Adamkovič <salutis@me.com> writes:

>> I think there is room for introducing these, but I wanted the first
>> version to leave out all but the necessities in order to be sure things
>> were needed before including them. (Easier to add features later than
>> remove them!) I've never found the need for these myself, but I wouldn't
>> say they're off the table.
>
> I think set-up and tear-down should run for every test by default, as
> everyone expects that, as per all the existing xUnit set-up/tear-down
> style frameworks.

Yeah, this is probably fine. Maybe the existing setup/teardown functions
could be renamed `setup-all` and `teardown-all`. Better to make a change
like that now when we have roughly zero users. =)

If you want to take a shot at this, feel free. Otherwise I can probably
do it before releasing the next version, but I don't know when that
would be.

>> I have only seen this in one particular circumstance: when the last
>> assertion of the function is TCO'd. I believe this is unavoidable. I've
>> definitely seen it with tests that use LuaUnit and lunatest in this
>> situation as well.
>
> For LuaUnit, I solved the problem with the following macro-module:
>
> [...]
>
> [I have written about 500 tests with the above.  Zero problems.]
>
> Would this help?

Yes, if we decide to include macros in Faith, this could be a good
option, but right now it is a single-file library. But it sounds like
what you're saying is that you did have this problem with luaunit too,
until you solved it with this workaround.

There have been a few discussions of how to include macros and runtime
functions in the same file for Fennel in general. So it might be
possible to do this without giving up the single-file approach in the
future.

>>> (4) UNUSABLE TEST SUITE
>>
>> Yes, this is because the library is very new and has only had a day or
>> two of work put into it yet. There's a note in the readme about how the
>> self-tests need to be improved.
>
> Gotcha.  As I am *fanatically* test-driven, I will hold my horses until
> the tests look more normal.  :)

I've added some self-tests here:

https://git.sr.ht/~technomancy/faith/commit/493ced5594b50cf5f0a26754013ce5f26650bfdb

They rely on the specifics of the output formatting a little more than I
would like, but it's a start. =)

-Phil
Rudolf Adamkovič <salutis@me.com>
Details
Message ID
<m28rdfnkj6.fsf@me.com>
In-Reply-To
<87fs7w1d3d.fsf@hagelb.org> (view parent)
DKIM signature
missing
Download raw message
Phil Hagelberg <phil@hagelb.org> writes:

Firstly, thank you for taking the time to work with me, Phil.

> [...] Maybe the existing setup/teardown functions could be renamed
> `setup-all` and `teardown-all`.  [...]  If you want to take a shot at
> this, feel free.

Please see the attached first-cut patch.  WDYT?

It would be nice to allow passing data from 'setup' and 'teardown' down
to the tests, the way 'setup-all' and 'teardown-all' do it, but we can
add that separately.  When we do, per-test 'setup' data, if any, win
over per-module 'setup-all' data, I think.

> Yes, if we decide to include macros in Faith, this could be a good
> option, but right now it is a single-file library.

Perhaps we could offer an optional macro-module that fixes it for now?
It is a pretty serious bug for day-to-day use, IMO.

> I've added some self-tests here: [...]  They rely on the specifics of
> the output formatting a little more than I would like, but it's a
> start. =)

That is a *great* start, and it allowed me to write the patch below.

Rudy
Details
Message ID
<87h6s1yb2z.fsf@hagelb.org>
In-Reply-To
<m28rdfnkj6.fsf@me.com> (view parent)
DKIM signature
missing
Download raw message
Rudolf Adamkovič <salutis@me.com> writes:

> It would be nice to allow passing data from 'setup' and 'teardown' down
> to the tests, the way 'setup-all' and 'teardown-all' do it, but we can
> add that separately.  When we do, per-test 'setup' data, if any, win
> over per-module 'setup-all' data, I think.

Hm. My original assumption would be that it would pass both thru, but
that gets very awkward given that setup and teardown can currently
return an arbitrary number of values. I think maybe it would be better
if we restricted it down to only returning a single value for `setup`
and a single value for `setup-all`. Then the test function could just be
passed two arguments.

>> Yes, if we decide to include macros in Faith, this could be a good
>> option, but right now it is a single-file library.
>
> Perhaps we could offer an optional macro-module that fixes it for now?
> It is a pretty serious bug for day-to-day use, IMO.

Sure; given that it is still completely usable as a single file (other
than this one issue which is common across every testing library),
adding an optional macros file separately seems fine. If you want to
take a shot at writing a macro module, go ahead.

> Please see the attached first-cut patch.  WDYT?

Thanks!

> +        result
> +        (or (case ?setup
> +              setup (case (xpcall setup (err-handler name))
> +                      true nil
> +                      (_ err) err))

The outer `case` call here isn't really doing pattern-matching; this
would be simpler as an `if` since the rebinding isn't really necessary.

The inner `case` might be better as a `case-try` since the goal is to
continue on to a new pattern under certain circumstances, and the same
failure pattern can be used for both, for instance:

>  (let [started-at (now)
>        result (case-try (if ?setup (xpcall ?setup (err-handler name)) true)
>                 true (xpcall #(test (unpack context)) (err-handler name))
>                 true (pass)
>                 (catch
>                  (_ err) err))]
>    (when ?teardown
>      (pcall ?teardown (unpack context)))

However, I'll go ahead and apply your patch anyway since this section
will need to be redone in order to get the `setup` function to convey
context to the individual test functions as discussed above.

-Phil
Rudolf Adamkovič <salutis@me.com>
Details
Message ID
<m25y8ho54i.fsf@me.com>
In-Reply-To
<87h6s1yb2z.fsf@hagelb.org> (view parent)
DKIM signature
missing
Download raw message
Phil Hagelberg <phil@hagelb.org> writes:

> Rudolf Adamkovič <salutis@me.com> writes:
>
>> It would be nice to allow passing data from 'setup' and 'teardown' down
>> to the tests, the way 'setup-all' and 'teardown-all' do it, but we can
>> add that separately.  When we do, per-test 'setup' data, if any, win
>> over per-module 'setup-all' data, I think.
>
> Hm. My original assumption would be that it would pass both thru, but
> that gets very awkward given that setup and teardown can currently
> return an arbitrary number of values. I think maybe it would be better
> if we restricted it down to only returning a single value for `setup`
> and a single value for `setup-all`. Then the test function could just be
> passed two arguments.

I wonder what the value of this is, BTW.  One could say, exactly once,
'(var ...)' on top of the file and then '(set ...)' in the setup,
instead of repeating '(fn [data] ...)' in every test (and not being able
to use 'lambda' or 'λ').

> [...] adding an optional macros file separately seems fine. If you
> want to take a shot at writing a macro module, go ahead.

I plan to do that, as we want to see correct line numbers for test
failures here at Egghead Games.  And who does not?!  I also plan to
contribute the 'error-match' function, as discussed.

> The outer `case` call here isn't really doing pattern-matching; this
> would be simpler as an `if` since the rebinding isn't really necessary.

Well, technically, it does. :) I typically use 'case', as it rebinds the
value to a name without a question mark, which makes it crystal clear to
the reader that the value is not 'nil'.  I even remember reading in the
Fennel reference that 'case' replaces 'if-let' and 'when-let'.  However!
I do see the appeal of not repeating the variable name twice in the
simple cases like ours here.

Patched.

> The inner `case` might be better as a `case-try` [...]

TIL!  Patched.  Thank you!

Please see the attached patch.

Rudy
Reply to thread Export thread (mbox)