Hi Olly,
Many thanks for your reply and your patch. Please apologize my late reply, I was on vacation for during the last two weeks. Your patch works for me. It is indeed much simpler than mine.
Another observation: the function topological_sort() might be to aggressive. Given I have three migrations in this order by ID:
"m1.sql", "m2.sql", "m3.sql" and "m3" depends on "m1"
Currently topological_sort() yields "m1", "m3", "m2". But the initial order "m1", "m2", "m3" already satisfies the dependency condition.
Of the various possible topological sorts we usually want the one, which is most similar to the original sequence of migrations. To define "most similar" one can use the Damerau-Levenshtein-distance or the Levenshtein-distance. See Oleksiy Gapotchenkos blog http://blog.gapotchenko.com/stable-topological-sort for an algorithm. If you are interested, I just coded a patch.
Kind regards
Anselm
--
Anselm Kruis
Senior Solution Architect
MSE OPS HPCS S PS
T +49-89-35 63 86-874
M +49-16 05 82 82 00
anselm.kruis@atos.net
science + computing ag
Ingolstädter Str. 22
80807 München, Germany
atos.net/de/deutschland/sc
science + computing AG; Vorstand: Dr. Martin Matzke (Vorsitzender), Sabine Hohenstein, matthias Schempp; Vorsitzender des Aufsichtsrats: Philippe Robert Jonas Miltin; Sitz der Gesellschaft: Tübingen; Registergericht: Amtsgericht Stuttgart, HRB 382196.
Hi Anslem,
>Your patch works for me. It is indeed much simpler than mine.
>
Since writing that patch I rediscovered this:
https://todo.sr.ht/~olly/yoyo/4
Because my patch breaks that earlier fix, I'd now prefer not to apply
it.
>Another observation: the function topological_sort() might be to
>aggressive. Given I have three migrations in this order by ID:
>"m1.sql", "m2.sql", "m3.sql" and "m3" depends on "m1"
>Currently topological_sort() yields "m1", "m3", "m2". But the initial
>order "m1", "m2", "m3" already satisfies the dependency condition.
>
>Of the various possible topological sorts we usually want the one,
>which is most similar to the original sequence of migrations. To define
>"most similar" one can use the Damerau-Levenshtein-distance or the
>Levenshtein-distance. See Oleksiy Gapotchenkos blog
>http://blog.gapotchenko.com/stable-topological-sort for an algorithm.
>If you are interested, I just coded a patch.
>
Thanks for the patch! I have applied it and taken some time to play
around with it (see the 'gapotchenko' bookmark).
I am hesitant to merge it though. I am worried about the algorithm's
performance, which looks to be roughly quadratic. For small numbers of
migrations (<100) the difference is not noticeable, but beyond a couple
of hundred it starts to slow down a lot. I have never used yoyo with
such numbers of migrations, but it doesn't seem unreasonable that it
should be able to cope with this.
I have tried designing a new algorithm which you can see under the
'new-toposort-algo' bookmark. The general idea is to number the nodes
according to their original input order, then at each iteration output
the lowest numbered node that has all its dependencies satisfied. This
seems to give better orderings than the current algorithm and has good
performance. If you have the time to test it I'd be interested to know
how well it works for your requirements.
Olly.