Hi,
First many thanks for providing yoyo to the public. A few month ago we retrofitted it into the development process for a huge old internal database application. It helps us to improve the quality of our development process.
We use several database instances. For now think of "DEV-DB" and "PROD-DB". For the DEV-DB we use two migration source directories "test" and "prod". For PROD-DB we only the source directory "prod".
The migrations in directory "prod" define the database schema and the migrations in directory "test" load additional test data into the "DEV-DB".
We have the convention to use a naming schema for the migration files to impose an ordering:
<migration number>.<sub-number>-<other text>.sql (or.py).
Currently yoyo sorts the migrations for the DEV-DB this way:
1. all migrations from directory "test" sorted by file name and then
2. all migrations from directory "prod" sorted by file name
Now suppose I have the migrations "test/01.sql", "test/02.sql", "prod/03.sql", "prod/04.sql", "test/05.sql" and "prod/06.sql". Currently they would be applied in the following order:
"test/01.sql",
"test/02.sql",
"test/05.sql",
"prod/03.sql",
"prod/04.sql",
"prod/06.sql"
Now, if migration "test/05.sql" declares a dependency on "prod/04.sql" I get the following order:
"test/01.sql",
"test/02.sql",
"prod/04.sql",
"test/05.sql",
"prod/03.sql",
"prod/06.sql"
This changes the application order for productive migrations from 03, 04, 06 to 04, 03, 06. That is, I get different ordering of the productive migrations for PROD-DB and DEV-DB. That is not so good.
Therefore I would like to add an option to sort all migrations by their ID. Then I get this order:
"test/01.sql",
"test/02.sql",
"prod/03.sql",
"prod/04.sql",
"test/05.sql",
"prod/06.sql"
Now it is no longer necessary to declare dependencies explicitly. I made a minimal prove of concept patch.
diff -r 312182a8c03f yoyo/migrations.py
--- a/yoyo/migrations.py Mon Jan 18 15:13:41 2021 +0000
+++ b/yoyo/migrations.py Fri Mar 26 17:03:56 2021 +0100
@@ -510,6 +510,10 @@
"""
migrations = OrderedDict() # type: Dict[str, MigrationList]
+ sort_migrations = '!SORT!' in sources
+ if sort_migrations:
+ sources = tuple(s for s in sources if s != '!SORT!')
+
for source, paths in _expand_sources(sources):
for path in paths:
if path.endswith(".rollback.sql"):
@@ -531,7 +535,7 @@
else:
ml.append(migration)
merged_migrations = MigrationList(
- chain(*migrations.values()),
+ sorted(chain(*migrations.values()), key=lambda m: m.id) if sort_migrations else chain(*migrations.values()),
chain(*(m.post_apply for m in migrations.values())),
)
return merged_migrations
It works well, but the configuration is a bit of a hack: you have to add the entry "!SORT!" to the sources list. Unfortunately there seems to be no easy way to access the configuration and I didn't want to change the signature of function "read_migrations(*sources)" for a POC.
It would be really cool to get an option to sort the migrations in a future release of yoyo.
Kind regards
Anselm
--
Anselm Kruis
Senior Solution Architect
MSE OPS HPCS S PS
T +49-89-35 63 86-874
M +49-16 05 82 82 00
anselm.kruis@atos.net
science + computing ag
Ingolstädter Str. 22
80807 München, Germany
atos.net/de/deutschland/sc
science + computing AG; Vorstand: Dr. Martin Matzke (Vorsitzender), Sabine Hohenstein, matthias Schempp; Vorsitzender des Aufsichtsrats: Philippe Robert Jonas Miltin; Sitz der Gesellschaft: Tübingen; Registergericht: Amtsgericht Stuttgart, HRB 382196.
On 26/03/2021, Kruis, Anselm wrote:
>
>Now, if migration "test/05.sql" declares a dependency on "prod/04.sql" I get the following order:
>
>"test/01.sql",
>"test/02.sql",
>"prod/04.sql",
>"test/05.sql",
>"prod/03.sql",
>"prod/06.sql"
>
>This changes the application order for productive migrations from 03, 04, 06 to 04, 03, 06. That is, I get different ordering of the productive migrations for PROD-DB and DEV-DB. That is not so good.
>
>Therefore I would like to add an option to sort all migrations by their ID. Then I get this order:
>
I think there might be a simpler solution. If we do an initial sort of
the migrations by id then yoyo should use your desired order by default,
and maintain that even if you add dependencies to tweak the ordering::
diff --git yoyo/migrations.py yoyo/migrations.py
--- yoyo/migrations.py
@@ -760,8 +760,8 @@
migration_list: Iterable[Migration],
) -> Iterable[Migration]:
- # Make a copy of migration_list. It's probably an iterator.
- migration_list = list(migration_list)
+ # Make a copy of migration_list and do an initial sort by id
+ migration_list = list(sorted(list(migration_list), key=lambda m: m.id))
# Track graph edges in two parallel data structures.
# Use OrderedDict so that we can traverse edges in order
Does that work for you?
If so, I think it would be safe to merge this. It shouldn't change
anything for single-source workflows, and multi-source workflows are
likely to either not rely on cross-dependencies or already work around
the same issue you've run into by maintaining explicit dependencies.
Olly.