Ticket #370 (assigned enhancement)

Opened 5 years ago

Last modified 3 years ago

Extended MultiDB Support

Reported by: andrew Owned by: andrew
Priority: major Milestone: The Future
Component: commands Version: 0.7-pre
Keywords: Cc:

Description

Implement more sophisticated support for MultiDB.

This will be support for having different sets of migrations - one per database, usually - with different models, etc. in them.

--auto will detect which models go where, by using the database router's allow_syncdb function, which Django already has, and do things roughly correctly.

Change History

comment:1 Changed 5 years ago by andrew

  • Status changed from new to assigned

comment:2 Changed 4 years ago by andrew

  • Milestone changed from 0.7.1 to The Future

comment:3 Changed 4 years ago by burchik@…

I'd like to see valid support for database routers (allow_syncdb).
Currently, when you pass --database <somedb> it fails with
"
django.db.utils.DatabaseError?: (1146, "Table 'somedb.south_migrationhistory' doesn't exist")
"

comment:4 Changed 4 years ago by andrew

Are you sure that's not because you didn't syncdb to that database first? The south migration history table is created using syncdb, not migrate, perhaps confusingly.

comment:5 Changed 4 years ago by anonymous

I did syncdb. South did not create migration history table on secondary database.
Also it raises NoMigrations? if you don't pass an application name to it.

The following scheme was implemented:
app1 -> db1
app2 -> db2
other apps -> main-db

I tried to migrate app1, it was the only app with migrations folder.

comment:6 Changed 4 years ago by aaugustin

tl;dr It is a hard problem because Django's database routers use a generic approach that is impossible to replicate perfectly in a migration context.


Here's the result of a few hours of research, in the hope that it may help however will tackle this ticket.

1) Django's database routers can define an "allow_syncdb" method, saying if tables for a given Model (as in django.db.models.Model) should be synchronized in a given database (as in settings.DATABASES.keys())

2) South migrations, recorded in the migrations folder, contain series of database modification statements, listed here : http://south.aeracode.org/docs/databaseapi.html

These methods are thin wrappers around the corresponding SQL statements; they exist so the migrations are database-agnostic but do not do anything clever. Their first argument is always the table name.

3) South could wrap each database modification statement, or series of such statements, with "if self.allow_migrate(app_label, model_name): ...".

Then there are two alternatives for the implementation of allow_migrate:

a) load the fake model frozen in the migration and call allow_syncdb(db, fake_model). This can fail with complex routers because the fake model implements a subset of a real model's methods.
b) load the real model and call allow_syncdb(db, model). This could be implemented like send_create_signal. But loading the real model can fail: it could have been renamed, removed, etc.

In either implementation:

  • if something fails, it's not clear whether allow_migrate should return True or False;
  • this will not work for existing apps unless someone rewrites all their migrations, which is a big problem with third party apps.

Currently the best way to use multiple database holding different sets of models and to use South is to run, for each app and db (depending on your setup, obviously):
./manage.py migrate --database=<db> app

comment:7 Changed 4 years ago by andrew

The main problem with that approach is that South, as you've seen, deals in tables, not in models, making the use of a router difficult. I'm not so keen on the whole "if allow_migrate" business, as it both won't work with any existing migrations, and requires you to use the autogenerator / do a lot of faffing when writing these things manually.

I'd prefer perhaps either making South a bit more model-oriented (which is a big, long-term plan), or to have a router that routes based on table name (not perfect, but it could probably do a reasonable job).

Still, either of these are large and complex to get correct, and other things are currently the priority right now (when I even get the time to do code at all).

comment:8 Changed 3 years ago by jdunck@…

Some discussion from the IRC channel on this topic:

[11:39:32] <jdunck> Is anyone using south with multi-db? I've run into this and some other problems: http://south.aeracode.org/ticket/370
[11:40:14] <jdunck> I'd like to discuss (hopefully today) an approach, because I think I need to fix this for my project and I'd prefer to do something that can be accepted upstream.
[11:40:26] <jdunck> andrewgodwin
...
[12:50:18] <jdunck> I imagine ep.io is taking most of your attention these days. I read the thread on 4 options for south migration here
[12:50:18] <jdunck> http://groups.google.com/group/south-users/browse_thread/thread/40cb65dc51c4af20/b8f531b00897131a
[12:50:47] <jdunck> I was wondering if you have any different feelings since that thread?
...
[13:11:50] <@andrewgodwin> my opinions these days are that you should probably be able to have multiple migration sets and then say which database uses what sets
[13:12:08] <jdunck> so, this is a subdir under the migrations dir?
[13:12:31] <jdunck> do you say --migset both when running (schema|data)migration and migrate?
[13:13:53] <jdunck> andrewgodwin and you haven't started on that, right? no branch in bitbucket.
[13:14:16] <@andrewgodwin> nope, have only thought about it
[13:14:18] <jdunck> if you'd prefer to take this to the mailing list, that's fine, i just am trying to understand the path in your mind so i can follow it.
[13:14:27] <@andrewgodwin> and I was thinking more of a setting with dbalias: migset name
[13:14:40] <@andrewgodwin> with the "default" migset being the default unless otherwise specified
[13:15:06] <@andrewgodwin> the setting would probably deprecate SOUTH_MIGRATION_MODULES, as it'd provide a superset of functionality
[13:16:39] <jdunck> I see. So if I wanted a table in 1 db but not another, rather than respecting the router, we'd rely on the migset for the 2nd db not having the migration at all.
...
[13:17:25] <@andrewgodwin> yes, exactly that
[13:17:35] <@andrewgodwin> though I can imagine something that used the router to make the migsets
[13:18:02] <@andrewgodwin> to be perfectly honest, I've been wanting to write a second, much more declarative, migration system for a while, and this would work better with that idea
[13:18:07] <jdunck> so --auto would look at the router and make migration copies for all syncdb?
[13:18:29] <@andrewgodwin> essentially, though sticking it all under --auto might be a bit much
[13:18:51] <@andrewgodwin> people who use this are going to be power users to start with, so I'm not concerned with a new switch/command or two

comment:9 Changed 3 years ago by dgouldin

Further implementation discussion on the mailing list:

http://groups.google.com/group/south-users/browse_thread/thread/4429dfbf8203e9df

Note: See TracTickets for help on using tickets.