Modify

Ticket #941 (reopened defect)

Opened 3 years ago

Last modified 3 years ago

add_column does unnecessary (and db-blocking) work

Reported by: Gábor Farkas <gabor.farkas@…> Owned by: andrew
Priority: minor Milestone: The Future
Component: commands Version: 0.7.3
Keywords: Cc:

Description

when south adds a field, it does 2 alter-table commands, the second one is often unnecessary and very slow.

in more detail:

when doing this:

db.add_column('p_thing', 'content', self.gf('django.db.models.fields.CharField')(max_length=30, null=True), keep_default=False)

on postgresql,

the following 2 commands are executed:

ALTER TABLE "p_thing" ADD COLUMN "content" varchar(30) NULL;
ALTER TABLE "p_thing" ALTER COLUMN "content" TYPE varchar(30), ALTER COLUMN "content" DROP NOT NULL, ALTER COLUMN "content" DROP DEFAULT;

the second command is totally unnecessary: the type is already correct, it is not "NOT NULL", and it does not have a DEFAULT.
the bigger problem is that when doing this on a db in production, the first commands finishes very fast, but the second one takes a very long time (i guess it does something to every row). so the migration will block the site's normal traffic for a very long time.

Attachments

Change History

comment:1 Changed 3 years ago by andrew

  • Status changed from new to closed
  • Resolution set to wontfix

This is a deliberate design decision on the part of South - it doesn't know the current state of the column, and so it has to run that command - it has no way of knowing if it should or not. It's feasible that a lot of introspection could be written to test whether to run it, but that will just increase the amount of code and make things more fragile.

Migrations, in general, will block a large database (especially column type alteration, or adding constraints) - if you have a site that large, you should be running using a two-stage plan, where you first switch the site over to a read-only slave, and then run migrations on the main database (or, just put a holding page up in the interim).

comment:2 Changed 3 years ago by Gábor Farkas <gabor.farkas@…>

hi, perhaps i misunderstand something, but i do not agree. south creates the column, so it should know what he created.
in more detail:

if you look at the 2 sql commands generated by south:

  • the first one creates the column, where it specifies that it can hold NULL values, it's type is varchar(30), and it does not specify a default value
  • in the second command, he again sets the column-type, the null-status, and the default-value-status

i do not think any introspection is needed here.

or, if i misunderstand something, could you please explain?

comment:3 Changed 3 years ago by andrew

  • Status changed from closed to reopened
  • Resolution wontfix deleted
  • Milestone set to The Future

Ah, you said add_column, not alter_column. Indeed, it's probably possible, but the way South is written assumes this case (as a lot of the time you do need to add with a default then drop it).

Patches are welcome, but this is not a priority to fix for us - the current code works, and there are many other operations that also block for a long time.

comment:4 Changed 3 years ago by andrew

  • Priority changed from major to minor
View

Add a comment

Modify Ticket

Action
as reopened
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.