Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #27452 -- Added serial fields to contrib.postgres. #18123

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

csirmazbendeguz
Copy link

@csirmazbendeguz csirmazbendeguz commented May 3, 2024

Trac ticket number

ticket-27452

Branch description

Added SmallSerialField, SerialField, BigSerialField to contrib.postgres.

Previous PR

Checklist

  • This PR targets the main branch.
  • The commit message is written in past tense, mentions the ticket number, and ends with a period.
  • I have checked the "Has patch" ticket flag in the Trac system.
  • I have added or updated relevant tests.
  • I have added or updated relevant docs, including release notes if applicable.
  • I have attached screenshots in both light and dark modes for any UI changes.

@ngnpope ngnpope self-requested a review May 3, 2024 06:37
@csirmazbendeguz csirmazbendeguz force-pushed the ticket_27452 branch 4 times, most recently from e892da4 to 97cbc23 Compare May 3, 2024 08:31
@csirmazbendeguz csirmazbendeguz marked this pull request as draft May 3, 2024 10:21
@csirmazbendeguz csirmazbendeguz force-pushed the ticket_27452 branch 12 times, most recently from fccac26 to 2b48193 Compare May 4, 2024 13:05
Copy link
Contributor

@LilyFoote LilyFoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great start!

In addition to the specific comments I added, we'll need some migrations tests to cover adding/removing serial fields and converting between serial and auto fields.

django/contrib/postgres/fields/serial.py Outdated Show resolved Hide resolved
django/contrib/postgres/fields/serial.py Outdated Show resolved Hide resolved
django/contrib/postgres/fields/serial.py Outdated Show resolved Hide resolved
django/contrib/postgres/fields/serial.py Outdated Show resolved Hide resolved
django/db/backends/postgresql/operations.py Show resolved Hide resolved
django/db/backends/postgresql/schema.py Outdated Show resolved Hide resolved
django/db/backends/postgresql/schema.py Show resolved Hide resolved
Comment on lines 5742 to 5812
@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific")
@isolate_apps("schema")
@tag("serial")
def test_serial_to_integer(self):
from django.contrib.postgres.fields import SerialField

self.serial_to_integer_test(SerialField, IntegerField)

@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific")
@isolate_apps("schema")
@tag("serial")
def test_serial_to_small_integer(self):
from django.contrib.postgres.fields import SerialField

self.serial_to_integer_test(SerialField, SmallIntegerField)

@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific")
@isolate_apps("schema")
@tag("serial")
def test_serial_to_big_integer(self):
from django.contrib.postgres.fields import SerialField

self.serial_to_integer_test(SerialField, BigIntegerField)

@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific")
@isolate_apps("schema")
@tag("serial")
def test_small_serial_to_integer(self):
from django.contrib.postgres.fields import SmallSerialField

self.serial_to_integer_test(SmallSerialField, IntegerField)

@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific")
@isolate_apps("schema")
@tag("serial")
def test_small_serial_to_small_integer(self):
from django.contrib.postgres.fields import SmallSerialField

self.serial_to_integer_test(SmallSerialField, SmallIntegerField)

@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific")
@isolate_apps("schema")
@tag("serial")
def test_small_serial_to_big_integer(self):
from django.contrib.postgres.fields import SmallSerialField

self.serial_to_integer_test(SmallSerialField, BigIntegerField)

@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific")
@isolate_apps("schema")
@tag("serial")
def test_big_serial_to_big_integer(self):
from django.contrib.postgres.fields import BigSerialField

self.serial_to_integer_test(BigSerialField, BigIntegerField)

@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific")
@isolate_apps("schema")
@tag("serial")
def test_big_serial_to_integer(self):
from django.contrib.postgres.fields import BigSerialField

self.serial_to_integer_test(BigSerialField, IntegerField)

@unittest.skipUnless(connection.vendor == "postgresql", "PostgreSQL specific")
@isolate_apps("schema")
@tag("serial")
def test_big_serial_to_small_integer(self):
from django.contrib.postgres.fields import BigSerialField

self.serial_to_integer_test(BigSerialField, SmallIntegerField)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a good use case for subTest to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the issue is we would either need to generate the test models with a dynamic name, or clean up the test models after the subtests - IMO it's more trouble than worth it.

tests/schema/tests.py Outdated Show resolved Hide resolved
@csirmazbendeguz csirmazbendeguz force-pushed the ticket_27452 branch 2 times, most recently from e7fdd21 to 8052e52 Compare May 5, 2024 10:37
@csirmazbendeguz
Copy link
Author

This looks like a great start!

In addition to the specific comments I added, we'll need some migrations tests to cover adding/removing serial fields and converting between serial and auto fields.

@LilyFoote , thanks for the review! Could you point me to some tests I could copy to get started? I'm not sure how to test the migrations other than with the tests/schema tests.

@LilyFoote
Copy link
Contributor

For database defaults I added tests in tests/migrations/test_autodetector.py and tests/migrations/test_operations.py.

@csirmazbendeguz csirmazbendeguz force-pushed the ticket_27452 branch 5 times, most recently from 569c960 to 0ebc30b Compare May 6, 2024 06:40
@csirmazbendeguz csirmazbendeguz marked this pull request as ready for review May 6, 2024 11:24
@csirmazbendeguz csirmazbendeguz force-pushed the ticket_27452 branch 4 times, most recently from 4c086eb to 51ed545 Compare May 6, 2024 12:03
@csirmazbendeguz
Copy link
Author

csirmazbendeguz commented May 8, 2024

I have managed to test these operations (add, alter, remove serial field) in the schema.tests, I'm not sure if more tests are necessary?

Copy link
Member

@ngnpope ngnpope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to address the backward compatibility with pre-Django 4.1 AutoField on PostgreSQL to avoid breaking many old projects that didn't manually migrate to identity columns.

Edit: This might not be too much of a problem after all, but we should carefully consider all the places where things could break, e.g. migrations. We should likely have a test that starts out with a pre-Django 4.1 AutoField to make sure that it is handled correctly.

django/db/backends/postgresql/introspection.py Outdated Show resolved Hide resolved
django/db/backends/postgresql/operations.py Outdated Show resolved Hide resolved
docs/ref/checks.txt Outdated Show resolved Hide resolved
@csirmazbendeguz
Copy link
Author

@ngnpope , thank you for reviewing my changes ! 🙏 I have pushed a fix for the issues you mentioned.

@csirmazbendeguz
Copy link
Author

@ngnpope , I think all concerns has been addressed, let me know what you think.

Copy link
Contributor

@LilyFoote LilyFoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm loving the progress you've made here. I have a few minor suggestions, but overall looks good to me!

django/contrib/postgres/fields/serial.py Outdated Show resolved Hide resolved
django/db/backends/postgresql/schema.py Outdated Show resolved Hide resolved
django/db/backends/postgresql/schema.py Show resolved Hide resolved
django/db/backends/postgresql/schema.py Outdated Show resolved Hide resolved
@csirmazbendeguz
Copy link
Author

Thanks a lot @LilyFoote ! I addressed your suggestions.

Copy link
Contributor

@sarahboyce sarahboyce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @csirmazbendeguz
I haven't reviewed this in depth but will hope to review again soon, as a first look through, this looks nice

django/contrib/postgres/fields/serial.py Outdated Show resolved Hide resolved
django/contrib/postgres/fields/serial.py Outdated Show resolved Hide resolved
docs/ref/contrib/postgres/fields.txt Outdated Show resolved Hide resolved
docs/releases/5.1.txt Outdated Show resolved Hide resolved
docs/ref/contrib/postgres/fields.txt Outdated Show resolved Hide resolved
docs/ref/contrib/postgres/fields.txt Outdated Show resolved Hide resolved
@csirmazbendeguz csirmazbendeguz force-pushed the ticket_27452 branch 2 times, most recently from 0ec1583 to d04ee2c Compare May 25, 2024 05:20
@csirmazbendeguz
Copy link
Author

Thanks a lot @sarahboyce , I addressed your comments, let me know what you think.

Copy link
Contributor

@sarahboyce sarahboyce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the updates @csirmazbendeguz
I think there are a few tests/asserts we need to add 👍

django/contrib/postgres/fields/serial.py Outdated Show resolved Hide resolved
django/contrib/postgres/fields/serial.py Outdated Show resolved Hide resolved
django/core/management/commands/inspectdb.py Show resolved Hide resolved
django/db/backends/postgresql/operations.py Show resolved Hide resolved
@csirmazbendeguz
Copy link
Author

Thanks @sarahboyce , fair enough, I made some adjustments.

Copy link
Contributor

@InvalidInterrupt InvalidInterrupt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really read many of the tests yet; not much has caught my eye that isn't already being discussed.

Comment on lines 41 to 50
def _check_default(self):
if self.default is NOT_PROVIDED:
return []
return [
checks.Error(
f"{self.__class__.__name__} does not accept default values.",
obj=self,
id="fields.E014",
),
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I would expect most users to operate this way; it feels overly prescriptive to me.

Suppose for some reason I only wanted a subset of objects to be part of the sequence, and the rest to default to 0, I might want to set default on the field and pass DatabaseDefault() only when I want to use the sequence.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @InvalidInterrupt !

I don't think this is an issue, it's possible to create an object without using the sequence:

class Dummy(Model):
    serial = SerialField()

Dummy.objects.create(serial=0).serial  # 0
Dummy.objects.create(serial=0).serial  # 0
Dummy.objects.create(serial=0).serial  # 0
Dummy.objects.create().serial  # 1
Dummy.objects.create().serial  # 2
Dummy.objects.create().serial  # 3
Dummy.objects.create(serial=1).serial  # 1
Dummy.objects.create().serial  # 4

Is this what you're suggesting?

class Dummy(Model):
    serial = SerialField(default=0)

Dummy.objects.create().serial  # 0
Dummy.objects.create().serial  # 0
Dummy.objects.create().serial  # 0
Dummy.objects.create(serial=DatabaseDefault()).serial  # 1
Dummy.objects.create(serial=DatabaseDefault()).serial  # 2
Dummy.objects.create(serial=DatabaseDefault()).serial  # 3
Dummy.objects.create(serial=1).serial  # 1
Dummy.objects.create(serial=DatabaseDefault()).serial  # 4

I think the first one is better because it's consistent with how AutoField works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been testing on a little project, if we add a SerialField to an existing model I cannot make it nullable or assign a default and so I have to give a temporary default in order to create a migration and add the field.

However, whatever I give here is ignored and the field is populated everywhere with as a serial starting from 1
I feel like it shouldn't prompt us in that case (forcing my input and then ignoring it to me is a bug).
Also considering this, a default option might be nice 🤔 (cc @LilyFoote)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see 🤔 Thanks for catching that @sarahboyce ! I tested it too but I didn't notice the the input was ignored. 😱 I'll investigate.

Yes, default makes sense in the context of migrations.

Copy link
Contributor

@LilyFoote LilyFoote May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I think the minimal change feature-wise would be to adapt the migration generation code to know that SerialField can be created without an explicit default.

For a user-chosen default my question would be "why a SerialField and not an IntegerField?".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining @InvalidInterrupt !

I think the only place this could be useful is in migrations. So when adding a serial field, it could be pre-filled with a default number.

I would find this API confusing otherwise, users may confuse it with the start parameter for example.
I don't think it's too restrictive since users can still opt out of using the sequence by providing an explicit value.

That said, it's always easier to remove a restriction than to add one.

In my opinion, the use-case of migrations is an edge-case and it can be solved by first creating an IntegerField then changing it to a SerialField, so it wouldn't justify enabling this.

If someone can provide a good use-case this restriction could be removed in the future, but for now I have to disagree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My description of the features potential use may have been confusing; but using default in this manner would be just like (every?) other field type, so I'm sceptical it would be confused for something like start outside of the context of this thread. In regards to API design, as it stands SerialField will be the exception. Eliminating that exception would probably also fix the migrations without adding a kludge to the migration code for this.

I've just finished mocking up the removal of this restriction to confirm that it works as expected and doesn't break anything else. I'm going to finish cleaning up the code and then push it to my fork for reference as a PoC (if anyone is curious: https://github.com/InvalidInterrupt/django/tree/ticket_27452).

That being said, I don't think this is a big issue; and it shouldn't hold up this PR. If I'm not persuading anyone that we should remove this guardrail and trust the Django users, then we should just move along for now (Sarah's migration issue notwithstanding).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, even AutoField allows you to set a default, even though that means that you'll only be able to save a single instance of the model without specifying an explicit value for the PK

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked and AutoField allows a default too, it will cause integrity errors of course but it allows it. So yes you are raising a valid point, this would be an exception, thank you for pointing that out. It may be worth enabling for consistency's sake. @LilyFoote , @sarahboyce what do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there's a problem with migrating a SerialField with a default value. It's impossible to do this in Postgres, since serial already implies the field has a default sequence. As far as I can tell, there's no trivial way to add a serial field to a table with an initial default value other than the sequence.

So I think the simplest option would be to disable default after all. Or to not use the built-in serial type when creating a field.

docs/ref/contrib/postgres/fields.txt Outdated Show resolved Hide resolved
@csirmazbendeguz
Copy link
Author

Thanks for the review @InvalidInterrupt , it's much appreciated! 🙏

@@ -75,6 +76,7 @@ def handle_inspection(self, options):
"field names."
)
yield "from %s import models" % self.db_module

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@csirmazbendeguz these are minor clean up changes that will happen on the PR either by you or by a merger

  • we want to limit the changed lines to keep the diff really tight to what's important, occasionally you have added a new line or removed a line, I will probably revert some of this (you can too if you like). This is considered formatting changes
  • you have added @tag("serial") for your tests, this is not how we organise tests generally and needs to be removed
  • the commits need to be squashed. This will likely be 1 commit (or if we can think of a nice way of splitting them we will)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this for informational purposes 👍

I forgot to say that I have tested this with a local project and tried out a number of things to try and see if I can find any issues (had a look in the admin, reversed migrations, used them in GeneratedFields) and this comment is the only thing I'm concerned about. It looks really good ⭐

@@ -1134,7 +1134,7 @@ def _generate_added_field(self, app_label, model_name, field_name):
preserve_default = (
field.null
or field.has_default()
or field.db_default is not models.NOT_PROVIDED
or field.has_db_default()
or field.many_to_many
or (field.blank and field.empty_strings_allowed)
or (isinstance(field, time_fields) and field.auto_now)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better instead of overwriting has_db_default to True for serial fields, to do an or isinstance(field, serial_fields) check similar to the time fields check here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read the SQL, it does have a database default, this makes sense now (sorry)

.has_db_default() can be used in more places (if you search for .db_default is you should see quite a few places)
We can do this refactor in a new branch/PR to be merged in before this PR. This will keep this PR small. Are you happy to make a PR for this change? (adding def has_db_default(self) to Field and using has_db_default everywhere) Does that make sense?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree @sarahboyce , I'll open a new PR for this refactor tomorrow. I think this makes sense since there's a has_default function too, this would be the same but for db_default.

Thanks for your support!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still haven't opened a PR. The reason is I don't think it would help. If has_db_default is true, some code would try to use the db_default, which would raise errors.

I'm looking for alternatives.

@csirmazbendeguz
Copy link
Author

csirmazbendeguz commented May 29, 2024

I realized instead of adding new system checks, I could just raise a ValueError in __init__. This is how GeneratedField does it. WDYT?

I think it's better, because:

  1. it's consistent with how it's done elsewhere
  2. it's stricter, the code will fail immediately, doesn't need to wait for checks to run
  3. it doesn't require adding new checks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants