refactor(robot-server): Avoid features that will be removed in SQLAlchemy 2.0 #16926

SyntaxColoring · 2024-11-21T00:36:42Z

Overview

Because of tests added in #16772, we're now seeing some SQLAlchemy deprecation warnings for the first time. They were always there, but now they're being surfaced.

But it turns out that there are very few of them and they're all easy to fix. So let's nip them in the bud now and make our lives easier whenever we get around to updating SQLAlchemy to 2.0.

Test Plan and Hands on Testing

Everything I'm changing should be covered well by automated tests.

Changelog

Avoid executing raw strs as statements. Use higher-level constructs instead, or, if that's not possible, sqlalchemy.text().
Avoid executing statements on a sqlalchemy.engine.Engine. Execute them on an open connection or transaction object instead.
Turn SQLAlchemy "removed in 2.0" warnings into errors, to stop us from adding any more.
Deduplicate the add_column() utility function, which was copied across several migrations.

Review requests

None in particular.

This is based on #16697 and will not merge until after that does.

Risk assessment

Low.

- Execute on a transaction or connection, not the raw engine. - Use sqlalchemy.text(), not a raw string.

sfoster1

thsi will definitely make our lives easier

ddcc4 · 2024-11-21T15:52:05Z

robot-server/robot_server/persistence/_migrations/_util.py

+    2. Use SQLAlchemy's `metadata.create_all()` to create an empty table with the new
+       schema, including the new column.
+    3. Copy rows from the old table to the new one, populating the new column
+       however you please.


lol, this is what Alembic is for.

But I wonder if there's a way to use Alembic as a fancy SQL generator for diffs without buying into the full Alembic ecosystem.

lol, this is what Alembic is for.

That is what we thought, but I looked into it when working on this, and I was underwhelmed. Alembic does implement this dance, and that is appealing. But:

First, like you said, there's a full Alembic ecosystem that we'd have to contend with. In particular, we would probably want to run Alembic "inside" our existing migration system. (It can't completely replace our own migration system, because we need to account for migrating regular files, and Alembic can only help with the schema inside the .db file.) Embedding it like that...seems messy? Like, one hypothetical way for it to work would be if Alembic gave us a standalone util function like alembic.add_all_column_constraints(command_table.c.command_status), and we'd call that from within our existing system, but it doesn't seem like Alembic works like that.

Second, as far as I can tell, Alembic leaves us on our own for the data part of these migrations, e.g. populating the new non-nullable column. They recommend some general patterns in sqlalchemy/alembic#972 (reply in thread) and https://alembic.sqlalchemy.org/en/latest/cookbook.html#data-migrations-general-techniques. Those patterns strike me as their own dances that are only marginally better. Especially if we need to rearchitect to fit into the Alembic ecosystem just for the privilege of using them.

But I could definitely have big misconceptions about all of this. I've never actually used Alembic for real. If you want to make a sketch or proof of concept to show what it would look like, I'd definitely love something better than what we have now.

But I wonder if there's a way to use Alembic as a fancy SQL generator for diffs without buying into the full Alembic ecosystem.

Yeah. I'm not sure if this what you're getting at, but there is https://alembic.sqlalchemy.org/en/latest/autogenerate.html, and we might be able to combine that with https://alembic.sqlalchemy.org/en/latest/offline.html. So one option is to run Alembic once on our laptops to autogenerate the ALTER TABLE dance, and then manually integrate that with our data migrations. Is that what you have in mind?

ddcc4 · 2024-11-21T15:52:57Z

robot-server/robot_server/persistence/_migrations/_util.py

+    with engine.begin() as transaction:
+        transaction.execute(
+            sqlalchemy.text(
+                f"ALTER TABLE {table_name} ADD COLUMN {column.key} {column_type}"


Do we use schemas here? Most of the SqlAlchemy code I've seen before passes around a tuple of (schema, table_name) for functions like this. Or is the schema included in the table_name?

I'm not 100% sure this is what you mean by schema in this context, but we represent the SQL schema in terms of Python objects with a sqlalchemy.Metadata. That lives in, e.g. robot_server.persistence.tables.schema_8.

What would passing (schema, table_name) do?

Oh sorry, the word "schema" is way too overloaded in the programming world.

For organizing big projects, databases like PostgreSQL let you create folders/directories/namespaces/whatever-you-want-to-call-them to divide up your data. PostgreSQL calls these folders "schemas" (which have nothing to do with the colloquial use of "schema" to refer to the shape of a table). And tables, enum definitions, server-side functions, etc., all live inside a schema.

So to refer to a table in PostgreSQL, you would need its fully qualified name, like:
SELECT something FROM myschema.mytable WHERE ...

The practical upshot is that in the code I worked on, whenever you pass a table name around, you would also need to pass the schema name around. https://docs.sqlalchemy.org/en/20/core/metadata.html#specifying-the-schema-name

Aaaaah, gotcha, thank you.

No, we don't use that kind of schema. Our .db file has only one, implicit, "main" schema. In SQLite, the myschema.mytable syntax is used for when you're opening multiple .db files in the same connection.

In SQLite, the myschema.mytable syntax is used for when you're opening multiple .db files in the same connection.

Oh neat! Then as a sidenote, I think that would let us solve the TODO in your copy_rows_unmodified() (where you wanted to avoid pulling the whole DB into Python/SqlAlchemy and then writing it back out). You could open both the source and destination tables in the same connection, then do INSERT INTO new_table SELECT * FROM old_table, and have the copy be done entirely inside the SQL engine.

ddcc4 · 2024-11-21T21:42:20Z

robot-server/robot_server/persistence/_migrations/_util.py

+            sqlalchemy.text(
+                f"ALTER TABLE {table_name} ADD COLUMN {column.key} {column_type}"
+            )
+        )


BTW, if you want to try something cute, folks online seem to recommend constructing the statement like this:

compiler = engine.dialect.ddl_compiler(engine.dialect, None) column_specification = compiler.get_column_specification(column) ...execute(f"ALTER TABLE {table_name} ADD COLUMN {column_specification}")

to have SqlAlchemy's compiler generate the column definition for you.

SyntaxColoring added 3 commits November 20, 2024 19:28

Treat SQLAlchemy deprecation warnings as errors.

cf88696

Deduplicate add_column() and avoid deprecated features.

afbe61e

- Execute on a transaction or connection, not the raw engine. - Use sqlalchemy.text(), not a raw string.

Deduplicate index declaration and avoid executing raw string.

f6b61b7

SyntaxColoring requested review from a team November 21, 2024 00:36

SyntaxColoring requested review from a team as code owners November 21, 2024 00:36

sfoster1 approved these changes Nov 21, 2024

View reviewed changes

ddcc4 approved these changes Nov 21, 2024

View reviewed changes

We love to lint.

aa720f8

ddcc4 reviewed Nov 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(robot-server): Avoid features that will be removed in SQLAlchemy 2.0 #16926

refactor(robot-server): Avoid features that will be removed in SQLAlchemy 2.0 #16926

SyntaxColoring commented Nov 21, 2024 •

edited

Loading

sfoster1 left a comment

ddcc4 Nov 21, 2024

SyntaxColoring Nov 21, 2024 •

edited

Loading

ddcc4 Nov 21, 2024

SyntaxColoring Nov 21, 2024

ddcc4 Nov 21, 2024 •

edited

Loading

SyntaxColoring Nov 21, 2024

ddcc4 Nov 21, 2024

ddcc4 Nov 21, 2024

refactor(robot-server): Avoid features that will be removed in SQLAlchemy 2.0 #16926

Are you sure you want to change the base?

refactor(robot-server): Avoid features that will be removed in SQLAlchemy 2.0 #16926

Conversation

SyntaxColoring commented Nov 21, 2024 • edited Loading

Overview

Test Plan and Hands on Testing

Changelog

Review requests

Risk assessment

sfoster1 left a comment

Choose a reason for hiding this comment

ddcc4 Nov 21, 2024

Choose a reason for hiding this comment

SyntaxColoring Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

ddcc4 Nov 21, 2024

Choose a reason for hiding this comment

SyntaxColoring Nov 21, 2024

Choose a reason for hiding this comment

ddcc4 Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

SyntaxColoring Nov 21, 2024

Choose a reason for hiding this comment

ddcc4 Nov 21, 2024

Choose a reason for hiding this comment

ddcc4 Nov 21, 2024

Choose a reason for hiding this comment

SyntaxColoring commented Nov 21, 2024 •

edited

Loading

SyntaxColoring Nov 21, 2024 •

edited

Loading

ddcc4 Nov 21, 2024 •

edited

Loading