-
Notifications
You must be signed in to change notification settings - Fork 715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce statement snapshot type for Rdb_transaction #1497
base: fb-mysql-8.0.32
Are you sure you want to change the base?
Introduce statement snapshot type for Rdb_transaction #1497
Conversation
@luqun has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
could you take a look main.subselect_debug 'rocksdb_intrinsic_table' MTR? The following assert failed storage/rocksdb/ha_rocksdb.cc:4976: std::unique_ptrrocksdb::Iterator myrocks::Rdb_transaction::get_iterator(rocksdb::ColumnFamilyHandle &, bool, const rocksdb::Slice &, const rocksdb::Slice &, myrocks::TABLE_TYPE, bool, bool): Assertion `statement_snapshot_type != snapshot_type::NONE || create_snapshot' failed. |
ce8893a
to
ff22bb6
Compare
@laurynas-biveinis has updated the pull request. You must reimport the pull request before landing. |
@luqun , fixed, the assert did not accept temp tables. Also rebased, ready for review |
@luqun has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
ff22bb6
to
9c9cf12
Compare
@laurynas-biveinis has updated the pull request. You must reimport the pull request before landing. |
@luqun has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
// This is used by transactions started with "START TRANSACTION WITH | ||
// CONSISTENT [ROCKSDB] SNAPSHOT". The snapshot has to be created via | ||
// DB::GetSnapshot(), not via Transaction API. | ||
READ_ONLY_TRX, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From rocksdb_explicit_snapshot/create_explicit_snapshot, Looks like EXPLICIT snapshot also create via DB::GetSnapshot(). maybe add a comment for EXPLICIT
API. | ||
*/ | ||
bool is_tx_read_only() const { return m_tx_read_only; } | ||
[[nodiscard]] bool is_tx_read_only() const noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about add _snapshot into method name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the whole transaction that is read only
@@ -15163,7 +15362,8 @@ int ha_rocksdb::external_lock(THD *const thd, int lock_type) { | |||
DBUG_RETURN(HA_ERR_UNSUPPORTED); | |||
} | |||
|
|||
if (thd->get_explicit_snapshot()) { | |||
if (unlikely(tx->has_explicit_snapshot() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need check both(tx and thd)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The state lifetimes don't fully match but only overlap. For instance,
CREATE EXPLICIT ROCKSDB SNAPSHOT;
INSERT INTO T1 VALUES(); # tx state not explicit, thd flag = true
And the opposite case:
START TRANSACTION WITH SHARED ROCKSDB SNAPSHOT;
INSERT INTO T1 VALUES(); # tx state explicit, thd flag = false
The tx state is set during transaction execution. The thd snapshot is set on
CREATE/ATTACH ROCKSDB SNAPSHOT, when there is no tx object.
I'm adding a clarifying comment.
9c9cf12
to
e507729
Compare
@laurynas-biveinis has updated the pull request. You must reimport the pull request before landing. |
@luqun , ready for review |
@luqun has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me.
storage/rocksdb/ha_rocksdb.cc
Outdated
case snapshot_type::NONE: | ||
assert(m_explicit_snapshot == nullptr); | ||
assert(m_read_opts[USER_TABLE].snapshot == nullptr); | ||
break; | ||
case snapshot_type::CURRENT: | ||
assert(m_explicit_snapshot == nullptr); | ||
assert(m_read_opts[USER_TABLE].snapshot != nullptr); | ||
break; | ||
case snapshot_type::CURRENT_DELAYED: | ||
assert(m_explicit_snapshot == nullptr); | ||
assert(m_read_opts[USER_TABLE].snapshot == nullptr); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these are all the same, it might be simpler to combine them all. Logically, it means that even though these are slightly different snapshot cases, they follow the same invariantsonce the snapshot is allocated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do (but only for the 1st and the 3rd: the 2nd has the 2nd condition check inverted)
storage/rocksdb/ha_rocksdb.cc
Outdated
switch (statement_snapshot_type) { | ||
case snapshot_type::CURRENT: | ||
assert(m_read_opts[USER_TABLE].snapshot == nullptr); | ||
break; | ||
case snapshot_type::CURRENT_DELAYED: | ||
assert(m_read_opts[USER_TABLE].snapshot == nullptr); | ||
statement_snapshot_type = snapshot_type::CURRENT; | ||
break; | ||
case snapshot_type::READ_ONLY_TRX: | ||
assert(m_read_opts[USER_TABLE].snapshot == nullptr); | ||
break; | ||
case snapshot_type::EXPLICIT: | ||
assert(snapshot == | ||
m_explicit_snapshot->get_snapshot()->snapshot()); | ||
break; | ||
case snapshot_type::NONE: | ||
assert(false); | ||
__builtin_unreachable(); | ||
break; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe have this whole block be under NDEBUG, and pull out the case for changing the snapshot type from CURRENT_DELAYED to CURRENT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, also merging the case branches like for the previous comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And removing __builtin_unreachable()
because it no longer has effect in debug-only code
Previously, the snapshot state was tracked by a combination of flags (m_is_delayed_snapshot, m_tx_read_only), and by certain snapshot pointers being null or non-null. (i.e. m_read_opts[USER_TABLE].snapshot, m_explicit_snapshot). The transaction state specified by different combinations of the above was not immediately obvious and the purpose of the read only flag was not the RO-ness as such but rather using the DB instead of Transaction API to acquire snapshots. Consolidate the states into an enum class snapshot_type { ... } statement_snapshot_type field, consuming the two flags above. Introduce and assert invariant across the state values and snapshot pointers being null or non-null. Add to the invariant of AC-NL-RO-RC transactions. Rename snapshot_created to assign_snapshot to better reflect what the method does (as it does not, in fact, take a newly-created snapshot i.e. in the case of existing explicit snapshot). Make share_explicit_snapshot and create_explicit_snapshot set the m_read_opts snapshot and remove the redundant code from the callers. For the latter, make it create the snapshot instead of taking the m_read_opts one. Two of the possible states are EXPLICIT and READ_ONLY_TRX, and the existing code implicitly had another state which was the combination of the two. Avoid introducing such new state, and use EXPLICIT in it too. This results in a minor user-visible change that START TRANSACTION WITH SHARED|EXISTING SNAPSHOT transactions are grouped with CREATE|ATTACH EXPLICIT SNAPSHOT ones instead of START TRANSACTION WITH CONSISTENT SNAPSHOT for the purposes of INFORMATION_SCHEMA read-only flag and diagnostics on write attempts. Update the diagnostic messages to reflect this.
e507729
to
fe22e56
Compare
@laurynas-biveinis has updated the pull request. You must reimport the pull request before landing. |
Ready for review again |
@luqun has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
rocksdb_rpl.rpl_skip_trx_api_binlog_format MTR failed with |
@luqun, I am unable to reproduce. Do you have more details? |
Previously, the snapshot state was tracked by a combination of
flags (m_is_delayed_snapshot, m_tx_read_only), and by certain snapshot pointers
being null or non-null. (i.e. m_read_opts[USER_TABLE].snapshot,
m_explicit_snapshot). The transaction state specified by different combinations
of the above was not immediately obvious and the purpose of the read only flag
was not the RO-ness as such but rather using the DB instead of Transaction API
to acquire snapshots.
Consolidate the states into an enum class snapshot_type { ... }
statement_snapshot_type field, consuming the two flags above. Introduce and
assert invariant across the state values and snapshot pointers being null or
non-null. Add to the invariant of AC-NL-RO-RC transactions. Rename
snapshot_created to assign_snapshot to better reflect what the method does (as
it does not, in fact, take a newly-created snapshot i.e. in the case of existing
explicit snapshot). Make share_explicit_snapshot and create_explicit_snapshot
set the m_read_opts snapshot and remove the redundant code from the callers. For
the latter, make it create the snapshot instead of taking the m_read_opts one.
Two of the possible states are EXPLICIT and READ_ONLY_TRX, and the existing code
implicitly had another state which was the combination of the two. Avoid
introducing such new state, and use EXPLICIT in it too. This results in a minor
user-visible change that START TRANSACTION WITH SHARED|EXISTING SNAPSHOT
transactions are grouped with CREATE|ATTACH EXPLICIT SNAPSHOT ones instead of
START TRANSACTION WITH CONSISTENT SNAPSHOT for the purposes of
INFORMATION_SCHEMA read-only flag and diagnostics on write attempts. Update the
diagnostic messages to reflect this.