MDEV-28730 Remove internal parser usage from InnoDB fts#4443
MDEV-28730 Remove internal parser usage from InnoDB fts#4443Thirunarayanan wants to merge 2 commits into
Conversation
|
|
|
In addition to the CI failures needing correcting, does this mean Great to see the parser going away. |
04ec1e9 to
ff6a64d
Compare
dr-m
left a comment
There was a problem hiding this comment.
Here are some quick initial comments.
b672350 to
53f237a
Compare
53f237a to
edabb01
Compare
dr-m
left a comment
There was a problem hiding this comment.
Here are some more comments. The error propagation is better now, but I would like to see some more effort to avoid the number of dict_sys.latch acquisitions. This should be tested as well, in a custom benchmark.
Even though we are adding quite a bit of code, I was pleasantly surprised that the size of a x86-64 CMAKE_BUILD_TYPE=RelWithDebInfo executable would increase by only 20 KiB. I believe that removing the InnoDB SQL parser (once some more code has been refactored) would remove more code than that.
| if (UNIV_LIKELY(error == DB_SUCCESS || | ||
| error == DB_RECORD_NOT_FOUND)) | ||
| { | ||
| fts_sql_commit(trx); | ||
| if (error == DB_RECORD_NOT_FOUND) error = DB_SUCCESS; |
There was a problem hiding this comment.
What is the reason for committing and re-starting the transaction after each iteration? Is it one transaction per fetched row?
Here, the second if had better be removed. A blind assignment error= DB_SUCCESS should be shorter and incur less overhead. It is basically just zeroing out a register.
There was a problem hiding this comment.
It is not single word. If no starting word is specified, fetch all words.
Otherwise, fetch words starting from the given word. This supports pagination
when memory limits are exceeded. There could be more words in all auxiliary tables and doesn't make sense
to open transaction for too long time.
There was a problem hiding this comment.
I would like to address this issue as a seperate one. Because this is what fulltext behaviour before this patch.
Yes, I agree that btr_cur_t iteration is enough to retrieve all the words from auxiliary table.
| fts_sql_rollback(trx); | ||
| if (error == DB_LOCK_WAIT_TIMEOUT) | ||
| { | ||
| ib::warn() << "Lock wait timeout reading FTS index. Retrying!"; | ||
| trx->error_state = DB_SUCCESS; | ||
| } | ||
| else | ||
| { | ||
| ib::error() << "Error occurred while reading FTS index: " << error; | ||
| break; |
There was a problem hiding this comment.
Which index and table are we reading? Why are we not disclosing the name of the index or the table?
Please, let’s avoid using ib::logger::logger in any new code, and invoke sql_print_error or sql_print_warning directly.
Is this code reachable? How would a lock wait timeout be possible?
Can this ever be a locking read? When and why would it need to be one? After all, as the code stands now, we are committing the transaction (and releasing any locks) after every successful iteration. Hence, there will be no consistency guarantees on the data that we are reading.
"Auxiliary table" in the function comment is inaccurate. Can we be more specific? Is this always reading entries from a partition of an inverted index? Which functions can write these tables? (What are the potential conflicts?)
Do we even need a transaction object here, or would a loop around btr_cur_t suffice?
There was a problem hiding this comment.
Replaced ib:: with sql_print_. Yet to address how lock wait can happen.
9f34b2c to
166235e
Compare
dr-m
left a comment
There was a problem hiding this comment.
I tried to review the MVCC logic, but I did not fully understand it.
It would be much more convenient to review this if the tables or indexes on which the new code is expected to be invoked were prominently documented in debug assertions.
| mem_heap_t* version_heap= nullptr; | ||
| mem_heap_t* offsets_heap= nullptr; | ||
| rec_offs* offsets= nullptr; | ||
| rec_offs* version_offsets= nullptr; |
There was a problem hiding this comment.
Why are we not allocating offsets and version_offsets from the stack, and forcing memory to be allocated from the heap every single time?
Could we cope with a single mem_heap_t object here? Or none at all? This function is supposed to be run on some tables whose schema we know in advance, right? Can this ever be run on a user-defined table? Unfortunately, in none of the callers of QueryExecutor::process_record_with_mvcc() did I find any assertion on the table names or the schema.
| dberr_t QueryExecutor::process_record_with_mvcc( | ||
| dict_index_t *clust_index, const rec_t *rec, | ||
| RecordCallback &callback, dict_index_t *sec_index, | ||
| const rec_t *sec_rec) noexcept | ||
| { | ||
| ut_ad(m_mtr.trx); | ||
| ut_ad(srv_read_only_mode || m_mtr.trx->read_view.is_open()); |
There was a problem hiding this comment.
We are missing ut_ad(sec_index->is_normal_btree()) and possibly other assertions.
Is this known to be limited to some specific tables or indexes? For example, if the table is not a FTS_ internal table and not mysql.innodb_index_stats or mysql.innodb_table_stats, would we know that the sec_index is FTS_DOC_ID_INDEX(FTS_DOC_ID)? Documenting the intended usage with debug assertions would make it easier to review this and to suggest possible optimization.
There was a problem hiding this comment.
Added the assert in lookup_clustered_record() saying secondary index is only FTS_DOC_ID_IDX
| if (!rec_get_deleted_flag(result_rec, | ||
| clust_index->table->not_redundant())) |
There was a problem hiding this comment.
In row_sel_sec_rec_is_for_clust_rec() this condition would be checked for every record version that row_sel_get_clust_rec() or Row_sel_get_clust_rec_for_mysql::operator()() is processing, but here we do it only after fetching the clustered index record version. I think that this should be OK. But, I would have appreciated a source code comment that refers to the other implementation of this logic.
However, there seems to be a bigger problem that here we are checking at most one earlier version, instead of traversing all versions that are visible in the current read view. This might be OK if this function is only going to be invoked on some specific FTS_ tables. But then there should be debug assertions that would document the assumptions about the tables, as well as source code comments that explain why we only check for one older version.
There was a problem hiding this comment.
Using Row_sel_get_clust_rec_for_mysql::operator() directly. I thought better to use existing function
166235e to
8355e6a
Compare
dr-m
left a comment
There was a problem hiding this comment.
The QueryExecutor bears some similarity with functions that operate on row_prebuilt_t. Could we unify them better? Would it be feasible to construct row_prebuit_t objects in the FULLTEXT subsystem, so that even higher-level functions such as row_search_mvcc() can be invoked?
| static void row_sel_reset_old_vers_heap(row_prebuilt_t *prebuilt) | ||
| static void row_sel_reset_old_vers_heap(mem_heap_t** old_vers_heap) | ||
| { | ||
| if (prebuilt->old_vers_heap) | ||
| mem_heap_empty(prebuilt->old_vers_heap); | ||
| if (*old_vers_heap) | ||
| mem_heap_empty(*old_vers_heap); | ||
| else | ||
| prebuilt->old_vers_heap= mem_heap_create(200); | ||
| *old_vers_heap= mem_heap_create(200); | ||
| } |
There was a problem hiding this comment.
This change is related to some memory leaks; see for example https://buildbot.mariadb.org/#/builders/1031/builds/3522
CURRENT_TEST: mariabackup.apply-log-only-incr
…
==230009==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 64000 byte(s) in 200 object(s) allocated from:
#0 0x56452b57e2e4 in malloc (/home/buildbot/bld/sql/mariadbd+0x22a92e4) (BuildId: 4c5cec9ca993f3edf8770e1f50461ef96bb740ef)
#1 0x56452d7a5177 in ut_allocator<unsigned char, true>::allocate(unsigned long, unsigned char const*, unsigned int, bool, bool) /home/buildbot/src/storage/innobase/include/ut0new.h:375:11
#2 0x56452d8fe784 in mem_heap_create_block_func(mem_block_info_t*, unsigned long, unsigned long) /home/buildbot/src/storage/innobase/mem/mem0mem.cc:276:37
#3 0x56452daae96b in mem_heap_create_func(unsigned long, unsigned long) /home/buildbot/src/storage/innobase/include/mem0mem.inl:306:10
#4 0x56452daae96b in row_sel_reset_old_vers_heap(mem_block_info_t**) /home/buildbot/src/storage/innobase/row/row0sel.cc:3268:21
#5 0x56452da93291 in row_sel_build_prev_vers_for_mysql(trx_t*, dict_index_t*, unsigned char const*, unsigned short**, mem_block_info_t**, mem_block_info_t*, unsigned char**, dtuple_t**, mtr_t*) /home/buildbot/src/storage/innobase/row/row0sel.cc:3294:2
#6 0x56452da99dfc in row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long) /home/buildbot/src/storage/innobase/row/row0sel.cc:5355:11
#7 0x56452d749096 in ha_innobase::general_fetch(unsigned char*, unsigned int, unsigned int) /home/buildbot/src/storage/innobase/handler/ha_innodb.cc:9229:24
#8 0x56452c8c146f in handler::ha_rnd_next(unsigned char*) /home/buildbot/src/sql/handler.cc:4046:5
#9 0x56452b700eb4 in rr_sequential(READ_RECORD*) /home/buildbot/src/sql/records.cc:509:35
#10 0x56452bc778f3 in READ_RECORD::read_record() /home/buildbot/src/sql/records.h:77:30
#11 0x56452bc778f3 in sub_select(JOIN*, st_join_table*, bool) /home/buildbot/src/sql/sql_select.cc:24592:18
#12 0x56452bcfa10e in do_select(JOIN*, Procedure*) /home/buildbot/src/sql/sql_select.cc:24086:14
#13 0x56452bcf7ac3 in JOIN::exec_inner() /home/buildbot/src/sql/sql_select.cc:5125:50
#14 0x56452bcf4a1b in JOIN::exec() /home/buildbot/src/sql/sql_select.cc:4913:8
#15 0x56452bc7afa5 in mysql_select(THD*, TABLE_LIST*, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) /home/buildbot/src/sql/sql_select.cc:5439:21
#16 0x56452bc79ce1 in handle_select(THD*, LEX*, select_result*, unsigned long long) /home/buildbot/src/sql/sql_select.cc:636:10
#17 0x56452bb6d334 in execute_sqlcom_select(THD*, TABLE_LIST*) /home/buildbot/src/sql/sql_parse.cc:6212:12
#18 0x56452bb50a93 in mysql_execute_command(THD*, bool) /home/buildbot/src/sql/sql_parse.cc:3987:12
#19 0x56452bb341a3 in mysql_parse(THD*, char*, unsigned int, Parser_state*) /home/buildbot/src/sql/sql_parse.cc:7940:18
#20 0x56452bb2b056 in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool) /home/buildbot/src/sql/sql_parse.cc:1896:7
#21 0x56452bb36765 in do_command(THD*, bool) /home/buildbot/src/sql/sql_parse.cc:1432:17
#22 0x56452c16bfcc in do_handle_one_connection(CONNECT*, bool) /home/buildbot/src/sql/sql_connect.cc:1503:11
#23 0x56452c16b805 in handle_one_connection /home/buildbot/src/sql/sql_connect.cc:1415:5
#24 0x56452d4912e8 in pfs_spawn_thread /home/buildbot/src/storage/perfschema/pfs.cc:2198:3
#25 0x56452b57bc66 in asan_thread_start(void*) asan_interceptors.cpp.o
| row_sel_reset_old_vers_heap(prebuilt); | ||
| row_sel_reset_old_vers_heap(&old_vers_heap); | ||
|
|
||
| return row_vers_build_for_consistent_read( | ||
| rec, mtr, clust_index, offsets, | ||
| &prebuilt->trx->read_view, offset_heap, | ||
| prebuilt->old_vers_heap, old_vers, vrow); | ||
| &trx->read_view, offset_heap, | ||
| old_vers_heap, old_vers, vrow); |
There was a problem hiding this comment.
We are leaking the possible old contents of old_vers_heap here.
| rec_get_offsets(rec, clust_index) */ | ||
| mem_heap_t** offset_heap, /*!< in/out: memory heap from which | ||
| the offsets are allocated */ | ||
| mem_heap_t* old_vers_heap, |
There was a problem hiding this comment.
The purpose and the lifetime of this parameter are not documented.
| dberr_t | ||
| Row_sel_get_clust_rec_for_mysql::operator()( | ||
| /*============================*/ | ||
| row_prebuilt_t* prebuilt,/*!< in: prebuilt struct in the handle */ | ||
| dtuple_t* clust_ref,/*!< in: clustered index search tuple */ | ||
| btr_pcur_t* clust_pcur,/*!< in/out: clustered index cursor */ | ||
| lock_mode select_lock_type,/*!< in: lock mode for selection */ | ||
| trx_t* trx, /*!< in: transaction */ | ||
| mem_heap_t* old_vers_heap,/*!< in/out: memory heap for old versions */ | ||
| dict_index_t* sec_index,/*!< in: secondary index where rec resides */ | ||
| const rec_t* rec, /*!< in: record in a non-clustered index; if | ||
| this is a locking read, then rec is not | ||
| allowed to be delete-marked, and that would | ||
| not make sense either */ | ||
| que_thr_t* thr, /*!< in: query thread */ | ||
| const rec_t** out_rec,/*!< out: clustered record or an old version of | ||
| it, NULL if the old version did not exist | ||
| in the read view, i.e., it was a fresh | ||
| inserted version */ | ||
| rec_offs** offsets,/*!< in: offsets returned by | ||
| rec_get_offsets(rec, sec_index); | ||
| out: offsets returned by | ||
| rec_get_offsets(out_rec, clust_index) */ | ||
| mem_heap_t** offset_heap,/*!< in/out: memory heap from which | ||
| the offsets are allocated */ | ||
| dtuple_t** vrow, /*!< out: virtual column to fill */ | ||
| mtr_t* mtr) /*!< in: mtr used to get access to the | ||
| non-clustered record; the same mtr is used to | ||
| access the clustered index */ |
There was a problem hiding this comment.
There already are very many parameters passed to this member function call. Adding 4 more parameters will complicate the call setup; I don’t think that any ABI would support passing this number of parameters in registers. Can we move most of the parameters to data members?
Do we need both trx and thr? Is thr->graph->trx not always accessible?
Instead of changing this interface, could we initialize a row_prebuilt_t in the code path that is missing it?
| enum class RecordCompareAction | ||
| { | ||
| /** Do not process this record, continue traversal */ | ||
| SKIP, | ||
| /** Process this record via process_record */ | ||
| PROCESS, | ||
| /** Stop traversal immediately */ | ||
| STOP | ||
| }; |
There was a problem hiding this comment.
PROCESS is the return value of the default argument to RecordCallback as well as the first value that row0query.cc will be checking the return value against. I’d suggest to make its value 0.
| if (action == RecordCompareAction::PROCESS) | ||
| { | ||
| dberr_t proc_err= callback->process_record(rec, clust_index, offsets); | ||
| if (proc_err != DB_SUCCESS) | ||
| { | ||
| err= proc_err; | ||
| goto err_exit; | ||
| } | ||
| } | ||
| if (action == RecordCompareAction::SKIP) | ||
| { | ||
| err= DB_RECORD_NOT_FOUND; | ||
| goto err_exit; | ||
| } |
There was a problem hiding this comment.
Why do we need a separate variable proc_err? We could just assign to err directly.
Should the if (action == RecordCompareAction::SKIP) actually be else, so that it would also cover the STOP value? If not, then it should be else if and we’d need an assertion and a comment that documents what we are supposed to do with the STOP value.
| /** Constructor with processor function and optional comparator | ||
| @param[in] processor Function to process each record | ||
| @param[in] comparator Optional function to filter records (default: accept all) */ | ||
| RecordCallback( | ||
| RecordProcessor processor, | ||
| RecordComparator comparator= [](const dtuple_t*, const rec_t*, | ||
| const dict_index_t*) | ||
| { return RecordCompareAction::PROCESS; }) | ||
| : process_record(std::move(processor)), | ||
| compare_record(std::move(comparator)) {} |
There was a problem hiding this comment.
We should not use any [in] or [out] in new code. Why do we need std::move()?
As far as I understand, only fts_load_user_stopword() is relying on this default comparator. It exercises only a small part of the new code. Could we default to nullptr and assert in most places that the comparator function pointer has been set?
| /** Called for each matching record */ | ||
| RecordProcessor process_record; | ||
|
|
||
| /** Comparison function for custom filtering */ | ||
| RecordComparator compare_record; |
There was a problem hiding this comment.
Why are these not const? They are not supposed to change after construction, are they?
8355e6a to
734db1e
Compare
| if (error == DB_SUCCESS) | ||
| { | ||
| /* There must be some nodes to read. */ | ||
| ut_a(ib_vector_size(optim->words) > 0); |
There was a problem hiding this comment.
Hitting this Assertion
# 2026-05-14T14:04:17 [56807] | InnoDB: Failing assertion: ib_vector_size(optim->words) > 0
Bug raised:https://jira.mariadb.org/browse/MDEV-39616
| noexcept | ||
| { | ||
| ut_ad(dtuple_check_typed(&tuple)); | ||
| ut_ad(page_rec_is_leaf(rec)); |
There was a problem hiding this comment.
Found this Assertion:-
# 2026-05-14T14:34:41 [1226396] | mariadbd: /data/Server/MDEV-28730_new/storage/innobase/page/page0cur.cc:87: int cmp_dtuple_rec_bytes(const rec_t*, const dict_index_t&, const dtuple_t&, int*, ulint): Assertion page_rec_is_leaf(rec)' failed.`
Bug raised:https://jira.mariadb.org/browse/MDEV-39617
734db1e to
7b5e60b
Compare
| ULINT_UNDEFINED, &m_heap); | ||
|
|
||
| dberr_t err= DB_SUCCESS; | ||
| ulint cmpl_info= UPD_NODE_NO_ORD_CHANGE | UPD_NODE_NO_SIZE_CHANGE; |
There was a problem hiding this comment.
Crash found during testing
Thread 3 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 3906580.3917832]
thd_storage_lock_wait (thd=0x0, value=11) at /data/Server/MDEV-28730_new/sql/sql_class.cc:451
451 thd->utime_after_lock+= value;
(rr) bt
#0 thd_storage_lock_wait (thd=0x0, value=11) at /data/Server/MDEV-28730_new/sql/sql_class.cc:451
#1 0x000056d4586cb8e3 in lock_sys_t::wait_resume (this=this@entry=0x56d459ee5c80 <lock_sys>, thd=0x0, start=start@entry=..., now=...) at /data/Server/MDEV-28730_new/storage/innobase/lock/lock0lock.cc:2108
#2 0x000056d4586c9128 in lock_wait (thr=thr@entry=0x1276380563c8) at /data/Server/MDEV-28730_new/storage/innobase/lock/lock0lock.cc:2410
#3 0x000056d45879c93d in row_mysql_handle_errors (new_err=new_err@entry=0x799151a0c4a0, trx=trx@entry=0x55aa7560ff00, thr=thr@entry=0x1276380563c8, savept=savept@entry=0x0) at /data/Server/MDEV-28730_new/storage/innobase/row/row0mysql.cc:698
#4 0x000056d4587d308d in row_search_mvcc<InnoDBPolicy<false, false> > (buf=buf@entry=0x799151a0cd90 "H\203>Y\324V", mode=mode@entry=PAGE_CUR_GE, prebuilt=0x127638056468, match_mode=match_mode@entry=0, direction=direction@entry=0)
at /data/Server/MDEV-28730_new/storage/innobase/row/row0sel.cc:6051
#5 0x000056d45893409c in row_search_mvcc_callback_dml (direction=0, match_mode=0, prebuilt=<optimized out>, mode=PAGE_CUR_GE, callback=0x799151a0cd90) at /data/Server/MDEV-28730_new/storage/innobase/include/row0sel.h:202
#6 QueryExecutor::select_for_update (this=this@entry=0x799151a0ce70, table=table@entry=0x12763803fbf8, search_tuple=search_tuple@entry=0x799151a0cd20, callback=callback@entry=0x799151a0cd90)
at /data/Server/MDEV-28730_new/storage/innobase/row/row0query.cc:331
#7 0x000056d458906f5f in FTSQueryExecutor::read_config_with_lock (this=this@entry=0x799151a0ce70, key=key@entry=0x56d458cb4922 "synced_doc_id", callback=...) at /data/Server/MDEV-28730_new/storage/innobase/fts/fts0exec.cc:535
#8 0x000056d4589044bf in fts_config_get_value (executor=executor@entry=0x799151a0ce70, table=table@entry=0x127638103a68, name=name@entry=0x56d458cb4922 "synced_doc_id", value=value@entry=0x799151a0ce50)
at /data/Server/MDEV-28730_new/storage/innobase/fts/fts0config.cc:46
#9 0x000056d4586a3eb3 in i_s_fts_config_fill (thd=0x127638000d58, tables=<optimized out>) at /data/Server/MDEV-28730_new/storage/innobase/handler/i_s.cc:3161
#10 0x000056d45813ed14 in get_schema_tables_result (join=join@entry=0x1276380178a8, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /data/Server/MDEV-28730_new/sql/sql_show.cc:9924
#11 0x000056d4581286d1 in JOIN::exec_inner (this=this@entry=0x1276380178a8) at /data/Server/MDEV-28730_new/sql/sql_select.cc:5086
#12 0x000056d4581288f1 in JOIN::exec (this=this@entry=0x1276380178a8) at /data/Server/MDEV-28730_new/sql/sql_select.cc:4913
#13 0x000056d4581278df in mysql_select (thd=thd@entry=0x127638000d58, tables=0x127638016628, fields=..., conds=0x0, og_num=0, order=<optimized out>, group=0x0, having=0x0, proc_param=0x0, select_options=2701396736, result=0x127638017880,
unit=0x127638005278, select_lex=0x127638015ea8) at /data/Server/MDEV-28730_new/sql/sql_select.cc:5439
#14 0x000056d458127ab4 in handle_select (thd=thd@entry=0x127638000d58, lex=lex@entry=0x127638005198, result=result@entry=0x127638017880, setup_tables_done_option=setup_tables_done_option@entry=0)
at /data/Server/MDEV-28730_new/sql/sql_select.cc:636
#15 0x000056d45809d35e in execute_sqlcom_select (thd=thd@entry=0x127638000d58, all_tables=0x127638016628) at /data/Server/MDEV-28730_new/sql/sql_parse.cc:6214
#16 0x000056d4580a6dd3 in mysql_execute_command (thd=thd@entry=0x127638000d58, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /data/Server/MDEV-28730_new/sql/sql_parse.cc:3988
#17 0x000056d4580ac979 in mysql_parse (thd=thd@entry=0x127638000d58, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x799151a0e3c0) at /data/Server/MDEV-28730_new/sql/sql_parse.cc:7942
#18 0x000056d4580adfcd in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x127638000d58, packet=packet@entry=0x12763800b669 "\nSELECT COUNT(*) FROM INFORMATION_SCHEMA.INNODB_FT_CONFIG /* E_R Thread6 QNO 79 CON_ID 21 */ ",
packet_length=packet_length@entry=93, blocking=blocking@entry=true) at /data/Server/MDEV-28730_new/sql/sql_parse.cc:1898
#19 0x000056d4580af5d2 in do_command (thd=thd@entry=0x127638000d58, blocking=blocking@entry=true) at /data/Server/MDEV-28730_new/sql/sql_parse.cc:1432
#20 0x000056d45820859d in do_handle_one_connection (connect=<optimized out>, connect@entry=0x56d45b7f35b8, put_in_cache=put_in_cache@entry=true) at /data/Server/MDEV-28730_new/sql/sql_connect.cc:1503
#21 0x000056d4582087ba in handle_one_connection (arg=0x56d45b7f35b8) at /data/Server/MDEV-28730_new/sql/sql_connect.cc:1415
#22 0x0000764474ab2a94 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#23 0x0000764474b3fa34 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
Bug raised:-https://jira.mariadb.org/browse/MDEV-39621
InnoDB FTS performed all reads and DML on its auxiliary,
common and CONFIG tables through the InnoDB internal SQL
graph parser. Replace that path with direct B-tree access
via a new query-executor layer, and delete the parser-era
helpers.
row0query.h, row/row0query.cc - new QueryExecutor:
- General MVCC-aware record traversal and basic DML on the
clustered index. Sits on a btr_pcur and a transaction-owned
mtr; record processing goes through a RecordCallback that
bundles two std::functions:
compare_record() returns SKIP / PROCESS / STOP for a record
process_record() handles each PROCESS-ed (MVCC-visible) row
Public API:
read() scan a clustered index with a search key
read_all() full clustered scan (optional start tuple)
read_by_index() scan a secondary index, fetch the matching
clustered record, deliver it to the callback
insert_record() insert a tuple into the clustered index
delete_record() delete a row identified by tuple
delete_all() delete every row in the clustered index
select_for_update() position+X-lock the matching clustered row
update_record() update the row select_for_update() locked,
falling back to optimistic/pessimistic and
external storage paths as needed
replace_record() upsert: select_for_update()+update_record(),
else insert_record()
lock_table(), handle_wait(), commit_mtr()
fts0exec.h, fts/fts0exec.cc - new FTSQueryExecutor:
Thin wrapper over QueryExecutor specialised for FTS tables.
Opens and locks the required tables once and exposes typed
helpers keyed by table family.
Auxiliary INDEX_[1..6]:
open_all_aux_tables()
insert_aux_record(aux_index, fts_aux_data_t)
delete_aux_record(aux_index, fts_aux_data_t)
read_aux() range scan from a given word
read_from_range() paginated read that absorbs
DB_FTS_EXCEED_RESULT_CACHE_LIMIT internally
and resumes from the last word seen
Common deletion tables (DELETED, DELETED_CACHE, BEING_DELETED,
BEING_DELETED_CACHE):
open_all_deletion_tables()
insert_common_record(), delete_common_record(),
delete_all_common_records(), read_all_common()
CONFIG table (<key, value>):
open_config_table() / set_config_table()
insert_config_record(), update_config_record() (upsert),
delete_config_record(), read_config_with_lock()
fts_aux_data_t carries the auxiliary row payload.
RecordCallback specialisations live alongside the executor:
CommonTableReader collects doc_ids from common tables that
share the <doc_id> schema.
ConfigReader extracts <key, value> and provides
compare_config_key() for fast key matching.
AuxRecordReader scans auxiliary indexes with an
AuxCompareMode (GREATER_EQUAL / GREATER / LIKE / EQUAL)
driving the comparator; tracks the last word seen so a
paginated scan can resume.
fts_query() walks index and common tables via
QueryExecutor::read_by_index() with RecordCallback;
fts_write_node() writes auxiliary rows through
FTSQueryExecutor::insert_aux_record() / delete_aux_record()
with fts_aux_data_t.
fts_optimize_write_word() now goes through the same
insert/delete path.
fts_select_index{,_by_range,_by_hash} return uint8_t (was
ulint) with a simpler control flow.
fts_optimize_table() binds a thd to its transaction whether
invoked from a user thread or the FTS optimize thread.
fts_optimize_t drops its fts_index_table and fts_common_table
fts_table_t fields; fts_query_t drops fts_common_table.
storage/innobase/fts/fts0sql.cc is deleted along with the
commented-out and unreferenced parser-era helpers it held.
dict_sys.latch is now acquired once per fts_sync_table(),
fts_optimize_table() and fts_query() call to open every
auxiliary and common table in one pass, instead of being
re-acquired per table.
dict_acquire_mdl(): function now supports both MDL_SHARED (default) and MDL_EXCLUSIVE via the exclusive template parameter. Updated FTS optimize to acquire MDL_EXCLUSIVE first then downgrade to MDL_SHARED_UPGRADABLE to serialize concurrent fulltext optimizations.
7b5e60b to
a966595
Compare
Description
Remove internal parser/SQL-graph usage and migrate FTS paths to QueryExecutor
Introduced QueryExecutor (row0query.{h,cc}) and FTSQueryExecutor abstractions for
clustered, secondary scans and DML.
Refactored fetch/optimize code to use QueryExecutor::read(), read_by_index()
with RecordCallback, replacing SQL graph flows
Added CommonTableReader and ConfigReader callbacks for common/CONFIG tables
Implemented fts_index_fetch_nodes(trx, index, word, user_arg, FTSRecordProcessor, compare_mode)
and rewrote fts_optimize_write_word() to delete/insert via executor with fts_aux_data_t
Removed fts_doc_fetch_by_doc_id() and FTS_FETCH_DOC_BY_ID_* macros, updating callers to
fts_query_fetch_document()
Tightened fts_select_index{,_by_range,by_hash} return type to uint8_t;
Removed fts0sql.cc and eliminated fts_table_t from fts_query_t/fts_optimize_t.*
Release Notes
Removed the sql parser usage from fulltext subsystem
How can this PR be tested?
For QA purpose, Run RQG testing involving Fulltext subsystem
Basing the PR against the correct MariaDB version
mainbranch.PR quality check