You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was wondering if it would be possible to optimize how PGSync updates Elasticsearch records. I put 2 different thoughts I had into these dropdown menu's. Please let me know what you think.
Only Update When Specified Columns Changed
I notice that when an unspecified column updates, it fires a re-index.
Example
Given the following schema for table my_table
# Table: my_table# ---------------------------------------------------------------------------------------------------------# Columns:# id | integer | PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY# foo | text |# bar | text |# ---------------------------------------------------------------------------------------------------------
Followed by another query (before pgsync runs again):
UPDATE my_table
SET foo ="world"WHERE id =1;
What is happening?
2 insert requests are sent to my_index the next time pgsync runs.
What do I expect to happen instead?
1 insert request is sent to my_index the next time pgsync runs. Because we are performing full document updates in the index, there is no reason to send the same document to Elasticsearch twice.
The text was updated successfully, but these errors were encountered:
I was wondering if it would be possible to optimize how PGSync updates Elasticsearch records. I put 2 different thoughts I had into these dropdown menu's. Please let me know what you think.
Only Update When Specified Columns Changed
I notice that when an unspecified column updates, it fires a re-index.
Example
Given the following schema for table
my_table
And a
schema.json
defined as:What is happening?
my_index
is getting updated when columnbar
updates.What do I expect to happen instead?
my_index
would not be updated whenbar
updates because it is not specified inschema.json
.Batch Updates Together
I notice that when a record updates multiple times, it gets re-indexed the same number of times it was updated.
Example
Given the following schema for table
my_table
And a
schema.json
defined as:And I perform the following query:
Followed by another query (before
pgsync
runs again):What is happening?
2 insert requests are sent to
my_index
the next timepgsync
runs.What do I expect to happen instead?
1 insert request is sent to
my_index
the next timepgsync
runs. Because we are performing full document updates in the index, there is no reason to send the same document to Elasticsearch twice.The text was updated successfully, but these errors were encountered: