Kafka Connect: Commit coordination #10351

bryanck · 2024-05-18T14:25:55Z

This PR is the next stage in submitting the Kafka Connect Iceberg sink connector, and is a follow up to #8701, #9466, and #9641. It includes the commit coordinator and related tests.

Still not included for the sink are the integration tests, distribution build, or docs, which will be added in follow up PRs. For reference, the current sink implementation can be found at https://github.com/tabular-io/iceberg-kafka-connect, and you can read some existing docs at https://github.com/tabular-io/iceberg-kafka-connect/tree/main/docs.

ajantha-bhat · 2024-05-21T07:41:56Z

...nect/kafka-connect-events/src/test/java/org/apache/iceberg/connect/events/EventTestUtil.java

@@ -45,7 +45,7 @@ private EventTestUtil() {}
      new Schema(ImmutableList.of(Types.NestedField.required(1, "id", Types.LongType.get())));

  static final PartitionSpec SPEC =
-      PartitionSpec.builderFor(SCHEMA).identity("id").withSpecId(1).build();
+      PartitionSpec.builderFor(SCHEMA).identity("id").withSpecId(0).build();


nit: default is 0, so no need to explicitly set it.

Thanks I changed this.

ajantha-bhat · 2024-05-21T07:52:25Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/CatalogUtils.java

+  }
+
+  // use reflection here to avoid requiring Hadoop as a dependency
+  private static Object loadHadoopConfig(IcebergSinkConfig config) {


Should this be moved to org.apache.iceberg.CatalogUtil so that it can be used by java API folks also if they have hadoop conf directory?

This method can accept String hadoopConfDir and config.hadoopProps() instead of sink config. In that case no need of this class.

I agree, that could be useful, though it may require some discussion to get right, so I'd rather start with this here.

ajantha-bhat · 2024-05-21T08:01:10Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/CommitterFactory.java

+import org.apache.iceberg.connect.channel.CommitterImpl;
+
+public class CommitterFactory {
+  public static Committer createCommitter(IcebergSinkConfig config) {


is config needed?

It isn't needed now, but possibly in the future, to indicate the type of committer to create. The API is public so I designed it with that in mind.

ajantha-bhat · 2024-05-21T08:03:16Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/IcebergSinkConfig.java

@@ -80,7 +80,6 @@ public class IcebergSinkConfig extends AbstractConfig {
  private static final String TABLES_SCHEMA_CASE_INSENSITIVE_PROP =
      "iceberg.tables.schema-case-insensitive";
  private static final String CONTROL_TOPIC_PROP = "iceberg.control.topic";
-  private static final String CONTROL_GROUP_ID_PROP = "iceberg.control.group-id";


why this is removed?

We use the connect consumer group now, rather than a separate consumer group that we keep in sync.

ajantha-bhat · 2024-05-21T10:43:09Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/IcebergSinkTask.java

+  public void put(Collection<SinkRecord> sinkRecords) {
+    if (committer != null) {
+      committer.save(sinkRecords);
+    }


should throw an exception when committer is null else the producer will assume it has been put?

This should never happen, so I changed this to a precondition check. I'll revisit this if needed when I add the integration tests.

ajantha-bhat · 2024-05-21T12:12:24Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/channel/Coordinator.java

+            .filter(distinctByKey(dataFile -> dataFile.path().toString()))
+            .collect(Collectors.toList());
+
+    List<DeleteFile> deleteFiles =


Since we are only supporting append, do we need this code?

Similar comments for below RowDelta block.

I wanted to keep the coordinator capable of handling delete files, so when we do add in delta support, it should be fairly straightforward, and won't require changes to the control message data model.

ajantha-bhat · 2024-05-21T12:36:35Z

...-connect/kafka-connect/src/test/java/org/apache/iceberg/connect/channel/CoordinatorTest.java

+  }
+
+  @Test
+  public void testCommitDelta() {


maybe delete file tests can be added when we have delete writers feature added.

See my comment above, I'd prefer to add this into the coordinator now, rather than later.

fqaiser94 · 2024-05-27T14:19:05Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/IcebergSinkTask.java

+  @Override
+  public void flush(Map<TopicPartition, OffsetAndMetadata> currentOffsets) {
+    committer.save(null);
+  }


Why are we overriding the flush method when we don't have any flush-specific code path in Committer?

The put method will be called on a regular basis (potentially with an empty collection of sink records) so this feels redundant.

Also, I'm fairly certain that this flush method will never actually be called since we are overriding the preCommit method. The default preCommit implementation is the only place where flush is called by the Kafka Connect runtime. Unless you call flush yourself in the preCommit method you've defined (or anywhere else), flush will never actually be called.

Overall, I would recommend you just omit this flush method definition from this class.

This is here in case future committer implementations want to perform an action on flush. The current committer doesn't do anything when the record collection is null.

fqaiser94 · 2024-05-27T14:21:28Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/IcebergSinkTask.java

+  @Override
+  public void close(Collection<TopicPartition> partitions) {
+    close();
+  }
+
+  private void close() {
+    if (committer != null) {
+      committer.stop();
+      committer = null;
+    }
+
+    if (catalog != null) {
+      if (catalog instanceof AutoCloseable) {
+        try {
+          ((AutoCloseable) catalog).close();
+        } catch (Exception e) {
+          LOG.warn("An error occurred closing catalog instance, ignoring...", e);
+        }
+      }
+      catalog = null;
+    }
+  }


nit: can you move these close methods so they're after preCommit but before stop? Just so these methods are arranged in life-cycle order.

I'm not sure I follow your suggestion. When the KC close() or stop() lifecycle methods are called, we close the committer and the catalog.

fqaiser94 · 2024-05-27T14:23:48Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/IcebergSinkTask.java

+
+  @Override
+  public void open(Collection<TopicPartition> partitions) {
+    catalog = CatalogUtils.loadCatalog(config);


I'm not 100% sure about this but IIRC from past experience, Kafka Connect doesn't really guarantee that close will be called before open if a SinkTask instance is reused, this can lead to resource leaks (or worse). So I usually defensively call close myself in the open method first before opening new resources (catalog and committer in this case).

Note: I can't find any documentation/issues to explicitly support this right now but looking at how things are implemented in the Kafka Connect runtime using the ConsumerRebalanceListener API I think I can see at least one way where this would be possible (open called without/before close).

Thanks, I added a precondition check here to be safer.

fqaiser94 · 2024-05-27T15:30:03Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/IcebergSinkTask.java

+
+  @Override
+  public void open(Collection<TopicPartition> partitions) {
+    catalog = CatalogUtils.loadCatalog(config);


open method is called with only the newly assigned partitions. Is there a strong reason to pass just the newly assigned partitions to the Committer.start method when the Committer can just retrieve all partitions assigned to this task via context.assignment anyway?

I'm also worried we might have a bug here. The Committer implementation uses this partitions argument to check if partition 0 of the first topic is assigned to this task and if so, it spawns a Coordinator process. I'm worried that if there was a rebalance where the partition 0 of the first topic doesn't move between tasks, then it would not be included in the partitions argument for any Task and thus we could potentially end up with a Connector that doesn't have any Coordinator process running on any Task. Thoughts?

That's not my understanding, open() will be called with the new assignment, i.e. all assigned topic partitions. See the javadoc: "The list of partitions that are now assigned to the task (may include partitions previously assigned to the task)"

fqaiser94 · 2024-05-30T19:25:17Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/IcebergSinkTask.java

+
+  @Override
+  public String version() {
+    return IcebergSinkConfig.version();


Looks like this method was implemented before this PR but I have a question about this.
This method returns a String that looks something like this:

IcebergBuild.version() + "-kc-" + kcVersion;

where kcVersion = IcebergSinkConfig.class.getPackage().getImplementationVersion().

Won't kcVersion and IcebergBuild.version() be the same value since AFAIK this connector's releases will be tied with the general iceberg releases? CMIIW

Note: when I run the existing unit test locally, I get 1.6.0-SNAPHSHOT-kc-unknown currently.

Thanks for catching this, we can just use the Iceberg version now. I left in the config method in case we want to add to this later.

fqaiser94 · 2024-05-30T19:31:56Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/IcebergSinkTask.java

+  private static final Logger LOG = LoggerFactory.getLogger(IcebergSinkTask.class);
+
+  private IcebergSinkConfig config;
+  private Catalog catalog;


As far as I know Catalog objects are not guaranteed to be thread-safe.
And at least on the "leader" tasks, we can have multiple threads using the same Catalog instance at the same time ("leader" tasks have both a main thread as well as a CoordinatorThread).
I haven't heard users report any issues about this but it would be better to avoid this risk entirely?

fqaiser94 · 2024-05-30T19:32:14Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/SinkWriter.java

+
+  private void routeRecordDynamically(SinkRecord record) {
+    String routeField = config.tablesRouteField();
+    Preconditions.checkNotNull(routeField, "Route field cannot be null with dynamic routing");


This should really be checked at config parsing time or at SinkWriter construction time?
Instead of on every record.
To be clear; I'm not worried about this from a performance perspective (I'm confident the JVM will optimize this away) but it just seems awkward.

We already check this in the config, so I'll just remove this.

fqaiser94 · 2024-05-30T20:11:17Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/SinkWriter.java

+        new TopicPartition(record.topic(), record.kafkaPartition()),
+        new Offset(record.kafkaOffset() + 1, timestamp));
+
+    if (config.dynamicTablesEnabled()) {


Same here, we really don't need to check this on every record?

Feels like we're missing an abstraction here, something like the concept of a Router which has a StaticRouter implementation and a DynamicRouter implementation and only one of those is constructed for the lifetime of a SinkWriter based on the config.dynamicTablesEnabled() setting.

fqaiser94 · 2024-05-30T20:13:23Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/SinkWriter.java

+    Preconditions.checkNotNull(routeField, "Route field cannot be null with dynamic routing");
+
+    String routeValue = extractRouteValue(record.value(), routeField);
+    if (routeValue != null) {


IMO we should throw an error instead of dropping the record.
Users can filter out messages easily using an SMT to get the same behaviour, if necessary.
In the future, I imagine we can allow users to supply custom exception-handler implementations which could also allow users to drop records on error.

I feel skipping is better than getting into a state where the sink can no longer progress. When we add DLQ support then we could route to that.

fqaiser94 · 2024-05-30T20:14:00Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/SinkWriter.java

+    String routeValue = extractRouteValue(record.value(), routeField);
+    if (routeValue != null) {
+      String tableName = routeValue.toLowerCase();
+      writerForTable(tableName, record, true).write(record);
+    }


Can we support writing to multiple tables if this is a comma separate list of table names? Like we do in "static" mode.

Unless there are other concerns, I think this is an important building block for more advanced functionality e.g. we can remove route-value-regex-table-routing-mode as it could be implemented entirely as an SMT + dynamic mode.

fqaiser94 · 2024-05-30T20:14:17Z

kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/data/SinkWriter.java

+      String routeValue = extractRouteValue(record.value(), routeField);
+      if (routeValue != null) {
+        config
+            .tables()
+            .forEach(
+                tableName -> {
+                  Pattern regex = config.tableConfig(tableName).routeRegex();
+                  if (regex != null && regex.matcher(routeValue).matches()) {
+                    writerForTable(tableName, record, false).write(record);
+                  }
+                });
+      }


nit: is this really "static" routing? I guess it's a static list of tables, but the table/s each message is written to is determined dynamically based on the route value ...

More importantly, is this an important enough use-case to support within the connector? I would strongly prefer if we didn't support this within the connector itself (users can easily implement this by writing an SMT + dynamic mode).

Static in the sense that the list of tables is fixed and doesn't change, rather than deriving it from the record. This feature is in use by some.

github-actions bot added the KAFKACONNECT label May 18, 2024

ajantha-bhat reviewed May 21, 2024

View reviewed changes

ajantha-bhat added this to the Iceberg 1.6.0 milestone May 21, 2024

fqaiser94 reviewed May 27, 2024

View reviewed changes

fqaiser94 reviewed May 30, 2024

View reviewed changes

bryanck added 4 commits June 9, 2024 06:43

Kafka Connect: Commit coordination

77b1c61

add tests

8182fb2

more tests

142e176

checkstyle fix

1cc2b76

bryanck force-pushed the kc-coord branch from 35abb14 to 1cc2b76 Compare June 9, 2024 13:44

PR feedback

219a490

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka Connect: Commit coordination #10351

Kafka Connect: Commit coordination #10351

bryanck commented May 18, 2024

ajantha-bhat May 21, 2024

bryanck Jun 9, 2024

ajantha-bhat May 21, 2024

bryanck Jun 9, 2024

ajantha-bhat May 21, 2024

bryanck Jun 9, 2024

ajantha-bhat May 21, 2024

bryanck Jun 9, 2024

ajantha-bhat May 21, 2024

bryanck Jun 9, 2024 •

edited

ajantha-bhat May 21, 2024

ajantha-bhat May 21, 2024

bryanck Jun 9, 2024

ajantha-bhat May 21, 2024

bryanck Jun 9, 2024

fqaiser94 May 27, 2024

bryanck Jun 9, 2024

fqaiser94 May 27, 2024

bryanck Jun 9, 2024

fqaiser94 May 27, 2024

bryanck Jun 9, 2024

fqaiser94 May 27, 2024

bryanck Jun 9, 2024

fqaiser94 May 30, 2024

bryanck Jun 9, 2024

fqaiser94 May 30, 2024

fqaiser94 May 30, 2024

bryanck Jun 9, 2024

fqaiser94 May 30, 2024

fqaiser94 May 30, 2024

bryanck Jun 9, 2024

fqaiser94 May 30, 2024 •

edited

fqaiser94 May 30, 2024

bryanck Jun 9, 2024

Kafka Connect: Commit coordination #10351

Are you sure you want to change the base?

Kafka Connect: Commit coordination #10351

Conversation

bryanck commented May 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bryanck Jun 9, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fqaiser94 May 30, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bryanck Jun 9, 2024 •

edited

fqaiser94 May 30, 2024 •

edited