-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connect with Kafka not able to form #4
Comments
hello, i had fixed that error successfully after 2 days. Here is my step:
s_conn = SparkSession.builder Hope it effective! |
Hi there, the author has explained the missing packages in this part: https://youtu.be/GqAcTrqKcrY?si=QzgPjlC-RHMULUS0&t=4599 I believe it's still missing the kafka-clients package. You can add the jar file with two missing packages from @VuKhoiGVM's comment. Ref: https://stackoverflow.com/a/71059689/17316050 Also please double-check the Spark, Cassandra, and Scala versions to download the compatible version or else it won't work. |
I get the error even after adding the jar files |
How did you manage to solve that issue? |
Whenever we try to connect to kafka, we get this error:
WARNING:root:kafka dataframe could not be created because: An error occurred while calling o36.load.
: java.lang.NoClassDefFoundError: scala/$less$colon$less
at org.apache.spark.sql.kafka010.KafkaSourceProvider.org$apache$spark$sql$kafka010$KafkaSourceProvider$$validateStreamOptions(KafkaSourceProvider.scala:338)
at org.apache.spark.sql.kafka010.KafkaSourceProvider.sourceSchema(KafkaSourceProvider.scala:71)
at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:233)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:118)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:118)
at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:35)
at org.apache.spark.sql.streaming.DataStreamReader.loadInternal(DataStreamReader.scala:168)
at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:144)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.ClassNotFoundException: scala.$less$colon$less
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 20 more
Traceback (most recent call last):
File "C:\Users\User\Desktop\Data Engineering\Data Engineering\spark_stream.py", line 143, in
selection_df = create_selection_df_from_kafka(spark_df)
File "C:\Users\User\Desktop\Data Engineering\Data Engineering\spark_stream.py", line 129, in create_selection_df_from_kafka
sel = spark_df.selectExpr("CAST(value AS STRING)")
AttributeError: 'NoneType' object has no attribute 'selectExpr'
The text was updated successfully, but these errors were encountered: