Issue with CSVSchemaGenerator
See original GitHub issueHi - I tried to deploy the CSV Spooldir-Connector with auto schema generation in version 2.0.54 Does anybody have an idea what is possibly going wrong here?
2020-12-09 15:09:05,589] INFO [Worker clientId=connect-1, groupId=connect-cluster] Starting connectors and tasks using config offset 2384 (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-12-09 15:09:05,589] INFO [Worker clientId=connect-1, groupId=connect-cluster] Starting connector csvimport-ps60 (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-12-09 15:09:05,590] INFO ConnectorConfig values:
config.action.reload = restart
connector.class = com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector
errors.log.enable = false
errors.log.include.messages = false
errors.retry.delay.max.ms = 60000
errors.retry.timeout = 0
errors.tolerance = none
header.converter = null
key.converter = class io.confluent.connect.avro.AvroConverter
name = csvimport-ps60
tasks.max = 1
transforms = []
value.converter = class io.confluent.connect.avro.AvroConverter
(org.apache.kafka.connect.runtime.ConnectorConfig)
[2020-12-09 15:09:05,590] INFO EnrichedConnectorConfig values:
config.action.reload = restart
connector.class = com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector
errors.log.enable = false
errors.log.include.messages = false
errors.retry.delay.max.ms = 60000
errors.retry.timeout = 0
errors.tolerance = none
header.converter = null
key.converter = class io.confluent.connect.avro.AvroConverter
name = csvimport-test
tasks.max = 1
transforms = []
value.converter = class io.confluent.connect.avro.AvroConverter
(org.apache.kafka.connect.runtime.ConnectorConfig$EnrichedConnectorConfig)
[2020-12-09 15:09:05,590] INFO Creating connector csvimport-test of type com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector (org.apache.kafka.connect.runtime.Worker)
[2020-12-09 15:09:05,591] INFO Instantiated connector csvimport-test with version 0.0.0.0 of type class com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector (org.apache.kafka.connect.runtime.Worker)
[2020-12-09 15:09:05,591] INFO SpoolDirCsvSourceConnectorConfig values:
batch.size = 1000
cleanup.policy = MOVE
csv.case.sensitive.field.names = false
csv.escape.char = 92
csv.file.charset = UTF-8
csv.first.row.as.header = true
csv.ignore.leading.whitespace = true
csv.ignore.quotations = false
csv.keep.carriage.return = false
csv.null.field.indicator = NEITHER
csv.quote.char = 34
csv.rfc.4180.parser.enabled = false
csv.separator.char = 44
csv.skip.lines = 0
csv.strict.quotes = false
csv.verify.reader = true
empty.poll.wait.ms = 500
error.path = /tmp/fail
file.buffer.size.bytes = 131072
file.minimum.age.ms = 0
files.sort.attributes = [NameAsc]
finished.path = /tmp/dest
halt.on.error = true
input.file.pattern = (.*?).input
input.path = /tmp/src
key.schema =
metadata.field = metadata
metadata.location = HEADERS
parser.timestamp.date.formats = [yyyy-MM-dd'T'HH:mm:ss, yyyy-MM-dd' 'HH:mm:ss]
parser.timestamp.timezone = UTC
processing.file.extension = .PROCESSING
schema.generation.enabled = true
schema.generation.key.fields = []
schema.generation.key.name = com.github.jcustenborder.kafka.connect.model.Key
schema.generation.value.name = com.github.jcustenborder.kafka.connect.model.Value
task.count = 1
task.index = 0
task.partitioner = ByName
timestamp.field =
timestamp.mode = PROCESS_TIME
topic = test
value.schema =
(com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnectorConfig)
[2020-12-09 15:09:05,592] INFO SpoolDirCsvSourceConnectorConfig values:
batch.size = 1000
cleanup.policy = MOVE
csv.case.sensitive.field.names = false
csv.escape.char = 92
csv.file.charset = UTF-8
csv.first.row.as.header = true
csv.ignore.leading.whitespace = true
csv.ignore.quotations = false
csv.keep.carriage.return = false
csv.null.field.indicator = NEITHER
csv.quote.char = 34
csv.rfc.4180.parser.enabled = false
csv.separator.char = 44
csv.skip.lines = 0
csv.strict.quotes = false
csv.verify.reader = true
empty.poll.wait.ms = 500
error.path = /tmp/fail
file.buffer.size.bytes = 131072
file.minimum.age.ms = 0
files.sort.attributes = [NameAsc]
finished.path = /tmp/dest
halt.on.error = true
input.file.pattern = (.*?).input
input.path = /tmp/src
key.schema =
metadata.field = metadata
metadata.location = HEADERS
parser.timestamp.date.formats = [yyyy-MM-dd'T'HH:mm:ss, yyyy-MM-dd' 'HH:mm:ss]
parser.timestamp.timezone = UTC
processing.file.extension = .PROCESSING
schema.generation.enabled = true
schema.generation.key.fields = []
schema.generation.key.name = com.github.jcustenborder.kafka.connect.model.Key
schema.generation.value.name = com.github.jcustenborder.kafka.connect.model.Value
task.count = 1
task.index = 0
task.partitioner = ByName
timestamp.field =
timestamp.mode = PROCESS_TIME
topic = test
value.schema =
(com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnectorConfig)
[2020-12-09 15:09:05,593] INFO Key or Value schema was not defined. Running schema generator. (com.github.jcustenborder.kafka.connect.spooldir.AbstractSpoolDirSourceConnector)
[2020-12-09 15:09:05,593] ERROR WorkerConnector{id=csvimport-test} Error while starting connector (org.apache.kafka.connect.runtime.WorkerConnector)
java.lang.NoClassDefFoundError: Could not initialize class com.github.jcustenborder.kafka.connect.spooldir.CsvSchemaGenerator
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
How To Do CSV File Validation And Schema Generation
Invalid CSV files create challenges for those building data pipelines. Pipelines value consistency, predictability, and testability as they ensure uninterrupted ...
Read more >timwis/csv-schema: Analyzes a CSV file and ... - GitHub
This application parses CSV files (including huge ones) within the browser. It analyzes each field to suggest the best database field type, max ......
Read more >CSV Schema - Digital Preservation @ The National Archives
A text based schema language ( CSV Schema ) for describing data in CSV files ... Tool and API ( CSV Validator )...
Read more >How to generate a schema from a CSV for a PostgreSQL Copy
Generate SQL statements for one or more CSV files, create execute ... -t, --tabs Specifies that the input CSV file is delimited with...
Read more >Resolving common problems with CSV files
If either problem exists, we suggest you start from the beginning with generating your .csv file. How to generate CSV data correctly. See...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
yes, by the way, It works fine with 2.0.46 and other versions.
java.lang.NoClassDefFoundError: Could not initialize class com.github.jcustenborder.kafka.connect.spooldir.CsvSchemaGenerator
and also with defined schema, I got this exception :java.lang.NoClassDefFoundError: com/fasterxml/jackson/annotation/JsonKey
hi, I have the same issue in version 2.0.54 I use confluentinc/cp-kafka-connect:5.5.1 image and install with confluent-hub
confluent-hub install --no-prompt jcustenborder/kafka-connect-spooldir:latest