Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issue with CSVSchemaGenerator

See original GitHub issue

Hi - I tried to deploy the CSV Spooldir-Connector with auto schema generation in version 2.0.54 Does anybody have an idea what is possibly going wrong here?

2020-12-09 15:09:05,589] INFO [Worker clientId=connect-1, groupId=connect-cluster] Starting connectors and tasks using config offset 2384 (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-12-09 15:09:05,589] INFO [Worker clientId=connect-1, groupId=connect-cluster] Starting connector csvimport-ps60 (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-12-09 15:09:05,590] INFO ConnectorConfig values:
        config.action.reload = restart
        connector.class = com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector
        errors.log.enable = false
        errors.log.include.messages = false
        errors.retry.delay.max.ms = 60000
        errors.retry.timeout = 0
        errors.tolerance = none
        header.converter = null
        key.converter = class io.confluent.connect.avro.AvroConverter
        name = csvimport-ps60
        tasks.max = 1
        transforms = []
        value.converter = class io.confluent.connect.avro.AvroConverter
 (org.apache.kafka.connect.runtime.ConnectorConfig)
[2020-12-09 15:09:05,590] INFO EnrichedConnectorConfig values:
        config.action.reload = restart
        connector.class = com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector
        errors.log.enable = false
        errors.log.include.messages = false
        errors.retry.delay.max.ms = 60000
        errors.retry.timeout = 0
        errors.tolerance = none
        header.converter = null
        key.converter = class io.confluent.connect.avro.AvroConverter
        name = csvimport-test
        tasks.max = 1
        transforms = []
        value.converter = class io.confluent.connect.avro.AvroConverter
 (org.apache.kafka.connect.runtime.ConnectorConfig$EnrichedConnectorConfig)
[2020-12-09 15:09:05,590] INFO Creating connector csvimport-test of type com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector (org.apache.kafka.connect.runtime.Worker)
[2020-12-09 15:09:05,591] INFO Instantiated connector csvimport-test with version 0.0.0.0 of type class com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnector (org.apache.kafka.connect.runtime.Worker)
[2020-12-09 15:09:05,591] INFO SpoolDirCsvSourceConnectorConfig values:
        batch.size = 1000
        cleanup.policy = MOVE
        csv.case.sensitive.field.names = false
        csv.escape.char = 92
        csv.file.charset = UTF-8
        csv.first.row.as.header = true
        csv.ignore.leading.whitespace = true
        csv.ignore.quotations = false
        csv.keep.carriage.return = false
        csv.null.field.indicator = NEITHER
        csv.quote.char = 34
        csv.rfc.4180.parser.enabled = false
        csv.separator.char = 44
        csv.skip.lines = 0
        csv.strict.quotes = false
        csv.verify.reader = true
        empty.poll.wait.ms = 500
        error.path = /tmp/fail
        file.buffer.size.bytes = 131072
        file.minimum.age.ms = 0
        files.sort.attributes = [NameAsc]
        finished.path = /tmp/dest
        halt.on.error = true
        input.file.pattern = (.*?).input
        input.path = /tmp/src
        key.schema =
        metadata.field = metadata
        metadata.location = HEADERS
        parser.timestamp.date.formats = [yyyy-MM-dd'T'HH:mm:ss, yyyy-MM-dd' 'HH:mm:ss]
        parser.timestamp.timezone = UTC
        processing.file.extension = .PROCESSING
        schema.generation.enabled = true
        schema.generation.key.fields = []
        schema.generation.key.name = com.github.jcustenborder.kafka.connect.model.Key
        schema.generation.value.name = com.github.jcustenborder.kafka.connect.model.Value
        task.count = 1
        task.index = 0
        task.partitioner = ByName
        timestamp.field =
        timestamp.mode = PROCESS_TIME
        topic = test
        value.schema =
 (com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnectorConfig)
[2020-12-09 15:09:05,592] INFO SpoolDirCsvSourceConnectorConfig values:
        batch.size = 1000
        cleanup.policy = MOVE
        csv.case.sensitive.field.names = false
        csv.escape.char = 92
        csv.file.charset = UTF-8
        csv.first.row.as.header = true
        csv.ignore.leading.whitespace = true
        csv.ignore.quotations = false
        csv.keep.carriage.return = false
        csv.null.field.indicator = NEITHER
        csv.quote.char = 34
        csv.rfc.4180.parser.enabled = false
        csv.separator.char = 44
        csv.skip.lines = 0
        csv.strict.quotes = false
        csv.verify.reader = true
        empty.poll.wait.ms = 500
        error.path = /tmp/fail
        file.buffer.size.bytes = 131072
        file.minimum.age.ms = 0
        files.sort.attributes = [NameAsc]
        finished.path = /tmp/dest
        halt.on.error = true
        input.file.pattern = (.*?).input
        input.path = /tmp/src
        key.schema =
        metadata.field = metadata
        metadata.location = HEADERS
        parser.timestamp.date.formats = [yyyy-MM-dd'T'HH:mm:ss, yyyy-MM-dd' 'HH:mm:ss]
        parser.timestamp.timezone = UTC
        processing.file.extension = .PROCESSING
        schema.generation.enabled = true
        schema.generation.key.fields = []
        schema.generation.key.name = com.github.jcustenborder.kafka.connect.model.Key
        schema.generation.value.name = com.github.jcustenborder.kafka.connect.model.Value
        task.count = 1
        task.index = 0
        task.partitioner = ByName
        timestamp.field =
        timestamp.mode = PROCESS_TIME
        topic = test
        value.schema =
 (com.github.jcustenborder.kafka.connect.spooldir.SpoolDirCsvSourceConnectorConfig)
[2020-12-09 15:09:05,593] INFO Key or Value schema was not defined. Running schema generator. (com.github.jcustenborder.kafka.connect.spooldir.AbstractSpoolDirSourceConnector)
[2020-12-09 15:09:05,593] ERROR WorkerConnector{id=csvimport-test} Error while starting connector (org.apache.kafka.connect.runtime.WorkerConnector)
java.lang.NoClassDefFoundError: Could not initialize class com.github.jcustenborder.kafka.connect.spooldir.CsvSchemaGenerator

Issue Analytics

State:
Created 3 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

fjahangiricommented, Dec 28, 2020

yes, by the way, It works fine with 2.0.46 and other versions. java.lang.NoClassDefFoundError: Could not initialize class com.github.jcustenborder.kafka.connect.spooldir.CsvSchemaGenerator and also with defined schema, I got this exception : java.lang.NoClassDefFoundError: com/fasterxml/jackson/annotation/JsonKey

1reaction

fjahangiricommented, Dec 28, 2020

hi, I have the same issue in version 2.0.54 I use confluentinc/cp-kafka-connect:5.5.1 image and install with confluent-hub confluent-hub install --no-prompt jcustenborder/kafka-connect-spooldir:latest

Top Results From Across the Web

How To Do CSV File Validation And Schema Generation

Invalid CSV files create challenges for those building data pipelines. Pipelines value consistency, predictability, and testability as they ensure uninterrupted ...

timwis/csv-schema: Analyzes a CSV file and ... - GitHub

This application parses CSV files (including huge ones) within the browser. It analyzes each field to suggest the best database field type, max ......

CSV Schema - Digital Preservation @ The National Archives

A text based schema language ( CSV Schema ) for describing data in CSV files ... Tool and API ( CSV Validator )...

How to generate a schema from a CSV for a PostgreSQL Copy

Generate SQL statements for one or more CSV files, create execute ... -t, --tabs Specifies that the input CSV file is delimited with...

Resolving common problems with CSV files

If either problem exists, we suggest you start from the beginning with generating your .csv file. How to generate CSV data correctly. See...