Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Persist raw data from kafka topic as it is

See original GitHub issue

Feature request: To save raw data from topic in pinot table.

Use case : We have lots of complex schemas and we are using pinot for saving and retreiving topic data with times stamp and some other fields. We do not want to map all nested columns from complex schema and create pinot schema and use lots of transformation functions. There are some places we want raw data as it is in pinot table.

Sample data :

{ "header": { "tid": "12wee", "rid": 1, "timestamp": 1647347092337 }, "status": "200_SUCCESS", "jasData": { "sdata": -22.89122, "cnn": 0.823469, "kli": 2.238848, "olp": [ { "ovPerc": 0.032486767, "hg": 30.0, "abshi": 6.661863 } ], "terrkl": { "ovPerc": 0.9675132, "dist": [ -25.17232, -25.17232, -25.130081 ] }, "bcut": 2.77 }, "rgData": { "pre": 102033.33, "pv": 0.16, "t": 287.36, "timestamp": 1647347069000 }, "timestamp": 1647347092337 } }

Issue Analytics

State:
Created a year ago
Comments:9 (7 by maintainers)

Top GitHub Comments

1reaction

saumya2700commented, Apr 28, 2022

@Jackie-Jiang This sounds good to me. @saumya2700 If you’re not working on this, I can pick this up?

yes Please go ahead.

1reaction

Jackie-Jiangcommented, Mar 29, 2022

We may consider adding a new config in the IngestionConfig to store the json string of the record into a field. The logic needs to be implemented into the RecordExtractor

Top Results From Across the Web

How persistence works in an Apache Kafka deployment

Data retention can be controlled by the Kafka server and by per-topic configuration parameters. The retention of the data can be controlled by ......

It's Okay To Store Data In Kafka - Confluent

Data in Kafka is persisted to disk, checksummed, and replicated for fault tolerance. Accumulating more stored data doesn't make it slower.

Using Kafka as a Temporary Data Store and Data-loss ...

The period during which the data is stored by Kafka is called retention. Theoretically, you can set this period to forever. Kafka also...

Read data from Kafka topic and write into local persistent in NiFi

This recipe helps you read data from Kafka topic store and write into local persistent storage in NiFi.

Ingesting Raw Data with Kafka-connect and Spark Datasets

Then, since we have Kafka in place, using Kafka-connect allows us to perform this raw data layer ETL without writing a single line...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Persist raw data from kafka topic as it is

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

The race condition that TableDataManager is removed during segment download because of incorrect segment count.

consumption halted on realtime table when accessing an offset that has been already deleted from Kafka