Write nested Pocos as Parquet
See original GitHub issueI’d like to get the bytes from generated parquet Poco-structure using the ChoParquetWriter
byte[] bytes = ChoParquetWriter.SerializeAll<MyData>(data);
The poco structure (IEnumerable<MyData> data
as serialized json)
[{
"Health": {
"Id": 99,
"Status": false
},
"Safety": {
"Id": 3,
"Fire": 1
},
"Climate": [{
"Id": 0,
"State": 2
}]
}]
MyData.cs
public class MyData
{
public Health Health { get; set; }
public Safety Safety { get; set; }
public List<Climate> Climate { get; set; }
}
(MyData is actually even more nested but follows the same pattern)
but this gives an error: Parquet: CLR type ‘<redacted>.Climate’ is not supported, please specify one of 'System.DateTimeOffset, System.DateTime, Parquet.File.Values.Primitives.Interval, System.Decimal, System.Boolean, System.Byte, System.SByte, System.Int16, System.UInt16, System.Int32, System.Int64, System.Numerics.BigInteger, System.Single, System.Double, System.String, System.Byte[], , , ’ or use an alternative constructor.
Excluding the Climate property and then it all works fine, what am I missing?
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
Write nested parquet format from Python - json
What is the best way to write the nested parquet file? I have read Nested data in Parquet with Python and here fast...
Read more >Cinchoo ETL - Parquet Reader
Parquet stores nested data structures in a flat columnar format. Compared to a traditional approach where data is stored in row-oriented ...
Read more >Nested data representation in Parquet
Parquet stores nested structures thanks to structures called repetition and definition levels. The first one is used to determine when a new ...
Read more >Arrow and Parquet Part 2: Nested and Hierarchical Data using ...
In our final blog post, we explain how Parquet and Arrow combine these concepts to support arbitrary nesting of potentially nullable data ...
Read more >Example parquet file - Erohana
Updated on 07/14/2023 Use the PXF HDFS connector to read and write Parquet-format data. ... Parquet is a columnar storage format that supports...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Confused by this when docs seem to say nested columns are supported…
Because Parquet file doesn’t support nested data format. You will need to flatten it before storing them
Here is one way to handle it