question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parquet.ParquetException: 'NewField' does not exist in this file (Schema evolution with ParquetConvert.Deserialize)

See original GitHub issue

Version: 3.9.0

Runtime Version: .Net Core v 2.2

OS: Windows

Expected behavior

I am having an issue with schema evolution. Added a new field in my type and it is not able to deserialize now. Can we mark a field optional somhow?

Actual behavior

Parquet.ParquetException: ‘NewField’ does not exist in this file

Steps to reproduce the behavior

  1. Serialize a collection of certain type.
  2. Add a field to the type.
  3. Deserialize using ParquentConvert.Deserialize<T>(“…”);

Code snippet reproducing the behavior

using (Stream fileStream = System.IO.File.OpenRead(“C:\temp\parquet\data.parquet”)) { positions = ParquetConvert.Deserialize<MyType>(fileStream); }

//here

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
aloneguidcommented, Feb 8, 2023

Unlikely as I don’t need it personally. Although PRs are always welcome.

0reactions
VolodymyrSenchakcommented, Aug 2, 2023

I would also love to have such an attribute like [ParquetOptional].

By the way, just to mimic JSON deserialization behavior - I think that not throwing an error and just applying default value if the field which is declared in class does not exist in the file - would be much more expected behavior.

Or are there any limitations to doing that?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Schema evolution in parquet format - apache spark
Parquet schema evolution is implementation-dependent. Hive for example has a knob parquet. column.
Read more >
Process parquet files in azure function - Microsoft Q&A
But I am not able to serialize parquet file content. I tried using NuGet package- Parquet.Net, below is the code used.
Read more >
How to Read Parquet with Spark: Handling Unsupported ...
Reading Parquet files with unsupported types in Spark can be challenging, but it's not impossible. With strategies like schema evolution, custom ...
Read more >
[#ARROW-9942] [Python] Schema Evolution - Add new Field
However when adding a new field in a later parquet file, the schemas don't seem to be merged and the new field is...
Read more >
Why you should use a parquet format file | by Park Sehun
Because the field type has been changed, Parquet cannot read and write data to the file using the new schema without rewriting the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found