question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

isAdjustedToUtc: false

See original GitHub issue

Hello G-Research,

I’ve been working on a project in order to convert data from an SQL-Server database to parquet, but I reach a wall when trying to remove the UTC from the DateTime column for the writer. I’ve read through other issues and found out that isAdjustedToUtc is set to default true, however the original data i have does not use UTC. If I try using columns[i] = new Column<DateTime?>(cdt.Schema[i].Name, LogicalType.Timestamp(isAdjustedToUtc: false, timeUnit: TimeUnit.Nanos) there is an exception since the values in the database are of type DateTime and there is a mismatch with the LogicalTypeOverride.

System.NotSupportedException: 'unsupported logical system type System.Nullable1[System.DateTime] with logical type Timestamp(isAdjustedToUTC=false, timeUnit=nanoseconds, is_from_converted_type=false, force_set_converted_type=false)

I posted the same question in StackOverflow a few days ago, but haven’t gotten a conclusive answer, apart from using the ConverterFactory, and while i try using that based on the snippets provided I can’t seem to get it working converterFactory.GetConverter<DateTime, TimeSpanNanos>(columnDescriptor, byteBuffer: buffer) ;

To briefly summarize, I just want to take the DateTime values in the original database column and write them to the parquet file again as DateTime but without UTC. And since they are of the same type it feels redundant to use the LogicalFactories to cast DateTime as DateTime with no UTC. Is there an easier way or what is the best way to proceed?

I know this is not a direct issue to the library itself, and more an issue on my part, but I would really appreciate any suggestions how to handle this instance.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
vthemeliscommented, Apr 4, 2023

Oh, sorry, I misunderstood the purpose of the logical type. I thought that I would have to provide a logical type representing an array (or list as in LogicalType.List()) of DateTimes as the last argument to the Column ctor.

Looks like just passing the logical type for the leaf type of the array is enough.

Thanks!

1reaction
adamreevecommented, Apr 4, 2023

Hi @vthemelis, can you explain more what you’re trying to achieve? It sounds like you have an array of DateTimes and you want to write them as TimeZone naive timestamps in a Parquet file, so you should just be able to specify isAdjustedToUtc: false in the logical type.

If instead you’re talking about reading DateTime values from a Parquet file and you want them to have Kind set to DateTimeKind.Unspecified you can use the ParquetSharp.ReadDateTimeKindAsUnspecified AppContext switch that was added in #288

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - How to retrieve idAdjustedUTC flag value for a ...
This gives me either true or false as string. Is there another way to retrieve the value of isAdjustedToUTC without using a regex?...
Read more >
How to load logical type TIMESTAMP data from Parquet ...
Data in Parquet files that are of logical type TIMESTAMP with adjustedToUTC=false are not supported by Snowflake, and are loaded as "Invalid ...
Read more >
Parquet Files - Spark 3.4.1 Documentation
When enabled, TIMESTAMP_NTZ values are written as Parquet timestamp columns with annotation isAdjustedToUTC = false and are inferred in a similar way.
Read more >
TimestampType (Apache Parquet Format 2.4.0 API) - javadoc.io
TimestampType(boolean isAdjustedToUTC, TimeUnit unit) ... Returns true if field isAdjustedToUTC is set (has been assigned a value) and false otherwise.
Read more >
Apache Parquet Data Type Mappings - MATLAB & Simulink
isAdjustedToUTC can be true or false. TimeUnit = can be milliseconds, microseconds, or nanoseconds. INT64.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found