Support the UTC formatter in the JSON Reader
See original GitHub issueIs your feature request related to a problem? Please describe. Support the UTC formatter in the JSON Reader
Describe the solution you’d like optional 1 :
- use a few of the most popular formatter and the try-catch
- minimal changes
example :
LocalDateTime ldt;
try {
OffsetDateTime originalDateTime = OffsetDateTime.parse(parser.getValueAsString(), DateUtility.isoFormatTimeStamp);
ldt = originalDateTime.toLocalDateTime();
} catch (DateTimeParseException e) {
ldt = LocalDateTime.parse(parser.getValueAsString(), DateUtility.utcFormatDateTime); // "yyyy-MM-dd'T'HH:mm:ss'Z'"
} catch (DateTimeParseException e) {
ldt = LocalDateTime.parse(parser.getValueAsString(), DateUtility.utcFormatTimeStamp); // "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
}
OffsetDateTime utcDateTime = OffsetDateTime.of(ldt, ZoneOffset.UTC);
optional 2 :
- allow the user to define the UTC formatter use the
ALTER SESSION SET
syntax. - not sure the framework support to extend this feature.
example (Dummy) :
ALTER SESSION SET `store.json.date_formatter` = "yyyy-MM-dd'T'HH:mm:ss'Z'"
LocalDateTime ldt;
if (hasUTCKeyword(parser.getValueAsString()) { // value.indexOf('T') > 0 & value.indexOf('Z') > value.indexOf('T')
ldt = LocalDateTime.parse(parser.getValueAsString(), session_date_formatter);
} else {
ldt = OffsetDateTime.parse(parser.getValueAsString(), session_date_formatter).toLocalDateTime();
}
Describe alternatives you’ve considered NONE
Additional context
When the date value as the ISODate (without the timezone, or called 0 timezone) store in mongo and set the store.mongo.bson.record.reader
to false:
{
"_id" : ObjectId("5da7760149b3f000195cabb"),
"date" : ISODate("2019-09-24T20:06:56Z")
}
Drill got the error stack error :
Caused by: java.lang.Exception: Text '2019-09-30T20:47:43Z' could not be parsed at index 19
Because the OffsetDateTime
parse the date string use the fixed formatter yyyy-MM-dd'T'HH:mm:ss.SSSXX
. Then, the OffsetDateTime is not allowed to accept the UTC formatter ***T***Z
(or called 0 timezone) :
example 1:
yyyy-MM-dd'T'HH:mm:ss.SSS'Z'
example 2:
yyyy-MM-dd'T'HH:mm:ss'Z'
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (8 by maintainers)
Top Results From Across the Web
JSON Stringify changes time of date because of UTC
toJSON() prints the UTC-Date into a String formatted (So adds the offset with it when converts it to JSON format). date = new...
Read more >DateTime and DateTimeOffset support in System.Text.Json
An overview of how DateTime and DateTimeOffset types are supported in the System.Text.Json library.
Read more >[jira] [Commented] (DRILL-7989) Use the UTC formatter in the ...
This is the first value in `input2.json`, so it seems we're good. -- This is an automated message from the Apache Git Service....
Read more >Data.format.parse does not support 'utc:"%m%d%Y" · Issue #818
I have tried a couple of examples and they did work but when I tried to add an example in vl, the validator...
Read more >How to handle incoming utc date in json and map to oracle ...
I have a Map shape with Json to Oracle dynamic insert. My JSON contains date in UTC format in yyyy-MM-dd'T'HH:mm:ss.SSS'Z.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@luocooong, I may be confused, but as I read the Mongo spec, it does want the UTC “Zulu” format:
2019-09-30T20:47:43Z
. I verified that this format is tested and does work correctly in the new JSON loader.When you say “without a timezone”, I think you are describing a date/time of the form
2019-09-30T20:47:43
. Ths would be a local time. Drill uses local time internally, but Mongo (wisely) seems to use UTC time. Thus, if you are reading Mongo data, you should not see a local time (if I understand Mongo correctly.) By the way,2019-09-30T20:47:43Z
does have a time zone: it is zero offset, also called GMT.Your testing shows that the “Zulu” format is broken in the old JSON parser. Let’s try to figure out why it is broken.
I looked at the code in your call stack. It is pretty convoluted – another reason for the new JSON parser. The JSON parser tries to handle all maps the same. Mongo extended types are, syntactically, a JSON map. The
MapVectorOutput.run()
method checks for Mongo keywords. For the date/time keyword, the code then callsVectorOutput$MapVectorOutput.writeTimestamp
. I suspect this is where things went wrong. TheVectorOutput$MapVectorOutput.writeTimestamp
method is not unique to Mongo JSON, it is a generic vector method. As you note, at present it uses theisoFormatTime
constant inDateUtility
:For Mongo, it should use the
UTC_FORMATTER
constant:Checking the file history, it looks like the following commit broke things: “DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Time, Timestamp types.” (Use “Blame” on the
VectorOutput.java
file.) My guess is that the author wanted to make sure Drill used only local times, and did not realize that he was breaking Mongo which requires ISO “Zulu” timestamps.A quick check of the code suggests that only the old
JsonReader
uses this code path. So, you can try reverting the code to use theUTC_FORMATTER
and rerun unit tests. Also check against your Mongo test case. If both of these work, then this is the simplest fix.Now, it could be that something in the tests uses a Mongo-format date/time, but with a Drill-like local time. If so, then we can look at the problem an think about how to solve it. Let’s see if running the tests tells us if we even have this problem.
@paul-rogers Thanks for the information. But the old JSON loader use the fixed date formatter :
yyyy-MM-dd'T'HH:mm:ss.SSSXX
, It required a timezone (or offset). And then, the OffsetDateTime class does not accept a date string without timezone.https://github.com/apache/drill/blob/39b565f112122734c080324fdcbef518ced16507/exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/VectorOutput.java#L353-L354
So, the old JSON loader cannot parse the
2019-09-30T20:47:43Z
(with this fixed formatter). Interestingly, the2019-09-30T20:47:43Z
is equals to the2019-09-30T20:47:43+0000
(Z
is the 0 timezone), but the OffsetDateTime don’t even know it. In this case, Do we need to update the new JSON loader?