Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

INSERT creating new Hive table partition uses wrong field delimiters for text format

See original GitHub issue

Hi - When running INSERT INTO a hive table as defined below, it seems Presto is writing valid data files. However running subsequent SELECTs on the table will return all NULL values. When running the SELECT in Hive, the same files are read and displayed correctly. Is this a known/fixed bug or is there an issue with my table definition? I’m running Presto 0.152.3.

CREATE TABLE `test`(
  `d` timestamp,
  `a` string,
  `b` string,
  `c` string
)
PARTITIONED BY (
  `p` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
  'field.delim'='\t',
  'line.delim'='\n',
  'serialization.format'='\t')
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://bucket'
TBLPROPERTIES (
  'serialization.null.format'=''
)

Thanks

Issue Analytics

State:
Created 7 years ago
Comments:9 (4 by maintainers)

Top GitHub Comments

4reactions

gcassarinicommented, May 30, 2017

Hi, I found the same behaviour in presto version 0.170. Is it possible?

Thanks

2reactions

ankitdixitcommented, Mar 14, 2017

Looking some more, i found out that the issue is presto does not use field delimiters specified while creating a table if there is not data in the partition. The first insert is done using default delimiters and subsequently the read fails.