question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

INSERT creating new Hive table partition uses wrong field delimiters for text format

See original GitHub issue

Hi - When running INSERT INTO a hive table as defined below, it seems Presto is writing valid data files. However running subsequent SELECTs on the table will return all NULL values. When running the SELECT in Hive, the same files are read and displayed correctly. Is this a known/fixed bug or is there an issue with my table definition? I’m running Presto 0.152.3.

CREATE TABLE `test`(
  `d` timestamp,
  `a` string,
  `b` string,
  `c` string
)
PARTITIONED BY (
  `p` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
  'field.delim'='\t',
  'line.delim'='\n',
  'serialization.format'='\t')
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://bucket'
TBLPROPERTIES (
  'serialization.null.format'=''
)

Thanks

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

4reactions
gcassarinicommented, May 30, 2017

Hi, I found the same behaviour in presto version 0.170. Is it possible?

Thanks

2reactions
ankitdixitcommented, Mar 14, 2017

Looking some more, i found out that the issue is presto does not use field delimiters specified while creating a table if there is not data in the partition. The first insert is done using default delimiters and subsequently the read fails.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Hive create table and insert table went wrong - Stack Overflow
This issue comes if there is data type miss match or file format is not correct . is if productid is not of...
Read more >
CREATE HIVEFORMAT TABLE - Spark 3.3.1 Documentation
Description. The CREATE TABLE statement defines a new table using Hive format. ... Partitions are created on the table, based on the columns...
Read more >
CREATE TABLE with Hive format | Databricks on AWS
Partitions the table by the specified columns. Use the SERDE clause to specify a custom SerDe for one table. Otherwise, use the DELIMITED ......
Read more >
Hive Create Table Syntax & Usage with Examples
but, I have a comma-separated file to load the data into this table hence, I've used ROW FORMAT DELIMITED FIELDS TERMINATED BY optional...
Read more >
CREATE TABLE Statement | 6.3.x - Cloudera Documentation
Currently, Impala can query more types of file formats than it can create or insert into. Use Hive to perform any create or...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found