question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Document Hive connector table properties explicitly

See original GitHub issue

Currently we only have this in code snippets.

Here is what I think is correct without too much digging

.. list-table:: Hive connector storage table properties
  :widths: 30, 60, 5
  :header-rows: 1

  * - Property name
    - Description
    - Default
  * - ``format``
    - File format to use in the storage. Valid values include ``ORC``,
      ``PARQUET``, ``CSV``, ``JSON`` and others. The catalog property
      ``hive.storage-format`` sets the default value to ``ORC`` and can be used
      to set a different default.
    - ``ORC``
  * - ``partitioned_by``
    - Partitioning column for the storage table.
    - ``[]``
  * - ``bucketed_by``
    - Bucketing column for the storage table.  Must be used with
      ``bucketed_count``.
    - ``[]``
  * - ``bucket_count``
    - Number of buckets to group data into. Must be used with ``bucketed_by``.
    -
  * - ``sorted_by``
    - Column to sort by to determine bucketing for row.
    - ``[]``

There are probably other table properties and I know that we have properties for ORC and Parquet. We should collect them all and have a section/table with anchor so we can link to it from other places like materialized view and such

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mosabuacommented, Aug 19, 2021

That does not detail what the possible values for each table property are and it requires the user to run a query to be able to write a query … while inline documentation like this is great … its probably not good enough. We can link to that in addition to documenting the properties in the actual docs with possible values, explanations and even example queries…

0reactions
findepicommented, Aug 23, 2021

@mosabua yes, the available table properties should be documented in Hive connector docs.

Currently the coverage is erratic, with e.g. bucketed_by being mentioned in examples only and e.g. skip_header_line_count being mentioned in release notes only.

btw the SELECT * FROM system.metadata.table_properties is a good way to get the docs initialized, also, if we could test the docs (and we already do in some rare cases!), we could test coherence between SELECT * FROM system.metadata.table_properties and documented properties. This would help keep them in sync.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Hive connector — Trino 403 Documentation
The Hive connector allows querying data stored in an Apache Hive data warehouse. Hive is a combination of three components: Data files in...
Read more >
Hive Configuration Properties - Apache Software Foundation
This document describes the Hive user configuration properties (sometimes called parameters, variables, or options), and notes which ...
Read more >
Add Hive table property to for arbitrary properties #954 - GitHub
Currently only table properties explicitly listed HiveTableProperties are supported in Presto, but many Hive environments use extended ...
Read more >
Hadoop Hive - Configuration Properties
Default file format for CREATE TABLE statement. Options are TextFile and SequenceFile. Users can explicitly say CREATE TABLE .
Read more >
Hive Security Configuration — Presto 0.278 Documentation
Few authorization checks are enforced, thus allowing most operations. The config properties hive.allow-drop-table , hive.allow-rename-table , hive ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found