Elasticsearch connector documentation inconsistencies
See original GitHub issueWhile trying to connect Presto to an Elasticsearch index we have found a few errors / inconsistencies / rooms for improvement in the documentation (version 0.220 at this point). In some cases I believe they are actually bugs (i.e. the code needs fixing) but I’m new to this and not always positive.
-
In table definition files we have found that the
schemaName
is not optional. If not defined Presto fails to start with[...] java.lang.IllegalArgumentException: schemaName is null or empty [...]
. (Since there is a default definition section for this at the connector level and even that is supposedly optional I guess this is a bug.) -
The example table definition has a
hostAddress
property in it; the description below the table useshost
(and the running code also expectshost
). We first copy-pasted the example to fill in and couldn’t understand for a while what might have gone wrong. -
When the
columns
are not defined (note: they are marked optional) Presto will start but queries on the table fail with ‘Internal error’. (SELECT * ...
type queries as well asSHOW COLUMNS IN ...
query.) They work fine if columns are defined. -
Field descriptions are not really helpful, especially at Column Metadata. (Some of this may be because we have not been very deep in Elastic either, though.) It took us quite a while to figure what to use for
type
at table definition. TheJSONPath
is still a bit of a mystery to us (what is it for and how is it used). Also we found most of the column metadata to be required in practice in spite of being marked as optional. -
The
type
in the column definitions looks outright wrong: for example it requires to be defined asvarchar
for Elastic’sstring
. Maybe it is right, though, just the description is misleading?
And last but not least: there is an open issue about the connector being able to process only all-lowercase Elastic field names at this point. It took us nearly a day to figure why Presto did not return rows for a table while properly reporting its size. (I.e. SELECT COUNT(1) FROM table
reported 13407, SELECT COUNT(Id) FROM table
reported NULL
and we knew that the Id
field was filled in for each row…) I suggest the until this gets resolved there should be an emphasized note in the docs about it.
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (1 by maintainers)
Top GitHub Comments
@fastcatch We’d love a pr to improve the documentation! Would you like to work on this? cc: @zhenxiao
This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.