question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot Set Index Pattern on Elasticsearch as a Log Handler

See original GitHub issue

Apache Airflow version: 2.0.0

Kubernetes version (if you are using kubernetes) (use kubectl version): Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:28:09Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.8-aliyun.1", GitCommit:"94f1dc8", GitTreeState:"", BuildDate:"2021-01-10T02:57:47Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Environment: -

  • Cloud provider or hardware configuration: Alibaba Cloud
  • OS (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
  • Kernel (e.g. uname -a): Linux airflow-webserver-fb89b7f8b-fgzvv 3.10.0-1160.11.1.el7.x86_64 #1 SMP Fri Dec 18 16:34:56 UTC 2020 x86_64 GNU/Linux
  • Install tools: Helm (Custom)
  • Others: None

What happened: My Airflow use fluent-bit to catch the stdout logs from airflow containers and then send the logs messages to Elasticsearch in a remote machine and it works well, I can see the logs through Kibana. But the Airflow cannot display the logs, because an error:

ERROR - Exception on /get_logs_with_metadata [GET]
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)  
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/www/auth.py", line 34, in decorated
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/www/decorators.py", line 60, in wrapper
    return f(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 65, in wrapper
    return func(*args, session=session, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/www/views.py", line 1054, in get_logs_with_metadata
    logs, metadata = task_log_reader.read_log_chunks(ti, try_number, metadata)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/log/log_reader.py", line 58, in read_log_chunks
    logs, metadatas = self.log_handler.read(ti, try_number, metadata=metadata)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/log/file_task_handler.py", line 217, in read
    log, metadata = self._read(task_instance, try_number_element, metadata)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py", line 160, in _read
    logs = self.es_read(log_id, offset, metadata)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/elasticsearch/log/es_task_handler.py", line 233, in es_read
    max_log_line = search.count()
  File "/home/airflow/.local/lib/python3.8/site-packages/elasticsearch_dsl/search.py", line 701, in count
    return es.count(index=self._index, body=d, **self._params)["count"]
  File "/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/client/__init__.py", line 528, in count
    return self.transport.perform_request(
  File "/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/transport.py", line 351, in perform_request
    status, headers_response, data = connection.perform_request(
  File "/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/connection/http_urllib3.py", line 261, in perform_request
    self._raise_error(response.status, raw_data)
  File "/home/airflow/.local/lib/python3.8/site-packages/elasticsearch/connection/base.py", line 181, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
elasticsearch.exceptions.AuthorizationException: AuthorizationException(403, 'security_exception', 'no permissions for [indices:data/read/search] and User [name=airflow, backend_roles=[], request

but when I debug and use this code, I can see the logs:

es = elasticsearch.Elasticsearch(['...'], **es_kwargs)
es.search(index="airflow-*", body=dsl)

and when I look into the source code of elasticsearch providers there are no definition of the index-pattern on that

https://github.com/apache/airflow/blob/88199eefccb4c805f8d6527bab5bf600b397c35e/airflow/providers/elasticsearch/log/es_task_handler.py#L216

so I assume the issue is insufficient permission to scan all the indices, therefore, how can I set the index-pattern so that Airflow only reads certain indices? Thank you!

What you expected to happen: The Airflow configuration has option to add elasticsearch index pattern so that airflow only queries certain indices, not querying all indexes on the elasticsearch server

How to reproduce it: Click log button on task popup modal to see logs page

Anything else we need to know: Every time etc

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
imamdigmicommented, Jul 7, 2021

Hi @jedcunningham thanks for your suggestion, I have tried it, and it works 20210707120518 PM

1reaction
koukcommented, May 23, 2022

I have started working on this here: https://github.com/apache/airflow/compare/main...kouk:support-es-index-patterns?expand=1 it’s still a WIP but any feedback would be helpful.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dashboard could not locate the index-pattern - Elastic
Try running the setup command again. For example: ./filebeat setup . · If that doesn't work, go to the Management app in Kibana,...
Read more >
Support restricted index patterns in Elasticsearch log handler ...
In these cases fetching the remote logs fails. To fix this we create a index_patterns configuration setting that can be set to a...
Read more >
how to import apache log into elasticsearcha and create an ...
I have an apache log file which i want to import in elasticsearch and create index pattern in kibana. i have installed ELK...
Read more >
Overview - IBM
The IBM Cloud Private logging chart deploys a Filebeat daemon set to every ... sends a record to Elasticsearch, it assigns it to...
Read more >
Best Practices for Managing Elasticsearch Indices - Logz.io
One area that deserves special focus is Elasticsearch indexing and ... its intuition is based on a small sample of the data set...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found