question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Docker Quickstart - Sample Data Loading Error

See original GitHub issue

Describe the bug

Running the Sample data ingestion commands produces an error:

Collecting avro-python3==1.8.2 Downloading avro-python3-1.8.2.tar.gz (36 kB) ERROR: Command errored out with exit status 1: … Complete output (7 lines): Traceback (most recent call last): … File “/tmp/pip-install-gmtSqW/avro-python3/setup.py”, line 114, in Main (‘Python version >= 3 required, got %r’ % sys.version_info) AssertionError: Python version >= 3 required, got sys.version_info(major=2, minor=7, micro=17, releaselevel=‘final’, serial=0)

To Reproduce

Steps to reproduce the behavior:

  1. Successfully run the Quickstart.sh script
  2. Run the commands to import sample data:

docker build -t ingestion -f docker/ingestion/Dockerfile . && cd docker/ingestion && docker-compose up

Expected behavior

Sample Data is loaded into DataHub

Desktop (please complete the following information):

  • OS: Ubuntu 18.04.4 LTS - 4.15.0-91-generic #92-Ubuntu
  • Browser N/A
  • Version 18.04.4

Additional context

I got around the above error by modifying the datahub/docker/ingestion/Dockerfile line “–from=python:2.7” --> “–from=python:3.6” and rerunning the docker build ingestion command again.

The container was successfully built but when the container started, I received a different error message:

Attaching to ingestion ingestion | Traceback (most recent call last): ingestion | File “mce_cli.py”, line 3, in <module> ingestion | from confluent_kafka import avro ingestion | File “/root/.local/lib/python3.6/site-packages/confluent_kafka/avro/init.py”, line 9, in <module> ingestion | from confluent_kafka.avro.cached_schema_registry_client import CachedSchemaRegistryClient ingestion | File “/root/.local/lib/python3.6/site-packages/confluent_kafka/avro/cached_schema_registry_client.py”, line 27, in <module> ingestion | from requests import Session, utils ingestion | ModuleNotFoundError: No module named ‘requests’ ingestion exited with code 1`

Then I got around that error by adding a line to the the datahub/metadata-ingestion/mce-cli/requirements.txt -> “requests==2.23.0”

Again, container built, ingestion container ran, but I received the error:

ingestion | avro.io.AvroTypeException: The datum {‘auditHeader’: None, ‘proposedSnapshot’: (‘com.linkedin.pegasus2avro.metadata.snapshot.CorpUserSnapshot’, {‘urn’: ‘urn:li:corpuser:datahub’, ‘aspects’: [{‘active’: True, ‘displayName’: ‘Data Hub’, ‘fullName’: ‘Data Hub’, ‘email’: ‘datahub@linkedin.com’, ‘title’: ‘CEO’}, {}]}), ‘proposedDelta’: None} is not an example of the schema

At this point I gave up, because I was mucking with too many components.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
skybabydadcommented, Mar 29, 2020

solved too. step1: edit ingestion/Dockerfile, change python from 2.7 to 3.6

#vi datahub/docker/ingestion/Dockerfile
FROM openjdk:8
COPY --from=python:3.6 / /
COPY . datahub-src
RUN cd datahub-src && ./gradlew :metadata-events:mxe-schemas:build \
    && cp -r metadata-ingestion/mce-cli . && cd metadata-ingestion/mce-cli \
    && pip install --user -r requirements.txt

step2: change /datahub/metadata-ingestion/mce-cl/requirements.txt as following

avro-python3==1.8.2; python_version == '3.7'
confluent-kafka==1.1.0; python_version == '3.7'
confluent-kafka[avro]==1.1.0; python_version < '3.7'
0reactions
klashe1977commented, Apr 14, 2020

This works for me now. thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Docker Quickstart - Sample Data Loading Error · Issue #1612
To Reproduce. Steps to reproduce the behavior: Successfully run the Quickstart.sh script. Run the commands to import sample data:
Read more >
Workarounds for common problems - Docker Documentation
Docker Desktop fails to start when anti-virus software is installed. Some anti-virus software may be incompatible with Hyper-V and Microsoft Windows 10 builds. ......
Read more >
Docker Compose Up gives "The system cannot find the file ...
Another alternative if this didn't work, is to turn it off and on again. Close the quickstart terminal, manually open Virtualbox, stop the...
Read more >
Debugging Guide - DataHub
If you're seeing errors like below, chances are you didn't give enough resource to docker. Please make sure to allocate at least 8GB...
Read more >
Running under Docker - Node-RED
docker run -it -p 1880:1880 -v node_red_data:/data --name mynodered ... Linux 4.19.76-linuxkit x64 LE 10 Oct 12:57:11 - [info] Loading palette nodes 10...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found