Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Starting scrapyd docker container with eggs included

See original GitHub issue

Hi I’ve been experimenting a little with scrapyd on docker, and done the following:

in the config file, i specified different directory for eggs eggs_dir = /src/eggs
in dockerfile, i added prebuilt projects to this directory ADD eggs /src/eggs

At first glance, it looked like it’s working

but, when I wanted to make a scheduje.json post, it returned me an error

{“node_name”: “295e305bea8e”, “status”: “error”, “message”: “Scrapy 1.4.0 - no active project\n\nUnknown command: list\n\nUse "scrapy" to see available commands\n”}

I could type anything into project and spider fields and the result was the same. How can I fix this issue?

Issue Analytics

State:
Created 6 years ago
Comments:12 (1 by maintainers)

Top GitHub Comments

6reactions

jacob1237commented, Nov 20, 2019

@VanDavv @iamprageeth @radyz

I managed to solve the problem without using the API. Unfortunately, there is no way to deploy Scrapy projects without the egg files completely (the only way is to override some scrapyd components), so you’ll need a simple deployment script:

build.sh:

#!/bin/sh

set -e

# The alternative way to build eggs is to use setup.py
# if you already have it in the Scrapy project's root
scrapy-deploy --build-egg=myproject.egg

# your docker container build commands
# ...

Dockerfile:

RUN mkdir -p eggs/myproject
COPY myproject.egg eggs/myproject/1_0.egg

CMD ["scrapyd"]

That’s all! So instead of deploying myproject.egg into the eggs folder directly, you have to create the following structure: eggs/myproject/1_0.egg where myproject is your project name, and 1_0 is a version of your project in scrapyd

4reactions

radyzcommented, Dec 1, 2018

I managed to get through this by running a background deploy after my scrapyd instance has started. Not sure it’s the best way but it works for me now

Dockerfile

FROM python:3.6

COPY requirements.txt /requirements.txt
RUN pip install -r requirements.txt
COPY docker-entrypoint /usr/local/bin/
RUN chmod 0755 /usr/local/bin/docker-entrypoint
COPY . /scrapyd
WORKDIR /scrapyd

ENTRYPOINT ["/usr/local/bin/docker-entrypoint"]

Entrypoint script

#!/bin/bash
bash -c 'sleep 15; scrapyd-deploy' &
scrapyd

scrapy.cfg

[settings]
default = scraper.settings

[deploy]
url = http://localhost:6800
project = projectname

This assumes you are copying your scrapy project folder into /scrapyd and have the requirements.tx with all your dependencies (including scrapyd server)

Top Results From Across the Web

Running command after server started - Docker, scrapyd ...

So, I found a way to get the scrapyd server process to the ... Set bash monitor mode on; run server on the...

vimagick/scrapyd - Docker Image

scrapyd is a service for running Scrapy spiders. It allows you to deploy your Scrapy projects and control their spiders using a HTTP...

Introduction — Gerapy 0.9.3 documentation

Complete the deployment of Scrapy projects with Scrapyd; Control the startup and status monitoring of Scrapy projects through the API provided by Scrapyd...

基于Docker的Scrapy+Scrapyd+Scrapydweb部署 - CSDN博客

文章开始，先摘录一下文中各软件的官方定义ScrapyAn open source and collaborative framework for extracting the data youneed from websites.

Scrapyd | My knowledge base - My deep learning

Scrapyd is an application for deploying and running Scrapy spiders. ... Загрузка egg на сервер Scrapyd через конечную точку addversion.json.