question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error using 3rd party library in remote function

See original GitHub issue

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac Mojave 10.14.5 Beta
  • Ray installed from (source or binary): through pip -u install ray
  • Ray version: 0.7.1
  • Python version: 3.7.1
  • Exact command to reproduce: python3 crawl_stock_daily.py 001231

Describe the problem

I want to use TorRequest(https://github.com/erdiaker/torrequest) in remote function. When I define remote function without TorRequest, it works fine. But when I use TorRequest method in ray remote function, I get error “TypeError: can’t pickle _thread.lock objects”.

$ python3 crawl-stock-daily.py 012330
2019-07-01 21:24:18,615	INFO node.py:498 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-07-01_21-24-18_614493_54252/logs.
2019-07-01 21:24:18,722	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:27576 to respond...
2019-07-01 21:24:18,833	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:36810 to respond...
2019-07-01 21:24:18,835	INFO services.py:806 -- Starting Redis shard with 6.87 GB max memory.
2019-07-01 21:24:18,847	INFO node.py:512 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-07-01_21-24-18_614493_54252/logs.
2019-07-01 21:24:18,847	INFO services.py:1442 -- Starting the Plasma object store with 10.31 GB memory using /tmp.
Stock(name='현대모비스', code='012330')
Traceback (most recent call last):
  File "crawl-stock-daily.py", line 147, in <module>
    result = ray.get(crawl_stock.remote(stock))
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/remote_function.py", line 84, in remote
    return self._remote(args=args, kwargs=kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/remote_function.py", line 119, in _remote
    worker.function_actor_manager.export(self)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/function_manager.py", line 348, in export
    self._do_export(remote_function)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/function_manager.py", line 367, in _do_export
    pickled_function = pickle.dumps(function)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/cloudpickle/cloudpickle.py", line 952, in dumps
    cp.dump(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/cloudpickle/cloudpickle.py", line 395, in save_function
    self.save_function_tuple(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/cloudpickle/cloudpickle.py", line 594, in save_function_tuple
    save(state)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 856, in save_dict
    self._batch_setitems(obj.items())
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 882, in _batch_setitems
    save(v)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 856, in save_dict
    self._batch_setitems(obj.items())
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 882, in _batch_setitems
    save(v)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 662, in save_reduce
    save(state)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 856, in save_dict
    self._batch_setitems(obj.items())
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 882, in _batch_setitems
    save(v)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/cloudpickle/cloudpickle.py", line 395, in save_function
    self.save_function_tuple(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/ray/cloudpickle/cloudpickle.py", line 594, in save_function_tuple
    save(state)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 856, in save_dict
    self._batch_setitems(obj.items())
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 882, in _batch_setitems
    save(v)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 856, in save_dict
    self._batch_setitems(obj.items())
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 882, in _batch_setitems
    save(v)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 662, in save_reduce
    save(state)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 856, in save_dict
    self._batch_setitems(obj.items())
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 882, in _batch_setitems
    save(v)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 662, in save_reduce
    save(state)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 856, in save_dict
    self._batch_setitems(obj.items())
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 882, in _batch_setitems
    save(v)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pickle.py", line 524, in save
    rv = reduce(self.proto)
TypeError: can't pickle _thread.lock objects

Source code / logs

import re
import sys
import random
from stocks import stocks
from torrequest import TorRequest
from bs4 import BeautifulSoup
import os
import ray
import time



browsers = [
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:16.0.1) Gecko/20121011 Firefox/16.0.1',
    'Mozilla/5.0 (iPad; CPU OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5355d Safari/8536.25',
   #Chrome
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36',
    'Mozilla/5.0 (Windows NT 5.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36',
    'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36',
    #Firefox
    'Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)',
    'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)',
    'Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (Windows NT 6.2; WOW64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)',
    'Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)',
    'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)',
    'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)',
    #Chrome
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36',
    'Mozilla/5.0 (Windows NT 5.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36',
    'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36',
    #Firefox
    'Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)',
    'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)',
    'Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (Windows NT 6.2; WOW64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)',
    'Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)',
    'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)',
    'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)'
]


base_url = 'https://finance.naver.com/item/sise_day.nhn'

@ray.remote
def get_last_page_num(stock):
    tr = TorRequest(proxy_port=9050, ctrl_port=9051, password=None)
    headers = {'User-Agent': random.choice(browsers)}
    target_url = "https://finance.naver.com/item/sise_day.nhn?code=%s&page=1" % stock.code
    r = tr.get(target_url, headers=headers)
    page_re = re.compile(r'page=(\d+)')
    s = BeautifulSoup(r.text, 'lxml')
    rr = s.find('td', {"class": "pgRR"})
    rr_href = rr.a['href']
    m = re.search(r.text, rr_href)
    return int(m[1])


def parse_page(url):
    rows = list()
    headers = {'User-Agent': random.choice(browsers)}
    r = tr.get(url, headers = headers)
    s = BeautifulSoup(r.text, 'lxml')
    table = s.find('table', {"class": "type2"})
    trs = table.find_all('tr')
    for tr in trs:
        tds = tr.find_all('td')
        if len(tds) == 7:
            if tds[0].find('span') != None:
                date = tds[0].span.text.strip().replace('.', '-')
                close = int(tds[1].span.text.strip().replace(',',''))
                diff = tds[2].span.text.strip().replace(',','')
                if diff != '0':
                    diff_sign = '-' if 'down' in tds[2].img['src'] else '+'
                else:
                    diff_sign = ''
                diff = int(diff_sign+diff)
                open = int(tds[3].span.text.strip().replace(',',''))
                high = int(tds[4].span.text.strip().replace(',',''))
                low = int(tds[5].span.text.strip().replace(',',''))
                volume = int(tds[6].span.text.strip().replace(',',''))

                rows.append('%s,%d,%d,%d,%d,%d,%d' % (
                    date,
                    close,
                    diff,
                    open,
                    high,
                    low,
                    volume
                ))

@ray.remote
def crawl_stock(stock):
    last_num = ray.get(get_last_page_num.remote(stock))
    print(last_num)
    rows = list()
    for page_num in range(1, last_num + 1):
        results = parse_page(
            '{url}?code={code}&page={page_num}'.format(url=base_url, code=stock.code, page_num=page_num)
        )
        rows.extends(results)
    
    with open('stock-data/%s.csv' % stock.code, 'w') as f:
        f.write('\n'.join(rows))
    
    return stock.code + ': done'



if __name__ == "__main__":
    ray.init()
    
    if len(sys.argv) == 2:
        target_stock_code = sys.argv[1]
        for stock in stocks:
            if stock.code == target_stock_code:
                print(stock)
                result = ray.get(crawl_stock.remote(stock))
                print(result)


    if len(sys.argv) == 1:
        start = time.time()
        result = ray.get([crawl_stock.remote(stock) for stock in stocks])
        print(result)
        end = time.time()
        print('crawl:', end - start)

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
pcmoritzcommented, Jul 1, 2019

Hey, I suspect it will work if you do

import torrequest

and then in get_last_page_num, use torrequest.TorRequest, instead of importing TorRequest as in

from torrequest import TorRequest

This is a known limitation of cloudpickle, the library we use for serialization. It cannot handle from ... import Class if Class is not pickleable.

If this doesn’t work, can you share one entry of the stocks list so I can try to run the script?

0reactions
VCBE123commented, Nov 21, 2019

It seems because that I use the SummaryWriter

Read more comments on GitHub >

github_iconTop Results From Across the Web

C++ Linking errors using third party library - Stack Overflow
The problem arises during linking of the main project, it complains about undefined references to basic ogre functions that we're working fine ...
Read more >
Extending with Shared Libraries - Jenkins
Using third-party libraries​​ While possible, accessing third-party libraries using @Grab from trusted libraries has various issues and is not recommended.
Read more >
Built-in Third-party Libraries - App Engine - Google Cloud
Name Default version Supported versions enum (None) "0.9.23" endpoints (None) "1.0" flask (None) "0.12"
Read more >
Use Third-Party JavaScript Libraries - Salesforce Developers
You can use third-party JavaScript libraries with Lightning web components. For example, use a library with interactive charts and graphs or a library...
Read more >
Third-Party Libraries | Unreal Engine 4.27 Documentation
Delay loading works by pointing imported functions to a thunk function that loads the real DLL. After the real DLL has been loaded,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found