question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Manylinux wheel size could be reduced by 66%

See original GitHub issue

When building numpy from source and running the command strip, the resulting folder is 66% lighter than when using the manylinux wheel (via pip).

[…] the strip program removes inessential information from executable binary programs and object files, thus potentially resulting in better performance and sometimes significantly less disk space usage https://en.wikipedia.org/wiki/Strip_(Unix)

This is probably harmless on most systems, yet it is quite important in size-constrained systems (such as AWS Lambda).

Some developers have resorted to distributing their own stripped binaries (e.g. lambda packages for the serverless framework zappa), but it seems like a makeshift solution.

I think the problem should be solved upstream, as each library should be responsible for packaging their own optimized binaries.

System specifications

Every command below has been executed using the amazonlinux docker image. https://hub.docker.com/_/amazonlinux/

Numpy version: 1.14.2 Python version: 3.6.2

How to replicate

Prepare docker image

docker run -it amazonlinux bash
yum update -y
yum install -y findutils binutils python36-devel gcc

Install wheel & measure package size

python3 -m pip install -t wheel numpy==1.14.2
du -sh wheel

–> 57 MB

Try strip:

find wheel/ -name "*.so"|xargs strip
du -sh wheel

–> 56 MB

No real progress here. The binary wheel seems already stripped.

Build from source & measure package size

python3 -m pip install -t build --no-binary numpy numpy==1.14.2
du -sh build

–> 42 MB

Try strip:

find build/ -name "*.so"|xargs strip
du -sh build

–> 19 MB

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:3
  • Comments:14 (8 by maintainers)

github_iconTop GitHub Comments

3reactions
njsmithcommented, Apr 17, 2018

If you do that please call it numpy-slow, so people know what they’re getting into 😃

2reactions
LouisAmoncommented, Apr 17, 2018

Hello @pv, thanks for looking into this.

As per the documentation:

NumPy does not require any external linear algebra libraries to be installed. However, if these are available, NumPy’s setup script can detect them and use them for building.

The problem here is that, considering numpy is often installed along other Python packages from the Python scientific community (e.g. pandas, scipy, etc.), they all add up to an impressive size.

Now I know each library should address their own build problem separately, but since numpy is usually the main building block… maybe including openblas isn’t what everyone needs.

Isn’t there a way to offer an alternate “lightweight” installation ? Maybe something along the lines: pip install numpy-light

Read more comments on GitHub >

github_iconTop Results From Across the Web

The next manylinux specification - Discussions on Python.org
TL;DR: I'm currently attempting to bridge the gap between the TensorFlow SIG Build group and the PyPA to try and determine the future...
Read more >
Exclude manylinux wheels when downloading from pip
Workaround. As a workaround you can create file _manylinux.py in current workdir, or in the site-packages with following content:
Read more >
CSE 142 Python Slides - Washington
Google, Yahoo!, Youtube; Many Linux distributions; Games and apps (e.g. Eve Online) ... Escape sequences such as \" are the same as in...
Read more >
scikit-build Documentation - Read the Docs
You can add lower limits to cmake or scikit-build as needed. ... 4.7.2 Using dockcross-manylinux to generate Linux wheels ... the CMake specs...
Read more >
Tweets with replies by ApacheArrow (@ApacheArrow ... - Twitter
github.com. ARROW-5082: [Python] Substantially reduce Python wheel package and install size by wesm · Pull... Current manylinux wheel packages on master: ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found