question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance Improvements

See original GitHub issue

I have been using Elodie to organize my photo library. A very impressive command line utility.

I did some profiling to see if I could improve the per image processing time.

Import 1 file with master branch. - 2.57s

python -m cProfile -o elodie.prof elodie.py import --destination="" <file> “get_metadata” takes 74% of run time. Every image makes 8 calls to exiftool. Each call to exiftool costs ~200-300ms

image

Import 24 files with master branch - 73.7s

python -m cProfile -o elodie.prof elodie.py import --destination="<dst-dir>" --source="<src-dir>" “get_metadata” takes 84% of run time. Again every image makes 8 calls to exiftool with each call taking ~200-300ms.

image

After looking over the code two things can be done to improve performance significantly.

  • Cache image metadata to reduce calls to pyexiftools
  • Initialize one instance of pyexiftools which can create 1 exiftool subprocess to improve exiftool lookups by using the “-stay_open” parameter (Right now every call to pyexiftool is creating a new exiftool process)

I’ve implemented these changes in a fork I created https://github.com/amaleki/elodie/commit/afd576600d7a0ba864b6bb2f93241a00711adb7a#diff-5a2b1bdaf59cbc881a6ad218eaabdc42

Here are some results from my testing:

Importing 1 file with the above commits - .690s (73% Improvement)

“get_metadata” now takes 32% of run time. image

Importing 24 files with the above commits - 3.17s (95% Improvement)

“get_metadata” now takes 37% of run time. image

Results are from a Windows Docker Desktop container running Alpine Linux on a Xeon E2176M. Visualization are from Snakeviz

I would love to have the community review the code and see what kind of results they see. I’m hoping these aren’t “too good to be true”!

elodie-master-profile.zip elodie-amaleki-branch-commit-afd5766.zip

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:3
  • Comments:9 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
evanjtcommented, Jan 12, 2020

Hey guys I did a comparison using the latest commits on both forks: jmathai 75e6590 amaleki afd5766

Setup

  • Removed ~/.elodie and used default settings, therefore:

    • No mapquest API
    • Default folder settings
    • No hash tables for location or images
  • Used same source folder

    • 6,853,366,707 bytes
    • 1315 files
    • 153 subfolders
  • System hardware:

    • Intel i7-8850H (2.60 Ghz 4.30 Ghz boost, 6 cores/12 threads)
    • 32 GB RAM
    • Samsung 970 1 TB SSD (images read and written on same drive, but separate from OS drive)
  • versions:

    • exiftool 11.70
    • python 3.8.1
    • linux kernel version 5.4.8-arch1-1

Results

amaleki

SUMMARY

Metric Count Success 1314 Error 0

time ./amaleki/elodie.py import /media/tb/elodietest/2019 --trash --destination "/media/tb/elodietest/done_amaleki/"
48.37s user 7.42s system 98% cpu 56.460 total

Files:

  • 6,852,898,461 bytes
  • 1314 files
  • 22 sub-folders

jmathai

SUMMARY

Metric Count Success 1314 Error 0

time ./elodie/elodie.py import /media/tb/elodietest/2019_2 --trash --destination "/media/tb/elodietest/done_jmathai/"
1132.93s user 112.78s system 98% cpu 21:02.52 total

Files:

  • 6,852,883,168 bytes
  • 1314 files
  • 22 sub-folders (due to default folder settings)

Comments

  • File missing from 1315 to 1314 result is just a .directory file in the source directory
  • Smaller count of directories due to default directory settings in config file
  • 23.4x speedup
  • 15,293 B (14.93 KB) less total total file size in jmathai vs amaleki
0reactions
evanjtcommented, Jan 17, 2020

Hi amaleki,

I don’t have the exact same folder for testing the images, but here’s another test:

Original folder: 8.0 GiB (8,546,552,827) 3,795 files, 277 sub-folders

last commit on Amaleki:fs-process-file-media-set-order (0c1142e)

Folder: 8.0 GiB (8,545,983,483) 3,795 files, 90 sub-folders

Success 3795 Error 0

83.34s user 14.09s system 97% cpu 1:39.97 total

Last commits to jmathai:master (d8cee15)

Folder: 8.0 GiB (8,545,991,675) 3,795 files, 90 sub-folders

Success 3795 Error 0

89.21s user 14.33s system 99% cpu 1:44.40 total

Read more comments on GitHub >

github_iconTop Results From Across the Web

Performance Improvements
P.I. Speed Shops Liquidation. Thanks for 58 great years! All stores are now CLOSED. FINAL LIQUIDATION Auction of inventory, memorabilia and fixtures will ......
Read more >
What is Performance Improvement? - BambooHR
Performance improvement is a strategy under the umbrella of performance management that helps employees achieve better performance and growth.
Read more >
Performance improvement - Wikipedia
Performance improvement is measuring the output of a particular business process or procedure, then modifying the process or procedure to increase the ...
Read more >
How to Establish a Performance Improvement Plan - SHRM
A performance improvement plan (PIP), also known as a performance action plan, is a tool to give an employee with performance deficiencies the...
Read more >
What Is Performance Improvement? - WalkMe
Performance improvement refers to the improvement of a business process, function, or procedure with the intention of improving overall outcomes ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found