question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Very slow terraform scan on medium sized repository

See original GitHub issue

Describe the bug

I’ve just tried checkov on a medium sized repository that contains ~70 modules and around 20 Terraform workspaces. Running checkov against the entire repository takes about 1 hour and 50 minutes on current gen desktop CPU. In contrast something like tfsec only takes in the order of ~20s for the same code. My understanding is that both tools are primarily busy evaluating the logic of the Terraform code (substituting variables, instantiating modules, …) and applying each rule set is the cheaper part of the execution. Correct me if I am wrong.

cloc says the following about my Terraform folder:

$ cloc terraform/

github.com/AlDanial/cloc v 1.88  T=2.25 s (416.3 files/s, 35823.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HCL                            709          12897           5047          50407
…
-------------------------------------------------------------------------------
SUM:                           935          14249           5268          60935
-------------------------------------------------------------------------------

Running checkov against individual modules takes somewhere between a ~5 to 30s which is totally fine when looking at their complexity.

Having done some light profiling of the application it appears as if the (costly?) parsing of each HCL block in the Terraform files is done over and over again for each instance of a module.

I slapped some @lru_cache(max_size=1024) annotation on to the parse_var_blocks function which resulted in a 13% reduction of execution time. Increasing the cache size to an unbounded limit had more influence on the performance but not to the point where the runtime is anywhere near what I would feel comfortable with in CI.

Anything going from here seems to require more knowledge about the code base.

commit a4606b31d29a1f5d4f392032ebc6e75f0d7ae0a2
Author: Andreas Rammhold <andreas@rammhold.de>
Date:   Fri Mar 19 17:09:20 2021 +0100

    Cache calls to find_var_blocks
    
    While using checkov on a larger terraform repository the runtime on all
    the files was about 1 hour and 50 minutes. As that large repository is
    not a good test case I trimmed it down to a smaller part of the repo
    that only took about 12.736s ± 0.153s seconds before this change and with this
    change is at 11.196s ± 0.195s which is a nice ~13% speedup in runtime.

diff --git a/checkov/terraform/parser_utils.py b/checkov/terraform/parser_utils.py
index b672c24e..50be5e27 100644
--- a/checkov/terraform/parser_utils.py
+++ b/checkov/terraform/parser_utils.py
@@ -4,6 +4,8 @@ from dataclasses import dataclass
 from enum import Enum, IntEnum
 from typing import Any, Dict, List, Optional
 
+from functools import lru_cache
+
 import hcl2
 
 
@@ -48,6 +50,7 @@ class ParserMode(Enum):
         return str(self.value)
 
 
+@lru_cache(maxsize=1024)
 def find_var_blocks(value: str) -> List[VarBlockMatch]:
     """
     Find and return all the var blocks within a given string. Order is important and may contain portions of

Profile results after applying the above caching patch:

Profile results before applying the above caching patch:

Looking at the output of strace -e openat bin/checkov .... I can see that checkov is reading some files as often as four times within one run.

To Reproduce Steps to reproduce the behavior:

  1. Checkout a larger terraform repository
  2. run checkov against the directory
  3. wait

Expected behavior I would expect a runtime in the order of seconds or minutes but not tens of minutes or almost two hours. .

Desktop (please complete the following information):

  • OS: Linux, using checkov docker container and a local python interpreter on NixOS during profiling/testing.
  • Checkov Version: 36d065755da53b59ab8371bbe349c9cb3ce526ff

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:19 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
schosterbarakcommented, Mar 30, 2021

@andir @robeden please wait with new commits on this. @nimrodkor’s team is leading a big performance change. The PR should be opened soon. I hope we will see different results after those changes. We did see a very big change on our own repos.

0reactions
schosterbarakcommented, Apr 20, 2021

great @andir - i’m closing the issue for now. If you’d like to make further improvements, we would accept a PR.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Very slow terraform scan on medium sized repository - GitLab
I've just tried checkov on a medium sized repository that contains ~70 modules and around 20 Terraform workspaces. Running checkov against the ...
Read more >
Terraform init is extremely slow · Issue #27379 - GitHub
Terraform init takes forever and it seems like it is due to provider download. This extremely slow behaviour is consistent for the last...
Read more >
Scanning with KICS for issues in Terraform
KICS can be used as a standalone scanner as it is written in Go and with ... For this post, we have a...
Read more >
250 Practice Questions For Terraform Associate Certification
The Terraform Associate certification is for Cloud Engineers specializing in operations ... For larger infrastructures, querying every resource is too slow.
Read more >
Documentation | - Terrascan
Terrascan is a static code analyzer for Infrastructure as Code. It can be installed and run in a number of different ways, and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found