Skip to content

Conversation

@matyascimbulka
Copy link

This PR somewhat relates to #982. The main goal is to speed up creating ZIP archives for codebases larger than 2MB when using apify push.

The issue was in using archive.glob() function within the loop. This approach forces the library to go through the entire current working directory (and its children). Using this function is also redundant because the getActorLocalFilePaths already uses globby to get all of the valid file paths to be archived.

This issue is easy to measure while working on a generic actor for eu-monitoring-tool which has roughly 1200 source code files and large node_modules.

Here is comparison of the compression times:

  • without fix: 137s
  • with fix: 6s

@B4nan or @vladfrangu Could you please have a look?

@B4nan B4nan changed the title fix: Speed up creating archives when pusing big codebases perf: Speed up creating archives when pusing big codebases Jan 29, 2026
Copy link
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, lets wait for Vlad before merging. i guess you tested this in the mentioned eu project, so you know it works. a test would be nice, but not crutial, we use the CLI beta for e2e tests in crawlee, so we should be warned in case things fall apart

@matyascimbulka
Copy link
Author

@B4nan Thanks for the review. I agree that having a test would be better but the EU project code lives in topmonks organisation. So I'm not sure how to demonstrate it.

@vladfrangu
Copy link
Member

well, if you tested it with your codebase / can test it with the built binaries from CI, and confirm it works the same way, we can merge it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants