Skip to content
This repository was archived by the owner on May 4, 2021. It is now read-only.
This repository was archived by the owner on May 4, 2021. It is now read-only.

langstat2candidates.py requires large amounts of RAM #8

@achimr

Description

@achimr

langstat2candidates.py, particularly when used with the -candidates parameter uses up large amounts of RAM (needing 32-64 GB of RAM for large language pairs). This is because it reads the entire candidates file into memory (dictionary with the URLs as keys and the entire candidates file line as values). Retaining all this data seems unnecessary.
This reduces the parallelizability and leads to crashes.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions