Skip to content

Potential memory retention in LexiconDecoder during streaming decoding #93

@lucgeo

Description

@lucgeo

Bug Description

When using LexiconDecoder via the flashlight-text Python bindings for streaming speech decoding, I observe continuous memory growth during decoding and more importantly, the memory is not released after the decoding process ends.

Reproduction Steps

Running decoding on a 5-minute audio stream:
Memory usage increases from ~3.06 GB (idle) to ~3.5 GB.
After decoding ends, memory stays at ~3.5 GB.
Running a second, longer (10-minute) stream:
Memory stays flat at 3.5 GB for the first half (likely due to reuse).
Then grows again from 3.5 GB to ~4.0 GB during the second half.
Memory does not drop between streams.

This suggests that memory allocated during decoding (likely to track hypotheses) is not being properly freed or reset.

Right now, re-instantiating LexiconDecoder from Python does not seem to fully release the underlying C++ memory.

Setup

Python version: 3.11.2
Flashlight-text binding: flashlight-text PyPI: version 0.0.7, but I have tried also version 0.0.8.dev312.

I'm using LexiconDecoder in an online decoding setup:

decodeBegin() -> decodeStep(...) -> decodeEnd() per stream
Decoder is reused across multiple streams. LexiconDecoder object is manually deleted and recreated in python.

Suspected cause

I instecpted the C++ implementation (LexiconDecoder.cpp) and I noticed that this element:
std::unordered_map<int, std::vector<LexiconDecoderState>> hyp_;
stores all intermediate hypotheses across frames. While hyp_ is cleared in decodeBegin(), its internal memory is never explicitly released, and std::unordered_map will not free memory unless rehashed or destructed.

Similarly, std::vector instances like candidates_ and candidatePtrs_ can grow over time without shrinking.

Suggestion

I would like to have a kind of reset() method for the LexiconDecoder class, which call clear() but also shrink_to_fit() for hyp, candidates_ and candidatePtrs_.

Questions

Can you confirm if this behavior is expected?
Is there an existing pattern for proper cleanup when reusing the decoder for long-running services?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions