-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Bug Description
When using LexiconDecoder via the flashlight-text Python bindings for streaming speech decoding, I observe continuous memory growth during decoding and more importantly, the memory is not released after the decoding process ends.
Reproduction Steps
Running decoding on a 5-minute audio stream:
Memory usage increases from ~3.06 GB (idle) to ~3.5 GB.
After decoding ends, memory stays at ~3.5 GB.
Running a second, longer (10-minute) stream:
Memory stays flat at 3.5 GB for the first half (likely due to reuse).
Then grows again from 3.5 GB to ~4.0 GB during the second half.
Memory does not drop between streams.
This suggests that memory allocated during decoding (likely to track hypotheses) is not being properly freed or reset.
Right now, re-instantiating LexiconDecoder from Python does not seem to fully release the underlying C++ memory.
Setup
Python version: 3.11.2
Flashlight-text binding: flashlight-text PyPI: version 0.0.7, but I have tried also version 0.0.8.dev312.
I'm using LexiconDecoder in an online decoding setup:
decodeBegin() -> decodeStep(...) -> decodeEnd() per stream
Decoder is reused across multiple streams. LexiconDecoder object is manually deleted and recreated in python.
Suspected cause
I instecpted the C++ implementation (LexiconDecoder.cpp) and I noticed that this element:
std::unordered_map<int, std::vector<LexiconDecoderState>> hyp_;
stores all intermediate hypotheses across frames. While hyp_ is cleared in decodeBegin(), its internal memory is never explicitly released, and std::unordered_map will not free memory unless rehashed or destructed.
Similarly, std::vector instances like candidates_ and candidatePtrs_ can grow over time without shrinking.
Suggestion
I would like to have a kind of reset() method for the LexiconDecoder class, which call clear() but also shrink_to_fit() for hyp, candidates_ and candidatePtrs_.
Questions
Can you confirm if this behavior is expected?
Is there an existing pattern for proper cleanup when reusing the decoder for long-running services?
Thank you!