Potential memory retention in LexiconDecoder during streaming decoding

### Bug Description
When using LexiconDecoder via the [flashlight-text Python bindings](https://pypi.org/project/flashlight-text/) for streaming speech decoding, I observe continuous memory growth during decoding and more importantly, the memory is not released after the decoding process ends.



#### Reproduction Steps
Running decoding on a 5-minute audio stream:
Memory usage increases from ~3.06 GB (idle) to ~3.5 GB.
After decoding ends, memory stays at ~3.5 GB.
Running a second, longer (10-minute) stream:
Memory stays flat at 3.5 GB for the first half (likely due to reuse).
Then grows again from 3.5 GB to ~4.0 GB during the second half.
Memory does not drop between streams.

This suggests that memory allocated during decoding (likely to track hypotheses) is not being properly freed or reset.

Right now, re-instantiating LexiconDecoder from Python does not seem to fully release the underlying C++ memory.

### Setup
Python version: 3.11.2
Flashlight-text binding: [flashlight-text PyPI](https://pypi.org/project/flashlight-text/): version 0.0.7, but I have tried also version 0.0.8.dev312.

I'm using LexiconDecoder in an online decoding setup:

decodeBegin() -> decodeStep(...) -> decodeEnd() per stream
Decoder is reused across multiple streams. LexiconDecoder object is manually deleted and recreated in python.

#### Suspected cause
I instecpted the C++ implementation ([LexiconDecoder.cpp](https://github.com/flashlight/text/blob/main/flashlight/lib/text/decoder/LexiconDecoder.cpp)) and I noticed that this element:
`
std::unordered_map<int, std::vector<LexiconDecoderState>> hyp_;
`
stores all intermediate hypotheses across frames. While `hyp_` is cleared in [`decodeBegin()`](https://github.com/flashlight/text/blob/main/flashlight/lib/text/decoder/LexiconDecoder.cpp#L22), its internal memory is never explicitly released, and `std::unordered_map` will not free memory unless rehashed or destructed.

Similarly, `std::vector` instances like `candidates_` and `candidatePtrs_` can grow over time without shrinking.

#### Suggestion
I would like to have a kind of `reset()` method for the `LexiconDecoder` class, which call `clear()` but also `shrink_to_fit()` for `hyp`, `candidates_` and `candidatePtrs_`.

#### Questions

Can you confirm if this behavior is expected?
Is there an existing pattern for proper cleanup when reusing the decoder for long-running services?

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Potential memory retention in LexiconDecoder during streaming decoding #93

Bug Description

Reproduction Steps

Setup

Suspected cause

Suggestion

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential memory retention in LexiconDecoder during streaming decoding #93

Description

Bug Description

Reproduction Steps

Setup

Suspected cause

Suggestion

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions