Skip to content

Conversation

@JeremyWesthead
Copy link

Hi @bede

This adds server functionality, controlled by use of the feature server. Sorry, this PR is a little long but a vast majority of line changes are reworking existing logic rather than adding new logic.
I've checked with the test suite and rsviruses17900.fastq.gz, and both give the same results as previously in my testing.

Compiled without the server feature:

  • Works the same as 0.10.0
  • Harmonises how minimizers are computed for single and paired reads (opting for the most recent)

Compiled with the server feature:

  • Swaps the filter::run for a version based on 0.7.0, so it's a little slower for running deacon filter...
  • Enables the running of deacon server some/index.bin to preload an index into memory for later use
  • Enables the running of deacon client http://some.address ... to function equivalent to deacon filer some/index.bin ...
  • In my testing, both client and filter produce the same results as 0.10.0

In a bit more depth, compared to 0.10.0:

  • Splits out all common filtering functions between local and remote filtering
    • JSON summary struct
    • Computing minimizers (both for paired and unpaired)
    • Whether a sequence (or pair) matches
  • This gives a slightly simpler implementation of FilterProcessor
  • Also allows for a single comparison logic to be used between local and remote filtering
  • Adds remote_filter.rs which is based on the implementation from 0.7.0, but updated to use the newer logic
  • Makes filter.rs a simple file which either imports run from remote_filter.rs or local_filter.rs depending on if the server feature is enabled
  • Adds client and server commands to the CLI. client is essentially a mirror of filter, but swapping the index path for a server address
  • Adds corresponding tests for client, mirroring all tests for filter
  • Variety of code quality related fixes as suggested by clippy lints, although these should not impact functionality

In terms of performance, larger input files will process slower (as sending to the server has overheads), but for smaller input files, there is significant speed improvement. Especially useful for removing human reads from ONT fastqs as they are basecalled.

On my laptop (Linux, x86), just loading the panhuman index take ~6s, so processing a small fastq (4,900 reads) with deacon filter takes a total of <7s. I've seen a lot of variation with how long index loading takes, so there's a margin of +- 1s on how long this takes
Pre-loading an index with deacon server, then processing with the same settings, but with deacon client takes ~200ms.

@bede
Copy link
Owner

bede commented Sep 10, 2025

Thanks for your work and patience with this Jeremy! I'll be in touch once I've had a decent look.

@bede bede mentioned this pull request Sep 29, 2025
@bede bede force-pushed the main branch 2 times, most recently from 08ec7de to 97868a0 Compare November 20, 2025 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants