New Plugin: Save skipped imports #6140

EmberLightVFX · 2025-10-31T11:07:55Z

Description

This plugin will save the name of the skipped song/album to a text file during import for later review.

It will also allow you to try to find the Spotify link for the skipped songs if the Spotify plugin is installed and configured.
This information can later be used together with other MB importers like Harmony.

If any song has already been written to the file, it will not be written again.

To Do

Documentation. (If you've added a new command-line flag, for example, find the appropriate page under docs/ to describe it.)
Changelog. (Add an entry to docs/changelog.rst to the bottom of one of the lists near the top of the document.)
Tests. (Very much encouraged but not strictly required.)

Extra Info

I haven't written any tests for this plugin as I'm not sure how to write tests for this kind of thing. If anyone knows how to write some good tests for this plugin I'm more than happy to add them to this PR!

I personally wrote this plugin to later use with Harmony as I currently manually save the skipped song-info from my library of not found songs, search spotify for them and add them through Harmony. Currently I have over 100 skipped songs and I'm only on the letter D in my library so this plugin will save me A LOT of time getting the information to later add.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

Consider caching the set of already-recorded songs in memory to avoid reading the file on every skip and improve performance.
Before writing to the configured path, ensure that its parent directory exists or create it to avoid write errors in nested directories.
You can simplify Spotify plugin retrieval by using the official plugin registry (e.g. plugins.get('spotify')) rather than iterating over find_plugins and dynamic imports.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Consider caching the set of already-recorded songs in memory to avoid reading the file on every skip and improve performance.
- Before writing to the configured path, ensure that its parent directory exists or create it to avoid write errors in nested directories.
- You can simplify Spotify plugin retrieval by using the official plugin registry (e.g. plugins.get('spotify')) rather than iterating over find_plugins and dynamic imports.

## Individual Comments

### Comment 1
<location> `beetsplug/saveskippedsongs.py:63-64` </location>
<code_context>
+        if task.choice_flag == Action.SKIP:
+            # If spotify integration is enabled, try to match with Spotify
+            link = None
+            if self.config["spotify"].get(bool):
+                link = self._match_with_spotify(task, session)
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Boolean config retrieval may be unreliable for non-bool values.

Use `.as_bool()` for config values or ensure they are always boolean to avoid unexpected behavior.

```suggestion
            if self.config["spotify"].as_bool():
                link = self._match_with_spotify(task, session)
```
</issue_to_address>

### Comment 2
<location> `beetsplug/saveskippedsongs.py:68-70` </location>
<code_context>
+
+            result = f"{summary(task)}{' (' + link + ')' if link else ''}"
+            self._log.info(f"Skipped: {result}")
+            path = self.config["path"].get(str)
+            if path:
+                path = os.path.abspath(path)
</code_context>

<issue_to_address>
**suggestion:** Path config retrieval may not handle user home or environment variables.

Paths containing '~' or environment variables are not expanded. Use `os.path.expanduser` and `os.path.expandvars` to support these cases.

```suggestion
            path = self.config["path"].get(str)
            if path:
                # Expand user home (~) and environment variables in the path
                path = os.path.expanduser(os.path.expandvars(path))
                path = os.path.abspath(path)
```
</issue_to_address>

### Comment 3
<location> `beetsplug/saveskippedsongs.py:73-81` </location>
<code_context>
+                    except FileNotFoundError:
+                        existing = set()
+
+                    if result not in existing:
+                        with open(path, "a", encoding="utf-8") as f:
+                            f.write(f"{result}\n")
</code_context>

<issue_to_address>
**suggestion:** Duplicate detection is case-sensitive and ignores whitespace variations.

Normalizing case and whitespace before checking for duplicates will help catch near-duplicates that differ only in formatting.

```suggestion
                    try:
                        with open(path, "r", encoding="utf-8") as f:
                            existing = {
                                line.rstrip("\n").strip().lower()
                                for line in f
                            }
                    except FileNotFoundError:
                        existing = set()

                    normalized_result = result.strip().lower()
                    if normalized_result not in existing:
                        with open(path, "a", encoding="utf-8") as f:
                            f.write(f"{result}\n")
```
</issue_to_address>

### Comment 4
<location> `beetsplug/saveskippedsongs.py:140` </location>
<code_context>
+            )
+
+            # Call the Spotify API directly via the plugin's search method
+            results = spotify_plugin._search_api(  # type: ignore[attr-defined]
+                query_type=search_type,  # type: ignore[arg-type]
+                query_string=query_string,
</code_context>

<issue_to_address>
**issue:** Direct use of a private method may break with future plugin updates.

Consider using a public method instead, or clearly document the reliance on this internal API.
</issue_to_address>

### Comment 5
<location> `beetsplug/saveskippedsongs.py:42-44` </location>
<code_context>
def summary(task: "ImportTask"):
    """Given an ImportTask, produce a short string identifying the
    object.
    """
    if task.is_album:
        return f"{task.cur_artist} - {task.cur_album}"
    else:
        item = task.item  # type: ignore[attr-defined]
        return f"{item.artist} - {item.title}"

</code_context>

<issue_to_address>
**suggestion (code-quality):** Remove unnecessary else after guard condition ([`remove-unnecessary-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-else/))

```suggestion
    item = task.item  # type: ignore[attr-defined]
    return f"{item.artist} - {item.title}"
```
</issue_to_address>

### Comment 6
<location> `beetsplug/saveskippedsongs.py:59` </location>
<code_context>
    def log_skipped_song(self, task: "ImportTask", session: "ImportSession"):
        if task.choice_flag == Action.SKIP:
            # If spotify integration is enabled, try to match with Spotify
            link = None
            if self.config["spotify"].get(bool):
                link = self._match_with_spotify(task, session)

            result = f"{summary(task)}{' (' + link + ')' if link else ''}"
            self._log.info(f"Skipped: {result}")
            path = self.config["path"].get(str)
            if path:
                path = os.path.abspath(path)
                try:
                    # Read existing lines (if file exists) and avoid duplicates.
                    try:
                        with open(path, "r", encoding="utf-8") as f:
                            existing = {line.rstrip("\n") for line in f}
                    except FileNotFoundError:
                        existing = set()

                    if result not in existing:
                        with open(path, "a", encoding="utf-8") as f:
                            f.write(f"{result}\n")
                    else:
                        self._log.debug(f"Song already recorded in {path}")
                except OSError as exc:
                    # Don't crash import; just log the I/O problem.
                    self._log.debug(
                        f"Could not write skipped song to {path}: {exc}"
                    )

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Add guard clause ([`last-if-guard`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/last-if-guard/))
- Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Use f-string instead of string concatenation [×2] ([`use-fstring-for-concatenation`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-fstring-for-concatenation/))
</issue_to_address>

### Comment 7
<location> `beetsplug/saveskippedsongs.py:97` </location>
<code_context>
    def _match_with_spotify(
        self, task: "ImportTask", session: "ImportSession"
    ) -> Optional[str]:
        """Try to match the skipped track/album with Spotify by directly
        calling the Spotify API search.
        """
        try:
            # Try to get the spotify plugin if it's already loaded
            spotify_plugin = None
            for plugin in plugins.find_plugins():
                if plugin.name == "spotify":
                    spotify_plugin = plugin
                    break

            # If not loaded, try to load it dynamically
            if not spotify_plugin:
                try:
                    from beetsplug.spotify import SpotifyPlugin

                    spotify_plugin = SpotifyPlugin()
                    self._log.debug("Loaded Spotify plugin dynamically")
                except ImportError as e:
                    self._log.debug(f"Could not import Spotify plugin: {e}")
                    return
                except Exception as e:
                    self._log.debug(f"Could not initialize Spotify plugin: {e}")
                    return

            # Build search parameters based on the task
            query_filters: SearchFilter = {}
            if task.is_album:
                query_string = task.cur_album or ""
                if task.cur_artist:
                    query_filters["artist"] = task.cur_artist
                search_type = "album"
            else:
                # For singleton imports
                item = task.item  # type: ignore[attr-defined]
                query_string = item.title or ""
                if item.artist:
                    query_filters["artist"] = item.artist
                if item.album:
                    query_filters["album"] = item.album
                search_type = "track"

            self._log.info(
                f"Searching Spotify for: {query_string} ({query_filters})"
            )

            # Call the Spotify API directly via the plugin's search method
            results = spotify_plugin._search_api(  # type: ignore[attr-defined]
                query_type=search_type,  # type: ignore[arg-type]
                query_string=query_string,
                filters=query_filters,
            )

            if results:
                self._log.info(f"Found {len(results)} Spotify match(es)!")
                self._log.info("Returning first Spotify match link")
                return results[0].get("external_urls", {}).get("spotify", "")
            else:
                self._log.info("No Spotify matches found")

        except AttributeError as e:
            self._log.debug(f"Spotify plugin method not available: {e}")
        except Exception as e:
            self._log.debug(f"Error searching Spotify: {e}")
        return

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Use the built-in function `next` instead of a for-loop ([`use-next`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-next/))
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

beetsplug/saveskippedsongs.py

Copilot

Pull Request Overview

This PR adds a new saveskippedsongs plugin that logs skipped songs/albums during import to a text file for later review. The plugin optionally attempts to find Spotify links for skipped items using the Spotify plugin if available and configured.

Implements SaveSkippedSongsPlugin with configuration options for Spotify integration and output file path
Integrates with the import workflow by listening to the import_task_choice event and logging when items are skipped
Adds documentation for the new plugin

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
beetsplug/saveskippedsongs.py	New plugin implementation that saves skipped songs to a text file and optionally searches Spotify for links
docs/plugins/saveskippedsongs.rst	Documentation for the new plugin including configuration options
docs/changelog.rst	Changelog entry for the new plugin

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

beetsplug/saveskippedsongs.py

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

… both track and album

henry-oberholtzer

Hi! This looks great so far! It is probably is better to hook into the existing database querying mechanism's results (as I noted in the review) rather than having to re-invent the Spotify API wheel. It'll also make it really easy to increase how useful this is for users who may not user Spotify.

henry-oberholtzer · 2025-11-01T00:52:02Z

beetsplug/saveskippedsongs.py

+            spotify_plugin = None
+            for plugin in plugins.find_plugins():
+                if plugin.name == "spotify":
+                    spotify_plugin = plugin
+                    break


I think we can assume that if a user doesn't have spotify loaded as a plugin, they may not be interested in the spotify feature.

Could also avoid having to make as second API call as well, since by the time the user application or may skip a song, it will have probably already attempted to grab candidates from the import task. It should be available under ImportTask.candidates here, and then it'd just be a member of the AlbumInfo.info or TrackMatch.info object - which should come with the distance already calculated nicely too. Could let the user just filter what database source URLs they wanted printed with it in a config option.

I probably just don't really understand your explenation so if I'm wrong please correct me.

For the addition of the spotify links to be added you would need the spotify plugin to be configured but disabled in your config. If it's active beets will mostly just pick the spotify match as the best match and move on (This is some info I should add to the documentation now when thinking about it).

If the spotify plugin is disabled we would need to do the API call when the user presses skip to see if there is any Spotify matches (without picking it as the beets match)

henry-oberholtzer · 2025-11-01T00:55:20Z

beetsplug/saveskippedsongs.py

+__version__ = "1.0"
+
+
+def summary(task: "SingletonImportTask"):


I think the more general ImportTask is a better suited typehint than the than the derived SingletonImportTask - I don't think there's any functionality in this plugin that appears to be Singleton only?

It would be if you skipp a singleton song?

henry-oberholtzer · 2025-11-01T01:03:37Z

docs/plugins/saveskippedsongs.rst

+- **path**: Path to the file to write the skipped songs to. Default:
+  ``skipped_songs.txt``.


Might be good to specify that it stores in the user's home directory.

For me it will save the file in the dir that I run the beets command from. Should I maybe change the default path to ~/skipped_songs.txt to make it a more specified path?

JOJ0 · 2025-11-09T11:20:22Z

Hi @EmberLightVFX, thank you for the submission! Interesting idea for a plugin. On a very high level and only having skimmed through this PR's conversation and code quickly, I would like to mention that I see quite similar use-cases for this plugin and the built-in log-file feature of Beets. Before moving on with this plugin, I would like to invite you to compare what benefits this plugin offers over the built-in import log (except the Spotify feature)

So first of all, in case one of you (including @henry-oberholtzer here), hasn't used it yet. This is the crash course:

Configure in config.yaml to write skipped imports to a predefined log file (https://beets.readthedocs.io/en/stable/reference/config.html#log)
Or use beet import --log= .... for specific import runs
Run beet import --quiet
Complete interactive-required imports later with beet import --from-logfile ....

Quick and dirty here, I can certainly go into more details if you want. I'm also thinking that this feature might not be too obvious to find out how to use properly for new Beets users since it's rather poorly documented and writing a proper tutorial might also be worth putting effort into.

JOJ0 · 2025-12-06T06:45:59Z

@EmberLightVFX @henry-oberholtzer any thoughts? Did you find time to think about my comparision with the internal log-file feature? I'm setting this PR to draft for now. Should we close it already and maybe only the spotify-suggest will be a future feature? How should we proceed?

henry-oberholtzer · 2025-12-06T22:29:16Z

Import from log file seems to do a lot of what this plug aims to do. It could certainly do with more highlighting in the documentation! This plugin seems to differ from --from-logfile in that it saves candidates, so maybe that'd be a different plugin direction.

JOJ0 · 2025-12-07T07:03:35Z

Import from log file seems to do a lot of what this plug aims to do. It could certainly do with more highlighting in the documentation! This plugin seems to differ from --from-logfile in that it saves candidates, so maybe that'd be a different plugin direction.

I agree that --from-logfile is kind of badly documented. When I first replied to this thread I wanted to simplay send you docs but found they are lacking detail on how to really use it, so I started to describe it quickly myself.

Is this plugin in its current state saving candidates to the text file already? From what I read here it only saves "artist - title" (which in my opinion is not very useful for a reimport later on. Is the purpose of this plugin even to move on with importing "difficult" to find tracks/albums or did I misunderstand it in the first place? Please rephrase again (sorry I might be slow here) @EmberLightVFX what you tried to do in the docs already (Harmony MusicBrainz importer for example)

https://github.com/beetbox/beets/pull/6140/files#diff-748c48fa17458aae81d7c28f64bf287aaf5864c919b16cbb8b62333daab78acbR42

https://github.com/beetbox/beets/pull/6140/files#diff-bbab3c937e587079482d4ff607fbd08ffb8886972381c44df243746a71d8e1bdR4-R5

I don't want to be at all opposed on creating a new and fancy plugin that helps people better import stuff into their Beets library but I'm not yet sure what we are really aiming for.

This plugin does a lot of file reading/writing, must handle errors, and even calls spotify and records things. There is a lot that can go wrong. The spotify search is only artist/title, that will bring a lot of false positives. Well, this could be a little improved if it would search via ISRC and then only fall back to artist/title/album. But even then I'm questioning: When a file to import has been skipped in the importer, most probably because it was hard to find from all the sources (mb,discogs,spotify,deezer,....) why will it be helpful to search for it on spotify again? If we didn't find it already, we won't find it then either, right?

Ooor asking again: Is the purpose not to find out what is difficult to import / no matches found / save tedious and manualy importing work in a logfile to work through in batch? Is there a different use-case?

So to wrap this up:

What do both of you think about taking the time to thoroughly document the from-logifle feature first, then we would see whether it can be improved (I'm sure there is much room for improvement) and were a plugin like we have here add even more value to the existing feature (or do something else entirely)
Also: Please provide some examples of how that textfile can look like, I'm having a hard time figuring out how it looks from the mentioned code snippets above, I have an idea but might not see all the details. That would help for a better high level review or maybe a concept refinement going forward.

EmberLightVFX added 3 commits October 31, 2025 12:01

Add plugin

daacd5a

Add docs

9acc2ba

Update changelog

7432fab

Copilot AI review requested due to automatic review settings October 31, 2025 11:07

EmberLightVFX requested a review from a team as a code owner October 31, 2025 11:07

sourcery-ai bot reviewed Oct 31, 2025

View reviewed changes

Copilot AI reviewed Oct 31, 2025

View reviewed changes

EmberLightVFX and others added 7 commits October 31, 2025 12:11

Update beetsplug/saveskippedsongs.py

b2ffe3c

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update beetsplug/saveskippedsongs.py

5879c3d

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Update beetsplug/saveskippedsongs.py

7cc7bb2

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Use SingletonImportTask instead of ImportTask as it contains info for…

a6e87c5

… both track and album

Clean up code

bdba5ff

Return None if no Spotify url was found

12f3e84

Lint docs

45ad61b

henry-oberholtzer requested changes Nov 1, 2025

View reviewed changes

JOJ0 marked this pull request as draft December 6, 2025 06:46

		__version__ = "1.0"


		def summary(task: "SingletonImportTask"):

		- path: Path to the file to write the skipped songs to. Default:
		``skipped_songs.txt``.

New Plugin: Save skipped imports #6140

Are you sure you want to change the base?

New Plugin: Save skipped imports #6140

Uh oh!

Conversation

EmberLightVFX commented Oct 31, 2025

Description

To Do

Extra Info

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

henry-oberholtzer left a comment

Choose a reason for hiding this comment

Uh oh!

henry-oberholtzer Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

EmberLightVFX Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

henry-oberholtzer Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

EmberLightVFX Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

henry-oberholtzer Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

EmberLightVFX Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JOJ0 commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JOJ0 commented Dec 6, 2025

Uh oh!

henry-oberholtzer commented Dec 6, 2025

Uh oh!

JOJ0 commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

EmberLightVFX Nov 1, 2025 •

edited

Loading

JOJ0 commented Nov 9, 2025 •

edited

Loading

JOJ0 commented Dec 7, 2025 •

edited

Loading