[pull] master from beetbox:master #1

pull · 2022-04-06T04:31:54Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

…ypi push" This reverts commit cf3acec.

This utilises regex substitution in the substitute plugin. The previous approach only used regex to match the pattern, then replaced it with a static string. This change allows more complex substitutions, where the output depends on the input. ### Example use case Say we want to keep only the first artist of a multi-artist credit, as in the following list: ``` Neil Young & Crazy Horse -> Neil Young Michael Hurley, The Holy Modal Rounders, Jeffrey Frederick & The Clamtones -> Michael Hurley James Yorkston and the Athletes -> James Yorkston ```` This would previously have required three separate rules, one for each resulting artist. By using a regex substitution, we can get the desired behaviour in a single rule: ```yaml substitute: ^(.*?)(,| &| and).*: \1 ``` (Capture the text until the first `,` ` &` or ` and`, then use that capture group as the output) ### Notes I've kept the previous behaviour of only applying the first matching rule, but I'm not 100% sure it's the ideal approach. I can imagine both cases where you want to apply several rules in sequence and cases where you want to stop after the first match.

Quick fix for #5467. Checks if the path for python is under the windows store folder then error and point the user to the beets [documentation](https://beets.readthedocs.io/en/stable/guides/main.html). Happy for feedback to improve, but thought it best to exit as early as possible.

Co-authored-by: Šarūnas Nejus <[email protected]>

This is based on the following comment: codecov/codecov-action#1594 (comment)

Fixes #5148. When importing, the code that matches tracks does not consider the medium number. This causes problems on Hybrid SACDs (and other releases) where the artists, track numbers, titles, and lengths are the same on both layers. I added a distance penalty for mismatching medium numbers. Before: ``` $ beet imp . /Volumes/Music/ti/Red Garland/1958 - All Mornin' Long - 1 (6 items) Match (95.4%): The Red Garland Quintet - All Mornin' Long ≠ media, year MusicBrainz, 2xHybrid SACD (CD layer), 2013, US, Analogue Productions, CPRJ 7130 SA, mono https://musicbrainz.org/release/6a584522-58ea-470b-81fb-e60e5cd7b21e * Artist: The Red Garland Quintet * Album: All Mornin' Long * Hybrid SACD (CD layer) 1 ≠ (#2-1) All Mornin' Long (20:21) -> (#1-1) All Mornin' Long (20:21) ≠ (#2-2) They Can't Take That Away From Me (10:24) -> (#1-2) They Can't Take That Away From Me (10:27) ≠ (#2-3) Our Delight (6:23) -> (#1-3) Our Delight (6:23) * Hybrid SACD (CD layer) 2 ≠ (#1-1) All mornin' long (20:21) -> (#2-1) All Mornin' Long (20:21) ≠ (#1-2) They can't take that away from me (10:27) -> (#2-2) They Can't Take That Away From Me (10:25) ≠ (#1-3) Our delight (6:23) -> (#2-3) Our Delight (6:23) ➜ [A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums, Enter search, enter Id, aBort, eDit, edit Candidates? ``` Note that all tracks tagged with disc 1 get moved to disc 2 and vice versa. After: ``` $ beet-test imp . /Volumes/Music/ti/Red Garland/1958 - All Mornin' Long - 1 (6 items) Match (95.4%): The Red Garland Quintet - All Mornin' Long ≠ media, year MusicBrainz, 2xMedia, 2013, US, Analogue Productions, CPRJ 7130 SA, mono https://musicbrainz.org/release/6a584522-58ea-470b-81fb-e60e5cd7b21e * Artist: The Red Garland Quintet * Album: All Mornin' Long * Hybrid SACD (CD layer) 1 ≠ (#1-1) All mornin' long (20:21) -> (#1-1) All Mornin' Long (20:21) ≠ (#1-2) They can't take that away from me (10:27) -> (#1-2) They Can't Take That Away From Me (10:27) ≠ (#1-3) Our delight (6:23) -> (#1-3) Our Delight (6:23) * Hybrid SACD (SACD layer) 2 * (#2-1) All Mornin' Long (20:21) * (#2-2) They Can't Take That Away From Me (10:24) * (#2-3) Our Delight (6:23) ➜ [A]pply, More candidates, Skip, Use as-is, as Tracks, Group albums, Enter search, enter Id, aBort, eDit, edit Candidates? ``` Yay!

The fix is based on the following comment: codecov/codecov-action#1594 (comment)

Seems like this is the missing bit: https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#using-job-outputs-in-a-matrix-job

Except under GitHub CI, where we expect all tests to run.

…y prior to users importing their music library

… getting started guide #4820 ## Description Added a quick checkpoint to ensure the config file is set up correctly prior to users importing their music library. This was something I discovered later after running into an issue with my config file and hope it helps new users avoid the issues I had.

This commit introduces a distance threshold mechanism for the Genius and Google backends. - Create a new `SearchBackend` base class with a method `check_match` that performs checking. - Start using undocumented `dist_thresh` configuration option for good, and mention it in the docs. This controls the maximum allowable distance for matching artist and title names. These changes aim to improve the accuracy of lyrics matching, especially when there are slight variations in artist or title names, see #4791.

Improve requests performance with requests.Session which uses connection pooling for repeated requests to the same host. Additionally, this centralizes request configuration, making sure that we use the same timeout and provide beets user agent for all requests.

Having removed it I fuond that only the Genius lyrics changed: it had en extra new line. Thus I defined a function 'collapse_newlines' which now gets called for the Genius lyrics.

Tidy up 'Google.is_page_candidate' method and remove 'Google.sluggify' method which was a duplicate of 'slug'. Since 'GeniusFetchTest' only tested whether the artist name is cleaned up (the rest of the functionality is patched), remove it and move its test cases to the 'test_slug' test.

* Type the response data that Google Custom Search API return. * Exclude some 'letras.mus.br' pages that do not contain lyric. * Exclude results from Musixmatch as we cannot access their pages. * Improve parsing of the URL title: - Handle long URL titles that get truncated (end with ellipsis) for long searches - Remove domains starting with 'www' - Parse the title AND the artist. Previously this would only parse the title, and fetch lyrics even when the artist did not match. * Remove now redundant credits cleanup and checks for valid lyrics.

Additionally, improve HTML pre-processing: * Ensure a new line between blocks of lyrics text from letras.mus.br. * Parse a missing last block of lyrics text from lacocinelle.net. * Parse a missing last block of lyrics text from paroles.net. * Fix encoding issues with AZLyrics by setting response encoding to None, allowing `requests` to handle it.

If we get caught by Cloudfare, it forwards our request somewhere else and returns some validation text response. To make sure that this text does not get assumed for lyrics, we can disable redirects for the Google backend, check the response code and raise if there's a redirect attempt. This source will then be skipped and the backend continues with the next one.

I think we can make our life easier by removing these checks assuming that users follow the instructions in the docs.

It was my mistake to remove search earlier - I found that in many cases it works fine.

…tionality (#5474) ### Bug Fixes - Fixed #4791: Resolved an issue with the Genius backend where it couldn't match lyrics if there was a slight variation in the artist's name. ### Plugin Enhancements * **Session Management**: Introduced a `TimeoutSession` to enable connection pooling and maintain consistent configuration across requests. * **Error Handling**: Centralized error handling logic in a new `RequestsHandler` class, which includes methods for retrieving either HTML text or JSON data. * **Logging**: Added methods to ensure the backend name is included in log messages. ### Configuration Changes * Added a new `dist_thresh` field to the configuration, allowing users to control the maximum tolerable mismatch between the artist and title of the lyrics search result and their item. Interestingly, this field was previously available (though undocumented) and used in the `Tekstowo` backend. Now, this threshold has also been applied to **Genius** and **Google** search logic. ### Backend Updates * All backends that perform searches now validate each result against the configured `dist_thresh`. #### Genius * Removed the need to scrape HTML tags for lyrics; instead, lyrics are now parsed from the JSON data embedded in the HTML. This change should reduce our vulnerability to Genius' frequent alterations in their HTML structure. * Documented the structure of their search JSON data. #### Google * Typed the response data returned by the Google Custom Search API. * Excluded certain pages under **https://letras.mus.br** that do not contain lyrics. * Excluded all results from MusiXmatch, as we cannot access their pages. * Improved parsing of URL titles (used for matching item/lyrics artist/title): - Handled results from long search queries where URL titles are truncated with an ellipsis. - Enhanced URL title cleanup logic. - Added functionality to determine (or rather, guess) not only the track title but also the artist from the URL title. * Similar to #5406, search results are now compared to the original item and sorted by distance. Results exceeding the configured `dist_thresh` value are discarded. The previous functionality simply selected the first result containing the track's title in its URL, which often led to returning lyrics for the wrong artist, particularly for short track titles. * Since we now fetch lyrics confidently, redundant checks for valid lyrics and credits cleanup have been removed. ### HTML Cleanup * Organized regex patterns into a new `Html` class. * Adjusted patterns to ensure new lines between blocks of lyrics text scraped from `letras.mus.br` and `musica.com`. * Modified patterns to scrape missing lyrics text on `paroles.net` and `lacoccinelle.net`. See the diff in `test/plugins/lyrics_page.py`.

pull bot added the ⤵️ pull label Apr 6, 2022

snejus force-pushed the master branch 3 times, most recently from c874689 to aa0db04 Compare November 22, 2024 01:36

snejus and others added 26 commits November 22, 2024 01:50

Increment version to 2.1.0

bc16ed1

Release: fix github-tag-action version

4a5b9a2

Release: temporarily ignore errors with bumping version and pypi push

cf3acec

Release: make sure release artefacts are present for the tagging job

0780bf3

Revert "Release: temporarily ignore errors with bumping version and p…

979f123

…ypi push" This reverts commit cf3acec.

bitesize to good first issue: Update changelog.rst (#5477)

6444111

Changelog notes need to go under Unreleased

176661b

Add a quick check for MST Store Python install

3798ac5

Update related window files to match 3.8

f0f77aa

Manually format messages

3750c63

Windows & VSCode & Python

22810b6

Update changelog

ec4b26f

Update error logging as suggested

1d63bf9

Update docs/guides/main.rst

5f4fe21

Co-authored-by: Šarūnas Nejus <[email protected]>

ARM footnotes

79d7d48

Disable OIDC for coverage uploads from forks

db71444

This is based on the following comment: codecov/codecov-action#1594 (comment)

Fix SACD Imports

32e9e58

Fix coverage upload from forks: Attempt #2 (#5514)

37a2cec

The fix is based on the following comment: codecov/codecov-action#1594 (comment)

Include test files, manual to sdist

9c4d4d9

Fixup changelog rst formatting for this and prev version

f5a0246

Fix missing changelog in the release notes

a7f00ea

Seems like this is the missing bit: https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#using-job-outputs-in-a-matrix-job

Update dependencies

f39eb98

Skip autobpm tests if librosa isn't available

336b5b3

Except under GitHub CI, where we expect all tests to run.

JOJ0 and others added 30 commits January 24, 2025 10:44

Add missing changelog for #4982

346071c

Added a quick checkpoint to ensure the config file is set up correctl…

1ae4677

…y prior to users importing their music library

Apply config-sanity-check suggestion in docs

6161b44

Make lyrics plugin documentation slightly more clear

c40db10

Centralise request error handling

283c513

Include class name in the log messages

cb29605

Leave a single chef in the kitchen

7c2fb31

Do not try to strip cruft from the parsed lyrics text.

dd9f178

Having removed it I fuond that only the Genius lyrics changed: it had en extra new line. Thus I defined a function 'collapse_newlines' which now gets called for the Genius lyrics.

lyrics: Add symbols for better visual feedback in the logs

8bdc2c6

lyrics: Do not write item unless lyrics have changed

8a1ce27

Replace custom unescape implementation by html.unescape

55b7824

Remove extract_text_between

54fc67b

Genius: refactor and simplify

745c5eb

Unite Genius, Tekstowo and Google backends under the same interface

12c5eaa

Google: prioritise Songlyrics and AZlyrics sources

07d372c

Remove dependency existence checks

04054ca

I think we can make our life easier by removing these checks assuming that users follow the instructions in the docs.

Tidy up handling of backends

bdc564a

Append source to the lyrics

734bcc2

Xfail Songlyrics source

858c135

Google: add support for dainuzodziai.lt

39c479f

Do not search for Various Artists, split titles by ' / '

7389f24

Bring back Tekstowo search

dab9a0d

It was my mistake to remove search earlier - I found that in many cases it works fine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from beetbox:master #1

[pull] master from beetbox:master #1

pull bot commented Apr 6, 2022 •

edited

Loading

[pull] master from beetbox:master #1

Are you sure you want to change the base?

[pull] master from beetbox:master #1

Conversation

pull bot commented Apr 6, 2022 • edited Loading

pull bot commented Apr 6, 2022 •

edited

Loading