WIP: Refactor the IDatabase calls to write directly to the database #271

wojtek · 2025-01-12T21:38:57+01:00

wojtek commented

2025-01-12 21:38:57 +01:00

Closes #268

wojtek added 1 commit 2025-01-12 21:38:57 +01:00

Update default database path

Cargo CI / Build and Test (pull_request) Successful in 2m26s

Details

Cargo CI / Lint (pull_request) Successful in 1m19s

Details

16dc2917f0

wojtek commented

2025-01-12 21:41:03 +01:00

Step 1: get rid of pre-commit - upon failure reload database. This introduces extra writes which are to be resolved by being more selective about what is to be written.

wojtek added 1 commit 2025-01-12 22:05:22 +01:00

Remove pre_commit

Cargo CI / Build and Test (pull_request) Successful in 2m21s

Details

Cargo CI / Lint (pull_request) Successful in 1m17s

Details

a8cd1d341d

wojtek added 1 commit 2025-01-12 22:41:06 +01:00

Remove unused methods

Cargo CI / Build and Test (pull_request) Successful in 2m19s

Details

Cargo CI / Lint (pull_request) Successful in 1m18s

Details

c2973680e8

wojtek commented

2025-01-12 22:56:14 +01:00

To do:

~~Redo the _and update methods to not take arbitrary closures but sort the parent collection.~~
~~Figure out how best to make targeted updates to the DB - updating only the values that need updating is better.~~

To do: 1. ~Redo the `_and` update methods to not take arbitrary closures but sort the parent collection.~ 2. ~Figure out how best to make targeted updates to the DB - updating only the values that need updating is better.~

wojtek added 1 commit 2025-01-12 22:56:32 +01:00

Correctly sort

Cargo CI / Build and Test (pull_request) Successful in 2m18s

Details

Cargo CI / Lint (pull_request) Successful in 1m17s

Details

6a80c05909

wojtek commented

2025-01-13 19:12:42 +01:00

Find the object
Update the DB
Update the in-memory object (either by reloading from DB or directly)

1. Find the object 2. Update the DB 3. Update the in-memory object (either by reloading from DB or directly)

wojtek commented

2025-01-13 19:13:02 +01:00

Also contemplate replacing all searches through get_collection with dedicated queries -> will no longer need to keep collection in memory

wojtek commented

2025-01-13 19:27:34 +01:00

Must still ensure that all items in the library are written into the database (including any changes). At the moment the database is dumped in full after initial startup.

Rely on the database to do this logic. After all, it's meant to be clever and efficient. Hide this logic within save and make the SQL queries clever.

Remember to add an integration test which will have a database with new items that should be added/modified in the test.

Must still ensure that all items in the library are written into the database (including any changes). At the moment the database is dumped in full after initial startup. Rely on the database to do this logic. After all, it's meant to be clever and efficient. Hide this logic within `save` and make the SQL queries clever. Remember to add an integration test which will have a database with new items that should be added/modified in the test.

wojtek commented

2025-01-13 19:55:53 +01:00

Start by making save smarter - sign of success is that drop is not needed. Integration tests are crucial here.
Once save is smarter start by making individual IDatabase calls smarter and only updating when needed.

1. Start by making save smarter - sign of success is that drop is not needed. Integration tests are crucial here. 2. Once save is smarter start by making individual IDatabase calls smarter and only updating when needed.

wojtek commented

2025-01-13 22:33:05 +01:00

Options so far:

INSERT OR IGNORE followed by UPDATE ... WHERE EXISTS ... EXCEPT - if it exists, it's hit in two queries
REPLACE INTO - replaces the row rather than updating (probably does not matter - leave to implementation)

Expect few replacements and few inserts in general

Options so far: 1. INSERT OR IGNORE followed by UPDATE ... WHERE EXISTS ... EXCEPT - if it exists, it's hit in two queries 2. REPLACE INTO - replaces the row rather than updating (probably does not matter - leave to implementation) Expect few replacements and few inserts in general

wojtek commented

2025-01-13 22:33:14 +01:00

Working example

INSERT OR IGNORE INTO artists (name, mbid, sort, properties) 
VALUES ('Alex Rivers', '"None"', 'Alex Rivers', '{"key":"value"}');

UPDATE artists SET sort = 'Rivers, Alex', properties = '{"no":"u"}' 
WHERE name = 'Alex Rivers' AND mbid = '"None"' AND EXISTS (
  SELECT 1 EXCEPT SELECT 1 WHERE sort = 'Rivers, Alex' AND properties = '{"no":"u"}'
);

This is better than using != and <> as it will correctly handle sort being NULL in the destination table

Working example ``` INSERT OR IGNORE INTO artists (name, mbid, sort, properties) VALUES ('Alex Rivers', '"None"', 'Alex Rivers', '{"key":"value"}'); UPDATE artists SET sort = 'Rivers, Alex', properties = '{"no":"u"}' WHERE name = 'Alex Rivers' AND mbid = '"None"' AND EXISTS ( SELECT 1 EXCEPT SELECT 1 WHERE sort = 'Rivers, Alex' AND properties = '{"no":"u"}' ); ``` This is better than using `!=` and `<>` as it will correctly handle `sort` being NULL in the destination table

wojtek commented

2025-01-13 22:38:44 +01:00

So the rusqlite query should be:

INSERT OR IGNORE INTO artists (name, mbid, sort, properties) 
VALUES (?1, ?2, ?3, ?4);

UPDATE artists SET sort = ?3, properties = ?4 
WHERE name = ?1 AND mbid = ?2 AND EXISTS (
  SELECT 1 EXCEPT SELECT 1 WHERE sort = ?3 AND properties = ?4
);

So the rusqlite query should be: ``` INSERT OR IGNORE INTO artists (name, mbid, sort, properties) VALUES (?1, ?2, ?3, ?4); UPDATE artists SET sort = ?3, properties = ?4 WHERE name = ?1 AND mbid = ?2 AND EXISTS ( SELECT 1 EXCEPT SELECT 1 WHERE sort = ?3 AND properties = ?4 ); ```

wojtek commented

2025-01-14 08:34:51 +01:00

Following Reddit thread, this works too

INSERT INTO artists (name, mbid, sort, properties) 
VALUES ('Alex Rivers', '"None"', 'Alex Rivers', '{"key":"value"}')
ON CONFLICT(name, mbid) DO UPDATE
SET sort = 'Rivers, Alex', properties = '{"no":"u"}'
WHERE EXISTS (
  SELECT 1 EXCEPT SELECT 1 WHERE sort = 'Rivers, Alex' AND properties = '{"no":"u"}'
);

Following Reddit thread, this works too ``` INSERT INTO artists (name, mbid, sort, properties) VALUES ('Alex Rivers', '"None"', 'Alex Rivers', '{"key":"value"}') ON CONFLICT(name, mbid) DO UPDATE SET sort = 'Rivers, Alex', properties = '{"no":"u"}' WHERE EXISTS ( SELECT 1 EXCEPT SELECT 1 WHERE sort = 'Rivers, Alex' AND properties = '{"no":"u"}' ); ```

wojtek added 2 commits 2025-01-14 21:49:22 +01:00

Convenience method 64f3e2d8b7

Update tables without dropping

Cargo CI / Build and Test (pull_request) Successful in 2m53s

Details

Cargo CI / Lint (pull_request) Failing after 1m16s

Details

aea710eb32

wojtek commented

2025-01-14 22:37:35 +01:00

Thorny problem: could have acquired lib_id in merge, but lib_id is used in UNIQUE clause. Previously this was not a problem since all tables were dropped and rewritten.

Potential solution: downgrade mbid back to non-id and introduce a DB-id which is guaranteed to exist for every item in the DB. Then the DB id will be the unique constraint just like lib_id or singleton+title is for library albums or name for artists. MBID was being used as a proxy for DB id but this fails when a DB id needs to exist and none exists.

Use the DB to assign the DB id just like the library assigns lib IDs. Tough this will require querying the DB after a save.

Don't need one for artists. Beets does not assign IDs to artists and generally in case of a name clash, the more famous band wins and the other disambiguates.

Use the fact that SQL NULL values are considered different in a UNIQUE constraint (https://stackoverflow.com/questions/22699409/sqlite-null-and-unique)

Thorny problem: could have acquired lib_id in merge, but lib_id is used in UNIQUE clause. Previously this was not a problem since all tables were dropped and rewritten. Potential solution: downgrade mbid back to non-id and introduce a DB-id which is guaranteed to exist for every item in the DB. Then the DB id will be the unique constraint just like lib_id or singleton+title is for library albums or name for artists. MBID was being used as a proxy for DB id but this fails when a DB id needs to exist and none exists. Use the DB to assign the DB id just like the library assigns lib IDs. Tough this will require querying the DB after a save. Don't need one for artists. Beets does not assign IDs to artists and generally in case of a name clash, the more famous band wins and the other disambiguates. Use the fact that SQL NULL values are considered different in a UNIQUE constraint (https://stackoverflow.com/questions/22699409/sqlite-null-and-unique)

wojtek added 3 commits 2025-01-18 10:14:30 +01:00

Tests 8a9030bf61

Correct artist id-info split 7934d7eccf

Correct album id-info split

Cargo CI / Build and Test (pull_request) Failing after 1m43s

Details

Cargo CI / Lint (pull_request) Successful in 1m17s

Details

fdd6d13146

wojtek commented

2025-01-18 10:48:14 +01:00

Solution: converge on a unique database ID. Every item in the collection MUST have a database ID and this will be the only unique identifier. Everything else: name/title, lib id will get relegated to info metadata.

wojtek commented

2025-01-18 18:17:06 +01:00

By the end IDatabase should not have save. Every insert/update should be explicit. (ideally - it might be easier to just leave it for now).

By the end IDatabase should not have `save`. Every insert/update should be explicit. (ideally - it might be easier to just leave it for now).

wojtek commented

2025-01-19 10:34:18 +01:00

Remember to move artist sort from name to meta

wojtek commented

2025-01-19 18:23:53 +01:00

Q: What sets info apart from the rest of meta?

Options

get rid of distinction, make choices explicit every time.
non-info - metadata that should be carefully curated - strongly affects experience, info - needs less curation - can be fetched from the web

Q: What sets info apart from the rest of meta? Options - get rid of distinction, make choices explicit every time. - non-info - metadata that should be carefully curated - strongly affects experience, info - needs less curation - can be fetched from the web

wojtek added 3 commits 2025-01-19 20:48:39 +01:00

Move artist mbref back to info 3901c161bd

WIP: use unique artist id 3b0fa28dfc

Merge figured out

Cargo CI / Build and Test (pull_request) Failing after 1m20s

Details

Cargo CI / Lint (pull_request) Failing after 37s

Details

38a2e1f33c

wojtek commented

2025-01-20 07:40:20 +01:00

Also rebrand TrackId to TrackTitle

Cargo CI / Build and Test (pull_request) Failing after 1m20s

Details

Cargo CI / Lint (pull_request) Failing after 37s

Details

This pull request is marked as a work in progress.

View command line instructions.

Checkout

From your project repository, check out a new branch and test the changes.

git fetch -u origin 268---refactor-the-idatabase-calls-to-write-directly-to-the-database:268---refactor-the-idatabase-calls-to-write-directly-to-the-database

git checkout 268---refactor-the-idatabase-calls-to-write-directly-to-the-database

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: wojtek/musichoard#271