WIP: Refactor the IDatabase calls to write directly to the database #271

Draft
wojtek wants to merge 12 commits from 268---refactor-the-idatabase-calls-to-write-directly-to-the-database into main
Owner

Closes #268

Closes #268
wojtek added 1 commit 2025-01-12 21:38:57 +01:00
Update default database path
All checks were successful
Cargo CI / Build and Test (pull_request) Successful in 2m26s
Cargo CI / Lint (pull_request) Successful in 1m19s
16dc2917f0
Author
Owner

Step 1: get rid of pre-commit - upon failure reload database. This introduces extra writes which are to be resolved by being more selective about what is to be written.

Step 1: get rid of pre-commit - upon failure reload database. This introduces extra writes which are to be resolved by being more selective about what is to be written.
wojtek added 1 commit 2025-01-12 22:05:22 +01:00
Remove pre_commit
All checks were successful
Cargo CI / Build and Test (pull_request) Successful in 2m21s
Cargo CI / Lint (pull_request) Successful in 1m17s
a8cd1d341d
wojtek added 1 commit 2025-01-12 22:41:06 +01:00
Remove unused methods
All checks were successful
Cargo CI / Build and Test (pull_request) Successful in 2m19s
Cargo CI / Lint (pull_request) Successful in 1m18s
c2973680e8
Author
Owner

To do:

  1. Redo the _and update methods to not take arbitrary closures but sort the parent collection.
  2. Figure out how best to make targeted updates to the DB - updating only the values that need updating is better.
To do: 1. ~Redo the `_and` update methods to not take arbitrary closures but sort the parent collection.~ 2. ~Figure out how best to make targeted updates to the DB - updating only the values that need updating is better.~
wojtek added 1 commit 2025-01-12 22:56:32 +01:00
Correctly sort
All checks were successful
Cargo CI / Build and Test (pull_request) Successful in 2m18s
Cargo CI / Lint (pull_request) Successful in 1m17s
6a80c05909
Author
Owner
  1. Find the object
  2. Update the DB
  3. Update the in-memory object (either by reloading from DB or directly)
1. Find the object 2. Update the DB 3. Update the in-memory object (either by reloading from DB or directly)
Author
Owner

Also contemplate replacing all searches through get_collection with dedicated queries -> will no longer need to keep collection in memory

Also contemplate replacing all searches through get_collection with dedicated queries -> will no longer need to keep collection in memory
Author
Owner

Must still ensure that all items in the library are written into the database (including any changes). At the moment the database is dumped in full after initial startup.

Rely on the database to do this logic. After all, it's meant to be clever and efficient. Hide this logic within save and make the SQL queries clever.

Remember to add an integration test which will have a database with new items that should be added/modified in the test.

Must still ensure that all items in the library are written into the database (including any changes). At the moment the database is dumped in full after initial startup. Rely on the database to do this logic. After all, it's meant to be clever and efficient. Hide this logic within `save` and make the SQL queries clever. Remember to add an integration test which will have a database with new items that should be added/modified in the test.
Author
Owner
  1. Start by making save smarter - sign of success is that drop is not needed. Integration tests are crucial here.
  2. Once save is smarter start by making individual IDatabase calls smarter and only updating when needed.
1. Start by making save smarter - sign of success is that drop is not needed. Integration tests are crucial here. 2. Once save is smarter start by making individual IDatabase calls smarter and only updating when needed.
Author
Owner

Options so far:

  1. INSERT OR IGNORE followed by UPDATE ... WHERE EXISTS ... EXCEPT - if it exists, it's hit in two queries
  2. REPLACE INTO - replaces the row rather than updating (probably does not matter - leave to implementation)

Expect few replacements and few inserts in general

Options so far: 1. INSERT OR IGNORE followed by UPDATE ... WHERE EXISTS ... EXCEPT - if it exists, it's hit in two queries 2. REPLACE INTO - replaces the row rather than updating (probably does not matter - leave to implementation) Expect few replacements and few inserts in general
Author
Owner

Working example

INSERT OR IGNORE INTO artists (name, mbid, sort, properties) 
VALUES ('Alex Rivers', '"None"', 'Alex Rivers', '{"key":"value"}');

UPDATE artists SET sort = 'Rivers, Alex', properties = '{"no":"u"}' 
WHERE name = 'Alex Rivers' AND mbid = '"None"' AND EXISTS (
  SELECT 1 EXCEPT SELECT 1 WHERE sort = 'Rivers, Alex' AND properties = '{"no":"u"}'
);

This is better than using != and <> as it will correctly handle sort being NULL in the destination table

Working example ``` INSERT OR IGNORE INTO artists (name, mbid, sort, properties) VALUES ('Alex Rivers', '"None"', 'Alex Rivers', '{"key":"value"}'); UPDATE artists SET sort = 'Rivers, Alex', properties = '{"no":"u"}' WHERE name = 'Alex Rivers' AND mbid = '"None"' AND EXISTS ( SELECT 1 EXCEPT SELECT 1 WHERE sort = 'Rivers, Alex' AND properties = '{"no":"u"}' ); ``` This is better than using `!=` and `<>` as it will correctly handle `sort` being NULL in the destination table
Author
Owner

So the rusqlite query should be:

INSERT OR IGNORE INTO artists (name, mbid, sort, properties) 
VALUES (?1, ?2, ?3, ?4);

UPDATE artists SET sort = ?3, properties = ?4 
WHERE name = ?1 AND mbid = ?2 AND EXISTS (
  SELECT 1 EXCEPT SELECT 1 WHERE sort = ?3 AND properties = ?4
);
So the rusqlite query should be: ``` INSERT OR IGNORE INTO artists (name, mbid, sort, properties) VALUES (?1, ?2, ?3, ?4); UPDATE artists SET sort = ?3, properties = ?4 WHERE name = ?1 AND mbid = ?2 AND EXISTS ( SELECT 1 EXCEPT SELECT 1 WHERE sort = ?3 AND properties = ?4 ); ```
Author
Owner

Following Reddit thread, this works too

INSERT INTO artists (name, mbid, sort, properties) 
VALUES ('Alex Rivers', '"None"', 'Alex Rivers', '{"key":"value"}')
ON CONFLICT(name, mbid) DO UPDATE
SET sort = 'Rivers, Alex', properties = '{"no":"u"}'
WHERE EXISTS (
  SELECT 1 EXCEPT SELECT 1 WHERE sort = 'Rivers, Alex' AND properties = '{"no":"u"}'
);
Following Reddit thread, this works too ``` INSERT INTO artists (name, mbid, sort, properties) VALUES ('Alex Rivers', '"None"', 'Alex Rivers', '{"key":"value"}') ON CONFLICT(name, mbid) DO UPDATE SET sort = 'Rivers, Alex', properties = '{"no":"u"}' WHERE EXISTS ( SELECT 1 EXCEPT SELECT 1 WHERE sort = 'Rivers, Alex' AND properties = '{"no":"u"}' ); ```
wojtek added 2 commits 2025-01-14 21:49:22 +01:00
Update tables without dropping
Some checks failed
Cargo CI / Build and Test (pull_request) Successful in 2m53s
Cargo CI / Lint (pull_request) Failing after 1m16s
aea710eb32
Author
Owner

Thorny problem: could have acquired lib_id in merge, but lib_id is used in UNIQUE clause. Previously this was not a problem since all tables were dropped and rewritten.

Potential solution: downgrade mbid back to non-id and introduce a DB-id which is guaranteed to exist for every item in the DB. Then the DB id will be the unique constraint just like lib_id or singleton+title is for library albums or name for artists. MBID was being used as a proxy for DB id but this fails when a DB id needs to exist and none exists.

Use the DB to assign the DB id just like the library assigns lib IDs. Tough this will require querying the DB after a save.

Don't need one for artists. Beets does not assign IDs to artists and generally in case of a name clash, the more famous band wins and the other disambiguates.

Use the fact that SQL NULL values are considered different in a UNIQUE constraint (https://stackoverflow.com/questions/22699409/sqlite-null-and-unique)

Thorny problem: could have acquired lib_id in merge, but lib_id is used in UNIQUE clause. Previously this was not a problem since all tables were dropped and rewritten. Potential solution: downgrade mbid back to non-id and introduce a DB-id which is guaranteed to exist for every item in the DB. Then the DB id will be the unique constraint just like lib_id or singleton+title is for library albums or name for artists. MBID was being used as a proxy for DB id but this fails when a DB id needs to exist and none exists. Use the DB to assign the DB id just like the library assigns lib IDs. Tough this will require querying the DB after a save. Don't need one for artists. Beets does not assign IDs to artists and generally in case of a name clash, the more famous band wins and the other disambiguates. Use the fact that SQL NULL values are considered different in a UNIQUE constraint (https://stackoverflow.com/questions/22699409/sqlite-null-and-unique)
wojtek added 3 commits 2025-01-18 10:14:30 +01:00
Correct album id-info split
Some checks failed
Cargo CI / Build and Test (pull_request) Failing after 1m43s
Cargo CI / Lint (pull_request) Successful in 1m17s
fdd6d13146
Author
Owner

Solution: converge on a unique database ID. Every item in the collection MUST have a database ID and this will be the only unique identifier. Everything else: name/title, lib id will get relegated to info metadata.

Solution: converge on a unique database ID. Every item in the collection MUST have a database ID and this will be the only unique identifier. Everything else: name/title, lib id will get relegated to info metadata.
Author
Owner

By the end IDatabase should not have save. Every insert/update should be explicit. (ideally - it might be easier to just leave it for now).

By the end IDatabase should not have `save`. Every insert/update should be explicit. (ideally - it might be easier to just leave it for now).
Author
Owner

Remember to move artist sort from name to meta

Remember to move artist sort from name to meta
Author
Owner

Q: What sets info apart from the rest of meta?

Options

  • get rid of distinction, make choices explicit every time.
  • non-info - metadata that should be carefully curated - strongly affects experience, info - needs less curation - can be fetched from the web
Q: What sets info apart from the rest of meta? Options - get rid of distinction, make choices explicit every time. - non-info - metadata that should be carefully curated - strongly affects experience, info - needs less curation - can be fetched from the web
wojtek added 3 commits 2025-01-19 20:48:39 +01:00
Merge figured out
Some checks failed
Cargo CI / Build and Test (pull_request) Failing after 1m20s
Cargo CI / Lint (pull_request) Failing after 37s
38a2e1f33c
Author
Owner

Also rebrand TrackId to TrackTitle

Also rebrand TrackId to TrackTitle
Some checks failed
Cargo CI / Build and Test (pull_request) Failing after 1m20s
Cargo CI / Lint (pull_request) Failing after 37s
This pull request is marked as a work in progress.

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin 268---refactor-the-idatabase-calls-to-write-directly-to-the-database:268---refactor-the-idatabase-calls-to-write-directly-to-the-database
git checkout 268---refactor-the-idatabase-calls-to-write-directly-to-the-database
Sign in to join this conversation.
No description provided.