Benchmark a custom string normalisation function #138
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
A custom written string normalisation function might be faster than AC simply by being specialised for the data being processed. Things this custom function can assume:
Running with AhoCorasick
Running with custom code above
Not sure how to explain
tui::app::machine::search::benches::is_char_sensitive_alphanumeric
... It seems consistent.In the most likely case, i.e., alphanumeric - the performance is similar. In the most unlikely and difficult case, i.e. UTF8, the custom solution wins by about 20%.
A quick test using just "Heaven's Basement" revealed a definite advantage for the custom solution. A better sample is needed.
Using the artist list from 2024-02-19 the results are as follows:
Running with AhoCorasick:
Running with custom solution:
Still don't know what's up with
test tui::app::machine::search::benches::is_char_sensitive
since that method is untouched...Conclusion is that on the most likely input which is mostly alphanumeric with some non-ASCII characters and the occasional special character AhoCorasick is ever so slightly better. An additional dependence in exchange for increased confidence of handling future edge cases.
For final reference, an approximation of the original method which used just string methods:
Performs with: