-
Notifications
You must be signed in to change notification settings - Fork 11
Description
There're some changes that I think should be addressed before package Cheese-shop release.
Result ranking
There's no full-text result set ranking function out-of-the-box in SQLite. I think it makes sense to extent the scope of the package to address ranking as it is absolutely a topic of both "sqlite" and "fts".
All code is already out there. There's the article, even though it's about MIT-licensed package, peewee, the code can be easily extracted. Here's a gist with module and test case for it.
Because BM25 is a general language-independent ranking function its presence in the package makes it more complete.
Minimum documentation
README should be written to overview and cover basics. I can assist with it.
Also recipes for integration with tokenizers for major domains (CJK, Cyrillic, etc) is a good idea.
Minor
Underscore is undesired in a Python module name. I suggest to rename sqlite_tokenizer.py. "sqlite" part is the obvious context. tokenizer.py is better but not good anyway as it's not informative as the module doesn't provide real tokenizer per se, rather than a binding to register it. binding.py may be a better name, though you can try to coin a better one.
Make user symbols available from __init__.py so import sqlitefts is sufficient.
setup.py. url points to other package. "Operating System :: POSIX :: Linux" seems redundant with "Operating System :: OS Independent".