Updated from https://github.com/renatoviolin/next_word_prediction
Simple application using transformers models to predict next word or a masked word in a sentence.
The purpose is to demo and compare the main models available up to date.
The first load take a long time since the application will download all the models. Beside 6 models running, inference time is acceptable even in CPU.
This app implements two variants of the same task (predict token). The first one consider the is at end of the sentence, simulating a prediction of the next word of the sentece.
The second variant is necessary to include a token where you want the model to predict the word.
python3 -m venv venv
venv/bin/pip3 install -r requirements.txt
venv/bin/python app.py
Open your browser http://localhost:8080
bart模型我刚测试了,貌似也不错。但是内存占用比较大,启用后到了2.24G内存,api响应也变慢了一点
- 安装torch可能失败,可以逐个单独安装:
venv/bin/pip3 install -U torch --no-cache-dir
--no-cache-dir 是关键 - 使用大语言模型国内镜像: HF_ENDPOINT=https://hf-mirror.com venv/bin/python app.py
优化重构:一次只响应一种语言(en/cn)的prediction计算,减少计算量和网络传输。