An AI based tweet generator which can generate collection of tweets automatically. It can be used to replace the usernames and texts of the real twitter data provided by the user with fake generated usernames and texts. The tweets are generated using models learned from real tweets using the textgenrnn library.
textgenrnn, which can be installed usingpip install textgenrnnujson, which can be installed usingpip install ujsontensorflow, which can be installed usingpip install tensorflow- If other libraries are missing, install them using
pip install <library>
Users must provide the raw twitter data downloaded from Twitter and unzip it to a json file. This json file will be fed to both the trainer and the generator.
The downloaded Twitter data is in .gz format (e.g. Tweet_2019-07-15_08-17-17.gz). To unzip the data to a json file, use command gunzip -c <filename>.gz > <filename>.json (e.g. gunzip -c Tweet_2019-07-15_08-17-17.gz > tweets.json).
Python version must be at least Python 3.6.
- IMPORTANT: create a subdirectory called
weightsin the same directory as the scripts for the pipeline to work. - Run
tweet_trainer.pyto train a model using thetextgenrnnlibrary. It trains the model using a sample of sizekfrom the json file containing the raw twitter data, which is provided by the user. It takes three parameters:-i-- the json file storing the original tweets (e.g.tweets.json),-k-- the training sample size, and-e-- the training epoch. An example ispython3 tweet_trainer.py -i tweets.json -k 20000 -e 5. Please usepython3 tweet_trainer.py --helpfor a detailed description of the script and its parameters. - Run
tweet_generator.pyto replace tweets with fake usernames and texts. The fake usernames are generated by the helper functionname_generator.py, and the fake texts are generated using the trained model. The script generates the tweets and writes them into another json file in batches of sizek, which is specified by the user. It takes three parameters:-i-- the json file storing the original tweets (e.g.tweets.json),-o-- the json file to store the fake tweets (e.g.fake_tweets.json), and-k-- the batch size to generate. An example ispython3 text_generator.py -i tweets.json -o fake_tweets.json -k 12000. Please usepython3 text_generator.py --helpfor a detailed description of the script and its parameters.
- name_generator.py
- tweet_trainer.py
- tweet_generator.py