Port of langchain text splitters to Ruby.
So far only the RecursiveCharacterTextSplitter is implemented. PRs for others are welcome!
$ gem install text_splitters
require "text_splitters"Learn more about this splitter.
text = "Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans."
splitter = ::TextSplitters::RecursiveCharacterTextSplitter.new(chunk_size: 100, chunk_overlap: 20)
output = splitter.split(text)
output[0] # "Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet."
output[1] # "and the Cabinet. Justices of the Supreme Court. My fellow Americans."If you want to report a bug, or have ideas, feedback or questions about the gem, let me know via GitHub issues and I will do my best to provide a helpful answer. Happy hacking!
The gem is available as open source under the terms of the MIT License.
Pull requests are welcome!
Clone the repo, run bundle to install deps, then run rake to run tests.