In the past few days, the server-example from llama.cpp has become a really useful piece of software - so much so that for many things it could replace the main program as the primary interaction tool with a model.
How difficult will it be to make this server available for falcon as well?
I have no idea how much falcon-specific code is actually in falcon-main - shouldn't most of the specific stuff be in the libraries, especially falcon_common and libfalcon?
How much is left to do once you've changed all the external calls in server.cpp to the corresponding calls from falcon_common and libfalcon?