Ray Aviary: Open-Source Multi-LLM Serving
We outline the reasons for why you might want to consider using open LLMs (e.g. Llama 2 from Meta) rather than proprietary LLMs (e.g. ChatGPT from OpenAI), in terms of cost, options, deployment flexibility and control of data.
Presuming you want to serve Open LLMs for some or all of your traffic, we then go through the self-hosted options available to you, including our own, RayLLM, that has advantages in performance, cost and scalability.
Finally we discuss your options if you want to use open LLMs, but you don’t necessarily want all of the headaches of self hosting including both shared endpoints (Anyscale Endpoints) and assisted self hosting (Anyscale Private Endpoints).
Chief Scientist at Anyscale