FriendliAI announced the launch of Friendli Serverless Endpoints

FriendliAI, a leader in inference serving for generative AI, announced the launch of Friendli Serverless Endpoints today for accessible development with generative AI models. This service removes the technical barriers of managing the underlying infrastructure, putting the power of cutting-edge generative AI models directly into the hands of developers, data scientists, and businesses of all sizes.

“Building the future of generative AI requires democratizing access to the technology,” says Byung-Gon Chun, CEO of FriendliAI. “With Friendli Serverless Endpoints, we’re removing the complicated infrastructure and GPU optimization hurdles that hold back innovation. Now, anyone can seamlessly integrate state-of-the-art models like Llama 2 and Stable Diffusion into their workflows at low costs and high speeds, unlocking incredible possibilities for text generation, image creation, and beyond.”

Users can seamlessly integrate open-source generative AI models into their applications with granular control at the per-token or per-step level, enabling need-specific resource usage optimizations. Friendli Serverless Endpoints comes pre-loaded with popular models like Llama 2, CodeLlama, Mistral, and Stable Diffusion.

Friendli Serverless Endpoints provides per-token billing at the lowest price on the market, at $0.2 per million tokens for the Llama 2 13B model, and $0.8 per million tokens for the Llama 2 70B model. Friendli Serverless Endpoints provides query responses at 2-4x faster latency compared to other leading solutions that use vLLM, ensuring a smooth and responsive generative AI experience. This impressive pricing and speed is achieved through the company’s Friendli Engine, an optimized serving engine that reduces the number of GPUs required for serving by up to 6-7x compared to traditional solutions.

For those seeking dedicated resources and custom model compatibility, FriendliAI offers Friendli Dedicated Endpoints through cloud-based dedicated GPU instances, as well as Friendli Container through Docker. This flexibility ensures the perfect solution for a variety of generative AI ambitions.

“We’re on a mission to make open-source generative AI models fast and affordable,” says Chun. “The Friendli Engine, along with our new Friendli Serverless Endpoints, is a game-changer. We’re thrilled to welcome new users and make generative AI more accessible and economical–advancing our mission to democratize generative AI.”

Start Your Generative Journey Today: FriendliAI is committed to fostering a thriving ecosystem for generative AI innovation. Visit to sign up for Friendli Serverless Endpoints and unlock the transformative power of generative AI today.

Previous ArticleNext Article