Sambaverse introduction

Sambaverse External link allows you to find the perfect open source model for your project. Use one prompt to compare open source models to find the best fit. Sambaverse uses SambaNova’s first Composition of Experts model, Samba-1, to enable instant model swapping via a single API. Effortlessly switch between multiple models on a single instance, adapting to your specific needs in mere milliseconds. Discover how Sambaverse is providing free cloud inference for open-source LLM exploration. Read on to learn more about this exciting opportunity and how it can benefit your research or project.

Sambaverse landing page

Why expert models are the future

The emergence of generative AI is driving the next phase of digital transformation. And while generic ask-me-anything chat models are great for consumer applications, integrating generative AI into business applications requires a different approach: having specific expert models to solve every part of the business problem.

Take the following use case as an example. A research paper assistant, where an end user wants to search lots of different research papers to find interesting information or check if there is related prior research work. You may want one model to provide a chat-based interface to the end user, but behind this model you have another that analyzes charts or complex diagrams, one that interprets mathematical formulas, models to translate between papers written in different languages, and models for different areas of research. In this single use case you could have more than 10 models. Now think about this use case and multiply it with the thousands of business workflows within a large organization.

Trying to solve all of these problems with a single large model incurs the following costs:

  • Monolithic training: Every time you fine-tune a model it becomes very expensive because you are fine-tuning the entire model for a narrow use case.

  • Monolithic inference: Every time you are making a request to the model, you have to load the entire model, which is again very resource expensive and becomes unfeasible at scale.

  • Alignment tax: Training on new domains, new tasks, new languages, and new modalities will make the model perform worse on previously trained data.

  • Granular access control: Large monolithic models that are trained on lots of different domains are hard to manage access control. For example, ensuring company accounting information is only available to the finance department.

Additionally, generic models perform well addressing lots of things, but specialized expert models perform better at specific things. Being able to compose different expert models together allows you to get accuracy as good as or better than big general models while still addressing all of the cost-incurred issues from above. Solving these issues is what fueled SambaNova’s push towards the Composition of Experts architecture.

Composition of Experts (CoE)

The Composition of Experts External link (CoE) architecture is a system of multiple experts, where each expert is a fully trained machine learning model. CoE architecture models are strategically curated expert models from the open source community that offer the latest advancements in accuracy. CoE combines the strengths of large, monolithic models with the advantages of smaller, specialized models, resulting in a more balanced and efficient approach that mitigates the limitations of each. Instead of relying on a single, resource-intensive model, SambaNova’s Composition of Experts (CoE) divides tasks into specialized subfields, each handled by a dedicated expert, resulting in more efficient and effective solutions.

Samba-1

Sambaverse is powered by Samba-1 External link, SambaNova’s first Composition of Experts implemented model. Samba-1 combines the comprehensive power of trillion-parameter models with the precision and efficiency of specialized models. This unique model offers a single API endpoint that enables the orchestration of domain-specific experts across various fields such as finance, legal, and engineering. This single API also implements task-specific experts for operations like summarization, extraction, and content editing across multiple modalities. Samba-1 enables a smooth integration of diverse models, providing flexible access and permission controls that align with an organization’s structure, thereby ensuring data segregation and security.

Features of Samba-1

  • Leverage diverse models from one endpoint: Utilize a single API endpoint to create knowledge-rich applications with a wide range of domain and task experts across different configurations and modalities (including text, code, and images). Leverage these experiments to easily integrate them into your applications for practical use.

  • Customize and fine-tune models for specific tasks: Tailor Samba-1 with your own data for fine-tuning. This ensures the model aligns with your specific business requirements and enhances performance on targeted tasks.

  • Develop applications with advanced reasoning capabilities: Harness the automated routing power of Samba-1. Build applications that can make complex decisions, perform tasks, and provide solutions tailored to customer needs by combining expert models with API calls and data queries.

  • Configurable access and permissions: Samba-1 offers flexible access controls, allowing precise management of model permissions. Align permissions with organizational roles to enhance security and ensure compliance.

  • Total AI stack control: With options for on-premise and air-gapped deployments, Samba-1 ensures complete data privacy and security. Use your sensitive data for model training without external exposure.

  • Optimize application performance and efficiency: Benefit from the efficiency and lower operational costs of running Samba-1 in SambaStudio on Dataflow External link hardware architecture’s large HBM capacity. Enjoy a 10x reduction in inference cost and power consumption over alternative solutions.

SN40L

Samba-1 and Sambaverse run on the most powerful AI chip, SambaNova’s SN40L External link. The SN40L can host up to 5 trillion parameter models in a single instance using 12TB of DDR memory. The 12TB memory of the SN40L enables SambaVerse’s curated list of open source models to be hosted in a single instance.