Brian AI models
A series of Open Weights models for web3
Read the full announcement article.
Large Language Models (LLMs) are rapidly becoming popular across various sectors of knowledge-based work on a global scale. It is known that LLMs demonstrate significant versatility as multi-purpose tools, supporting a wide array of tasks. However, when applied to specialized fields, LLMs often require specialized knowledge to achieve cutting-edge performance, particularly in deep technical fields such as Web3.
There are already several similar works for domain-specific models in other sectors; read more here:
Astronomy: “AstroMLab 3: Achieving GPT-4o Level Performance in Astronomy with a Specialized 8B-Parameter Large Language Model”
Finance: “BloombergGPT: A Large Language Model for Finance”
Medicine: “Towards Expert-Level Medical Question Answering with Large Language Models”
Although they hold great potential, the significant training resources needed have constrained specialized LLMs advancement. To bridge this gap, domain-specific LLMs can be developed through innovative approaches like training from scratch using domain-specific datasets or applying supervised or instruction-based fine-tuning to existing general-purpose LLMs.
The Brian-8B model
The Brian-8B model, built on top of Llama-3.1-8B, is an LLM tailored for the Web3 space. This model shows excellent capabilities in natural language understanding of technical sentences related to the web3 world. Up-to-date is the best model for web3 practitioners of its size. Such a model is well suited for developers and startups because its size makes it easily deployable and faster at inference. This is our first step to showcase the utility of having web3-related models and the Brian team's objectives towards contributing to the community.
Brian-8B acts as a backbone model, which means it is not suited to be released in production since this is only the first phase of a longer process that will take another few months. However, developers can use our backbone model to instruct finetune (IFT) with their custom-supervised dataset for specific downstream tasks.
By leveraging the unique dataset (e.g., intents for transactions, QA for info/data, Solidity code) that the Brian team owns, several IFT models covering a wide range of web3-related use cases will be released:
Natural-language to JSON model
Solidity code generation model
QA model
Once released, these models will be used in the Brian Intent Recognition Engine, which is the core element of the Brian architecture. Accessible via the Brian API, this architecture is capable of understanding the user and agent intent for Web3 interactions, providing accurate textual answers and data, building transactions, or generating smart contracts.
Read more about it here.
Why the Brian-8B model?
In the recent enormous hype for AI Agents and intent-based apps, we have seen most of the devs using OpenAI GPT models, a comprehensible choice considering they are currently leading the market. Even if these models are suitable for a beta version of a web3 product, we do not believe that they will remain a good solution for the long term, especially if we want to build a new open and transparent Internet.
In comparison to OpenAI GPT models, these are the main reasons why devs should choose the Brian models:
Performances General-purpose AI models lack effective natural language understanding of Web3-related topics. Thanks to a lower perplexity (evaluate a language model's ability to predict the next word – models with lower perplexity generally achieve better performances), domain-specific models can have comparable, if not higher, performances to OpenAI GPT models.
Further external fine-tunings The Brian models will be released open-weights, which means the pre-trained weights (the learned parameters that define the model’s behavior) are available while keeping the underlying training data and algorithms private. This approach allows devs to perform further fine-tunings for their project needs. For example, in a multi-agent system, each AI agent could be powered by a fine-tuned model for the specific tasks it has to perform.
Inference costs These models will be available at a fraction of the cost of larger general-purpose models. This will also be reflected in our API tier cost.
Decentralized Inference The Brian models (or other OS IFT versions) could also be hosted on top of a decentralized inference network, and the Brian team has already conducted some R&D for the future implementation of a Decentralized Collaborative Intelligence Network.
In conclusion, these models will be a more suitable solution both for our Intent Recognition Engine (on top of which can be built intent-based apps and AI agents capable of transacting on-chain) but also for innovative use cases built by other devs who can fine-tune them for their specific needs.
Last updated