top of page
Search
  • Writer's pictureNicholas Kluge

Aira-Instruct 🤗



We have just made available an enhanced version of our language model, Aira. Aira has several iterations, from closed-domain chatbots to open-domain chatbots tuned via instruction-tuning and RLHF (Reinforcement Learning from Human Feedback).


This new version, Aira-Instruct, is a series of generative language models, from 124M to 1.7B parameters, available in Portuguese and English.


We also make available two reward models (used in RLHF): one created to evaluate the quality of our model generations (RewardModel), and another model to help control the toxicity present in the model generations (ToxicityModel). Both models are available in Portuguese and English.


The datasets used for training all the models mentioned, plus the implementation of the model training, are also available at Hugging Face. 🤗


The Aira-Instruct series was developed to help researchers explore the challenges related to the Alignment problem. Since these are small models (up to 1.7 billion parameters), the models can be reproduced by individual researchers with a relatively low investment cost (~R$250.00).


Test our demo on AIRES Playground or Hugging Face!


The models and datasets developed are part of the development of Nicholas Kluge's Ph.D. thesis, "Dynamic Normativity: Necessary and Sufficient Conditions for Outer Alignment." This research is funded by CNPq (Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul), FAPERGS (Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul), DAAD (Deutscher Akademischer Austauschdienst), PUCRS (Pontifícia Universidade Católica do Rio Grande do Sul) and the University of Bonn.




10 views0 comments
bottom of page