Under the hood
Al-Dahle says that getting LLaMA 2 ready for launch required a lot of tweaking to make the model more secure and less likely to launch toxic fakes than its predecessor.
Meta has many past gaffes to learn from. His Galactica language model for science was taken offline after just three days, and his earlier LlaMA model, which was intended for research purposes only, was leaked online, prompting criticism from politicians who questioned whether Meta was adequately considering the associated risks. with AI language models such as misinformation and harassment.
To mitigate the risk of repeating these mistakes, Meta applied a combination of different machine learning techniques aimed at improving usability and security.
Meta’s approach to forming LLaMA 2 had more steps than is typical for generative AI models, says Sasha Luccioni, a researcher at AI startup Hugging Face.
The model was trained with 40% more data than its predecessor. Al-Dahle says there were two sources of training data: data that was scrubbed online and a dataset that was tweaked and tweaked based on feedback from human annotators to behave in a more desirable way. The company says it didn’t use Meta user data in LLaMA 2 and excluded data from sites it knew had a lot of personal information.
Despite this, LLaMA 2 still emits offensive, harmful and otherwise problematic language, just like rival models. Meta says it didn’t remove the toxic data from the dataset, because leaving it in could help LLaMA 2 better detect hate speech, and removing it could risk accidentally filtering some demographics.
However, Meta’s commitment to openness is exciting, Luccioni says, because it allows researchers like her to properly study the biases, ethics and efficiency of AI models.
The fact that LLaMA 2 is an open source model will also allow outside researchers and developers to probe it for security flaws, making it more secure than proprietary models, Al-Dahle says.
Liang agrees. “I’m really excited to try things out and I think it’s going to be beneficial to the community,” he says.