The algorithms that underpin artificial intelligence systems like ChatGPT are unable to learn as they are used, forcing tech companies to spend billions of dollars training new models from scratch. This has been a concern in the industry for some time, but new research suggests there's an inherent problem with how the models are designed – but there may be a solution.
Most AI today is so-called neural networks, inspired by how the brain works, with processing units called artificial neurons. Typically, AI goes through distinct stages during its development: First, the AI ​​is trained, and its artificial neurons are fine-tuned by an algorithm to better reflect a particular dataset. Then, the AI ​​can be used to respond to new data, such as text inputs like those entered into ChatGPT. However, once a model's neurons are set in the training phase, they can no longer be updated or learn from new data.
This means that most large AI models need to be retrained when new data becomes available, which can be very costly, especially when the new dataset represents a large portion of the entire internet.
Researchers have wondered whether these models might be able to incorporate new knowledge after initial training, reducing costs, but it was unclear whether this was possible.
now, Shivhansh Dohare Researchers at the University of Alberta in Canada tested whether the most common AI models could be adapted to continually learn. The team found that when exposed to new data, a huge number of artificial neurons became stuck at a value of zero, causing the AI ​​models to quickly lose the ability to learn new things.
“If you think of it like a brain, it's like 90 percent of the neurons are dead,” D'Hare says. “You don't have enough neurons to learn with.”
Dhare and his team started by training their AI system from the ImageNet database, which consists of 14 million labeled images of simple objects like houses and cats. But instead of training the AI ​​once and then testing it multiple times to distinguish between the two images, as is the standard approach, they retrained the model for each image pair.
The researchers tested different learning algorithms in this way and found that after thousands of retraining cycles, the networks were unable to learn and their performance deteriorated, with many neurons becoming “dead” – that is, having a value of zero.
The team also trained the AI ​​to simulate the way ants learn to walk through reinforcement learning, a common technique that teaches an AI what success looks like and helps it figure out the rules through trial and error. They tried to adapt this technique to allow for continuous learning by retraining the algorithm after walking on different surfaces, but they found this also led to a significant decrease in learning ability.
The problem is inherent to the way these systems learn, D'Hare says, but there is a workaround: The researchers developed an algorithm that randomly turns on some neurons after each training round, which seems to mitigate the performance degradation. [neuron] “When it dies, you just bring it back to life,” D'Hare says, “and now it can learn again.”
The algorithm seems promising, but needs to be tested on larger systems before it can be trusted to be useful, he says. Mark van der Wilk At Oxford University.
“Solving continuous learning is literally a billion-dollar problem,” he says. “If you have a true comprehensive solution that allows you to continuously update your models, you can dramatically reduce the cost of training these models.”
topic:
Source: www.newscientist.com