Neural Networks Warehouse
Artificial Intelligence Depot
"As knowledge increases, ignorance unfolds." -Kennedy
KNOWLEDGE MESSAGES SUBMIT SEARCH  
backprop 'stability-plasticity' dilemma - any simple solutions ??
 
• backprop 'stability-plasticity' dilemma - any simple solutions ??

First of all, I'm a fairly non-academic engineer, so I'll probably not describe this correctly - so I apologise in advance for talking drivel !

Anyway, I have a small backprop trained NN that works well after being trained with data collected from an industrial process. However, after it has been used a while, I have some new data I wish to add to its training set - when I tried to do this by simply training with just the new data ( starting with the previously trained weights ) it *forgot* all the previous stuff.

I am led to believe that this is a well-known phenomena called the 'stability-plasticity' dilemma, I've researched it a little and have found papers on this and 'catastrophic forgettting' ( excellent name I think ) . . . the papers seem to be quite complex and don't seem to offer a simple solution to this.

Worst case I can add the new training set to the original training sets and retrain, but this will of course get slower and slower as new sets are added. My question is, is there a simple solution to this, which will allow me to retrain with just the new set, but retain the previous training ??

thanks, Gav

1 posts.
Friday 26 September, 05:27
Reply
• Neural network capacity (and consequent "forgetting")

You may combine the new training data with the old to have the neural network learn all of it. As you mention, this will increase resource requirements over time. It may be possible to minimize this issue by sub-sampling the data or by producing some enhancement to the neural architecture (some modeling systems are better at incremental learning than others). Ultimately, though, one must realize that as more data is collected, more is being expected of the neural network- it seems only reasonable that resource requirements would rise.

249 posts.
Friday 26 September, 09:38
Reply
• SPD and catastrophic forgetting

Robins addressed this problem by using psuedo-rehearsal. That is, after he trained a network to criterion, he turned off the learning and used random input vectors (0 or 1 at random) to generate outputs that reflected the current state of weight space. He then blended those random vectors and the network outputs (as I/O teaching pairs) into the new data set.

1 posts.
Sunday 01 October, 13:53
Reply