- See Also
-
Links
- “LoRA vs Full Fine-Tuning: An Illusion of Equivalence”, Shuttleworth et al 2024
- “Investigating Learning-Independent Abstract Reasoning in Artificial Neural Networks”, Barak & Loewenstein 2024
- “How Do Large Language Models Acquire Factual Knowledge During Pretraining?”, Chang et al 2024
- “Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data”, Gerstgrasser et al 2024
- “Simple and Scalable Strategies to Continually Pre-Train Large Language Models”, Ibrahim et al 2024
- “Online Adaptation of Language Models With a Memory of Amortized Contexts (MAC)”, Tack et al 2024
- “When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method”, Zhang et al 2024
- “Investigating Continual Pretraining in Large Language Models: Insights and Implications”, Yıldız et al 2024
- “RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture”, Balaguer et al 2024
- “LLaMA Pro: Progressive LLaMA With Block Expansion”, Wu et al 2024
- “Large Language Models Relearn Removed Concepts”, Lo et al 2024
- “Language Model Alignment With Elastic Reset”, Noukhovitch et al 2023
- “In-Context Pretraining (ICP): Language Modeling Beyond Document Boundaries”, Shi et al 2023
- “Loss of Plasticity in Deep Continual Learning (Continual Backpropagation)”, Dohare et al 2023
- “Continual Diffusion: Continual Customization of Text-To-Image Diffusion With C-LoRA”, Smith et al 2023
- “Understanding Plasticity in Neural Networks”, Lyle et al 2023
- “The Forward-Forward Algorithm: Some Preliminary Investigations”, Hinton 2022
- “Broken Neural Scaling Laws”, Caballero et al 2022
- “Exclusive Supermask Subnetwork Training for Continual Learning”, Yadav & Bansal 2022
- “Learn the Time to Learn: Replay Scheduling in Continual Learning”, Klasson et al 2022
- “On the Effectiveness of Compact Biomedical Transformers (✱BioBERT)”, Rohanian et al 2022
- “Don’t Stop Learning: Towards Continual Learning for the CLIP Model”, Ding et al 2022
- “Fleet-DAgger: Interactive Robot Fleet Learning With Scalable Human Supervision”, Hoque et al 2022
- “Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”, Caccia et al 2022
- “CT0: Fine-Tuned Language Models Are Continual Learners”, Scialom et al 2022
- “Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models”, Tirumala et al 2022
- “Continual Pre-Training Mitigates Forgetting in Language and Vision”, Cossu et al 2022
- “Continual Learning With Foundation Models: An Empirical Study of Latent Replay”, Ostapenko et al 2022
- “DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning”, Wang et al 2022
- “Effect of Scale on Catastrophic Forgetting in Neural Networks”, Ramasesh et al 2022
- “The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention”, Irie et al 2022
- “Learning to Prompt for Continual Learning”, Wang et al 2021
- “An Empirical Investigation of the Role of Pre-Training in Lifelong Learning”, Mehta et al 2021
- “The Geometry of Representational Drift in Natural and Artificial Neural Networks”, Aitken et al 2021
- “Wide Neural Networks Forget Less Catastrophically”, Mirzadeh et al 2021
- “Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora”, Jin et al 2021
- “Continuous Coordination As a Realistic Scenario for Lifelong Learning”, Nekoei et al 2021
- “Inductive Biases for Deep Learning of Higher-Level Cognition”, Goyal & Bengio 2020
- “Learning from the Past: Meta-Continual Learning With Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition”, Zheng et al 2020b
- “Meta-Learning through Hebbian Plasticity in Random Networks”, Najarro & Risi 2020
- “Learning to Learn With Feedback and Local Plasticity”, Lindsey & Litwin-Kumar 2020
- “Understanding the Role of Training Regimes in Continual Learning”, Mirzadeh et al 2020
- “Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks”, Gururangan et al 2020
- “Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning”, Julian et al 2020
- “On Warm-Starting Neural Network Training”, Ash & Adams 2019
- “Gated Linear Networks”, Veness et al 2019
- “Learning and Evaluating General Linguistic Intelligence”, Yogatama et al 2019
- “Self-Net: Lifelong Learning via Continual Self-Modeling”, Camp et al 2018
- “Unicorn: Continual Learning With a Universal, Off-Policy Agent”, Mankowitz et al 2018
- “Meta Networks”, Munkhdalai & Yu 2017
- “PathNet: Evolution Channels Gradient Descent in Super Neural Networks”, Fernando et al 2017
- “Overcoming Catastrophic Forgetting in Neural Networks”, Kirkpatrick et al 2016
- “Repeat Before Forgetting: Spaced Repetition for Efficient and Effective Training of Neural Networks”
- “Can LLMs Learn from a Single Example?”
- Sort By Magic
- Miscellaneous
- Bibliography
See Also
Links
“LoRA vs Full Fine-Tuning: An Illusion of Equivalence”, Shuttleworth et al 2024
“Investigating Learning-Independent Abstract Reasoning in Artificial Neural Networks”, Barak & Loewenstein 2024
Investigating learning-independent abstract reasoning in artificial neural networks
“How Do Large Language Models Acquire Factual Knowledge During Pretraining?”, Chang et al 2024
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
“Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data”, Gerstgrasser et al 2024
“Simple and Scalable Strategies to Continually Pre-Train Large Language Models”, Ibrahim et al 2024
Simple and Scalable Strategies to Continually Pre-train Large Language Models
“Online Adaptation of Language Models With a Memory of Amortized Contexts (MAC)”, Tack et al 2024
Online Adaptation of Language Models with a Memory of Amortized Contexts (MAC)
“When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method”, Zhang et al 2024
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
“Investigating Continual Pretraining in Large Language Models: Insights and Implications”, Yıldız et al 2024
Investigating Continual Pretraining in Large Language Models: Insights and Implications
“RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture”, Balaguer et al 2024
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
“LLaMA Pro: Progressive LLaMA With Block Expansion”, Wu et al 2024
“Large Language Models Relearn Removed Concepts”, Lo et al 2024
“Language Model Alignment With Elastic Reset”, Noukhovitch et al 2023
“In-Context Pretraining (ICP): Language Modeling Beyond Document Boundaries”, Shi et al 2023
In-Context Pretraining (ICP): Language Modeling Beyond Document Boundaries
“Loss of Plasticity in Deep Continual Learning (Continual Backpropagation)”, Dohare et al 2023
Loss of Plasticity in Deep Continual Learning (Continual Backpropagation)
“Continual Diffusion: Continual Customization of Text-To-Image Diffusion With C-LoRA”, Smith et al 2023
Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA
“Understanding Plasticity in Neural Networks”, Lyle et al 2023
“The Forward-Forward Algorithm: Some Preliminary Investigations”, Hinton 2022
The Forward-Forward Algorithm: Some Preliminary Investigations
“Broken Neural Scaling Laws”, Caballero et al 2022
“Exclusive Supermask Subnetwork Training for Continual Learning”, Yadav & Bansal 2022
Exclusive Supermask Subnetwork Training for Continual Learning
“Learn the Time to Learn: Replay Scheduling in Continual Learning”, Klasson et al 2022
Learn the Time to Learn: Replay Scheduling in Continual Learning
“On the Effectiveness of Compact Biomedical Transformers (✱BioBERT)”, Rohanian et al 2022
On the Effectiveness of Compact Biomedical Transformers (✱BioBERT)
“Don’t Stop Learning: Towards Continual Learning for the CLIP Model”, Ding et al 2022
Don’t Stop Learning: Towards Continual Learning for the CLIP Model
“Fleet-DAgger: Interactive Robot Fleet Learning With Scalable Human Supervision”, Hoque et al 2022
Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision
“Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”, Caccia et al 2022
Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)
“CT0: Fine-Tuned Language Models Are Continual Learners”, Scialom et al 2022
“Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models”, Tirumala et al 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
“Continual Pre-Training Mitigates Forgetting in Language and Vision”, Cossu et al 2022
Continual Pre-Training Mitigates Forgetting in Language and Vision
“Continual Learning With Foundation Models: An Empirical Study of Latent Replay”, Ostapenko et al 2022
Continual Learning with Foundation Models: An Empirical Study of Latent Replay
“DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning”, Wang et al 2022
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning
“Effect of Scale on Catastrophic Forgetting in Neural Networks”, Ramasesh et al 2022
Effect of scale on catastrophic forgetting in neural networks
“The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention”, Irie et al 2022
“Learning to Prompt for Continual Learning”, Wang et al 2021
“An Empirical Investigation of the Role of Pre-Training in Lifelong Learning”, Mehta et al 2021
An Empirical Investigation of the Role of Pre-training in Lifelong Learning
“The Geometry of Representational Drift in Natural and Artificial Neural Networks”, Aitken et al 2021
The Geometry of Representational Drift in Natural and Artificial Neural Networks
“Wide Neural Networks Forget Less Catastrophically”, Mirzadeh et al 2021
“Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora”, Jin et al 2021
Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora
“Continuous Coordination As a Realistic Scenario for Lifelong Learning”, Nekoei et al 2021
Continuous Coordination As a Realistic Scenario for Lifelong Learning
“Inductive Biases for Deep Learning of Higher-Level Cognition”, Goyal & Bengio 2020
Inductive Biases for Deep Learning of Higher-Level Cognition
“Learning from the Past: Meta-Continual Learning With Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition”, Zheng et al 2020b
“Meta-Learning through Hebbian Plasticity in Random Networks”, Najarro & Risi 2020
“Learning to Learn With Feedback and Local Plasticity”, Lindsey & Litwin-Kumar 2020
“Understanding the Role of Training Regimes in Continual Learning”, Mirzadeh et al 2020
Understanding the Role of Training Regimes in Continual Learning
“Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks”, Gururangan et al 2020
Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks
“Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning”, Julian et al 2020
Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning
“On Warm-Starting Neural Network Training”, Ash & Adams 2019
“Gated Linear Networks”, Veness et al 2019
“Learning and Evaluating General Linguistic Intelligence”, Yogatama et al 2019
“Self-Net: Lifelong Learning via Continual Self-Modeling”, Camp et al 2018
“Unicorn: Continual Learning With a Universal, Off-Policy Agent”, Mankowitz et al 2018
Unicorn: Continual Learning with a Universal, Off-policy Agent
“Meta Networks”, Munkhdalai & Yu 2017
“PathNet: Evolution Channels Gradient Descent in Super Neural Networks”, Fernando et al 2017
PathNet: Evolution Channels Gradient Descent in Super Neural Networks
“Overcoming Catastrophic Forgetting in Neural Networks”, Kirkpatrick et al 2016
“Repeat Before Forgetting: Spaced Repetition for Efficient and Effective Training of Neural Networks”
Repeat before Forgetting: Spaced Repetition for Efficient and Effective Training of Neural Networks:
View External Link:
“Can LLMs Learn from a Single Example?”
Sort By Magic
Annotations sorted by machine learning into inferred 'tags'. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.
Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.
representational-drift
lifelong-learning
adaptive-learning
Miscellaneous
Bibliography
-
https://arxiv.org/abs/2401.08406#microsoft
: “RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture”, -
https://arxiv.org/abs/2312.07551
: “Language Model Alignment With Elastic Reset”, -
https://arxiv.org/abs/2206.14349
: “Fleet-DAgger: Interactive Robot Fleet Learning With Scalable Human Supervision”, -
https://arxiv.org/abs/2205.12393
: “CT0: Fine-Tuned Language Models Are Continual Learners”, -
https://arxiv.org/abs/2110.11526#deepmind
: “Wide Neural Networks Forget Less Catastrophically”,