Week 1
History of Deep Learning and AI
- Neural Foundations: Early debates about the brain's structure led to two main theories: Reticular Theory (continuous network) and Neuron Doctrine (discrete cells). The term "neuron" was coined, and synapses were confirmed, proving neuron-to-neuron communication.
- Early AI: The MP Neuron model simplified biological neurons. The Perceptron introduced a learning algorithm for pattern recognition, and the first multilayer perceptrons expanded on this idea.
- AI Winter: The limitations of perceptrons, such as the XOR problem highlighted by Minsky and Papert, led to reduced interest and funding in neural networks.
- Key Algorithms: Backpropagation enabled efficient training of multi-layer networks, and gradient descent optimized network parameters. The Universal Approximation Theorem proved that neural networks could theoretically approximate any function.
- Deep Revival: Unsupervised pre-training allowed for effective training of deep networks. Geoffrey Hinton's work reignited interest in deep learning.
- Recognition Breakthroughs: Deep learning achieved human-level performance in handwriting recognition, MNIST digit classification, speech recognition, and traffic sign identification.
- Computer Vision: The ImageNet competition drove rapid progress in image recognition, with Convolutional Neural Networks (CNNs) becoming dominant in computer vision tasks.
- Sequence Processing: Networks like Hopfield, Jordan, and Elman were developed for temporal data. Long Short-Term Memory (LSTM) addressed long-term dependency problems in sequences.
- Seq2Seq Models: These models enabled end-to-end learning for tasks like translation. The introduction of the attention mechanism significantly improved performance.
- Transformers: This architecture, using self-attention, revolutionised NLP. Transformers became the basis for models like GPT (generative) and BERT (bidirectional encoding).
- Large Language Models: GPT-3 demonstrated emergent abilities with 175 billion parameters. There is a trend towards increasingly large models with billions or even trillions of parameters.
- AI in Games: Deep reinforcement learning mastered Atari games, and AlphaGo beat the world champion in Go. AI advanced to more complex games like Poker (DeepStack), DOTA (OpenAI Five), and StarCraft (AlphaStar).
- Machine Translation: Evolved from manual rule-based systems to data-driven statistical methods, then to neural networks, dramatically improving translation quality.
- Generative AI: Techniques like Variational Auto-encoders, GANs, and Diffusion models enabled AI to generate realistic content. DALL-E creates images from text descriptions.
- Challenges: The "Clever Hans" effect in AI raised concerns about true understanding. Bias issues in systems like facial recognition highlighted fairness problems. Training large models has a significant environmental impact.
- Recent Focus: Efforts are being made towards explainable AI to understand model decisions, "Green AI" to reduce computational costs, and increasing regulatory efforts to ensure ethical AI development.
- Emerging Tech: Analog AI explores using programmable resistors instead of digital transistors for more efficient AI hardware.
Theoretical Concepts