- Jupyter Notebook 76.8%
- TeX 16.5%
- Python 6.7%
| docs | ||
| graphs | ||
| presentations | ||
| report | ||
| summaries | ||
| .gitignore | ||
| adaptations.py | ||
| data.py | ||
| experiments.py | ||
| logger.py | ||
| main.py | ||
| models.py | ||
| README.md | ||
| Requirements.txt | ||
| utils.py | ||
| warmup.py | ||
Target Value Optimization in Neural Networks
When training a classification neural network, each output neuron represents a class. For example, in a 3-class problem (cat, dog, rat), a typical target vector for a cat image might be:
These values (class = 1, non-class = 0) are compared to the network's output to compute the loss and update the weights.
However, using fixed values like 1 and 0 may not always yield optimal learning. Adjusting the class and non-class target values (e.g., 0.8 and 0.2) can improve training performance (especially with activation functions like sigmoid) by enhancing gradient flow.
This project explores dynamically optimizing these values during training instead of using fixed constants. Early results show improvements in training speed and confidence, though gains in test accuracy are yet to be achieved.
📄 See /docs for more details.
TO-DO
Implementation
- Implement basic σ-adaptation logic (only nc)
- Switch to using one non-class value for each class instead of one global one
- Fix confidence calculation: Calculate cosine similarity of output vector to closest target vector (not necessarily the correct target)
- Log current class/non-class values
- Set up structured experiments and logging
- Build simple frontend for selecting method, starting and stopping training and displaying result in graph (maybe use "tensorboard" or "weights and biases")
- Find new ways of initializing class and non-class values
- Uniform init (poor performance)
- Soft target init
- Implement initial nudge for network preference init
- Try taking all the initial preferences and remapping them to use the full (0,1) range
- Try pushing after each batch
Algorithm Design
- Refine the σ-adaptation strategy (tuning, edge cases)
- Explore and prototype additional adaptation methods
Research & Evaluation
- Define evaluation metrics beyond accuracy/loss (e.g. training speed, confidence margin)
- Analyze, visualize, and summarize results
- Draft and outline the research paper
- Finish paper