-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Robustness is Important:
- Capacity and Trainability in Recurrent Neural Networks:
- Energy Spent in Training???????
- Camry to SF to LA
- Robustness:
- common RNN architectures achieve the same per-task
- Camry in 10 -> 100 -> 1000 -> 10,000
Stochastic Gradient Methods:
- minimize a function:
Weakly Convex Functions:
Why We use This??
- easy to analyze:
- default packages?
- works?
Linear Regression:
- U- shaped algorithm (sgm, truncated, prox)
Optimization Methods:
- good but simple local model of f
- minimize the model regularizing
Optimization Methods:
- how to solve optimization problems:
- minimize a model (regularizing)
Newton's Method:
- Taylor (second-order) model:
Composite Optimization Problems:
Modeling Composite Problems:
- convex model
Modeling Composite Problems:
- now we make a convex model
Modeling Composite Problems
Generic Optimization Methods:
aProx family for stochastic optimization:
Models in Stochastic Optimization:
- conditions on our models (convex case)
- lower bound
- local correctness
Divergence of a gradient method
Conclusion:
- blind application of SGD is not right answer
- care and better modeling can yield improved performance
- computational efficieny c
Metadata
Metadata
Assignees
Labels
No labels