Describe the task. It can be a feature, documentation, etc.
We noticed some variables would benefit from different learning rates + modded nano gpt runs also noticed embedding networks benefitting from separately tuned learning rates.
Hedgedoc URL, if you are keeping notes, plots, logs in hedgedoc.
No response
Area
Describe the task. It can be a feature, documentation, etc.
We noticed some variables would benefit from different learning rates + modded nano gpt runs also noticed embedding networks benefitting from separately tuned learning rates.
Hedgedoc URL, if you are keeping notes, plots, logs in hedgedoc.
No response
Area