GeistHaus
log in · sign up

GitHub - meta-pytorch/torchforge: PyTorch-native post-training at scale

github.com

PyTorch-native post-training at scale. Contribute to meta-pytorch/torchforge development by creating an account on GitHub.

3 pages link to this URL
On the Viability of Fine-Tuning SLMs

Until recently, I held the opinion that training custom language models was inadvisable except in relatively rare cases. Simply requiring marginally better performance on your task was insufficient justification for customization; subsequent generations of models from foundation labs would inevitably catch up. Only when you had extreme latency constraints, specific tasks with low drift, and/or such high use that you could guarantee GPU utilization and ROI was it possible to justify the expected economics associated with managing the model lifecycle. And even then, achieving a successful outcome would be unlikely due to the number of ways customization could fail. Success here means both that the custom model performs as or better than expected on in-domain tasks (e.g., better than available alternatives in accuracy and/or speed) and that the business achieves positive ROI as a result of training and deploying the custom model.

0 inbound links article en blog CC BY-NC 4.0