- Добавил: literator
- Дата: Сегодня, 08:11
- Комментариев: 0

Автор: Chi Wang, Peiheng Hu
Издательство: O’Reilly Media, Inc.
Год: 2025-05-05
Страниц: 124
Язык: английский
Формат: pdf, epub
Размер: 10.1 MB
Large language models (LLMs) are rapidly becoming the backbone of AI-driven applications. Without proper optimization, however, LLMs can be expensive to run, slow to serve, and prone to performance bottlenecks. As the demand for real-time AI applications grows, along comes Hands-On Serving and Optimizing LLM Models, a comprehensive guide to the complexities of deploying and optimizing LLMs at scale. In this hands-on book, authors Chi Wang and Peiheng Hu take a real-world approach backed by practical examples and code, and assemble essential strategies for designing robust infrastructures that are equal to the demands of modern AI applications. Whether you're building high-performance AI systems or looking to enhance your knowledge of LLM optimization, this indispensable book will serve as a pillar of your success. Although different Machine Learning (ML) frameworks, such as TensorFlow and PyTorch, provide distinct APIs and libraries, the fundamental principles of model packaging and structure remain largely similar. In practice, additional optimizations are often applied to model files to enhance performance for model serving, particularly for large language models (LLMs).