Название: Deep Reinforcement Learning with Python: RLHF for Chatbots and Large Language Models, 2nd Edition Автор: Nimish Sanghi Издательство: Apress Год: 2024 Страниц: 650 Язык: английский Формат: pdf (true), epub (true) Размер: 34.1 MB
Gain a theoretical understanding to the most popular libraries in deep reinforcement learning (deep RL). This new edition focuses on the latest advances in deep RL using a learn-by-coding approach, allowing readers to assimilate and replicate the latest research in this field.
New agent environments ranging from games, and robotics to finance are explained to help you try different ways to apply reinforcement learning. A chapter on multi-agent reinforcement learning covers how multiple agents compete, while another chapter focuses on the widely used deep RL algorithm, proximal policy optimization (PPO). You'll see how reinforcement learning with human feedback (RLHF) has been used by chatbots, built using Large Language Models, e.g. ChatGPT to improve conversational capabilities.
You'll also review the steps for using the code on multiple cloud systems and deploying models on platforms such as Hugging Face Hub. The code is in Jupyter Notebook, which canbe run on Google Colab, and other similar deep learning cloud platforms, allowing you to tailor the code to your own needs.
This book is about reinforcement learning, taking the readers through the basics to advanced topics. Although this book assumes no prior knowledge of the field of reinforcement learning, it expects the readers to be familiar with the basics of machine learning. Have you coded in Python? Are you comfortable working with libraries like NumPy and Scikit-learn? Have you heard of deep learning and have you explored the basic build blocks of training simple models in PyTorch? You should answer yes to these questions to get the most out of this book. If not, I suggest you learn a bit about these concepts first. Nothing too deep—any introductory online tutorial or book from Apress on these topics will be sufficient.
In this second edition, I have made some major changes while keeping most of the content from the first edition. The main additions are related to the new developments in the field of Large Language Models (LLM) and Multimodal Generative AI, which have revolutionized the world since late 2022. Reinforcement learning (RL) has played a crucial role in enabling this through Reinforcement Learning from Human Feedback (RLHF). This edition has a new chapter dedicated to this topic. It gives the reader a high-level overview of transformers, LLMs, and related topics like prompt engineering, Retrieval Augmented Generation (RAG), parameter efficient fine-tuning (PEFT), and chaining of LLMs and LLM-based auto agents, followed by a detailed explanation of the concept of RLHF. In the same chapter, you'll also explore Proximal Policy Optimization (PPO), which is a popular state-of-the-art RL based algorithm that was used by OpenAI for the RLHF fine-tuning of ChatGPT.
Another addition is a chapter on multi-agent RL (MARL) and deep MARL (DMARL), which deals with more than one agent cooperating or competing in the same environment. In this chapter, I start with the introduction and go all the way to a working example. I limit the discussion to introducing the key concepts, enabling interested readers to follow specialized texts on MARL for further exploration.
This edition also covers additional topics, like hyperparameter tuning. It includes an overview of other topics like curiosity learning, use of transformers in RL in various ways, emerging areas such as sample efficient offline RL, decision transformers, automated curriculum learning, zero-shot RL, and various other advances in the field since the first edition. The chapter on Deep Q networks has been split into two to provide better organization to the topic.
Whether it’s for applications in gaming, robotics, or Generative AI, Deep Reinforcement Learning with Python will help keep you ahead of the curve.
What You'll Learn: Explore Python-based RL libraries, including StableBaselines3 and CleanRL Work with diverse RL environments like Gymnasium, Pybullet, and Unity ML Understand instruction finetuning of Large Language Models using RLHF and PPO Study training and optimization techniques using HuggingFace, Weights and Biases, and Optuna
Who This Book Is For: Software engineers and Machine Learning developers eager to sharpen their understanding of deep RL and acquire practical skills in implementing RL algorithms fromscratch.
Скачать Deep Reinforcement Learning with Python: RLHF for Chatbots and Large Language Models, 2nd Edition
Внимание
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.
Информация
Посетители, находящиеся в группе Гости, не могут оставлять комментарии к данной публикации.