Название: Modern Data Science with R, 2nd Edition Автор: Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton Издательство: CRC Press Год: 2021 Страниц: 650 Язык: английский Формат: pdf (true) Размер: 83.4 MB
Modern Data Science with R is a comprehensive data science textbook for undergraduates that incorporates statistical and computational thinking to solve real-world data problems. Rather than focus exclusively on case studies or programming syntax, this book illustrates how statistical programming in the state-of-the-art R/RStudio computing environment can be leveraged to extract meaningful information from a variety of data in the service of addressing compelling questions.
The second edition is updated to reflect the growing influence of the tidyverse set of packages. All code in the book has been revised and styled to be more readable and easier to understand. New functionality from packages like sf, purrr, tidymodels, and tidytext is now integrated into the text. All chapters have been revised, and several have been split, re-organized, or re-imagined to meet the shifting landscape of best practice.
From a review of the first edition: "Modern Data Science with R… is rich with examples and is guided by a strong narrative voice. What’s more, it presents an organizing framework that makes a convincing argument that data science is a course distinct from applied statistics" (The American Statistician).
The book has three main sections plus supplementary appendices. Part I provides an introduction to data science, which includes an introduction to data visualization, a foundation for data management (or “wrangling”), and ethics. Part II extends key modeling notions from introductory statistics, including regression modeling, classification and prediction, statistical foundations, and simulation. Part III introduces more advanced topics, including interactive data visualization, SQL and relational databases, geospatial data, text mining, and network science. We conclude with appendices that introduce the book's R package, R and RStudio, key aspects of algorithmic thinking, reproducible analysis, a review of regression, and how to set up a local SQL database. We provide two indices: one organized by subject and the other organized by R function and package. In addition, the book features extensive cross-referencing (given the inherent connections between topics and approaches).
Intended audience: This book is intended for readers who want to develop the appropriate skills to tackle complex Data Science projects and “think with data” (as coined by Diane Lambert of Google). The desire to solve problems using data is at the heart of our approach. We acknowledge that it is impossible to cover all these topics in any level of detail within a single book: Many of the chapters could productively form the basis for a course or series of courses. Instead, our goal is to lay a foundation for analysis of real-world data and to ensure that analysts see the power of statistics and data analysis. After reading this book, readers will have greatly expanded their skill set for working with these data, and should have a newfound confidence about their ability to learn new technologies on-the-fly. This book was originally conceived to support a one-semester, 13-week undergraduate course in data science. We have found that the book will be useful for more advanced students in related disciplines, or analysts who want to bolster their data science skills. At the same time, Part I of the book is accessible to a general audience with no programming or statistics experience.
Внимание
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.
Информация
Посетители, находящиеся в группе Гости, не могут оставлять комментарии к данной публикации.