Название: Data Science and Predictive Analytics: Biomedical and Health Applications using R (2nd Edition) Автор: Ivo D. Dinov Издательство: Springer Год: 2023 Страниц: 940 Язык: английский Формат: pdf (true) Размер: 49.4 MB
This textbook integrates important mathematical foundations, efficient computational algorithms, applied statistical inference techniques, and cutting-edge Machine Learning approaches to address a wide range of crucial biomedical informatics, health analytics applications, and decision science challenges. Each concept in the book includes a rigorous symbolic formulation coupled with computational algorithms and complete end-to-end pipeline protocols implemented as functional R electronic markdown notebooks. These workflows support active learning and demonstrate comprehensive data manipulations, interactive visualizations, and sophisticated analytics. The content includes open problems, state-of-the-art scientific knowledge, ethical integration of heterogeneous scientific tools, and procedures for systematic validation and dissemination of reproducible research findings.
The purpose of this book is to provide a sufficient methodological foundation for a number of modern Data Science techniques along with hands-on demonstration of implementation protocols, pragmatic mechanics of protocol execution, and interpretation of the results of these methods applied on concrete case studies. Successfully completing the Data Science and Predictive Analytics (DSPA) training materials2 will equip readers to (1) understand the computational foundations of Dig Data science; (2) build critical inferential thinking; (3) lend a tool chest of R libraries for managing and interrogating raw, derived, observed, experimental, and simulated big healthcare datasets; and (4) furnish practical skills for handling complex datasets. The DSPA materials are designed to build specific Data Science skills and predictive analytic competencies,6 as described by the Michigan Institute for Data Science (MIDAS).
Chapter 1 (Introduction) presents (1) the DSPA Mission and Objectives; (2) discusses several driving biomedical challenges.
In Chap. 2 (Basic Visualization and Exploratory Data Analytics), we present additional R programming details about (1) loading, manipulating, visualizing, and saving R data objects; (2) sample-based statistics measuring central tendency and dispersion; (3) understanding different types of variables; (4) scraping data from public websites; (5) missing observations and cohort-rebalancing; (6) graphical techniques for exposing composition, comparison, and relationships in multivariate data; and (7) visualizing and computing of 1D, 2D, 3D, and 4D distributions. The foundations of Linear Algebra, Matrix Computing, and Regression Modeling are presented in Chap. 3. It covers (1) creation, interpretation, processing, and manipulation of second-order tensors (matrices); (2) illustrations of variety of matrix operations and their applications; (3) demonstrations of linear modeling and solu- tions of matrix equations; (4) discussion of the eigen-spectra of matrices; (5) the fundamentals of multivariate linear modeling and prediction; and (6) contrast regres- sion trees and model trees. This chapter also includes several complete end-to-end predictive analytics examples.
Chapter 4 (Linear and Nonlinear Dimensionality Reduction) starts with a driving motivational example reducing a 2D dataset to a 1D signal. It covers (1) matrix rotations; (2) linear dimensionality techniques such as principal component analysis (PCA), singular value decomposition (SVD), independent component analysis (ICA), and factor analysis (FA); and (3) non-linear dimensionality reduction methods such as t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP).
The discussion of machine learning model-based and model-free techniques commences in Chap. 5 (Supervised Classification). This chapter covers (1) lazy learning classification using k-nearest neighbors (kNN) algorithm; (2) general divide-and-conquer approaches for splitting data into training and validation sets; (3) some basic strategies for evaluation of model performance; (4) probabilistic learning using Naïve Bayes classifier, linear and quadratic discriminant analysis classification; (5) decision tree divide and conquer classification; and (6) various classification metrics (e.g., entropy, misclassification error, Gini index) and strategies for pruning decision trees.
Chapter 6 (Black Box Machine-Learning Methods) lays out the foundation of neural networks as silicon analogues to biological neurons. This chapter covers (1) a discussion of the effects of network layers and topology on the resulting neural network classification; (2) support vector machines (SVM); and (3) ensemble methods based on bagging, boosting, random forest, and adaptive boosting. ... Chapter 10 (Specialized Machine Learning Topics) presents some technical details that may be useful to many data science practitioners, computational scientists, and engineers. Here, we discuss (1) data format conversion; (2) SQL data queries; (3) reading and writing XML, JSON, XLSX, and other data formats; (4) visualization of network bioinformatics data; (5) data streaming and on-the-fly stream classification and clustering; (6) optimization and improvement of computational performance; (7) parallel computing; and (8) integration of R, Python, C/C++, and other programming languages within a single R markdown electronic notebook. ... The last chapter of the textbook is Chap. 14 (Deep Learning, Neural Networks). It covers (1) perceptron activation functions; (2) relations between artificial and biological neurons and networks; (3) neural nets for computing exclusive OR (XOR) and negative AND (NAND) operators; (4) classification of handwritten digits and network bases estimation of the square root function; (5) classification of natural images; (6) variational autoencoders (VAEs); (7) transfer learning; and (8) applications in text mining, image generation, etc.
Скачать Data Science and Predictive Analytics (2nd Edition)
Внимание
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.
Информация
Посетители, находящиеся в группе Гости, не могут оставлять комментарии к данной публикации.