top of page
Books

Links of interest

BOOKS
 

Advanced R https://adv-r.hadley.nz/

 

Forecasting: Principles and Practice by R.J. Hyndman and G. Athanasopoulos, Monash University, Australia. https://otexts.com/fpp2/

 

Time series analysis with R, by A. Coghlan, Wellcome Trust Sanger Institute, Cambridge, UK. https://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/

 

Holmes A., Illowsky B., Dean S., 2017. Introductory Business Statistics. Rice University, OpenStax. https://openstax.org/details/books/introductory-business-statistics

 

Hanck C., Arnold M., Gerber A. and Schmelzer M., 2020. Introduction to Econometrics with R. University of Duisburg-Essen, Germany. https://www.econometrics-with-r.org/index.html

 

Mathematics for Machine Learning, by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, https://mml-book.github.io/

 

Applied Time Series Analysis for Fisheries and Environmental Sciences, by E. E. Holmes, M. D. Scheuerell, and E. J. Ward. https://atsa-es.github.io/atsa-labs/sec-tslab-correlation-within-and-among-time-series.html

 

Fleuret F. The Little Book of Deep Learning. https://fleuret.org/francois/lbdl.html; Deep learning course https://fleuret.org/dlc/

​

Demand Forecasting for Executives and Professionals. https://dfep.netlify.app/index.html

 

Spatial Statistics for Data Science: Theory and Practice with R. https://www.paulamoraga.com/book-spatial/

 

Unraveling principal component analysis. https://peterbloem.nl/publications/unraveling-pca

 

R for Data Science. https://r4ds.had.co.nz/

​

Deep Learning, by Ian Goodfellow and Yoshua Bengio and Aaron Courville. https://www.deeplearningbook.org/

​

Telling Stories with Data. A range of topics on statistical communication, how to use statistical models and share results. https://tellingstorieswithdata.com/

​

Molnar C., Freiesleben T., Supervised Machine Learning for Science. How to stop worrying and love your black box. https://ml-science-book.com/

​

Gelman A., Hill J., Vehtari A.. Regression and Other Stories. Cambridge University Press. https://avehtari.github.io/ROS-Examples/

​

Kabacoff R. Modern Data Visualization with R. https://rkabacoff.github.io/datavis/
 

Prince S. J. D. Understanding Deep Learning. https://udlbook.github.io/udlbook/

​

​

TUTORIALS
 

​Deep Learning Book Series, by Hadrien Jean Ph.D, https://hadrienj.github.io/deep-learning-book-series-home/

​

R-squared: Where Geometry Meets Statistics. https://blog.minitab.com/blog/statistics-and-quality-data-analysis/r-squared-sometimes-a-square-is-just-a-square

 

​Cross Correlation Functions and Lagged Regressions. https://online.stat.psu.edu/stat510/lesson/8/8.2

 

​Handling imbalanced data. https://dataaspirant.com/handle-imbalanced-data-machine-learning/

 

Ensemble methods: bagging and boosting. https://dataaspirant.com/ensemble-methods-bagging-vs-boosting-difference/​

 

Regression ANOVA http://www.stat.yale.edu/Courses/1997-98/101/anovareg.htm

 

​Making predictions from a mixed model using R. http://optimumsportsperformance.com/blog/making-predictions-from-a-mixed-model-using-r/

 

GLM and causal inference, by A. Solomon Kurz, PhD. https://solomonkurz.netlify.app/blog/2023-04-12-boost-your-power-with-baseline-covariates/

 

The Illustrated Machine Learning website. https://illustrated-machine-learning.github.io

 

An introduction to git and github https://product.hubspot.com/blog/git-and-github-tutorial-for-beginners

 

Demystifying Fourier analysis. https://dsego.github.io/demystifying-fourier/

 

A visual introduction to machine learning. http://www.r2d3.us/

 

Treemaps to visualize heterogeneous, hierarchical data. https://blog.phronemophobic.com/treemaps-are-awesome.html

 

Practical guide to conjoint analysis. https://www.andrewheiss.com/blog/2023/07/25/conjoint-bayesian-frequentist-guide/

 

Conceptualizing functions as infinite-dimensional vectors, https://thenumb.at/Functions-are-Vectors/

​

Generalization seems to happen abruptly and long after fitting the training data (grokking) https://pair.withgoogle.com/explorables/grokking/

​

Large language models explained, https://www.understandingai.org/p/large-language-models-explained-with

​

Correspondence analysis, https://www.displayr.com/how-correspondence-analysis-works/http://www.sthda.com/english/articles/31-principal-component-methods-in-r-practical-guide/120-correspondence-analysis-theory-and-practice/

​

Extracting seasonality and trend from time series, https://anomaly.io/seasonal-trend-decomposition-in-r/index.html

​

Seasonal decomposition of short time series, https://robjhyndman.com/hyndsight/tslm-decomposition/

​

Wilcoxon–Mann–Whitney test and t-test https://cienciadedatos.net/documentos/17_mann%E2%80%93whitney_u_test

​

A Bloom filter is a probabilistic data structure that tests if an element is part of a set, with a trade-off between accuracy and memory usage. They're ideal for filtering tasks in large datasets and can significantly reduce resource usage while managing data effectively. https://samwho.dev/bloom-filters/

​

Binary logistic regression in R https://statsandr.com/blog/binary-logistic-regression-in-r/
 

Moving range charts https://demingalliance.org/resources/articles/process-behaviour-charts-an-introduction

 

Six not-so-basic base R functions https://ivelasq.rbind.io/blog/not-so-basic-base-r-functions/

​

Deploy Shinylive R app on Github pages https://github.com/RamiKrispin/shinylive-r

​

Connect RStudio to Git and GitHub, https://happygitwithr.com/rstudio-git-github

​

Seeing Theory, a visual introduction to probability and statistics. https://seeing-theory.brown.edu/

​​

Linear mixed-effects models

​

Learning Statistical Models Through Simulation in R, https://psyteachr.github.io/stat-models-v1/index.html

​

Bayesian statistics and modelling

 

Pensamiento Estadístico, Felipe Bravo, Department of Computer Science, University of Chile https://github.com/dccuchile/CC6104https://www.youtube.com/playlist?list=PLppKo85eGXiXpvRVYM5ZJEHWWofjzuiXw

​

Markov-chain Monte Carlo Interactive Gallery https://chi-feng.github.io/mcmc-demo/

 

Introduction to Bayesian statistics in R & brms​ https://github.com/benjamin-rosenbaum/bayesian-intro

​

(YouTube channels)

​

Playlists and individual videos on statistics, machine learning and related topics https://www.youtube.com/@statquest, with and index of the published videos at https://statquest.org/video-index/

​

Alfonso Garcia Perez, Prof. PhD, https://www.youtube.com/@alfonsogarciaperez9717, website https://www.uned.es/universidad/docentes/ciencias/alfonso-garcia-perez.html

​

Introducción a la inferencia a partir de las pendientes de rectas de regresión muestral. https://www.youtube.com/watch?v=bWG7-WmVtQE

​

​

ONLINE TOOLS AND RESOURCES

​

Statistics applets:

https://homepage.divms.uiowa.edu/~mbognar/

https://probstats.org/

​

Draw Function Graphs https://rechneronline.de/function-graphs/

 

Practice datasets by the SuperDataScience Team. https://www.superdatascience.com/pages/training

 

Human population within a distance, from any point in the world. https://www.tomforth.co.uk/circlepopulations/

 

Web del área de Estadística en el Departamento de Matemáticas de la ULPGC. https://estadistica-dma.ulpgc.es/

​

Functional Indicator Tool, application for displaying indicators related to Territorial Cohesion Policy at European and regional levels, https://apps.espon.eu/fit/

​

​

PACKAGES AND SPECIFIC TOPICS
 

Arrow in R. https://r4ds.hadley.nz/arrow.html

 

Cookbook Polars for R. https://ddotta.github.io/cookbook-rpolars/

 

R packages for visualising spatial data. https://nrennie.rbind.io/blog/2022-12-17-r-packages-for-visualising-spatial-data/

 

messydates R package. https://globalgov.github.io/messydates/

​

explore R package for exploratory data analysis https://github.com/rolkra/explore/

​

Column-oriented tables in SQLite https://github.com/dgllghr/stanchion

​

​

PAPERS
 

von Kugelgen J. et al. 2021. Simpson's Paradox in COVID-19 Case Fatality Rates: A Mediation Analysis of Age-Related Causal Effects. doi: 10.1109/TAI.2021.3073088

 

Wasserstein and Lazar, 2016. The ASA Statement on p-Values: Context, Process, and Purpose. doi: https://doi.org/10.1080/00031305.2016.1154108

 

Wei-Chao Lin et al., 2017. Clustering-based undersampling in class-imbalanced data. doi: https://doi.org/10.1016/j.ins.2017.05.008

 

Lunardon et al., 2014. ROSE: a Package for Binary Imbalanced Learning. doi: 10.32614/RJ-2014-008

 

Marnik G. Dekimpe et al., 1999. Long-run effects of price promotions in scanner markets. doi: https://doi.org/10.1016/S0304-4076(98)00064-5

​

Muggeo, 2008. Segmented: An R package to fit regression models with broken-line relationships. https://journal.r-project.org/articles/RN-2008-004/

​

Hewamalage, H., Ackermann, K. & Bergmeir, C. Forecast evaluation for data scientists: common pitfalls and best practices. Data Min Knowl Disc 37, 788–832 (2023). https://doi.org/10.1007/s10618-022-00894-5

​

​

MISC
 

"What machine learning is", blog post by Jason Brownlee, https://machinelearningmastery.com/what-is-machine-learning 

 

"What machine learning is", Data Science Venn Diagram by Drew Conway, http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

 

A collection of questions and answers from data science interviews, by Youssef Hosni. https://github.com/youssefHosni/Data-Science-Interview-Questions-Answers

​

In this post, Mikkel Dengsøe describes his approach to decision-making by thinking in small bets, rather than getting slowed down waiting for conclusive data. "Pack your bags and enjoy the journey through uncertainty". https://mikkeldengsoe.substack.com/p/data-will-not-tell-you-what-to-do

​

The story of t test and p value. https://www.scientificamerican.com/article/how-the-guinness-brewery-invented-the-most-important-statistical-method-in/

​

  • LinkedIn
  • research-gate-icon_edited

© 2022 by Fabio Natalini. Powered and secured by Wix

My profiles on social netwroks

bottom of page