

Links of interest
BOOKS
Advanced R https://adv-r.hadley.nz/
Forecasting: Principles and Practice by R.J. Hyndman and G. Athanasopoulos, Monash University, Australia. https://otexts.com/fpp2/
Time series analysis with R, by A. Coghlan, Wellcome Trust Sanger Institute, Cambridge, UK. https://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/
Holmes A., Illowsky B., Dean S., 2017. Introductory Business Statistics. Rice University, OpenStax. https://openstax.org/details/books/introductory-business-statistics
Hanck C., Arnold M., Gerber A. and Schmelzer M., 2020. Introduction to Econometrics with R. University of Duisburg-Essen, Germany. https://www.econometrics-with-r.org/index.html
Mathematics for Machine Learning, by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, https://mml-book.github.io/
Applied Time Series Analysis for Fisheries and Environmental Sciences, by E. E. Holmes, M. D. Scheuerell, and E. J. Ward. https://atsa-es.github.io/atsa-labs/sec-tslab-correlation-within-and-among-time-series.html
Fleuret F. The Little Book of Deep Learning. https://fleuret.org/francois/lbdl.html; Deep learning course https://fleuret.org/dlc/
​
Demand Forecasting for Executives and Professionals. https://dfep.netlify.app/index.html
Spatial Statistics for Data Science: Theory and Practice with R. https://www.paulamoraga.com/book-spatial/
Unraveling principal component analysis. https://peterbloem.nl/publications/unraveling-pca
R for Data Science. https://r4ds.had.co.nz/
​
Deep Learning, by Ian Goodfellow and Yoshua Bengio and Aaron Courville. https://www.deeplearningbook.org/
​
Telling Stories with Data. A range of topics on statistical communication, how to use statistical models and share results. https://tellingstorieswithdata.com/
​
Molnar C., Freiesleben T., Supervised Machine Learning for Science. How to stop worrying and love your black box. https://ml-science-book.com/
​
Gelman A., Hill J., Vehtari A.. Regression and Other Stories. Cambridge University Press. https://avehtari.github.io/ROS-Examples/
​
Kabacoff R. Modern Data Visualization with R. https://rkabacoff.github.io/datavis/
Prince S. J. D. Understanding Deep Learning. https://udlbook.github.io/udlbook/
​
​
TUTORIALS
​Deep Learning Book Series, by Hadrien Jean Ph.D, https://hadrienj.github.io/deep-learning-book-series-home/
​
R-squared: Where Geometry Meets Statistics. https://blog.minitab.com/blog/statistics-and-quality-data-analysis/r-squared-sometimes-a-square-is-just-a-square
​Cross Correlation Functions and Lagged Regressions. https://online.stat.psu.edu/stat510/lesson/8/8.2
​Handling imbalanced data. https://dataaspirant.com/handle-imbalanced-data-machine-learning/
Ensemble methods: bagging and boosting. https://dataaspirant.com/ensemble-methods-bagging-vs-boosting-difference/​
Regression ANOVA http://www.stat.yale.edu/Courses/1997-98/101/anovareg.htm
​Making predictions from a mixed model using R. http://optimumsportsperformance.com/blog/making-predictions-from-a-mixed-model-using-r/
GLM and causal inference, by A. Solomon Kurz, PhD. https://solomonkurz.netlify.app/blog/2023-04-12-boost-your-power-with-baseline-covariates/
The Illustrated Machine Learning website. https://illustrated-machine-learning.github.io
An introduction to git and github https://product.hubspot.com/blog/git-and-github-tutorial-for-beginners
Demystifying Fourier analysis. https://dsego.github.io/demystifying-fourier/
A visual introduction to machine learning. http://www.r2d3.us/
Treemaps to visualize heterogeneous, hierarchical data. https://blog.phronemophobic.com/treemaps-are-awesome.html
Practical guide to conjoint analysis. https://www.andrewheiss.com/blog/2023/07/25/conjoint-bayesian-frequentist-guide/
Conceptualizing functions as infinite-dimensional vectors, https://thenumb.at/Functions-are-Vectors/
​
Generalization seems to happen abruptly and long after fitting the training data (grokking) https://pair.withgoogle.com/explorables/grokking/
​
Large language models explained, https://www.understandingai.org/p/large-language-models-explained-with
​
Correspondence analysis, https://www.displayr.com/how-correspondence-analysis-works/ , http://www.sthda.com/english/articles/31-principal-component-methods-in-r-practical-guide/120-correspondence-analysis-theory-and-practice/
​
Extracting seasonality and trend from time series, https://anomaly.io/seasonal-trend-decomposition-in-r/index.html
​
Seasonal decomposition of short time series, https://robjhyndman.com/hyndsight/tslm-decomposition/
​
Wilcoxon–Mann–Whitney test and t-test https://cienciadedatos.net/documentos/17_mann%E2%80%93whitney_u_test
​
A Bloom filter is a probabilistic data structure that tests if an element is part of a set, with a trade-off between accuracy and memory usage. They're ideal for filtering tasks in large datasets and can significantly reduce resource usage while managing data effectively. https://samwho.dev/bloom-filters/
​
Binary logistic regression in R https://statsandr.com/blog/binary-logistic-regression-in-r/
Moving range charts https://demingalliance.org/resources/articles/process-behaviour-charts-an-introduction
Six not-so-basic base R functions https://ivelasq.rbind.io/blog/not-so-basic-base-r-functions/
​
Deploy Shinylive R app on Github pages https://github.com/RamiKrispin/shinylive-r
​
Connect RStudio to Git and GitHub, https://happygitwithr.com/rstudio-git-github
​
Seeing Theory, a visual introduction to probability and statistics. https://seeing-theory.brown.edu/
​​
Linear mixed-effects models
​
Learning Statistical Models Through Simulation in R, https://psyteachr.github.io/stat-models-v1/index.html
​
Bayesian statistics and modelling
Pensamiento Estadístico, Felipe Bravo, Department of Computer Science, University of Chile https://github.com/dccuchile/CC6104, https://www.youtube.com/playlist?list=PLppKo85eGXiXpvRVYM5ZJEHWWofjzuiXw
​
Markov-chain Monte Carlo Interactive Gallery https://chi-feng.github.io/mcmc-demo/
Introduction to Bayesian statistics in R & brms​ https://github.com/benjamin-rosenbaum/bayesian-intro
​
(YouTube channels)
​
Playlists and individual videos on statistics, machine learning and related topics https://www.youtube.com/@statquest, with and index of the published videos at https://statquest.org/video-index/
​
Alfonso Garcia Perez, Prof. PhD, https://www.youtube.com/@alfonsogarciaperez9717, website https://www.uned.es/universidad/docentes/ciencias/alfonso-garcia-perez.html
​
Introducción a la inferencia a partir de las pendientes de rectas de regresión muestral. https://www.youtube.com/watch?v=bWG7-WmVtQE
​
​
ONLINE TOOLS AND RESOURCES
​
Statistics applets:
https://homepage.divms.uiowa.edu/~mbognar/
​
Draw Function Graphs https://rechneronline.de/function-graphs/
Practice datasets by the SuperDataScience Team. https://www.superdatascience.com/pages/training
Human population within a distance, from any point in the world. https://www.tomforth.co.uk/circlepopulations/
Web del área de Estadística en el Departamento de Matemáticas de la ULPGC. https://estadistica-dma.ulpgc.es/
​
Functional Indicator Tool, application for displaying indicators related to Territorial Cohesion Policy at European and regional levels, https://apps.espon.eu/fit/
​
​
PACKAGES AND SPECIFIC TOPICS
Arrow in R. https://r4ds.hadley.nz/arrow.html
Cookbook Polars for R. https://ddotta.github.io/cookbook-rpolars/
R packages for visualising spatial data. https://nrennie.rbind.io/blog/2022-12-17-r-packages-for-visualising-spatial-data/
messydates R package. https://globalgov.github.io/messydates/
​
explore R package for exploratory data analysis https://github.com/rolkra/explore/
​
Column-oriented tables in SQLite https://github.com/dgllghr/stanchion
​
​
PAPERS
von Kugelgen J. et al. 2021. Simpson's Paradox in COVID-19 Case Fatality Rates: A Mediation Analysis of Age-Related Causal Effects. doi: 10.1109/TAI.2021.3073088
Wasserstein and Lazar, 2016. The ASA Statement on p-Values: Context, Process, and Purpose. doi: https://doi.org/10.1080/00031305.2016.1154108
Wei-Chao Lin et al., 2017. Clustering-based undersampling in class-imbalanced data. doi: https://doi.org/10.1016/j.ins.2017.05.008
Lunardon et al., 2014. ROSE: a Package for Binary Imbalanced Learning. doi: 10.32614/RJ-2014-008
Marnik G. Dekimpe et al., 1999. Long-run effects of price promotions in scanner markets. doi: https://doi.org/10.1016/S0304-4076(98)00064-5
​
Muggeo, 2008. Segmented: An R package to fit regression models with broken-line relationships. https://journal.r-project.org/articles/RN-2008-004/
​
Hewamalage, H., Ackermann, K. & Bergmeir, C. Forecast evaluation for data scientists: common pitfalls and best practices. Data Min Knowl Disc 37, 788–832 (2023). https://doi.org/10.1007/s10618-022-00894-5
​
​
MISC
"What machine learning is", blog post by Jason Brownlee, https://machinelearningmastery.com/what-is-machine-learning
"What machine learning is", Data Science Venn Diagram by Drew Conway, http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
A collection of questions and answers from data science interviews, by Youssef Hosni. https://github.com/youssefHosni/Data-Science-Interview-Questions-Answers
​
In this post, Mikkel Dengsøe describes his approach to decision-making by thinking in small bets, rather than getting slowed down waiting for conclusive data. "Pack your bags and enjoy the journey through uncertainty". https://mikkeldengsoe.substack.com/p/data-will-not-tell-you-what-to-do
​
The story of t test and p value. https://www.scientificamerican.com/article/how-the-guinness-brewery-invented-the-most-important-statistical-method-in/
​