Preface

Contents

Preface#

Welcome to this book!

These are lecture notes for Data Science 122, Foundations of Data Science III, as taught at Boston University. The notes are written by Pawel Przytycki, Lisa Wobbes, and Mark Crovella.

Format#

The notes are in the form of Jupyter notebooks. Demos and most figures are included as executable Python code.

Each of the Chapters is based on a single notebook, and each forms the basis for one lecture (more or less).

Sources#

We have relied on many sources for these lecture notes, including many public domain images and other resources (thank you Wikimedia!). Some illustrations were generated using DALL-E and Stable Diffusion.

The principal sources from which we draw much text and many examples are:

  • Think Bayes, Second Edition, Allen Downey. This book in particular is the basis for the Bayesian portion of the course.

  • Data Science From Scratch, Joel Grus

  • Probabilistic Graphical Models, Koller and Friedman

  • Introduction to Probability, Dennis Sun, notes here.

  • A first course in probability, Sheldon Ross

  • Mathematical Statistics and Data Analysis, John A. Rice

  • Lecture notes, David Vogan here

  • Understanding the New Statistics, Geoff Cumming

  • Statistics Done Wrong, Alex Reinhart

  • Deep Learning, Goodfellow, Bengio, and Courville

  • Applied Stochastic Analysis, Miranda Holmes-Cerfon, available here

  • Numerical Algorithms, Justin Solomon, available here

Code#

Packages that are used in this book include: numpy scipy pandas matplotlib seaborn pymc

Here are some quick instructions for making this book:

Clone the repository, then:

make requirements.txt

Create a python environment with the necessary packages:

python -m venv bookenvironment

source bookenvironment/bin/activate

pip install -r requirements.txt

make book