Digital Resources

Below are digital resources that complement the book Practical Machine Learning with R: Tutorials and Case Studies.

R Script and Data Sources for MNIST Data Analysis

The material provided below allows you to experiment with recognizing handwritten notes from the MNIST dataset using k-Nearest Neighbors.

You can use the import() command from the rio package together with the links below to import random subsets of the MNIST dataset as well as the complete MNIST dataset:

500 sample observations (R data frame):
https://ai.lange-analytics.com/data/MN500.rds
1,000 sample observations (R data frame):
https://ai.lange-analytics.com/data/MN1000.rds
10,000 sample observations (R data frame):
https://ai.lange-analytics.com/data/MN10000.rds
Complete MNISTdataset as (R list object):
https://ai.lange-analytics.com/data/MN.rds

Example:
library(rio)
DataMnist=import("https://ai.lange-analytics.com/data/MN500.rds")

Alternatively, you can download an R script that you can use as a template to work with the various MNIST data subsets. See the link below.

Free DataCamp Course About k-Nearest Neighbors

This is a free course from DataCamp that introduces k-Nearest Neighbors. It is interactive, and it provides exercises. The course covers other classification models as well, but they are not free.

A Video Tutorial for k-Nearest Neighbors to Recognize Handwriting

This R tutorial video by Carsten Lange shows how handwritten digits from images of the MNIST dataset can be classified using k-Nearest Neighbors.

Low Code/No Code Data Analysis with KNIME: k-Nearest Neighbors (Wine Data)

This is a blog post by Carsten Lange about how you can repeat the wine color analysis that you did in Chapter 4.9 with a no-code approach in KNIME. The KNIME platform is a freely available no-code/low-code software. If you have KNIME installed on your computer you can use the links in the blog post to download the related KNIME workflow.

Prepare and Normalized Data Before Training a k-Nearest Neighbor Model

The video from DataCamp explains how to prepare data for k-Nearest Neighbors and how they can be normalized.

Confusion Matrix in Machine Learning