R

PCA made easy in R

Masumbuko Semba
R
In the previous post I illustrated a simple way to do Principal Component Analysis in R. I simply used the output results from prcomp() function of R base. But, I constantly find hard to the untidy output that prcomp generates and wished to get a tidy result. In this post I will illustrate the approaches that I was inspired by Claus Wilke in the post PCA tidyverse style.

Simplified Principal Component Analysis in R

Masumbuko Semba
R
Principal Component Analysis (PCA) Principal Component Analysis (PCA) is widely used to explore data. This technique allows you visualize and understand how variables in the dataset varies. Therefore, PCA is particularly helpful where the dataset contain many variables.This is a method of unsupervised learning that allows you to better understand the variability in the data set and how different variables are related. The Components in PCA are the underlying structure in the data.

The Lake Victoria bathymetry

Masumbuko Semba
I was looking for bathymetry dataset for Lake Victoria online and I came across this link. It stores several products of the bathymetry data of the Lake Victoria. Among them products is the gridded TIFF file. This dataset was created by a team from Harvard University in 2017 (Hamilton et al. 2016). They used over 4.2 million points collected over 100-years of surveys. The point data was obtained from an Admiral Bathymetry map and points collected in the field.

A unified Machine Learning Approach in R with tidymodels

Masumbuko Semba
tidymodels tidymodels is a suite of packages that make machine learning with R a breeze. R has many packages for machine learning, each with their own syntax and function arguments. tidymodels aims to provide an unified interface, which allows data scientists to focus on the problem they’re trying to solve, instead of wasting time with learning package syntax. The tidymodels has a modular approach meaning that specific, smaller packages designed to work hand in hand.

Linear and Bayesian Regression Models with tidymodels package

Masumbuko Semba
As a data scientist, you need to distinguish between regression predictive models and classification predictive models. Clear understanding of these models helps to choose the best one for a specific use case. In a nutshell, regression predictive models andclassification predictive models` fall under supervised machine learning. The main difference between them is that the output variable—in regression is numerical (or continuous) while that for classification is categorical (or discrete).