semba

t-test and interpretation in R

Masumbuko Semba
A formal statistical test called a hypothesis test is used to confirm or disprove a statistical hypothesis. The following R hypothesis tests are demonstrated in this course. T-test with one sample T-Test of two samples T-test for paired samples Each type of test can be run using the R function t.test().The function comes with the following arguments; t.test(x, y = NULL, alternative = c("two-sided", "less", "greater"), mu = 0, paired = FALSE, var.

Text Mining and Wordcloud in R

Masumbuko Semba
Word clouds Word clouds visualize word frequencies of either single corpus or different corpora. Although word clouds are rarely used in academic publications, they are a common way to display language data and the topics of texts - which may be thought of as their semantic content. To exemplify how to use word clouds, we are going to have a look at the State of Environment issued in 2019 by the department of environment of the vice president’s office.

Access Open Street Map features programmatically with osmdata package in R

Masumbuko Semba
OpenStreetMaps is a great source of spatial data. Most common programming languages have packages for downloading data from OSM. In this tutorial we are going to see how to download hosptial features data using R’s osmdata (Padgham et al. 2017) package and plot it using ggplot (Wickham 2016), and interactively using tmap (Tennekes 2018). This requires some knowledge of spatial data structures.

Simplified Principal Component Analysis in R

Masumbuko Semba
R
Principal Component Analysis (PCA) Principal Component Analysis (PCA) is widely used to explore data. This technique allows you visualize and understand how variables in the dataset varies. Therefore, PCA is particularly helpful where the dataset contain many variables.This is a method of unsupervised learning that allows you to better understand the variability in the data set and how different variables are related. The Components in PCA are the underlying structure in the data.

The Lake Victoria bathymetry

Masumbuko Semba
I was looking for bathymetry dataset for Lake Victoria online and I came across this link. It stores several products of the bathymetry data of the Lake Victoria. Among them products is the gridded TIFF file. This dataset was created by a team from Harvard University in 2017 (Hamilton et al. 2016). They used over 4.2 million points collected over 100-years of surveys. The point data was obtained from an Admiral Bathymetry map and points collected in the field.