org-mode org-babel ess-ample
Using dependencies
- emacs 27
- org-mode
- org-babel
- ess (Emacs speaks statistics)
Inserting source commands
<s <tab> R
#+BEGIN_SRC R #+END_SRC
Installing dplyr and ggplot
install.packages(c("dplyr","ggplot2"))
Following along The Wilke Lab
library(dplyr) iris %>% select(-Species) %>% # remove Species column kmeans(centers=3) -> # do k-means clustering with 3 centers km # store result as `km` # now display the results from the analysis km
gets us
K-means clustering with 3 clusters of sizes 50, 62, 38 Cluster means: Sepal.Length Sepal.Width Petal.Length Petal.Width 1 5.006000 3.428000 1.462000 0.246000 2 5.901613 2.748387 4.393548 1.433871 3 6.850000 3.073684 5.742105 2.071053 Clustering vector: [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [75] 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 3 3 3 3 2 3 3 3 3 [112] 3 3 2 2 3 3 3 3 2 3 2 3 2 3 3 2 2 3 3 3 3 3 2 3 3 3 3 2 3 3 3 2 3 3 3 2 3 [149] 3 2 Within cluster sum of squares by cluster: [1] 15.15100 39.82097 23.87947 (between_SS / total_SS = 88.4 %) Available components: [1] "cluster" "centers" "totss" "withinss" "tot.withinss" [6] "betweenss" "size" "iter" "ifault"
in the \*R:<location>\* buffer.
The source executions are contiguous
km$centers
Sepal.Length Sepal.Width Petal.Length Petal.Width 1 5.006000 3.428000 1.462000 0.246000 2 5.901613 2.748387 4.393548 1.433871 3 6.850000 3.073684 5.742105 2.071053
Plotting
# add species information back into PCA data # we use `factor(km$cluster)` to tell R that the cluster numbers represent distinct categories, # not continuous values iris_clustered <- data.frame(iris, cluster=factor(km$cluster))
library('ggplot2') ggplot(iris_clustered, aes(x=Petal.Width, y=Sepal.Width, color=cluster, shape=Species)) + geom_point() ggsave("irisclusters.png")
later I should do some analysis to public retinal fundus image corpora.