This repository holds scripts demonstrating methods for the journal article Geochemical characterisation of rock hydration processes using t-SNE. The dataset used within the journal article is not open file, so a public chemical assay database has been used in this repository.
The "examples/examples.R" file walks through the available functions with an open file chemical assay database (see https://www.rdocumentation.org/packages/compositions/versions/1.40-1/topics/Hydrochem).
The "examples/examples.R" file should create figures which match those in "examples/examples_out".
We apply the techniques in "Geochemical characterisation of rock hydration processes using t-SNE" to a hydrochemical dataset containing water sample assays and their source river name.
The figure below shows that the Mg, Ca, Sr, and SO4 concentrations are highly predictive of the river source.
The embeddings below were generated using Rtsne, and have used the Aitchison distance rather than the Euclidean distance. The 'subset embedding' uses only Mg, Ca, Sr, and SO4 as input.
The Random Forest is regularised for increasing minimum terminal node sizes. As this dataset presumably has little label noise (the river is known), the regularisation is ultimately unneccessary.