Skip to content

tom-a-horrocks/t-SNE-geochemistry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

t-SNE-geochemistry

This repository holds scripts demonstrating methods for the journal article Geochemical characterisation of rock hydration processes using t-SNE. The dataset used within the journal article is not open file, so a public chemical assay database has been used in this repository.

How to use

The "examples/examples.R" file walks through the available functions with an open file chemical assay database (see https://www.rdocumentation.org/packages/compositions/versions/1.40-1/topics/Hydrochem).

Testing

The "examples/examples.R" file should create figures which match those in "examples/examples_out".

Walkthrough

We apply the techniques in "Geochemical characterisation of rock hydration processes using t-SNE" to a hydrochemical dataset containing water sample assays and their source river name.

Random Forest recursive feature elimination

The figure below shows that the Mg, Ca, Sr, and SO4 concentrations are highly predictive of the river source. RFE

t-SNE embeddings

The embeddings below were generated using Rtsne, and have used the Aitchison distance rather than the Euclidean distance. The 'subset embedding' uses only Mg, Ca, Sr, and SO4 as input.

Full embedding

tsne full embedding concentrations tsne full embedding labels

Subset embedding

tsne subset embedding concentrations tsne subset embedding labels

Regularised Random Forest

The Random Forest is regularised for increasing minimum terminal node sizes. As this dataset presumably has little label noise (the river is known), the regularisation is ultimately unneccessary.

About

R scripts for applying t-SNE to geochemical datasets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages