Dataset Enrichment and Handshaking

Exploring ways to enrich the data.

Posted by Ahmet Cecen on May 10, 2016

Recently by the same author:


Comparing Structural Features

A short presentation displaying quick results.


Ahmet Cecen

Data Scientist / Materials Informatics

You may find interesting:


Dataset Description

A brief overview of the data.

Table of Contents

Enrichment Strategy

Since our current dataset is of limited size, we will explore benefits of adding simulated datasets of similar nature to the analysis, obtained from other sources.

New Dataset

A potentially compatible dataset is found online published at Harvard Dataverse

Microstructure results from the simulation of additive manufacturing processes with the SPPARKS Monte Carlo code. All simulations were performed on a 300 x 300 x 200 rectangular lattice. The parameters varied during the study. All length and timescales are defined within the model and refer to no actual physical system.

SimData

Length Scale Problem

Since the new data is simulated without a particular lengthscale, we need to find a way to handshake the lengthscales of the two datasets. This can be done visually like the visualization below, however we can also utilize Chord Length Distributions to come up with an objective scaling measure.

LengthScales

Using Chord Length Distributions (CLD) to Handshake Length Scales

Original difference in length scales as visible in the CLD is shown below. We fit a lognormal distribution to highlight trends.

AMvsOriginalCLD

We can then scale the simulated dataset based on mean lengthscale and fit parameters with a reasonable accuracy. Here we have used a scaling factor of 3 to enlarge the simulated microsturcture. The resulting CLDs appear compatible as shown below.

AMvsStrecthedCLD