Metrics for the similarity of two sets of data

by FernandoP   Last Updated September 21, 2018 16:20 PM

I am trying to model a certain (discrete) behavior measured from source A, and the literature in the field have a model for a source A'.

The behavior itself for sources A and A' are pretty similar in the shape, but not in the absolute value.


In this figure, the plot in blue (G1) is the behavior measured for source A, in green (G2) the simulated behavior using our model for source A; and in red (G3) the simulated value for source A'.

My objective is to show that the shape of our model have is a better approximation for this behavior than the previous model. (It is quite clear to human eyes but I'd like to have some metric to validate our claim). Problems are:

  • The measures of behavior for source A' are unavailable
  • An histogram using Mean Square Error or even Mean Absolute Error in relation to the measures to A is not a fair metric, since the sources are different

At moment, I am using a scatterplot of Model x Measure, disregarding the slope and just taking in consideration the Pearson correlation, since models in both cases should grow/decrease at the similar points. Therefore, I am arguing that the greater the correlation, greater is the similarity between the model and the measure.

However, this is kinda indirect measure. Is there a more direct way to do this comparison?

Any help or comment is highly appreciated

Related Questions

Explain approximate lines in graph of this function

Updated February 22, 2017 07:20 AM

Comparing a function and its estimate

Updated August 05, 2015 16:08 PM