Pearson Correlation Coefficient can measure the strength of the relationship between two variables. Here is a tutorial:

A Beginner Guide to Pearson Correlation Coefficient – Machine Learning Tutorial

We can use it as a loss to measure the correlation between two distributions in deep learning model. In this tutorial, we will create this loss function using tensorflow.

## Preliminary

We will create two distributions in tensorflow.

import numpy as np import tensorflow as tf a = np.array([[0.15, 0.16, 0.9], [0.8, 4.15, 0.15]]) b = np.array([[0.7, 0.12, 0.1], [0.15, 0.19, 0.05]]) aa = tf.convert_to_tensor(a, tf.float32) bb = tf.convert_to_tensor(b, tf.float32)

\(aa\) and \(bb\) are two distributions, we will compute their pearson correlation coefficient loss.

## Pearson Correlation Coefficient Loss

Similar to cosine distance loss, pearson correlation coefficient loss is defined as:

\(loss = 1 – p\)

\(p\) is pearson correlation coefficient.

## How to compute pearson correlation coefficient loss in tensorflow?

We will create a function to calculate. Here is an example:

def pearson_r(y_true, y_pred): x = y_true y = y_pred mx = tf.reduce_mean(x, axis=1, keepdims=True) my = tf.reduce_mean(y, axis=1, keepdims=True) xm, ym = x - mx, y - my t1_norm = tf.nn.l2_normalize(xm, axis = 1) t2_norm = tf.nn.l2_normalize(ym, axis = 1) cosine = tf.losses.cosine_distance(t1_norm, t2_norm, axis = 1) return cosine

In this example, we will use cosine distance loss to compute pearson correlation coefficient loss. Here is the reason:

Then we can compute the pearson loss between \(aa\) and \(bb\).

a_s = pearson_r(aa, bb) init = tf.global_variables_initializer() init_local = tf.local_variables_initializer() with tf.Session() as sess: sess.run([init, init_local]) np.set_printoptions(precision=4, suppress=True) a = (sess.run(a_s)) print('a=') print(a)

Run this code, we will get the loss:

0.85890067

## Evaluate our pearson correlation coefficient loss function

In order to make sure our function is correct, we will use scipy.stats.pearsonr() to evaluate our function.

Here is the example code:

from scipy.stats import pearsonr p1, _ = pearsonr(a[0,:], b[0,:]) p2, _ = pearsonr(a[1,:], b[1,:]) print(p1) print(p2) print(p1+p2) d = 1-(p1+p2)/2 print(d)

Run this code, \(d\) is:

0.8589005906554071

It is almost same to \(a_s\) in tensorflow, which means our function is correct.