# Implement Pearson Correlation Coefficient Loss in TensorFlow – TensorFlow Tutorial

By | March 3, 2021

Pearson Correlation Coefficient can measure the strength of the relationship between two variables. Here is a tutorial:

A Beginner Guide to Pearson Correlation Coefficient – Machine Learning Tutorial

We can use it as a loss to measure the correlation between two distributions in deep learning model. In this tutorial, we will create this loss function using tensorflow.

## Preliminary

We will create two distributions in tensorflow.

import numpy as np
import tensorflow as tf
a = np.array([[0.15, 0.16, 0.9], [0.8, 4.15, 0.15]])
b = np.array([[0.7, 0.12, 0.1], [0.15, 0.19, 0.05]])

aa = tf.convert_to_tensor(a, tf.float32)
bb = tf.convert_to_tensor(b, tf.float32)

$$aa$$ and $$bb$$ are two distributions, we will compute their pearson correlation coefficient loss.

## Pearson Correlation Coefficient Loss

Similar to cosine distance loss, pearson correlation coefficient loss is defined as:

$$loss = 1 – p$$

$$p$$ is pearson correlation coefficient.

## How to compute pearson correlation coefficient loss in tensorflow?

We will create a function to calculate. Here is an example:

def pearson_r(y_true, y_pred):
x = y_true
y = y_pred
mx = tf.reduce_mean(x, axis=1, keepdims=True)
my = tf.reduce_mean(y, axis=1, keepdims=True)
xm, ym = x - mx, y - my
t1_norm = tf.nn.l2_normalize(xm, axis = 1)
t2_norm = tf.nn.l2_normalize(ym, axis = 1)
cosine = tf.losses.cosine_distance(t1_norm, t2_norm, axis = 1)
return cosine

In this example, we will use cosine distance loss to compute pearson correlation coefficient loss. Here is the reason:

Understand the Relationship Between Pearson Correlation Coefficient and Cosine Similarity – Machine Learning Tutorial

Then we can compute the pearson loss between $$aa$$ and $$bb$$.

a_s = pearson_r(aa, bb)

init = tf.global_variables_initializer()
init_local = tf.local_variables_initializer()
with tf.Session() as sess:
sess.run([init, init_local])
np.set_printoptions(precision=4, suppress=True)

a = (sess.run(a_s))

print('a=')
print(a)

Run this code, we will get the loss:

0.85890067

## Evaluate our pearson correlation coefficient loss function

In order to make sure our function is correct, we will use scipy.stats.pearsonr() to evaluate our function.

Here is the example code:

from scipy.stats import pearsonr
p1, _ = pearsonr(a[0,:], b[0,:])
p2, _ = pearsonr(a[1,:], b[1,:])
print(p1)
print(p2)
print(p1+p2)

d = 1-(p1+p2)/2
print(d)

Run this code, $$d$$ is:

0.8589005906554071

It is almost same to $$a_s$$ in tensorflow, which means our function is correct.