# Analysis of correlation and significance of parameters

## Correlation

The study of the significance of the impact of input parameters on output parameters should begin with the analysis of the correlation of individual parameters. Three basic dependencies can be checked:

• monotonic linear
• monotonic non-linear
• square

### Pearson's correlation coefficient (monotonic linear relationship)

The most basic measure determining whether there is a linear correlation between parameters${x}_{i}$ i ${y}_{i}$is the Pearson correlation coefficient:

${r}_{p}=\frac{\sum _{i=1}^{n}\left({x}_{i}-\overline{x}\right)\left({y}_{i}-\overline{y}\right)}{\sqrt{\sum _{i=1}^{n}\left({x}_{i}-\overline{x}{\right)}^{2}}\sqrt{\sum _{i=1}^{n}\left({y}_{i}-\overline{y}{\right)}^{2}}}$

where$\overline{x}$and$\overline{y}$mean the mean values ​​of the relevant parameters.

This formula can be simplified to

${r}_{p}=\frac{cov\left(x,y\right)}{\sqrt{var\left(x\right)var\left(y\right)}}$

where$x=\left[{x}_{1},{x}_{2},...\right],y=\left[{y}_{1},{y}_{2},...\right]$

#### Spearman's correlation coefficient (monotonic non-linear relationship)

Spearman's rank correlation coefficient is more universal because it allows to determine the strength of monotonic correlation, which may be non-linear and is expressed by the relation:

${r}_{s}=\frac{\sum _{i=1}^{n}\left({R}_{i}-\overline{R}\right)\left({S}_{i}-\overline{S}\right)}{\sqrt{\sum _{i=1}^{n}\left({R}_{i}-\overline{R}{\right)}^{2}}\sqrt{\sum _{i=1}^{n}\left({S}_{i}-\overline{S}{\right)}^{2}}}$

where${R}_{i}$is the rank of the observation${x}_{i}$, ${S}_{i}$ is the rank of the observation${y}_{i}$and$\overline{R}$ i $\overline{S}$are the mean values ​​of the respective ranks${R}_{i}$ and${S}_{i}$.

#### Interpretation of the correlation coefficient value

Correlation type:

• ${r}_{s}$> 0 positive correlation – when the value of X increases, so does Y
• ${r}_{s}$= 0 no correlation – when X increases, Y sometimes increases and sometimes decreases
• ${r}_{s}$< 0 negative correlation – when X increases, Y decreases

Correlation strength:

• $|{r}_{s}|<0.2$– no linear relationship
• $0.2\le |{r}_{s}|<0.4$- weak dependence
• $0.4\le |{r}_{s}|<0.7$– moderate dependency
• $0.7\le |{r}_{s}|<0.9$- quite a strong relationship
• $|{r}_{s}|\ge 0.9$- very strong dependence

The quadratic correlation coefficient is determined on the basis of regression analysis.

Error sum of squares$SSE$is designated as

$SSE=\sum _{i=1}^{n}\left({y}_{i}-{\stackrel{^}{y}}_{i}{\right)}^{2}$

After performing the approximation with a polynomial of the second degree (i.e. determining the coefficients${a}_{2},{a}_{1},{a}_{0}$) ${\stackrel{^}{y}}_{i}$ is determined by substitution${x}_{i}$to the formula of the approximating function

${\stackrel{^}{y}}_{i}={a}_{2}{{x}_{i}}^{2}+{a}_{1}{x}_{i}+{a}_{0}$

total sum of squares$SST$ to

$SST=\sum _{i=1}^{n}\left({y}_{i}-\overline{y}{\right)}^{2}$

The correlation coefficient is determined from the relationship

${r}_{q}=\sqrt{1-\frac{SSE}{SST}}$

## Statistical testing of the significance of the correlation coefficient

To determine whether the determined correlation coefficient is statistically significant, it is necessary to make a null hypothesis

${H}_{0}:\delta =0$

meaning that there is no correlation between the parameters. The alternative hypothesis has the form

${H}_{1}:\delta \ne 0$

It is assumed that the statistic takes the Student's t-distribution o $k=n-2$degrees of freedom and hence, for example, for the Pearson correlation coefficient, the value of the statistics is

$t={r}_{p}\sqrt{\frac{n-2}{1-{r}_{p}^{2}}}$

The value of the test statistic cannot be determined when${r}_{p}=1$ the${r}_{p}=-1$or when$n<3$.

In other cases, the value determined on its basis$p$ (read from the Student's t-distribution) is compared with the assumed significance level$\alpha$

• if$p\le \alpha$we reject it${H}_{0}$accepting ${H}_{1}$
• if$p>\alpha$there is no reason to reject it${H}_{0}$

Typically, a significance level is selected$\alpha =0.05$, agreeing that in 5% of situations we will reject the null hypothesis when it is true.

The same is done for the other correlation coefficients instead${r}_{p}$substituting${r}_{s}$the${r}_{q}$.

Currently unrated

More News  »

## Latest posts

#### First post in 2023

Recent news

All the best!
Thank you 2022, I thank you for all the good and bad memories of 2022. All of them has made me wiser and more mature. And I welcome the 2023! With all the best in everything!

Recent news

# Analysis of correlation and significance of parameters

## Correlation

The study of the significance of the impact of input parameters on output parameters should begin with the analysis of the correlation of individual parameters. Three basic dependencies can be checked:

#### How to calculate the correlation using their original formula

Recent news

Correlation coefficients are used to measure how strong a relationship is between two variables. There are several types of correlation coefficient, but the most popular is Pearson’s. Pearson’s correlation (also called Pearson’s R) is a correlation coefficient commonly used in linear regression. If you’re starting out in statistics, you’ll probably learn about Pearson’s R first. In fact, when anyone refers to the correlation coefficient, they are usually talking about Pearson’s.

#### VARs package in R

Recent news

Need to check it out

#### VAR in stata

Recent news

This one is interesting

#### SVAR model example

Recent news

SVAR model example can be found in literature of macroeconomic.

#### VAR vs SVAR risk

Recent news

This topic also need to look back at

#### SVAR model in R

Recent news

Its interesting topic that I havent dig deeper

##### 2 months, 4 weeks ago

More News » 