Posted by: thefunkymonk | May 18, 2009

Dungeness Crab Growth

Pearson’s coefficients of regression; ‘r’

The most common measure of “correlation” or “predictability” is Pearson’s coefficient of correlationPearson’s r, can have a value anywhere between -1 and 1. The larger r, ignoring sign, the stronger the association between the two variables and the more accurately you can predict one variable from knowledge of the other variable.

r=\frac{1}{n}\Sigma\frac{x_i -\overline{x}}{S_x}\frac{y_i -\overline{y}}{S_y}

\overline{x} = avg of x_i

\overline{y} = avg of y_i

s_x = standard deviation of x_i

s_y = standard deviation of y_i

Background: Dungeness crabs are fished on west coast and are an important to Pacific fisherman.  The population of Dungeness crabs has been threatened by over-fishing and some believe that the inclusion of the female population with restrictions may be the answer.  The entire male population is fished out each year, in order to control the large fluctuations of male crabs fished out yearly, biolgists questioned the fishing of female crabs. The Canadian fishing industry allows for fishing of female crab, and their population doesn’t seem to have the fluctuations the U.S. shows.




The above data shows Premolt with Postmolt size.  Adding a Linear trendline and Pearson’s coefficients shows that the data is very closely correlated.


The actual and predicted values correspond very closely. The predictor is an excellent fit based on the hish pearson coefficient. It was based off the equation from the last graph.  The linear fit of the data gives the equation of y=0.914x + 25.803 and plugging in the premolt size of the crab into “x” we get a predicted post molt size of the crab for “y”.



Above the actual data is plotted against a line forced through zero. It shows how close this correlation

is to being absolutely linear and with perfect correlation.

-After taking a much smaller sampling and calculating the actual pre molt versus the calculated pre molt the data has a much smaller correlation,this random sampling shows no correlation at all.The population much be much larger for the information to be substantial.

Conclusion: In conclusion, this data is closely correlated by Pearson’s coefficient, however a sufficient amount of data must be used in order for the data to be accurate.  Using the linear trendline you can very closely perdict the postmolt size using the premolt size.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: